-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding a time series/1d data use case #2
Conversation
Markus.. I'm looking over the annotated serialization.
|
On Wed, Feb 24, 2021 at 03:35:48PM -0800, Mark Cresitello-Dittmar wrote:
Some questions..
1) it doesn't use any of the current models-in-progress (well.. Dataset looks ok)
One of the primary goals of this workshop is to exercise the models in various scenarios to see how they serve.
So, while this gives a good look at the annotation scheme you have in mind, it isn't applied to the same models as the other implementations will be.. so a bit apples vs oranges.
True, but that's kind of the purpose of the exercise; part of my
point is that the models as they stand have serious drawbacks that
can relatively easily be fixed; the implicit (if I may say so) models
here are intended to show that.
2) Magnitude
the table has magnitudes, the annotation does not.
It is not in Coords, presumably because it is not space or time domain
It is not a Measurement, presumably because it has no errors?
There are two PhotCal annotations, one for the flux, the other for
the magnitude. Only the flux has an error (which, incidentally, is
because the error transforms in a way that it becomes badly
asymmetrical in mag when it's large), and hence only the flux
is annotated as a measurement. One *could* add a measurement
annotation without an error reference in the mag column as well, but
I don't think that would be useful -- would you want to do that? And
if so, why?
Somewhat more fundamentally, since PhotCal and Measurement are
independent annotations in my scheme, you can't really say "mag is
(or is not) a Measurement", only that "mag is (or is not) annotated
as Measurement version 1". I don't think our annotation scheme
(or rather, meta-model) should do more than that.
3) Position of the source
The table has 2 Params giving the RA,DEC with description saying it is the "Position of source object"
There is a SphericalCoordinate mapping to those values
This is in the Coords annotation at Coords.space, so it can be found as a Coordinate
But no other annotation which gives it context as a source position ( Target.position ? ) or anywhere in relation to the TimeSeries. I don't see when/how a client would know when to use the SphericalCoordinate.
Right, that's missing so far -- it would, I think, be part
ds:Dataset. If I got to say how it should do that, it would be a
sequence of column references in order to keep Dataset separate from
STC.
4) PhotCal reference to phot/flux?
If I recall.. the usage of photDM:PhotCal is to be referenced by the Flux/Magnitudes as part of their "Frame" (SpectralDM).
The PhotDM model PhotCal object does not have any association to the Flux/Magnitude values
...which is one of the things I claim we need to fix, which is why
it's done as it is in the example.
The table flux and phot FIELD elements have this reference to PhotCal, but also has backward pointing references within the PhotCal objects to the FIELDs.
Is that part of your annotation scheme? What if multiple columns share the same PhotCal?
No, the backward references are due to the custom GROUP from Ada's
time series proposal, which the example contains as well (I've just
pulled it out of my live system, which has them). I've maintained
that these backward references are a bad idea (indeed, they're one of
two reasons why I started on the STC annotation thing all these years
ago), and I've been struggling against them, in particular in COOSYS
and TIMESYS. Consider them legacy.
|
Help me see how this approach works well/better for clients..
Which is essentially "Table + knowledge of dependent/independent axes" There are 2 Cubes defined
I, as a client, find Cube instance no. 1 and want to know 'what is this a cube of?'
Which leads me to conclude that this is either:
What I'd would expect to find is an unambiguous
The only way I see to identify column 'phot' as a Magnitude from the annotation, is from the PhotCal.magnitudeSystem attribute having a value. That makes it a VERY important attribute! |
On Thu, Feb 25, 2021 at 08:53:22AM -0800, Mark Cresitello-Dittmar wrote:
Help me see how this approach works well/better for clients..
Your implicit model for Cube is:
- ndcube:Cube
o independent_axes: RealQuantity[\*][\*] <-- [naxes][nrows]
o dependent_axes: RealQuantity[\*][\*] <-- [naxes][nrows]
Since the things with the square brackets unnerve me a bit, let me
make a brief statement: The annotation proposed is for VOTable (which
for STC is the most pressing use case).
If we want to annotate FITS arrays or even CDF files or anything
else, we'll have to come up with some syntax on how to reference the
various things happening in these container formats, including
header cards or other native metadata items.
However, the mess is bad enough as is, so I'd suggest to keep in mind
we'll have to do that at some point and develop ideas how that could
be done but focus on VOTable so we get *something* to work properly.
Big problems need to be solved in reasonable steps.
Which is essentially "Table + knowledge of dependent/independent axes"
There are 2 Cubes defined
1. has independent_axes = Field('obs_time'); dependent_axes = Field('phot')
2. has independent_axes = Field('obs_time'); dependent_axes = Field('flux'), Field('phot')
I'd not put it this way -- as the annotation says, there's one Cube
with two observables.
I, as a client, find Cube instance no. 1 and want to know 'what is
this a cube of?'
This is an example of why I think it's paramount to have concrete use
cases as scenarios what clients are supposed to do with the
annotation.
You see, I don't think a client will generally enumerate cubes in
this way. Instead, I expect it will see: "Ah, there's a cube in
there; offer the user the option to plot this as a cube" (or, in a
library: "I'll expose: you have two observables; before proceeding
further, choose one").
This cube-plotting option will then tell the user: you can plot
either flux or phot, and only when the user selects one of these
fields does it become relevant what that actually is.
The client will then inspect the additional annotations of that
column and pick one or more that help it configure the plot in the
most useful way, for instance:
(a) Ah, I have a measurement annotation -- I'm adding error bars.
(b) Ah, I have a Photcal annotation and a zero point -- offer to
convert to mag (or so).
(c) Ah, I have a time annotation -- offer to convert to other time
systems.
-- note that the cool part of this scheme is that a client that
perhaps doesn't know how to do (b) still can do (a) and (c) without a
problem.
To find this, I need to extract other annotations which contain the same Field reference
- 'obs_time' is in "stc2:Coords" (indirectly) and "stc2:TimeCoordinate.location" (directly)
- 'phot' is in "phot:PhotCal.value" (directly) and cube no. 2 "ndcube:dependent_axes" (directly)
Which leads me to conclude that this is either:
1. Cube( Time, PhotCal )
2. Cube ( Time, Cube )
Ummm -- I don't think I could follow here, perhaps because I've not
quite worked out why you see two Cubes in here. Could you perhaps
elaborate a bit why a client would want to do this kind of analysis?
The only way I see to identify column 'phot' as a Magnitude from
the annotation, is from the PhotCal.magnitudeSystem attribute
having a value. That makes it a VERY important attribute!
Yes, of course -- value is all-important, because that says what the
column you are annotating. Without it, the whole annotation is
pointless in this scheme. True, this means you will may have to
repeat items like filterIdentifier if you have multiple columns using
the same photometric system -- but that seems a small price to pay
for saving on referencing.
There is the UCD on the Field, but unless it is part of your
annotation scheme that it must exist.. one can't rely on that being
there to help out.
Well, a client knows it's some kind of photometric thing because of
the PhotCal annotation -- and I'm quite convinced a client can
conventiently learn this as soon as it needs that: when it knows what
column it is operating on.
But yes, I'm very sure we should *not* encode the information
contained in the ucd and unit attributes in the annotation again
wherever we can help it. Repeating things *will* lead to conflicting
information on both ends (don't get me with my GloTS operator hat on
started on the endless pain of conflicting information in VOSI tables
and TAP schema in TAP services) and hence make our clients' lives
hard.
Container formats that don't have UCD or even unit will need some
special handling in *their* annotation schemes, but that's certainly
solvable relatively straightforwardly once we have the container
format-specific referencing worked out, and we shouldn't encumber the
modelling and mapping question with such considerations at this
point -- this kind of thing has held up the whole effort for far too
long already.
|
On Fri, Feb 26, 2021 at 3:36 AM msdemlei ***@***.***> wrote:
On Thu, Feb 25, 2021 at 08:53:22AM -0800, Mark Cresitello-Dittmar wrote:
> Help me see how this approach works well/better for clients..
> Your implicit model for Cube is:
> - ndcube:Cube
> o independent_axes: RealQuantity[\*][\*] <-- [naxes][nrows]
> o dependent_axes: RealQuantity[\*][\*] <-- [naxes][nrows]
Since the things with the square brackets unnerve me a bit, let me
make a brief statement: The annotation proposed is for VOTable (which
for STC is the most pressing use case).
If we want to annotate FITS arrays or even CDF files or anything
else, we'll have to come up with some syntax on how to reference the
various things happening in these container formats, including
header cards or other native metadata items.
However, the mess is bad enough as is, so I'd suggest to keep in mind
we'll have to do that at some point and develop ideas how that could
be done but focus on VOTable so we get *something* to work properly.
Big problems need to be solved in reasonable steps.
I can't tell if you are saying that we don't need anything more complex
than RealQuantity[naxes][nrows],
or if you're suggesting this representation is in some way targeting more
complicated FITS, CDF cases..
Annotation:
<INSTANCE ID="nduppompmgea" dmtype="ndcube:Cube">
<ATTRIBUTE dmrole="independent_axes">
<COLUMN ref="obs_time"/>
</ATTRIBUTE>
<ATTRIBUTE dmrole="dependent_axes">
<COLUMN ref="phot"/>
</ATTRIBUTE>
</INSTANCE>
"independent_axes" is a pointing to the obs_time column .. ie a list of
RealQuantity-s, length nrows
There can be > 1 "independent_axes" (its plural).. so that is another
dimension for # of axes
Hence, independent_axes == RealQuantity[naxes][nrows]
> Which is essentially "Table + knowledge of dependent/independent axes"
>
> There are 2 Cubes defined
> 1. has independent_axes = Field('obs_time'); dependent_axes =
Field('phot')
> 2. has independent_axes = Field('obs_time'); dependent_axes =
Field('flux'), Field('phot')
I'd not put it this way -- as the annotation says, there's one Cube
with two observables.
Yes, but with very little more information than you get with the TABLE of
FIELDs.
literally, the only information added is the dependency flag.
> I, as a client, find Cube instance no. 1 and want to know 'what is
> this a cube of?'
This is an example of why I think it's paramount to have concrete use
cases as scenarios what clients are supposed to do with the
annotation.
You see, I don't think a client will generally enumerate cubes in
this way. Instead, I expect it will see: "Ah, there's a cube in
there; offer the user the option to plot this as a cube" (or, in a
library: "I'll expose: you have two observables; before proceeding
further, choose one").
Write a script which scans through data for target "X", find Cubes which
have TimeSeries with Magnitude-s.
To find this, I need to extract other annotations which contain the same
Field reference
> - 'obs_time' is in "stc2:Coords" (indirectly) and
"stc2:TimeCoordinate.location" (directly)
> - 'phot' is in "phot:PhotCal.value" (directly) and cube no. 2
"ndcube:dependent_axes" (directly)
>
> Which leads me to conclude that this is either:
> 1. Cube( Time, PhotCal )
> 2. Cube ( Time, Cube )
Ummm -- I don't think I could follow here, perhaps because I've not
quite worked out why you see two Cubes in here. Could you perhaps
elaborate a bit why a client would want to do this kind of analysis?
There is 1 cube at line 86, and another at line 155.
I assumed that was intentional, but did wonder why you'd want to do that.
> The only way I see to identify column 'phot' as a Magnitude from
> the annotation, is from the PhotCal.magnitudeSystem attribute
> having a value. That makes it a VERY important attribute!
Yes, of course -- value is all-important, because that says what the
column you are annotating. Without it, the whole annotation is
pointless in this scheme. True, this means you will may have to
repeat items like filterIdentifier if you have multiple columns using
the same photometric system -- but that seems a small price to pay
for saving on referencing.
"saving on referencing"..
This isn't something I'm terribly savvy about, but sounds like it may be
an important criteria for the requirements on the annotation. You're
saying that resolving the <REFERENCE> or ref=* references is more
costly than the added bulk and downstream vulnerability to inconsistency?
In my experience, once the knowledge that there is 1 instance being shared
is lost, the various occurrences tend to be treated independently.
|
On Fri, Feb 26, 2021 at 06:54:13AM -0800, Mark Cresitello-Dittmar wrote:
On Fri, Feb 26, 2021 at 3:36 AM msdemlei ***@***.***> wrote:
> On Thu, Feb 25, 2021 at 08:53:22AM -0800, Mark Cresitello-Dittmar wrote:
> > Help me see how this approach works well/better for clients..
> > Your implicit model for Cube is:
> > - ndcube:Cube
> > o independent_axes: RealQuantity[\*][\*] <-- [naxes][nrows]
> > o dependent_axes: RealQuantity[\*][\*] <-- [naxes][nrows]
>
> Since the things with the square brackets unnerve me a bit, let me
> make a brief statement: The annotation proposed is for VOTable (which
> for STC is the most pressing use case).
I can't tell if you are saying that we don't need anything more
complex than RealQuantity[naxes][nrows], or if you're suggesting
this representation is in some way targeting more complicated FITS,
CDF cases..
I was trying to say that at this point -- when we don't talk about
arrays in VOTable cells -- for *referencing* we don't need anything
but columns and parames, i.e. XML ids. Which I now think had nothing
to do with what you were saying: you were trying to give types to
independent_axes and dependent_axes types other than "set of column
references", right?
I don't think we'd be doing anyone, including the clients, a favour
if we said anything about the types of the columns we reference here;
the types are given by the container format. All the annotation says
is, if you will, "consider this column as something you ought to plot
on the abscisssa".
I can perfectly see people having categorial variables as axes, and I
see not much reason to keep them from doing that. Of course a
given analysis might not be able to deal with strings instead of
numbers, but it's better to let it jump and fail than try to ward
these things off in advance. After all, this isn't much different
from having to deal with, say, negative numbers, which, to mention an
example, will kill any analyses involving logarithms.
Trying to catch things like that in advance involves a lot of
complexity and rigidity for no gain -- the result in either case
would be an error message, and there's no guarantees anything
up-front would be better understandable to users than what the
programmes produce when they've jumped and failed.
> You see, I don't think a client will generally enumerate cubes in
> this way. Instead, I expect it will see: "Ah, there's a cube in
> there; offer the user the option to plot this as a cube" (or, in a
> library: "I'll expose: you have two observables; before proceeding
> further, choose one").
>
Write a script which scans through data for target "X", find Cubes which
have TimeSeries with Magnitude-s.
Well, I'd frankly expect such discovery operations to be run on
Obscore-like tables, but let me see how I'd do that if I had to do
that exercise on a wild bunch of annotated data files:
(1) find the ds:Dataset annotation
(2) dereference the dataProductType attribute, and if it's a
literal, compare it against TIMESERIES
(3) find ds:Dataset's target location attribute (not in the current
annotation; I don't remember why I didn't put it in. I'll do
some AstroTarget-like thing as I find a bit of time).
(4) That probably will never be a literal, so you'll get a bunch of
params or columns. If it's columns, give up, if it's params,
look for a spatial annotation on one of the params that you
understand. That sounds more complicated than it is
-- it just makes sure that we can evolve our spatial annotation
withough breaking everything, and most of the time you'll have it
it on the first attempt. If you don't find a spatial annotation
you understand, give up.
(5) compare the target position you find against target X's.
(6) see if there's a column with a UCD of phot.* is in the columns.
I'd say the likelihood for a false positive here is negligible.
But again: Iterating over a bunch of files is probably not how we'll
do dataset discovery in any desirable future -- I suppose obscore, caom
or similar tech will remain the norm for that.
There is 1 cube at line 86, and another at line 155.
I assumed that was intentional, but did wonder why you'd want to do that.
Oops, no, that's a bug. Sorry about that. It *is* rather
experimental code that's been hacked on mostly in bad moods a couple
of times since the utype days... I'll fix it.
"saving on referencing"..
This isn't something I'm terribly savvy about, but sounds like it may be
an important criteria for the requirements on the annotation. You're
saying that resolving the <REFERENCE> or ref=* references is more
costly than the added bulk and downstream vulnerability to inconsistency?
In my experience, once the knowledge that there is 1 instance being shared
is lost, the various occurrences tend to be treated independently.
This is exactly analogous to the question of de-normalising database
schemas. There are very good reasons to keep them normalised in
general, but there are equally good reasons to de-normalise them in
individual cases.
For the photometry, my instinct would be that having the metadata in
an immediate annotation is helping much more in the wide majority of
the use cases than it might harm in the relatively few cases where
having explicit filter objects referenced from photometry instances
might help a bit.
But of course that's based on data I've dealt with. There might be
cases when explicit filter objects actually help a lot, and that
could change my considerations.
|
The nesting is getting deep, so I'll re-base these comments
1) implicit model
you were trying to give types to independent_axes and dependent_axes types
other than "set of column references", right?
Yes.. I'm looking to work out the mapping from the underlying model to the
annotation... you don't have a document to describe that do you?
The annotation is that the ATTRIBUTE independent_axes == list of COLUMN
elements which reference a VOTable column.
So what would the underlying Data Model which this annotates be?
* these columns essentially describe Quantity-s (numeric values with
units)
When you say "I can perfectly see people having categorial variables as
axes"
* this translates, in my head, to there being either different kinds of
'axes', or different kinds of values (stored in columns).
- which starts adding structure to the model
I don't think it is viable/useful for the model to be:
* independent_axes: void[*][*]
which is what I believe you are proposing.
2) Write a script which scans through data for target "X", find Cubes
which have TimeSeries with Magnitude-s.
(1) find the ds:Dataset annotation
(2) dereference the dataProductType attribute, and if it's a literal,
compare it against TIMESERIES
(3) find ds:Dataset's target location attribute (not in the
current annotation; I don't remember why I didn't put it in. I'll do some
AstroTarget-like thing as I find a bit of time).
(4) That probably will never be a literal, so you'll get a bunch
of params or columns. If it's columns, give up, if it's params, look for a
spatial annotation on one of the params that you understand. That sounds
more complicated than it is
-- it just makes sure that we can evolve our spatial annotation
withough breaking everything, and most of the time you'll have it it on the
first attempt. If you don't find a spatial annotation you understand, give
up.
(5) compare the target position you find against target X's.
*> (6) see if there's a column with a UCD of phot.* is in the columns.*>
I'd say the likelihood for a false positive here is negligible.
This speaks to the Annotation/Model requirements.
To fully execute the thread, you must rely directly on the VOTable content
(ucd-s) to identify the 'type' of the data (Magnitude).
The main question here is: "Is the Magnitude-ness part of the model?" I
believe it is. In which case, I should be able to identify it via
Annotation content.
I think this question may be better explored in the "Standard Properties"
case
…On Mon, Mar 1, 2021 at 4:37 AM msdemlei ***@***.***> wrote:
On Fri, Feb 26, 2021 at 06:54:13AM -0800, Mark Cresitello-Dittmar wrote:
> On Fri, Feb 26, 2021 at 3:36 AM msdemlei ***@***.***>
wrote:
> > On Thu, Feb 25, 2021 at 08:53:22AM -0800, Mark Cresitello-Dittmar
wrote:
> > > Help me see how this approach works well/better for clients..
> > > Your implicit model for Cube is:
> > > - ndcube:Cube
> > > o independent_axes: RealQuantity[\*][\*] <-- [naxes][nrows]
> > > o dependent_axes: RealQuantity[\*][\*] <-- [naxes][nrows]
> >
> > Since the things with the square brackets unnerve me a bit, let me
> > make a brief statement: The annotation proposed is for VOTable (which
> > for STC is the most pressing use case).
>
> I can't tell if you are saying that we don't need anything more
> complex than RealQuantity[naxes][nrows], or if you're suggesting
> this representation is in some way targeting more complicated FITS,
> CDF cases..
I was trying to say that at this point -- when we don't talk about
arrays in VOTable cells -- for *referencing* we don't need anything
but columns and parames, i.e. XML ids. Which I now think had nothing
to do with what you were saying: you were trying to give types to
independent_axes and dependent_axes types other than "set of column
references", right?
I don't think we'd be doing anyone, including the clients, a favour
if we said anything about the types of the columns we reference here;
the types are given by the container format. All the annotation says
is, if you will, "consider this column as something you ought to plot
on the abscisssa".
I can perfectly see people having categorial variables as axes, and I
see not much reason to keep them from doing that. Of course a
given analysis might not be able to deal with strings instead of
numbers, but it's better to let it jump and fail than try to ward
these things off in advance. After all, this isn't much different
from having to deal with, say, negative numbers, which, to mention an
example, will kill any analyses involving logarithms.
Trying to catch things like that in advance involves a lot of
complexity and rigidity for no gain -- the result in either case
would be an error message, and there's no guarantees anything
up-front would be better understandable to users than what the
programmes produce when they've jumped and failed.
> > You see, I don't think a client will generally enumerate cubes in
> > this way. Instead, I expect it will see: "Ah, there's a cube in
> > there; offer the user the option to plot this as a cube" (or, in a
> > library: "I'll expose: you have two observables; before proceeding
> > further, choose one").
> >
>
> Write a script which scans through data for target "X", find Cubes which
> have TimeSeries with Magnitude-s.
Well, I'd frankly expect such discovery operations to be run on
Obscore-like tables, but let me see how I'd do that if I had to do
that exercise on a wild bunch of annotated data files:
(1) find the ds:Dataset annotation
(2) dereference the dataProductType attribute, and if it's a
literal, compare it against TIMESERIES
(3) find ds:Dataset's target location attribute (not in the current
annotation; I don't remember why I didn't put it in. I'll do
some AstroTarget-like thing as I find a bit of time).
(4) That probably will never be a literal, so you'll get a bunch of
params or columns. If it's columns, give up, if it's params,
look for a spatial annotation on one of the params that you
understand. That sounds more complicated than it is
-- it just makes sure that we can evolve our spatial annotation
withough breaking everything, and most of the time you'll have it
it on the first attempt. If you don't find a spatial annotation
you understand, give up.
(5) compare the target position you find against target X's.
(6) see if there's a column with a UCD of phot.* is in the columns.
I'd say the likelihood for a false positive here is negligible.
But again: Iterating over a bunch of files is probably not how we'll
do dataset discovery in any desirable future -- I suppose obscore, caom
or similar tech will remain the norm for that.
> There is 1 cube at line 86, and another at line 155.
> I assumed that was intentional, but did wonder why you'd want to do that.
Oops, no, that's a bug. Sorry about that. It *is* rather
experimental code that's been hacked on mostly in bad moods a couple
of times since the utype days... I'll fix it.
> "saving on referencing"..
> This isn't something I'm terribly savvy about, but sounds like it may be
> an important criteria for the requirements on the annotation. You're
> saying that resolving the <REFERENCE> or ref=* references is more
> costly than the added bulk and downstream vulnerability to inconsistency?
> In my experience, once the knowledge that there is 1 instance being
shared
> is lost, the various occurrences tend to be treated independently.
This is exactly analogous to the question of de-normalising database
schemas. There are very good reasons to keep them normalised in
general, but there are equally good reasons to de-normalise them in
individual cases.
For the photometry, my instinct would be that having the metadata in
an immediate annotation is helping much more in the wide majority of
the use cases than it might harm in the relatively few cases where
having explicit filter objects referenced from photometry instances
might help a bit.
But of course that's based on data I've dealt with. There might be
cases when explicit filter objects actually help a lot, and that
could change my considerations.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#2 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADMLJCSU2YT6DSHNJEXTVI3TBNN43ANCNFSM4YECHYAQ>
.
|
On Mon, Mar 01, 2021 at 07:48:59AM -0800, Mark Cresitello-Dittmar wrote:
The nesting is getting deep, so I'll re-base these comments
1) implicit model
>you were trying to give types to independent_axes and dependent_axes types
other than "set of column references", right?
Yes.. I'm looking to work out the mapping from the underlying model to the
annotation... you don't have a document to describe that do you?
Sorry, no -- I could write something up, but the way I'm expecting
this to be really just is "set of objects", so this particular thing
would be really short.
The annotation is that the ATTRIBUTE independent_axes == list of COLUMN
elements which reference a VOTable column.
So what would the underlying Data Model which this annotates be?
* these columns essentially describe Quantity-s (numeric values with
units)
That's exactly the sort of cross-model references I'd like to avoid
whenever it doesn't hurt much.
Hence, it's really just "set of objects", which in a VOTable
annotation would usually translate into "set of columns and params".
Whether it makes any sense to reference params in the axes attributes
is another question -- probably not, but I suspect we'll be grateful
one day if we don't rule it out (PARAMs can be array-valued, after
all).
When you say "I can perfectly see people having categorial variables as
axes"
* this translates, in my head, to there being either different kinds of
'axes', or different kinds of values (stored in columns).
- which starts adding structure to the model
I don't think it is viable/useful for the model to be:
* independent_axes: void[*][*]
which is what I believe you are proposing.
No, it's less than that, because it's really just a set (where I'd
not worry about implicit order too much, so a list would do, too).
I'm really sure we ought to leave typing to the container format (or
the target object, *if* we can't avoid referencing complex things).
Without that, bad flag days are *really* hard to avoid -- plus I
don't believe static typing is going to help us anyway in this
particular metamodel.
Let's learn from python (where this kind of thing is known as
"dynamic typing").
*> (6) see if there's a column with a UCD of phot.* is in the columns.*>
I'd say the likelihood for a false positive here is negligible.
This speaks to the Annotation/Model requirements.
To fully execute the thread, you must rely directly on the VOTable content
(ucd-s) to identify the 'type' of the data (Magnitude).
The main question here is: "Is the Magnitude-ness part of the model?" I
believe it is. In which case, I should be able to identify it via
Annotation content.
You're probably expecting this, but anyway: What functionality would
be enabled if you have "physical meaning of scalar" in the model?
You see, there's a rather high price tag on that (you'll have to
replicate the UCD semantics, and you're severely limiting what people
can annotate; we had indications of the trouble of pulling this kind
of semantics into the models in the meeting the other day), and hence
we should reap a proportional benefit from it.
Also consider that the Registry already uses UCDs for data discovery
(*if* folks to these advanced sorts of data discovery at all).
Building something that will eventually require a parallel structure
with a similar functionality is painful, just as with the current
situation in image search, where you have to run ObsCore, SIAP1, and
SIAP2 for completeness. Let's try hard to avoid that in resource
discovery (which is hard enough as-is).
I think this question may be better explored in the "Standard Properties"
case
If you open a discussion there, would you mention ("@msdemlei") me
there so github pings me?
|
This has to things I'd like to mention. For one, there are a few actual use cases (in the sense of: what should be done with this particular annotation). And the annotation scheme assumes largely independent DMs and, I think, gets away with that quite nicely.