-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using UCD in models #15
Comments
This thread continues the topic initiated [here](https://github.com/ivoa/dm-usecases/issues/12#issuecomment-795751659]
|
A challenge we have to face, is to build a model encompassing any parameter one can found in astronomical catalogs.
My proposal is to use the Uniform Content Descriptor (UCDs) for this. Both @mcdittmar and @msdemlei claim that UCDs cannot be used in a model (I do not agree). This issue has been tackled by the MANGO proposal which forces UCD prefixes for specific classes. So that a magnitude with It has also been proposed to use a vocabulary, but building a vocabulary that would be a subset of the UCDs is a job which is not worth it, except if it just maps UCDs (or a subset of). This would be like hidding the use of UCDs. Other suggestions? |
To state/recap my position more specifically.
Mango is a mixing semantic modeling and formal vo-dml modeling techniques.
* Parameter.semantic: is an entry from a semantic vocabulary identifying
the role of the content in the Property. This equates to an attribute name
(or vo-dml role) in a formally modeled Property type.
* Parameter.ucd: is using the UCD vocabulary to identify the Type of the
Property.measure element (mainly for when it is GenericMeasure class). This
equates to the vo-dml type of the object at Property.measure.
UCD tells more than measure type.
UCDs are 2 words label e.g. pos;meta.main
Therefore you cannot put UCDs in measures as built-in parameters.
The issue from my perspective *is* that UCD conveys more than the Type, and
so, is maybe not the best choice for this job.
The concepts overlap, so it can get the job done, but the UCD is a tool
which was designed for a different job.
* "pos.eq" conveys the type as Position, but also information about the
coordinate space
* "pos;meta.main", the second word contains information regarding the
role.. which overlaps with the purpose of Parameter.semantic
Using any semantic for this purpose is doing the same modeling work, but in
a different way.
ie: applying ucd="phot.flux" to a meas:GenericMeasure, indirectly
defines FluxMeasure, a specialization of GenericMeasure
By not doing the formal vo-dml equivalent you:
* lose the benefits that come with it ( auto-class generation, defining
associated metadata (PhotCal), etc )
* no longer have a model which defines what a "phot.flux" is.. what are
the expectations?, are there algorithm details?, was cos(dec) applied?. Is
that information to be recorded in the UCD document?
* still have the dependency on the Measure model version
* add a dependency on the UCD vocabulary version, which would have to
take on the job of being updated to add new Measure types
The other part of my objection (left to the original issue) has to do with
"whose job is it to identify the Type of the Measure?"
…On Fri, Mar 12, 2021 at 10:38 AM Laurent MICHEL ***@***.***> wrote:
A challenge we have to face, is to build a model encompassing any
parameter one can found in astronomical catalogs.
-
The natural way to do this is to make one class for each quantity.
This does not work because there are too much differents sort of
parameters, and their number increases day after day.
-
The solution adopted by Mango is to use specific classes in a few
cases (position ...) and to model the others with generic objects (
meas:GenericMeasure).
-
The problem is now to get the physical meaning of a GenericMeasures.
We need a semantic tag for doing this (e.g This is measure is a magnetic
field).
My proposal is to use the Uniform Content Descriptor (UCDs) for this.
Both @mcdittmar <https://github.com/mcdittmar> and @msdemlei
<https://github.com/msdemlei> claim that UCDs cannot be used in a model
(I do not agree).
The arguments are that the scope of the UCDs goes beyond the quantity
roles and that we can get mismatchs between UCDs and measure classes (e.g.
a magnitude with ucd=pos.eq)
This issue has been tackled by the MANGO proposal which forces UCD
prefixes for specific classes. So that a magnitude with ucd=pos.eq is not
compliant with the model.
It has also been proposed to use a vocabulary, but building a vocabulary
that would be a subset of the UCDs is a job which is not worth it, except
if it just maps UCDs (or a subset of). This would be like hidding the use
of UCDs.
Other suggestions?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#15 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADMLJCUGE7GAZUGS4XPTQADTDIYOFANCNFSM4ZCNR47Q>
.
|
On Fri, Mar 12, 2021 at 07:38:10AM -0800, Laurent MICHEL wrote:
A challenge we have to face, is to build a model encompassing any
parameter one can found in astronomical catalogs.
My take is: This is a problem that immediately goes away when you
properly define "build a model".
You see, as I've argued in the past few years, there is not one model
for something you find in astronomical catalogues (let's call it a
"column" here, but other entities are possible, too), but each column
can be annotated using several different models.
If you have a column magK, it can be:
* the value in a photometry annotation (which may also define the
bandpass, zero point, etc)
* the value in a measurement annotation (which will link it to an
err_magK and perhaps even correlations if applicable)
* one of the dependent_axis in an nDCube annotation
* the value in a characterisation annotation (which might give the
legal and actual range of the thing, and perhaps some advanced
statistics)
* the result (or whatever) in a provenance annotation that links this
with an image and a software that did the source extraction.
…-- and so on.
Seen like this, there is no challenge -- simple, plain models
automatically work for anything that matches their purpose. Which is
one of the reasons I keep arguing for them.
|
I understand Mark's point of view, though I do not follow it totally. You are right if we consider that 1- providing types to measures
The requirements for
2- Describing the measure content Actually this is not the purpose of I insist to say that the better candidate to describe the measure role is the UCD. In this configuation, the measurement model would be able to associate type with |
Markus,
I answered to your take by anticipation in this post. To me, the model must be valid, self-consistant and usable out of the scope of VOTable parsing. This is why it cannot be designed as a set of independant objects spread over VOTable FIELDs. We have a model for columns quantities: MCT |
I've read this thread and also the long #12 one. This said I think I follow Laurent for the remaining part. |
On Mon, May 03, 2021 at 10:47:22AM -0700, Bonnarel wrote:
First all, I disagree that ucd mix quantity types and roles. They
are quantity types. The combination feature with semi column was
Good -- and we should keep it that way. We *have* sinned a bit with
meta.main, but that shouldn't be a reason to re-invent UCDs in model
hierarchies.
b ) IN case we find some odd measurements just use a Parameter with
Generic Measure and we have the ucd attribute in mango:Parameter to
tell us what it is.
Well, actually we already have ***@***.*** -- let's see how far we get
with that.
d) this may be a ucd coarse grained compared to the one used in
VOTable on each FIELD (but they should be consistent)
Um... why would someone use different UCDs in DM annotation and on
the field? Would would clients be supposed to do in such a case?
e ) the ucd attribute in the Parameter class is strongly useful for
non VOTAble serializations
Perhaps, but then it's easy to define UCD attachment per such
serialisation. There's no need to encumber our DMs with that.
But maintaining UCDs in DMs is really a side show. The main issue
with current Meas is the large number of subclasses for which I still
haven't seen a credible use case (over just using ***@***.***).
And that's a large problem, because they're actually heavily damaging
a fundamental and attractive baseline use case (cf. issue #36): Let a
client figure out errors.
That one is rather dear to me because (learn from STC-1) this can
greatly help uptake: Data providers will love it when TOPCAT
automatically plots error bars on their data sets. If we make that
cheap and easy, chances that people will actually add annotation to
what they produce get a lot better.
With just one Measurement class (or perhaps a few when we add proper
distributions), with the API of https://github.com/msdemlei/astropy,
all TOPCAT would need to do is:
```
ann = col.get_annotations("meas:Measurement")
if ann:
associated_error = ann.naive_error
```
(or it would try a few attributes it knows how to plot).
With current Meas, it will, as far as I can see, have to do something
like
```
MEAS_CLASSES = ["meas:GenericMeasure", "meas:Time",
"meas:Position", "meas:Velocity", "meas:ProperMotion"]
# I'm leaving out Polarisation because it really doesn't belong here
for class_name in MEAS_CLASSES:
ann = col.get_annotations(class_name)
if ann:
associated_error = ann.naive_error
```
And, worse, each time we invent a new Measure subclass, it will have
to amend MEAS_CLASS.
That's a high price to pay; it would be worth paying if we got a
major benefit from it. But I can't even see a minor one.
Of course, you could tell Mark to
(a) retrieve the VO-DML
(b) parse it
(c) derive the MEAS_CLASSES from the class hierarchy.
(or, equivalently, have the VOTable library do that, and have a new
get_annotation_subclass API function).
That's an even higher price, not only because of all the extra code
for VO-DML processing (lesson from STC-1: make takeup easy and cheap)
but also because you start having client code retrieve stuff from
ivoa.net in normal operations (lesson from Registry: That's a pain;
don't get me started on the fun I keep having with validators having
to pull schema documents from our schema repo). Again: there are
cases when that may be a price worth paying. But here, I can't see a
proportional benefit.
|
Hi Markus
Le 04/05/2021 à 10:45, msdemlei a écrit :
On Mon, May 03, 2021 at 10:47:22AM -0700, Bonnarel wrote:
> First all, I disagree that ucd mix quantity types and roles. They
> are quantity types. The combination feature with semi column was
Good -- and we should keep it that way. We *have* sinned a bit with
meta.main, but that shouldn't be a reason to re-invent UCDs in model
hierarchies.
OK
> b ) IN case we find some odd measurements just use a Parameter with
> Generic Measure and we have the ucd attribute in mango:Parameter to
> tell us what it is.
Well, actually we already have ***@***.*** -- let's see how far we get
with that.
Sorry I don't understand what you are talking about. Can you be more
explicit ?
> d) this may be a ucd coarse grained compared to the one used in
> VOTable on each FIELD (but they should be consistent)
Um... why would someone use different UCDs in DM annotation and on
the field? Would would clients be supposed to do in such a case?
OK. My point was unclear, I admit it.
The ucd in Mango:parameter is not associated to a single FIELD or PARAM
but to a group of them. (a group english word, not always a VOTable GROUP)
as is the Measure.
The FIELDS embedded in that mango:parameter have their own UCD which can
be more accurate.
But they have to be consistent of course.
> e ) the ucd attribute in the Parameter class is strongly useful for
> non VOTAble serializations
Perhaps, but then it's easy to define UCD attachment per such
serialisation. There's no need to encumber our DMs with that.
But maintaining UCDs in DMs is really a side show. The main issue
with current Meas is the large number of subclasses for which I still
haven't seen a credible use case (over just using ***@***.***).
See above my question
And that's a large problem, because they're actually heavily damaging
a fundamental and attractive baseline use case (cf. issue #36): Let a
client figure out errors.
That one is rather dear to me because (learn from STC-1) this can
greatly help uptake: Data providers will love it when TOPCAT
automatically plots error bars on their data sets. If we make that
cheap and easy, chances that people will actually add annotation to
what they produce get a lot better.
With just one Measurement class (or perhaps a few when we add proper
distributions), with the API of https://github.com/msdemlei/astropy,
all TOPCAT would need to do is:
```
ann = col.get_annotations("meas:Measurement")
if ann:
associated_error = ann.naive_error
```
(or it would try a few attributes it knows how to plot).
With current Meas, it will, as far as I can see, have to do something
like
```
MEAS_CLASSES = ["meas:GenericMeasure", "meas:Time",
"meas:Position", "meas:Velocity", "meas:ProperMotion"]
# I'm leaving out Polarisation because it really doesn't belong here
for class_name in MEAS_CLASSES:
ann = col.get_annotations(class_name)
if ann:
associated_error = ann.naive_error
```
And, worse, each time we invent a new Measure subclass, it will have
to amend MEAS_CLASS.
That's a high price to pay; it would be worth paying if we got a
major benefit from it. But I can't even see a minor one.
But the code for getting back the error should be the same for all the
subclasses, no ?
Apart from polarization there are abosultly not constraint on Error for
any of the subclasses (which is perfectly understandable)
The code for getting the error from any of the subclasses is typically
the code for the GEneric measure isn't it ?
So why is it so large a problem?
… Of course, you could tell Mark to
(a) retrieve the VO-DML
(b) parse it
(c) derive the MEAS_CLASSES from the class hierarchy.
(or, equivalently, have the VOTable library do that, and have a new
get_annotation_subclass API function).
That's an even higher price, not only because of all the extra code
for VO-DML processing (lesson from STC-1: make takeup easy and cheap)
but also because you start having client code retrieve stuff from
ivoa.net in normal operations (lesson from Registry: That's a pain;
don't get me started on the fun I keep having with validators having
to pull schema documents from our schema repo). Again: there are
cases when that may be a price worth paying. But here, I can't see a
proportional benefit.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#15 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AMP5LTBYSC5BVFUBXZA2JYLTL6X2NANCNFSM4ZCNR47Q>.
|
MANGO speaking, I admit that the current draft has too many per-domain classes. (see MANGO issues)
Having said that we need a way to give a role to those generic measures. In the context of a MANGO annotated VOtable, UCDs could either be set has reference to FIELD@ucd (see here) or as literals.
|
The current list of per-domain classes is rather limited (position, pm, veloc, time, luminosity)
Mango has a place holder for this. |
I can agree with most of this.
My only assertion is that if UCD is considered the solution for defining
what the GenericMeasure holds, that should be assigned to the
GenericMeasure.
Other usage of GenericMeasure (in Cube for example), will have the same
question.
The same GenericMeasure will not be a "phys.energy" in Source but a
"stat.snr" in Cube.
…On Wed, May 5, 2021 at 4:44 AM Laurent MICHEL ***@***.***> wrote:
MANGO speaking, I admit that the current draft has too many per-domain
classes. (see MANGO issues)
My point of view:
- Use per-domain class when it is necessary.
- domain comes with specific frame classes (e.g. Time)
- Specific set of coordinates (e.g. 2 coordinates for one position)
- This allows validators to check that positions are bound with
SpaceFrame and not with TimeFrame
- Use generic measure anywhere else
- no frame
- one single coordinate
Having said that we need a way to give a role to those generic measures.
I reaffirm that using UCDs for this is not only valid but also smart.
The is why MANGO requires to have UCDs for each parameter.
In the context of a MANGO annotated VOtable, UCDs could either be set has
reference to ***@***.*** (see here
<#23 (comment)>)
or as literals.
- *by reference* when both column and Mango parameter UCDs match
together
- *as literal* in any other cases.
- UCD not provided in the VOTable.
- UCDs do not match (e.g pos.eq.ra and pos.eq.dec for the position
fields vs pos.eq for the MANGO position Parameter)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#15 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADMLJCU2WRMT57ZWHFAZZW3TMEAQHANCNFSM4ZCNR47Q>
.
|
On Wed, May 05, 2021 at 01:36:50AM -0700, Bonnarel wrote:
Le 04/05/2021 à 10:45, msdemlei a écrit :
> Well, actually we already have ***@***.*** -- let's see how far we get
> with that.
Sorry I don't understand what you are talking about. Can you be more
explicit ?
Ah... that was "the ucd attribute of FIELD" (and PARAM) in xpath
notation before github's spam obfuscator ate it.
All I was saying is: We have a perfectly good place to store UCDs.
Before we create another place, let's have crystal-clear use (!)
cases that that place isn't good enough any more.
> > d) this may be a ucd coarse grained compared to the one used in
> > VOTable on each FIELD (but they should be consistent)
>
> Um... why would someone use different UCDs in DM annotation and on
> the field? Would would clients be supposed to do in such a case?
OK. My point was unclear, I admit it.
The ucd in Mango:parameter is not associated to a single FIELD or PARAM
but to a group of them. (a group english word, not always a VOTable GROUP)
as is the Measure.
Yeah, that *would* be a clear case, except I can't see a use case for
these ucds on Measure (or whatever).
On the contrary, I expect them to lead to rather confusing
situations. Say you have a Photometric measurement. It's "group"
UCD would presumably be something like phot.mag;em.opt.V, right?
But now it would group two columns, the value and an error. These
would then have UCDs phot.mag;em.opt.V and
stat.error;phot.mag;em.opt.V. Don't you agree it's a bit odd to
repeat the UCD on one of the reference fields, and to have a
different one on the other?
And of course, as usual: What would clients do with the UCD on
Measure that they could not (possibly better) do with the UCD on the
Measurement's value?
> ```
> MEAS_CLASSES = ["meas:GenericMeasure", "meas:Time",
> "meas:Position", "meas:Velocity", "meas:ProperMotion"]
> # I'm leaving out Polarisation because it really doesn't belong here
>
> for class_name in MEAS_CLASSES:
> ann = col.get_annotations(class_name)
> if ann:
> associated_error = ann.naive_error
> ```
>
> And, worse, each time we invent a new Measure subclass, it will have
> to amend MEAS_CLASS.
>
> That's a high price to pay; it would be worth paying if we got a
> major benefit from it. But I can't even see a minor one.
>
But the code for getting back the error should be the same for all the
subclasses, no ?
I'd hope so, yes.
The code for getting the error from any of the subclasses is typically
the code for the GEneric measure isn't it ?
Again, that's my understanding, and part of the reason why I can't
see why we'd want these additional classes.
So why is it so large a problem?
Because getting, managing, and parsing VO-DML is a big deal that
people just wanting to find the error for a value shouldn't have to
do if we can help it. And manually enumerating all the sub-classes
is brittle and asking quite a bit of our adopters, in particular when
at least I can't explain why these sub-classes are there to begin with.
|
Having UCDs on GenericMeasure only makes sense, but would require Meas model to do it. 2 elements answering Markus about the UCD repetitions :
|
I don't think we propose to replace the ucd attribute on FIELDS/PARAM. What is in Mango is a wider usage of ucd as semantic tags. Both usages have to be consistent when appropriate
The code can preprare for maniging anything "below" this parameter as a Photometric Measurement |
Is it possible to subset the mango:parameter ucd attribute value for per-physics classes ?
+1 : good summary Laurent |
No description provided.
The text was updated successfully, but these errors were encountered: