Make sure dtype.names ends up with the "name" attribute instead of "id" #819

Merged
merged 1 commit into from Dec 12, 2013

2 participants

@eteq
The Astropy Project member

This is prompted by a question by Susana Sanchez on the astropy mailing list (http://mail.scipy.org/pipermail/astropy/2013-February/002303.html) - see that discussion for additional info.

The current VOTable behavior is that if you have an XML VO Table with FIELDs that have both an ID attribute and a name attribute, when io.vo parses the file, it puts both the name and the id in the resulting numpy dtype. That is, for a particular example where the id's are "col#", the first entry of the thing that comes from table.dtype is ('CIG Number', 'col1'), '|O8'). This is all fine so far - it's good to record both of those.

The problem is that if you then do table.dtype.names, the resulting list has the IDs, not the names from the XML Field. It seems more natural for the name attribute to end up as the numpy dtype's name.

If I understand multi-name dtypes, I think the fix is to simply construct the dtypes so that the last name for each column is the name attribute, rather than the first. But @mdboom will likely know if this might have unintended side effects.

I'm not sure if this is a bug or feature request, as I'm not sure if this is intended or not.

@mdboom
The Astropy Project member

The problem is this (and I'd be in favor of better documentation, of course):

In VOTable, ID is guaranteed to be unique, but is not required. Names are not guaranteed to be unique, but are required.

In numpy, names are required to be unique and are required. titles are not required, and are not required to be unique.

So a name is not a name. The conceptual mapping is really that vo name == numpy title and vo ID == numpy name.

@eteq
The Astropy Project member

Ahh, I see. That's very subtly annoying, but understandable.

I'll leave this open and assign to you, @mdboom, as a "improve the documentation on this whenever you have the time" sort of thing.

@mdboom mdboom was assigned Feb 25, 2013
@mdboom
The Astropy Project member

@eteq: How does this look? It's a mess, really, so hard to explain well.

@eteq
The Astropy Project member

I think its as good as it can be, considering how subtle and confusing this is. So I'll just go ahead and merge this and we can re-visit it if more people have trouble after this. Thanks @mdboom!

@eteq eteq merged commit 9c54d8e into astropy:master Dec 12, 2013

1 check passed

Details default The Travis CI build passed
@embray embray commented on the diff Dec 13, 2013
docs/io/votable/index.rst
@@ -81,6 +82,27 @@ specified as follows:
</DESCRIPTION>
</FIELD>
+.. note::
+
+ The mapping from VOTable ``name`` and ``ID`` attributes to Numpy
+ dtype ``names`` and ``titles`` is highly confusing.
+
+ In VOTable, ``ID`` is guaranteed to be unique, but is not
+ required. ``name`` is not guaranteed to be unique, but is
+ required.
@embray
The Astropy Project member
embray added a line comment Dec 13, 2013

Who came up with that?

@mdboom
The Astropy Project member
mdboom added a line comment Dec 13, 2013

I have a feeling we'll be writing a paper called "The Future of Astronomical Data Formats I. Learning from VOTable" in about 5-10 years.

@embray
The Astropy Project member
embray added a line comment Dec 13, 2013

Why wait? Just call it "The Future of Astronomical Data Formats II: Learning from VOTable"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@mdboom mdboom deleted the mdboom:vo/name-mapping-docs branch May 21, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment