Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for CDS reader requiring at least two description characters #4474

Merged
merged 10 commits into from
Jan 18, 2016
3 changes: 3 additions & 0 deletions CHANGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,9 @@ Bug fixes

- ``astropy.io.ascii``

- Fixed handling of CDS data file when no description is given and also
included stripping out of markup for missing value from description. [#4437]

- ``astropy.io.fits``

- ``astropy.io.misc``
Expand Down
9 changes: 5 additions & 4 deletions astropy/io/ascii/cds.py
Original file line number Diff line number Diff line change
Expand Up @@ -103,8 +103,8 @@ def get_cols(self, lines):
(?P<end> \d+) \s+
(?P<format> [\w.]+) \s+
(?P<units> \S+) \s+
(?P<name> \S+) \s+
(?P<descr> \S.+)""",
(?P<name> \S+)
(\s+ (?P<descr> \S.*))?""",
re.VERBOSE)

cols = []
Expand All @@ -120,13 +120,14 @@ def get_cols(self, lines):
col.unit = match.group('units')
if col.unit == '---':
col.unit = None # "---" is the marker for no unit in CDS table
col.description = match.group('descr').strip()
col.description = (match.group('descr') or '').strip()
col.raw_type = match.group('format')
col.type = self.get_col_type(col)

match = re.match(
r'\? (?P<equal> =)? (?P<nullval> \S*)', col.description, re.VERBOSE)
r'\? (?P<equal> =)? (?P<nullval> \S*) (\s+ (?P<descriptiontext> \S.*))?', col.description, re.VERBOSE)
if match:
col.description=(match.group('descriptiontext') or '').strip()
if issubclass(col.type, core.FloatType):
fillval = 'nan'
else:
Expand Down
2 changes: 2 additions & 0 deletions astropy/io/ascii/setup_package.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,8 @@ def get_package_data():
't/simple_csv.csv',
't/simple_csv_missing.csv',
't/fixed_width_2_line.txt',
't/cds/description/ReadMe',
't/cds/description/table.dat',
]
}

Expand Down
67 changes: 67 additions & 0 deletions astropy/io/ascii/tests/t/cds/description/ReadMe
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
J/A+A/511/A56 Abundances of five open clusters (Pancino+, 2010)
================================================================================
Chemical abundance analysis of the open clusters Cr 110, NGC 2420, NGC 7789,
and M 67 (NGC 2682).
Pancino E., Carrera R., Rossetti, E., Gallart C.
<Astron. Astrophys. 511, A56 (2010)>
=2010A&A...511A..56P
================================================================================
ADC_Keywords: Clusters, open ; Stars, giant ; Equivalent widths ; Spectroscopy
Keywords: stars: abundances - Galaxy: disk -
open clusters and associations: general

Abstract:
The present number of Galactic open clusters that have high resolution
abundance determinations, not only of [Fe/H], but also of other key
elements, is largely insufficient to enable a clear modeling of the
Galactic disk chemical evolution. To increase the number of Galactic
open clusters with high quality measurements, we obtained high
resolution (R~30000), high quality (S/N~50-100 per pixel), echelle
spectra with the fiber spectrograph FOCES, at Calar Alto, Spain, for
three red clump stars in each of five Open Clusters. We used the
classical equivalent width analysis method to obtain accurate
abundances of sixteen elements: Al, Ba, Ca, Co, Cr, Fe, La, Mg, Na,
Nd, Ni, Sc, Si, Ti, V, and Y. We also derived the oxygen abundance
using spectral synthesis of the 6300{AA} forbidden line.

Description:
Atomic data and equivalent widths for 15 red clump giants in 5 open
clusters: Cr 110, NGC 2099, NGC 2420, M 67, NGC 7789.

File Summary:
--------------------------------------------------------------------------------
FileName Lrecl Records Explanations
--------------------------------------------------------------------------------
ReadMe 80 . This file
table1.dat 103 15 Observing logs and programme stars information
table5.dat 56 5265 Atomic data and equivalent widths
--------------------------------------------------------------------------------

See also:
J/A+A/455/271 : Abundances of red giants in NGC 6441 (Gratton+, 2006)
J/A+A/464/953 : Abundances of red giants in NGC 6441 (Gratton+, 2007)
J/A+A/505/117 : Abund. of red giants in 15 globular clusters (Carretta+, 2009)

Byte-by-byte Description of file: table.dat
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 7 A7 --- Cluster Cluster name
9- 12 I4 --- Star
14- 20 F7.2 0.1nm Wave wave
? Wavelength in Angstroms
22- 23 A2 --- El a
24 I1 --- ion ?=0
- Ionization stage (1 for neutral element)
26- 30 F5.2 eV chiEx Excitation potential
32- 37 F6.2 --- loggf Logarithm of the oscillator strength
39- 43 F5.1 0.1pm EW ?=-9.9 Equivalent width (in mA)
46- 49 F4.1 0.1pm e_EW ?=-9.9 rms uncertainty on EW
51- 56 F6.3 --- Q ?=-9.999 DAOSPEC quality parameter Q
(large values are bad)
--------------------------------------------------------------------------------

Acknowledgements:
Elena Pancino, elena.pancino(at)oabo.inaf.it
================================================================================
(End) Elena Pancino [INAF-OABo, Italy], Patricia Vannier [CDS] 23-Nov-2009
2 changes: 2 additions & 0 deletions astropy/io/ascii/tests/t/cds/description/table.dat
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Cr110 2108 6696.79 Al1 4.02 -1.42 29.5 2.2 0.289
Cr110 2108 6698.67 Al1 3.14 -1.65 58.0 2.0 0.325
16 changes: 16 additions & 0 deletions astropy/io/ascii/tests/test_cds_header_from_readme.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,21 @@ def read_table3(readme, data):
return ascii.read(data, readme=readme)


def test_description():
readme = 't/cds/description/ReadMe'
data = 't/cds/description/table.dat'
for read_table in (read_table1, read_table2, read_table3):
table = read_table(readme, data)
assert_equal(len(table), 2)
assert_equal(table['Cluster'].description, 'Cluster name')
assert_equal(table['Star'].description, '')
assert_equal(table['Wave'].description, 'wave? Wavelength in Angstroms')
assert_equal(table['El'].description, 'a')
assert_equal(table['ion'].description, '- Ionization stage (1 for neutral element)')
assert_equal(table['EW'].description, 'Equivalent width (in mA)')
assert_equal(table['Q'].description, 'DAOSPEC quality parameter Q(large values are bad)')

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also test one of EW, e_EW, or Q? They are a little special in the sense that there is also the missing value markup (?=-9.9) in the description string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this markup for missing value wont change actual description right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the point of the test, to confirm that the missing value markup does not end up in the description.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@taldcroft @hamogu So table['Q'].description should be DAOSPEC quality parameter Q(large values are bad) not ?=-9.999 DAOSPEC quality parameter Q(large values are bad)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The markup value is processed in line 128 in cds.py.
Currently, the description string is unaltered from what you read in, i.e. the ?=-9.9 is part of the description. What we really should do is something like this:

match = re.match(
                     r'\? (?P<equal> =)? (?P<nullval> \S* (?P<descriptiontext> .))', col.description, re.VERBOSE)
col.description = match.group('descriptiontext')

(untested, but something like that.)
Not sure how that works out with line breaks in the description 0 I think those are taken care of in an earlier stage. I would just try and see if it works.
Technically is this outside of the scope of your PR ("requiring at least two description characters") and if you don't want to do it we can postpone it to another PR but it would be great if you could take care of it, while you are working on the description stuff anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would try to do it.


def test_multi_header():
readme = 't/cds/multi/ReadMe'
data = 't/cds/multi/lhs2065.dat'
Expand Down Expand Up @@ -137,3 +152,4 @@ def test_header_from_readme():
test_header_from_readme()
test_multi_header()
test_glob_header()
test_description()