New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for CDS reader requiring at least two description characters #4474

Merged
merged 10 commits into from Jan 18, 2016

Conversation

Projects
None yet
7 participants
@anchitjain1234
Contributor

anchitjain1234 commented Jan 12, 2016

#4437 fix

anchitjain1234 added some commits Jan 12, 2016

@hamogu

This comment has been minimized.

Show comment
Hide comment
@hamogu

hamogu Jan 12, 2016

Member

I think this looks very promising. Can you add an entry to the CHANGES.rst file and a test?
You can look at the existing tests e.g. in tests/test_cds_header_... to see how it is done.
The easiest way is probably to take the example from #4437 (with a line or two of data) , add that table in the io/ascii/tests/t/cds directory, and then add a test that reads that table.

Note: If you add a file to /io/ascii/tests/t then you also have to list it here: https://github.com/astropy/astropy/blob/master/astropy/io/ascii/setup_package.py
to make sure that the test file gets installed (so that the tests find it).

Member

hamogu commented Jan 12, 2016

I think this looks very promising. Can you add an entry to the CHANGES.rst file and a test?
You can look at the existing tests e.g. in tests/test_cds_header_... to see how it is done.
The easiest way is probably to take the example from #4437 (with a line or two of data) , add that table in the io/ascii/tests/t/cds directory, and then add a test that reads that table.

Note: If you add a file to /io/ascii/tests/t then you also have to list it here: https://github.com/astropy/astropy/blob/master/astropy/io/ascii/setup_package.py
to make sure that the test file gets installed (so that the tests find it).

@anchitjain1234

This comment has been minimized.

Show comment
Hide comment
@anchitjain1234

anchitjain1234 Jan 12, 2016

Contributor

Where should the entry be added in CHANGES.rst? In version 1.1?

Contributor

anchitjain1234 commented Jan 12, 2016

Where should the entry be added in CHANGES.rst? In version 1.1?

@hamogu

This comment has been minimized.

Show comment
Hide comment
@hamogu

hamogu Jan 12, 2016

Member

1.1 is released already (see https://github.com/astropy/astropy/blob/master/CHANGES.rst), so we cannot make any changes to that any longer.
Put it into "1.1.1. - bug fixes".

Also, when you do another commit, you can put "fixes #4437" anywhere into the git commit message and github will automatically close that issue, when this pull request is merged.

Member

hamogu commented Jan 12, 2016

1.1 is released already (see https://github.com/astropy/astropy/blob/master/CHANGES.rst), so we cannot make any changes to that any longer.
Put it into "1.1.1. - bug fixes".

Also, when you do another commit, you can put "fixes #4437" anywhere into the git commit message and github will automatically close that issue, when this pull request is merged.

@bsipocz

This comment has been minimized.

Show comment
Hide comment
@bsipocz

bsipocz Jan 12, 2016

Member

@hamogu - v1.1.1 is also released, so this should go in 1.1.2.

Member

bsipocz commented Jan 12, 2016

@hamogu - v1.1.1 is also released, so this should go in 1.1.2.

Fixes #4437
Added test files and also updated CHANGES.rst

@eteq eteq added this to the v1.1.2 milestone Jan 12, 2016

@eteq

This comment has been minimized.

Show comment
Hide comment
@eteq

eteq Jan 12, 2016

Member

Yup, as @bsipocz said, @anchitjain1234, you should put the changelog entry in the "Bug Fixes" section of 1.1.2.

One other minor thing (not worth changing here, but just for the future) @anchitjain1234: can you try to have your commit messages be a bit more descriptive "Fixes #4437" is not ideal because it doesn't say what that particular commit does. Something like "Adds tests to make sure that the fix for #4437 works" I think would be more accurate. Thanks!

Member

eteq commented Jan 12, 2016

Yup, as @bsipocz said, @anchitjain1234, you should put the changelog entry in the "Bug Fixes" section of 1.1.2.

One other minor thing (not worth changing here, but just for the future) @anchitjain1234: can you try to have your commit messages be a bit more descriptive "Fixes #4437" is not ideal because it doesn't say what that particular commit does. Something like "Adds tests to make sure that the fix for #4437 works" I think would be more accurate. Thanks!

Show outdated Hide outdated CHANGES.rst
@@ -167,6 +167,10 @@ Bug fixes
- ``astropy.wcs``
- ``astropy.io.ascii``

This comment has been minimized.

@embray

embray Jan 12, 2016

Member

Adding this section heading isn't necessary--it already exists above.

@embray

embray Jan 12, 2016

Member

Adding this section heading isn't necessary--it already exists above.

Show outdated Hide outdated CHANGES.rst
@@ -167,6 +167,10 @@ Bug fixes
- ``astropy.wcs``
- ``astropy.io.ascii``
- Handling of CDS data file when no description is given. [#4474]

This comment has been minimized.

@embray

embray Jan 12, 2016

Member

This is a fragment. If nothing else it can be "Fixed handling of..."

@embray

embray Jan 12, 2016

Member

This is a fragment. If nothing else it can be "Fixed handling of..."

@embray

This comment has been minimized.

Show comment
Hide comment
@embray

embray Jan 12, 2016

Member

Other than my minor nitpicks about the changelog this looks good--thanks @anchitjain1234

Member

embray commented Jan 12, 2016

Other than my minor nitpicks about the changelog this looks good--thanks @anchitjain1234

@anchitjain1234

This comment has been minimized.

Show comment
Hide comment
@anchitjain1234

anchitjain1234 Jan 12, 2016

Contributor

Thanks @embray for review. Will commit again after including all the changes suggested.

Contributor

anchitjain1234 commented Jan 12, 2016

Thanks @embray for review. Will commit again after including all the changes suggested.

@taldcroft taldcroft changed the title from Fix for 4437 to Fix for CDS reader requiring at least two description characters Jan 12, 2016

@taldcroft

This comment has been minimized.

Show comment
Hide comment
@taldcroft

taldcroft Jan 12, 2016

Member

You might be able to make a much smaller change by using a conditional match on descr with:

@@ -103,8 +103,8 @@ class CdsHeader(core.BaseHeader):
                                     (?P<end>   \d+)        \s+
                                     (?P<format> [\w.]+)     \s+
                                     (?P<units> \S+)        \s+
-                                    (?P<name>  \S+)        \s+
-                                    (?P<descr> \S.+)""",
+                                    (?P<name>  \S+)
+                                    (\s+ (?P<descr> \S.*))?""",
                                 re.VERBOSE)

Can you give this a try? If there is no matching description then match.group('descr') will be None, so you might just have something like (match.group('descr') or '').strip() later on.

Member

taldcroft commented Jan 12, 2016

You might be able to make a much smaller change by using a conditional match on descr with:

@@ -103,8 +103,8 @@ class CdsHeader(core.BaseHeader):
                                     (?P<end>   \d+)        \s+
                                     (?P<format> [\w.]+)     \s+
                                     (?P<units> \S+)        \s+
-                                    (?P<name>  \S+)        \s+
-                                    (?P<descr> \S.+)""",
+                                    (?P<name>  \S+)
+                                    (\s+ (?P<descr> \S.*))?""",
                                 re.VERBOSE)

Can you give this a try? If there is no matching description then match.group('descr') will be None, so you might just have something like (match.group('descr') or '').strip() later on.

@taldcroft

This comment has been minimized.

Show comment
Hide comment
@taldcroft

taldcroft Jan 12, 2016

Member

Your test case could actually just modify a couple of the descriptions in the existing tests/t/cds.dat file, e.g.:

================================================================================
Byte-by-byte Description of file: datafile3.txt
--------------------------------------------------------------------------------
   Bytes Format Units  Label  Explanations
--------------------------------------------------------------------------------
   1-  3 I3     ---    Index
   5-  6 I2     h      RAh    Hour of Right Ascension (J2000) 
   8-  9 I2     min    RAm    Minute of Right Ascension (J2000) 
  11- 15 F5.2   s      RAs
                              - continuation of description
      17 A1     ---    DE-    Sign of the Declination (J2000)
  18- 19 I2     deg    DEd    Degree of Declination (J2000) 
  21- 22 I2     arcmin DEm    Arcminute of Declination (J2000) 
  24- 27 F4.1   arcsec DEs    Arcsecond of Declination (J2000) 
  29- 68 A40    ---    Match  Literature match 
  70- 75 A6     ---    Class  Source classification (1)
  77-80  F4.2   mag    AK     ? The K band extinction (2) 
  82-86  F5.2   ---    Fit    ? Fit of IRAC photometry (3)
--------------------------------------------------------------------------------

The testing should actually verify that the final description attribute of the relevant table columns are as expected.

It's possible I'm just not seeing it, but as far as I can see there doesn't seem to be any testing of parsing the description for CDS. That's definitely an oversight and should be fixed here while you're there. 😄

Member

taldcroft commented Jan 12, 2016

Your test case could actually just modify a couple of the descriptions in the existing tests/t/cds.dat file, e.g.:

================================================================================
Byte-by-byte Description of file: datafile3.txt
--------------------------------------------------------------------------------
   Bytes Format Units  Label  Explanations
--------------------------------------------------------------------------------
   1-  3 I3     ---    Index
   5-  6 I2     h      RAh    Hour of Right Ascension (J2000) 
   8-  9 I2     min    RAm    Minute of Right Ascension (J2000) 
  11- 15 F5.2   s      RAs
                              - continuation of description
      17 A1     ---    DE-    Sign of the Declination (J2000)
  18- 19 I2     deg    DEd    Degree of Declination (J2000) 
  21- 22 I2     arcmin DEm    Arcminute of Declination (J2000) 
  24- 27 F4.1   arcsec DEs    Arcsecond of Declination (J2000) 
  29- 68 A40    ---    Match  Literature match 
  70- 75 A6     ---    Class  Source classification (1)
  77-80  F4.2   mag    AK     ? The K band extinction (2) 
  82-86  F5.2   ---    Fit    ? Fit of IRAC photometry (3)
--------------------------------------------------------------------------------

The testing should actually verify that the final description attribute of the relevant table columns are as expected.

It's possible I'm just not seeing it, but as far as I can see there doesn't seem to be any testing of parsing the description for CDS. That's definitely an oversight and should be fixed here while you're there. 😄

@taldcroft

This comment has been minimized.

Show comment
Hide comment
@taldcroft

taldcroft Jan 12, 2016

Member

@anchitjain1234 - as mentioned earlier you can continue with this same branch / PR in your evolving solution. At the end if there are too many superfluous commits we'll walk you through cleaning things up.

Member

taldcroft commented Jan 12, 2016

@anchitjain1234 - as mentioned earlier you can continue with this same branch / PR in your evolving solution. At the end if there are too many superfluous commits we'll walk you through cleaning things up.

@anchitjain1234

This comment has been minimized.

Show comment
Hide comment
@anchitjain1234

anchitjain1234 Jan 12, 2016

Contributor

@taldcroft Yes I was only testing the case when description is empty and not focusing on parsing.
So I should remove added test and files and modify tests/t/cds.dat file?

Contributor

anchitjain1234 commented Jan 12, 2016

@taldcroft Yes I was only testing the case when description is empty and not focusing on parsing.
So I should remove added test and files and modify tests/t/cds.dat file?

@taldcroft

This comment has been minimized.

Show comment
Hide comment
@taldcroft

taldcroft Jan 12, 2016

Member

So I should remove added test and files and modify tests/t/cds.dat file?

Yes.

Member

taldcroft commented Jan 12, 2016

So I should remove added test and files and modify tests/t/cds.dat file?

Yes.

anchitjain1234 added some commits Jan 12, 2016

CHANGES.rst update. Fixes #4437
Changed issue number
@anchitjain1234

This comment has been minimized.

Show comment
Hide comment
@anchitjain1234

anchitjain1234 Jan 13, 2016

Contributor

Added test for document parsing. Please check them.

Contributor

anchitjain1234 commented Jan 13, 2016

Added test for document parsing. Please check them.

@pllim pllim added the io.ascii label Jan 14, 2016

@anchitjain1234

This comment has been minimized.

Show comment
Hide comment
@anchitjain1234

anchitjain1234 Jan 14, 2016

Contributor

@taldcroft Is it OK?

Contributor

anchitjain1234 commented Jan 14, 2016

@taldcroft Is it OK?

Show outdated Hide outdated CHANGES.rst
@@ -134,6 +134,8 @@ Bug fixes
- ``astropy.io.ascii``
- Fixed handling of CDS data file when no description is given. [#4437]

This comment has been minimized.

@hamogu

hamogu Jan 15, 2016

Member

@taldcroft : Is this for 1.2 or 1.1.2 as a bug fix?

@hamogu

hamogu Jan 15, 2016

Member

@taldcroft : Is this for 1.2 or 1.1.2 as a bug fix?

Show outdated Hide outdated astropy/io/ascii/tests/test_cds_header_from_readme.py
assert_equal(table['Wave'].description, 'wave? Wavelength in Angstroms')
assert_equal(table['El'].description, 'a')
assert_equal(table['ion'].description, '- Ionization stage (1 for neutral element)')

This comment has been minimized.

@hamogu

hamogu Jan 15, 2016

Member

Can we also test one of EW, e_EW, or Q? They are a little special in the sense that there is also the missing value markup (?=-9.9) in the description string.

@hamogu

hamogu Jan 15, 2016

Member

Can we also test one of EW, e_EW, or Q? They are a little special in the sense that there is also the missing value markup (?=-9.9) in the description string.

This comment has been minimized.

@anchitjain1234

anchitjain1234 Jan 15, 2016

Contributor

But this markup for missing value wont change actual description right?

@anchitjain1234

anchitjain1234 Jan 15, 2016

Contributor

But this markup for missing value wont change actual description right?

This comment has been minimized.

@taldcroft

taldcroft Jan 15, 2016

Member

That's the point of the test, to confirm that the missing value markup does not end up in the description.

@taldcroft

taldcroft Jan 15, 2016

Member

That's the point of the test, to confirm that the missing value markup does not end up in the description.

This comment has been minimized.

@anchitjain1234

anchitjain1234 Jan 15, 2016

Contributor

@taldcroft @hamogu So table['Q'].description should be DAOSPEC quality parameter Q(large values are bad) not ?=-9.999 DAOSPEC quality parameter Q(large values are bad)

@anchitjain1234

anchitjain1234 Jan 15, 2016

Contributor

@taldcroft @hamogu So table['Q'].description should be DAOSPEC quality parameter Q(large values are bad) not ?=-9.999 DAOSPEC quality parameter Q(large values are bad)

This comment has been minimized.

@hamogu

hamogu Jan 15, 2016

Member

The markup value is processed in line 128 in cds.py.
Currently, the description string is unaltered from what you read in, i.e. the ?=-9.9 is part of the description. What we really should do is something like this:

match = re.match(
                     r'\? (?P<equal> =)? (?P<nullval> \S* (?P<descriptiontext> .))', col.description, re.VERBOSE)
col.description = match.group('descriptiontext')

(untested, but something like that.)
Not sure how that works out with line breaks in the description 0 I think those are taken care of in an earlier stage. I would just try and see if it works.
Technically is this outside of the scope of your PR ("requiring at least two description characters") and if you don't want to do it we can postpone it to another PR but it would be great if you could take care of it, while you are working on the description stuff anyway.

@hamogu

hamogu Jan 15, 2016

Member

The markup value is processed in line 128 in cds.py.
Currently, the description string is unaltered from what you read in, i.e. the ?=-9.9 is part of the description. What we really should do is something like this:

match = re.match(
                     r'\? (?P<equal> =)? (?P<nullval> \S* (?P<descriptiontext> .))', col.description, re.VERBOSE)
col.description = match.group('descriptiontext')

(untested, but something like that.)
Not sure how that works out with line breaks in the description 0 I think those are taken care of in an earlier stage. I would just try and see if it works.
Technically is this outside of the scope of your PR ("requiring at least two description characters") and if you don't want to do it we can postpone it to another PR but it would be great if you could take care of it, while you are working on the description stuff anyway.

This comment has been minimized.

@anchitjain1234

anchitjain1234 Jan 15, 2016

Contributor

I would try to do it.

@anchitjain1234

anchitjain1234 Jan 15, 2016

Contributor

I would try to do it.

Show outdated Hide outdated astropy/io/ascii/tests/t/cds/description/table.dat
Cr110 2108 8772.86 Al1 4.02 -0.32 87.6 5.1 0.957
Cr110 2108 8773.90 Al1 4.02 -0.16 118.6 14.6 0.736
Cr110 2108 5853.67 Ba2 0.60 -1.00 121.9 5.5 1.435
Cr110 2108 6141.71 Ba2 0.70 -0.08 191.0 8.7 1.117

This comment has been minimized.

@hamogu

hamogu Jan 15, 2016

Member

For the test it woul be totally sufficient to have 2 lines of data here, but this is fine with me, too.

@hamogu

hamogu Jan 15, 2016

Member

For the test it woul be totally sufficient to have 2 lines of data here, but this is fine with me, too.

@taldcroft taldcroft self-assigned this Jan 15, 2016

@anchitjain1234

This comment has been minimized.

Show comment
Hide comment
@anchitjain1234

anchitjain1234 Jan 17, 2016

Contributor

@taldcroft @hamogu Is it alright now?

Contributor

anchitjain1234 commented Jan 17, 2016

@taldcroft @hamogu Is it alright now?

@hamogu

This comment has been minimized.

Show comment
Hide comment
@hamogu

hamogu Jan 17, 2016

Member

Looks good to me!
Thanks for going through this. It's often the seemingly simple and innocent changes that take a lot longer than one thinks, but fixing these seemingly small issues is very important for astropy as a whole, because users easily get annoyed by them ("Grr. I guess the table reading just does not work. Stupid package."), probably more so than by big missing features ("How do I plot my spectrum? Ah, I guess astropy just is not a package that deals with spectroscopy.")

Member

hamogu commented Jan 17, 2016

Looks good to me!
Thanks for going through this. It's often the seemingly simple and innocent changes that take a lot longer than one thinks, but fixing these seemingly small issues is very important for astropy as a whole, because users easily get annoyed by them ("Grr. I guess the table reading just does not work. Stupid package."), probably more so than by big missing features ("How do I plot my spectrum? Ah, I guess astropy just is not a package that deals with spectroscopy.")

taldcroft added a commit that referenced this pull request Jan 18, 2016

Merge pull request #4474 from anchitjain1234/cds-fix
Fix bugs and improve header parsing for CDS reader

@taldcroft taldcroft merged commit 27b9a39 into astropy:master Jan 18, 2016

3 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage increased (+0.06%) to 76.604%
Details

@taldcroft taldcroft added the Bug label Jan 18, 2016

@taldcroft

This comment has been minimized.

Show comment
Hide comment
@taldcroft
Member

taldcroft commented Jan 18, 2016

Thanks @anchitjain1234 !!

@anchitjain1234 anchitjain1234 deleted the anchitjain1234:cds-fix branch Jan 19, 2016

@embray

This comment has been minimized.

Show comment
Hide comment
@embray

embray Jan 19, 2016

Member

I'm also confused. The changelog entry for this is in the section for v1.2. But the milestone says v1.1.2. Which is it?

Member

embray commented Jan 19, 2016

I'm also confused. The changelog entry for this is in the section for v1.2. But the milestone says v1.1.2. Which is it?

@taldcroft

This comment has been minimized.

Show comment
Hide comment
@taldcroft

taldcroft Jan 19, 2016

Member

Sorry, didn't notice that the changelog entry was in the wrong section. This should be 1.1.2 if possible.

Member

taldcroft commented Jan 19, 2016

Sorry, didn't notice that the changelog entry was in the wrong section. This should be 1.1.2 if possible.

@anchitjain1234

This comment has been minimized.

Show comment
Hide comment
@anchitjain1234

anchitjain1234 Jan 19, 2016

Contributor

Should I create a new pull request for updating CHANGES.rst or push to this pull request only?

Contributor

anchitjain1234 commented Jan 19, 2016

Should I create a new pull request for updating CHANGES.rst or push to this pull request only?

@taldcroft

This comment has been minimized.

Show comment
Hide comment
@taldcroft

taldcroft Jan 19, 2016

Member

@anchitjain1234 - this has been merged already. The changelog will get fixed one way or another, and you don't need to worry about it at this point. Thanks for asking though!

Member

taldcroft commented Jan 19, 2016

@anchitjain1234 - this has been merged already. The changelog will get fixed one way or another, and you don't need to worry about it at this point. Thanks for asking though!

@embray

This comment has been minimized.

Show comment
Hide comment
@embray

embray Jan 19, 2016

Member

Partly my fault--I didn't put the section for v1.1.2 in the master branch after I released 1.1.1.

Member

embray commented Jan 19, 2016

Partly my fault--I didn't put the section for v1.1.2 in the master branch after I released 1.1.1.

embray added a commit that referenced this pull request Jan 19, 2016

embray added a commit that referenced this pull request Jan 19, 2016

Merge pull request #4474 from anchitjain1234/cds-fix
Fix bugs and improve header parsing for CDS reader
@embray

This comment has been minimized.

Show comment
Hide comment
@embray

embray Jan 19, 2016

Member

Fixed it.

Member

embray commented Jan 19, 2016

Fixed it.

dhomeier added a commit to dhomeier/astropy that referenced this pull request Jun 12, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment