New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-26343: Use less expansive definition of extension #354
Conversation
Before I leave a review, is there a completely different path for |
I wasn't sure what to do with |
""" | ||
special = {".gz", ".bz2", ".xz"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I see that .fz
is not "special" because it is handled natively by the fits formatter. Which is actually good, because then we can support fits.fz
or .fz
which I think are both acceptable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, at the moment even though .fits.fz
is listed as a supported extension in the formatter it's irrelevant because the .fz
on its own is going to be enough to match.
("file", ""), | ||
("flat_i_sim_1.4_blah.fits.gz", ".fits.gz"), | ||
("flat_i_sim_1.4_blah.txt", ".txt"), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do support .fits
and .fits.fz
then I think that we should explicitly test that here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since .fz
is not listed as a special extension at the moment getExtension() will return .fz
for .fits.fz
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's fine! I would suggest putting those in the list so that it's clear what it will return (and that we know it's returning what we think it should return).
We generally want a single extension to be returned unless it's one of the standard compression extensions such as .gz. Now recognize those and return .fits.gz for "a.b.c.fits.gz" and .fits for "a.b.c.fits".
Rather than parsing extensions out of filenames, change to the safer approach of seeing if the file to be ingested ends with one of the supported extensions.
This can happen if it's is implemented as an instance property. In these cases we assume that the supported extensions class property is complete.
These changes allow "flat_i_sim_1.4_blah.fits" to be ingested as a fits file.