New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-13835: Cannot ingest empty data #33
Conversation
Having a parser quit the program makes sense when the user provides bad arguments to the main program, but is dangerous when it happens as an implementation detail of a submodule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handful of changes requested, but I don't need to see it again. Thanks for documenting the exceptions.
Raises | ||
------ | ||
RuntimeError | ||
Raised if there are no files to ingest. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for documenting this. We need more docstrings that call out known exceptions.
python/lsst/ap/verify/ingestion.py
Outdated
""" | ||
if not dataFiles: | ||
raise RuntimeError("No raw files to ingest.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe be a little more explicit here, to help with debugging: "No raw files to ingest; dataFiles is %s."%dataFiles
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The if
statement guarantees that won't display anything interesting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dataFiles
could be None
, False
, []
, ()
, {}
, ''
. It might be informative to know which of those it was.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh, weakly typed languages. 😞 I'll write it with %r
, then.
python/lsst/ap/verify/ingestion.py
Outdated
""" | ||
if not calibDataFiles: | ||
raise RuntimeError("No calib files to ingest.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, please include calibDataFiles
in the raised string.
python/lsst/ap/verify/ingestion.py
Outdated
with tarfile.open(defectTarball, "r") as opened: | ||
if opened.getNames(): | ||
if not os.path.isdir(defectDir): | ||
os.mkdir(defectDir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use pathlib here in place of if exist: mkdir
, to avoid os.mkdir
race conditions on parallel ingest:
pathlib.Path(defectDir).mkdir(parents=True, exist_ok=True)
https://stackoverflow.com/a/14364249
https://docs.python.org/3/library/pathlib.html#pathlib.Path.mkdir
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure parallel ingestion is possible, but your solution is more elegant.
butler = self._rawButler() | ||
for datum in IngestionTestSuite.rawData: | ||
dataId = {'visit': datum['visit']} | ||
self.assertEqual(butler.datasetExists('raw', dataId), datum['file'] not in badFiles) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooog. I'm not too keen on this syntax (it was hard for me to parse), but I'm not sure of a better approach given the butler methods that exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could do the approach of if in badFiles: self.assertFalse(...) else: self.assertTrue(...)
, but I'd find that ickier.
Attempts to ingest empty lists of files now raise an exception. Unit tests have been updated to reflect the new behavior.
Path.mkdir provides a clean postcondition on the directory's existence, unlike os.mkdir.
This factoring makes it easier to share the information among test cases.
1fa5d53
to
3af2236
Compare
This PR changes the behavior of
DatasetIngestTask
's private methods from crashing to raising a specific exception. It also adds a unit test that could not be implemented because of this bug (at least, not while we were still usingtestdata_decam
).