New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-34484: Update tests to remove dependency on obs_test. #278
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. I have one performance comment in the tests and confusion about the return values of the mock not matching (I think) the base class.
tests/test_refObjLoader.py
Outdated
|
||
class TestRefObjLoader(ingestIndexTestBase.ConvertReferenceCatalogTestBase, | ||
lsst.utils.tests.TestCase): | ||
def setUp(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a shame that this gets run for every test when seemingly it's a readonly butler that is created and the tests aren't multi-threaded (at least I don't see any of the tests writing to a butler since the butler is not stored anywhere apart from the deferred dataset refs). Otherwise there is a lot of overhead creating a butler and populating it and then doing it all again. Can you see if setUpClass works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Running all the tests here only takes 10s or so, even with the overhead. I was having trouble with setupClass following the the same format of the other test this was based on. There seemed to be some sort of filename clashes when running with xdist. The performance didn't seem like a problem and I didn't want to spend too much time trying to optimize.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. Ok. It's a bit odd because all the temp dirs in here use tempfile.TemporaryDirectory()
. Something to optimize in the future then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem might have been before I fixed all the temporary directory calls. But once I got it working I didn't want to poke it any further...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change works for me:
diff --git a/tests/test_refObjLoader.py b/tests/test_refObjLoader.py
index 1941b559..4519d4e2 100644
--- a/tests/test_refObjLoader.py
+++ b/tests/test_refObjLoader.py
@@ -39,15 +39,18 @@ import ingestIndexTestBase
class TestRefObjLoader(ingestIndexTestBase.ConvertReferenceCatalogTestBase,
lsst.utils.tests.TestCase):
- def setUp(self):
+
+ @classmethod
+ def setUpClass(cls):
+ super().setUpClass()
np.random.seed(10)
# Generate a second catalog, with different ids
inTempDir = tempfile.TemporaryDirectory()
inPath = inTempDir.name
- skyCatalogFile, _, skyCatalog = self.makeSkyCatalog(inPath, idStart=25, seed=123)
+ skyCatalogFile, _, skyCatalog = cls.makeSkyCatalog(inPath, idStart=25, seed=123)
- self.skyCatalog = skyCatalog
+ cls.skyCatalog = skyCatalog
# override some field names, and use multiple cores
config = ingestIndexTestBase.makeConvertConfig(withRaDecErr=True, withMagErr=True,
@@ -59,8 +62,8 @@ class TestRefObjLoader(ingestIndexTestBase.ConvertReferenceCatalogTestBase,
config.file_reader.format = 'ascii.commented_header'
config.n_processes = 1
config.id_name = 'id' # Use the ids from the generated catalogs
- self.repoTempDir = tempfile.TemporaryDirectory()
- repoPath = self.repoTempDir.name
+ cls.repoTempDir = tempfile.TemporaryDirectory()
+ repoPath = cls.repoTempDir.name
# Convert the input data files to our HTM indexed format.
dataTempDir = tempfile.TemporaryDirectory()
@@ -69,7 +72,7 @@ class TestRefObjLoader(ingestIndexTestBase.ConvertReferenceCatalogTestBase,
converter.run([skyCatalogFile])
# Make a temporary butler to ingest them into.
- butler = self.makeTemporaryRepo(repoPath, config.dataset_config.indexer.active.depth)
+ butler = cls.makeTemporaryRepo(repoPath, config.dataset_config.indexer.active.depth)
dimensions = [f"htm{depth}"]
datasetType = DatasetType(config.dataset_config.ref_dataset_name,
dimensions,
@@ -95,8 +98,8 @@ class TestRefObjLoader(ingestIndexTestBase.ConvertReferenceCatalogTestBase,
for dataRef in datasetRefs:
handles.append(DeferredDatasetHandle(butler=butler, ref=dataRef, parameters=None))
- self.datasetRefs = datasetRefs
- self.handles = handles
+ cls.datasetRefs = datasetRefs
+ cls.handles = handles
inTempDir.cleanup()
dataTempDir.cleanup()
@@ -270,10 +273,12 @@ class TestRefObjLoader(ingestIndexTestBase.ConvertReferenceCatalogTestBase,
)
self.assertEqual(result.fluxField, 'a_flux')
- def tearDown(self):
+ @classmethod
+ def tearDownClass(cls):
+ super().tearDownClass()
# Clear this to clean up the temporary directory.
- self.repoTempDir.cleanup()
- del self.repoTempDir
+ cls.repoTempDir.cleanup()
+ del cls.repoTempDir
class TestMemory(lsst.utils.tests.MemoryTestCase):
and takes the non-multiprocessing testing from 16 seconds to 6 seconds and still works with xdist.
1110105
to
586906b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing those docstrings!
I'm concerned about MockLoadReferenceObjects
looking rather complicated. I think we should be able to pretty easily create a minimal butler or use unittest.mocks in e.g. meas_astrom
.
I fear that just looking at coverage HTML isn't enough for this: the refcat code is not well isolated, so measured coverage might not change but we might actually lose out on things being covered (like the version0 tests). There are also multiple different files that test refcat-related things, in redundant and overlapping ways.
tests/test_refObjLoader.py
Outdated
self.assertFloatsAlmostEqual(cat_pm["coord_raErr"], predictedRaErr) | ||
self.assertFloatsAlmostEqual(cat_pm["coord_decErr"], predictedDecErr) | ||
|
||
def test_requireProperMotion(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not at all convinced this is testing the same things as the testRequireProperMotion
from test_htmIndex
. That test had very specific cases, using a mocked butler to force the return of a specific file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is testing the same thing as the current code, unless I'm missing something. Looking at the current code path,
meas_algorithms/tests/test_htmIndex.py
Lines 155 to 156 in f36ae6a
IngestIndexedReferenceTask.parseAndRun(args=[cls.input_dir, "--output", cls.testRepoPath, | |
cls.skyCatalogFile], config=config) |
meas_algorithms/tests/ingestIndexTestBase.py
Line 203 in f36ae6a
cls.skyCatalogFile, cls.skyCatalogFileDelim, cls.skyCatalog = cls.makeSkyCatalog(cls.outPath) |
makeSkyCatalog()
.
tests/test_refObjLoader.py
Outdated
self.assertFloatsNotEqual(cat['coord_ra'], catWithEpoch['coord_ra'], rtol=1.0e-4) | ||
self.assertFloatsNotEqual(cat['coord_dec'], catWithEpoch['coord_dec'], rtol=1.0e-4) | ||
|
||
def test_filterMap(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd rather this test didn't exist, and we just relied on the ones in test_loadReferenceObjects.py
, but I don't know if this one provides any specific coverage that that doesn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm copying over the tests that were there before in gen2-only land. But the slight difference is that this is testing the return values of loadSkyCircle
which the tests in test_loadReferenceObjects.py
are not (because it has a dummy loadSkyCircle
). The second test (that the values are equal) is I guess redundant with the schema mapping tests, but it's not expensive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In particular, this test hits
ReferenceObjectLoaderBase._addFluxAliases(refCat.schema, anyFilterMapsToThis, filterMap) |
np.rad2deg(self._cat['coord_dec']), | ||
radius.asDegrees() | ||
) | ||
sel = np.zeros(len(self._cat), dtype=bool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sel
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to selected
.
|
||
|
||
class MockLoadReferenceObjects(ReferenceObjectLoader): | ||
"""A simple mock of LoadReferenceObjects for tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is a "simple" mock: it seems like there's a lot going on in here. You've effectively re-implemented most of the important parts of ReferenceObjectLoader
in here and I'm concerned that this hides some of the things we want to test. I wonder if we could get by with some unittest.mock
s of dataId/datasetRef instead? Or even have a very minimal butler live in meas_algorithms/tests/data
?
tests/test_refObjLoader.py
Outdated
lsst.utils.tests.TestCase): | ||
|
||
@classmethod | ||
def setUpClass(cls): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This setUpClass is very long. Is all of this really necessary, and/or could it be split out into free functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just following testIngestTwoFilesTwoCores
. This is what's needed to ingest and setup the tests. I could put it into a function that's called by setUpClass
I suppose but I don't know what that gains.
586906b
to
6aacfd3
Compare
@@ -193,7 +189,8 @@ def tearDownClass(cls): | |||
|
|||
@classmethod | |||
def setUpClass(cls): | |||
cls.outPath = tempfile.mkdtemp() | |||
cls.outDir = tempfile.TemporaryDirectory() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We try to avoid using this in tests because it can lead to test failures over NFS caused by TemporaryDirectory not using the ignore_errors flag in shutil when it automatically runs the cleanup (because of nfs lock files hanging around). You might get away with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh no! Okay, we had a mix before and I thought that mkdtemp
was old and TemporaryDirectory
was new. I can move everything over to mkdtemp
. The problem I was trying to solve was that the temporary directories aren't being cleaned up. (These might be fine because they all go in /tmp
and who mounts /tmp
over nfs?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other people may be excited about fancy case syntax, but ignore_cleanup_errors
is what I'm really looking forward to in Python 3.10.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooh. They've finally fixed it to allow you to say you don't care. If only someone would write a 3.10 migration RFC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. Don't we have lsst.utils.tests.temporaryDirectory
that tries to fix this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: should I move this all back to mkdtemp and add the robust cleanup now, or let this slide until we get to 3.10? I think it's fine because /tmp
shouldn't be nfs mounted, but that doesn't mean it can't be nfs mounted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 to lsst.utils.tests.temporaryDirectory
as the solution for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm. That's a context manager so does it work against setup/teardown which we need here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 You are probably right that /tmp is not going to mess us up. I think the problem we had before was from people creating temp directories within the tests/
directory (which is still common).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any notes about temporary directory usage in the dev guide, and I wasn't aware of that problem. Should we add a section about that to the python testing guide?
28d82ce
to
b1808c2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A handful of new comments, but I think this is pretty close now. I filed an RFC about removing support for version0 refcats, but at least we know they should work in gen3 now.
""" | ||
def __init__(self, filenames, name='cal_ref_cat', config=None, htmLevel=4): | ||
if config is None: | ||
config = LoadReferenceObjectsConfig() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm... looking at the base class, I think how we manage the config in ReferenceObjectLoaderBase
is busted: if the config is None
in the base class, we should create a default config there. The base class is not a Task, so it doesn't do that automatically like the gen2 code did. Don't need to fix it on this ticket, but we should create one to deal with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an easy fix, I can just add it on this ticket.
@@ -55,11 +56,14 @@ def test_main_args(self): | |||
# Test with sets because the glob can come out in any order. | |||
self.assertEqual(set(run.call_args.args[0]), set(self.expected_files)) | |||
|
|||
outdir.cleanup() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you're using TemporaryDirectory
, you don't need to call cleanup
: it should be cleaned up automatically when it goes out of scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. In some cases I see warnings, but I guess that's when it's not fully local?
tests/test_loadReferenceObjects.py
Outdated
@@ -235,6 +237,95 @@ def make_catalog(): | |||
self.assertIsNone(newRefCat) | |||
|
|||
|
|||
class ConvertReferenceCatalogConfigValidateTestCase(lsst.utils.tests.TestCase): | |||
"""Test validation of IngestIndexReferenceConfig.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IngestIndexReferenceConfig -> ConvertReferenceCatalogConfig
tests/test_referenceObjectLoader.py
Outdated
@classmethod | ||
def setUpClass(cls): | ||
super().setUpClass() | ||
np.random.seed(10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to seed here: makeSkyCatalog
does its own seeding.
|
||
|
||
class MockReferenceObjectLoaderFromFiles(ReferenceObjectLoader): | ||
"""A simple mock of ReferenceObjectLoader to use a set of files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, let'ss expand this docstring to something like: "A ReferenceObjectLoader that doesn't know about regions on the sky or the butler, and takes the list of files on disk to mock dataIds."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mock ReferenceObjectLoader uses a set of files on disk to create
mock dataIds and data reference handles that can be accessed
without a butler. The files must be afw catalog files in the reference
catalog format, sharded with HTM pixelization.
tests/test_referenceObjectLoader.py
Outdated
super().tearDownClass() | ||
# Clear this to clean up the temporary directory. | ||
cls.repoTempDir.cleanup() | ||
del cls.repoTempDir |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure none of the above is necessary.
tests/test_referenceObjectLoader.py
Outdated
loaderConfig = ReferenceObjectLoader.ConfigClass() | ||
loader = ReferenceObjectLoader([dataRef.dataId for dataRef in self.datasetRefs], | ||
self.handles, | ||
config=loaderConfig) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any empty config like this one should go away: thanks for fixing the base class to have it create a config!
@@ -233,6 +242,10 @@ def test_getCatalog(self): | |||
self.assertFloatsEqual(newcat[:len(catalog)]['coord_ra'], catalog['coord_ra']) | |||
self.assertFloatsEqual(newcat[:len(catalog)]['coord_dec'], catalog['coord_dec']) | |||
|
|||
def tearDown(self): | |||
self.tempDir.cleanup() | |||
self.tempDir2.cleanup() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More unnecessary cleanups
@@ -109,6 +112,10 @@ def runTest(withRaDecErr): | |||
self.checkAllRowsInRefcat(loader, skyCatalog1, config) | |||
self.checkAllRowsInRefcat(loader, skyCatalog2, config) | |||
|
|||
inTempDir1.cleanup() | |||
inTempDir2.cleanup() | |||
dataTempDir.cleanup() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can just remove all the calls to cleanup
in this branch.
No description provided.