DM-34484: Update tests to remove dependency on obs_test. #278

erykoff · 2022-04-19T22:07:57Z

No description provided.

timj

Looks great. I have one performance comment in the tests and confusion about the return values of the mock not matching (I think) the base class.

python/lsst/meas/algorithms/testUtils.py

timj · 2022-04-19T23:58:16Z

tests/test_refObjLoader.py

+
+class TestRefObjLoader(ingestIndexTestBase.ConvertReferenceCatalogTestBase,
+                       lsst.utils.tests.TestCase):
+    def setUp(self):


It's a shame that this gets run for every test when seemingly it's a readonly butler that is created and the tests aren't multi-threaded (at least I don't see any of the tests writing to a butler since the butler is not stored anywhere apart from the deferred dataset refs). Otherwise there is a lot of overhead creating a butler and populating it and then doing it all again. Can you see if setUpClass works?

Running all the tests here only takes 10s or so, even with the overhead. I was having trouble with setupClass following the the same format of the other test this was based on. There seemed to be some sort of filename clashes when running with xdist. The performance didn't seem like a problem and I didn't want to spend too much time trying to optimize.

Hmm. Ok. It's a bit odd because all the temp dirs in here use tempfile.TemporaryDirectory(). Something to optimize in the future then.

The problem might have been before I fixed all the temporary directory calls. But once I got it working I didn't want to poke it any further...

This change works for me:

diff --git a/tests/test_refObjLoader.py b/tests/test_refObjLoader.py index 1941b559..4519d4e2 100644 --- a/tests/test_refObjLoader.py +++ b/tests/test_refObjLoader.py @@ -39,15 +39,18 @@ import ingestIndexTestBase class TestRefObjLoader(ingestIndexTestBase.ConvertReferenceCatalogTestBase, lsst.utils.tests.TestCase): - def setUp(self): + + @classmethod + def setUpClass(cls): + super().setUpClass() np.random.seed(10) # Generate a second catalog, with different ids inTempDir = tempfile.TemporaryDirectory() inPath = inTempDir.name - skyCatalogFile, _, skyCatalog = self.makeSkyCatalog(inPath, idStart=25, seed=123) + skyCatalogFile, _, skyCatalog = cls.makeSkyCatalog(inPath, idStart=25, seed=123) - self.skyCatalog = skyCatalog + cls.skyCatalog = skyCatalog # override some field names, and use multiple cores config = ingestIndexTestBase.makeConvertConfig(withRaDecErr=True, withMagErr=True, @@ -59,8 +62,8 @@ class TestRefObjLoader(ingestIndexTestBase.ConvertReferenceCatalogTestBase, config.file_reader.format = 'ascii.commented_header' config.n_processes = 1 config.id_name = 'id' # Use the ids from the generated catalogs - self.repoTempDir = tempfile.TemporaryDirectory() - repoPath = self.repoTempDir.name + cls.repoTempDir = tempfile.TemporaryDirectory() + repoPath = cls.repoTempDir.name # Convert the input data files to our HTM indexed format. dataTempDir = tempfile.TemporaryDirectory() @@ -69,7 +72,7 @@ class TestRefObjLoader(ingestIndexTestBase.ConvertReferenceCatalogTestBase, converter.run([skyCatalogFile]) # Make a temporary butler to ingest them into. - butler = self.makeTemporaryRepo(repoPath, config.dataset_config.indexer.active.depth) + butler = cls.makeTemporaryRepo(repoPath, config.dataset_config.indexer.active.depth) dimensions = [f"htm{depth}"] datasetType = DatasetType(config.dataset_config.ref_dataset_name, dimensions, @@ -95,8 +98,8 @@ class TestRefObjLoader(ingestIndexTestBase.ConvertReferenceCatalogTestBase, for dataRef in datasetRefs: handles.append(DeferredDatasetHandle(butler=butler, ref=dataRef, parameters=None)) - self.datasetRefs = datasetRefs - self.handles = handles + cls.datasetRefs = datasetRefs + cls.handles = handles inTempDir.cleanup() dataTempDir.cleanup() @@ -270,10 +273,12 @@ class TestRefObjLoader(ingestIndexTestBase.ConvertReferenceCatalogTestBase, ) self.assertEqual(result.fluxField, 'a_flux') - def tearDown(self): + @classmethod + def tearDownClass(cls): + super().tearDownClass() # Clear this to clean up the temporary directory. - self.repoTempDir.cleanup() - del self.repoTempDir + cls.repoTempDir.cleanup() + del cls.repoTempDir class TestMemory(lsst.utils.tests.MemoryTestCase):

and takes the non-multiprocessing testing from 16 seconds to 6 seconds and still works with xdist.

parejkoj

Thanks for fixing those docstrings!

I'm concerned about MockLoadReferenceObjects looking rather complicated. I think we should be able to pretty easily create a minimal butler or use unittest.mocks in e.g. meas_astrom.

I fear that just looking at coverage HTML isn't enough for this: the refcat code is not well isolated, so measured coverage might not change but we might actually lose out on things being covered (like the version0 tests). There are also multiple different files that test refcat-related things, in redundant and overlapping ways.

tests/test_refObjLoader.py

tests/test_htmIndex.py

tests/test_refObjLoader.py

parejkoj · 2022-04-22T23:44:30Z

tests/test_refObjLoader.py

+        self.assertFloatsAlmostEqual(cat_pm["coord_raErr"], predictedRaErr)
+        self.assertFloatsAlmostEqual(cat_pm["coord_decErr"], predictedDecErr)
+
+    def test_requireProperMotion(self):


I'm not at all convinced this is testing the same things as the testRequireProperMotion from test_htmIndex. That test had very specific cases, using a mocked butler to force the return of a specific file.

I believe this is testing the same thing as the current code, unless I'm missing something. Looking at the current code path,

meas_algorithms/tests/test_htmIndex.py

Lines 155 to 156 in f36ae6a

IngestIndexedReferenceTask.parseAndRun(args=[cls.input_dir, "--output", cls.testRepoPath,

cls.skyCatalogFile], config=config)

is what is ingested, and this comes from

meas_algorithms/tests/ingestIndexTestBase.py

Line 203 in f36ae6a

cls.skyCatalogFile, cls.skyCatalogFileDelim, cls.skyCatalog = cls.makeSkyCatalog(cls.outPath)

and this new test is also calling makeSkyCatalog().

tests/test_refObjLoader.py

parejkoj · 2022-04-23T00:33:04Z

tests/test_refObjLoader.py

+        self.assertFloatsNotEqual(cat['coord_ra'], catWithEpoch['coord_ra'], rtol=1.0e-4)
+        self.assertFloatsNotEqual(cat['coord_dec'], catWithEpoch['coord_dec'], rtol=1.0e-4)
+
+    def test_filterMap(self):


I think I'd rather this test didn't exist, and we just relied on the ones in test_loadReferenceObjects.py, but I don't know if this one provides any specific coverage that that doesn't.

I'm copying over the tests that were there before in gen2-only land. But the slight difference is that this is testing the return values of loadSkyCircle which the tests in test_loadReferenceObjects.py are not (because it has a dummy loadSkyCircle). The second test (that the values are equal) is I guess redundant with the schema mapping tests, but it's not expensive.

In particular, this test hits

meas_algorithms/python/lsst/meas/algorithms/loadReferenceObjects.py

Line 489 in f36ae6a

ReferenceObjectLoaderBase._addFluxAliases(refCat.schema, anyFilterMapsToThis, filterMap)

parejkoj · 2022-04-23T00:42:54Z

python/lsst/meas/algorithms/testUtils.py

+                np.rad2deg(self._cat['coord_dec']),
+                radius.asDegrees()
+            )
+        sel = np.zeros(len(self._cat), dtype=bool)


Renamed to selected.

parejkoj · 2022-04-23T00:47:33Z

python/lsst/meas/algorithms/testUtils.py

+
+
+class MockLoadReferenceObjects(ReferenceObjectLoader):
+    """A simple mock of LoadReferenceObjects for tests.


I'm not sure this is a "simple" mock: it seems like there's a lot going on in here. You've effectively re-implemented most of the important parts of ReferenceObjectLoader in here and I'm concerned that this hides some of the things we want to test. I wonder if we could get by with some unittest.mocks of dataId/datasetRef instead? Or even have a very minimal butler live in meas_algorithms/tests/data?

parejkoj · 2022-04-23T00:49:13Z

tests/test_refObjLoader.py

+                       lsst.utils.tests.TestCase):
+
+    @classmethod
+    def setUpClass(cls):


This setUpClass is very long. Is all of this really necessary, and/or could it be split out into free functions?

This is just following testIngestTwoFilesTwoCores. This is what's needed to ingest and setup the tests. I could put it into a function that's called by setUpClass I suppose but I don't know what that gains.

timj · 2022-04-25T17:56:02Z

tests/ingestIndexTestBase.py

@@ -193,7 +189,8 @@ def tearDownClass(cls):

    @classmethod
    def setUpClass(cls):
-        cls.outPath = tempfile.mkdtemp()
+        cls.outDir = tempfile.TemporaryDirectory()


We try to avoid using this in tests because it can lead to test failures over NFS caused by TemporaryDirectory not using the ignore_errors flag in shutil when it automatically runs the cleanup (because of nfs lock files hanging around). You might get away with it.

Oh no! Okay, we had a mix before and I thought that mkdtemp was old and TemporaryDirectory was new. I can move everything over to mkdtemp. The problem I was trying to solve was that the temporary directories aren't being cleaned up. (These might be fine because they all go in /tmp and who mounts /tmp over nfs?)

Other people may be excited about fancy case syntax, but ignore_cleanup_errors is what I'm really looking forward to in Python 3.10.

Ooh. They've finally fixed it to allow you to say you don't care. If only someone would write a 3.10 migration RFC.

Hmm. Don't we have lsst.utils.tests.temporaryDirectory that tries to fix this?

Question: should I move this all back to mkdtemp and add the robust cleanup now, or let this slide until we get to 3.10? I think it's fine because /tmp shouldn't be nfs mounted, but that doesn't mean it can't be nfs mounted.

👍 to lsst.utils.tests.temporaryDirectory as the solution for now.

Hmmm. That's a context manager so does it work against setup/teardown which we need here?

🤔 You are probably right that /tmp is not going to mess us up. I think the problem we had before was from people creating temp directories within the tests/ directory (which is still common).

I don't see any notes about temporary directory usage in the dev guide, and I wasn't aware of that problem. Should we add a section about that to the python testing guide?

parejkoj

A handful of new comments, but I think this is pretty close now. I filed an RFC about removing support for version0 refcats, but at least we know they should work in gen3 now.

python/lsst/meas/algorithms/testUtils.py

parejkoj · 2022-05-02T22:16:09Z

python/lsst/meas/algorithms/testUtils.py

+    """
+    def __init__(self, filenames, name='cal_ref_cat', config=None, htmLevel=4):
+        if config is None:
+            config = LoadReferenceObjectsConfig()


hmmm... looking at the base class, I think how we manage the config in ReferenceObjectLoaderBase is busted: if the config is None in the base class, we should create a default config there. The base class is not a Task, so it doesn't do that automatically like the gen2 code did. Don't need to fix it on this ticket, but we should create one to deal with it.

It's an easy fix, I can just add it on this ticket.

parejkoj · 2022-05-05T00:28:18Z

tests/test_convertReferenceCatalog.py

@@ -55,11 +56,14 @@ def test_main_args(self):
            # Test with sets because the glob can come out in any order.
            self.assertEqual(set(run.call_args.args[0]), set(self.expected_files))

+        outdir.cleanup()


Since you're using TemporaryDirectory, you don't need to call cleanup: it should be cleaned up automatically when it goes out of scope.

You're right. In some cases I see warnings, but I guess that's when it's not fully local?

parejkoj · 2022-05-05T00:33:34Z

tests/test_loadReferenceObjects.py

@@ -235,6 +237,95 @@ def make_catalog():
        self.assertIsNone(newRefCat)


+class ConvertReferenceCatalogConfigValidateTestCase(lsst.utils.tests.TestCase):
+    """Test validation of IngestIndexReferenceConfig."""


IngestIndexReferenceConfig -> ConvertReferenceCatalogConfig

parejkoj · 2022-05-05T00:41:09Z

tests/test_referenceObjectLoader.py

+    @classmethod
+    def setUpClass(cls):
+        super().setUpClass()
+        np.random.seed(10)


No need to seed here: makeSkyCatalog does its own seeding.

parejkoj · 2022-05-05T00:50:50Z

python/lsst/meas/algorithms/testUtils.py

+
+
+class MockReferenceObjectLoaderFromFiles(ReferenceObjectLoader):
+    """A simple mock of ReferenceObjectLoader to use a set of files.


Oh, let'ss expand this docstring to something like: "A ReferenceObjectLoader that doesn't know about regions on the sky or the butler, and takes the list of files on disk to mock dataIds."

This mock ReferenceObjectLoader uses a set of files on disk to create mock dataIds and data reference handles that can be accessed without a butler. The files must be afw catalog files in the reference catalog format, sharded with HTM pixelization.

parejkoj · 2022-05-05T18:51:46Z

tests/test_referenceObjectLoader.py

+        super().tearDownClass()
+        # Clear this to clean up the temporary directory.
+        cls.repoTempDir.cleanup()
+        del cls.repoTempDir


I'm pretty sure none of the above is necessary.

parejkoj · 2022-05-05T18:52:40Z

tests/test_referenceObjectLoader.py

+        loaderConfig = ReferenceObjectLoader.ConfigClass()
+        loader = ReferenceObjectLoader([dataRef.dataId for dataRef in self.datasetRefs],
+                                       self.handles,
+                                       config=loaderConfig)


Any empty config like this one should go away: thanks for fixing the base class to have it create a config!

parejkoj · 2022-05-05T18:53:52Z

tests/nopytest_ingestIndexReferenceCatalog.py

@@ -233,6 +242,10 @@ def test_getCatalog(self):
        self.assertFloatsEqual(newcat[:len(catalog)]['coord_ra'], catalog['coord_ra'])
        self.assertFloatsEqual(newcat[:len(catalog)]['coord_dec'], catalog['coord_dec'])

+    def tearDown(self):
+        self.tempDir.cleanup()
+        self.tempDir2.cleanup()


More unnecessary cleanups

parejkoj · 2022-05-05T18:54:35Z

tests/nopytest_ingestIndexReferenceCatalog.py

@@ -109,6 +112,10 @@ def runTest(withRaDecErr):
            self.checkAllRowsInRefcat(loader, skyCatalog1, config)
            self.checkAllRowsInRefcat(loader, skyCatalog2, config)

+            inTempDir1.cleanup()
+            inTempDir2.cleanup()
+            dataTempDir.cleanup()


I think you can just remove all the calls to cleanup in this branch.

timj approved these changes Apr 20, 2022

View reviewed changes

erykoff force-pushed the tickets/DM-34484 branch 2 times, most recently from 1110105 to 586906b Compare April 20, 2022 15:43

parejkoj requested changes Apr 23, 2022

View reviewed changes

erykoff force-pushed the tickets/DM-34484 branch from 586906b to 6aacfd3 Compare April 25, 2022 17:53

timj reviewed Apr 25, 2022

View reviewed changes

erykoff force-pushed the tickets/DM-34484 branch 3 times, most recently from 28d82ce to b1808c2 Compare April 26, 2022 16:29

parejkoj reviewed May 5, 2022

View reviewed changes

erykoff force-pushed the tickets/DM-34484 branch from b1808c2 to 1502364 Compare May 5, 2022 15:31

erykoff added 5 commits May 5, 2022 11:48

Add gen3 ReferenceObjectLoader tests.

2348b1b

Remove old gen2 test_htmIndex tests.

947c0b7

Remove obs_test dependency.

87866af

Add MockReferenceObjectLoaderFromFiles for tests.

96abeed

Fix incorrect docstrings in ReferenceObjectLoader.

87c8c9d

erykoff force-pushed the tickets/DM-34484 branch from 1502364 to 3d4539a Compare May 5, 2022 18:48

parejkoj reviewed May 9, 2022

View reviewed changes

erykoff added 3 commits May 9, 2022 11:13

Update to use tempfile.TemporaryDirectory() for easy cleanup.

76a65f8

Properly reserve memory for refcat conversion.

741eb5e

Add default config to ReferenceObjectLoader.

a988c88

erykoff force-pushed the tickets/DM-34484 branch from 3d4539a to a988c88 Compare May 9, 2022 18:13

parejkoj approved these changes May 9, 2022

View reviewed changes

erykoff merged commit cdb44b6 into main May 9, 2022

erykoff deleted the tickets/DM-34484 branch May 9, 2022 20:19

	IngestIndexedReferenceTask.parseAndRun(args=[cls.input_dir, "--output", cls.testRepoPath,
	cls.skyCatalogFile], config=config)



		class MockLoadReferenceObjects(ReferenceObjectLoader):
		"""A simple mock of LoadReferenceObjects for tests.



		class MockReferenceObjectLoaderFromFiles(ReferenceObjectLoader):
		"""A simple mock of ReferenceObjectLoader to use a set of files.

DM-34484: Update tests to remove dependency on obs_test. #278

DM-34484: Update tests to remove dependency on obs_test. #278

Conversation

erykoff commented Apr 19, 2022

timj left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

parejkoj left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TallJimbo Apr 25, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

parejkoj left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TallJimbo Apr 25, 2022 •

edited