New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-13851: Speed up ap_verify unit tests #34
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few initial comments here. I sketched out how I would use MagicMock to replace your custom classes in a branch; diff it against this branch to see my changes:
https://github.com/lsst-dm/ap_verify/tree/u/parejkoj/MagicMock
The tests all pass. I think there may be an even nicer way to mock the butler, which I'm poking at. I don't know if the above is what you tried when you looked at using unittest.mock?
(`testDataset`), and skip all tests if the dataset is not available. | ||
|
||
Subclasses must call `DataTestCase.setUpClass()` if they override | ||
``setUpClass`` themselves. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Subclasses must call super().setUpClass()
..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EDIT: never mind; the encapsulation violations I was worried about are a consequence of how classes work in Python (specifically, the distinction between initialization and construction), not of super()
or the MRO. After thinking about it some more, I realized that they will also happen with explicitly named bases.
from lsst.ap.verify.config import Config | ||
|
||
|
||
class DataTestCase(lsst.utils.tests.TestCase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't have this derive from TestCase
unless you want it to potentially be picked up as a test. Subclasses that are tests would then derive from this and TestCase
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure? We used this pattern a lot in afw
and didn't run into any problems. The fact that it's in python/
(necessary to ensure it's on the path) should make it immune to that kind of bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume the test classes themselves inherit from DataTestCase
so I don't think there is a problem. The problem is when you put a class in a test file that looks like a class of tests but isn't really.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised that pattern is in afw
, I'd have pushed back on it in review. DataTestCase
is not a TestCase
, it's essentially a mixin for things that are TestCase
s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't agree that it's a mixin (though maybe that word means something different in Python than in Java); it's a specific type of test case rather than extra functionality.
Anyway, I propose we revisit this later (and maybe the aforementioned afw
test utilities at the same time), since this bit of code sharing is not relevant to the actual speed-up work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Maybe I should make a Community post to discuss it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe discuss with Russell first, because lsst.afw.geom.testUtils.TransformTestBaseClass
(the "lots" of uses in afw
I thought I remembered 😰) was his idea.
# Hack the config for testing purposes | ||
# Note that Config.instance is supposed to be immutable, so, depending on initialization order, | ||
# this modification may cause other tests to see inconsistent config values | ||
Config.instance._allInfo['datasets.' + cls.datasetKey] = cls.testDataset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This worries me. Why do you have to do it this way? Can't you set the config values in an actual Config
instance in setUp()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you saying that instead of a Config
singleton, each class/function in ap_verify
should take a Config
object for testing convenience?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing that worries me is the "depending on initialization order... see inconsistent config values" bit. It also just feels hacky ("hack the config" afterall).
I absolutely wouldn't advocate the "config object for testing convenience" suggestion you propose. That's definitely clumsy.
Looking at it more, I thought Config
was a pex_config type of thing. Instead, it's a manager for those. I'm not entirely sure what its real purpose is, but I guess the above is the way to do what you want with a singleton.
tests/test_association.py
Outdated
"""An emulator for `lsst.daf.persistence.Butler.get` that can only handle test data. | ||
""" | ||
# No cleaner way to test if dict contains all key-value pairs in dataIdDict? | ||
if dataIdDict.items() <= dataId.items(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is exactly how to check whether dataIdDict
is a subset of dataId
: dict.items()
returns a dict_items
, which behaves like a set. You can rephrase your comment to say "check whether dataIdDict is a subset of dataId", if you want to clarify it there.
Here's the original PEP about it, if you're curious:
tests/test_association.py
Outdated
|
||
def fetchall(self): | ||
"""An emulator for `sqlite3.Cursor.fetchall`, returns results for known queries. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably would be good to list the "known" queries here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a feeling the documentation would quickly get out of date, but ok...
tests/test_association.py
Outdated
metricName='association.numTotalUnassociatedDiaObjects') | ||
with sqlite3.connect(":memory:") as conn: | ||
cursor = conn.cursor() | ||
with self.assertRaises(sqlite3.OperationalError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like it's just testing that sqlite3 raises an exception when you give it bad input. I don't understand what functionality it's testing in the code itself, other than exception pass-through.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. @morriscb, was there some specific functionality in ap.verify.measurements
being tested in testInvalidDb
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really. As I recall I was mostly just parroting other "assertRaises" tests the repository. I'm happy to see it gone if it's not thought to be necessary.
c070ee4
to
cce7d28
Compare
Making Dataset creation the argument parser's responsibility, while reducing duplicate program code, meant that test_args now depends on the --dataset argument being instantiable. To avoid test failures when no datasets are installed, I've shared the dummy dataset code from test_dataset.
This test significantly lengthened the test running time, but essentially tested functionality for the Butler rather than the Workspace.
cce7d28
to
98f721b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handful of further comments, mostly on docs. Thanks again for humoring me on this approach, and I'm glad you were able to make it work.
Per our conversation, it seems we both agree that this is as viable an approach as your previous Butler stubs (given the "immanent" replacement of the butler), and it is significantly less code.
The only other broad comment would be to add a couple of comments about why you're mocking what your mocking (e.g. "mock the butler, to avoid disk I/O and Mapper creation, to speed up the tests").
from lsst.ap.verify.config import Config | ||
|
||
|
||
class DataTestCase(lsst.utils.tests.TestCase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised that pattern is in afw
, I'd have pushed back on it in review. DataTestCase
is not a TestCase
, it's essentially a mixin for things that are TestCase
s.
# Hack the config for testing purposes | ||
# Note that Config.instance is supposed to be immutable, so, depending on initialization order, | ||
# this modification may cause other tests to see inconsistent config values | ||
Config.instance._allInfo['datasets.' + cls.datasetKey] = cls.testDataset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing that worries me is the "depending on initialization order... see inconsistent config values" bit. It also just feels hacky ("hack the config" afterall).
I absolutely wouldn't advocate the "config object for testing convenience" suggestion you propose. That's definitely clumsy.
Looking at it more, I thought Config
was a pex_config type of thing. Instead, it's a manager for those. I'm not entirely sure what its real purpose is, but I guess the above is the way to do what you want with a singleton.
tests/test_association.py
Outdated
elif datasetType == 'deepDiff_diaSrc': | ||
return testDiaSources | ||
raise dafPersist.NoResults("Dataset not found:", datasetType, dataId) | ||
self.butler = NonCallableMagicMock(spec=dafPersist.Butler, get=mockGet) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I think about it more, I think we don't want the butler mock to be a MagicMock
, but rather just a NonCallableMock
, since we don't need magic methods (e.g. iterators). I think the same holds true for our other mocks.
tests/test_ingestion.py
Outdated
|
||
This method initializes ``self._registerTask`` and ``self._registryHandle``. | ||
|
||
Behavior is undefined if more than one of `setUpRawRegistry`, `setUpCalibRegistry`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably put a WARNING:
in front of this, just to emphasize it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's that important -- I mean "undefined" in the sense of "I'm not supporting that case so I don't control what will happen" rather than "your hard drive will wipe itself."
self.assertFalse(butler.datasetExists('flat', filter='g')) | ||
# TODO: find a way to avoid having to know exact data ID expansion | ||
dataId = {'visit': datum['visit'], 'expTime': datum['exptime'], 'filter': datum['filter']} | ||
# TODO: I don't think we actually care about the keywords -- especially since they're defaults |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODOs should have a jira ticket attached to them, otherwise they'll get lost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree in general, but I'm also not comfortable filing tickets for work that I'm not sure can be done...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point. I guess we can see how much it actually matters in practice: if these tests never break due to the things mentioned in the TODOs, it'll be ok if we forget them!
tests/test_ingestion.py
Outdated
self._task = ingestion.DatasetIngestTask(config=IngestionTestSuite.config) | ||
|
||
def setUpRawRegistry(self): | ||
"""Mock up the RegisterTask used for ingesting raw data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that it should be called at the start of a test that needs an X registry.
Since measureTotalUnassociatedDiaObjects does not guarantee any particular exception behavior, this test case was deemed irrelevant to the measurements module.
The test was previously impossible because obs_test did not allow defect ingestion, but _RepoStub doesn't know that.
98f721b
to
e46325b
Compare
This PR fixes several bugs that were introduced in previous commits, then replaces most tests' Butler and database interactions with stub classes. This change speeds up unit test execution from ~20 s to ~5 s and enables future tests of modules that were previously deemed too expensive.