Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-13851: Speed up ap_verify unit tests #34

Merged
merged 10 commits into from Jun 28, 2018
Merged

Conversation

kfindeisen
Copy link
Member

This PR fixes several bugs that were introduced in previous commits, then replaces most tests' Butler and database interactions with stub classes. This change speeds up unit test execution from ~20 s to ~5 s and enables future tests of modules that were previously deemed too expensive.

@kfindeisen kfindeisen requested a review from parejkoj June 8, 2018 18:09
Copy link
Contributor

@parejkoj parejkoj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few initial comments here. I sketched out how I would use MagicMock to replace your custom classes in a branch; diff it against this branch to see my changes:

https://github.com/lsst-dm/ap_verify/tree/u/parejkoj/MagicMock

The tests all pass. I think there may be an even nicer way to mock the butler, which I'm poking at. I don't know if the above is what you tried when you looked at using unittest.mock?

(`testDataset`), and skip all tests if the dataset is not available.

Subclasses must call `DataTestCase.setUpClass()` if they override
``setUpClass`` themselves.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Subclasses must call super().setUpClass() ..."

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EDIT: never mind; the encapsulation violations I was worried about are a consequence of how classes work in Python (specifically, the distinction between initialization and construction), not of super() or the MRO. After thinking about it some more, I realized that they will also happen with explicitly named bases.

from lsst.ap.verify.config import Config


class DataTestCase(lsst.utils.tests.TestCase):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't have this derive from TestCase unless you want it to potentially be picked up as a test. Subclasses that are tests would then derive from this and TestCase.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure? We used this pattern a lot in afw and didn't run into any problems. The fact that it's in python/ (necessary to ensure it's on the path) should make it immune to that kind of bug.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the test classes themselves inherit from DataTestCase so I don't think there is a problem. The problem is when you put a class in a test file that looks like a class of tests but isn't really.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised that pattern is in afw, I'd have pushed back on it in review. DataTestCase is not a TestCase, it's essentially a mixin for things that are TestCases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree that it's a mixin (though maybe that word means something different in Python than in Java); it's a specific type of test case rather than extra functionality.

Anyway, I propose we revisit this later (and maybe the aforementioned afw test utilities at the same time), since this bit of code sharing is not relevant to the actual speed-up work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Maybe I should make a Community post to discuss it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe discuss with Russell first, because lsst.afw.geom.testUtils.TransformTestBaseClass (the "lots" of uses in afw I thought I remembered 😰) was his idea.

# Hack the config for testing purposes
# Note that Config.instance is supposed to be immutable, so, depending on initialization order,
# this modification may cause other tests to see inconsistent config values
Config.instance._allInfo['datasets.' + cls.datasetKey] = cls.testDataset
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This worries me. Why do you have to do it this way? Can't you set the config values in an actual Config instance in setUp()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying that instead of a Config singleton, each class/function in ap_verify should take a Config object for testing convenience?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing that worries me is the "depending on initialization order... see inconsistent config values" bit. It also just feels hacky ("hack the config" afterall).

I absolutely wouldn't advocate the "config object for testing convenience" suggestion you propose. That's definitely clumsy.

Looking at it more, I thought Config was a pex_config type of thing. Instead, it's a manager for those. I'm not entirely sure what its real purpose is, but I guess the above is the way to do what you want with a singleton.

"""An emulator for `lsst.daf.persistence.Butler.get` that can only handle test data.
"""
# No cleaner way to test if dict contains all key-value pairs in dataIdDict?
if dataIdDict.items() <= dataId.items():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exactly how to check whether dataIdDict is a subset of dataId: dict.items() returns a dict_items, which behaves like a set. You can rephrase your comment to say "check whether dataIdDict is a subset of dataId", if you want to clarify it there.

Here's the original PEP about it, if you're curious:

https://www.python.org/dev/peps/pep-3106/


def fetchall(self):
"""An emulator for `sqlite3.Cursor.fetchall`, returns results for known queries.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably would be good to list the "known" queries here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a feeling the documentation would quickly get out of date, but ok...

metricName='association.numTotalUnassociatedDiaObjects')
with sqlite3.connect(":memory:") as conn:
cursor = conn.cursor()
with self.assertRaises(sqlite3.OperationalError):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like it's just testing that sqlite3 raises an exception when you give it bad input. I don't understand what functionality it's testing in the code itself, other than exception pass-through.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. @morriscb, was there some specific functionality in ap.verify.measurements being tested in testInvalidDb?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. As I recall I was mostly just parroting other "assertRaises" tests the repository. I'm happy to see it gone if it's not thought to be necessary.

Making Dataset creation the argument parser's responsibility, while
reducing duplicate program code, meant that test_args now depends on
the --dataset argument being instantiable. To avoid test failures when
no datasets are installed, I've shared the dummy dataset code from
test_dataset.
This test significantly lengthened the test running time, but
essentially tested functionality for the Butler rather than
the Workspace.
Copy link
Contributor

@parejkoj parejkoj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handful of further comments, mostly on docs. Thanks again for humoring me on this approach, and I'm glad you were able to make it work.

Per our conversation, it seems we both agree that this is as viable an approach as your previous Butler stubs (given the "immanent" replacement of the butler), and it is significantly less code.

The only other broad comment would be to add a couple of comments about why you're mocking what your mocking (e.g. "mock the butler, to avoid disk I/O and Mapper creation, to speed up the tests").

from lsst.ap.verify.config import Config


class DataTestCase(lsst.utils.tests.TestCase):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised that pattern is in afw, I'd have pushed back on it in review. DataTestCase is not a TestCase, it's essentially a mixin for things that are TestCases.

# Hack the config for testing purposes
# Note that Config.instance is supposed to be immutable, so, depending on initialization order,
# this modification may cause other tests to see inconsistent config values
Config.instance._allInfo['datasets.' + cls.datasetKey] = cls.testDataset
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing that worries me is the "depending on initialization order... see inconsistent config values" bit. It also just feels hacky ("hack the config" afterall).

I absolutely wouldn't advocate the "config object for testing convenience" suggestion you propose. That's definitely clumsy.

Looking at it more, I thought Config was a pex_config type of thing. Instead, it's a manager for those. I'm not entirely sure what its real purpose is, but I guess the above is the way to do what you want with a singleton.

elif datasetType == 'deepDiff_diaSrc':
return testDiaSources
raise dafPersist.NoResults("Dataset not found:", datasetType, dataId)
self.butler = NonCallableMagicMock(spec=dafPersist.Butler, get=mockGet)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I think about it more, I think we don't want the butler mock to be a MagicMock, but rather just a NonCallableMock, since we don't need magic methods (e.g. iterators). I think the same holds true for our other mocks.


This method initializes ``self._registerTask`` and ``self._registryHandle``.

Behavior is undefined if more than one of `setUpRawRegistry`, `setUpCalibRegistry`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably put a WARNING: in front of this, just to emphasize it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's that important -- I mean "undefined" in the sense of "I'm not supporting that case so I don't control what will happen" rather than "your hard drive will wipe itself."

self.assertFalse(butler.datasetExists('flat', filter='g'))
# TODO: find a way to avoid having to know exact data ID expansion
dataId = {'visit': datum['visit'], 'expTime': datum['exptime'], 'filter': datum['filter']}
# TODO: I don't think we actually care about the keywords -- especially since they're defaults
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODOs should have a jira ticket attached to them, otherwise they'll get lost.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree in general, but I'm also not comfortable filing tickets for work that I'm not sure can be done...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point. I guess we can see how much it actually matters in practice: if these tests never break due to the things mentioned in the TODOs, it'll be ok if we forget them!

self._task = ingestion.DatasetIngestTask(config=IngestionTestSuite.config)

def setUpRawRegistry(self):
"""Mock up the RegisterTask used for ingesting raw data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that it should be called at the start of a test that needs an X registry.

Since measureTotalUnassociatedDiaObjects does not guarantee any
particular exception behavior, this test case was deemed irrelevant to
the measurements module.
The test was previously impossible because obs_test did not allow defect
ingestion, but _RepoStub doesn't know that.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants