DM-22599: Develop PipelineTask unit test framework #114

kfindeisen · 2020-01-23T00:04:59Z

This PR adds a module, lsst.pipe.base.tests, with test utilities specific to PipelineTask subclasses. The intent of these tests is to enable unit testing of:

whether a task's Connections are correctly written and whether they match the inputs and outputs of the run method
any logic in a custom runQuantum method
configuration logic, such as optional or alternative inputs or outputs

Note that, because it depends on a real Butler, the test code has some performance limitations: each call to makeTestRepo takes 4-6 seconds on my machine and each call to makeUniqueButler takes an extra second.

The code has been cleaned up, and tests added.

It's hard to explicitly provide correct keys without understanding how the Butler dimensions system works in detail. Moving key constraints to automated (if simple-minded) code greatly reduces the burden on callers.

Each test should have its own collection for isolation, but creating a completely new repository each time is impractical.

The code has been cleaned up, and minimal tests added.

timj · 2020-01-23T00:08:08Z

Is all the time in butler.registry.insertDimensionData?

kfindeisen · 2020-01-23T19:15:41Z

No, it's about 80% from Butler.makeRepo. The entire loop with insertDimensionData is a 10% contribution.

timj · 2020-01-23T19:21:47Z

Weird because for me running makeButlerRepo on the command line takes 1.2s on my laptop and includes import overhead of all the butler code. I don't understand how Butler.makeRepo can take 5 seconds with everything pre-imported.

timj · 2020-01-23T19:26:41Z

The daf_butler tests create very many new butler repos and there are only two tests that take longer than one second. One is an S3 test that takes just over a second and the other is a big registry test taking 7 seconds. If making a repository for tests had 5 seconds overhead the tests in daf_butler would take an incredible amount of time. What do you get if you run the daf_butler tests with pytest --durations=10 tests ? Are they all taking a very long time? Or mostly less than a second?

kfindeisen · 2020-01-23T19:31:00Z

Looks like I slightly overestimated the time for makeUniqueButler, but it is enough to add up across multiple tests:

6.93s setup    tests/test_pipelineTaskTests.py::PipelineTaskTestSuite::testMakeQuantumCorruptedDataId
4.50s setup    tests/test_pipelineTaskTests.py::ButlerUtilsTestSuite::testButlerDimensions
1.11s call     tests/test_pipelineTaskTests.py::PipelineTaskTestSuite::testRunQuantumPatchWithRun
0.87s call     tests/test_pipelineTaskTests.py::PipelineTaskTestSuite::testRunQuantumVisitWithRun
0.85s call     tests/test_pipelineTaskTests.py::PipelineTaskTestSuite::testMakeQuantumInvalidDimension
0.69s call     tests/test_pipelineTaskTests.py::PipelineTaskTestSuite::testRunQuantumPatchMockRun
0.61s call     tests/test_pipelineTaskTests.py::PipelineTaskTestSuite::testValidateOutputConnectionsSingle
0.58s call     tests/test_pipelineTaskTests.py::PipelineTaskTestSuite::testMakeQuantumNoSuchDatatype
0.57s call     tests/test_pipelineTaskTests.py::PipelineTaskTestSuite::testMakeQuantumMissingMultiple
0.56s call     tests/test_pipelineTaskTests.py::PipelineTaskTestSuite::testMakeQuantumExtraMultiple

While I'm at it, here are the finer-grained timings:

Total: 4.0734123570000005
Butler.makeRepo: 3.4911327360000004 (0.857053602736935)
Butler: 0.20473458100000003 (0.05026119701536517)
_makeRecords: 0.0005014780000000001 (0.00012311005026000612)
insertDimensionData: 0.377043562 (0.09256209019743983)
Total: 6.239485837
Butler.makeRepo: 5.2321940940000005 (0.8385617390095216)
Butler: 0.313362996 (0.050222567081050974)
_makeRecords: 0.0005615160000000001 (8.99939537758422e-05)
insertDimensionData: 0.6933672310000001 (0.11112569995565166)

timj · 2020-01-23T19:35:58Z

Well, I checked out the branch and ran the tests myself and I see:

0.15s setup    tests/test_pipelineTaskTests.py::ButlerUtilsTestSuite::testButlerDimensions
0.14s call     tests/test_pipelineTaskTests.py::PipelineTaskTestSuite::testRunQuantumPatchWithRun
0.14s setup    tests/test_pipelineTaskTests.py::PipelineTaskTestSuite::testMakeQuantumCorruptedDataId
0.13s call     tests/test_pipelineTaskTests.py::PipelineTaskTestSuite::testMakeQuantumInvalidDimension
0.12s call     tests/test_pipelineTaskTests.py::PipelineTaskTestSuite::testRunQuantumVisitMockRun
0.11s call     tests/test_pipelineTaskTests.py::ButlerUtilsTestSuite::testButlerDimensions
0.10s call     tests/test_pipelineTaskTests.py::PipelineTaskTestSuite::testRunQuantumVisitWithRun
0.10s call     tests/test_pipelineTaskTests.py::ButlerUtilsTestSuite::testExpandUniqueId
0.10s call     tests/test_pipelineTaskTests.py::ButlerUtilsTestSuite::testUniqueButler

so your machine is really really slow for some reason.

kfindeisen · 2020-01-23T19:37:29Z

Well, I know that my computer runs tests faster than Jenkins (at the level of individual packages), so I think there's still room for concern.

timj · 2020-01-23T19:38:35Z

Are you using a local SSD or an NFS mount?

timj · 2020-01-23T19:42:24Z

I do have a comment on the code itself. I really think that the test code for creating butlers and dataset types should be moved to daf_butler. daf_butler already has some of these functions in the helper packages inside daf_butler tests directory but it seems that they need to be consolidated with your code here and moved to lsst.daf.butler.tests. pipe_base should only have support code relating to testing pipelines not testing butlers.

kfindeisen · 2020-01-23T19:45:16Z

A suggestion from @timj on Slack (possibly redundant with the merge proposed above): use an in-memory SQLite database for the registry to speed up the Butler operations.

timj · 2020-01-23T19:49:57Z

Since the config is not being specified in your API I think you can create your own Config to pass to makeRepo.

c = lsst.daf.butler.Config()
c["registry", "db"] = "sqlite:///:memory:"
Butler.makeRepo(root, config=c)

TallJimbo · 2020-01-23T19:55:46Z

I can confirm that using an in-memory SQLite database can make things go tremendously faster. I discovered on a recent ticket that one of our Registry tests does an absurdly large number of inserts (playing with spatial indexing on regions covering ~half the sphere), and it was totally fine until I tried running that test against on-disk databases.

kfindeisen · 2020-01-23T20:20:58Z

I really think that the test code for creating butlers and dataset types should be moved to daf_butler. daf_butler already has some of these functions in the helper packages inside daf_butler tests directory but it seems that they need to be consolidated with your code here and moved to lsst.daf.butler.tests.

Given the scope of this change, I'm closing this PR and will open a new one once the pipe_base code depends on daf.butler.tests.

kfindeisen added 10 commits January 21, 2020 12:47

Transfer prototype Butler code from lsst.verify.

c7bf88a

The code has been cleaned up, and tests added.

Make makeTestButler infer relationships.

83ca65c

It's hard to explicitly provide correct keys without understanding how the Butler dimensions system works in detail. Moving key constraints to automated (if simple-minded) code greatly reduces the burden on callers.

Add expandUniqueId test utility to recover auto-generated data IDs.

cfc82db

Split repository/collection creation.

78d82b9

Each test should have its own collection for isolation, but creating a completely new repository each time is impractical.

Make record generation more robust.

9d9b96b

Transfer prototype runQuantum from lsst.verify.

8b8a0b8

The code has been cleaned up, and minimal tests added.

Add makeQuantum test utility.

6d55f5e

Add option to skip run in runQuantum test.

7c88841

Test runQuantum testbed with optional inputs.

2a852bd

Add validateOutputConnections.

a30b824

kfindeisen requested a review from parejkoj January 23, 2020 00:04

kfindeisen closed this Jan 23, 2020

This was referenced Feb 4, 2020

DM-23174: Consolidate daf_butler test code lsst/daf_butler#229

Merged

DM-22599: Develop PipelineTask unit test framework #116

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DM-22599: Develop PipelineTask unit test framework #114

DM-22599: Develop PipelineTask unit test framework #114

kfindeisen commented Jan 23, 2020 •

edited

timj commented Jan 23, 2020

kfindeisen commented Jan 23, 2020

timj commented Jan 23, 2020

timj commented Jan 23, 2020

kfindeisen commented Jan 23, 2020 •

edited

timj commented Jan 23, 2020

kfindeisen commented Jan 23, 2020

timj commented Jan 23, 2020

timj commented Jan 23, 2020 •

edited

kfindeisen commented Jan 23, 2020

timj commented Jan 23, 2020

TallJimbo commented Jan 23, 2020

kfindeisen commented Jan 23, 2020 •

edited

DM-22599: Develop PipelineTask unit test framework #114

DM-22599: Develop PipelineTask unit test framework #114

Conversation

kfindeisen commented Jan 23, 2020 • edited

timj commented Jan 23, 2020

kfindeisen commented Jan 23, 2020

timj commented Jan 23, 2020

timj commented Jan 23, 2020

kfindeisen commented Jan 23, 2020 • edited

timj commented Jan 23, 2020

kfindeisen commented Jan 23, 2020

timj commented Jan 23, 2020

timj commented Jan 23, 2020 • edited

kfindeisen commented Jan 23, 2020

timj commented Jan 23, 2020

TallJimbo commented Jan 23, 2020

kfindeisen commented Jan 23, 2020 • edited

kfindeisen commented Jan 23, 2020 •

edited

kfindeisen commented Jan 23, 2020 •

edited

timj commented Jan 23, 2020 •

edited

kfindeisen commented Jan 23, 2020 •

edited