DM-41962: Use storage classes from QG in PreExecInit. #276

TallJimbo · 2023-12-01T18:41:54Z

PipelineDatasetTypes does not preserve storage classes (that's why it's being deprecated).

Checklist

ran Jenkins (failed due to something unrelated; main is broken in at least two ways right now, and this ticket fixes one of them)
added a release note for user-visible changes to doc/changes

PipelineDatasetTypes does preserve storage classes (that's why it's being deprecated).

First problem was just that the test assumed it could register a dataset type after the QG was made and not run into trouble when running that QG. Second problem was that it was just expecting something other than what its own code comments suggested, which was also contrary to correct behavior.

codecov · 2023-12-01T20:41:20Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (643a063) 87.15% compared to head (2e33486) 87.05%.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #276      +/-   ##
==========================================
- Coverage   87.15%   87.05%   -0.11%     
==========================================
  Files          49       49              
  Lines        4429     4433       +4     
  Branches      764      766       +2     
==========================================
- Hits         3860     3859       -1     
- Misses        413      421       +8     
+ Partials      156      153       -3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

timj

I have questions before I can review ☹️

tests/test_simple_pipeline_executor.py

timj · 2023-12-01T21:09:44Z

tests/test_simple_pipeline_executor.py

@@ -243,14 +237,21 @@ def test_from_pipeline_output_differ(self):
            )
        )

+        executor = self._configure_pipeline(


This has to move until after the dataset type has been registered? If that is the case shouldn't the other tests have this change?

Good point. They probably don't matter, but they're at least unsafe. Will do.

timj · 2023-12-01T21:11:12Z

tests/test_simple_pipeline_executor.py

        # b returns a dict and that is converted to TaskMetadata on put.
        self._test_logs(cm.output, "dict", "lsst.pipe.base.TaskMetadata", "dict", "dict")

        self.assertEqual(len(quanta), 2)
-        self.assertEqual(self.butler.get("intermediate"), {"zero": 0, "one": 1})
+        self.assertEqual(self.butler.get("intermediate").to_dict(), {"zero": 0, "one": 1})
        self.assertEqual(self.butler.get("output").to_dict(), {"zero": 0, "one": 1, "two": 2})


So "output" is a TaskMetadata? How is that consistent with "b returned a dict" test above and the default connections being "dict"?

From the perspective of b, it returned a dict, and declared that it would return a dict, but since the dataset type was registered with the repo as TaskMetadataLike it was converted to TaskMetadataLike on put.

Ok. It should likely use the more explicit TaskMetadata.from_dict() in the test to make it obvious.

This includes: - Using TaskMetadata.from_dict instead of to_dict in comparisons to make the intent clearer. - Documenting what the log-inspection utility method actually tests. - Making one test (test_from_pipeline_intermediates_differ) registering a dataset type before building a QG because that's what's needed in general for correctness (even though it didn't matter here). - Renaming and re-documenting another test (test_from_pipeline_inconsistent_dataset_types) where we were actually testing that building the QG and then changing a dataset type out from under it can be a problem.

TallJimbo · 2023-12-01T21:48:14Z

I've added another commit that cleans up the test a bit; ready for another look.

We have a custom QG builders in the wild and they're not as well-behaved as the main one; guard against them until we can fix them.

Use storage classes from QG in PreExecInit.

dd2f147

PipelineDatasetTypes does preserve storage classes (that's why it's being deprecated).

TallJimbo force-pushed the tickets/DM-41962 branch from 9ccb217 to dd2f147 Compare December 1, 2023 18:44

TallJimbo force-pushed the tickets/DM-41962 branch from 179fe1f to ae2ae85 Compare December 1, 2023 20:41

timj reviewed Dec 1, 2023

View reviewed changes

TallJimbo force-pushed the tickets/DM-41962 branch from c1d1bfd to 1a05997 Compare December 1, 2023 21:44

timj approved these changes Dec 1, 2023

View reviewed changes

Guard against QGs that don't populate registryDatasetTypes.

2e33486

We have a custom QG builders in the wild and they're not as well-behaved as the main one; guard against them until we can fix them.

TallJimbo merged commit e072a71 into main Dec 2, 2023
13 of 14 checks passed

TallJimbo deleted the tickets/DM-41962 branch December 2, 2023 20:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DM-41962: Use storage classes from QG in PreExecInit. #276

DM-41962: Use storage classes from QG in PreExecInit. #276

TallJimbo commented Dec 1, 2023 •

edited

codecov bot commented Dec 1, 2023 •

edited

timj left a comment

timj Dec 1, 2023

TallJimbo Dec 1, 2023

timj Dec 1, 2023

TallJimbo Dec 1, 2023

timj Dec 1, 2023

TallJimbo commented Dec 1, 2023

DM-41962: Use storage classes from QG in PreExecInit. #276

DM-41962: Use storage classes from QG in PreExecInit. #276

Conversation

TallJimbo commented Dec 1, 2023 • edited

Checklist

codecov bot commented Dec 1, 2023 • edited

Codecov Report

timj left a comment

Choose a reason for hiding this comment

timj Dec 1, 2023

Choose a reason for hiding this comment

TallJimbo Dec 1, 2023

Choose a reason for hiding this comment

timj Dec 1, 2023

Choose a reason for hiding this comment

TallJimbo Dec 1, 2023

Choose a reason for hiding this comment

timj Dec 1, 2023

Choose a reason for hiding this comment

TallJimbo commented Dec 1, 2023

TallJimbo commented Dec 1, 2023 •

edited

codecov bot commented Dec 1, 2023 •

edited