DM-33278: Improve storage class conversion #632

timj · 2022-01-14T22:24:36Z

Add a dataset type compatibility method.
Add support for specifying methods.

Checklist

ran Jenkins
added a release note for user-visible changes to doc/changes

codecov · 2022-01-14T22:39:58Z

Codecov Report

Merging #632 (457729f) into main (90f5a21) will increase coverage by 0.01%.
The diff coverage is 93.33%.

@@            Coverage Diff             @@
##             main     #632      +/-   ##
==========================================
+ Coverage   84.11%   84.12%   +0.01%     
==========================================
  Files         237      237              
  Lines       30143    30183      +40     
  Branches     5014     5018       +4     
==========================================
+ Hits        25354    25392      +38     
- Misses       3645     3646       +1     
- Partials     1144     1145       +1

Impacted Files	Coverage Δ
python/lsst/daf/butler/core/storageClass.py	`93.92% <0.00%> (-0.61%)`	⬇️
python/lsst/daf/butler/core/datasets/type.py	`83.41% <94.73%> (+1.14%)`	⬆️
...on/lsst/daf/butler/datastores/inMemoryDatastore.py	`83.05% <100.00%> (+0.09%)`	⬆️
tests/test_datasets.py	`99.23% <100.00%> (+0.04%)`	⬆️
tests/test_storageClass.py	`98.88% <100.00%> (+0.05%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 90f5a21...457729f. Read the comment docs.

If a converter ends with () it is assumed to be a method to run on the object to be converted and not a class method or function.

andy-slac

Looks good, couple of minor comments/questions.

andy-slac · 2022-01-16T05:04:09Z

doc/lsst.daf.butler/formatters.rst

+In the first definition, the configuration says that if a ``PropertySet`` object is given then the ``toDict()`` method can be called on it and the returned value will be a `dict`.
+In the second definition a ``PropertySet`` is again specified but this time the ``from_metadata`` class method will be called with the ``PropertySet`` as the first parameter and a ``TaskMetadata`` will be returned.


This difference between () and no () confused me a lot when I first saw it and it was not obvious what it means. I had to read everything more than once to figure it out. Now I think that toDict() is basically the same as lsst.daf.base.PropertySet.toDict (non-bound method). Would it be better to drop () and use non-bound method syntax for symmetry?

I've been pondering this and the () syntax has the advantage that it doesn't lead to duplicating the full class name on both sides:

lsst.daf.butler.tests.MetricsExample: exportAsDict()

vs

lsst.daf.butler.tests.MetricsExample: lsst.daf.butler.tests.MetricsExample.exportAsDict

The latter does work and doesn't require any code changes to make it work without this ticket. Maybe that's a good enough argument for not trying to bother with the more succinct form.

andy-slac · 2022-01-16T05:12:43Z

python/lsst/daf/butler/core/datasets/type.py

-        if self._isCalibration != other._isCalibration:
+        return True
+
+    def is_compatible_with(self, other: "DatasetType") -> bool:


Do you need to quote DatasetType here?

I think I'm confused between from __future__ import annotations being enabled and not being enabled. There was some other place where we did have to use strings if the type was used in the context of its own class but maybe that was to do with pydantic.

I believe we cannot use from __future__ import annotations with pydantic (which is really a shame, and I think it means we should mostly put pydantic classes in their own modules, so we can use it everywhere else). And without __future__.annotations, you do have to quote in places like this.

andy-slac · 2022-01-16T05:15:54Z

python/lsst/daf/butler/core/datasets/type.py

-        if self._isCalibration != other._isCalibration:
+        return True
+
+    def is_compatible_with(self, other: "DatasetType") -> bool:


Does "compatible" mean compatible for both get and put? Is it symmetrical (X.is_compatible_with(Y) === Y.is_compatible_with(X))?

It really means self_sc.can_convert(other_sc) -- You have made me realize that ctrl_mpexec is going to have to try to distinguish inputs (so can convert on get to the required type for the Task) and outputs (so can convert the type used in the Task to the required type in the butler).

This reverts commit 8e15cb7. We can do the same thing with a full name of an unbound method.

This removes surprises when someone gets the dataset (there is no need to also add conversion on get), and provides consistent application of parameters.

Sometimes a storage class can be defined with a different name but is referring to the same python type.

timj force-pushed the tickets/DM-33278 branch from b287294 to 0f09851 Compare January 14, 2022 22:27

timj mentioned this pull request Jan 14, 2022

DM-33278: Check for dataset type compatibility before failing lsst/ctrl_mpexec#160

Closed

2 tasks

timj force-pushed the tickets/DM-33278 branch from 0f09851 to dca815c Compare January 14, 2022 22:58

Support methods in storage class converters

8e15cb7

If a converter ends with () it is assumed to be a method to run on the object to be converted and not a class method or function.

timj force-pushed the tickets/DM-33278 branch from 68a3b2c to 9d7a6a4 Compare January 15, 2022 00:38

timj requested a review from andy-slac January 15, 2022 14:58

andy-slac approved these changes Jan 16, 2022

View reviewed changes

timj added 11 commits January 18, 2022 15:54

Revert "Support methods in storage class converters"

6c24f85

This reverts commit 8e15cb7. We can do the same thing with a full name of an unbound method.

Convert python type on put in in-memory datastore

79635c8

This removes surprises when someone gets the dataset (there is no need to also add conversion on get), and provides consistent application of parameters.

Add DatasetType.is_compatible_with method

47178c5

Add a converter for dict -> Packages

4801db9

Add news fragment

effd925

Add some short docs on storage class conversion.

6589e11

Include DatasetType equality test involving isCalibration

9a4be32

Improve conversion docs

438ab2f

Add explicit test of unbound method converter

94c842a

Add experimental warning to storage class conversion docs

228f380

Update the documentation to indicate that unbound methods are allowed

5287133

timj force-pushed the tickets/DM-33278 branch from 9d7a6a4 to 5287133 Compare January 18, 2022 22:55

Allow StorageClass.can_convert to return true if pytypes match

457729f

Sometimes a storage class can be defined with a different name but is referring to the same python type.

timj merged commit 8be9c34 into main Jan 19, 2022

timj deleted the tickets/DM-33278 branch January 19, 2022 03:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DM-33278: Improve storage class conversion #632

DM-33278: Improve storage class conversion #632

timj commented Jan 14, 2022 •

edited

codecov bot commented Jan 14, 2022 •

edited

andy-slac left a comment

andy-slac Jan 16, 2022

timj Jan 18, 2022

andy-slac Jan 16, 2022

timj Jan 18, 2022

TallJimbo Jan 18, 2022

andy-slac Jan 16, 2022

timj Jan 16, 2022

		In the first definition, the configuration says that if a ``PropertySet`` object is given then the ``toDict()`` method can be called on it and the returned value will be a `dict`.
		In the second definition a ``PropertySet`` is again specified but this time the ``from_metadata`` class method will be called with the ``PropertySet`` as the first parameter and a ``TaskMetadata`` will be returned.

DM-33278: Improve storage class conversion #632

DM-33278: Improve storage class conversion #632

Conversation

timj commented Jan 14, 2022 • edited

Checklist

codecov bot commented Jan 14, 2022 • edited

Codecov Report

andy-slac left a comment

Choose a reason for hiding this comment

andy-slac Jan 16, 2022

Choose a reason for hiding this comment

timj Jan 18, 2022

Choose a reason for hiding this comment

andy-slac Jan 16, 2022

Choose a reason for hiding this comment

timj Jan 18, 2022

Choose a reason for hiding this comment

TallJimbo Jan 18, 2022

Choose a reason for hiding this comment

andy-slac Jan 16, 2022

Choose a reason for hiding this comment

timj Jan 16, 2022

Choose a reason for hiding this comment

timj commented Jan 14, 2022 •

edited

codecov bot commented Jan 14, 2022 •

edited