DM-19988: support code (and removal of dead code) for QuantumGraph generation rewrite #169

TallJimbo · 2019-06-27T15:31:12Z

No description provided.

andy-slac

Looks OK, few minor comments.

andy-slac · 2019-06-28T22:48:26Z

config/schema.yaml

+        primary_key: true
+        nullable: false
+        doc: >
+          Unique (with instrument) integer identifier for an visit.


andy-slac · 2019-06-28T22:59:25Z

python/lsst/daf/butler/core/quantum.py

        The Run this Quantum is a part of.
+    initInputs : collection of `DatasetRef`


optional?

andy-slac · 2019-06-28T23:13:10Z

python/lsst/daf/butler/core/quantum.py

+                 **kwargs):
+        super().__init__(**kwargs)
+        if taskClass is not None:
+            taskName = f"{taskClass.__module__}.{taskClass.__name__}"


Everyone seem to prefer f-strings for simple string operations, I still like old fashioned

taskClass.__module__ + "." + taskClass.__name__

Yeah, I do like the f-string version better. In this case there isn't much of a difference, but I like using the same pattern for simple and more complex cases that seem to be part of the same family of patterns, and I think this qualifies.

andy-slac · 2019-06-28T23:15:42Z

python/lsst/daf/butler/core/quantum.py

+        if initInputs is None:
+            initInputs = {}
+        elif not hasattr(initInputs, "keys"):
+            initInputs = {ref.datasetType: ref for ref in initInputs}


Could it be that there is more than one ref in initInputs with the same dataset type?

No, the PipelineTask declarations already assume all initInput and initOutput dataset types are scalar (and they have to be because they have empty data IDs).

andy-slac · 2019-06-28T23:23:15Z

python/lsst/daf/butler/core/quantum.py

+        self._initInputs = NamedKeyDict(initInputs)
+        self._predictedInputs = NamedKeyDict(predictedInputs if predictedInputs is not None else {})
+        self._actualInputs = NamedKeyDict(actualInputs if actualInputs is not None else {})
+        self._outputs = NamedKeyDict(outputs if outputs is not None else {})


Some serious typing could be saved if NamedKeyDict constructor accepted None. Or can't you use old pattern outputs or {}?

I don't think None makes sense for NamedKeyDict in general, so I'd prefer not to move the handling there. And I once asked about a variant of the outputs or {} pattern on Slack/#software-def, and it was not popular (people considered it a Perl pattern that was a Python antipattern). But I've come up with another solution: it's safe to pass an empty tuple NamedKeyDict's constructor, so I'll just use that as the default for these arguments.

MultipleDatasetQueryBuilder and its Registry-level interface were highly specialized for QuantumGraph generation but didn't actually do what we need. DataIdQueryBuilder is simpler, lower-level and hopefully more generally useful. The higher-level logic for QuantumGraph generation is being moved to pipe_base, which will now use DataIdQueryBuilder (and SingleDatasetQueryBuilder) directly.

Eventually we may want to add this to the database representation as well, but that's a bit tricky due to the variable number of dimensions, and if we partition the dataset table in the future that may provide a model.

These weren't really thought through and are not currently in use, so right now they're just a maintenance burden.

andy-slac

Looks OK, one minor comment

andy-slac · 2019-07-02T18:24:33Z

python/lsst/daf/butler/sql/queryBuilder.py


        Parameters
        ----------
-        sqlExpression
-            SQLAlchemy boolean column expression.
+        expression : str


backticks around str?

In particular, do not begin transactions before read-only operations, and use nested transactions on write operations. This seems to be necessary in order to do a Datastore read inside a QueryBuilder result-iteration loop.

andy-slac approved these changes Jun 29, 2019

View reviewed changes

TallJimbo added 11 commits June 30, 2019 16:33

Find the best join table for indirection, not just the first one.

4494cde

Add optional data ID to Quantum (in Python).

5ca429b

Eventually we may want to add this to the database representation as well, but that's a bit tricky due to the variable number of dimensions, and if we partition the dataset table in the future that may provide a model.

Add custom dictionary class for keys with "name" attributes.

2b1f916

Remove Registry methods for dealing with Quantum objects.

af1edff

These weren't really thought through and are not currently in use, so right now they're just a maintenance burden.

Augment and improve Quantum attributes.

b49ade0

Add SQL-compiling __str__ to QueryBuilder.

2ba909c

Add matches method to DataId.

5858290

Add property to get empty DimensionGraph for a DimensionUniverse.

0dc977f

Fix typo in docs.

77f334f

Add temporal join between visit and calibration_label.

b4b5fd8

TallJimbo force-pushed the tickets/DM-19988 branch 2 times, most recently from 6db6cfb to eaeeae2 Compare July 1, 2019 22:51

andy-slac approved these changes Jul 2, 2019

View reviewed changes

TallJimbo added 3 commits July 12, 2019 11:05

Fix doc errors in QueryBuilder.

3643f90

Add QueryBuilder method for DataId-based WHERE expressions.

e2b8c28

TallJimbo force-pushed the tickets/DM-19988 branch from ed50c93 to d14b9c9 Compare July 12, 2019 15:05

TallJimbo merged commit 730ec69 into master Jul 12, 2019

TallJimbo deleted the tickets/DM-19988 branch July 12, 2019 15:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DM-19988: support code (and removal of dead code) for QuantumGraph generation rewrite #169

DM-19988: support code (and removal of dead code) for QuantumGraph generation rewrite #169

TallJimbo commented Jun 27, 2019

andy-slac left a comment

andy-slac Jun 28, 2019

andy-slac Jun 28, 2019

andy-slac Jun 28, 2019

TallJimbo Jun 30, 2019

andy-slac Jun 28, 2019

TallJimbo Jun 30, 2019

andy-slac Jun 28, 2019

TallJimbo Jun 30, 2019 •

edited

andy-slac left a comment

andy-slac Jul 2, 2019

		The Run this Quantum is a part of.
		initInputs : collection of `DatasetRef`

DM-19988: support code (and removal of dead code) for QuantumGraph generation rewrite #169

DM-19988: support code (and removal of dead code) for QuantumGraph generation rewrite #169

Conversation

TallJimbo commented Jun 27, 2019

andy-slac left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TallJimbo Jun 30, 2019 • edited

Choose a reason for hiding this comment

andy-slac left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TallJimbo Jun 30, 2019 •

edited