Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-39198: Ensure task connection storage classes are used in Quanta. #335

Merged
merged 3 commits into from May 17, 2023

Conversation

TallJimbo
Copy link
Member

@TallJimbo TallJimbo commented May 15, 2023

QuantumGraph should have the connection storage classes in its Quantum objects, and the registry storage classes for those dataset types in its "registryDatasetTypes" attribute. Apparently we were making no effort to ensure the former before.

Checklist

  • ran Jenkins
  • added a release note for user-visible changes to doc/changes

python/lsst/pipe/base/graphBuilder.py Outdated Show resolved Hide resolved
python/lsst/pipe/base/graphBuilder.py Outdated Show resolved Hide resolved
python/lsst/pipe/base/graphBuilder.py Outdated Show resolved Hide resolved
@@ -381,7 +399,10 @@ def __repr__(self) -> str:
quantum.
"""

def makeQuantum(self, datastore_records: Optional[Mapping[str, DatastoreRecordData]] = None) -> Quantum:
def makeQuantum(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did this get reformatted? I can't see what changed. Was it reformatted because of an early API modification that was backed out?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's exactly it. Black vs. the trailing commas.

QuantumGraph should have the connection storage classes in its Quantum
objects, and the registry storage classes for those dataset types in
its "registryDatasetTypes" attribute.  Apparently we were making no
effort to ensure the former before.
@codecov
Copy link

codecov bot commented May 15, 2023

Codecov Report

Patch coverage: 60.52% and project coverage change: -0.10 ⚠️

Comparison is base (943382d) 82.65% compared to head (a7ffabb) 82.55%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #335      +/-   ##
==========================================
- Coverage   82.65%   82.55%   -0.10%     
==========================================
  Files          62       62              
  Lines        6960     6978      +18     
  Branches     1381     1388       +7     
==========================================
+ Hits         5753     5761       +8     
- Misses        926      935       +9     
- Partials      281      282       +1     
Impacted Files Coverage Δ
...ython/lsst/pipe/base/script/transfer_from_graph.py 12.50% <0.00%> (-2.14%) ⬇️
python/lsst/pipe/base/graphBuilder.py 64.14% <82.14%> (+0.17%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.


# Get data repository definitions from the QuantumGraph; these can have
# different storage classes than those in the quanta.
dataset_types = {dstype.name: dstype for dstype in qgraph.registryDatasetTypes()}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this line enough I'm starting to wonder if QuantumGraph could have a method that returned the dict.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 . I'm informally saving up a number of changes I'd like to make to QG.

for ref in original_output_refs:
internal_dataset_type = dataset_types.get(ref.datasetType.name, ref.datasetType)
if internal_dataset_type.storageClass_name != ref.datasetType.storageClass_name:
output_refs.add(ref.overrideStorageClass(internal_dataset_type.storageClass_name))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
output_refs.add(ref.overrideStorageClass(internal_dataset_type.storageClass_name))
output_refs.add(ref.overrideStorageClass(internal_dataset_type.storageClass))

since the storage class should be attached and can be used directly without having to proxy through the name string.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you think that's safe, sure. I tend to default to using the name when I'm not sure because I don't want to accidentally trigger imports of things that might not be available, but that may just be a habit relevant for working down in the guts of registry, not here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think at this point the storage class is going to be resolved into a real storage class almost immediately so it won't matter. I guess there's no harm in going via the name here and then deferring sotrage class creation as much as possible.

@TallJimbo TallJimbo merged commit b187eb6 into main May 17, 2023
8 of 10 checks passed
@TallJimbo TallJimbo deleted the tickets/DM-39198 branch May 17, 2023 03:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants