-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-39198: Ensure task connection storage classes are used in Quanta. #335
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
Fix handling of storage classes in QuantumGraph generation. | ||
|
||
This could lead to a failure downstream in execution butler creation, and would likely have led to problems with Quantum-Backed Butler usage as well. |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -62,13 +62,26 @@ | |||||
qgraph = QuantumGraph.loadUri(graph) | ||||||
|
||||||
# Collect output refs that could be created by this graph. | ||||||
output_refs: set[DatasetRef] = set(qgraph.globalInitOutputRefs()) | ||||||
original_output_refs: set[DatasetRef] = set(qgraph.globalInitOutputRefs()) | ||||||
for task_def in qgraph.iterTaskGraph(): | ||||||
if refs := qgraph.initOutputRefs(task_def): | ||||||
output_refs.update(refs) | ||||||
original_output_refs.update(refs) | ||||||
for qnode in qgraph: | ||||||
for refs in qnode.quantum.outputs.values(): | ||||||
output_refs.update(refs) | ||||||
original_output_refs.update(refs) | ||||||
|
||||||
# Get data repository definitions from the QuantumGraph; these can have | ||||||
# different storage classes than those in the quanta. | ||||||
dataset_types = {dstype.name: dstype for dstype in qgraph.registryDatasetTypes()} | ||||||
|
||||||
# Convert output_refs to the data repository storage classes, too. | ||||||
output_refs = set() | ||||||
for ref in original_output_refs: | ||||||
internal_dataset_type = dataset_types.get(ref.datasetType.name, ref.datasetType) | ||||||
if internal_dataset_type.storageClass_name != ref.datasetType.storageClass_name: | ||||||
output_refs.add(ref.overrideStorageClass(internal_dataset_type.storageClass_name)) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
since the storage class should be attached and can be used directly without having to proxy through the name string. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If you think that's safe, sure. I tend to default to using the name when I'm not sure because I don't want to accidentally trigger imports of things that might not be available, but that may just be a habit relevant for working down in the guts of registry, not here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think at this point the storage class is going to be resolved into a real storage class almost immediately so it won't matter. I guess there's no harm in going via the name here and then deferring sotrage class creation as much as possible. |
||||||
else: | ||||||
output_refs.add(ref) | ||||||
|
||||||
# Make QBB, its config is the same as output Butler. | ||||||
qbb = QuantumBackedButler.from_predicted( | ||||||
|
@@ -77,6 +90,7 @@ | |||||
predicted_outputs=[], | ||||||
dimensions=qgraph.universe, | ||||||
datastore_records={}, | ||||||
dataset_types=dataset_types, | ||||||
) | ||||||
|
||||||
dest_butler = Butler(dest, writeable=True) | ||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this line enough I'm starting to wonder if QuantumGraph could have a method that returned the dict.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 . I'm informally saving up a number of changes I'd like to make to QG.