Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM- 26629: switch to calibration collections instead of the calibration_label dimension #148

Merged
merged 5 commits into from Sep 26, 2020

Conversation

TallJimbo
Copy link
Member

No description provided.

This method was clearly intended for this purpose, but wasn't being
used, and couldn't actually be used due to a missing base-class version
and lack of support for parent storage classes.
QG generation and the executor classes already guarantee this class is
given resolved DatasetRefs, so this always saves an extra Registry
lookup, and it's now necessary for calibration lookups, which can no
longer be done on data ID alone (a timespan is needed as well).
Copy link
Contributor

@natelust natelust left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it mostly looks fine

@@ -71,11 +71,11 @@ def __init__(self, butler: Butler, quantum: Quantum):
def _get(self, ref):
if isinstance(ref, DeferredDatasetRef):
self._checkMembership(ref.datasetRef, self.allInputs)
return butler.getDeferred(ref.datasetRef)
return butler.getDirectDeferred(ref.datasetRef)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im not supper comfortable linking this to something that MUST have been done by some other class. At minimum you should throw an execption here if ref.id is none (or in the constructor), and add something to the documentation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getDirectDeferred will raise an exception if the ref.id is None.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks to Tim's comment on the daf_butler PR. As we discussed out-of-band, I'm also uncomfortable with this relying on logic in ctrl_mpexec, but I think the problem is where that code lives, not what it does or guarantees. But in addition to that check for not-None ID (which we can defer to butler), I'll go add some documentation to the ButlerQuantumContext class docs and anywhere else I can find that resolved DatasetRefs are a precondition.


else:
self._checkMembership(ref, self.allInputs)
return butler.get(ref)
return butler.getDirect(ref)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getDirect already gets upset if it's not a resolved ref.

refs = list(
lookupFunction(datasetType, registry, quantum.dataId, collections)
)
elif (datasetType.isCalibration()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only comment about this block is that I had always intended to go back and make the else a free function so it was easier to use in a custom lookupFunction that wanted to extend that behavior. Perhaps you could do that and the same for the calibration bits if you think it is worth it. Its not too much code to copy for someone, and its use will be small so if you dont think its a good use of time I am fine with you not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that's a good idea, but this ticket is too big and cumbersome for any more scope, and I definitely plan to be revisit this block in the next month anyway.

# temporal join on a non-dimension-based timespan yet.
timespan = quantum.dataId.timespan
try:
refs = [registry.findDataset(datasetType, quantum.dataId,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you be sure that this will only ever return one, is an exception raised if more than one could be found for a timespan? Does the certification process prevent this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, findDataset will raise in that case. The certification process is supposed to guarantee that is impossible for any calibration I can think of, but it's not something we can guarantee at the level of database constraints. So the exception basically says, "someone probably put together a malformed calibration repo".

refs = list(registry.queryDatasets(datasetType,
collections=collections,
dataId=quantum.dataId,
deduplicate=True).expanded())
quantum.prerequisites[datasetType].update({ref.dataId: ref for ref in refs})
quantum.prerequisites[datasetType].update({ref.dataId: ref for ref in refs
if ref is not None})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If my ticket lands first, this is broken, infact... looking at this makes me feel I need to go look back at my ticket...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, this is a QuantumScaffolding

newRefsForDatasetType.append(resolvedRef)
else:
newRefsForDatasetType.append(ref)
refsForDatasetType[:] = newRefsForDatasetType
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes too with my ticket, the race is on, maybe I should have held back these comments? :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least (I imagine) it'll be an easy change in either case.

@TallJimbo TallJimbo merged commit 1ce627f into master Sep 26, 2020
@TallJimbo TallJimbo deleted the tickets/DM-26629 branch September 26, 2020 05:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants