New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-25919: utilize new butler query functionality in QuantumGraph generation #139
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments
_LOG.debug("Iterating over query results to associate quanta with datasets.") | ||
# Iterate over query results, populating data IDs for datasets and | ||
# quanta and then connecting them to each other. | ||
n = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for typing you generally just write n: int
and not add other interpreter work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't just for typing; there is logging code after the loop body that uses n
, and we don't want that to crash if the loop is never executed.
@@ -495,6 +497,15 @@ def connectDataIds(self, registry, collections, userQuery): | |||
Object representing the collections to search for input datasets. | |||
userQuery : `str`, optional | |||
User-provided expression to limit the data IDs processed. | |||
|
|||
Returns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not right, this function is not a function once the context manager hits it. The function call actually returns a object that can be used in a with statement. The result of returned_object.enter is what you are documenting here. I really don't know how to document that, but what you wrote here is potentially confusing to someone trying to use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I know. This seemed like the the least-bad way to document it, and I figured the docstring itself explained the relationship to the context manager well enough.
for datasetType, refs in itertools.chain(self.inputs.items(), self.intermediates.items(), | ||
self.outputs.items()): | ||
datasetDataId = commonDataId.subset(datasetType.dimensions) | ||
ref = refs.get(datasetDataId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider just biting the bullet and doing the double hash lookup that should be fast to make this cleaner to read (and below)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm using get
here to avoid try...__getitem__...except
for control flow, not to avoid a double lookup. Or maybe I'm just not seeing what you're proposing as an alternative?
This avoids the many-single-row-queries problem in data ID expansion and regular dataset lookups. The biggest problem, in prerequisite dataset lookups, still exists, but will probably be deferred to another ticket that builds on this one.
Higher level logic can explicitly skip writing these if they already exist, so this code can't consider that an error.
This addresses a change to the default behavior of queryDatasets in daf_butler.
c504440
to
f33be83
Compare
@natelust, there are some conversations ongoing on this ticket (which I'd forgotten about when I asked you whether you'd signed off earlier), but Jenkins is green, this is a multi-package merge, and I don't want to miss my window and have to get through Jenkins again. I'm happy to fix anything that comes out of these ongoing conversations on my next ticket. |
No description provided.