Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-43479: Add ability to use where/bind clauses in LogBrowser #38

Merged
merged 1 commit into from Mar 27, 2024

Conversation

laurenam
Copy link
Contributor

No description provided.

Copy link
Contributor

@mfisherlevine mfisherlevine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, do with the one functional comment as you see fit, I'm fine with whatever.

@@ -82,7 +88,7 @@ def _getDataRefs(self):
dataRefs : `list` [`lsst.daf.butler.core.datasets.ref.DatasetRef`]
"""
results = self.butler.registry.queryDatasets(
f"{self.taskName}_log", collections=self.collection, findFirst=True
f"{self.taskName}_log", collections=self.collection, findFirst=True, where=self.whereStr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the default/null option here is an empty string and not None, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -50,7 +55,7 @@ class LogBrowser:
animal.

example usage:
logBrowser = LogBrowser(butler, taskName, collection)
logBrowser = LogBrowser(butler, taskName, collection, whereStr=whereStr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're supporting where then I wonder if we should also support passing in a bind option too. Don't worry if you don't think it's useful/can't be bothered, and as this is summit_extras (and will hopefully be absorbed into pipetask report anyway) I don't think it matters much, but just checking, especially given how you're using .format() in the example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't aware of this binding option! Definitely worth adding, so I did (once I figured out what it was!) 😄

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few reasons one should always use bind, I am told. There is a crazy foot-gun there though, where you mustn't use dimension names as the keys in the dict, which includes day_obs, but other than that, it's recommended.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coolio...I added a note about the avoiding foot-gun :)

@laurenam laurenam force-pushed the tickets/DM-43479 branch 3 times, most recently from cba9f5d to dc99f23 Compare March 25, 2024 21:44
@laurenam laurenam changed the title DM-43479: Add ability to use a where clause in LogBrowser DM-43479: Add ability to use where/bind clauses in LogBrowser Mar 25, 2024
@@ -64,13 +77,23 @@ class LogBrowser:
"with gufunc signature (n?,k),(k,m?)->(n?,m?)",
]

def __init__(self, butler, taskName, collection):
def __init__(self, butler, taskName, collection, whereStr="", bindMap=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I'd call bindMap just bind, but then I'd probably also call whereStr just where tbh. I think these should be familiar terms for people, and also results in the common where=where, bind=bind pattern.

Comment on lines 93 to 94
f"Key '{key}' in bindMap is not in the whereStr provided:\n"
f"{self.whereStr}\nso no binding will take effect."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you really want two newlines inside the single log message? That's unusual. I'm not even sure one is recommended tbh, though it's summit_extras, so I don't care much.

Loading in all logs associated with a given task in a given collection
can take a very long time (i.e. depending on how many dataRefs exist
for that task.)  This adds the ability to add a "where" clause to the
dataId search such that the log loading time can be greatly reduced.
A "bind" mapping can (optionally) be used with the where clause.  If
a given key provided in the bind mapping does not appear in the where
clause string, a warning will be logged (stating that no binding will
take effect on that key).
@laurenam laurenam merged commit 83f1fb8 into main Mar 27, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants