New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-26591: Include instrument data ID value when provided on pipetask command-line or Pipeline yaml file #144
Conversation
The name of the instrument in the query string, or `None` if an | ||
instrument is not named. | ||
""" | ||
instrumentRegex = r"instrument *=" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As written this does not need to be a raw string. Saying that, in regexes I tend to prefer \s*
rather than *
for whitespace matching. Also though, why can't this regex also include the instrument in it as a group so that you can ask for group 1 directly, rather than taking the unmatched part and splitting on space after?
Something like:
instrumentRegex =r"instrument\s*=\s*[\"'](.*?)[\"']"
works for me and the instrument is then match.group(1)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even better:
r"\Winstrument\s*=\s*[\"'](.*?)[\"']"
to avoid matching "NotAnInstrument = '1'".
I would actually suggest using expression parser that knows about query syntax to avoid surprises (ask me for more artificial examples if you want to know 🙂)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And in our query language string literals use single quotes, like in SQL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd probably use \binstrument
for word boundary...
instrument = pipeline.getInstrument() | ||
if instrument is not None: | ||
if isinstance(instrument, str): | ||
instrument = getInstrument(pipeline._pipelineIR.instrument, self.registry) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't this pipeline._pipelinesIR
exactly what pipeline.getInstrument()
does above? I'm trying to work out why the argument here isn't instrument
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(edit: moved to the other thread, where I had intended to put this comment)
tests/test_graphBuilder.py
Outdated
"""Test getting the instrument from the query.""" | ||
queries = (("tract = 42 and instrument = 'HSC'", "HSC"), | ||
("tract=42 and INSTRUMENT='HSC'", "HSC"), | ||
('tract=42 and Instrument = "HSC"', "HSC"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There need to be tests where instrument is first and then tract is second, and also one where instrument is in the middle of two other clauses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and I guess also a tract=42 and notinstrument = "HSC" and instrument = "LSSTCam"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do the right thing and parse user expression correctly. I'll give you an example of how to do it on JIRA, hopefully today.
The name of the instrument in the query string, or `None` if an | ||
instrument is not named. | ||
""" | ||
instrumentRegex = r"instrument *=" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even better:
r"\Winstrument\s*=\s*[\"'](.*?)[\"']"
to avoid matching "NotAnInstrument = '1'".
I would actually suggest using expression parser that knows about query syntax to avoid surprises (ask me for more artificial examples if you want to know 🙂)
The name of the instrument in the query string, or `None` if an | ||
instrument is not named. | ||
""" | ||
instrumentRegex = r"instrument *=" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And in our query language string literals use single quotes, like in SQL.
if queryInstrument is None: | ||
# There is not an instrument in the query, add it: | ||
restriction = f"instrument = '{instrument.getName()}'" | ||
_LOG.info(f"Adding restriction \"{restriction}\" to query.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe debug
, that is not super-useful as INFO, unless you like to annoy people 🙂
aefa31f
to
df50645
Compare
df50645
to
a7e145e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Travis failed due to flake8 errors, and I'm not sure that test would run because it imports something that is not there anymore.
@@ -745,6 +746,78 @@ def makeQuantumGraph(self): | |||
return graph | |||
|
|||
|
|||
from lsst.daf.butler.registry.queries.exprParser import TreeVisitor, ParserYacc, ParseError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should go to the top
# Since there is an instrument in the query, it should match | ||
# the instrument in the pipeline. | ||
raise RuntimeError(f"The instrument named in the query (\"{queryInstruments[0]}\") does not " | ||
f"match the instrument named by the pipeline (\"{instrumentName}\")") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation.
tests/test_graphBuilder.py
Outdated
"""Tests of things related to the GraphBuilder class.""" | ||
|
||
import unittest | ||
from unittest.mock import patch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flake8 says its not used
tests/test_graphBuilder.py
Outdated
from unittest.mock import patch | ||
|
||
from lsst.pipe.base import GraphBuilder | ||
from lsst.pipe.base.graphBuilder import findInstruments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one too, and I think it does not exist.
a7e145e
to
5fcde30
Compare
6f43a06
to
f009dcb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks OK.
f009dcb
to
acf6da9
Compare
No description provided.