Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-26591: Include instrument data ID value when provided on pipetask command-line or Pipeline yaml file #144

Merged
merged 2 commits into from Sep 16, 2020

Conversation

n8pease
Copy link
Contributor

@n8pease n8pease commented Sep 11, 2020

No description provided.

The name of the instrument in the query string, or `None` if an
instrument is not named.
"""
instrumentRegex = r"instrument *="
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As written this does not need to be a raw string. Saying that, in regexes I tend to prefer \s* rather than * for whitespace matching. Also though, why can't this regex also include the instrument in it as a group so that you can ask for group 1 directly, rather than taking the unmatched part and splitting on space after?

Something like:

instrumentRegex =r"instrument\s*=\s*[\"'](.*?)[\"']"

works for me and the instrument is then match.group(1).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even better:

r"\Winstrument\s*=\s*[\"'](.*?)[\"']"

to avoid matching "NotAnInstrument = '1'".
I would actually suggest using expression parser that knows about query syntax to avoid surprises (ask me for more artificial examples if you want to know 🙂)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And in our query language string literals use single quotes, like in SQL.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably use \binstrument for word boundary...

instrument = pipeline.getInstrument()
if instrument is not None:
if isinstance(instrument, str):
instrument = getInstrument(pipeline._pipelineIR.instrument, self.registry)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this pipeline._pipelinesIR exactly what pipeline.getInstrument() does above? I'm trying to work out why the argument here isn't instrument.

Copy link
Member

@TallJimbo TallJimbo Sep 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(edit: moved to the other thread, where I had intended to put this comment)

"""Test getting the instrument from the query."""
queries = (("tract = 42 and instrument = 'HSC'", "HSC"),
("tract=42 and INSTRUMENT='HSC'", "HSC"),
('tract=42 and Instrument = "HSC"', "HSC"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There need to be tests where instrument is first and then tract is second, and also one where instrument is in the middle of two other clauses.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and I guess also a tract=42 and notinstrument = "HSC" and instrument = "LSSTCam"

Copy link
Contributor

@andy-slac andy-slac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do the right thing and parse user expression correctly. I'll give you an example of how to do it on JIRA, hopefully today.

The name of the instrument in the query string, or `None` if an
instrument is not named.
"""
instrumentRegex = r"instrument *="
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even better:

r"\Winstrument\s*=\s*[\"'](.*?)[\"']"

to avoid matching "NotAnInstrument = '1'".
I would actually suggest using expression parser that knows about query syntax to avoid surprises (ask me for more artificial examples if you want to know 🙂)

The name of the instrument in the query string, or `None` if an
instrument is not named.
"""
instrumentRegex = r"instrument *="
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And in our query language string literals use single quotes, like in SQL.

if queryInstrument is None:
# There is not an instrument in the query, add it:
restriction = f"instrument = '{instrument.getName()}'"
_LOG.info(f"Adding restriction \"{restriction}\" to query.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe debug, that is not super-useful as INFO, unless you like to annoy people 🙂

Copy link
Contributor

@andy-slac andy-slac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Travis failed due to flake8 errors, and I'm not sure that test would run because it imports something that is not there anymore.

@@ -745,6 +746,78 @@ def makeQuantumGraph(self):
return graph


from lsst.daf.butler.registry.queries.exprParser import TreeVisitor, ParserYacc, ParseError
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should go to the top

# Since there is an instrument in the query, it should match
# the instrument in the pipeline.
raise RuntimeError(f"The instrument named in the query (\"{queryInstruments[0]}\") does not "
f"match the instrument named by the pipeline (\"{instrumentName}\")")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation.

"""Tests of things related to the GraphBuilder class."""

import unittest
from unittest.mock import patch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flake8 says its not used

from unittest.mock import patch

from lsst.pipe.base import GraphBuilder
from lsst.pipe.base.graphBuilder import findInstruments
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one too, and I think it does not exist.

Copy link
Contributor

@andy-slac andy-slac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK.

@n8pease n8pease merged commit d6f3b0e into master Sep 16, 2020
@timj timj deleted the tickets/DM-26591 branch April 13, 2022 22:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants