Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-32048: Change tests to use the new pipetask API. #73

Merged
merged 1 commit into from Oct 12, 2021

Conversation

erykoff
Copy link
Contributor

@erykoff erykoff commented Oct 8, 2021

No description provided.

Copy link
Contributor

@parejkoj parejkoj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handful of comments. It's nice to not need click here.

configFiles : `list` [`str`], optional
List of config files to use (with "-C").
List of config files to use. Each string will be of the form
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd maybe replace "to use" with "to load, in order".

Similarly for the configOptions below.

pipeline.addConfigFile(taskName, fileName)
for configOption in configOptions:
taskName, others = configOption.split(':')
option, value = others.split('=')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not keen on the string manipulation here: I feel like these could be dictionaries instead, so you don't have to muck with strings. I found that to be an improvement in jointcal: just pass a dict of the config field:value you want to apply. But you're also specifying different tasks, so maybe it's not as nice? Would a nested dict be bad?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love nested dicts, but they do the trick in this case.

@@ -92,16 +88,18 @@ def _runPipeline(self, repo, pipelineFile, queryString=None,
Pipeline definition file.
queryString : `str`, optional
String to use for "-d" data query.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be "where query that defines the data to use" or something like that, since -d is no longer relevant.

# We are reusing the outputCollection so we can't specify the input
inputCollections = None
# We need to know the correct input collections to be able to
# reuse the output collection with the pipetask API.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm having a little trouble parsing this comment. If I didn't know about the discussion you and Jim had on slack, I'd be a bit lost. Not sure how to reword it though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

            # In these tests we are running the fit cycle task multiple
            # times into the same output collection.  This code allows
            # us to find the correct chained input collections to use
            # so that we can both read from previous runs in the output
            # collection and write to a new run in the output collection.
            # Note that this behavior is handled automatically by the
            # pipetask command-line interface, but not by the python
            # API.

@@ -608,7 +602,7 @@ def _testFgcmMultiFit(self, instName, testName, queryString, visits, zpOffsets):
'pipelines',
f'fgcmFullPipeline{instCamel}.yaml'),
configFiles=configFiles,
inputCollections=f'{instName}/{testName}/lut,refcats/gen2',
inputCollections=[f'{instName}/{testName}/lut', 'refcats/gen2'],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You did the same thing above, but put a newline instead of a space before refcats. I'm not sure it matters, but the newline was maybe a bit more easy to parse.

@erykoff erykoff merged commit 89e8d3e into master Oct 12, 2021
@erykoff erykoff deleted the tickets/DM-32048 branch October 12, 2021 19:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants