New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-31251: Add execution butler tests #13
Conversation
Ingest with direct mode so that we can test that side of the butler. This required an improved test for checking to see if ingest had already happened.
|
||
refresh_exe | ||
node=2 | ||
pipetask --long-log run -b "$tmp_butler" --output-run "$exerun" --qgraph "$graph_file" --qgraph-node-id $node --skip-init-writes --extend-run --clobber-outputs --skip-existing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is important to keep in mind for me. This is going to be less strait forward with UUID, as they are randomly assigned
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I have been pondering this. I think we might need a graph introspection subcommand that can report the node IDs or something -- or at least a command that can dump the graph in json and that I can then strip out the nodes using jq
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can give you a loadHeader
method that dumps the json header out. It will only be a few lines on my ticket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. So ideally something like:
nodes=$(pipetask qgraph --dump-header -q graph.qgraph | jq SOME_QUERY)
knowing that this graph is simple enough that the three nodes are all in sequence.
# of a previous chain. | ||
# If --replace-run is required an extra line to do --mode=pop should be added. | ||
butler collection-chain DATA_REPO --mode=extend "$exeoutput" ${incoll//,/ } | ||
butler collection-chain DATA_REPO --mode=prepend "$exeoutput" "$exerun" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this how we do it in production? Does BPS produce these commands as it creates the work graphs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it creates a final shell script to run and calls butler collection-chain
. This is what we were discussing at last week's standup. One outcome of this ticket is that I think this version here is a bit more robust than what we currently do in ctrl_bps -- the "extend" line is not there. One constraint at the moment is that the final script doesn't support bash logic -- just some bash commands to run in series -- and so as written we can't support --replace-run
in execution butler. This is the motivation for having a new pipetask subcommand that takes all the run-related options and can be a single call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I didn't quite remember how it all fit together at the outcome of that
No description provided.