Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bidirectional mapping of inputs and outputs between original WDL and dx #191

Open
notestaff opened this issue Feb 13, 2019 · 8 comments
Open

Comments

@notestaff
Copy link

For the purpose of comparing the same workflow run locally on Cromwell, vs remotely on dx, one needs to know the correspondence of input and output names . I know dxWDL has the -inputs option. But (1) it cannot be run without creating new files on dx; (2) it does not intermediate or final outputs. Would it be possible to add an option to output a file giving the full correspondence?

@orodeh
Copy link
Contributor

orodeh commented Feb 13, 2019

Could you give an example?

@notestaff
Copy link
Author

If you have a dx analysis that used a WDL workflow compiled by dxWDL, and now want to re-run it locally (maybe with some parameter change), what would be the steps? I have written some ad-hoc code to convert the json output from dx describe analysis-xxxxx to corresponding Cromwell input file, but it e.g. relies on knowing that stage-0/common has the workflow-level arguments. It would be better if dxWDL had more direct support for doing this.

@orodeh
Copy link
Contributor

orodeh commented Feb 14, 2019

I see what you mean. It is somewhat like the --input flag which takes a JSON file of Cromwell inputs, but for workflow outputs. It is an interesting enhancement idea. I can think about it for the next compiler version.

@notestaff
Copy link
Author

@orodeh Thanks for considering it. Most ideally, dxWDL would provide a command to map the output of 'dx describe --json analysis-xxxxxx' to the corresponding metadata.json from Cromwell, and vice versa, for a workflow compiled by dxWDL. I've been doing it using heuristics, but that of course is fragile as it relies on assumptions about dxWDL's inner workings.

@jdidion
Copy link
Contributor

jdidion commented Feb 12, 2020

I have actually already implemented this in pytest-wdl, in the dxWDL executor. I will write a stand-alone command-line tool that uses that code. Later we can look at re-implementing it in dxWDL.

@notestaff
Copy link
Author

notestaff commented Feb 12, 2020

Great, thanks!

One other possible place to put this info, is in the details field, where womSourceCode now goes; or add it as workflow metadata to the WDL stored in womSourceCode. Then the mapping will be retrievable even if later versions of dxWDL change how mapping is done.

@jdidion
Copy link
Contributor

jdidion commented Feb 12, 2020

I have started on this here: https://github.com/dnanexus/dxWDL/tree/feat/191-input-mapping/contrib/io_mapping. For now it is a separate tool written in python.

@jdidion
Copy link
Contributor

jdidion commented Feb 13, 2020

The tool now works for booth input mapping (cromwell -> DNAnexus) and output mapping (DNAnexus -> cromwell). @notestaff please test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants