Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

secondaryFiles are not discovered? #211

Open
buchanae opened this issue Mar 15, 2017 · 7 comments
Open

secondaryFiles are not discovered? #211

buchanae opened this issue Mar 15, 2017 · 7 comments
Milestone

Comments

@buchanae
Copy link
Contributor

buchanae commented Mar 15, 2017

I'm testing bunny with a TES service.

bunny + TES is failing a conformance test (#83) which includes secondaryFiles.

bunny without TES is passing that test, but curiously, when I carefully inspect the bunny logs, all of the logged data structures show secondaryFiles=[]. I suspect that the test is passing because it shares the same local filesystem.

I'm testing against this branch of bunny, which could also be a factor.

Are you aware of this behavior?

@StarvingMarvin
Copy link
Contributor

Here is the line where it's checked that secondary files actually exist at a given location. Ideally it should be invoked after a tool finished it's job, and files are indeed present locally to the executor. It's possible that there is a bug in a way bunny invokes TES, especially since it was originally only coded to a "proof of concept" stage. We are wrapping up work on bunny on SBG integration in a next week or two, and we will shift our focus to TES once again.

@adamstruck
Copy link
Contributor

We have confirmed that this issue is not specific to the TES backend.

CWLInputPort doesn't contain any references to secondary files at all. I think the getSecondaryFiles method needs to be called somewhere in this region.

@buchanae
Copy link
Contributor Author

To be clear, we think this is specific to loading an app's inputs, rather than evaluating files in between jobs.

@sivkovic sivkovic modified the milestone: v1.0 Apr 12, 2017
simonovic86 added a commit that referenced this issue Apr 13, 2017
Discover secondary files for inputs (#211)
@adamstruck
Copy link
Contributor

Is there an ETA on this when this will be fixed? The TES backend isn't very functional without this. Any workflow that has inputs with required secondary files will fail.

@milos-ljubinkovic
Copy link
Contributor

Hi, sorry for the way too late response, we had some team changes and some issues on the platform that we had to address first.

Basic staging of secondary input files should be available in the latest release of bunny but I'm not sure that it will work the same on TES. Anyways, there is a bit older separate branch where it should work: "bugfix/tes" that is dealing with TES in a different way, without rabix docker image but using direct conversion and file access. Last I checked, it was passing most cwl 1.0 conformance tests when ran on funnel with maybe 10-12 failing.

In the following days, if I get the time, I will update the branch with the latest fixes and I'll contact you with the new binary and configs.

@adamstruck
Copy link
Contributor

@milos-ljubinkovic that would be great thank you!

@dleehr
Copy link

dleehr commented Dec 19, 2017

We have confirmed that this issue is not specific to the TES backend.

Just here to +1 this issue. We've been exploring rabix bunny among other implementations that support parallelism. But our workflows use secondaryFiles to for things like index files on a reference genome:

https://github.com/Duke-GCB/bespin-cwl/blob/52457d925f9c232394c38d6ccb890cced9f93e2c/workflows/exomeseq.cwl#L21-L30

so this issue stops our workflow pretty quickly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants