Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TES backend doesn't seem to support the CWL allowed exit codes #408

Open
kmhernan opened this issue Jan 11, 2018 · 12 comments
Open

TES backend doesn't seem to support the CWL allowed exit codes #408

kmhernan opened this issue Jan 11, 2018 · 12 comments

Comments

@kmhernan
Copy link

When using bunny + TES a task that has a non-zero exit code, but is considered acceptable by the CWL spec still is interpreted as failed by the rabix engine. I have tested the same workflow with the the local execution backend and it runs as expected. While the TES may return the error state, the exit code is stored in the TES TaskLog message and should be used in this case to override the error state provided by the TES backend.

@milos-ljubinkovic
Copy link
Contributor

Are the output files uploaded when the TES task errors-out?

@buchanae
Copy link
Contributor

Are the output files uploaded when the TES task errors-out?

Ah, I hadn't thought about this.

Currently, Funnel will stop processing on the first failed executor, and will not upload output files. We've discussed changing this behavior, in order to provide a sort of "best effort" behavior, where Funnel tries to get you all the data it has. In other words, we could try make Funnel upload any outputs it can find. There are some details to iron out there though. Currently it's an error if an output isn't found, which wouldn't be true in this situation.

@milos-ljubinkovic
Copy link
Contributor

Bunny could wrap those tools with defined successCodes into a command that always exits with a 0 but stores the actual exit code somewhere and then evaluates the success state in the postprocess stage. This makes sense as it's a cwl feature so it should work independently of funnel's support for it.

@kmhernan
Copy link
Author

Yeah I was using shared FS in these tests when I noticed it but no the output wasn’t copied/linked over to the bunny directory structure from what I can see.

@milos-ljubinkovic
Copy link
Contributor

I've made some quick changes on this branch: https://github.com/rabix/bunny/tree/tes/exitcodes

If there are allowed exit codes in the app the exit code is saved and overridden to 0 inside TES and then independently evaluated after execution.

Changed the way command line is built to accommodate this so some side effects with weird command lines might happen.

@kmhernan
Copy link
Author

Awesome @milos-ljubinkovic ... my quick peek at the source suggests that this branch also supports the newer TES spec correct? Since I'm testing with the newer funnel versions that have the newer TES spec, I have had to edit the source from older rabix versions... just double checking so I can test it with my workflow.

@milos-ljubinkovic
Copy link
Contributor

It should support the latest TES spec and was tested against funnel's master branch on 10th January I think. Some issues with s3 and endpoints were reported, though.

@kmhernan
Copy link
Author

great... yeah we gave up on s3 for now and testing with ceph FS... will test this today thanks

@adamstruck
Copy link
Contributor

We are tracking this issue in Funnel ohsu-comp-bio/funnel#425

@kmhernan
Copy link
Author

@milos-ljubinkovic it seems like I can't get around this exception with this branch:

java.lang.IllegalArgumentException: Illegal character in scheme name at index 0: {
  "appFileLocation" : "/mnt/cephfs/cwls/jeremiah/gdc-dnaseq-cwl/workflows/dnaseq/metrics.cwl",
  "successCodes" : [ ],
  "cwlVersion" : "v1.0",
  "inputs" : [ {
    "id" : "bam",
    "type" : "File",
    "scatter" : true
  }, {
    "id" : "known_snp",
    "type" : "File"
  }, {

It happens on both local and TES backends.

@milos-ljubinkovic
Copy link
Contributor

Made a quick revert on that branch that had something to do with ignoring IllegalArgumentExceptions, so it might help but didn't really reproduce the issue. Could you upload your workflow or the full stack trace?

@kmhernan
Copy link
Author

@milos-ljubinkovic I think that's where the issue is, here more of the stack trace I can easily grep out... I'm running again with your changes right now.

java.lang.IllegalArgumentException: Illegal character in scheme name at index 0: {
	at java.net.URI.create(URI.java:852) ~[na:1.8.0_141]
	at org.rabix.bindings.cwl.resolver.CWLDocumentResolver.resolve(CWLDocumentResolver.java:100) ~[rabix-cli.jar:na]
	at org.rabix.bindings.cwl.helper.CWLJobHelper.getCWLJob(CWLJobHelper.java:20) ~[rabix-cli.jar:na]
	at org.rabix.bindings.cwl.CWLProcessor.transformInputs(CWLProcessor.java:519) ~[rabix-cli.jar:na]
	at org.rabix.bindings.cwl.CWLBindings.transformInputs(CWLBindings.java:175) ~[rabix-cli.jar:na]
	at org.rabix.engine.processor.handler.impl.JobStatusEventHandler.handleTransform(JobStatusEventHandler.java:356) ~[rabix-cli.jar:na]
	at org.rabix.engine.processor.handler.impl.JobStatusEventHandler.ready(JobStatusEventHandler.java:289) ~[rabix-cli.jar:na]
	at org.rabix.engine.processor.handler.impl.JobStatusEventHandler.handle(JobStatusEventHandler.java:109) ~[rabix-cli.jar:na]
	at org.rabix.engine.processor.handler.impl.JobStatusEventHandler.handle(JobStatusEventHandler.java:43) ~[rabix-cli.jar:na]
	at org.rabix.engine.processor.impl.EventProcessorImpl.send(EventProcessorImpl.java:210) [rabix-cli.jar:na]
	at org.rabix.engine.processor.impl.MultiEventProcessorImpl.send(MultiEventProcessorImpl.java:59) ~[rabix-cli.jar:na]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_141]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_141]
	at com.google.inject.internal.DelegatingInvocationHandler.invoke(DelegatingInvocationHandler.java:50) ~[rabix-cli.jar:na]
	at org.rabix.engine.processor.handler.impl.InputEventHandler.handle(InputEventHandler.java:99) ~[rabix-cli.jar:na]
	at org.rabix.engine.processor.handler.impl.InputEventHandler.handle(InputEventHandler.java:27) ~[rabix-cli.jar:na]
	at org.rabix.engine.processor.impl.EventProcessorImpl.send(EventProcessorImpl.java:210) [rabix-cli.jar:na]
	at org.rabix.engine.processor.impl.MultiEventProcessorImpl.send(MultiEventProcessorImpl.java:59) ~[rabix-cli.jar:na]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_141]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_141]
	at com.google.inject.internal.DelegatingInvocationHandler.invoke(DelegatingInvocationHandler.java:50) ~[rabix-cli.jar:na]
	at org.rabix.engine.processor.handler.impl.ScatterHandler.createScatteredJobs(ScatterHandler.java:222) ~[rabix-cli.jar:na]
	at org.rabix.engine.processor.handler.impl.ScatterHandler.scatterPort(ScatterHandler.java:115) ~[rabix-cli.jar:na]
	at org.rabix.engine.processor.handler.impl.JobStatusEventHandler.ready(JobStatusEventHandler.java:277) ~[rabix-cli.jar:na]
	at org.rabix.engine.processor.handler.impl.JobStatusEventHandler.handle(JobStatusEventHandler.java:109) ~[rabix-cli.jar:na]
	at org.rabix.engine.processor.handler.impl.JobStatusEventHandler.handle(JobStatusEventHandler.java:43) ~[rabix-cli.jar:na]
	at org.rabix.engine.processor.impl.EventProcessorImpl.send(EventProcessorImpl.java:210) [rabix-cli.jar:na]
	at org.rabix.engine.processor.impl.MultiEventProcessorImpl.send(MultiEventProcessorImpl.java:59) ~[rabix-cli.jar:na]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_141]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_141]
	at com.google.inject.internal.DelegatingInvocationHandler.invoke(DelegatingInvocationHandler.java:50) ~[rabix-cli.jar:na]
	at org.rabix.engine.processor.handler.impl.InputEventHandler.handle(InputEventHandler.java:99) ~[rabix-cli.jar:na]
	at org.rabix.engine.processor.handler.impl.InputEventHandler.handle(InputEventHandler.java:27) ~[rabix-cli.jar:na]
	at org.rabix.engine.processor.impl.EventProcessorImpl.send(EventProcessorImpl.java:210) [rabix-cli.jar:na]
	at org.rabix.engine.processor.impl.MultiEventProcessorImpl.send(MultiEventProcessorImpl.java:59) ~[rabix-cli.jar:na]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_141]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_141]
	at com.google.inject.internal.DelegatingInvocationHandler.invoke(DelegatingInvocationHandler.java:50) ~[rabix-cli.jar:na]
	at org.rabix.engine.processor.handler.impl.OutputEventHandler.handle(OutputEventHandler.java:112) ~[rabix-cli.jar:na]
	at org.rabix.engine.processor.handler.impl.OutputEventHandler.handle(OutputEventHandler.java:34) ~[rabix-cli.jar:na]
	at org.rabix.engine.processor.impl.EventProcessorImpl.send(EventProcessorImpl.java:210) [rabix-cli.jar:na]
	at org.rabix.engine.processor.impl.MultiEventProcessorImpl.send(MultiEventProcessorImpl.java:59) ~[rabix-cli.jar:na]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_141]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_141]
	at com.google.inject.internal.DelegatingInvocationHandler.invoke(DelegatingInvocationHandler.java:50) ~[rabix-cli.jar:na]
	at org.rabix.engine.processor.handler.impl.OutputEventHandler.handle(OutputEventHandler.java:112) ~[rabix-cli.jar:na]
	at org.rabix.engine.processor.handler.impl.OutputEventHandler.handle(OutputEventHandler.java:34) ~[rabix-cli.jar:na]
	at org.rabix.engine.processor.impl.EventProcessorImpl.send(EventProcessorImpl.java:210) [rabix-cli.jar:na]
	at org.rabix.engine.processor.impl.MultiEventProcessorImpl.send(MultiEventProcessorImpl.java:59) ~[rabix-cli.jar:na]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_141]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_141]
	at com.google.inject.internal.DelegatingInvocationHandler.invoke(DelegatingInvocationHandler.java:50) ~[rabix-cli.jar:na]
	at org.rabix.engine.processor.handler.impl.JobStatusEventHandler.handle(JobStatusEventHandler.java:160) ~[rabix-cli.jar:na]
	at org.rabix.engine.processor.handler.impl.JobStatusEventHandler.handle(JobStatusEventHandler.java:43) ~[rabix-cli.jar:na]
	at org.rabix.engine.processor.impl.EventProcessorImpl.handle(EventProcessorImpl.java:175) [rabix-cli.jar:na]
	at org.rabix.engine.processor.impl.EventProcessorImpl.lambda$doProcessEvent$3(EventProcessorImpl.java:108) [rabix-cli.jar:na]
	at org.rabix.engine.store.memory.InMemoryRepositoryRegistry.doInTransaction(InMemoryRepositoryRegistry.java:92) ~[rabix-cli.jar:na]
	at org.rabix.engine.processor.impl.EventProcessorImpl.doProcessEvent(EventProcessorImpl.java:107) [rabix-cli.jar:na]
	at org.rabix.engine.processor.impl.EventProcessorImpl.lambda$null$1(EventProcessorImpl.java:91) [rabix-cli.jar:na]
	at org.rabix.engine.metrics.impl.MetricsHelperImpl.time(MetricsHelperImpl.java:78) ~[rabix-cli.jar:na]
	at org.rabix.engine.processor.impl.EventProcessorImpl.lambda$start$2(EventProcessorImpl.java:91) [rabix-cli.jar:na]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_141]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_141]
	at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_141]
Caused by: java.net.URISyntaxException: Illegal character in scheme name at index 0: {
	at java.net.URI$Parser.fail(URI.java:2848) ~[na:1.8.0_141]
	at java.net.URI$Parser.checkChars(URI.java:3021) ~[na:1.8.0_141]
	at java.net.URI$Parser.checkChar(URI.java:3031) ~[na:1.8.0_141]
	at java.net.URI$Parser.parse(URI.java:3047) ~[na:1.8.0_141]
	at java.net.URI.<init>(URI.java:588) ~[na:1.8.0_141]
	at java.net.URI.create(URI.java:850) ~[na:1.8.0_141]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants