-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
jbang exceptions while running on cloud #45
Comments
Hi Abhinav, Collaboration sounds great, we would love to improve the pipeline that it runs smoothly on Azure. |
Thanks @seppinho Yes, I'd appreciate some thoughts on the overall manner we get this done. As you might have noticed in the PR #46
However, these changes mean that any non-containerized (in this case docker) workload would not find the expected JARs there. As the pipeline isn't officially supporting P.S. I have also explored the approach of staging the |
Thx for the detailed explanation. Right, conda is currently not supported, so approach sounds good to me. Really appreciate your work. |
Thanks! #46 has now been merged. |
Hi genepi team 👋
Thanks for the neat pipeline!
I was able to successfully run the pipeline locally using the
-profile test,docker
profiles however it seems that the jbang's native behavior of downloading JARs on the fly might not be well suited for the cloud environment.Issue encountered
When I tried to run the pipeline on the cloud (both Azure/AWS Batch) setting via Nextflow CLI by adding the cloud-specific configs (
azure.config
) and invokingI kept running across the following issue in the initial caching process
I suspect that this might have to do with how jbang relies on the download of jar dependencies (in a local
lib
folder) which is not available to the tasks in other nodes (or other instances of the container) in a multi-node setting.Suggestions
Allow me to share a couple of suggestions which might be worth considering.
Use a compiled shadow-jar/uber-jar (single jar with all deps baked in) so that there is no need for the
lib
folder to be available to downstream processes which rely upon these cached jar files.Another alternative, perhaps with less effort, is to perhaps bake in the compiled jar files in the container itself since the tool is already available in the container, this way we can ensure that the dependencies (i.e.
lib
folder) as well as theRegenie*
JARs are all available within the container instances across different nodes.Collaboration
I'd be happy to test the pipeline on the cloud and to discuss any changes which might be necessary for making the pipeline optimal (hardware config etc) for the cloud setting.
The text was updated successfully, but these errors were encountered: