New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Fusion file system to Slurm and LSF executors #3516
Conversation
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Looks good so far 👍 |
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Think the Fusion support is ready to go. Main changes:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested and it's really nice, it's magic! 😄
The only thing that was not intuitive to me is the need to explicitly set fusion.exportAwsAccessKeys = true
to make it work.
I'm thinking that with the upcoming support for more data stores this will became a bit of a mess to always explicitly configure and maybe we can have a default way of sharing credentials (using standard env vars) that works out of the box and only when you use a different approach you need to tune your configuration.
I agree that's not great but how to handle it better? In principle storage authentication should be delegated to the corresponding cloud infra. When using Fusion on-prem I'm expecting the use of S3-compatible storage such as Minio or Ceph. Don't think in this scenario the use of cloud-hosted object storage is a real use case. |
MinIO and CEPH also use |
Included in version 23.02.0-edge |
So great to see this, thank you all! Question for you @pditommaso: Another approach to supporting Fusion in grid-based executors (#3205) supported a wider array of grid-based executors. I am at an institution with a condor-based HTC setup. Does this PR allow for the use of Fusion in any executors beyond Slurm and LSF? If so, what limits it to those two executors, and what work would need to be done (by me) to allow the use of Fusion with condor? If it matters, I intend to use it with our on-premises Minio-based s3-compatible storage. |
Currently, it is not supported, but in principle, it should be easy to add it. To enable this capability what nextflow does it to pipe the launch command to the summit command stdin instead of creating a wrapper script. For example when using Slurm instead of running
it does
Does Condor allow the same? |
According to my research it does support stdin. Also some notes here that may be relevant to Condor support. |
I guess so, it's needed somebody that takes care of implementing and validating it |
@JosephLalli if I draft a Fusion / Condor PR, can you test it in your environment? We don't have a Condor environment so makes it hard to develop new features for it. As long as you can run Nextflow, you should be able to build and test it from a branch. |
Absolutely @bentsherman.
I can’t promise any kind of automated testing setup, but I can absolutely run whatever you’d like here.
University of Wisconsin CHTC leadership is very eager to get Nextflow working on Condor, and my work has become their test case. I think Fusion paired with our local s3 storage setup should solve a lot of the headaches that the Condor folks have had in the past.
|
Excellent, I will tag you when we have a PR for it. We'll just need you to run a small pipeline (e.g. I also don't know if Fusion works with S3-compatible storage on HPC yet, or at least it hasn't been tested. So we might be waiting for that support to arrive. |
Yes, it does. This an example config
|
…ow-io#3516) This commit adds the support for Fusion file system to Slurm, LSF and Grid Engine batch schedulers Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
This PR adds support for Fusion file system to Slurm and LSF grid executors.
The PR implements the following changes
NXF_CHDIR
variable as the common pattern to specify the job work directory instead of relying on grid-specific directivesstdin
special file, instead of creating a temporary launcher fileThis feature for production usage requires #3513 or #3514