New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
s3 download retry #1164
s3 download retry #1164
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this pull request. One thing that's not completely like is that the retry is triggered for any error condition, for example when the file does not exist, but I guess the exit code of awscli does not make such kind of difference.
Anyhow, I would propose to keep as an opt-in feature controlled by the number of attempts that should be > 0 (default: 0)
modules/nextflow/src/main/groovy/nextflow/cloud/aws/batch/AwsBatchFileCopyStrategy.groovy
Show resolved
Hide resolved
modules/nextflow/src/main/groovy/nextflow/cloud/aws/batch/S3Helper.groovy
Outdated
Show resolved
Hide resolved
modules/nextflow/src/main/groovy/nextflow/cloud/aws/batch/S3Helper.groovy
Show resolved
Hide resolved
modules/nextflow/src/main/groovy/nextflow/cloud/aws/batch/S3Helper.groovy
Outdated
Show resolved
Hide resolved
modules/nextflow/src/main/groovy/nextflow/cloud/aws/batch/S3Helper.groovy
Show resolved
Hide resolved
modules/nextflow/src/main/groovy/nextflow/cloud/aws/batch/S3Helper.groovy
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made a minor comment. If you sign-off the DCO, I think it could be merged.
modules/nextflow/src/main/groovy/nextflow/cloud/aws/batch/AwsBatchFileCopyStrategy.groovy
Show resolved
Hide resolved
The PR look ok, thanks a lot. To proceed with the merge I need a commit with your sign-off. |
0750024
to
a75683b
Compare
Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
When input files are copied in the task work dir, the input files should not be mounted in the container environment. Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
Signed-off-by: Sven Fillinger <sven.fillinger@qbic.uni-tuebingen.de> Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
…-io#996 This commit enables the process echo when the ansi logging is enabled that was turned off in a previous commit. It also redirected the console output to the Ansi renderer for for the `print/ln` and `view` operators when the ansi logging is enabled. Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
The unit used by LSF batch scheduler for memory limits depends on the setting of the attribute LSF_UNIT_FOR_LIMITS defined in the lsf.config file. This commit adds to the LSF executor the ability to fetch such setting and properly apply when a job is submitted for execution. Moreover if fixes the default mem unit to KB (instead of MB) when the above setting is not defined. Solves nextflow-io#1124. See also PR nextflow-io#1035. Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
This commit disables the rich ansi logging feature when using the `kuberun` command since the output is not rendered correctly. Likely this is due to the wrong encoding of ANSI escape code when written/read over the http api. Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
This commit implements the deletion of the K8s configMap create by the `kuberun` command to propagate the nextflow configuration to the K8s cluster. Deletion take place only when the pod terminated sucessfully. This policy can be overriden by the `k8s.cleanup=false|true` option. Bonus: the kuberun command now returns the exit status of the driver nextflow pod. Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
…#1163 Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
…w-io#1163 Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
0750024
to
414ea43
Compare
5e7c1c7
to
352a89a
Compare
Thanks. One last thing, could you please add an entry in the documentation regarding the new setting options. See config.rst line 512. |
Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
OK, merged on testing branch. Thanks a lot for your contribution! |
Use duration unit to define S3 delayBetweenAttemps config options.
Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
Use duration unit to define S3 delayBetweenAttemps config options.
Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
Use duration unit to define S3 delayBetweenAttemps config options.
Hi Paolo,
Since the 26d3755 didn't solve our problem #1107 I implemented a retry mechanism when downloading files from s3. We tested it with 200 TN samples (15536 processes). This way we are able to use 'optimal' for instance types for our compute environment and having a single queue for all processes. Also what we detected as really beneficial in this kind of setup is that wait time for instances is shorter and cost is lower since we are not limiting AWS Batch scheduler to just some instance types. If you think this is beneficial I can also add configurable number of retries or any cleanup you think is needed.
Best,