New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Task batching #3909
base: master
Are you sure you want to change the base?
Task batching #3909
Conversation
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
81f7cb7
to
8a43489
Compare
Link to GNU parallel page at NCI HPC centre (Canberra AUS) : https://opus.nci.org.au/display/Help/GNU%20parallel |
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
✅ Deploy Preview for nextflow-docs-staging canceled.
|
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Had a first quick look - Good! Slick design and implementation that build on top of the job array effort Will review in greater detail after finalising the Job Array work package. |
I had a quick thrill worrying that arrays are not supported in Bash 3.x, which is super old but still the default version on Macs. But my memory was only half valid: Bash 3.x does support indexed arrays (the ones used here in TaskArrayCollector), while it does not support associative arrays (maps). |
I dreamt tonight that this had been merged |
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Good morning Europe! 😄
This PR implements task grouping as described in #2527. The implementation is inspired by #3905.
Each task processor has a "task group collector" that collects tasks and submits them as groups. The task group itself is just a task run, but with a special wrapper script that executes each child task sequentially. See the docs for details. Basically, a task group should behave exactly like a task array, except the child tasks run sequentially on the same node.
So far it's working on the local executor, haven't tested anything beyond that. For now this PR is just a proof of concept to help us think through task arrays vs task groups.