[elan] Deploy Elan VM before April downtime #41

SamStudio8 · 2021-03-29T13:26:01Z

No description provided.

SamStudio8 · 2021-03-30T11:40:58Z

RP has provisioned a machine to use during the downtime period. I'll work on packaging Elan up a bit better to deploy it there before next Thursday.

SamStudio8 · 2021-03-30T12:53:27Z

Will also need to sort out mqtt (#22) to ensure Asklepian runs.

SamStudio8 · 2021-03-31T19:44:49Z

Elan removed from head node crontab
Elan added to Elan node crontab
go-full-elan script updated to parameterise nextflow configuration SamStudio8/elan-nextflow@3163e95
Nextflow configuration temporarily pointed to new config set to use 120/128 cores and no SLURM
Tested Ocarina -- we can reach Majora with OAuth credentials
Update mqtt-message script to allow for host to be overridden SamStudio8/elan-nextflow@2e922fc
go-full-elan script updated to parameterise MQTT host SamStudio8/elan-nextflow@4d59dd8
Tested mqtt message sending

Committed to production -- tomorrow's Elan run should use the new node

SamStudio8 · 2021-04-01T12:26:35Z

Absolutely. Blazing.

[b5/f5d0c3] process > save_manifest            [100%] 1 of 1 ✔
[8c/e75c97] process > resolve_uploads          [100%] 1 of 1 ✔
[a5/5d8a66] process > samtools_quickcheck      [100%] 7122 of 7122 ✔
[0a/057692] process > fasta_quickcheck         [100%] 7122 of 7122 ✔
[8b/7e1ba3] process > save_uploads             [100%] 7122 of 7122, failed: 473 ✔
[45/19d2ca] process > rehead_bam               [100%] 6649 of 6649 ✔
[3f/ad37b5] process > samtools_filter_and_sort [100%] 6649 of 6649 ✔
[db/65c5e6] process > samtools_index           [100%] 6649 of 6649 ✔
[a6/17a025] process > samtools_depth           [100%] 6649 of 6649 ✔
[0e/570933] process > rehead_fasta             [100%] 6649 of 6649 ✔
[a0/56acd0] process > swell                    [100%] 6649 of 6649 ✔
[31/85558d] process > post_swell               [100%] 6649 of 6649 ✔
[b9/ea6754] process > ocarina_ls               [100%] 6649 of 6649 ✔
Completed at: 01-Apr-2021 13:12:47
Duration    : 4h 17m 37s
CPU hours   : 477.0 (0% failed)
Succeeded   : 74'087
Ignored     : 473
Failed      : 473

🔥

SamStudio8 · 2021-04-01T12:27:16Z

RP says the connection to Majora will be a little slower from this node, but we're able to blow twice as many Ocarinas at the post-Elan step. Publishing and MQTT still to come.

SamStudio8 · 2021-04-01T16:29:07Z

Forgot that part of the post-Elan publish step is sent to SLURM but have parameterised the publish mode and committed that change to Elan (SamStudio8/elan-nextflow@c4f9dcc). After a little conda faff (is it bioinformatics without it?) everything finished up in record time, and emitted a message to MQTT without trouble. Will keep an eye out tomorrow morning but I'm happy.

SamStudio8 · 2021-04-05T12:19:27Z

Encountered an issue early this morning caused by Nextflow exceeding the thread pool limit when resuming tasks with the local executor. This has been reported before (nextflow-io/nextflow#1871) and the proposed fix to add -Dnxf.pool.type=sync to NXF_OPTS has been deployed and seems to be working.

SamStudio8 · 2021-04-05T12:20:42Z

Additionally, the swell step appears to have segfaulted for a very small number of samples (n=3) which seems to be related to a resource limit causing the initialisation of numpy to fail -- killing swell. RP has increased all process and file handler limits and we'll keep an eye on tomorrow's run.

Error below for posterity;

OpenBLAS blas_thread_init: pthread_create failed for thread 63 of 64: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 65535 max

Update: This error has gone away now. We might want to lower OPENBLAS_NUM_THREADS from the default of 64 anyway, but it isn't urgent.

SamStudio8 added elan infrastructure labels Mar 29, 2021

SamStudio8 self-assigned this Mar 29, 2021

SamStudio8 added in progress p:high threat labels Mar 30, 2021

SamStudio8 closed this as completed Apr 1, 2021

SamStudio8 removed the in progress label Apr 5, 2021

SamStudio8 mentioned this issue Apr 9, 2021

[manualpipe] Asklepian 20210409 #47

Closed

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[elan] Deploy Elan VM before April downtime #41

[elan] Deploy Elan VM before April downtime #41

SamStudio8 commented Mar 29, 2021

SamStudio8 commented Mar 30, 2021

SamStudio8 commented Mar 30, 2021

SamStudio8 commented Mar 31, 2021 •

edited

SamStudio8 commented Apr 1, 2021

SamStudio8 commented Apr 1, 2021

SamStudio8 commented Apr 1, 2021

SamStudio8 commented Apr 5, 2021

SamStudio8 commented Apr 5, 2021 •

edited

[elan] Deploy Elan VM before April downtime #41

[elan] Deploy Elan VM before April downtime #41

Comments

SamStudio8 commented Mar 29, 2021

SamStudio8 commented Mar 30, 2021

SamStudio8 commented Mar 30, 2021

SamStudio8 commented Mar 31, 2021 • edited

SamStudio8 commented Apr 1, 2021

SamStudio8 commented Apr 1, 2021

SamStudio8 commented Apr 1, 2021

SamStudio8 commented Apr 5, 2021

SamStudio8 commented Apr 5, 2021 • edited

SamStudio8 commented Mar 31, 2021 •

edited

SamStudio8 commented Apr 5, 2021 •

edited