SLURM submit error #67

stubrownGal · 2018-09-27T13:43:37Z

I am trying to run Masurca 3.2.8 on our new SLURM based cluster. I am assembling 40 Gb of Nanopore reads plus Illumina PE and Jump libraries. Masurca ran for a few hours and completed the read correction phase, but failed when trying to submit mega-read pass-1 jobs to the grid.

The STD_error message is below. I noticed that there is no specific grid information in the assemble.sh script, but it calls another script masurca/3.2.8/bin//mega_reads_assemble_cluster.sh which does contain grid commands that are wrong for our system (it is set up for SGE grid, not SLURM - see second section below). I did include some SLURM information in the config.txt.

How do I correctly set up Masurca for our SLURM based cluster?

[browns02@bigpurple-ln1 Masurca]$ ./assemble.sh
[Tue Sep 25 15:44:38 EDT 2018] Processing pe library reads
[Tue Sep 25 15:52:49 EDT 2018] Processing sj library reads
[Tue Sep 25 15:59:39 EDT 2018] Average PE read length 101
[Tue Sep 25 15:59:40 EDT 2018] Using kmer size of 67 for the graph
[Tue Sep 25 15:59:40 EDT 2018] MIN_Q_CHAR: 33
[Tue Sep 25 15:59:40 EDT 2018] Creating mer database for Quorum
[Tue Sep 25 16:22:05 EDT 2018] Error correct PE
[Tue Sep 25 16:48:36 EDT 2018] Error correct JUMP
[Tue Sep 25 17:07:41 EDT 2018] Estimating genome size
[Tue Sep 25 17:21:07 EDT 2018] Estimated genome size: 2360747251
[Tue Sep 25 17:21:07 EDT 2018] Creating k-unitigs with k=67
[Tue Sep 25 17:56:30 EDT 2018] Creating k-unitigs with k=31
[Tue Sep 25 18:47:29 EDT 2018] Filtering mate pairs
Assuming outtie orientation
Assuming outtie orientation
Chimeric/Redundant jump reads:
52378114 chimeric_sj.txt
38148468 redundant_sj.txt
90526582 total
[Tue Sep 25 23:57:14 EDT 2018] Creating FRG files
[Wed Sep 26 00:20:47 EDT 2018] Computing super reads from PE
[Wed Sep 26 01:43:09 EDT 2018] Using CABOG from /gpfs/share/apps/masurca/3.2.8/bin/../CA8/Linux-amd64/bin
[Wed Sep 26 01:43:09 EDT 2018] Running mega-reads correction/assembly
[Wed Sep 26 01:43:09 EDT 2018] Using mer size 15 for mapping, B=15, d=0.02
[Wed Sep 26 01:43:09 EDT 2018] Estimated Genome Size 2360747251
[Wed Sep 26 01:43:09 EDT 2018] Estimated Ploidy 1
[Wed Sep 26 01:43:09 EDT 2018] Using 32 threads
[Wed Sep 26 01:43:09 EDT 2018] Output prefix mr.41.15.15.0.02
[Wed Sep 26 01:43:09 EDT 2018] Using 25x of the longest ONT reads
[Wed Sep 26 01:46:29 EDT 2018] Reducing super-read k-mer size
[Wed Sep 26 02:21:04 EDT 2018] Mega-reads pass 1
[Wed Sep 26 02:21:04 EDT 2018] Running on the grid in 89 batches
[Wed Sep 26 02:29:01 EDT 2018] submitting SGE create_mega_reads jobs to the grid
[Wed Sep 26 02:29:01 EDT 2018] create_mega_reads failed on the grid
[Wed Sep 26 02:29:01 EDT 2018] mega-reads pass 1 on the grid failed or stopped, please re-run assemble.sh
[Wed Sep 26 02:29:01 EDT 2018] Assembly stopped or failed, see CA.mr.41.15.15.0.02.log

#this is the batch size for grid execution
PBATCH_SIZE=2000000000
GRID_ENGINE="SGE"
QUEUE=""
USE_SGE=0
PACBIO=""
NANOPORE=""
ONEPASS=0
GC=
RC=
NC=
if tty -s < /dev/fd/1 2> /dev/null; then
GC='\e[0;32m'
RC='\e[0;31m'
NC='\e[0m'
fi

zrlewis · 2018-10-22T18:06:54Z

@stubrownGal Any luck getting this running on SLURM? I just ran into a similar problem.

stubrownGal · 2018-10-22T20:42:42Z

I have made some more progress, but I cant say exactly what was changed. Basically there are bits of SGE specific code in many places. I have tracked some back and got it to run through the SuperReads, but it dies on the megaReads. I found this code for “create_mega_reads.sh” within my last output folder created which was mr_pass1 It is full of SGE specific stuff that will not work on my SURM system. [browns02@bigpurple-ln4 mr_pass1]$ more create_mega_reads.sh #!/bin/sh if [ ! -e mr.batch$SGE_TASK_ID.success ];then /gpfs/share/apps/masurca/3.2.8/bin/create_mega_reads -s 5967832386 -m 15 --psa-min 12 --stretch-cap 10000 -k 41 -u ../g uillaumeKUnitigsAtLeast32bases_all.41.fasta -t 32 -B 15 --max-count 5000 -d 0.02 -r ../superReadSequences.named.fasta -p lr.batch$SGE_TASK_ID -o mr.batch$SGE_TASK_ID.tmp && mv mr.batch$SGE_TASK_ID.tmp mr.batch$SGE_TASK_ID.txt && touch mr. batch$SGE_TASK_ID.success else echo "job $SGE_TASK_ID previously completed successfully" fi From: Zack Lewis [mailto:notifications@github.com] Sent: Monday, October 22, 2018 2:07 PM To: alekseyzimin/masurca Cc: Brown, Stuart; Mention Subject: Re: [alekseyzimin/masurca] SLURM submit error (#67) @stubrownGal<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_stubrownGal&d=DwMCaQ&c=j5oPpO0eBH1iio48DtsedbOBGmuw5jHLjgvtN2r4ehE&r=gDmM_501-dDAuSo4nZTF17Qgt3MzRMmpU99zkXegrJU&m=9GWzvgjTpILOgutWQxC4lOcopeNpcbH1CXYHcv_N_6U&s=fu0ZNK2Dx79CGYmL1SHinDLJGofWcroDnp_zx0H3ZDM&e=> Any luck getting this running on SLURM? I just ran into a similar problem. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_alekseyzimin_masurca_issues_67-23issuecomment-2D431918627&d=DwMCaQ&c=j5oPpO0eBH1iio48DtsedbOBGmuw5jHLjgvtN2r4ehE&r=gDmM_501-dDAuSo4nZTF17Qgt3MzRMmpU99zkXegrJU&m=9GWzvgjTpILOgutWQxC4lOcopeNpcbH1CXYHcv_N_6U&s=5MWZpCOvKHMKujanVvMwu-XpwOoE7ux2AhkUONvLo1w&e=>, or mute the thread<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AOo6WqqPAq0H4EnciXbsnDpR7tbowY4jks5ungk-2DgaJpZM4W8sW2&d=DwMCaQ&c=j5oPpO0eBH1iio48DtsedbOBGmuw5jHLjgvtN2r4ehE&r=gDmM_501-dDAuSo4nZTF17Qgt3MzRMmpU99zkXegrJU&m=9GWzvgjTpILOgutWQxC4lOcopeNpcbH1CXYHcv_N_6U&s=cBzArC-scgJmP0dtVqyinJlhzvE0Y577wJx_Nuc1qo0&e=>.

…

------------------------------------------------------------ This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain information that is proprietary, confidential, and exempt from disclosure under applicable law. Any unauthorized review, use, disclosure, or distribution is prohibited. If you have received this email in error please notify the sender by return email and delete the original message. Please note, the recipient should check this email and any attachments for the presence of viruses. The organization accepts no liability for any damage caused by any virus transmitted by this email. =================================

zrlewis · 2018-10-23T14:41:23Z

@stubrownGal Okay. It looks like SLURM support might not be fully integrated yet, judging from the mega_reads_assemble_cluster.sh script. I'll try running it locally, setting USE_GRID=0

zrlewis · 2018-11-20T14:36:57Z

@alekseyzimin Any status update for SLURM support in v 3.2.9? Running mega-reads locally on a large genome is very slow in my case.

sunnycqcn · 2018-11-20T14:49:38Z

Yes. I take about two months with 120 cpus.

…

On Tue, Nov 20, 2018 at 8:36 AM Zack Lewis ***@***.***> wrote: @alekseyzimin <https://github.com/alekseyzimin> Any status update for SLURM support in v 3.2.9? Running mega-reads locally on a large genome is very slow in my case. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#67 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AXaRKOk9n3g9pjaYiqtbmYHpdbYfJvryks5uxBOKgaJpZM4W8sW2> .

-- Fuyou Fu, Ph.D. Department of Botany and Plant Pathology Purdue University USA

alekseyzimin · 2018-11-20T16:01:15Z

Still working on SLURM support. Currently only SGE is supported. MaSuRCA has two parts: correction and assembly. Assembly runs on SLURM already, correction does not yet.

…

On Tue, Nov 20, 2018 at 9:49 AM sunnycqcn ***@***.***> wrote: Yes. I take about two months with 120 cpus. On Tue, Nov 20, 2018 at 8:36 AM Zack Lewis ***@***.***> wrote: > @alekseyzimin <https://github.com/alekseyzimin> Any status update for > SLURM support in v 3.2.9? Running mega-reads locally on a large genome is > very slow in my case. > > — > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub > < #67 (comment)>, > or mute the thread > < https://github.com/notifications/unsubscribe-auth/AXaRKOk9n3g9pjaYiqtbmYHpdbYfJvryks5uxBOKgaJpZM4W8sW2 > > . > -- Fuyou Fu, Ph.D. Department of Botany and Plant Pathology Purdue University USA — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#67 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AZ9zHTp0CTNXdPlr72AKARiYKNp8FPJgks5uxBaDgaJpZM4W8sW2> .

-- Dr. Alexey V. Zimin Associate Research Scientist Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA (301)-437-6260 http://www.genome.umd.edu http://masurca.blogspot.com

stubrownGal · 2018-11-22T13:37:48Z

Thanks for the update. I finally got it to run to completion without grid support, using 32 threads on our large memory machine. Stuart M. Brown, Ph.D. Center for Health Informatics and Bioinformatics New York University School of Medicine

…

________________________________ From: Aleksey Zimin <notifications@github.com> Sent: Tuesday, November 20, 2018 11:01:16 AM To: alekseyzimin/masurca Cc: Brown, Stuart; Mention Subject: Re: [alekseyzimin/masurca] SLURM submit error (#67) Still working on SLURM support. Currently only SGE is supported. MaSuRCA has two parts: correction and assembly. Assembly runs on SLURM already, correction does not yet.

On Tue, Nov 20, 2018 at 9:49 AM sunnycqcn ***@***.***> wrote: Yes. I take about two months with 120 cpus. On Tue, Nov 20, 2018 at 8:36 AM Zack Lewis ***@***.***> wrote: > @alekseyzimin <https://github.com/alekseyzimin> Any status update for > SLURM support in v 3.2.9? Running mega-reads locally on a large genome is > very slow in my case. > > - > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub > < #67 (comment)>, > or mute the thread > < https://github.com/notifications/unsubscribe-auth/AXaRKOk9n3g9pjaYiqtbmYHpdbYfJvryks5uxBOKgaJpZM4W8sW2 > > . > -- Fuyou Fu, Ph.D. Department of Botany and Plant Pathology Purdue University USA - You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#67 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AZ9zHTp0CTNXdPlr72AKARiYKNp8FPJgks5uxBaDgaJpZM4W8sW2> .

-- Dr. Alexey V. Zimin Associate Research Scientist Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA (301)-437-6260 http://www.genome.umd.edu http://masurca.blogspot.com - You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_alekseyzimin_masurca_issues_67-23issuecomment-2D440326097&d=DwMFaQ&c=j5oPpO0eBH1iio48DtsedbOBGmuw5jHLjgvtN2r4ehE&r=gDmM_501-dDAuSo4nZTF17Qgt3MzRMmpU99zkXegrJU&m=OQNaXTBzrt-DW4bAKieRy_cO_DbtRTIylMaVg4gm90w&s=cA8hUiCJd1eVNI3moZ34cPqnKj0Kb5eDGH9bqOKeT0M&e=>, or mute the thread<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AOo6Wr5wkWTkUU-5Fgk1-5FMf4pywHcAnxFQks5uxCdMgaJpZM4W8sW2&d=DwMFaQ&c=j5oPpO0eBH1iio48DtsedbOBGmuw5jHLjgvtN2r4ehE&r=gDmM_501-dDAuSo4nZTF17Qgt3MzRMmpU99zkXegrJU&m=OQNaXTBzrt-DW4bAKieRy_cO_DbtRTIylMaVg4gm90w&s=a04lrNPqiIbfYsxU8SI1Gk_MRM1QOG0OFGwJ7sCXpNY&e=>.

------------------------------------------------------------ This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain information that is proprietary, confidential, and exempt from disclosure under applicable law. Any unauthorized review, use, disclosure, or distribution is prohibited. If you have received this email in error please notify the sender by return email and delete the original message. Please note, the recipient should check this email and any attachments for the presence of viruses. The organization accepts no liability for any damage caused by any virus transmitted by this email. =================================

alekseyzimin · 2018-11-26T14:48:47Z

Great the hear that!

…

--Aleksey On Thu, Nov 22, 2018 at 8:37 AM stubrownGal <notifications@github.com> wrote:

Thanks for the update. I finally got it to run to completion without grid support, using 32 threads on our large memory machine. Stuart M. Brown, Ph.D. Center for Health Informatics and Bioinformatics New York University School of Medicine ________________________________ From: Aleksey Zimin ***@***.***> Sent: Tuesday, November 20, 2018 11:01:16 AM To: alekseyzimin/masurca Cc: Brown, Stuart; Mention Subject: Re: [alekseyzimin/masurca] SLURM submit error (#67) Still working on SLURM support. Currently only SGE is supported. MaSuRCA has two parts: correction and assembly. Assembly runs on SLURM already, correction does not yet. On Tue, Nov 20, 2018 at 9:49 AM sunnycqcn ***@***.***> wrote: > Yes. I take about two months with 120 cpus. > > On Tue, Nov 20, 2018 at 8:36 AM Zack Lewis ***@***.***> > wrote: > > > @alekseyzimin <https://github.com/alekseyzimin> Any status update for > > SLURM support in v 3.2.9? Running mega-reads locally on a large genome is > > very slow in my case. > > > > - > > You are receiving this because you are subscribed to this thread. > > Reply to this email directly, view it on GitHub > > < > #67 (comment) >, > > or mute the thread > > < > https://github.com/notifications/unsubscribe-auth/AXaRKOk9n3g9pjaYiqtbmYHpdbYfJvryks5uxBOKgaJpZM4W8sW2 > > > > . > > > > > -- > Fuyou Fu, Ph.D. > Department of Botany and Plant Pathology > Purdue University > USA > > - > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > < #67 (comment)>, > or mute the thread > < https://github.com/notifications/unsubscribe-auth/AZ9zHTp0CTNXdPlr72AKARiYKNp8FPJgks5uxBaDgaJpZM4W8sW2 > > . > -- Dr. Alexey V. Zimin Associate Research Scientist Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA (301)-437-6260 http://www.genome.umd.edu http://masurca.blogspot.com - You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub< https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_alekseyzimin_masurca_issues_67-23issuecomment-2D440326097&d=DwMFaQ&c=j5oPpO0eBH1iio48DtsedbOBGmuw5jHLjgvtN2r4ehE&r=gDmM_501-dDAuSo4nZTF17Qgt3MzRMmpU99zkXegrJU&m=OQNaXTBzrt-DW4bAKieRy_cO_DbtRTIylMaVg4gm90w&s=cA8hUiCJd1eVNI3moZ34cPqnKj0Kb5eDGH9bqOKeT0M&e=>, or mute the thread< https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AOo6Wr5wkWTkUU-5Fgk1-5FMf4pywHcAnxFQks5uxCdMgaJpZM4W8sW2&d=DwMFaQ&c=j5oPpO0eBH1iio48DtsedbOBGmuw5jHLjgvtN2r4ehE&r=gDmM_501-dDAuSo4nZTF17Qgt3MzRMmpU99zkXegrJU&m=OQNaXTBzrt-DW4bAKieRy_cO_DbtRTIylMaVg4gm90w&s=a04lrNPqiIbfYsxU8SI1Gk_MRM1QOG0OFGwJ7sCXpNY&e= >. ------------------------------------------------------------ This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain information that is proprietary, confidential, and exempt from disclosure under applicable law. Any unauthorized review, use, disclosure, or distribution is prohibited. If you have received this email in error please notify the sender by return email and delete the original message. Please note, the recipient should check this email and any attachments for the presence of viruses. The organization accepts no liability for any damage caused by any virus transmitted by this email. ================================= — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#67 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AZ9zHaMqAvilwK1fpc4J46jR_g-V8AYjks5uxqitgaJpZM4W8sW2> .

-- Dr. Alexey V. Zimin Associate Research Scientist Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA (301)-437-6260 http://www.genome.umd.edu http://masurca.blogspot.com

zrlewis · 2018-11-27T18:16:01Z

@alekseyzimin Thanks for the update. Keep us posted on SLURM support. I'm limited to nodes with 20 CPUs, so I wouldn't be able to take the approach that @sunnycqcn did, unfortunately.

alekseyzimin · 2018-11-29T17:40:52Z

Hi, I have a beta version with SLURM support. The way it works is that you run the main script on a single multi threaded high memory machine and when it is time for the expensive steps, such as create_mega_reads or overlapper in assembly, the code prepares the batch jobs, prints out the command to submit (sbatch.....) and exits. Then you can submit the jobs and once they are all done, re-run assemble.sh. If a few jobs failed it will stop again asking to re-submit jobs. Successful jobs will not have to be re-run. I am working on a better way to implement this by setting up dependencies. You can get the 3.3.0b version with SLURM support here: https://github.com/alekseyzimin/masurca Best, Aleksey

…

On Tue, Nov 27, 2018 at 1:16 PM Zack Lewis ***@***.***> wrote: @alekseyzimin <https://github.com/alekseyzimin> Thanks for the update. Keep us posted on SLURM support. I'm limited to nodes with 20 CPUs, so I wouldn't be able to take the approach that @sunnycqcn <https://github.com/sunnycqcn> did, unfortunately. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#67 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AZ9zHbp7W65rIIQE5VxLN-beuVPCTIesks5uzYFjgaJpZM4W8sW2> .

-- Dr. Alexey V. Zimin Associate Research Scientist Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA (301)-437-6260 http://www.genome.umd.edu http://masurca.blogspot.com

zrlewis · 2018-11-29T18:01:57Z

@alekseyzimin Fantastic! I look forward to trying it out.

If I have already started a run, which has been having trouble getting past mega-reads, then do I need to restart or can I resume with the new version?

alekseyzimin · 2018-11-29T18:24:38Z

Hi, You can simply restart it with the new version.

…

--Aleksey

On Thu, Nov 29, 2018 at 1:01 PM Zack Lewis ***@***.***> wrote: @alekseyzimin <https://github.com/alekseyzimin> Fantastic! I look forward to trying it out. If I have already started a run, which has been having trouble getting past mega-reads, then do I need to restart or can I resume with the new version? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#67 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AZ9zHebCprnD7ABj71OiBl2FwsBrSvSxks5u0CEWgaJpZM4W8sW2> .

-- Dr. Alexey V. Zimin Associate Research Scientist Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA (301)-437-6260 http://www.genome.umd.edu http://masurca.blogspot.com

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SLURM submit error #67

SLURM submit error #67

stubrownGal commented Sep 27, 2018

zrlewis commented Oct 22, 2018

stubrownGal commented Oct 22, 2018 via email

zrlewis commented Oct 23, 2018

zrlewis commented Nov 20, 2018

sunnycqcn commented Nov 20, 2018 via email

alekseyzimin commented Nov 20, 2018 via email

stubrownGal commented Nov 22, 2018 via email

alekseyzimin commented Nov 26, 2018 via email

zrlewis commented Nov 27, 2018

alekseyzimin commented Nov 29, 2018 via email

zrlewis commented Nov 29, 2018

alekseyzimin commented Nov 29, 2018 via email

SLURM submit error #67

SLURM submit error #67

Comments

stubrownGal commented Sep 27, 2018

zrlewis commented Oct 22, 2018

stubrownGal commented Oct 22, 2018 via email

zrlewis commented Oct 23, 2018

zrlewis commented Nov 20, 2018

sunnycqcn commented Nov 20, 2018 via email

alekseyzimin commented Nov 20, 2018 via email

stubrownGal commented Nov 22, 2018 via email

alekseyzimin commented Nov 26, 2018 via email

zrlewis commented Nov 27, 2018

alekseyzimin commented Nov 29, 2018 via email

zrlewis commented Nov 29, 2018

alekseyzimin commented Nov 29, 2018 via email