Skip to content

Job failed with exit value 1: 'vg_indexes' kind-vg_indexes/instance-a78rdbe4 v3 #810

@Lin-Yuying

Description

@Lin-Yuying

Hi there,

I recently run Cactus pangenome pipeline (lastest version) on the cluster, which has been successfully run on your test dataset. When I run cactus-graphmap-join, some errors happened as following (sorry, this is super long :D ). Currently, I have 25 samples + 1 reference genome, the genome size of each is ~750MB, so, I suspected this is due to low memory. If this is indeed related to low memory, what would you recommend me to do? I've got 80 samples in total.

[2022-10-15T08:28:51-0700] [MainThread] [I] [toil-rt] 2022-10-15 08:28:51.959633
: Running the command: "vg gbwt -G /home/myname/tmp/848c46ecee435c6685110464b
6b52787/acb3/0bfc/tmpa7m0lj26/merged.gfa -o /home/myname/tmp/848c46ecee435c66
85110464b6b52787/acb3/0bfc/tmpa7m0lj26/merged.gbwt -g /home/myname/tmp/848c46
ecee435c6685110464b6b52787/acb3/0bfc/tmpa7m0lj26/merged.gg --translation /home/myname/tmp/848c46ecee435c6685110464b6b52787/acb3/0bfc/tmpa7m0lj26/merged.trans
"
[2022-10-15T08:37:44-0700] [MainThread] [I] [toil.leader] 1 jobs are running, 0
jobs are issued and waiting to run
[2022-10-15T09:15:09-0700] [Thread-1 ] [E] [toil.batchSystems.singleMachine] Go
t exit code 1 (indicating failure) from job toil_worker vg_indexes file:/Archiv
e/Cluster/SciBorg/array0/myname/3.SV/7.Cactus/RMasker_result/jobstore kind-vg
indexes/instance-a78rdbe4.
[2022-10-15T09:15:09-0700] [MainThread] [W] [toil.leader] Job failed with exit v
alue 1: 'vg_indexes' kind-vg_indexes/instance-a78rdbe4 v1
Exit reason: None
[2022-10-15T09:15:09-0700] [MainThread] [W] [toil.leader] The job seems to have
left a log file, indicating failure: 'vg_indexes' kind-vg_indexes/instance-a78rd
be4 v2
[2022-10-15T09:15:09-0700] [MainThread] [W] [toil.leader] Log from job "kind-vg

indexes/instance-a78rdbe4" follows:
=========>
[2022-10-15T08:27:35-0700] [MainThread] [I] [toil.worker] ---TOIL WORKER OUTPUT LOG---
[2022-10-15T08:27:35-0700] [MainThread] [I] [toil] Running Toil version 5.7.1-b5cae9634820d76cb6c13b2a6312895122017d54 on host host01.
[2022-10-15T08:27:35-0700] [MainThread] [I] [toil.worker] Working on job 'vg_indexes' kind-vg_indexes/instance-a78rdbe4 v1
[2022-10-15T08:27:35-0700] [MainThread] [I] [toil.worker] Loaded body Job('vg_indexes' kind-vg_indexes/instance-a78rdbe4 v1) from description 'vg_indexes' kind-vg_indexes/instance-a78rdbe4 v1
[2022-10-15T08:27:48-0700] [MainThread] [I] [cactus.shared.common] Running the command ['grep', '-v', '^H\|^P\t_MINIGRAPH
', '/home/myname/tmp/848c46ecee435c6685110464b6b52787/acb3/0bfc/tmpa7m0lj26/sample.cactus.vg.gfa']
[2022-10-15T08:27:48-0700] [MainThread] [I] [toil-rt] 2022-10-15 08:27:48.372688: Running the command: "grep -v ^H|^P MINIGRAPH /home/myname/tmp/848c46ecee435c6685110464b6b52787/acb3/0bfc/tmpa7m0lj26/sample.cactus.vg.gfa"
[2022-10-15T08:28:48-0700] [MainThread] [I] [toil-rt] 2022-10-15 08:28:48.175012: Successfully ran: "grep -v ^H|^P MINIGRAPH /home/myname/tmp/848c46ecee435c6685110464b6b52787/acb3/0bfc/tmpa7m0lj26/sample.cactus.vg.gfa" in 59.7824 seconds
[2022-10-15T08:28:51-0700] [MainThread] [I] [toil-rt] 2022-10-15 08:28:51.959633: Running the command: "vg gbwt -G /home/myname/tmp/848c46ecee435c6685110464b6b52787/acb3/0bfc/tmpa7m0lj26/merged.gfa -o /home/myname/tmp/848c46ecee435c6685110464b6b52787/acb3/0bfc/tmpa7m0lj26/merged.gbwt -g /home/myname/tmp/848c46ecee435c6685110464b6b52787/acb3/0bfc/tmpa7m0lj26/merged.gg --translation /home/myname/tmp/848c46ecee435c6685110464b6b52787/acb3/0bfc/tmpa7m0lj26/merged.trans"
[2022-10-15T09:15:06-0700] [MainThread] [W] [toil.fileStores.abstractFileStore] Failed job accessed files:
[2022-10-15T09:15:06-0700] [MainThread] [W] [toil.fileStores.abstractFileStore] Downloaded file 'files/for-job/kind-vg_to_gfa/instance-_ii_dhud/file-67f82f7bd0ef4d2fa50cb91b1040c0ad/sample.cactus.vg.gfa' to path '/home/myname/tmp/848c46ecee435c6685110464b6b52787/acb3/0bfc/tmpa7m0lj26/sample.cactus.vg.gfa'
Traceback (most recent call last):
File "/Archive/Cluster/SciBorg/array0/myname/bin/cactus-bin-v2.2.3/cactus_env/lib/python3.7/site-packages/toil/worker.py", line 407, in workerScript
job._runner(jobGraph=None, jobStore=jobStore, fileStore=fileStore, defer=defer)
File "/Archive/Cluster/SciBorg/array0/myname/bin/cactus-bin-v2.2.3/cactus_env/lib/python3.7/site-packages/toil/job.py", line 2406, in _runner
returnValues = self._run(jobGraph=None, fileStore=fileStore)
File "/Archive/Cluster/SciBorg/array0/myname/bin/cactus-bin-v2.2.3/cactus_env/lib/python3.7/site-packages/toil/job.py", line 2324, in _run
return self.run(fileStore)
File "/Archive/Cluster/SciBorg/array0/myname/bin/cactus-bin-v2.2.3/cactus_env/lib/python3.7/site-packages/toil/job.py", line 2547, in run
rValue = userFunction(*((self,) + tuple(self._args)), **self._kwargs)
File "/Archive/Cluster/SciBorg/array0/myname/bin/cactus-bin-v2.2.3/cactus_env/lib/python3.7/site-packages/cactus/refmap/cactus_graphmap_join.py", line 516, in vg_indexes
cactus_call(parameters=['vg', 'gbwt', '-G', merge_gfa_path, '-o', gbwt_path, '-g', gg_path, '--translation', trans_path])
File "/Archive/Cluster/SciBorg/array0/myname/bin/cactus-bin-v2.2.3/cactus_env/lib/python3.7/site-packages/cactus/shared/common.py", line 796, in cactus_call
raise RuntimeError("{}Command {} signaled {}: {}".format(sigill_msg, call, signal.Signals(-process.returncode).name, out))
RuntimeError: Command ['vg', 'gbwt', '-G', '/home/myname/tmp/848c46ecee435c6685110464b6b52787/acb3/0bfc/tmpa7m0lj26/merged.gfa', '-o', '/home/myname/tmp/848c46ecee435c6685110464b6b52787/acb3/0bfc/tmpa7m0lj26/merged.gbwt', '-g', '/home/myname/tmp/848c46ecee435c6685110464b6b52787/acb3/0bfc/tmpa7m0lj26/merged.gg', '--translation', '/home/myname/tmp/848c46ecee435c6685110464b6b52787/acb3/0bfc/tmpa7m0lj26/merged.trans'] signaled SIGSEGV: stdout=None, stderr=vg [warning]: System's vm.overcommit_memory setting is 2 (never overcommit). vg does not work well under these conditions; you may appear to run out of memory with plenty of memory left. Attempting to unsafely reconfigure jemalloc to deal better with this situation.
[2022-10-15T09:15:08-0700] [MainThread] [E] [toil.worker] Exiting the worker because of a failed job on host host01

Here are the command lines I used

cactus-minigraph ./jobstore sample.cactus.seq sample.cactus.gfa.gz --reference refGenome
cactus-graphmap ./jobstore sample.cactus.seq sample.cactus.gfa.gz sample.cactus.paf  --reference refGenome --outputFasta sample.cactus.gfa.fa.gz
cactus-align ./jobstore sample.cactus.seq sample.cactus.paf sample.cactus.hal --pangenome --outVG --reference refGenome
cactus-graphmap-join ./jobstore --vg sample.cactus.vg --outDir ./samples --outName sample.cactus --reference refGenome --vcf --giraffe --gfaffix  --wlineSep "." 

Do you know what happened? Thanks in advance!

Y. L.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions