-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
problem with longreads_only assembly #59
Comments
Hi @Guy2Horev , I think your problem might be related to memory and that you have hifi reads. I will first suggest a command line using the current released version For that, here are a few advices and inquiries: Testing case so I can also try The skip parameters and assemblers to try
Because you have hifi For that, in version Modifying memory for bigger genomes One can modify it by passing on a custom config. For example, if you have a process {
withLabel:process_assembly {
cpus = 20
memory = '40 GB'
time = '72 h'
}
// Quast sometimes can take too long
withName:quast {
cpus = 10
memory = '20 GB'
time = '72 h'
}
} Having this, would make that all assembly steps runs for at least 72 hours, using 40 GB memory, and 20 CPUs. The same idea for Quast. You can adjust this with how much you think is feasible. If you to not use this config, it is fine, in the second attempt, the pipeline you try to max it out the execution using how much you have allowed it to use with Finally the command line nextflow \
run fmalmeida/mpgap \
-r v3.1.4 -latest \
--output _ASSEMBLY \
--max_cpus 20 \
--skip_wtdbg2 \
--skip_unicycler \
--genome_size 800m \
--corrected_long_reads \
--input MPGAP_samplesheet1.yml \
-profile docker \
-c custom_config_for_resources.config # optional Please let me know how it goes, because then we can properly assess if there is any bug, or if for example we can try the current dev branch which already has some bug fixes and where I started to test some parameters for Let me know if you have a public dataset similar to yours which I can try. Best, |
Hi Felipe, Thank you very much for the detailed response.
I am trying to run only canu to check if it works. |
Hi @Guy2Horev , The error message still complains about memory. Please let me know so we can get it to work and also, if necessary, to properly test this mentioned dev branch for releasing 😄 Also, when increasing memory, is good to try to keep number of cpus stable or low, so that you have more memory per thread. |
Added some new parameters in the latest release to allow users to quickly modify the amount of memory of starting assembly results. Select different BUSCO dbs. And also, say if long reads are corrected or high quality. https://github.com/fmalmeida/MpGAP/releases/tag/v3.2.0 Hope it helps. If error persists, we can open a new ticket for tackling it. |
[intergalactic_knuth] Nextflow Workflow Report.pdf
Hi,
I am trying to assemble plant genome (~800m) from PacBio Revio reads.
here is the command I use
nextflow -bg run fmalmeida/mpgap --output _ASSEMBLY --max_cpus 20 --skeep_wtdbg2 --genome_size 800m --input MPGAP_samplesheet1.yml -profile docker
here is the yml file contents
The process started but at some points I get the error messages similar to the following for all the assemblers
[Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 5; name: LONGREADS_ONLY:canu (sample_5); status: COMPLETED; exit: 1; error: -; workDir: /mnt/data/guyh/Trifolium/Revio/work/a8/a8499e26430751241cde25981ce53b]
A pdf version of mpgap report is attached
Can you please advice?
Thank you in advance.
Guy
The text was updated successfully, but these errors were encountered: