-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: Add reads de-duplication step before assembly #104
Comments
Hello hello, Can you check and share what is the content of the file called ‘flye.log’ inside this work dir? The '.nextflow.log' file which contains the command line used is also helpful. With only this we cannot understand what caused the problem. |
Hello Thank you for your Response The contents of flye log show duplicated IDs in subreads. [2023-09-14 11:29:48] INFO: Starting Flye 2.9-b1768
[2023-09-14 11:29:48] INFO: >>>STAGE: configure
[2023-09-14 11:29:48] INFO: Configuring run
[2023-09-14 11:30:01] INFO: Total read length: 407342569
[2023-09-14 11:30:01] INFO: Reads N50/N90: 5930 / 880
[2023-09-14 11:30:01] INFO: Minimum overlap set to 1000
[2023-09-14 11:30:01] INFO: >>>STAGE: assembly
[2023-09-14 11:30:01] INFO: Assembling disjointigs
[2023-09-14 11:30:01] INFO: Reading sequences
[2023-09-14 11:30:12] ERROR: The input contain reads with duplicated IDs. Make sure all reads have unique IDs and restart. The first problematic ID was: SRR3667790.17
[2023-09-14 11:30:12] ERROR: Command '['flye-modules', 'assemble', '--reads', '/home/centos/Michael/Bacannot/work/06/36060129688447e2db9077eb8d6650/SRR3667790_subreads.fastq.gz', '--out-asm', '/home/centos/Michael/Bacannot/work/06/36060129688447e2db9077eb8d6650/flye_SRR3667790/00-assembly/draft_assembly.fasta', '--config', '/usr/local/lib/python3.9/site-packages/flye/config/bin_cfg/asm_raw_reads.cfg', '--log', '/home/centos/Michael/Bacannot/work/06/36060129688447e2db9077eb8d6650/flye_SRR3667790/flye.log', '--threads', '8', '--min-ovlp', '1000']' returned non-zero exit status 1.
[2023-09-14 11:30:12] ERROR: Pipeline aborted I think it is better to add dedup.sh from BBMap suite or some other option to remove dupicates from subreads |
Hi @Michaelijesse ,
In the mean time, I will modify the name of this ticket and add it as a feature request for adding such step in the pipeline. Once again, thanks for sharing. |
Hi @fmalmeida Thank you for helping out. I will do accordingly.. I will dedup and run once... |
seqkit rename <fastq.gz> works fine. |
Perfect, will add this. Thanks! |
All solved!! |
The text was updated successfully, but these errors were encountered: