You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
From your experience do you have suggestions for combinations of parameters to use on a sample of raw paired-end reads, with mean read depth of 15x?
I have tried 13 combinations that vary in the 1) sample used (either a ~17x coverage or 10x coverage sample), 2) assembler used (megahit or spades), 3) data size used for assembly (5, 25 ,50 and 80), 4) kmers ("Large
39 59 79 99 119 141" or "Small 21 31 41 51 61 71 81 91”) and 5) whether reads were trimmed or not. I attach the below table summarising the combinations I have tried (MitoZ_combinations.xlsx).
The command that works best (attached above) finds all genes but is non-circular and produces two seq_id (combo5summary.txt). The read depth across the genome looks OK apart from the beginning (combo5circos.depth.txt). This is the command :
Based on my experience on mammals (your samples are birds), 2-5Gbp or 8Gbp is good enough for assembling circular mitogenome.
I have no better recommendations now. But maybe you can map all the raw data to the mitogenomes of some closely related species? And use a loose cutoff to keep many alignable reads. Then use the mapped reads to assemble the mitogenome with MitoZ?
on which platform/server? (Windows? Windows Sublinux? MacOS? Ubuntu? etc.)
Linux
MitoZ version?
3.6
How did you install MitoZ? (e.g. Docker, Udocker, Singularity, Conda-Pack, Conda, or source code)
Conda
Did you run a test after your installation, and was the test run okay?
Yes. OK.
How much data (roughly) did you use for mitogenome assembly? e.g. 5Gbp?
25 Gbp.
The command you used?
mitoz all
--outprefix sw
--thread_number 20
--clade Chordata
--requiring_taxa Chordata
--genetic_code 2
--species_name "Seychelles warbler"
--fq1 102_ACTTAGATCG-CGGAATTCTT_L002__trimmed_paired_R1.fastq.gz
--fq2 102_ACTTAGATCG-CGGAATTCTT_L002__trimmed_paired_R2.fastq.gz
--fastq_read_length 151
--data_size_for_mt_assembly 25,0
--assembler megahit
--kmers_megahit 39 59 79 99 119 141
--memory 100
--requiring_taxa Chordata
--min_abundance 0
Problem description
From your experience do you have suggestions for combinations of parameters to use on a sample of raw paired-end reads, with mean read depth of 15x?
I have tried 13 combinations that vary in the 1) sample used (either a ~17x coverage or 10x coverage sample), 2) assembler used (megahit or spades), 3) data size used for assembly (5, 25 ,50 and 80), 4) kmers ("Large
39 59 79 99 119 141" or "Small 21 31 41 51 61 71 81 91”) and 5) whether reads were trimmed or not. I attach the below table summarising the combinations I have tried (MitoZ_combinations.xlsx).
The command that works best (attached above) finds all genes but is non-circular and produces two seq_id (combo5summary.txt). The read depth across the genome looks OK apart from the beginning (combo5circos.depth.txt). This is the command :
mitoz all
--outprefix sw
--thread_number 20
--clade Chordata
--requiring_taxa Chordata
--genetic_code 2
--species_name "Seychelles warbler"
--fq1 102_ACTTAGATCG-CGGAATTCTT_L002__trimmed_paired_R1.fastq.gz
--fq2 102_ACTTAGATCG-CGGAATTCTT_L002__trimmed_paired_R2.fastq.gz
--fastq_read_length 151
--data_size_for_mt_assembly 25,0
--assembler megahit
--kmers_megahit 39 59 79 99 119 141
--memory 100
--requiring_taxa Chordata
--min_abundance 0
The raw paired-end reads can be found here:
102: https://cgr.liv.ac.uk/illum/LIMS26629_51a15827930a0b65/Raw/Sample_102/
53: https://cgr.liv.ac.uk/illum/LIMS25133_4f8b5ec41474a239/Raw/Sample_53-11998DH0147L01_4879/
Log messages from MitoZ (stdout and stderr, e.g., both
m.log
andm.err
files)Attached as combo5.log
combo5.log
and combo5errorsummaryval.txt
combo5errorsummaryval.txt
The text was updated successfully, but these errors were encountered: