You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I am using PLASS (v4.687d7) on a set of metagenomes from ~100 cheese samples and it works very well, but still, I have some questions.
In each dataset a high level of protein sequences (on average 30%) are duplicated (with 100% identity and coverage). I understand that some sequences could be duplicated (originating from closely related species), but 30% seems to be quite high.
Another issue is the total amount of assembled amino acid. As an example, for an initial dataset of 18 million reads (2x150 bp paired-end reads, 2.7 Gbp in total), 7 million proteins are assembled (2e+9 aa in total, almost as much as the total amount of nucleotides, which means, to me, more amino acid than expected...).
Is there an explanation about these results ?
I am using PLASS with the following command (others parameters as default):
plass assemble METAG_R1.fastq.gz METAG_R2.fastq.gz METAG_out.fasta -e 0.001 --num-iterations 12 --filter-proteins 1 --remove-tmp-files 1
Thanks
Helene
The text was updated successfully, but these errors were encountered:
Since Plass can reuse each read in every iteration. It tends to create a lot of variation that are not necessarily useful. We generally use mmseqs linclust to remove fragments afterwards.
Hi,
I am using PLASS (v4.687d7) on a set of metagenomes from ~100 cheese samples and it works very well, but still, I have some questions.
In each dataset a high level of protein sequences (on average 30%) are duplicated (with 100% identity and coverage). I understand that some sequences could be duplicated (originating from closely related species), but 30% seems to be quite high.
Another issue is the total amount of assembled amino acid. As an example, for an initial dataset of 18 million reads (2x150 bp paired-end reads, 2.7 Gbp in total), 7 million proteins are assembled (2e+9 aa in total, almost as much as the total amount of nucleotides, which means, to me, more amino acid than expected...).
Is there an explanation about these results ?
I am using PLASS with the following command (others parameters as default):
plass assemble METAG_R1.fastq.gz METAG_R2.fastq.gz METAG_out.fasta -e 0.001 --num-iterations 12 --filter-proteins 1 --remove-tmp-files 1
Thanks
Helene
The text was updated successfully, but these errors were encountered: