Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How is multiple redundancy handled? #8

Closed
tkatsuki opened this issue Nov 3, 2015 · 1 comment
Closed

How is multiple redundancy handled? #8

tkatsuki opened this issue Nov 3, 2015 · 1 comment

Comments

@tkatsuki
Copy link

tkatsuki commented Nov 3, 2015

How does the reduction step handle redundancy that involve more than 2 sequences? When mapped to itself, my assembly had some contigs that aligned to multiple other shorter contigs at high confidence (>99% identity, longer than 250nt, as showed in the attached image, which is actually after running fasta2homozygous.py). I want to remove all but the longest contig. When I ran redundans (only fasta2homozygous.py ), I saw messages like "matched already removed contig". Does this mean that redundans removes only the first matched sequence?

Thanks,

Takeo
clipboard

@lpryszcz
Copy link
Collaborator

lpryszcz commented Dec 7, 2015

Hi Takeo, In principle fasta2homozygous.py should remove all contigs but the longest one that fullfill identity & overlap critaria. Isn't it the case with your data?
Don't bother about this message, it just means that some of the contigs have similarity to the contigs that were previously marked for removal. This is expected if you remove multiple overlapping contigs.

@lpryszcz lpryszcz closed this as completed Dec 7, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants