Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Isoseq collapse filtering out criteria #664

Closed
MengjunWu opened this issue Mar 20, 2024 · 2 comments
Closed

Isoseq collapse filtering out criteria #664

MengjunWu opened this issue Mar 20, 2024 · 2 comments
Labels

Comments

@MengjunWu
Copy link

Hi,
I have some problems with isoseq collapse. While most of my reads (90%) are mapped, almost half of them are filtered out after isoseq collpase. I was wondering how do you calculate coverage and identify? I am using the mg tag to get the identity, and calculating coverage per read as number of matches and mismatches in the cigar string divided by the read length, but I get much less reads filtered out than by isoseq collapse with the same thresholds. Are either coverage or identity calculated differently?

Many thanks
Mengjun

@armintoepfer
Copy link
Member

Assigning to @jmattick

@jmattick
Copy link
Contributor

Hi @MengjunWu,
collapse filters based on the following:

  1. Read is mapped
  2. Read is a primary alignment
  3. Read meets the minimum coverage (aligned end - aligned start) / (read length)
  4. Read meets the minimum identity (matches / (matches + mis-matches + inserted bases + deleted bases)
  5. Optional: If using single-cell workflow, read must be marked as coming from a real cell using the rc tag.

These minimum values can be changed using the following options.

Alignment Filter Options:
  --min-aln-coverage              FLOAT  Ignore alignments with less than minimum query read coverage. [0.99]
  --min-aln-identity              FLOAT  Ignore alignments with less than minimum alignment identity. [0.95]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

3 participants