Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PF variants in fade region are not totally filtered #32

Open
xiucz opened this issue May 9, 2022 · 2 comments
Open

PF variants in fade region are not totally filtered #32

xiucz opened this issue May 9, 2022 · 2 comments

Comments

@xiucz
Copy link

xiucz commented May 9, 2022

Hi,
After diving into your wonderful tool, I find one more question. After running my bam file (reacalibrate.bam with GATK best practice) with the fade annotate and fade out (without -c) , most PF variants will not show in the bam( The first IGV panel bam, fade.bam).

And the second bam, mt2.bam, which comes from mutect2 --bamout option, let's ignore it.

However, there are still some FP variants which can not be filtered by fade software, I know it is not a bug of fade.

We know that fade can filter/trim reads that meet fade‘s inter threshold. But if a variant contains reads both meet fade‘s inter threshold (read A) and not (read B). Should we remove the read B also?

image

b6cf3463a315c5aea4a6106c205c074

Best,
xiucz

@charlesgregory
Copy link
Collaborator

Thank you for your kind words about fade!

Currently the only reads fade can remove are those it identifies as "artifact". It determines this by realigning the read to the local sequence of the original alignment. If you queryname sort your bam/sam file before using fade out fade will remove any read pair in which either mate is identified as "artifact". In order to assess variants, as a way of determining artifact reads, fade would have to use some variant caller or perform some rudimentary variant calling as part of its analysis.

This would be quite out of fade's scope and would likely be a large undertaking.

Currently you could use fade's extract function to extract the artifact reads in bam format. Then you could create a bed file of regions to in which you wish to ignore variants using the bam file.

Hopefully that helps answer your question.

@xiucz
Copy link
Author

xiucz commented May 16, 2022

@charlesgregory

Thank you, it is a good idea to use fade's extract. So I begin to find the breakpoint where the reverse-complemented happened.

image
The picture is taken from the article.

  1. The first length refers to 47bp(softclip) + 8bp(inverted repeat), which should be trimmed with -c option.
  2. BreakpointA refers to reverse-complemented point.
  3. The second length refers to 8bp(inverted repeat) + 47bp(maybe the natural sequence base). But it seems fade software use this strategy to trim bases.

I want to know which strategy should be used, the first length strategy or the second length ?

Best,
xiucz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants