Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indel alignment #84

Open
shimbalama opened this issue Jul 20, 2020 · 4 comments
Open

Indel alignment #84

shimbalama opened this issue Jul 20, 2020 · 4 comments

Comments

@shimbalama
Copy link

Hi Devs,

The IGV screen shot below shows the same deletion in short reads above (BWA-MEM) and ONT reads below (NGMLR). The 22bp deletion is strewn across ~50bp of the reference and has a varying length. I'm in the business of calling somatic variants, so this 'wobble' or 'fuziness' or whatever you want to call it makes this a difficult problem. I can't find any discussion of this problem in your docs so I just though I'd touch base and see if you could offer any solutions to mitigate this? This example is over a TAAAA repeat which always exacerbates the problem, however, I've looked at thousands of indels now and most have the same issue to some extend. Some of my larger indels wobble across more than 1k bps. And a similar issue with SV breakpoints (which I call using soft clipping).

Thanks,
Liam
igv_snapshot_chr22_26951324_indel_problem

@fritzsedlazeck
Copy link
Collaborator

Hey,
sadly this is quite common in noisy data + these low repeats. I dont understand that the upper one is bwa -mem as the reads seem to be quite different and not including many sequencing errors.

Unfortunately, there is currently no approach to e.g. left align these deletions.
Variant callers such as ours or others will be able to cope with this.
Thanks
Fritz

@shimbalama
Copy link
Author

Thanks for your fast response, Fritz. Just FYI the reads at the top with BWA-MEM are Illumina 150bp reads.

@fritzsedlazeck
Copy link
Collaborator

Ah I got confused... sorry too much going on here.
Yes please give e.g. Sniffles a try. I have implemented some procedures to make an educated guess where the breakpoint is most likely. Given the nature of this particular region, we could of course argue the whole day in which of the repeat units it occurs and I agree that is a remaining challenge.

Thanks
Fritz

@shimbalama
Copy link
Author

Thanks, Fritz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants