Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How does Circle-Map use hard clipped reads? #23

Closed
tchrisboles opened this issue Aug 12, 2019 · 1 comment
Closed

How does Circle-Map use hard clipped reads? #23

tchrisboles opened this issue Aug 12, 2019 · 1 comment

Comments

@tchrisboles
Copy link

Inigo,
This is possibly more a question about the structure of sam/bam files (I am a beginner in bioinformatics). If so, I apologize in advance.

I have been using Circle-Map to analyze my mtDNA sequencing runs, and also to see if I can identify circular DNA from other chr's in my circle-enriched preparations.

In my last prep, Circle-Map assigned some very high circle scores to regions with no discordant pairs (see orange shading). Also coverage was highly variable across the region. So you might want to consider assigning more scoring weight to coverage uniformity of potential circles.

image

I did find one other region with very low coverage but which had both discordant reads and split reads (see yellow shaded entry above). I wanted to see the details of these reads and so I inspected the information for the individual reads with IGV. Here is an IGV image of one of the discordant (and split) reads. The read information for the read bracketed in red is shown in the next image.

image

image

My confusion is that IGV says that the read is hard-clipped. I had the impression that Circle-Map would ignore hard-clipped reads. Is that correct? Or is it the case that Circle-Map will only looks for supplementary alignments, whether the read is hard or soft clipped?

Thanks,
Chris

@iprada
Copy link
Owner

iprada commented Aug 14, 2019

Dear Chris,

Thanks a lot for your questions and feedback.

In my last prep, Circle-Map assigned some very high circle scores to regions with no discordant pairs (see orange shading). Also coverage was highly variable across the region. So you might want to consider assigning more scoring weight to coverage uniformity of potential circles.

Regarding this question, I plan to improve substantially the scoring model during the next couple of months. The reason for the low priority is that we (in my lab) do not use the circle scores in our downstream analysis. However, If you (or anybody else reading this) consider that this feature is very important for the way you handle your data downstream of Circle-Map, inform me on this thread and I will try to push a bit the development of the model.

My confusion is that IGV says that the read is hard-clipped. I had the impression that Circle-Map would ignore hard-clipped reads. Is that correct? Or is it the case that Circle-Map will only looks for supplementary alignments, whether the read is hard or soft clipped?

This one is a very good question. The short answer is the following: Circle-Map uses the hard-clipped reads to construct the graph and guide the realignment of the soft-clipped reads. But it does not use hard-clipped reads for the detection.

The long answer: based on the SAM format, in some cases, BWA MEM is able to indentify split reads by itself. On those cases, BWA MEM represents the split reads (reads crossing the circular DNA breakpoint) as two separate alignments for the same read:

  • The primary alignment will be a soft-clipped read that will have an extra tag (the SA tag, you can find the information about the tag on SAM optional fields page 3) indicating the location of the alignment of the hard-clipped alignment.

  • The sencodary alignment(s): will be a hard-clipepd read with the tag indicating the location of the alignment of the soft-clipped alignment.

Hence, you can see that a single read can be represented by severall alignments on the SAM file. If Circle-Map encounters this cases, it will skip the hard-clipped alignment, but it will use the primary alignment of that read for detecting the circle.

I hope this is helpful

Best wishes,

Iñigo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants