Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get the identity of an indel in pileup #294

Open
kentwait opened this issue Feb 16, 2021 · 1 comment
Open

How to get the identity of an indel in pileup #294

kentwait opened this issue Feb 16, 2021 · 1 comment

Comments

@kentwait
Copy link

Is there a way to get the inserted or deleted sequence while doing a pileup?

For example the reference is the first line and there are 3 mapped reads.
The first read has an insertion GGC and the third has a deletion of GGC.

ref:   GCGGGCGGCTCTTATATTATAAAGAGCTAGAATATCG
read1: GCGGGCGGC[GGC]TCTTATATTATAAAGAGCTAGAATATCG
read2: GCGGGCGGCTCTTATATTATAAAGAGCTAGAATATCG
read3: GCGGGC---TCTTATATTATAAAGAGCTAGAATATCG

Using the example code below will get the length of the indel (3) but I don't know how to get the inserted or deleted bases.

// mark indel start
match alignment.indel() {
    bam::pileup::Indel::Ins(len) => println!("Insertion of length {} between this and next position.", len),
    bam::pileup::Indel::Del(len) => println!("Deletion of length {} between this and next position.", len),
    bam::pileup::Indel::None => ()
}

In samtools mpileup this would be +3GGC for insertion or -3GGC for deletion. Is there an existing method for this or do I need to manually process this by reading the CIGAR of each read?

Thank you for the help.

@ebioman
Copy link

ebioman commented Aug 10, 2021

Hi @kentwait
I have currently the same problem, did you solve this by chance ?
thx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants