Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

paired-end mode calling #52

Open
avilella opened this issue Sep 7, 2017 · 2 comments
Open

paired-end mode calling #52

avilella opened this issue Sep 7, 2017 · 2 comments

Comments

@avilella
Copy link

avilella commented Sep 7, 2017

Hi,

There is a new sample prep that generates read pairs where the CpG
modification needs to be read for both reads in a pair. E.g.

ff = forward/forward read pairs
rr = reverse/reverse read pairs

RefC:Base1Base2 RefG:Base1Base2

C:..  G:A.  ff  =>  ModA
C:,t  G:,,  rr  =>  ModA
C:.T  G:A.  ff  =>  C
C:t,  G:,a  rr  =>  C
C:..  G:..  ff  =>  ModB
C:,,  G:,,  rr  =>  ModB

I couldn't find any way to obtain this currently in MethylDackel,
maybe there is...

Since it seems that MethylDackel is being adopted more and more in the
community, I thought I'd propose to have this new sample prep
implemented in it (I wouldn't want to hack this together in the
"other" methylation caller, with the name starting with a b and ends
with a k, and be stuck with not being able to align the reads with
bwa-meth).

So far I scripted this in a very slow way:

  • Take coordinate sorted bam and sort it by read name, so pairs of
    reads are consecutive.

  • Create a mini-sam entry for each pair of reads, run samtools
    mpileup, intersect it with the list of CpGs, and record what type is
    each CpG for this read pair in a tabular file.

  • Collate the tabular file for summary statistics, per-CpG
    percentages, etc.

On a scale of 1 to 10, how much different would this be from the
current codebase and how feasible to implement?

@dpryan79
Copy link
Owner

dpryan79 commented Sep 7, 2017

Is this method published anywhere? I'd be interested in reading exactly how all of the library prep is working.

The ModA and ModB case should be easy to implement, but I'm not sure I understand enough of what's going on in the C case to say how hard that would be to implement.

@dpryan79
Copy link
Owner

dpryan79 commented Sep 7, 2017

Interesting. Feel free to email me directly so I can keep everything confidential.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants