Skip to content
forked from PoonLab/clmp

Genetic clustering with Markov-modulated Poisson processes

License

Notifications You must be signed in to change notification settings

ConnorChato/clmp

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

clmp

clmp is an R extension, mostly written in C, for extracting genetic clusters from a phylogeny using a Markov-modulated Poisson process to model variation in branching rates. Our paper that describes and evaluates MMPP as a model-based clustering method for HIV epidemiology was recently accepted in PLOS Computational Biology.

Background

A genetic cluster is a subset of nucleotide or amino acid sequences that are more similar to each other than to the remaining sequences in the sample population. For infectious diseases, a genetic cluster may correspond to an outbreak of cases that are clustered in space and/or time, which can imply that the cases are related by a common source.

Usage

> require(clmp)
Loading required package: clmp
Loading required package: ape

> t1 <- read.tree('examples/test.nwk')  # a simulated tree with 1000 tips, 100 in clusters
> res <- clmp(t1)  # returns an ape::phylo tree object 
log likelihood for 2 state model is 2238.543290
rates: 495.368085 1305.115860 
Q: [    *   2.526691 ]
   [ 23.309483   *    ]

> index <- match(t1$tip.label, names(res$clusters))  
> labels <- grepl("_1_", t1$tip.label)  # extract truth from the tip labels
> table(labels, res$clusters[index])
      
labels    0   1
  FALSE 860   3  # false positive rate, 3/(3+860)=0.34%
  TRUE   13  98  # true positive rate, 98/(98+13)=88.2%

Installation

For step-by-step instructions to compile the clmp R package from source, please refer to our installation documentation.

About

Genetic clustering with Markov-modulated Poisson processes

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C 85.3%
  • R 7.3%
  • Makefile 5.3%
  • C++ 1.3%
  • Yacc 0.6%
  • Lex 0.1%
  • M4 0.1%