Skip to content
This repository has been archived by the owner on Oct 12, 2023. It is now read-only.

Performance comparisons

Kevin R. Thornton edited this page Jul 18, 2014 · 6 revisions

Let's look at performance of calculating the "c-alpha" statistic.

The following R code compares buRden to AssotesteR, which is implemented in native R. The code does 100 permutations of the data and calculates the test statistic for each shuffling of case/control status:

library(buRden)
library(AssotesteR)

data(rec.ccdata)
#Generate control/case status vector
rec.ccdata.status = c( rep(0,rec.ccdata$ncontrols),rep(1,rec.ccdata$ncases))
#Filter sites: 0 <= MAF in cases < 0.05 && r^2 between pairs < 0.8
rec.ccdata.keep = filter_sites(rec.ccdata$genos,rec.ccdata.status,1e-3,5e-2,0.8)

#This is the function in the buRden package
print(system.time( replicate(100,cAlpha(rec.ccdata$genos[,which(rec.ccdata.keep==1)],
      sample(rec.ccdata.status),simplecounts = TRUE))) )

#This is the function in the AssotesteR package
print(system.time( replicate(100,CALPHA(sample(rec.ccdata.status),
       rec.ccdata$genos[,which(rec.ccdata.keep==1)]))) )

buRden.calpha = cAlpha(rec.ccdata$genos[,which(rec.ccdata.keep==1)],
                       rec.ccdata.status,simplecounts = TRUE)
AssotesteR.calpha = CALPHA(rec.ccdata.status,
                           rec.ccdata$genos[,which(rec.ccdata.keep==1)])

print(buRden.calpha)
print(AssotesteR.calpha)

The results (on my new iMac) showed a ~16x speed difference:

> print(system.time( replicate(100,cAlpha(rec.ccdata$genos[,which(rec.ccdata.keep==1)],sample(rec.ccdata.status),simplecounts = TRUE))) )
   user  system elapsed 
  1.663   0.018   1.681 

> print(system.time( replicate(100,CALPHA(sample(rec.ccdata.status),rec.ccdata$genos[,which(rec.ccdata.keep==1)],))) )
   user  system elapsed 
 26.008   1.823  27.844 

And the two packages give the same result for the test statistic. (The boolean simplecounts passed to my function makes the two packages do the calculation the same way. I'd argue that AssotesteR's calculation is incorrect, as it does not discriminate hets from homozygotes):

> print(buRden.calpha)
[1] 41771.5
> print(AssotesteR.calpha)

 	 CALPHA: c-alpha Test 

Info: 
   cases  controls  variants   n.perms  
    3000      3000       314         0  

    calpha.stat   calpha.stat.var   
        41771.5         1178265.8
Clone this wiki locally