# Speed of mean.difference and test statistics in general #6

Open
opened this Issue Jul 12, 2011 · 0 comments

 So, I've discovered that `mean.difference` is much slower than `mann.whitney.u`: ``````> system.time(replicate(1000,mean.difference(R,Z,B))) user system elapsed 2.909 0.013 2.921 > system.time(replicate(1000,mann.whitney.u(R,Z,B))) user system elapsed 0.073 0.001 0.074 `````` Part of the issue is the fact that there is some preprocessing of the data for blocks: ``````> system.time(replicate(1000,paired.sgnrank.sum(R,Z,B))) user system elapsed 1.484 0.004 1.489 `````` But not all of the difference is there. Here are a couple of ideas: ``````mean.diff.lsfit<-function(ys,z,blocks){ ##Try using something that calls compiled code ##Gives same answer as mean.difference for balanced blocks and should be like harmonic.mean.difference for unbalanced blocks. lsfit(x=model.matrix(ys~z+blocks),y=ys,intercept=FALSE)[["coefficients"]][["z"]] } > system.time(replicate(1000,mean.diff.lsfit(R,Z,B))) user system elapsed 1.793 0.004 1.797 mean.diff.vect<-function(ys,z,blocks){ X<-model.matrix(ys~z+blocks) solve(qr(X, LAPACK=TRUE), ys)[2] ## qr.coef(qr(X,LAPACK=TRUE),ys) ## to handle near singular X } > system.time(replicate(1000,mean.diff.vect(R,Z,B))) user system elapsed 1.741 0.001 1.742 `````` I suspect that as long as we allow `blocks` to be a factor and use `model.matrix`, we may not get much more speed. Any ideas welcome, of course, since this is the function that we are calling lots.