Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upSwitch to median instead of mean #242
Comments
|
Which functions did you have in mind for this? |
|
sorry, I've never written an issue before, this is the avgdist.Rd function. |
|
This seems trivial to add. @Microbiology --- as you committed this function, do you have any input on whether this is a useful feature to add (& also #243)? (I'm not expecting you to implement it of course!) |
|
The |
|
Replacing the line findist <- Reduce("+", distlist)/length(distlist)with findist <- apply(do.call(rbind, distlist), 2, meanfun, ...)and having new argument |
|
Closed with PR #244 |
|
Thank you for adding this feature to avgdist.R Microbiology! However, when I ran the script I got the same results regardless of using the mean or median options. Is this implemented correctly? Also, can you please put in the dcoumentation that Bray-Curtis is the default dissimilarity index? Thanks! avgdist_7kmean <- avgdist(otu, 7000, meanfun= mean, transf= sqrt ) |
|
Thanks for pointing this out @sydneyg ! :) I think you need to reopen the issue since you are the original poster. I'll look into this and follow up. |
|
Hey @sydneyg , so I was looking into the problem and things seem to be working as far as I can tell, but I may be missing something. I can switch the default to median and it still runs as expected, without any mention of mean so I see how it would be doing the mean in that case. Thinking about it though, I would probably not expect the results to be massively different since we are taking the mean and median of the random iterations. Could you provide a small test set where you would expect to see very different results between the mean and median of the random iterations? Or does it make sense that they would be very close results? Thanks again for your helpful feedback. :) |
|
@sydneyg @Microbiology : I had a quick look at the function, and in my inspection > set.seed(4711); meand <- avgdist(BCI, 100)
> set.seed(4711); medid <- avgdist(BCI, 100, meanfun=median)
> range(meand-medid)
[1] -0.0114 0.0141You must use This experiment shows that results change when you change mean function. My analysis of the code shows that the change happens when you switch to |
|
One way of inspecting the issue is to check the variability of simulations. You can do this by abusing the function like this: > set.seed(4711); sdd <- avgdist(BCI, 100, meanfun=sd)
> summary(sdd)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.03319 0.04048 0.04266 0.04271 0.04478 0.05539 So you ask for @Microbiology : I just wonder if some measure of variability would be useful diagnostic here. |
|
Thanks @jarioksa, it sounds like we are on the same page. :) I like your idea of using |
@Microbiology Hi can you please add an option to make a median of the subsampled OTU tables instead of the mean?