You can clone with
HTTPS or Subversion.
See about incorporating this code into a bootstrapping framework. It's BSD licensed.
[Edit: Updated repo https://github.com/cgevans/scikits-bootstrap/]
If you plan to work on this now: you may want to wait a few days. Someone just contacted me saying he had an updated version of that code that could be included in scipy. I mentioned also statsmodels and asked him to bring it up on the ML.
Sure. I won't be working on this anytime soon likely. There's plenty else to do right now.
There's another BSD library here: http://gcalmettes.github.com/bootstrap-tools/
(From the comments on this blog post: http://www.randalolson.com/2012/08/06/statistical-analysis-made-easy-in-python/)
As the author of the first package, I'd be happy to help, but I'm not really sure how the code or bootstrapping in general would fit into statsmodels. All I've been using the code for has been to get confidence limits for a function applied to independent data.
Like @cgevans, I would be happy to help incorporating the code into statsmodels. In addition to using the code for calculating confidence limits of a statistics or calculating effect size and associated confidence limits when comparing two datasets, I also use it to calculate confidence bands of linear/non-linear regressions (bootstrapping of the residuals).
I appreciate if someone would work more systematically on getting bootstrap into statsmodels. (our current status is bits and pieces and examples)
Bootstrap (or resampling in general including permutation tests) is a huge topic, and both packages could be pretty directly included with bootstrapping for univariate statistics and for one or two sample tests.
I don't see yet how either package can be applied to multivariate statistics or used in support of the (regression) models. Vectorizing (the statistic is 1d with more than one element) wouldn't be very difficult, but I don't think I have seen much for bca or abc confidence intervals in the multivariate case.
based on this we can work on extensions (or create separate code for other use cases)
outsourcing this is good, but I think an iterator would be more efficient (in terms of memory)
about unit tests: Constantine has already started. It isn't clear to me how to verify some of the code, much is straightforward and regression tests should be enough, but more complicated methods like bca and abc could hide bugs.
My plan was to have pure bootstrap code (generic code and utilities, along with the code for Monte Carlo and Permutation tests) in statsmodels.resampling and application code in the corresponding directories. statsmodels.resampling is currently empty and the existing code is still spread out.
I'm busy with other things, but would be glad to review any pull requests.