Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Bootstrapping #420

Open
jseabold opened this Issue · 6 comments

5 participants

@jseabold
Owner

See about incorporating this code into a bootstrapping framework. It's BSD licensed.

https://bitbucket.org/cevans/bootstrap/src

[Edit: Updated repo https://github.com/cgevans/scikits-bootstrap/]

@rgommers
Collaborator

If you plan to work on this now: you may want to wait a few days. Someone just contacted me saying he had an updated version of that code that could be included in scipy. I mentioned also statsmodels and asked him to bring it up on the ML.

@jseabold
Owner

Sure. I won't be working on this anytime soon likely. There's plenty else to do right now.

@cgevans

As the author of the first package, I'd be happy to help, but I'm not really sure how the code or bootstrapping in general would fit into statsmodels. All I've been using the code for has been to get confidence limits for a function applied to independent data.

@gcalmettes

Like @cgevans, I would be happy to help incorporating the code into statsmodels. In addition to using the code for calculating confidence limits of a statistics or calculating effect size and associated confidence limits when comparing two datasets, I also use it to calculate confidence bands of linear/non-linear regressions (bootstrapping of the residuals).

@josef-pkt
Owner

I appreciate if someone would work more systematically on getting bootstrap into statsmodels. (our current status is bits and pieces and examples)

Bootstrap (or resampling in general including permutation tests) is a huge topic, and both packages could be pretty directly included with bootstrapping for univariate statistics and for one or two sample tests.

I don't see yet how either package can be applied to multivariate statistics or used in support of the (regression) models. Vectorizing (the statistic is 1d with more than one element) wouldn't be very difficult, but I don't think I have seen much for bca or abc confidence intervals in the multivariate case.

based on this we can work on extensions (or create separate code for other use cases)

one comment:
https://github.com/cgevans/scikits-bootstrap/blob/master/scikits/bootstrap/bootstrap.py#L181
outsourcing this is good, but I think an iterator would be more efficient (in terms of memory)

about unit tests: Constantine has already started. It isn't clear to me how to verify some of the code, much is straightforward and regression tests should be enough, but more complicated methods like bca and abc could hide bugs.

My plan was to have pure bootstrap code (generic code and utilities, along with the code for Monte Carlo and Permutation tests) in statsmodels.resampling and application code in the corresponding directories. statsmodels.resampling is currently empty and the existing code is still spread out.

I'm busy with other things, but would be glad to review any pull requests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.