Adds a method to predict the probability of a certain count to poisson results. Compare to Stata's predict varname, n(#) and predprob from R's pscl. Needs tests but works.
predict varname, n(#)
my thought was to return the distribution instance itself, which makes all stats.distribution methods directly available
(broadcasting would require users to use x[:,None] or x[..., None] if mean mu/lambda is 1d.)
predict_prob might be a bit misleading, would refer to estimate of sample (in binomial, and sklearn) (maybe not too much misleading.
the default 0 to max(endog) could be a huge array. (dangerous as a default)
Mainly just did it because it's the default in R too.
What's 1064 supposed to be?
1064 is autocompletion in github
#106 written for GLM but applies to other models
one method I'd like to have available is rvs to simulate the process (besides examples also parametric bootstrap and Monte Carlo).
other nice ones: cdf and interval
some things like var might be more interesting for other distributions, but might not work vectorized until the next scipy release.
I don't see much reason not to merge this. But I also don't see any usecase for it. (especially compared to predict_dist )
It's used in the Vuong test for zero-inflated vs. poisson.
ENH: add predict_prob to poisson
Rebased. Added a test. We can get rid of it / change it before 0.6.0 release if there's a better alternative forthcoming.
Hmm, this installs correctly and passes for me locally. Not sure why travis fails.
This PR doesn't include an .npy file. And I think it needs to be added to MANIFEST.in
What's in the npy file? I don't really like to use "proprietary" formats in case there are format changes.
Ha, right. Need to add it. It's just some test data. It's smaller as a binary file than a csv.
Coverage remained the same when pulling ee90385 on jseabold:poisson-count-prob into 84e7607 on statsmodels:master.
The only time we had a .npz file that I can find now (in vector_ar), I needed to convert it to a python module, because we ran into some problems across numpy or python versions, AFAIR.
I'll switch it, but my understanding is that the .npy is architecture and python independent. That's the point of it. I don't see any issues on the numpy tracker about this.
The only explanation for the .npz conversion (*) I can find is https://groups.google.com/d/msg/pystatsmodels/JIp54_XZ66w/OxUf8tCQAJUJ
From this the problem was npz not npy files in the python 3.2 conversion.
Using only generic formats removes some possible headaches later on.
(But because I didn't think about it, my first test data file in GMM is a .dta not a csv file.)
TST: Test predict probs vs. R pscl package
Coverage remained the same when pulling 9f6e2a0 on jseabold:poisson-count-prob into 84e7607 on statsmodels:master.