Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: add predict_prob to poisson #1088

Merged
merged 2 commits into from Oct 23, 2013

Conversation

Projects
None yet
3 participants
@jseabold
Copy link
Member

commented Sep 19, 2013

Adds a method to predict the probability of a certain count to poisson results. Compare to Stata's predict varname, n(#) and predprob from R's pscl. Needs tests but works.

@josef-pkt

This comment has been minimized.

Copy link
Member

commented Sep 19, 2013

my thought was to return the distribution instance itself, which makes all stats.distribution methods directly available
(broadcasting would require users to use x[:,None] or x[..., None] if mean mu/lambda is 1d.)

see #1064

predict_prob might be a bit misleading, would refer to estimate of sample (in binomial, and sklearn) (maybe not too much misleading.
the default 0 to max(endog) could be a huge array. (dangerous as a default)

@jseabold

This comment has been minimized.

Copy link
Member Author

commented Sep 19, 2013

Mainly just did it because it's the default in R too.

What's 1064 supposed to be?

@josef-pkt

This comment has been minimized.

Copy link
Member

commented Sep 19, 2013

1064 is autocompletion in github
#106 written for GLM but applies to other models

one method I'd like to have available is rvs to simulate the process (besides examples also parametric bootstrap and Monte Carlo).
other nice ones: cdf and interval

some things like var might be more interesting for other distributions, but might not work vectorized until the next scipy release.

@josef-pkt

This comment has been minimized.

Copy link
Member

commented Oct 23, 2013

I don't see much reason not to merge this. But I also don't see any usecase for it. (especially compared to predict_dist )

@jseabold

This comment has been minimized.

Copy link
Member Author

commented Oct 23, 2013

It's used in the Vuong test for zero-inflated vs. poisson.

@jseabold

This comment has been minimized.

Copy link
Member Author

commented Oct 23, 2013

Rebased. Added a test. We can get rid of it / change it before 0.6.0 release if there's a better alternative forthcoming.

@jseabold

This comment has been minimized.

Copy link
Member Author

commented Oct 23, 2013

Hmm, this installs correctly and passes for me locally. Not sure why travis fails.

@josef-pkt

This comment has been minimized.

Copy link
Member

commented Oct 23, 2013

This PR doesn't include an .npy file. And I think it needs to be added to MANIFEST.in

What's in the npy file? I don't really like to use "proprietary" formats in case there are format changes.

@jseabold

This comment has been minimized.

Copy link
Member Author

commented Oct 23, 2013

Ha, right. Need to add it. It's just some test data. It's smaller as a binary file than a csv.

@coveralls

This comment has been minimized.

Copy link

commented Oct 23, 2013

Coverage Status

Coverage remained the same when pulling ee90385 on jseabold:poisson-count-prob into 84e7607 on statsmodels:master.

@josef-pkt

This comment has been minimized.

Copy link
Member

commented Oct 23, 2013

The only time we had a .npz file that I can find now (in vector_ar), I needed to convert it to a python module, because we ran into some problems across numpy or python versions, AFAIR.

@jseabold

This comment has been minimized.

Copy link
Member Author

commented Oct 23, 2013

I'll switch it, but my understanding is that the .npy is architecture and python independent. That's the point of it. I don't see any issues on the numpy tracker about this.

@josef-pkt

This comment has been minimized.

Copy link
Member

commented Oct 23, 2013

The only explanation for the .npz conversion (*) I can find is https://groups.google.com/d/msg/pystatsmodels/JIp54_XZ66w/OxUf8tCQAJUJ
From this the problem was npz not npy files in the python 3.2 conversion.

(*) b88d9a3

Using only generic formats removes some possible headaches later on.
(But because I didn't think about it, my first test data file in GMM is a .dta not a csv file.)

@coveralls

This comment has been minimized.

Copy link

commented Oct 23, 2013

Coverage Status

Coverage remained the same when pulling 9f6e2a0 on jseabold:poisson-count-prob into 84e7607 on statsmodels:master.

jseabold added a commit that referenced this pull request Oct 23, 2013

Merge pull request #1088 from jseabold/poisson-count-prob
ENH: add predict_prob to poisson

@jseabold jseabold merged commit da26462 into statsmodels:master Oct 23, 2013

1 check passed

default The Travis CI build passed
Details

@jseabold jseabold deleted the jseabold:poisson-count-prob branch Oct 23, 2013

PierreBdR pushed a commit to PierreBdR/statsmodels that referenced this pull request Sep 2, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.