Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zero inflated docs #725

Merged
merged 6 commits into from
Sep 28, 2023
Merged

Conversation

GStechschulte
Copy link
Collaborator

This PR adds documentation for two classes of zero inflated data: (1) Zero inflated poisson (ZIP), and (2) Hurdle poisson.

First, I describe why zero inflated data needs such a class of models followed by a description and implementation of the ZIP and Hurdle models. Then, I describe the differences between the two models.

I am not finished, but it is far enough ahead that I am drafting this PR.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@tomicapretto
Copy link
Collaborator

Notice how there are two linear models and two link functions: one for each process in the ZIPoisson

I would say "parameter" instead of "process"

@tomicapretto
Copy link
Collaborator

I would say the example is already looking great! Will review again when it's done but so far I don't have any other suggestions.

@GStechschulte GStechschulte marked this pull request as ready for review September 22, 2023 14:00
@GStechschulte
Copy link
Collaborator Author

Thank you! 👍🏼 If anything, I would like to have the Hurdle poisson section reviewed. In my opinion, the difference between the two is subtle, and I want to make sure my writing makes sense and is intuitive for the users. Thanks!

@codecov-commenter
Copy link

codecov-commenter commented Sep 22, 2023

Codecov Report

Merging #725 (255ded3) into main (e53f8da) will not change coverage.
Report is 1 commits behind head on main.
The diff coverage is n/a.

❗ Current head 255ded3 differs from pull request most recent head a9ab9d3. Consider uploading reports for the commit a9ab9d3 to get more accurate results

@@           Coverage Diff           @@
##             main     #725   +/-   ##
=======================================
  Coverage   89.56%   89.56%           
=======================================
  Files          44       44           
  Lines        3525     3525           
=======================================
  Hits         3157     3157           
  Misses        368      368           
Files Coverage Δ
bambi/priors/scaler.py 96.70% <ø> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

docs/notebooks/zero_inflated_regression.ipynb Show resolved Hide resolved
docs/notebooks/zero_inflated_regression.ipynb Show resolved Hide resolved
docs/notebooks/zero_inflated_regression.ipynb Show resolved Hide resolved
@@ -0,0 +1,1905 @@
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little bit ambiguous: "the probability of the Poisson process, i.e., the probability of non-zero data, increases"

It is correct to say the probability of the Poisson process increases, but that doesn't mean we have no zeroes, since zeroes are also possible under the Poisson distribution.


Reply via ReviewNB

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

 but that doesn't mean we have no zeroes, since zeroes are also possible under the Poisson distribution.

Yeah, that's why I added the "probability of non-zero data" since it implies that zeros are still plausible. I wanted to add an interpretation of the statement "the probability of the Poisson process" for the users. For some it may be obvious, but for others not. Nonetheless, I updated this paragraph so it is not as ambiguous.

docs/notebooks/zero_inflated_regression.ipynb Show resolved Hide resolved
docs/notebooks/zero_inflated_regression.ipynb Show resolved Hide resolved
docs/notebooks/zero_inflated_regression.ipynb Show resolved Hide resolved
@tomicapretto
Copy link
Collaborator

tomicapretto commented Sep 25, 2023

Only one minor last thing (this time it's true!), because all the rest is perfect.

Change np.linspace(..., num=20) for np.arange(39). This way every bar in the posterior predictive check visualization corresponds to one different count value and it's easier to conclude. Otherwise, the first bar represents more than the zeroes.

edit after that, feel free to merge :)

@GStechschulte
Copy link
Collaborator Author

Thanks for the code reviews on this docs PR. Much appreciated! 😄

@GStechschulte GStechschulte merged commit 0321981 into bambinos:main Sep 28, 2023
1 of 4 checks passed
GStechschulte added a commit to GStechschulte/bambi that referenced this pull request Sep 28, 2023
* zero inflated poisson and hurdle poisson models

* grammar fix and sort imports

* interpret coeff. and model comparison section

* code review changes

* change wording in hurdle Poisson section

* change posterior predictive bins to use np.arange
GStechschulte added a commit to GStechschulte/bambi that referenced this pull request Sep 28, 2023
* zero inflated poisson and hurdle poisson models

* grammar fix and sort imports

* interpret coeff. and model comparison section

* code review changes

* change wording in hurdle Poisson section

* change posterior predictive bins to use np.arange
GStechschulte added a commit that referenced this pull request Sep 28, 2023
* ordinal model with cumulative link notebook

* ordinal model with cumulative link function

ordinal models (cumulative and sratio)

* unified explanation for cumulative and sequential models

* sratio model and data

* code review changes

* remove intercept in models

* zero mu vector prior for sratio family

* code review and add section on default priors

* explicit explanation of K and k and added summary section

* Zero inflated docs (#725)

* zero inflated poisson and hurdle poisson models

* grammar fix and sort imports

* interpret coeff. and model comparison section

* code review changes

* change wording in hurdle Poisson section

* change posterior predictive bins to use np.arange

* ordinal model with cumulative link function

ordinal models (cumulative and sratio)

* use plot_ppc_discrete for posterior predictive samples

* add plots explaining the ordinal outcome of the dataset

---------

Co-authored-by: Gabriel Stechschulte <gabriel.stechschulte@schindler.com>
GStechschulte added a commit to GStechschulte/bambi that referenced this pull request Oct 3, 2023
* zero inflated poisson and hurdle poisson models

* grammar fix and sort imports

* interpret coeff. and model comparison section

* code review changes

* change wording in hurdle Poisson section

* change posterior predictive bins to use np.arange
GStechschulte added a commit to GStechschulte/bambi that referenced this pull request Oct 3, 2023
* ordinal model with cumulative link notebook

* ordinal model with cumulative link function

ordinal models (cumulative and sratio)

* unified explanation for cumulative and sequential models

* sratio model and data

* code review changes

* remove intercept in models

* zero mu vector prior for sratio family

* code review and add section on default priors

* explicit explanation of K and k and added summary section

* Zero inflated docs (bambinos#725)

* zero inflated poisson and hurdle poisson models

* grammar fix and sort imports

* interpret coeff. and model comparison section

* code review changes

* change wording in hurdle Poisson section

* change posterior predictive bins to use np.arange

* ordinal model with cumulative link function

ordinal models (cumulative and sratio)

* use plot_ppc_discrete for posterior predictive samples

* add plots explaining the ordinal outcome of the dataset

---------

Co-authored-by: Gabriel Stechschulte <gabriel.stechschulte@schindler.com>
@GStechschulte GStechschulte deleted the zero-inflated-examples branch January 21, 2024 20:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants