-
-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Radon Example Implementation #440
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
View / edit / reply to this conversation on ReviewNB tomicapretto commented on 2022-01-14T22:38:01Z We could convert the variable
Right now,
This is what I mean
juanitorduz commented on 2022-01-15T18:56:36Z Makes sense! Solved in https://github.com//pull/440/commits/1cf6df3fa259dde97921045f9504fbeba89a22d7 |
View / edit / reply to this conversation on ReviewNB tomicapretto commented on 2022-01-14T22:38:02Z "Estimate a single radon level for every floor level" would be clearer about the meaning of the predictor "x" I think juanitorduz commented on 2022-01-16T20:02:37Z Added! |
View / edit / reply to this conversation on ReviewNB tomicapretto commented on 2022-01-14T22:38:03Z If we decide to use the categorical version of floor, it would be good to remove the intercept from the formula so the first level of If we keep the intercept, the first level is dropped to ensure full-rankness of the underlying design matrix*
It would be
(*) We have priors, which are a kind of soft-constraind, and the model would fit even with rank-defficient design matrices. But that can lead to other problems with the sampler. That's why Bambi sometimes drops columns for categorical variables (brms does the same thing) juanitorduz commented on 2022-01-15T18:57:50Z This was added in https://github.com//pull/440/commits/1cf6df3fa259dde97921045f9504fbeba89a22d7 |
View / edit / reply to this conversation on ReviewNB tomicapretto commented on 2022-01-14T22:38:04Z This is what I mean when I suggest using the categorical version of floor: https://imgur.com/a/KKQZ1eL
(I uploaded the image when I was using First floor instead of Floor) |
View / edit / reply to this conversation on ReviewNB tomicapretto commented on 2022-01-14T22:38:04Z If you decide to use
Note that the coordinate for
pooled_results_posterior_stacked = pooled_results.posterior.stack(sample=('chain', 'draw')) juanitorduz commented on 2022-01-15T19:10:13Z Thanks for adding it to the notebook! |
View / edit / reply to this conversation on ReviewNB tomicapretto commented on 2022-01-14T22:38:05Z The equivalent formula for the model in the PyMC version is
If we do something like
This post from the Patsy documentation is very insightful. I confess I had to read it several times, in different days, before I started understanding in more depth :D juanitorduz commented on 2022-01-15T19:01:14Z Thanks for the input! |
View / edit / reply to this conversation on ReviewNB tomicapretto commented on 2022-01-14T22:38:06Z With the new formula we would have https://imgur.com/iRaqmKi which is more alike what is in the PyMC version |
View / edit / reply to this conversation on ReviewNB tomicapretto commented on 2022-01-14T22:38:07Z I also think you've found a bug, or something similar. I would have expected an error being raised when you set the prior for
Here we should have something along these lines
partial_pooling_priors = { "Intercept": bmb.Prior(name="Normal", mu=0, sigma=10), "1|county": bmb.Prior(name="Normal", mu=0, sigma=bmb.Prior("Exponential", lam=1)), "sigma": bmb.Prior(name="Exponential", lam=1.0), }
If you want to be closer to the PyMC specification, you could pass
If you're familiar with the PyMC way of using hierarchies, the Bambi way (borrowed from mixed effects models way) may be confusing in the beginning.
A good explanation is found in Chapter 16 from Bayes Rules book, specifically section 16.3.2
juanitorduz commented on 2022-01-15T19:06:32Z I would actually suggest to add some of this comments (and the reference) to the notebook. juanitorduz commented on 2022-01-15T19:07:44Z As for future reference, the bug was solved in https://github.com//pull/441 |
View / edit / reply to this conversation on ReviewNB tomicapretto commented on 2022-01-14T22:38:08Z Line #7. formula="log_radon ~ 0 + (0 + 1|county)",
The 0 in
Feel free to ask if there's something you'd like to double check in the formula specification :) |
View / edit / reply to this conversation on ReviewNB tomicapretto commented on 2022-01-14T22:38:08Z This is a similar issue than above
varying_intercept_priors = { "Intercept": bmb.Prior(name="Normal", mu=0, sigma=10), "floor": bmb.Prior(name="Normal", mu=0.0, sigma=10.0), "1|county": bmb.Prior(name="Normal", mu=0, sigma=bmb.Prior("Exponential", lam=1)), "sigma": bmb.Prior(name="Exponential", lam=1), }
I think that updating the model solves your question about one-hot encoding for county?
Note: I removed the intercept so we have the two coefficients for floor. juanitorduz commented on 2022-01-15T19:08:26Z Now everything make sense! |
View / edit / reply to this conversation on ReviewNB tomicapretto commented on 2022-01-14T22:38:09Z Here the model would be
varying_intercept_slope_priors = { "Intercept": bmb.Prior(name="Normal", mu=0, sigma=5), "1|county": bmb.Prior(name="Normal", mu=0, sigma=bmb.Prior("Exponential", lam=1)), "floor|county": bmb.Prior(name="Normal", mu=0.0, sigma=bmb.Prior("Exponential", lam=0.5)), "sigma": bmb.Prior(name="Exponential", lam=1.0), }
but I've tried running it and it was taking hours, so I guessed there is a bug when using categorical predictors in varying effects
I did the following and it worked like a charm
srrs_mn2 = srrs_mn.copy()
So I'll dig into this issue to see what's wrong in the first model and try to fix it. |
@juanitorduz thanks for this massive contribution! I've added several comments to the notebook. I think you've made a lot of progress! As you may see in the comments, I've found two bugs associated with Bambi. One is related to hyperpriors and the other is associated with a varying effect when using a categorical predictor. I think I have to fix the second issue before we can move on with this example. Again, thanks so much! |
I have found the problem. It's a shape problem in the underlying PyMC model. We're basically adding things with shape I'll create a new PR with the fix and make a new patch release. I have introduced this bug accidentally when modifying other underlying code I guess. I'll also add a test so this does not happen again 😅 For the record, this is what is happening in the underlying model x = np.array([1, 2, 3])
y = np.array([[1, 2, 3]]).T
x + y
when Bambi actually meant
|
The shape problem is solved in the new release Bambi 0.7.1. I could push my changes to the notebook so you don't have to write them (if you want) |
Hey @tomicapretto thank you very much on your feedback! Here are some comments regarding your points:
Feel free to push your changes to the notebook! Just push whatever you have ready and I will work on top of it the open comments 💪 ! About the bug you discovered (I was also a bit surprised I could pass the prior as I did without getting an error... sometimes things are too good to be true 😆), the fix was merged in #441 right? Which means I would need to re-install |
I've quickly made a new patch release to solve this issue because it may be affecting other people too 😨. I'm about to push my changes now. |
Codecov Report
@@ Coverage Diff @@
## main #440 +/- ##
==========================================
- Coverage 89.38% 89.26% -0.12%
==========================================
Files 31 31
Lines 2420 2431 +11
==========================================
+ Hits 2163 2170 +7
- Misses 257 261 +4
Continue to review full report at Codecov.
|
Makes sense! Solved in https://github.com//pull/440/commits/1cf6df3fa259dde97921045f9504fbeba89a22d7 View entire conversation on ReviewNB |
This was added in https://github.com//pull/440/commits/1cf6df3fa259dde97921045f9504fbeba89a22d7 View entire conversation on ReviewNB |
Thanks for the input! View entire conversation on ReviewNB |
Fixed in https://github.com//pull/440/commits/3c68d074c045ab63af7ded6bf3e73958ea0b630b View entire conversation on ReviewNB |
4 similar comments
Fixed in https://github.com//pull/440/commits/3c68d074c045ab63af7ded6bf3e73958ea0b630b View entire conversation on ReviewNB |
Fixed in https://github.com//pull/440/commits/3c68d074c045ab63af7ded6bf3e73958ea0b630b View entire conversation on ReviewNB |
Fixed in https://github.com//pull/440/commits/3c68d074c045ab63af7ded6bf3e73958ea0b630b View entire conversation on ReviewNB |
Fixed in https://github.com//pull/440/commits/3c68d074c045ab63af7ded6bf3e73958ea0b630b View entire conversation on ReviewNB |
Ups! I think I have lost another comment about the model specification. Here is is for future reference: Here you mention that you are removing the intercept and then using one coefficient per floor level. But when you describe the mathematical model with you called the coefficient beta the "Intercept for the Floor level j". The pattern repeats for the other models, so it is better to clarify the nomenclature View entire conversation on ReviewNB |
I agree that saying we don't have an Intercept and then saying "intercept for the floor" may be confusing (even though we can consider those coefficients to be intercepts). Let's use "coefficient for the Floor level j". I'm fixing that now. View entire conversation on ReviewNB |
@juanitorduz it's sad you caught it too but I'm glad you're feeling better :D I added several changes, with the aim to simplify the models, but still be correct in what we're doing. I compared our Bambi models, with the PyMC models and with the models in the original example in the book by Gelman and Hill. There are so many ways of writing the same underlying model, and so many terms that mean the same or something very similar, it's been very painful. On top of that, I realized this particular example brings a lot of confusion. I'll try to explain in the following points: Intercepts and slopes Both the PyMC example and the original example in the book talk about the intercept and the floor slope. I think they do this because For example, Gelman and Hill use What I find confusing is that they talk about the "intercept" and the "slope" as if they were unrelated things. Here, Consequently, I think it is better to omit the global intercept when we use County-level predictors When the uranium level is added, both examples say that a county-level predictor is added. However this does not mean that there's a county-specific slope for uranium (i.e. a hierarchical effect). What they mean is that there is a unique uranium measurement for each county, but the variable enters as a regular slope in the model. Trying to mimic PyMC parametrizations I think we should not try to replicate the same parameterizations that are used in the PyMC example. Two main reasons come to my mind:
I'll try to have a second read to this notebook soon. I think I've left many "intercepts" written there that I could express in a different way. CC @aloctavodia feel free to add your opinion here. |
Hey @tomicapretto , thank you very much for your input! Here are some comments:
Is there anything I could support you with? Or maybe I wait for your second read and feedback from others? |
According to what we've discussed yesterday, these are the points we need to complete to consider this finished
We're getting closer to wrap this up!! 😄 🚀 |
I think this is done if you want to have a last review @juanitorduz @aloctavodia |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2022-02-13T07:17:51Z A small notation detail: above we used juanitorduz commented on 2022-02-13T07:28:17Z fixed |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2022-02-13T07:17:51Z Here is a typo in y = ... juanitorduz commented on 2022-02-13T07:28:01Z fixed |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2022-02-13T07:17:52Z y = ... is missing (log) juanitorduz commented on 2022-02-13T07:28:44Z fixed |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2022-02-13T07:17:53Z here as well (log) is missing juanitorduz commented on 2022-02-13T07:28:58Z fixed |
fixed View entire conversation on ReviewNB |
3 similar comments
fixed View entire conversation on ReviewNB |
fixed View entire conversation on ReviewNB |
fixed View entire conversation on ReviewNB |
Great work @tomicapretto ! The change in the notation to make it light-weight really improves readability IMO. Also, the explanation on how to build the model is quite handy :) In 497b0f1 and a6b80d9 i fixed some small typos and harmonised some formulas (lower/upper case). From my side is ready to go! We can always get feedback and keep improving this valuable notebook 🚀 |
This PR aims to solve #439.
I have pushed the first version of the models. Here are the formulas I used:
log_radon ~ floor
log_radon ~ 0 + county + county:floor
log_radon ~ 0 + (0 + 1|county)
log_radon ~ 0 + (1|county) + floor
# ! How to do complete one-hot-encoding for county?log_radon ~ 0 + (floor|county)
# ! How to do complete one-hot-encoding for county?I could like to have all the countries as encoded (not removing the first one) plus I am not auto-scaling the data to make the results with the original PyMC example comparable. Does this look good? I am still getting used to the formula-like notation for hierarchical models (which is why I am working on this PR 😅 ).
After double-checking the model specification I will continue to add text and some additional plots.