-
-
Notifications
You must be signed in to change notification settings - Fork 269
Conditional autoregressive priors #547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conditional autoregressive priors #547
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
What's the right technique to generate and then fix these merge conflicts locally? |
I've managed to pass all the pre-commit checks locally, so that's progress for this notebook. I ended up finding a half dozen other typos that were causing pre-commit to fail. However, the preview file is rendered terribly plus the branch conflicts. Not sure how to fix those yet.
|
…-saunders-phil/pymc-examples into conditional_autoregressive_priors
I think it might be done! @bwengals and I fought through a lot of pre-commit to get here 🫠 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some comments after going over the notebook. I haven't gone over the contents in depth yet though. Before going over the comments and further review, first note:
- I have commented everything on the myst file. Don't edit the myst file directly, apply the changes to the notebook and then run pre-commit.
- I am completely lost on what is the relation between this notebook and the existing one. Are they complementary, should this replace the other one, is the old one still relevant and salvageable (even if this one is somewhat complementary)?
# Conditional Autoregressive (CAR) Models for Spatial Data | ||
|
||
:::{post} Jul 29, 2022 | ||
:tags: spatial |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is also an autoregressive tag https://www.pymc.io/projects/examples/en/latest/blog/tag/autoregressive.html which seems relevant for the notebook based on the title. I'd also recommend going over the list of tags (left sidebar of https://www.pymc.io/projects/examples/en/latest/gallery.html) and add other that might be relevant too.
:::{post} Jul 29, 2022 | ||
:tags: spatial | ||
:category: beginner, tutorial | ||
:author: Conor Hassan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also add yourself here
substitutions: | ||
extra_dependencies: bambi seaborn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
substitutions: | |
extra_dependencies: bambi seaborn | |
myst: | |
substitutions: | |
extra_dependencies: bambi seaborn |
I need to update the docs, they recenly updated the metadata keys for this. Ref: https://myst-parser.readthedocs.io/en/latest/syntax/optional.html#substitutions-with-jinja2. It also looks like geopandas and pylibsal and the like should also be added here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there anything I should do on my end? The ipynb file just has:
:::{include} ../extra_installs.md
:::
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, only the metadata and these two lines are enough. You can see in the preview how this two lines add into the rendered notebook a predefined template with install instructions with pip, conda and from within jupyter: https://pymcio--547.org.readthedocs.build/projects/examples/en/547/case_studies/conditional_autoregressive_priors.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm not clear on where I can add the metadata. The substitutions lines only appears in the myst file (as far as I can tell) and I thought that's entirely generated from the .ipynb.
Also, if it matters we don't need seaborn or bambi for this one. Just geopandas and pylibsal. Both should be installed with conda-forge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The section I linked of the style guide covers how to edit the metadata from jupyter. This (together with cell level metadata which we also use) is one of the main reasons we use myst files in addition to ipynb. The ipynb is needed to store outputs and render the website properly, myst ones are needed for proper reviews to be possible (this might change with the improvements in jupyer diffs github is doing but for now I think they are still needed)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still pretty unsure - the .ipynb metadata now reads:
"substitutions": {
"extra_dependencies": "geopandas libpysal"
},
but the document preview is unchanged. What might I be missing? Is there an example of a notebook in the pymc library that has used this myst substitutions trick successfully I can model?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has been updated, you might have been a bit too eager and not waited for the rtd check (has status mark below) finished. I do see the new libraries now.
This behaviour is the one that appears in the style guide (and still works) but was recently deprecated. It should be:
"myst": {
"substitutions": {
"extra_dependencies": "geopandas libpysal"
}
},
We need to load in the dataset to access the variables $\{y_i, x_i, E_i\}_{i=1}^N$. But more unique to the use of CAR models, is the creation of the necessary spatial adjacency matrix. For the models that we fit, all neighbours are weighted as $1$, circumventing the need for a weight matrix. The dataset can be accessed via the `pm.get_data` function. | ||
|
||
```{code-cell} ipython3 | ||
df_scot_cancer = pd.read_csv(pm.get_data("scotland_lips_cancer.csv")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be updated to https://www.pymc.io/projects/docs/en/latest/contributing/jupyter_style.html#reading-from-file
|
||
```{code-cell} ipython3 | ||
independent_stacked = az.extract(independent_idata) | ||
spat_df["INDEPENDENT_RES"] = independent_stacked.res.mean(axis=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spat_df["INDEPENDENT_RES"] = independent_stacked.res.mean(axis=1) | |
spat_df["INDEPENDENT_RES"] = independent_stacked.res.mean(dim="sample") |
or, given I don't see independent_stacked
variable being used later, it could even be:
spat_df["INDEPENDENT_RES"] = independent_data["posterior"]["res"].mean(dim=["chain", "draw"])
fixed_spatial_stacked = az.extract(fixed_spatial_idata) | ||
spat_df["SPATIAL_RES"] = fixed_spatial_stacked.res.mean(axis=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar comment as above, axis=...
should not be used with xarray objects
```{code-cell} ipython3 | ||
car_stacked = az.extract(car_idata) | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this cell is not used anywhere, should be removed
* Adapted from a previous PyMC example notebook, authored by Junpeng Lao {ref}`conditional_autoregressive_model` by Conor Hassan on July, 2022. | ||
* Re-executed by Daniel Saunders in May, 2023 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd probably say something like:
* Adapted from {ref}`another PyMC example notebook <conditional_autoregressive_model>` by Conor Hassan and Daniel Saunders ([pymc-examples#417](https://github.com/pymc-devs/pymc-examples/pull/417) and [pymc-examples#547](https://github.com/pymc-devs/pymc-examples/pull/547/)).
Authorship of the other notebook should be on the other notebook I think.
|
||
```{code-cell} ipython3 | ||
%load_ext watermark | ||
%watermark -n -u -v -iv -w -p pytensor,xarray |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
%watermark -n -u -v -iv -w -p pytensor,xarray | |
%watermark -n -u -v -iv -w -p xarray |
pytensor is imported, so adding it here duplicates it on the watermark
I think this notebook replaces the other one for most users. The old CAR notebook is great for explaining why PyMC's car prior is implemented the way it is. So users who are interested in understanding how to write and modify their own algorithms for similar models autoregressive models would benefit from reading the old notebook (albeit, with up dates to pymc 5). For everyone else, the new notebook is more useful - it explain a bit about why we should care about spatial autocorrelation and is far more concise. |
Thanks, this is great! Would you be interested in updating the other one too? If so we can continue the discussion once the time comes, otherwise (or in the meantime if it is expected to be a longish wait), what do you think about updating the title of the other notebook to be "About CAR models in PyMC"? I think it is more fitting given your description |
…ussion of divergences
Yeah I'd like to revise the other one at some point. I think I'll get to it after my next task (trying to develop the ICAR) so an interim title change sounds good. Would I update the A quick explanation of the other changes to the file: previously the notebook suggested model 3 was unidentifiable and trying to sample from it would typically result in chains with loads of divergences. I wasn't able to reliably reproduce these divergences and can't find evidence of degenerate geometries in the posterior distribution. I think the divergences discussed in the earlier drafts might be due to bad luck. To avoid discussion of divergences, I rewrote the bottom few cells to provide a different motivation for the ICAR. |
The main thing that needs to be updated is the title, the |
Hi @OriolAbril I think it might be ready to go again - I adjusted the metadata and the title of the old CAR notebook as you suggested. My read the docs preview doesn't seem to be up to date but the committed files are all where I think they should be. |
…utoregressive_priors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left a comment about the categories for the "about" notebook, other than that I think it is good to go, thanks!
|
||
:::{post} Aug 14, 2020 | ||
:tags: spatial, autoregressive, count data | ||
:category: advanced, reference |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the notebook is more explanation type than reference.
Thanks! |
#417 is got stuck and the original author isn't pursing the PR anymore. However, it would be nice to get this notebook finished. I just don't have the permissions to change #417 directly. So I've made a new PR to, at the very least, test out the preview of read the docs and see if it passes continuous integration checks on github.