# Rank-normalization, folding, and localization: An improved $\widehat{R}$ for assessing convergence of MCMC by Vehtari et al. 2019

### Students: Tanveer Karim and Ian Weaver
### Course: APMTH 207
### GitHub Repository: https://github.com/icweaver/pyhat
### Paper: https://ui.adsabs.harvard.edu/abs/2019arXiv190308008V/abstract

## Problem Statement and Existing Work
$\newcommand{\rhat}{\widehat R}$
The traditional $\rhat$ statistic was first defined by [Gelman and Rubin (1992)](https://www.jstor.org/stable/2246093?seq=1#metadata_info_tab_contents) and the subsequent revision involving the split variation was defined by [Gelman et al. (2013)](https://books.google.com/books/about/Bayesian_Data_Analysis.html?id=eSHSBQAAQBAJ). These statistics theoretically measure the variance between chains and in chains for an MCMC model and quantifies whether the chains have not mixed well, i.e. whether they have diverged. Along with traceplots, these statistics serve as a powerful way to diagnose whether the MCMC samples can be used for subsequent analysis. 

However, [Vehtari et al. (2019)](https://ui.adsabs.harvard.edu/abs/2019arXiv190308008V/abstract) showed that the well-known $\rhat$ statistic and traceplots have limitations and fail in many cases. Thus, they propose an alternative definition of $\rhat$ and introduce rankplots as an alternative to traceplots as visual diagnostics. 

## Context and Unique Contribution

For anyone doing Bayesian analysis over multidimensions, MCMC samplers are paramount for modelling purposes. Beyond 2D, visualizing convergence of MCMC samplers on the parameter space becomes very complicated and we have to resort to alternate methods to understand convergence. Hence, if our well-used statistic and diagnostics are not performing as we expect them to, then almost any complex multidimensional MCMC modelling becomes suspect. Because of this importance, we decided to tackle this paper and understand how to best study convergence issues related to MCMC samplers. 

## Technical Content and Experiments
The high-level organization of our codebase [`pyhat`](https://github.com/icweaver/pyhat) is shown below and described in detail in the associated notebooks:
```
pyhat
├── Project_Summary_Notebook_ian.ipynb
├── Project_Summary_Notebook_tanveer.ipynb
├── README.md
├── codes
│   ├── plotutils.py
│   └── utils.py
└── examples
    ├── multiplanet
    │   ├── data
    │   │   ├── map_solution.npy
    │   │   ├── multiplanet_chain_1.pkl
    │   │   ├── multiplanet_chain_2.pkl
    │   │   ├── multiplanet_chain_3.pkl
    │   │   ├── multiplanet_chain_4.pkl
    │   │   └── trace.pkl
    │   └── multiplanet.ipynb
    ├── rhat_variance
    │   ├── data
    │   │   └── models.npy
    │   └── rhat_variance.ipynb
    └── toymodel_gaussian
        ├── toymodel_2DGaussian.ipynb
        ├── toymodel_gaussian.ipynb
        └── utils.py
```

The main implementation and notebooks detailing its use are:
* `codes/` - implementation of modified $\rhat$ statistics proposed by paper and tools to visualize them
* [`examples/multiplanet/multiplanet.ipynb`](examples/multiplanet/multiplanet.ipynb) - domain specific application (astronomy) of the tools described above
* [`examples/rhat_variance/rhat_variance.ipynb`](examples/rhat_variance/rhat_variance.ipynb) - increased variance example
* [`examples/toymodel_gaussian/toymodel_2DGaussian.ipynb`](examples/toymodel_gaussian/toymodel_2DGaussian.ipynb) - presentation of $\rhat$ and visualization tools on a 2D Gaussian with changing correlation
* [`examples/toymodel_gaussian/toymodel_gaussian.ipynb`](examples/toymodel_gaussian/toymodel_gaussian.ipynb) - presentation of $\rhat$ and visualization tools on a simple Gaussian distribution with changing variance

The `data` folder holds intermediate results that can be loaded into each notebook to avoid running time-intensive cells again. Note:`examples/multiplanet/data/trace.pkl` was too large to upload to our Github repo, but we are more than happy to share it upon request.

## Evaluation

works well for variance difference, not so good for correlation. expand.

## Future Work

For future work, we would like to understand how to modify $\widehat{R}$ in a way such that it captures correlation terms from the covariance matrix. Right now, off-diagonal terms do not contribute at all, as shown in [toymodel_2DGaussian](examples/toymodel_gaussian/toymodel_2DGaussian.ipynb). We would like to explore and understand if there is a better way quantify changes in the correlation terms. One possible avenue is to understand the relationship between $\widehat{R}$ and autocorrelation function and see if there is a way to unify them together. 