# beta-dist-plot

Copyright © 2024 Erik Hanson

Interactive plots of a beta distribution or a combination of 2 beta distributions 
given the mean and "variance ratio" paramters. This is a nice distribution to use
to get bounded paramters for use in simulations because you can easily get the
distribution parameters from the mean and variance. I use the "variance ratio" 
since it's limits do not change with the mean.

If we need to do a lot of sampling from the distribution during each step of 
the simulations, it might be better to
use the Kumaraswamy distribution since it's inverse CDF has a closed form.

https://en.wikipedia.org/wiki/Kumaraswamy_distribution


## Credits

To get started, I searched for "python jupyter interactive plot" and 
found and example on Stack Overflow that was helpful.

https://stackoverflow.com/questions/44329068/jupyter-notebook-interactive-plot-with-widgets

along with a similar link on Geeks for Geeks.

https://www.geeksforgeeks.org/interactive-graphs-in-jupyter-notebook/

Then I wanted to plot the pdf and cdf on the same graph, so I searched for 
"scipy seccond y axis" and found this link on Python Guides.

https://pythonguides.com/matplotlib-two-y-axes/

Then I searched for "scipy multiply graphs" to get examples with multiple 
subplots. Again, there was a very detailed link on Stack Overflow and a 
nice example on Geeks for Geeks (along with some distracting ads)

https://stackoverflow.com/questions/31726643/how-to-plot-in-multiple-subplots

https://www.geeksforgeeks.org/plot-multiple-plots-in-matplotlib/

The Geeks for Geeks article mentioned the matplotlib.pyplot.subplot2grid 
method which I ended up using.

With those examples, I started making my plots. I referred to the 
documentation to learn more about the `FloatSlider` options and might
have done another search to learn about the `rcParamsDefault['figure.figsize']`
option.

In [1]:
import matplotlib.pyplot as plt
from scipy.stats import beta

from ipywidgets import interact
from ipywidgets import Layout
from ipywidgets import FloatSlider
from ipywidgets import FloatSlider
from ipywidgets import FloatSlider

import numpy as np


In [2]:
def beta_params_from_mean_and_variance_ratio(mean, variance_ratio):
    """
    mean and variance_ratio between 0 and 1.
    variance = variance_ratio * mean * (1 - mean)
    
    Returns the alpha and beta paramters of the specified beta distribution.
    """
    if variance_ratio in (0, 1) or mean in (0, 1):
        return 1, 1  # Should we throw an exception if mean is 0 or 1?
    nu = (1.0 - variance_ratio) / variance_ratio
    a = nu * mean
    b = nu * (1.0 - mean)
    return a, b

def plot_pdf_cdf_on_axis(x, y_pdf, y_cdf, ax):
    """
    plot a distribution pdf and cdf on an axis
    """
    color = 'tab:red'
    ax.set_xlabel('X') 
    ax.set_ylabel('PDF', color = color) 
    ax.plot(x, y_pdf, color = color) 
    ax.tick_params(axis ='y', labelcolor = color) 

    ax2 = ax.twinx() 
    color = 'tab:green'
    ax2.set_ylabel('cdf', color = color) 
    ax2.plot(x, y_cdf, color = color) 
    ax2.tick_params(axis ='y', labelcolor = color) 

def plot_beta_dist_with_cdf(mean=0.5, variance_ratio=0.1):
    """
    Plot a single beta distribution pdf and cdf.
    """
    a, b = beta_params_from_mean_and_variance_ratio(mean, variance_ratio)
    x = np.linspace(0, 1, 100)
    y_pdf = beta.pdf(x, a, b)
    y_cdf = beta.cdf(x, a, b)
    mode = (a - 1) / (a + b - 2)
    
    fig, ax = plt.subplots()
    
    plot_pdf_cdf_on_axis(x, y_pdf, y_cdf, ax)

    label = f'a = {a:0.4f}, b = {b:0.4f}, mode = {mode:0.4f}'
    fig.text(.5, -0.05, label, ha='center', va='bottom')

    plt.title('Interactive beta plot')
    plt.grid(True)
    plt.show()

def plot_beta_dists_with_cdf(mean_1=0.2, variance_ratio_1=0.1, weight_1=0.5, mean_2=0.8, variance_ratio_2=0.1):
    """
    Plot a pair of beta distributions and their weighted sum.
    """    
    # Constrained layout avoids issues with overlapping axes
    plt.rcParams['figure.constrained_layout.use'] = True

    ax1 = plt.subplot2grid((3, 2), (0, 0))
    a, b = beta_params_from_mean_and_variance_ratio(mean_1, variance_ratio_1)
    x = np.linspace(0, 1, 100)
    y_pdf_1 = beta.pdf(x, a, b)
    y_cdf_1 = beta.cdf(x, a, b)
    plot_pdf_cdf_on_axis(x, y_pdf_1, y_cdf_1, ax1)
    
    ax2 = plt.subplot2grid((3, 2), (0, 1)) 
    a, b = beta_params_from_mean_and_variance_ratio(mean_2, variance_ratio_2)
    x = np.linspace(0, 1, 100)
    y_pdf_2 = beta.pdf(x, a, b)
    y_cdf_2 = beta.cdf(x, a, b)
    plot_pdf_cdf_on_axis(x, y_pdf_2, y_cdf_2, ax2)
    
    ax3 = plt.subplot2grid((3, 2), (1, 0), rowspan=2, colspan=2) 
    y_pdf = y_pdf_1 * weight_1 + y_pdf_2 * (1.0 - weight_1)
    y_cdf = y_cdf_1 * weight_1 + y_cdf_2 * (1.0 - weight_1)
    plot_pdf_cdf_on_axis(x, y_pdf, y_cdf, ax3)

    # plt.gcf().text(.5, -0.05, f'a = {a:0.4f}, b = {b:0.4f}', ha='center', va='bottom')

    plt.title('Interactive beta combination plot')
    plt.grid(True)
    plt.show()
   

In [3]:
# Interactive plot of a single beta dist.

step_size = 0.002
precision = len(str(step_size)) - 2
format_string = f'.{precision}f'
width_string = f"{plt.rcParamsDefault['figure.figsize'][0]}in"
common_params = {
    'min' : step_size,
    'max' : 1-step_size,
    'step' : step_size,
    'readout_format' : format_string,
    'layout' : Layout(width=width_string)
}

interact(
    plot_beta_dist_with_cdf,
    mean=FloatSlider(value=0.2, **common_params),
    variance_ratio=FloatSlider(value = 0.1, **common_params)
)

interactive(children=(FloatSlider(value=0.2, description='mean', layout=Layout(width='6.4in'), max=0.998, min=…

<function __main__.plot_beta_dist_with_cdf(mean=0.5, variance_ratio=0.1)>

In [4]:
# put common params together

step_size = 0.002
precision = len(str(step_size)) - 2
format_string = f'.{precision}f'
width_string = f"{plt.rcParamsDefault['figure.figsize'][0]}in"
common_params = {
    'min' : step_size,
    'max' : 1-step_size,
    'step' : step_size,
    'readout_format' : format_string,
    'layout' : Layout(width=width_string)
}

interact(
    plot_beta_dists_with_cdf,
    mean_1=FloatSlider(value=0.2, **common_params),
    variance_ratio_1=FloatSlider(value = 0.1, **common_params),
    weight_1=FloatSlider(value = 0.7, **common_params),
    mean_2=FloatSlider(value=0.8, **common_params),
    variance_ratio_2=FloatSlider(value = 0.1, **common_params)
)

interactive(children=(FloatSlider(value=0.2, description='mean_1', layout=Layout(width='6.4in'), max=0.998, mi…

<function __main__.plot_beta_dists_with_cdf(mean_1=0.2, variance_ratio_1=0.1, weight_1=0.5, mean_2=0.8, variance_ratio_2=0.1)>

### Things to consider adding

- scaling with min and max values.

- try a parameterization with mean and mode. I think we get
a = mean ( 1 - 2 mode) / (mean - mode)
and
b = a (1 - mean) / mean.