# The Beta Distribution

This notebook explores the beta distribution through real-world examples, data generating processes, and visualizations.

In [None]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from ipywidgets import interact, widgets

sns.set_theme(style="whitegrid")
plt.rcParams['figure.figsize'] = [10, 6]
plt.rcParams['figure.dpi'] = 100


## 1. Real-World Examples

The beta distribution appears in many real-world scenarios:

- Conversion rates in marketing
- Batting averages in baseball
- Proportion of defective items in manufacturing

Let's explore some examples in detail.

### Example 1: Conversion rates in marketing

Let's simulate and visualize this example:

In [None]:

def simulate_conversion_rates(a=2, b=8, n_campaigns=1000):
    conversion_rates = np.random.beta(a, b, n_campaigns)
    
    return conversion_rates

conversion_rates = simulate_conversion_rates(2, 8, 1000)

plt.figure(figsize=(12, 6))

plt.subplot(1, 2, 1)
sns.histplot(conversion_rates, bins=30, kde=True)
plt.title("Distribution of Conversion Rates")
plt.xlabel("Conversion Rate")
plt.ylabel("Frequency")

plt.axvline(np.mean(conversion_rates), color='r', linestyle='--', label='Mean')
plt.legend()

plt.subplot(1, 2, 2)
sns.ecdfplot(conversion_rates)
plt.title("Empirical CDF of Conversion Rates")
plt.xlabel("Conversion Rate")
plt.ylabel("Cumulative Probability")

plt.tight_layout()
plt.show()

mean_rate = np.mean(conversion_rates)
var_rate = np.var(conversion_rates)
print(f"Mean conversion rate: {mean_rate:.4f}")
print(f"Variance of conversion rates: {var_rate:.4f}")
print(f"Theoretical mean (a/(a+b)): {2/(2+8):.4f}")
print(f"Theoretical variance (ab/((a+b)²(a+b+1))): {2*8/((2+8)**2 * (2+8+1)):.4f}")


## 5. Summary

In this notebook, we explored the beta distribution through:

1. **Real-world examples**: Conversion rates in marketing, Batting averages in baseball, Proportion of defective items in manufacturing
2. **Data generating process**: Understanding how the distribution emerges
3. **Implementation & visualization**: Using NumPy for random sampling and seaborn for visualization
4. **Interactive exploration**: Examining how the distribution changes with different parameters
5. **Practical applications**: Real-world use cases and applications

The beta distribution is important in statistics and appears in many real-world scenarios.