# The Gaussian Distribution

This notebook explores the gaussian distribution through real-world examples, data generating processes, and visualizations.

In [None]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from ipywidgets import interact, widgets

sns.set_theme(style="whitegrid")
plt.rcParams['figure.figsize'] = [10, 6]
plt.rcParams['figure.dpi'] = 100


## 1. Real-World Examples

The gaussian distribution appears in many real-world scenarios:

- Heights of people
- Measurement errors
- Test scores

Let's explore some examples in detail.

### Example 1: Heights of people

Let's simulate and visualize this example:

In [None]:

def simulate_heights(mean=170, std=7, n_people=1000):
    heights = np.random.normal(mean, std, n_people)
    
    return heights

heights = simulate_heights(170, 7, 1000)

plt.figure(figsize=(12, 6))

plt.subplot(1, 2, 1)
sns.histplot(heights, bins=30, kde=True)
plt.title("Distribution of Heights")
plt.xlabel("Height (cm)")
plt.ylabel("Frequency")

plt.axvline(np.mean(heights), color='r', linestyle='--', label='Mean')
plt.axvline(np.mean(heights) + np.std(heights), color='g', linestyle='--', label='+1 SD')
plt.axvline(np.mean(heights) - np.std(heights), color='g', linestyle='--', label='-1 SD')
plt.legend()

plt.subplot(1, 2, 2)
from scipy import stats
stats.probplot(heights, plot=plt)
plt.title("Q-Q Plot of Heights")

plt.tight_layout()
plt.show()

mean_height = np.mean(heights)
std_height = np.std(heights)
print(f"Mean height: {mean_height:.2f} cm")
print(f"Standard deviation: {std_height:.2f} cm")
print(f"Percentage within 1 SD of mean: {np.mean((heights > mean_height - std_height) & (heights < mean_height + std_height)):.1%}")
print(f"Percentage within 2 SD of mean: {np.mean((heights > mean_height - 2*std_height) & (heights < mean_height + 2*std_height)):.1%}")


## 5. Summary

In this notebook, we explored the gaussian distribution through:

1. **Real-world examples**: Heights of people, Measurement errors, Test scores
2. **Data generating process**: Understanding how the distribution emerges
3. **Implementation & visualization**: Using NumPy for random sampling and seaborn for visualization
4. **Interactive exploration**: Examining how the distribution changes with different parameters
5. **Practical applications**: Real-world use cases and applications

The gaussian distribution is important in statistics and appears in many real-world scenarios.