# Statistical Distributions: Synthetic Data Generation

This notebook demonstrates data generation from common statistical distributions with real-world examples:
- **Normal Distribution**: Heights of adults
- **Log-Normal Distribution**: Income distribution
- **Binomial Distribution**: Product quality control
- **Poisson Distribution**: Customer arrivals per hour

## 1. Import Required Libraries

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats

# Set random seed for reproducibility
np.random.seed(42)

# Set style for better visualizations
sns.set_style("whitegrid")
plt.rcParams['figure.figsize'] = (12, 6)

print("Libraries imported successfully!")

## 2. Normal Distribution (Gaussian)

**Real-World Example**: Heights of adult humans

The normal distribution is symmetric and bell-shaped. Many natural phenomena follow this distribution, including:
- Human heights and weights
- Blood pressure measurements
- IQ scores
- Measurement errors

In [None]:
# Generate data: Heights of adult males (in cm)
# Mean height: 175 cm, Standard deviation: 7 cm
mean_height = 175
std_height = 7
sample_size = 1000

heights = np.random.normal(mean_height, std_height, sample_size)

# Create visualization
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Histogram with KDE
axes[0].hist(heights, bins=30, density=True, alpha=0.7, color='skyblue', edgecolor='black')
x = np.linspace(heights.min(), heights.max(), 100)
axes[0].plot(x, stats.norm.pdf(x, mean_height, std_height), 'r-', linewidth=2, label='PDF')
axes[0].set_title('Normal Distribution: Adult Male Heights', fontsize=14, fontweight='bold')
axes[0].set_xlabel('Height (cm)')
axes[0].set_ylabel('Density')
axes[0].legend()
axes[0].grid(alpha=0.3)

# Box plot
axes[1].boxplot(heights, vert=True)
axes[1].set_title('Box Plot of Heights', fontsize=14, fontweight='bold')
axes[1].set_ylabel('Height (cm)')
axes[1].grid(alpha=0.3)

plt.tight_layout()
plt.show()

# Statistics
print(f"Mean: {heights.mean():.2f} cm")
print(f"Standard Deviation: {heights.std():.2f} cm")
print(f"Min: {heights.min():.2f} cm")
print(f"Max: {heights.max():.2f} cm")
print(f"68% of data within: [{mean_height-std_height:.2f}, {mean_height+std_height:.2f}] cm")