# Data Distributions Notebook

### **Explanation of Various Data Distributions:**

1. **Normal Distribution**:
   - A continuous probability distribution characterized by a symmetric bell-shaped curve.
   - Defined by its mean (average) and standard deviation (spread).
   - Many natural phenomena (e.g., heights, test scores) approximate this distribution.

2. **Binomial Distribution**:
   - A discrete distribution that describes the number of successes in a fixed number of independent Bernoulli trials (each trial has two possible outcomes).
   - Defined by the number of trials (n) and the probability of success (p).
   - Commonly used in scenarios like coin flips, quality control, etc.

3. **Poisson Distribution**:
   - A discrete distribution that expresses the probability of a given number of events occurring within a fixed interval of time or space.
   - Defined by a single parameter λ (lambda), which is the average number of occurrences in the interval.
   - Often used in queuing theory and for counting events over time (e.g., arrivals at a service point).

4. **Uniform Distribution**:
   - A continuous distribution where all outcomes are equally likely within a specified range.
   - Defined by a lower bound (a) and an upper bound (b).
   - Used in scenarios where every outcome has the same chance (e.g., rolling a fair die).

5. **Beta Distribution**:
   - A continuous distribution defined on the interval [0, 1] characterized by two shape parameters, α (alpha) and β (beta).
   - Useful in modeling random variables limited to a finite range, particularly in Bayesian statistics.

6. **Gamma Distribution**:
   - A continuous distribution often used to model waiting times and life durations.
   - Defined by two parameters: shape (k) and scale (θ).
   - Commonly applied in queuing models and in various fields such as insurance and reliability engineering.

### Usage Instructions:
1. **Run the code** to visualize the distributions of various data types.
2. Feel free to modify the parameters of each distribution to see how the shape changes.
3. Explore other distributions by modifying the code as needed.

In [None]:
# Importing necessary libraries
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.stats import norm, binom, poisson, uniform, beta, gamma

# Set style for seaborn
sns.set(style="whitegrid")

# Function to plot distributions
def plot_distribution(data, title):
    plt.figure(figsize=(10, 5))
    sns.histplot(data, bins=30, kde=True, stat="density", linewidth=0)
    plt.title(title)
    plt.xlabel("Value")
    plt.ylabel("Density")
    plt.grid()
    plt.show()

# 1. Normal Distribution
# Properties: Symmetrical, bell-shaped curve defined by mean and standard deviation
mu, sigma = 0, 1  # Mean and standard deviation
normal_data = np.random.normal(mu, sigma, 1000)
plot_distribution(normal_data, "Normal Distribution (mean=0, std=1)")

# 2. Binomial Distribution
# Properties: Discrete distribution for the number of successes in a fixed number of trials
n, p = 10, 0.5  # Number of trials and probability of success
binomial_data = np.random.binomial(n, p, 1000)
plot_distribution(binomial_data, "Binomial Distribution (n=10, p=0.5)")

# 3. Poisson Distribution
# Properties: Discrete distribution that expresses the probability of a given number of events occurring in a fixed interval
lambda_poisson = 3  # Average rate (lambda)
poisson_data = np.random.poisson(lambda_poisson, 1000)
plot_distribution(poisson_data, "Poisson Distribution (lambda=3)")

# 4. Uniform Distribution
# Properties: All outcomes are equally likely, defined by lower and upper bounds
lower, upper = 0, 10  # Bounds
uniform_data = np.random.uniform(lower, upper, 1000)
plot_distribution(uniform_data, "Uniform Distribution (lower=0, upper=10)")

# 5. Beta Distribution
# Properties: Continuous distribution defined on the interval [0, 1], characterized by two shape parameters
a, b = 2, 5  # Shape parameters
beta_data = np.random.beta(a, b, 1000)
plot_distribution(beta_data, "Beta Distribution (a=2, b=5)")

# 6. Gamma Distribution
# Properties: Continuous distribution that models waiting times, defined by shape and scale parameters
shape, scale = 2, 2  # Shape and scale parameters
gamma_data = np.random.gamma(shape, scale, 1000)
plot_distribution(gamma_data, "Gamma Distribution (shape=2, scale=2)")

# Additional Plotting of the distributions using seaborn
plt.figure(figsize=(12, 8))

# Normal Distribution
plt.subplot(2, 3, 1)
sns.histplot(normal_data, bins=30, kde=True, stat="density", color="blue")
plt.title("Normal Distribution")
plt.xlabel("Value")
plt.ylabel("Density")

# Binomial Distribution
plt.subplot(2, 3, 2)
sns.histplot(binomial_data, bins=30, kde=False, stat="density", color="orange")
plt.title("Binomial Distribution")
plt.xlabel("Value")
plt.ylabel("Density")

# Poisson Distribution
plt.subplot(2, 3, 3)
sns.histplot(poisson_data, bins=30, kde=False, stat="density", color="green")
plt.title("Poisson Distribution")
plt.xlabel("Value")
plt.ylabel("Density")

# Uniform Distribution
plt.subplot(2, 3, 4)
sns.histplot(uniform_data, bins=30, kde=False, stat="density", color="red")
plt.title("Uniform Distribution")
plt.xlabel("Value")
plt.ylabel("Density")

# Beta Distribution
plt.subplot(2, 3, 5)
sns.histplot(beta_data, bins=30, kde=True, stat="density", color="purple")
plt.title("Beta Distribution")
plt.xlabel("Value")
plt.ylabel("Density")

# Gamma Distribution
plt.subplot(2, 3, 6)
sns.histplot(gamma_data, bins=30, kde=True, stat="density", color="brown")
plt.title("Gamma Distribution")
plt.xlabel("Value")
plt.ylabel("Density")

plt.tight_layout()
plt.show()