# Probability and Statistics for Machine Learning: Probability Distributions

## 3. Probability Distributions


### What are Probability Distributions?

A probability distribution describes how the probabilities of different outcomes are distributed for a random variable. It provides a mathematical description of the likelihood of different outcomes.

There are two types of probability distributions:
1. **Discrete Probability Distributions**: Used for discrete random variables (variables that take on specific values).
2. **Continuous Probability Distributions**: Used for continuous random variables (variables that can take on any value within a range).

### Discrete Probability Distributions

- **Binomial Distribution**: Describes the number of successes in a fixed number of independent Bernoulli trials (each trial results in a success or failure).
  
  Probability mass function (PMF):
  \[
  P(X = k) = inom{n}{k} p^k (1 - p)^{n - k}
  \]
  Where:
  - \( n \) is the number of trials.
  - \( k \) is the number of successes.
  - \( p \) is the probability of success in each trial.

### Example: Binomial Distribution

The probability of getting exactly 3 heads in 5 coin flips, with the probability of heads being 0.5, is calculated as:
    

In [None]:

from scipy.stats import binom

# Example: Binomial distribution
n = 5   # Number of trials
p = 0.5 # Probability of success
k = 3   # Number of successes

P_binomial = binom.pmf(k, n, p)
P_binomial
    


### Continuous Probability Distributions

- **Normal Distribution**: The most commonly used continuous distribution in machine learning. It is also known as the Gaussian distribution.

  Probability density function (PDF):
  \[
  f(x) = rac{1}{\sqrt{2\pi\sigma^2}} \exp\left( -rac{(x - \mu)^2}{2\sigma^2} ight)
  \]
  Where:
  - \( \mu \) is the mean.
  - \( \sigma^2 \) is the variance.

### Example: Normal Distribution

The PDF of a normal distribution with mean 0 and standard deviation 1 (standard normal distribution) can be plotted as follows:
    

In [None]:

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm

# Example: Normal distribution
x = np.linspace(-3, 3, 100)
mean = 0
std_dev = 1

# Plotting the normal distribution
plt.plot(x, norm.pdf(x, mean, std_dev))
plt.title('Normal Distribution (mean=0, std_dev=1)')
plt.show()
    


### Other Probability Distributions

- **Poisson Distribution**: Describes the probability of a given number of events happening in a fixed interval of time or space.
- **Exponential Distribution**: Models the time between events in a Poisson process.

Probability distributions are widely used in machine learning for tasks such as:
- Modeling uncertainties.
- Estimating likelihoods.
- Bayesian inference.

    