**[Mushfikur R. Mahi](https://x.com/mushfikurahmaan)**  
Department of Economics  
Bangladesh National University  
The second week of October.

## Normal Distribution

<center>
    <img src="https://www.investopedia.com/thmb/Fni-D-yFwtaHBGKExU8v78yHNPc=/1500x0/filters:no_upscale():max_bytes(150000):strip_icc()/The-Normal-Distribution1-51cb75a3e0a34eb6bbff7e966557757e.jpg" alt="description" width="auto">
</center>


### Introduction
Normal distribution, often referred to as the Gaussian distribution, is a fundamental concept in statistics and probability theory. It describes how data points are distributed in a symmetrical, bell-shaped curve, centered around the mean. This distribution is essential for various fields, including psychology, finance, natural and social sciences, and machine learning, as it helps in understanding and interpreting data trends.

### Definition
Normal distribution is a probability distribution that is symmetric about the mean, indicating that data near the mean are more frequent in occurrence than data far from the mean. In simpler terms, it represents a continuous probability distribution for a random variable that is defined by its mean and standard deviation.

### Formula
The probability density function (PDF) of a normal distribution is defined by the following formula:

$$
f(x) = \frac{1}{\sigma \sqrt{2\pi}} e^{-\frac{(x - \mu)^2}{2\sigma^2}}
$$

Where:
- $ f(x) $ is the probability density function.
- $ \mu $ is the mean of the distribution.
- $ \sigma $ is the standard deviation.
- $ e $ is the base of the natural logarithm (approximately equal to 2.71828).
- $ x $ represents the variable of interest.

### Key Characteristics
Normal distribution has several key characteristics that define its shape and behavior:

1. **Symmetry**: The distribution is symmetric around the mean ($ \mu $), meaning that the left and right sides of the curve are mirror images.

2. **Mean, Median, and Mode**: In a normal distribution, the mean, median, and mode are all equal and located at the center of the distribution.

3. **Bell-shaped Curve**: The graph of the normal distribution is bell-shaped, with the highest point at the mean. The curve approaches the horizontal axis but never touches it.

4. **68-95-99.7 Rule**: Approximately 68% of the data falls within one standard deviation of the mean $ \mu \pm \sigma $, 95% falls within two standard deviations $ \mu \pm 2\sigma $, and 99.7% falls within three standard deviations $ \mu \pm 3\sigma $.

5. **Asymptotic**: The tails of the curve approach the horizontal axis but never actually touch it, indicating that there is a non-zero probability of extreme values.

### When to Use Normal Distribution
Normal distribution can be used in various scenarios, particularly when:

- The sample size is large (typically $ n > 30 $) according to the Central Limit Theorem, which states that the means of large samples drawn from any distribution will be approximately normally distributed.
- The data is continuous and can be assumed to be symmetrically distributed around a central value.
- The underlying process generating the data is random, and the distribution of measurements can be described by a mean and standard deviation.

### Real-World Applications
Normal distribution has a wide range of real-world applications across various fields:

1. **Psychology**: Used in intelligence testing, where IQ scores are typically distributed normally.

2. **Finance**: Stock prices and returns often follow a normal distribution, aiding in risk assessment and portfolio management.

3. **Quality Control**: Manufacturers use normal distribution to monitor product quality, ensuring that measurements fall within specified limits.

4. **Natural and Social Sciences**: Many biological and social phenomena, such as heights, weights, and test scores, exhibit normal distribution patterns.

5. **Machine Learning**: Algorithms often assume that the features are normally distributed, which is crucial for certain statistical techniques like linear regression.

### Problem Example
**Problem**: A factory produces light bulbs with a mean lifespan of 1200 hours and a standard deviation of 100 hours. 

1. What percentage of light bulbs last between 1100 and 1300 hours?
   
**Solution**:
Using the 68-95-99.7 rule:
- The range of 1100 to 1300 hours is one standard deviation from the mean (1200 hours).
- Approximately 68% of the light bulbs will last between 1100 and 1300 hours.

For a more precise calculation, we can use the Z-score formula:

$$
Z = \frac{(X - \mu)}{\sigma}
$$

Calculating the Z-scores for 1100 and 1300 hours:
- For 1100 hours: 
  $$
  Z = \frac{(1100 - 1200)}{100} = -1
  $$
  
- For 1300 hours: 
  $$
  Z = \frac{(1300 - 1200)}{100} = 1
  $$

Using the standard normal distribution table, we find:
- The area to the left of $ Z = -1 $ is approximately 0.1587.
- The area to the left of $ Z = 1  $ is approximately 0.8413.

To find the area between $ Z = -1 $ and $ Z = 1 $:
$$
\text{Area} = 0.8413 - 0.1587 = 0.6826 \text{ or } 68.26\%
$$

Thus, approximately 68.26% of the light bulbs will last between 1100 and 1300 hours.

### Conclusion
Normal distribution is a critical concept in statistics, offering insights into data trends and variability. Its key characteristics, such as symmetry, the bell-shaped curve, and the 68-95-99.7 rule, provide a framework for understanding how data behaves in various fields. While normal distribution is widely applicable, the Poisson distribution serves specific contexts involving rare events. Understanding both distributions equips researchers and practitioners with the tools needed to analyze and interpret data effectively.


---


## Binomial Distribution
<center>
    <img src="https://statisticsbyjim.com/wp-content/uploads/2022/10/negative_binomial_distribution_plot.png" alt="description" width="auto">
</center>

### Introduction

The **binomial distribution** is a foundational concept in statistics and probability theory. It is a discrete probability distribution that models the number of successes in a fixed number of independent trials, each with the same probability of success. This distribution is often encountered when dealing with scenarios where there are only two possible outcomes for each trial, such as success/failure, true/false, or yes/no. Due to its simplicity and wide applicability, the binomial distribution is crucial in fields like economics, biology, engineering, and data science.

### Definition

The binomial distribution describes the probability of obtaining exactly $  k $  successes in $  n $  independent trials of a binary experiment, where each trial has only two outcomes: success (with probability $  p $ ) and failure (with probability $  1 - p $ ).

The probability mass function (PMF) of the binomial distribution is defined as:

$$ 
P(X = k) = \binom{n}{k} p^k (1 - p)^{n-k}
$$ 

Where:
- $  P(X = k) $ : Probability of getting exactly $  k $  successes.
- $  n $ : Number of trials.
- $  k $ : Number of successes.
- $  p $ : Probability of success in each trial.
- $  1 - p $ : Probability of failure.
- $  \binom{n}{k} $ : The binomial coefficient, representing the number of ways to choose $  k $  successes from $  n $  trials, calculated as $  \frac{n!}{k!(n-k)!} $ .

### Formula

The general formula for the binomial distribution is:

$$ 
P(X = k) = \binom{n}{k} p^k (1 - p)^{n-k}
$$ 

This formula allows us to calculate the probability of observing $  k $  successes in $  n $  trials when each trial has a success probability of $  p $ .

Some key quantities associated with the binomial distribution include:
- **Mean** ($  \mu $ ): $  \mu = np $ 
- **Variance** ($  \sigma^2 $ ): $  \sigma^2 = np(1 - p) $ 
- **Standard Deviation** ($  \sigma $ ): $  \sigma = \sqrt{np(1 - p)} $ 

### Key Characteristics

- **Discrete Distribution**: The binomial distribution is a discrete distribution, meaning it only takes integer values, corresponding to the number of successes.
- **Fixed Number of Trials**: The number of trials ($  n $ ) is fixed in advance.
- **Binary Outcomes**: Each trial results in either a success or failure.
- **Independence**: The outcome of each trial is independent of others.
- **Constant Probability**: The probability of success ($  p $ ) remains constant across all trials.

### When to Use Binomial Distribution

The binomial distribution is used when the following conditions are met:
1. **Binary outcomes**: There are only two possible outcomes for each trial.
2. **Fixed number of trials**: The number of trials ($  n $ ) is predetermined.
3. **Constant probability**: The probability of success ($  p $ ) remains the same for all trials.
4. **Independence**: Each trial is independent of the others.

This distribution is commonly used in situations where we are interested in counting the number of successes in a sequence of independent binary trials.

### Real-World Applications

The binomial distribution is applied in a wide range of real-world problems:
- **Business**: Predicting the number of successful sales calls in a given number of attempts.
- **Medicine**: Estimating the number of patients who will recover after a specific treatment.
- **Manufacturing**: Assessing the number of defective items in a batch of products.
- **Economics**: Analyzing voter behavior in elections where individuals either vote for or against a candidate.
- **Data Science**: Modeling the success rate of email campaigns or A/B testing outcomes.

### Problem Example

**Problem:**

Suppose a factory produces light bulbs, and the probability that a light bulb is defective is $  0.05 $ . If a random sample of 10 light bulbs is selected, what is the probability that exactly 2 bulbs will be defective?

**Solution:**

Here, we have:
- $  n = 10 $  (number of light bulbs tested),
- $  p = 0.05 $  (probability of a defective bulb),
- $  k = 2 $  (we want the probability of exactly 2 defective bulbs).

Using the binomial probability formula:

$$ 
P(X = 2) = \binom{10}{2} (0.05)^2 (1 - 0.05)^{8}
$$ 

First, calculate $  \binom{10}{2} $ :
$$ 
\binom{10}{2} = \frac{10!}{2!(10 - 2)!} = \frac{10 \times 9}{2} = 45
$$ 

Now, calculate $  P(X = 2) $ :
$$ 
P(X = 2) = 45 \times (0.05)^2 \times (0.95)^8
$$ 
$$ 
P(X = 2) \approx 45 \times 0.0025 \times 0.6634 = 0.0745
$$ 

Thus, the probability that exactly 2 light bulbs will be defective is approximately $  0.0745 $  or 7.45%.

### Conclusion

The binomial distribution is a critical tool for modeling binary outcomes in a fixed number of trials. Its applications span across various industries, helping to solve practical problems such as quality control, medical studies, and business decisions. By understanding its characteristics, formula, and conditions for use, we can apply the binomial distribution to estimate probabilities in real-world scenarios effectively. Its simplicity, combined with broad applicability, makes it a core concept in statistics, especially in data-driven decision-making.


----

## Poisson Distribution
<center>
    <img src="https://www.scribbr.nl/wp-content/uploads/2022/08/Poisson-distribution-graph.webp" alt="description" width="auto">
</center>

### Introduction
The Poisson distribution is a fundamental probability distribution that models the number of events occurring within a fixed interval of time or space, provided these events occur independently. Named after the French mathematician Siméon Denis Poisson, it is widely used in various fields such as physics, finance, biology, and data science to model random processes. Understanding the Poisson distribution is essential for analyzing phenomena where events happen infrequently but are of great interest.

### Definition
The Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events happening in a fixed interval of time or space, under the following conditions:
1. Events are independent.
2. The average rate (number of events per unit time or space) is constant.
3. Two events cannot occur simultaneously (i.e., no two events can happen at exactly the same time).

Mathematically, if the average number of events in a given time period is $\lambda$ (lambda), then the probability of observing $k$ events in that interval is given by the Poisson probability mass function (PMF).

### Formula
The probability mass function (PMF) of the Poisson distribution is:

$$
P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!}
$$

Where:
- $P(X = k)$ is the probability of observing $k$ents.
- $\lambda$ is the expected number of events (mean of the distribution).
- $k$ is the number of events (non-negative integer: $k = 0, 1, 2, \ldots$).
- $e$ is Euler’s number (approximately 2.71828).
- $k!$ is the factorial of $k$

### Key Characteristics
The Poisson distribution has the following important characteristics:
1. **Mean and Variance**: Both the mean and the variance of a Poisson distribution are equal to $\lambda$
   - Mean ($\mu$) = $\lambda$
   - Variance ($sigma^2$) = $\lambda$
2. **Shape**: The shape of the distribution is positively skewed when $\lambda$ is small but becomes more symmetrical as $\lambda$ increases.
3. **Discrete Nature**: It is a discrete distribution, meaning it only applies to integer values for the number of events.
4. **Memorylessness**: The Poisson process exhibits memorylessness, meaning that the probability of an event happening in the future is independent of past events.

### When to Use Poisson Distribution
The Poisson distribution is applicable when:
- You are interested in counting the number of times an event occurs within a specified interval of time or space.
- The events occur independently.
- The average rate $\lambda$ at which events happen is constant.
- Events occur randomly, with no simultaneous occurrences.

In simpler terms, if you are modeling rare events (e.g., the number of earthquakes in a year, the number of cars passing through a toll booth in a minute), the Poisson distribution is appropriate.

### Real-World Applications
Poisson distribution is widely used across different fields:
1. **Business and Operations**: To model the number of customer arrivals at a service center or the number of phone calls received in a call center within an hour.
2. **Biology**: Modeling the distribution of mutations in a given length of DNA.
3. **Healthcare**: Estimating the number of patient admissions in emergency wards over a given period.
4. **Traffic Flow**: Predicting the number of cars passing through a checkpoint in a given time.
5. **Natural Disasters**: Modeling the occurrence of rare events like earthquakes, volcanic eruptions, or extreme weather conditions.
6. **E-commerce**: Estimating the number of website visits or transactions within a given timeframe.

### Problem Example
**Problem:**  
A call center receives an average of 5 calls per minute. What is the probability that the center will receive exactly 7 calls in the next minute?

**Solution:**  
Here, $\lambda = 5$ (the average number of calls per minute) and $k = 7$ (the number of calls we are interested in). Using the Poisson formula:

$$
P(X = 7) = \frac{5^7 e^{-5}}{7!}
$$

First, calculate the individual components:
- $5^7 = 78125$
- $e^{-5} \approx 0.00674$
- $7! = 5040$

Now, substitute these into the formula:

$$
P(X = 7) = \frac{78125 \times 0.00674}{5040} \approx 0.1044
$$

Thus, the probability that the call center will receive exactly 7 calls in the next minute is approximately 0.1044 or 10.44%.

### Conclusion
The Poisson distribution provides an elegant and simple way to model the probability of rare events occurring over a fixed period of time or space. Its wide-ranging applications make it indispensable in fields as varied as business, healthcare, natural sciences, and more. By understanding the key characteristics and appropriate use cases of this distribution, one can make more informed decisions when analyzing and predicting the occurrence of events.
