<img src="./images/banner.png" width="800">

# Normal Distributions

Welcome to the lecture on Normal Distributions! In this Jupyter Notebook, we will explore one of the most important probability distributions in statistics: the **Normal Distribution**, also known as the **Gaussian Distribution**. 


Before we dive into the Normal Distribution, let's briefly review some fundamental concepts in probability theory. Probability is a measure of the likelihood that an event will occur. It is expressed as a number between 0 and 1, where 0 indicates impossibility and 1 indicates certainty. The sum of probabilities for all possible outcomes in a given scenario is always equal to 1.


<img src="./images/probability.png" width="600">

A **probability distribution** is a function that describes the likelihood of different outcomes in a random experiment. It assigns a probability to each possible outcome. There are two main types of probability distributions: discrete and continuous. Discrete probability distributions deal with random variables that can only take on specific, countable values, while continuous probability distributions, like the Normal Distribution, deal with random variables that can take on any value within a specified range.


The Normal Distribution is a continuous probability distribution that is symmetrical about its mean, with data near the mean being more frequent in occurrence than data far from the mean. This distribution is widely used in various fields, including natural and social sciences, because many real-world phenomena can be approximated by the Normal Distribution.


Throughout this lecture, we will cover the following topics:

1. Properties of a Normal Distribution
2. Standard Normal Distribution
3. Probability Density Function (PDF) and Cumulative Distribution Function (CDF)
4. Empirical Rule (68-95-99.7 Rule)
5. Applications of Normal Distributions


By the end of this lecture, you will have a solid understanding of Normal Distributions and their applications in real-world scenarios. Let's dive in!


**Table of contents**<a id='toc0_'></a>    
- [Properties of a Normal Distribution](#toc1_)    
- [Standard Normal Distribution](#toc2_)    
  - [Standardizing Normal Distributions](#toc2_1_)    
  - [Properties of the Standard Normal Distribution](#toc2_2_)    
  - [Z-tables and Probability Calculations](#toc2_3_)    
- [Probability Density Function (PDF) and Cumulative Distribution Function (CDF)](#toc3_)    
  - [Probability Density Function (PDF)](#toc3_1_)    
  - [Cumulative Distribution Function (CDF)](#toc3_2_)    
- [Empirical Rule (68-95-99.7 Rule)](#toc4_)    
- [Applications of Normal Distributions](#toc5_)    
- [Exercise: Normal Distribution Properties and Applications](#toc6_)    
  - [Solution](#toc6_1_)    

<!-- vscode-jupyter-toc-config
	numbering=false
	anchor=true
	flat=false
	minLevel=2
	maxLevel=6
	/vscode-jupyter-toc-config -->
<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->

## <a id='toc1_'></a>[Properties of a Normal Distribution](#toc0_)

<img src="./images/normal-distribution.png" width="800">

<img src="./images/variance.png" width="800">

A Normal Distribution is characterized by several key properties that distinguish it from other probability distributions. Understanding these properties is essential for working with Normal Distributions and applying them to real-world problems.

1. **Bell-shaped curve**: The Normal Distribution is represented by a symmetric, bell-shaped curve known as the "Gaussian curve" or "bell curve." The peak of the curve represents the mean (μ) of the distribution, and the curve is symmetric about this mean.

2. **Mean, median, and mode**: In a Normal Distribution, the mean, median, and mode are all equal. This is a result of the distribution's symmetry.

3. **Symmetry**: The Normal Distribution is symmetric about its mean. This means that the left and right halves of the distribution are mirror images of each other.

4. **Asymptotes**: The tails of the Normal Distribution curve approach the x-axis but never touch it. These tails extend infinitely in both directions, meaning that the range of the Normal Distribution is from negative infinity to positive infinity.

5. **Area under the curve**: The total area under the Normal Distribution curve is equal to 1. This property allows us to calculate probabilities by finding the area under the curve between specific points.

6. **Parametric distribution**: The Normal Distribution is a parametric distribution, which means it is fully described by its parameters: the mean (μ) and the standard deviation (σ). The mean determines the location of the center of the distribution, while the standard deviation determines the width and height of the curve.

   - The mathematical formula for the Normal Distribution is:

     $f(x) = \frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{1}{2}(\frac{x-\mu}{\sigma})^2}$

     where:
     - $f(x)$ is the probability density function (PDF)
     - $\mu$ is the mean
     - $\sigma$ is the standard deviation
     - $\pi$ is the mathematical constant pi (approximately 3.14159)
     - $e$ is the mathematical constant e (approximately 2.71828)

7. **Empirical rule**: The Empirical Rule, also known as the 68-95-99.7 Rule, states that for a Normal Distribution:
   - Approximately 68% of the data falls within one standard deviation of the mean (μ ± σ).
   - Approximately 95% of the data falls within two standard deviations of the mean (μ ± 2σ).
   - Approximately 99.7% of the data falls within three standard deviations of the mean (μ ± 3σ).

These properties make the Normal Distribution a valuable tool for modeling and analyzing real-world phenomena, as many natural processes and measurements tend to follow a Normal Distribution.

## <a id='toc2_'></a>[Standard Normal Distribution](#toc0_)

The Standard Normal Distribution, also known as the Z-distribution, is a special case of the Normal Distribution with a mean of 0 and a standard deviation of 1. It is denoted as:

$Z \sim N(0, 1)$

The Standard Normal Distribution is essential because it allows us to compare and standardize data from different Normal Distributions. By transforming data from any Normal Distribution into the Standard Normal Distribution, we can calculate probabilities, quantiles, and other statistical measures using a single, standardized scale.

<img src="./images/standard-normal-dist.png" width="800">

### <a id='toc2_1_'></a>[Standardizing Normal Distributions](#toc0_)


To convert a random variable X from a Normal Distribution with mean μ and standard deviation σ to a Standard Normal Distribution, we use the following formula:

$Z = \frac{X - \mu}{\sigma}$

This process is called standardization or Z-score normalization. The resulting Z-score represents the number of standard deviations an observation is away from the mean.


For example, suppose we have a Normal Distribution with a mean of 100 and a standard deviation of 15. If we observe a value of 115, we can calculate its Z-score as follows:

$Z = \frac{115 - 100}{15} = 1$

This means that the observation of 115 is 1 standard deviation above the mean.


### <a id='toc2_2_'></a>[Properties of the Standard Normal Distribution](#toc0_)


1. **Mean**: The mean of the Standard Normal Distribution is always 0.
    - Mean (μ) = 0
    - Standard deviation (σ) = 1
    - Probability density function (PDF):

      $f(z) = \frac{1}{\sqrt{2\pi}}e^{-\frac{1}{2}z^2}$

      where:
      - $f(z)$ is the probability density function (PDF)
      - $z$ is the standard score (Z-score)
      - $\pi$ is the mathematical constant pi (approximately 3.14159)
      - $e$ is the mathematical constant e (approximately 2.71828)

2. **Standard Deviation**: The standard deviation of the Standard Normal Distribution is always 1.

3. **Symmetry**: The Standard Normal Distribution is symmetric about its mean (0).

4. **Area under the curve**: The total area under the Standard Normal Distribution curve is equal to 1.

5. **Probability calculations**: Because the Standard Normal Distribution is standardized, we can use pre-calculated tables or statistical software to find the probability of observing a value less than, greater than, or between specific Z-scores.

6. **Converting between Z-scores and raw scores**: To convert a raw score (x) from a Normal Distribution to a Z-score, use the formula:
    - $z = \frac{x - \mu}{\sigma}$
    - To convert a Z-score back to a raw score (x) in a Normal Distribution, use the formula:
      $x = \mu + z\sigma$

### <a id='toc2_3_'></a>[Z-tables and Probability Calculations](#toc0_)


Z-tables, also known as Standard Normal tables, are used to find the probability of observing a value less than or greater than a given Z-score. These tables list the cumulative probabilities for various Z-scores.


<img src="./images/z-table.webp" width="600">

<img src="./images/z-table-full.jpeg" width="800">

For example, to find the probability of observing a Z-score less than 1.5, we would look up the value corresponding to 1.5 in the Z-table. The table will give us the probability of observing a value less than 1.5 in a Standard Normal Distribution.


Most statistical software packages, such as Python's SciPy library or R's base functions, provide built-in functions to calculate probabilities and quantiles for the Standard Normal Distribution, eliminating the need for manual table lookups.


In the next section, we will discuss the Probability Density Function (PDF) and Cumulative Distribution Function (CDF) of the Normal Distribution, which are essential for understanding the properties and applications of the Normal Distribution.

## <a id='toc3_'></a>[Probability Density Function (PDF) and Cumulative Distribution Function (CDF)](#toc0_)

To fully understand the Normal Distribution, it is essential to familiarize ourselves with two key concepts: the Probability Density Function (PDF) and the Cumulative Distribution Function (CDF). These functions help us calculate probabilities and quantiles for the Normal Distribution.


<img src="./images/cdf-pdf-pmf.png" width="800">

<img src="./images/cdf-pdf-pmf-2.jpeg" width="800">

### <a id='toc3_1_'></a>[Probability Density Function (PDF)](#toc0_)


The Probability Density Function (PDF) of a continuous random variable X is a function that describes the relative likelihood of X taking on a specific value. For the Normal Distribution, the PDF is given by:

$f(x) = \frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{1}{2}(\frac{x-\mu}{\sigma})^2}$

where:
- $\mu$ is the mean of the distribution
- $\sigma$ is the standard deviation of the distribution
- $\pi$ is the mathematical constant pi (approximately 3.14159)
- $e$ is the mathematical constant e (approximately 2.71828)


The PDF has the following properties:

1. The total area under the PDF curve is equal to 1.
2. The PDF is non-negative everywhere, i.e., $f(x) \geq 0$ for all x.
3. The probability of observing a value between a and b is given by the area under the PDF curve between a and b.


It is important to note that the PDF does not directly give us the probability of observing a specific value. Instead, it gives us the relative likelihood of observing a value in a given range.


### <a id='toc3_2_'></a>[Cumulative Distribution Function (CDF)](#toc0_)


The Cumulative Distribution Function (CDF) of a random variable X is a function that gives the probability of observing a value less than or equal to a given value x. For the Normal Distribution, the CDF is given by:

$F(x) = \int_{-\infty}^{x} \frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{1}{2}(\frac{t-\mu}{\sigma})^2} dt$

where:
- $\mu$ is the mean of the distribution
- $\sigma$ is the standard deviation of the distribution
- $\pi$ is the mathematical constant pi (approximately 3.14159)
- $e$ is the mathematical constant e (approximately 2.71828)
- $t$ is a dummy variable of integration


The CDF has the following properties:

1. The CDF is a non-decreasing function, i.e., if $a < b$, then $F(a) \leq F(b)$.
2. The CDF is bounded between 0 and 1, i.e., $0 \leq F(x) \leq 1$ for all x.
3. As x approaches negative infinity, the CDF approaches 0, and as x approaches positive infinity, the CDF approaches 1.


The CDF is particularly useful for calculating probabilities and quantiles. To find the probability of observing a value less than or equal to x, we simply evaluate the CDF at x. To find the probability of observing a value between a and b, we calculate $F(b) - F(a)$.


In practice, we often use statistical software or pre-calculated tables (such as the Z-table for the Standard Normal Distribution) to evaluate the CDF and calculate probabilities.


In the next section, we will discuss the Empirical Rule (68-95-99.7 Rule), which provides a quick way to estimate the probability of observing values within certain ranges of the mean in a Normal Distribution.

## <a id='toc4_'></a>[Empirical Rule (68-95-99.7 Rule)](#toc0_)

The Empirical Rule, also known as the 68-95-99.7 Rule or the Three Sigma Rule, is a quick and easy way to estimate the probability of observing values within certain ranges of the mean in a Normal Distribution. This rule is based on the properties of the Standard Normal Distribution and the fact that the Normal Distribution is symmetric about its mean.


The Empirical Rule states that for a Normal Distribution:

1. Approximately 68% of the data falls within 1 standard deviation of the mean, i.e., within the range $(\mu - \sigma, \mu + \sigma)$.
2. Approximately 95% of the data falls within 2 standard deviations of the mean, i.e., within the range $(\mu - 2\sigma, \mu + 2\sigma)$.
3. Approximately 99.7% of the data falls within 3 standard deviations of the mean, i.e., within the range $(\mu - 3\sigma, \mu + 3\sigma)$.


Here's a visual representation of the Empirical Rule:


<img src="./images/68-95-99.7-rule.png" width="800">

To use the Empirical Rule, follow these steps:

1. Identify the mean (μ) and standard deviation (σ) of the Normal Distribution.
2. Determine the range of interest, i.e., within 1, 2, or 3 standard deviations of the mean.
3. Use the corresponding percentage from the Empirical Rule to estimate the probability of observing a value within that range.


For example, suppose we have a Normal Distribution with a mean of 100 and a standard deviation of 10. To estimate the probability of observing a value between 80 and 120, we first note that 80 is 2 standard deviations below the mean, and 120 is 2 standard deviations above the mean. Using the Empirical Rule, we know that approximately 95% of the data falls within 2 standard deviations of the mean. Therefore, the probability of observing a value between 80 and 120 is approximately 0.95 or 95%.


It is important to note that the Empirical Rule provides an approximation and is most accurate for distributions that are nearly normal. For exact probabilities or more complex problems, it is better to use the Probability Density Function (PDF), Cumulative Distribution Function (CDF), or statistical software.


In the next section, we will explore some applications of the Normal Distribution in various fields.

## <a id='toc5_'></a>[Applications of Normal Distributions](#toc0_)

The Normal Distribution is widely used in various fields due to its many applications. Some of the most common applications include:

1. **Natural and Social Sciences**:
   - In biology, the Normal Distribution can model the distribution of various physical characteristics, such as height, weight, or blood pressure, in a population.
   - In psychology, the Normal Distribution is used to model the distribution of intelligence quotient (IQ) scores or personality traits.
   - In physics, the Normal Distribution is used to model the distribution of measurement errors or the velocities of particles in a gas.

2. **Quality Control and Manufacturing**:
   - The Normal Distribution is used to model the variation in product dimensions, weights, or other quality characteristics.
   - By setting acceptable limits based on the properties of the Normal Distribution (e.g., within 2 standard deviations of the mean), manufacturers can ensure that their products meet quality standards.
   - The Six Sigma methodology, which aims to minimize defects and improve quality, relies heavily on the properties of the Normal Distribution.

3. **Financial Markets and Economics**:
   - In finance, the Normal Distribution is often used to model the returns of financial assets, such as stocks or bonds, over short time periods.
   - The Black-Scholes model, which is used for pricing options, assumes that the underlying asset's returns follow a Normal Distribution.
   - In economics, the Normal Distribution can be used to model the distribution of income or other economic variables within a population.

4. **Hypothesis Testing and Confidence Intervals**:
   - Many statistical tests, such as the t-test or the Z-test, assume that the data follows a Normal Distribution.
   - The properties of the Normal Distribution are used to construct confidence intervals for population parameters, such as the mean or the proportion.

5. **Machine Learning and Data Science**:
   - Many machine learning algorithms, such as Linear Regression or Gaussian Naive Bayes, assume that the input features or the errors follow a Normal Distribution.
   - In data preprocessing, the Normal Distribution is used to standardize or normalize features, which can improve the performance of some machine learning models.

6. **Environmental Sciences and Climatology**:
   - The Normal Distribution can be used to model the distribution of temperature, precipitation, or other environmental variables over time or space.
   - Climate models often assume that certain variables, such as the concentration of greenhouse gases, follow a Normal Distribution.

7. **Telecommunications and Signal Processing**:
   - In signal processing, the Normal Distribution is used to model the distribution of noise in communication channels.
   - The properties of the Normal Distribution are used to design filters and estimate the signal-to-noise ratio in telecommunication systems.


These are just a few examples of the many applications of the Normal Distribution. Its versatility and well-understood properties make it a valuable tool in numerous fields, from the natural and social sciences to engineering and finance.


It is important to note that while the Normal Distribution is widely applicable, it is not always the most appropriate model for every situation. Researchers and practitioners should always consider the underlying assumptions and the nature of their data before applying the Normal Distribution or any other statistical model.

<img src="../images/exercise-banner.gif" width="800">

## <a id='toc6_'></a>[Exercise: Normal Distribution Properties and Applications](#toc0_)

In this exercise, you will apply your knowledge of Normal Distributions to solve various problems. Use the following information to answer the questions below:

A company manufactures light bulbs with a mean life of 1000 hours and a standard deviation of 100 hours. The lifespan of these light bulbs follows a Normal Distribution.

1. What is the probability that a randomly selected light bulb will last between 900 and 1100 hours? Use the Empirical Rule to solve this problem.

2. Calculate the Z-scores for the following light bulb lifespans:
   a. 1200 hours
   b. 850 hours

3. The company wants to identify the top 5% longest-lasting light bulbs for a premium product line. What is the minimum lifespan (in hours) a light bulb must have to be included in this top 5%? Use the Z-score table to solve this problem.

4. Suppose the company decides to offer a warranty for light bulbs that last less than 800 hours. What percentage of light bulbs will be covered under this warranty? Use the Empirical Rule to estimate this value.

5. The company plans to introduce a new line of energy-efficient light bulbs with a mean life of 1200 hours. The standard deviation is expected to be 20% less than the current light bulbs. Calculate the probability that a randomly selected energy-efficient light bulb will last between 1100 and 1300 hours. Use the Z-score table to solve this problem.


> *Hint: For questions 3 and 5, you can use the Z-score table to find the appropriate Z-score and then convert it back to the original scale using the mean and standard deviation.*

### <a id='toc6_1_'></a>[Solution](#toc0_)


1. Using the Empirical Rule, we know that approximately 68% of the data falls within one standard deviation of the mean (μ ± σ). In this case, one standard deviation is 100 hours, so the range is 900 to 1100 hours. Therefore, the probability that a randomly selected light bulb will last between 900 and 1100 hours is approximately 0.68 or 68%.

2. To calculate the Z-scores, we use the formula: $Z = \frac{x - \mu}{\sigma}$
   a. For 1200 hours: $Z = \frac{1200 - 1000}{100} = 2$
   b. For 850 hours: $Z = \frac{850 - 1000}{100} = -1.5$

3. To find the minimum lifespan for the top 5% longest-lasting light bulbs, we need to find the Z-score that corresponds to the 95th percentile (100% - 5% = 95%). Using the Z-score table, we find that the Z-score for the 95th percentile is approximately 1.645. Now, we can convert this Z-score back to the original scale using the formula: $x = \mu + Z\sigma$

   $x = 1000 + 1.645 \times 100 = 1164.5$

   Therefore, the minimum lifespan for a light bulb to be included in the top 5% is approximately 1164.5 hours.

4. Using the Empirical Rule, we know that approximately 95% of the data falls within two standard deviations of the mean (μ ± 2σ). This means that about 2.5% of the data falls below two standard deviations from the mean. Two standard deviations below the mean is: $1000 - 2 \times 100 = 800$ hours. Therefore, approximately 2.5% of the light bulbs will be covered under the warranty for lasting less than 800 hours.

5. For the new line of energy-efficient light bulbs, the mean is 1200 hours, and the standard deviation is $100 \times 0.8 = 80$ hours. To find the probability that a light bulb will last between 1100 and 1300 hours, we first calculate the Z-scores for these values:

   $Z_{1100} = \frac{1100 - 1200}{80} = -1.25$
   $Z_{1300} = \frac{1300 - 1200}{80} = 1.25$

   Using the Z-score table, we find the cumulative probabilities for these Z-scores:
   
   $P(Z < -1.25) = 0.1056$
   $P(Z < 1.25) = 0.8944$

   The probability that a light bulb will last between 1100 and 1300 hours is the difference between these cumulative probabilities:

   $P(1100 < x < 1300) = 0.8944 - 0.1056 = 0.7888$

   Therefore, the probability that a randomly selected energy-efficient light bulb will last between 1100 and 1300 hours is approximately 0.7888 or 78.88%.