<h6 style='color:red' align='center'>!! ॐ श्री गणेशाय नम: !!</h6>

<h1 style='color:#192841' align='center'>Inferential Statistics</h1>

## Statistical Inference
Using `data analysis` and `statistics` to make conclusions about a population is called `statistical inference`.

The main types of statistical inference are:
* Estimation
* Hypothesis testing

### Estimation
Statistics from a sample are used to estimate population parameters.

The most likely value is called a **point estimate**.

There is always **uncertainty** when estimating.

The **uncertainty** is often expressed as **confidence intervals** defined by a likely lowest and highest value for the parameter.

An example could be a confidence interval for the number of bicycles a Dutch person owns:
"The average number of bikes a Dutch person owns is between 3.5 and 6."

### Hypothesis Testing
Hypothesis testing is a method to check if a claim about a population is true. More precisely, it checks how likely it is that a hypothesis is true is based on the sample data.

There are different types of hypothesis testing.

The steps of the test depends on:
* Type of data (categorical or numerical)
* If you are looking at:
 * A single group
 * Comparing one group to another
 * Comparing the same group before and after a change

Some examples of claims or questions that can be checked with hypothesis testing:
* 90% of Australians are left handed
* Is the average weight of dogs more than 40kg?
* Do doctors make more money than lawyers?

### Probability Distributions
Statistical inference methods rely on **probability calculation** and **probability distributions**.


<h1 style='color:red' align='center'>Normal Distribution</h1>

The normal distribution is described by the mean (**µ**) and the standard deviation (**σ**).

The normal distribution is often referred to as a `bell curve` because of it's shape:
 * Most of the values are around the center (**µ**)
 * **The median and mean are equal.**
 * **It has only one mode.**
 * It is symmetric, meaning it decreases the same amount on the left and the right of the center
 
**The area under the curve of the normal distribution represents probabilities for the data.**
* The area under the whole curve is equal to 1, or 100%

The graph of a normal distribution with probabilities between standard deviations (**σ**):
![image-3.png](attachment:image-3.png)
* Roughly 68.3% of the data is within 1 standard deviation of the average (from μ-1σ to μ+1σ)
* Roughly 95.5% of the data is within 2 standard deviations of the average (from μ-2σ to μ+2σ)
* Roughly 99.7% of the data is within 3 standard deviations of the average (from μ-3σ to μ+3σ)

Note: **Probabilities of the normal distribution can only be calculated for intervals (between two values).**

### Different Mean and Standard Deviations
**Case : 1 Constant Standard deviation**

The mean describes where the center of the normal distribution is.

Here is a graph showing three different normal distributions with the same standard deviation but different means
![image-2.png](attachment:image-2.png)

**Case :2 Constant Mean**

The standard deviation describes how spread out the normal distribution is.

Here is a graph showing three different normal distributions with the same mean but different standard deviations.
![image-4.png](attachment:image-4.png)
* The purple curve has the biggest standard deviation and the black curve has the smallest standard deviation.
* The area under each of the curves is still 1, or 100%.

### Real Data Example of Normally Distributed Data
![image-5.png](attachment:image-5.png)
The normal distribution drawn on top of the histogram is based on the population mean (**µ**) and standard deviation (**σ**) of the real data.

We can see that the histogram close to a normal distribution.

Examples of real world variables that can be normally distributed:
 * Test scores
 * Height
 * Birth weight
 
## Probability Distributions
Probability distributions are functions that calculates the probabilities of the outcomes of random variables.

Typical examples of random variables are coin tosses and dice rolls.

<h2 style='color:green' align='center'>Standard Normal Distribution</h2>

Normally distributed data can be transformed into a standard normal distribution.

**The standard normal distribution is a normal distribution where the mean is 0 and the standard deviation is 1**.

Standardizing normally distributed data makes it easier to compare different sets of data.

* The standard normal distribution is used for:
 * Calculating confidence intervals
 * Hypothesis tests
 
Here is a graph of the standard normal distribution with probability values (**p-values**) between the standard deviations:
![image.png](attachment:image.png)

Standardizing makes it easier to calculate probabilities.

The functions for calculating probabilities are complex and difficult to calculate by hand.

The standard normal distribution is also called the '**Z-distribution**' and the values are called '**Z-values**' (or **Z-scores**).

### Z-Values
Z-values express how many standard deviations from the mean a value is.
![image-2.png](attachment:image-2.png)
 **x** is the value we are standardizing, **µ** is the mean, and **σ** is the standard deviation.
 
For Example :
    The mean height of people in Germany is 170 cm (**µ**)
    The standard deviation of the height of people in Germany is 10 cm (**σ**).
    Bob is 200 cm tall (**x**)

Z = (x - µ)/σ = (200 - 170)/10 = 3

The Z-value of Bob's height (200 cm) is 3.

## Finding the P-value from a Z-Value

In [1]:
# Find the probability of getting less than a Z-value of 3
import scipy.stats as stats
print(stats.norm.cdf(3))

0.9986501019683699


* The probability is ≈0.99865, or ≈ 99.87%.

Which means that Bob is taller than 99.87% of the people in Germany.

Here is a graph of the standard normal distribution and a Z-value of 3 to visualize the probability:
![image.png](attachment:image.png)
These methods find the p-value up to the particular z-value we have.

To find the p-value above the z-value we can calculate 1 minus the probability.

So in Bob's example, we can calculate 1 - 0.9987 = 0.0013, or 0.13%.

Which means that only 0.13% of Germans are taller than Bob.

### Finding the P-Value Between Z-Values
If we instead want to know how many people are between 155 cm and 165 cm in Germany using the same example:

For x = 155cm ![image.png](attachment:image.png)
The Z-value of 155 cm is -1.5

For x = 165cm ![image-2.png](attachment:image-2.png)
The Z-value of 165 cm is -0.5

Using the Z-table or programming we can find that the p-value for the two z-values:
* The probability of a z-value smaller than -0.5 (shorter than 165 cm) is 30.85%
* The probability of a z-value smaller than -1.5 (shorter than 155 cm) is 6.68%
* Subtract 6.68% from 30.85% to find the probability of getting a z-value between them.
     30.85% - 6.68% = 24.17%
     
  ![image-3.png](attachment:image-3.png)

## Find the Z-value from a P-Value

Use p-values (probability) to find z-values.

"How tall are you if you are taller than 90% of Germans?"
The p-value is 0.9, or 90%.

Using a Z-table or programming we can calculate the z-value:

In [2]:
# find the z-value separating the top 10% from the bottom 90%:
import scipy.stats as stats
print(stats.norm.ppf(0.9))

1.2815515655446004


The Z-value is ≈ 1.281

Meaning that a person that is 1.281 standard deviations taller than the mean height of Germans is taller than 90% of Germans.
![image.png](attachment:image.png)

###### You have to be at least 182.81 cm tall to be taller than 90% of Germans


<h1 style='color:red' align='center'>T Distribution</h1>

The t-distribution is used for estimation and hypothesis testing of a population mean (average).

The t-distribution is adjusted for the extra uncertainty of estimating the mean.

**If the sample is small, the t-distribution is wider. If the sample is big, the t-distribution is narrower.**

**The bigger the sample size is, the closer the t-distribution gets to the standard normal distribution.**

![image.png](attachment:image.png)

Notice how some of the curves have bigger tails.

This is due to **the uncertainty from a smaller sample size.**

The green curve has the smallest sample size.

**For the t-distribution this is expressed as `degrees of freedom` (df), which is calculated by subtracting 1 from the sample size (n).**

For example a sample size of 30 will make 29 degrees of freedom for the t-distribution.

The t-distribution is used to find **critical t-values** and **p-values (probabilities) for estimation and hypothesis testing**.

Note: Finding the critical t-values and p-values of the t-distribution is similar z-values and p-values of the standard normal distribution. But make sure to use the correct degrees of freedom.

### Find the P-Value of a T-Value :
We can find the p-values of a t-value by using a t-table or with programming.
https://www.w3schools.com/statistics/statistics_t_table.php

**Find the probability of getting less than a t-value of 2.1 with 29 degrees of freedom:**

In [3]:
# find the probability of getting less than a t-value of 2.1 with 29 degrees of freedom:
import scipy.stats as stats
print(stats.t.cdf(2.1, 29))

0.9777290209818548


### Finding the T-value of a P-Value
we can find the t-values of a p-value by using a t-table or with programming.

**Find the t-value separating the top 25% from the bottom 75% with 29 degrees of freedom:**

In [4]:
# find the t-value separating the top 25% from the bottom 75% with 29 degrees of freedom:
import scipy.stats as stats
print(stats.t.ppf(0.75, 29))

0.6830438592467807
