<center><font size=20>Introduction to Inferential Statistics</center></font>

### Topics Covered:
#### 1. Inferential Statistics
#### 2. Fundamental Terms
#### 3. Binomial Distribution
#### 4. Normal Distribution
#### 5. Uniform Distribution
#### 6. Poisson Distribution

## 1. Inferential Statistics

* Inferential Statistics involves making inferences about populations using data drawn from the population. 

* Instead of analyzing the entire population, we use a sample to make predictions or generalizations.

#### Steps Involved in Inferential Statistics:
* **1. Sample and Population:** Inferential statistics involves using a sample (a subset of the population) to make inferences or draw conclusions about the population as a whole.

* **2. Random Variable:** You start with a random variable, which is a numerical outcome of a random process.

* **3. Distribution:** You examine the distribution of this random variable within your sample. This could involve estimating the sample's mean, variance, or other statistics.

* **4. Comparison to Theoretical Distributions:** You then compare the sample distribution to a theoretical (standard) distribution, like the normal distribution, to make inferences about the population. This step often involves hypothesis testing, confidence intervals, or other statistical methods.

* **5. Conclusion:** Based on the comparison, you can draw conclusions about population parameters (like the population mean, proportion, etc.) with a certain level of confidence.

## Descriptive vs Inferential Statistics
![image.png](attachment:image.png)

![image.png](attachment:image.png)

## Business Problems
![image.png](attachment:image.png)

## 2. Fundamental Terms

### 2.1 Random Variables
![image.png](attachment:image.png)

* Random variables are essential because they allow us to model and quantify the uncertainty inherent in real-world data. 
* They provide the mathematical foundation for probability distributions, which are crucial for making inferences about populations based on sample data. 
* Without understanding random variables, it would be impossible to rigorously apply statistical methods to analyze and interpret data

#### Example
* Define a New Random Variable Y: Let Y represent the number of times you roll a 4 in the 10 rolls.

* Y can take any value from 0 to 10 (since it’s possible to roll no 4s or all 4s, and anything in between).
* Range of Y: The possible values for Y are 0, 1, 2, ..., 10, each with a different probability.

* Probability Distribution: You can calculate the probability of each possible value of 𝑌. For example, the probability of rolling exactly three 4s in 10 rolls is one of the values in this distribution.

#### 2.1.1 Discrete Random Variables
![image.png](attachment:image.png)

#### 2.1.2 Continuous Random Variables
![image.png](attachment:image.png)

### 2.2 Probability Distribution

![image.png](attachment:image.png)

#### Probability Distribution: Example
![image.png](attachment:image.png)

#### Commonly Occuring Distributions
![image.png](attachment:image.png)

## 3. Binomial Distribution

**Overview**

* The Binomial distribution is a discrete probability distribution that models the number of successes in a fixed number of independent and identically distributed (i.i.d.) Bernoulli trials, each with the same probability of success. 
* It is one of the most commonly used discrete distributions in statistics.

**Key Characteristics:**
* Discrete Distribution: The binomial distribution is concerned with discrete events, where each trial has only two possible outcomes: success or failure.
* Parameters: The distribution is defined by two parameters:
    * n: The number of trials.
    * p: The probability of success on each trial.
* Fixed Number of Trials: The number of trials n is fixed in advance.
* Independent Trials: The outcome of each trial is independent of the others.
* Binary Outcome: Each trial results in either a success (often coded as 1) or a failure (often coded as 0).

**When to Use the Binomial Distribution?** 

The binomial distribution is appropriate in scenarios where:

* You are performing a fixed number of trials or experiments.
* Each trial has only two possible outcomes.
* The probability of success remains constant across trials.
* The trials are independent of each other.

**Examples of Binomial-Distributed Events:**
* Flipping a coin a fixed number of times and counting the number of heads.
* Conducting a survey and counting the number of respondents who say "yes."
* Testing a batch of products and counting how many are defective.
* A basketball player making a fixed number of free throws and counting the successful shots.

![image.png](attachment:image.png)

![image.png](attachment:image.png)

#### Example
![image.png](attachment:image.png)

**Visualizing the Binomial Distribution:**
* PMF: The PMF shows the probability of each possible number of successes in the n trials. 
* The shape of the distribution depends on the parameters n and p.
* If p is close to 0.5 and n is large, the distribution will be approximately symmetric.
* If p is far from 0.5, the distribution will be skewed.
* CDF: The CDF of the binomial distribution shows the cumulative probability of observing up to a certain number of successes.

#### Binomial Distribution Assumptions
![image.png](attachment:image.png)

**Note- Bernoulli Distribution is a special case of Binomial Distribution with just one number of trial**

# Work on Case Study

## 4. Normal Distribution
![image.png](attachment:image.png)

**Overview**

The Normal distribution (also known as the Gaussian distribution) is one of the most important and widely used continuous probability distributions in statistics. It describes how the values of a random variable are distributed symmetrically around the mean, creating a bell-shaped curve.

**Key Characteristics:**
* Continuous Distribution: The normal distribution is continuous, meaning the variable can take any real number value within a certain range.
* Symmetry: It is symmetric around its mean (μ), meaning that the left and right sides of the distribution are mirror images.
* Parameters: The normal distribution is characterized by two parameters:
    * Mean (μ): Determines the location of the center of the distribution.
    * Standard Deviation (σ): Determines the spread or width of the distribution.
* Bell-shaped Curve: The shape of the distribution is bell-shaped, with most of the data clustering around the mean.

**When to Use the Normal Distribution?**

The normal distribution is used when:

* The data is symmetrically distributed around the mean.
* The phenomenon under study is influenced by many small, independent factors.
* You are dealing with a large dataset or sample size, especially due to the Central Limit Theorem.

**Examples of Normally Distributed Events:**
* Heights of adults in a population.
* IQ scores of individuals.
* Measurement errors in physical experiments.
* Blood pressure readings in a population.

![image.png](attachment:image.png)

##### Properties of Normal Distribution
![image.png](attachment:image.png)

**Properties of the Normal Distribution:**
* Mean, Median, and Mode: All three are equal and located at the center of the distribution (μ).
* 68-95-99.7 Rule: Also known as the empirical rule:
    * 68% of the data lies within 1 standard deviation (μ±σ).
    * 95% of the data lies within 2 standard deviations (μ±2σ).
    * 99.7% of the data lies within 3 standard deviations (μ±3σ).
* Tail Behavior: The tails of the normal distribution approach, but never touch, the x-axis, indicating that extreme values are possible but increasingly unlikely.
* Standard Normal Distribution: A special case of the normal distribution where μ=0 and σ=1. This is often used to compute probabilities and z-scores.

![image.png](attachment:image.png)

In [2]:
import scipy.stats as stats

In [3]:
1-stats.norm.cdf(185,175,10)

0.15865525393145707

##### Example-2
![image.png](attachment:image.png)

Visualizing the Normal Distribution:
PDF: The PDF is the bell curve, showing the likelihood of different values of the random variable. The peak occurs at the mean, and the curve tapers off symmetrically on both sides.
CDF (Cumulative Distribution Function): The CDF of the normal distribution shows the probability that the random variable is less than or equal to a specific value. It is an S-shaped curve that approaches 0 and 1 at the extremes.

### Area Under Density Curve 
![image.png](attachment:image.png)

### Standard Normal Distribution
![image.png](attachment:image.png)

###  Z-score
![image.png](attachment:image.png)

## 5. Uniform Distribution
![image.png](attachment:image.png)

The Uniform distribution is a probability distribution in which all outcomes are equally likely. It is characterized by the fact that every interval of the same length on the distribution's support has an equal probability of being observed.

Key Characteristics:
Two Types:
Discrete Uniform Distribution: The outcomes are finite and discrete (e.g., rolling a fair die).
Continuous Uniform Distribution: The outcomes are continuous and can take any value within a specified range (e.g., selecting a random number between 0 and 1).
Parameters: The distribution is defined by two parameters, 
𝑎
a and 
𝑏
b, which represent the minimum and maximum values of the distribution.
Support: For the discrete uniform distribution, the support is a set of discrete values; for the continuous uniform distribution, the support is a continuous interval [a,b].

When to Use the Uniform Distribution?
The Uniform distribution is appropriate in scenarios where:

All outcomes within a specified range are equally likely.
There is no preference or bias towards any specific outcome within the range.
You need a simple model for random sampling within a finite or continuous range.

Examples of Uniform-Distributed Events:
Discrete Uniform:
Rolling a fair die, where each face (1 to 6) has an equal chance of landing face up.
Drawing a card from a well-shuffled deck, where each card has an equal probability of being drawn.
Continuous Uniform:
Randomly selecting a number between 0 and 1 from a continuous interval.
Measuring the waiting time for a bus that arrives at a random time within a 10-minute window.

![image.png](attachment:image.png)

![image.png](attachment:image.png)

![image.png](attachment:image.png)

Visualizing the Uniform Distribution:
PDF: For the continuous uniform distribution, the PDF is a flat line, indicating that each value in the interval is equally likely.
CDF: The CDF is a straight line that increases linearly from 0 to 1 over the interval [a,b], reflecting the uniform accumulation of probability.

## 6. Poission Distribution

* The Poisson distribution is a discrete probability distribution that describes the probability of a given number of events occurring within a fixed interval of time or space, provided these events happen with a known constant rate and are independent of the time since the last event.

**Key Characteristics:**
* Discrete Distribution: The Poisson distribution deals with discrete events (e.g., the number of arrivals, occurrences, or incidents).
* Parameter: The distribution is characterized by a single parameter,λ (lambda), which represents the average number of occurrences in the given interval.
 * Support: The values of the random variable X (which represents the number of occurrences) are non-negative integers (0, 1, 2, ...).

**When to Use the Poisson Distribution?**
The Poisson distribution is applicable in situations where:

1. Events are rare or occur infrequently.
2. Events occur independently of each other.
3. The rate (λ) at which events occur is constant over the interval.
4. The number of trials is large, and the probability of success in each trial is small (leading to a relatively low average number of occurrences).

* Examples of Poisson-Distributed Events:
    * The number of emails received per hour.
    * The number of earthquakes in a region within a year.
    * The number of phone calls received by a call center in a minute.
    * The number of defects found in a batch of products.

![image.png](attachment:image.png)

**Properties of the Poisson Distribution:**
* Mean and Variance: Both the mean and variance of the Poisson distribution are equal to λ.
* Skewness: The distribution is skewed to the right, especially when λ is small. As λ increases, the distribution becomes more symmetric.
* Memorylessness: The Poisson distribution has no memory; the probability of an event occurring in the future is independent of past events.

![image.png](attachment:image.png)

Visualizing the Poisson Distribution:
The distribution is typically right-skewed, especially for low values of λ.
As λ increases, the distribution begins to resemble a normal distribution.

# ****************************************************************************************