# Advanced Statistics: Probability Distribution and Density Function

## Probability Distribution:

### Definition:
Probability distribution describes how the values of a random variable are spread or distributed. It provides the likelihood of occurrence of each possible outcome in an experiment.

### Probability Density Function (PDF):

#### Definition:
The Probability Density Function (PDF) is a function that describes the likelihood of obtaining a particular value from a continuous random variable. It is denoted as \(f(x)\), and the probability of a random variable falling within a particular range is given by the integral of the PDF over that range.

#### Types of Probability Distributions:

1. **Uniform Distribution:**
   - All outcomes are equally likely.
   - Example: Rolling a fair die.

2. **Normal Distribution (Gaussian Distribution):**
   - Bell-shaped curve characterized by mean $(\mu)$ and standard deviation $(\sigma)$.
   - Many natural phenomena follow this distribution.
   - Example: Heights of a population.

3. **Binomial Distribution:**
   - Describes the number of successes in a fixed number of independent Bernoulli trials.
   - Two possible outcomes (success or failure).
   - Example: Flipping a coin multiple times.

4. **Poisson Distribution:**
   - Models the number of events occurring in a fixed interval of time or space.
   - Assumes events are rare and independent.
   - Example: Number of emails received in an hour.

5. **Exponential Distribution:**
   - Describes the time between events in a Poisson process.
   - Continuous analog to the geometric distribution.
   - Example: Time until the next arrival of a bus.

6. **Gamma Distribution:**
   - Generalization of the exponential distribution.
   - Used to model the time until \(k\) events in a Poisson process.
   - Example: Time until \(k\) goals are scored in a soccer match.

7. **Chi-Square Distribution:**
   - Distribution of the sum of squares of \(k\) independent standard normal random variables.
   - Used in hypothesis testing and confidence interval construction.
   - Example: Testing the variance of a sample.

8. **Student's t-Distribution:**
   - Distribution of a random variable following a normal distribution divided by the square root of a scaled chi-square distribution.
   - Used in t-tests for small sample sizes.
   - Example: Comparing means of two small samples.

9. **F-Distribution:**
   - Distribution of the ratio of two independent chi-square random variables.
   - Used in analysis of variance (ANOVA) and regression analysis.
   - Example: Testing the equality of variances in two samples.

Understanding these probability distributions is crucial for various fields, including statistics, machine learning, and data science.


# Binomial and Bernoulli Distributions, PMF, CDF, and PDF

## Bernoulli Distribution:

### Definition:
- Represents a discrete random variable with two possible outcomes: success (1) or failure (0).
- Characterized by the probability of success, denoted as \( p \).

### Probability Mass Function (PMF):
- The PMF of a Bernoulli distribution is given by:
\[ P(X = k) = \begin{cases} p & \text{if } k = 1 \\ 1 - p & \text{if } k = 0 \end{cases} \]

### Cumulative Distribution Function (CDF):
- The CDF gives the probability that the random variable is less than or equal to a certain value.
- For a Bernoulli distribution:
\[ F(x) = \begin{cases} 0 & \text{if } x < 0 \\ 1 - p & \text{if } 0 \leq x < 1 \\ 1 & \text{if } x \geq 1 \end{cases} \]

## Binomial Distribution:

### Definition:
- Represents the number of successes in a fixed number of independent Bernoulli trials.
- Characterized by the number of trials (\( n \)) and the probability of success in each trial (\( p \)).

### Probability Mass Function (PMF):
- The PMF of a binomial distribution is given by:
$[ P(X = k)$ = $\binom{n}{k} p^k (1-p)^{n-k}]$
where $(\binom{n}{k})$ is the binomial coefficient.

### Cumulative Distribution Function (CDF):
- The CDF gives the probability that the number of successes is less than or equal to a certain value.
- For a binomial distribution, the CDF involves the summation of individual PMF values.

## Probability Density Function (PDF):
- Probability density function is used for continuous random variables.
- Unlike PMF, PDF doesn't give the probability at a single point but the likelihood over a range.

Understanding these concepts is crucial in probability theory and statistical modeling.


# Probability Distributions, Z-Score, and Central Limit Theorem

## Poisson Distribution:

### Definition:
- Describes the number of events occurring in a fixed interval of time or space.
- Characterized by the average rate $( \lambda)$ of event occurrences.

### Probability Mass Function (PMF):
$[ P(X = k) = \frac{e^{-\lambda} \lambda^k}{k!}]$

## Gaussian (Normal) Distribution:

### Definition:
- Represents a continuous probability distribution.
- Characterized by its mean $( \mu)$ and standard deviation $( \sigma)$.

### Probability Density Function (PDF):
$[ f(x) = \frac{1}{\sigma \sqrt{2\pi}} e^{-\frac{(x - \mu)^2}{2\sigma^2}}]$

## Uniform Distribution:

### Definition:
- All values within a given range are equally likely.
- Characterized by the minimum (\( a \)) and maximum (\( b \)) values.

### Probability Density Function (PDF):
$[ f(x) = \frac{1}{b - a} \text{ for } a \leq x \leq b ]$

## Standard Normal Distribution:

### Definition:
- A specific instance of the Gaussian distribution with mean $( \mu)$ of 0 and standard deviation $(\sigma)$ of 1.

### Z-Score:
- Measures how many standard deviations a data point is from the mean.
$[ Z = \frac{(X - \mu)}{\sigma}]$

## Central Limit Theorem:

### Definition:
- States that the distribution of the sum (or average) of a large number of independent, identically distributed random variables approaches a normal distribution.
- Holds irrespective of the original distribution.

### Application:
- Widely used in statistical inference and hypothesis testing.
- Allows approximating the distribution of sample means, making it easier to make statistical inferences.


| Distribution           | Description                                               | Real-Life Example                                  | Application                                        | Type of Data       | Probability Distribution |
|-------------------------|-----------------------------------------------------------|-----------------------------------------------------|----------------------------------------------------|--------------------|---------------------------|
| Poisson Distribution    | Models the number of events in a fixed interval of time.  | Customer arrivals at a service point                | Queuing theory, traffic flow analysis               | Discrete           | PMF                       |
| Gaussian (Normal)       | Represents a continuous probability distribution.         | Heights of a population                             | Statistical analysis, quality control               | Continuous         | PDF                       |
| Uniform Distribution    | All values within a given range are equally likely.       | Rolling a fair six-sided die                        | Random number generation, statistical sampling      | Discrete or Continuous | PMF or PDF               |
| Standard Normal         | Specific instance of Gaussian with mean 0 and stddev 1.   | IQ scores                                             | Z-tests, statistical inference                      | Continuous         | PDF                       |
| Central Limit Theorem    | Distribution of the sum/average of many random variables | Mean of sample means from various populations       | Statistical hypothesis testing, confidence intervals| Depends on Sample Size | -                         |
| Binomial Distribution    | Models the number of successes in a fixed number of trials| Coin flips, where success is getting heads           | Quality control, reliability analysis              | Discrete           | PMF                       |
| Bernoulli Distribution   | Represents the probability of success or failure           | Outcome of a single coin flip                       | Binary outcomes, risk assessment                   | Discrete           | PMF                       |


| Distribution           | Mean ($\mu$) Formula                   | Median Formula                        | Mode Formula                         | Variance Formula                                | Standard Deviation Formula                      | Example                                       |
|-------------------------|----------------------------------------|--------------------------------------|-------------------------------------|-------------------------------------------------|-------------------------------------------------|-----------------------------------------------|
| Poisson Distribution    | $\lambda$                             | -                                    | $\lambda$                           | $\lambda$                                        | $\sqrt{\lambda}$                                | Number of customer arrivals in an hour         |
| Gaussian (Normal)       | $\mu$                                | $\mu$                                | -                                   | $\sigma^2$                                      | $\sigma$                                       | Heights of a population                       |
| Uniform Distribution    | $\frac{a + b}{2}$                     | $\frac{a + b}{2}$                    | Any value within the range $[a, b]$| $\frac{(b - a)^2}{12}$                           | $\frac{b - a}{\sqrt{12}}$                       | Rolling a fair six-sided die                   |
| Standard Normal         | 0                                    | 0                                    | 0                                   | 1                                               | 1                                             | Z-scores in statistical analysis              |
| Central Limit Theorem    | $\mu$ (of the original distribution) | $\mu$ (of the original distribution)| -                                   | $\frac{\sigma^2}{n}$                            | $\frac{\sigma}{\sqrt{n}}$                      | Means of sample means from various populations |
| Binomial Distribution    | $np$                                 | $np$                                 | $\lfloor np \rfloor$                | $np(1-p)$                                       | $\sqrt{np(1-p)}$                               | Number of successes in coin flips             |
| Bernoulli Distribution   | $p$                                  | $p$                                  | 0 or 1                              | $p(1-p)$                                       | $\sqrt{p(1-p)}$                                | Outcome of a single coin flip                 |


| Distribution           | Mean ($\mu$) Formula                   | Variance Formula                                | Standard Deviation Formula                      |
|-------------------------|----------------------------------------|-------------------------------------------------|-------------------------------------------------|
| Poisson Distribution    | $\lambda \cdot t$                      | $\lambda \cdot t$                               | $\sqrt{\lambda \cdot t}$                       |
| Gaussian (Normal)       | $\mu$                                | $\sigma^2$                                      | $\sigma$                                       |
| Uniform Distribution    | $\frac{a + b}{2}$                     | $\frac{(b - a)^2}{12}$                           | $\frac{b - a}{\sqrt{12}}$                       |
| Standard Normal         | 0                                    | 1                                               | 1                                             |
| Central Limit Theorem    | $\mu$ (of the original distribution) | $\frac{\sigma^2}{n}$                            | $\frac{\sigma}{\sqrt{n}}$                      |
| Binomial Distribution    | $np$                                 | $np(1-p)$                                       | $\sqrt{np(1-p)}$                               |
| Bernoulli Distribution   | $p$                                  | $p(1-p)$                                       | $\sqrt{p(1-p)}$                                |


### Empirical Rule

The Empirical Rule, also known as the 68-95-99.7 Rule or the Three Sigma Rule, is a statistical guideline that provides a quick estimate of the spread of a normal distribution. The rule states that, for a normal distribution:

1. About 68% of the data falls within one standard deviation of the mean.
2. About 95% of the data falls within two standard deviations of the mean.
3. About 99.7% of the data falls within three standard deviations of the mean.

Mathematically, it can be expressed as:

1. **68% Rule:** $ \mu - \sigma \leq X \leq \mu + \sigma $
2. **95% Rule:** $ \mu - 2\sigma \leq X \leq \mu + 2\sigma $
3. **99.7% Rule:** $ \mu - 3\sigma \leq X \leq \mu + 3\sigma $

Here:
- $ \mu $ is the mean of the distribution.
- $ \sigma $ is the standard deviation of the distribution.
- $ X $ represents the range within a certain number of standard deviations from the mean.

The Empirical Rule is particularly useful when dealing with normal distributions, providing a quick way to assess the probability of an observation falling within a certain range.

Keep in mind that the Empirical Rule is specifically applicable to normal distributions and may not be accurate for distributions that deviate significantly from normality.

## Distribution's

### Normal Distribution:

**Definition:**
The normal distribution, also known as the Gaussian distribution or bell curve, is a continuous probability distribution that is symmetric around its mean. It is characterized by its bell-shaped curve and is fully defined by its mean (μ) and standard deviation (σ).

**Properties:**
- The mean, median, and mode of a normal distribution are equal and located at the center of the distribution.
- The standard deviation determines the spread or dispersion of the distribution.
- Approximately 68% of the data falls within one standard deviation of the mean, 95% within two standard deviations, and 99.7% within three standard deviations (empirical rule).

**Applications:**
- Many natural phenomena, such as human height and IQ, follow a normal distribution.
- It is widely used in statistical inference and hypothesis testing.

### Uniform Distribution:

**Definition:**
The uniform distribution is a type of probability distribution in which all outcomes or events are equally likely. For a continuous uniform distribution, the probability density function is constant within a given interval.

**Properties:**
- Every outcome has an equal probability of occurring.
- The probability density function is flat within the defined interval.

**Applications:**
- Modeling scenarios where each outcome in a range is equally likely, like the result of rolling a fair die or selecting a random number from a uniform distribution.

### Discrete Distributions:

**Definition:**
Discrete distributions deal with random variables that take on distinct, separate values. The probability mass function (PMF) describes the probabilities of these specific outcomes.

**Properties:**
- The PMF gives the probability of each possible outcome.
- The sum of probabilities for all possible outcomes equals 1.

**Examples:**
- **Bernoulli Distribution:** Models two possible outcomes, often used for success/failure experiments.
- **Binomial Distribution:** Describes the number of successes in a fixed number of independent Bernoulli trials.
- **Poisson Distribution:** Models the number of events occurring within a fixed interval of time or space.

**Applications:**
- Counting problems where outcomes are discrete, such as the number of emails received in an hour or the number of defective items in a production batch.

Understanding these distributions is fundamental in statistics and is crucial for various applications in data analysis, hypothesis testing, and modeling real-world phenomena.