# Statistics Advance Part 1

#     Theory Section

**Q 1.  What is a random variable in probability theory?**

**Ans.** In probability theory, a random variable is a function that assigns a numerical value to each outcome in a sample space of a random experiment.

- Two Key Types:
1. Discrete Random Variable:
    - Takes on a countable number of possible values.
    - Example: The result of rolling a die (1, 2, 3, 4, 5, 6).

2. Continuous Random Variable:
    - Takes on an infinite number of possible values within a given range.
    - Example: The exact height of a person (e.g., 170.25 cm).

Example:
- If you toss a fair coin:
    - Let X be a random variable defined as:
    - X = 1 if heads,
    - X = 0 if tails.
    - Then X is a discrete random variable.

- Why It Matters:
    - Random variables allow us to quantify uncertainty, enabling the use of tools like:
         - Probability distributions
         - Expected value
         - Variance
         - Standard deviation

**Q 2.  What are the types of random variables?**

**aNS.** There are two main types of random variables in probability theory:

1. Discrete Random Variable
    - Takes countable values (finite or countably infinite).
    - Examples: 0, 1, 2, 3, …
    - Often used in scenarios involving counts.

    - Examples:
        - Number of heads in 5 coin tosses.
        - Number of students present in a class.
        - Rolling a die: values = {1, 2, 3, 4, 5, 6}.

2. Continuous Random Variable
    - Takes uncountably infinite values within an interval.
    - Can take any value within a range.
    - Often used in scenarios involving measurements.

    - Examples:
        - Height or weight of a person.
        - Time taken to complete a task.
        - Temperature on a given day.

**Q 3. What is the difference between discrete and continuous distributions?**

**Ans.** The difference between discrete and continuous distributions lies in the type of values the random variable can take and how probabilities are assigned:

1. Discrete Distribution:
    - Deals with discrete random variables (countable outcomes).
    - Probabilities are assigned to specific values.
    - Represented by a probability mass function (PMF).
    - Example:
        - Tossing a coin 3 times → Number of heads: 0, 1, 2, or 3
        - Each outcome has a specific probability.
2. Continuous Distribution
    - Deals with continuous random variables (uncountably infinite values).
    - Probabilities are assigned over intervals, not individual values.
    - Represented by a probability density function (PDF).
    - Example:
        - Height of people: Values like 170.25 cm, 170.251 cm, etc.
        - Probability of exactly 170 cm = 0; but probability between 169.5 and 170.5 is positive.

**Q 4. What are probability distribution functions (PDF)?**

**Ans.** In probability theory, a probability distribution function (PDF) describes how the values of a random variable are distributed—that is, how likely different outcomes are.

There are two main types, depending on whether the random variable is discrete or continuous:

### 1. For Discrete Random Variables:

**Probability Mass Function (PMF)**  
- Gives the probability that a **discrete random variable** is exactly equal to some value.
- Notation:

$$
P(X = x)
$$

- Example: Tossing a fair die  
$$
P(X = 3) = \frac{1}{6}
$$
---

### 2. For Continuous Random Variables:

**Probability Density Function (PDF)**  
- Describes the *relative likelihood* for a **continuous random variable** to take on a value.
- Probability is found **over an interval**, not at a single point.
- The area under the curve between two values represents the probability:

$$
P(a \leq X \leq b) = \int_a^b f(x)\, dx
$$

- For any single value:
$$
P(X = x) = 0
$$
(because there's an infinite number of possible values)

Example: Normal distribution with mean = 0 and standard deviation = 1
- Bell-shaped curve
- PDF gives higher values near the mean

**Q 5. How do cumulative distribution functions (CDF) differ from probability distribution functions (PDF)?**

**Ans.** 

| Feature            | [Probability Distribution Function (PDF)](w)                                                                                             | [Cumulative Distribution Function (CDF)](w)                                                        |
| ------------------ | ---------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------- |
| **Definition**     | Describes the **likelihood** of a random variable taking a specific value (for discrete) or falling within an interval (for continuous). | Describes the **probability** that a random variable is **less than or equal to** a certain value. |
| **Representation** | $f(x)$                                                                                                                                   | $F(x) = P(X \leq x)$                                                                               |
| **Values**         | For continuous: $f(x) \geq 0$, but not a probability directly. <br> For discrete: $P(X = x)$.                                            | Always between 0 and 1.                                                                            |
| **Use**            | Tells how **densely** the values are distributed.                                                                                        | Tells the **accumulated probability** up to a point.                                               |
| **Graph**          | Usually a curve that peaks and falls (like bell-shaped for normal distribution).                                                         | Always a **non-decreasing** curve from 0 to 1.                                                     |
| **Relation**       | CDF is the **integral** (area under curve) of the PDF for continuous variables:  <br> $F(x) = \int_{-\infty}^{x} f(t)\,dt$               | PDF is the **derivative** of the CDF (if it exists):  <br> $f(x) = \frac{d}{dx}F(x)$               |


**Q 6.  What is a discrete uniform distribution?**

**Ans.**

A **discrete uniform distribution** is a probability distribution in which all outcomes are **equally likely**.

If a random variable \( X \) can take on \( n \) distinct values \( x_1, x_2, ..., x_n \), then:

$$
P(X = x_i) = \frac{1}{n}, \quad \text{for } i = 1, 2, ..., n
$$

---

### 🎲 Example:
- Rolling a fair 6-sided die:
  - Possible outcomes: \( \{1, 2, 3, 4, 5, 6\} \)
  - Probability of each outcome:

$$
P(X = x) = \frac{1}{6}
$$

---

### 📊 Properties:
- **Mean** (Expected Value):

$$
\mu = \frac{a + b}{2}
$$

- **Variance**:

$$
\sigma^2 = \frac{(b - a + 1)^2 - 1}{12}
$$

Where \( a \) and \( b \) are the smallest and largest values that \( X \) can take.

---

### 🧠 Use Cases:
- Fair dice rolls  
- Random draws where each value is equally likely


**Q 7. What are the key properties of a Bernoulli distribution?**

**Ans.** 

The **Bernoulli distribution** models a random experiment with only **two possible outcomes**: 1 (success) or 0 (failure).

---

## 🔑 Key Properties:

### 1. Binary Outcomes:
- The random variable \( X \in \{0, 1\} \)
- Typically:  
  - \( X = 1 \rightarrow \text{success} \)  
  - \( X = 0 \rightarrow \text{failure} \)

---
![image.png](attachment:0cc8fd1b-eef9-4be4-a29a-c5bb523fb1e7.png)

---

### 8. Memoryless Property:

- The Bernoulli distribution **does not** have the memoryless property.

---

### 9. Relation to Other Distributions:

- A **Binomial distribution** with \( n = 1 \) is a Bernoulli distribution.
- It is a special case of the **Categorical distribution** with 2 classes.

**Q 8. What is the binomial distribution, and how is it used in probability?**

**Ans.** The **Binomial Distribution** is a discrete probability distribution that models the number of successes in a fixed number of independent **Bernoulli trials**, each with the same probability of success.

---

### ✅ Key Characteristics:
- The experiment consists of **n independent trials**.
- Each trial has **only two outcomes**: Success (S) or Failure (F).
- The **probability of success (p)** remains constant for each trial.
- The **random variable X** represents the number of successes in n trials.

---

### 📐 Probability Mass Function (PMF):

The probability of observing exactly \( k \) successes in \( n \) trials is given by:

\[
P(X = k) = \binom{n}{k} p^k (1 - p)^{n - k}
\]

Where:
- \( \binom{n}{k} = \frac{n!}{k!(n-k)!} \) is the **binomial coefficient**,
- \( p \) is the probability of success,
- \( (1 - p) \) is the probability of failure,
- \( k = 0, 1, 2, ..., n \)

---

### 🧠 Usage in Probability:

Binomial distribution is used when:
- You want to find the probability of a **specific number of successes** (like getting 3 heads in 5 coin tosses).
- Modeling **binary outcomes** like pass/fail, yes/no, success/failure in repeated experiments.

---

### 📊 Example:

If you flip a fair coin 10 times, what is the probability of getting exactly 6 heads?

Let \( n = 10 \), \( k = 6 \), and \( p = 0.5 \):

\[
P(X = 6) = \binom{10}{6} (0.5)^6 (0.5)^4 = \binom{10}{6} (0.5)^{10}
\]


**Q 9. What is the Poisson distribution and where is it applied?**

**Ans.** The **Poisson Distribution** is a discrete probability distribution that expresses the probability of a given number of events occurring in a **fixed interval of time or space**, assuming the events occur with a known constant mean rate and independently of the time since the last event.

---

### ✅ Key Characteristics:
- It models the **number of occurrences** of an event in a fixed interval (time, area, volume, etc.).
- Events occur **independently**.
- The **average rate (λ or lambda)** of events is constant.
- Two events **cannot occur at the exact same instant**.

---

### 📐 Probability Mass Function (PMF):

The probability of observing exactly \( k \) events is given by:

\[
P(X = k) = \frac{e^{-\lambda} \lambda^k}{k!}
\]

Where:
- \( \lambda \) = expected number of occurrences in the interval (mean rate),
- \( k \) = actual number of occurrences (0, 1, 2, ...),
- \( e \) = Euler’s number (approximately 2.71828).

---

### 🧠 Applications of Poisson Distribution:

Poisson distribution is used when modeling:
- The number of **emails received per hour**.
- The number of **accidents at a traffic signal** per week.
- The number of **calls at a call center** in a given time frame.
- The number of **decay events** per second from a radioactive source.
- The number of **spelling errors per page** in typed text.

---

### 📊 Example:

If a call center receives an average of 5 calls per hour, what is the probability it receives exactly 3 calls in an hour?

Let \( \lambda = 5 \), \( k = 3 \):

\[
P(X = 3) = \frac{e^{-5} \cdot 5^3}{3!} = \frac{e^{-5} \cdot 125}{6}
\]


**Q 10.  What is a continuous uniform distribution?**

**Ans.** The **Continuous Uniform Distribution** is a type of probability distribution in which **all outcomes in a continuous interval are equally likely**. It is defined over an interval \([a, b]\), where:
- Every value between \(a\) and \(b\) is **equally probable**.
- The distribution has a **constant probability density**.

---

### ✅ Key Properties:
- **Support**: \( a \leq x \leq b \)
- **Probability Density Function (PDF)**:

\[
f(x) = 
\begin{cases}
\frac{1}{b - a}, & \text{for } a \leq x \leq b \\
0, & \text{otherwise}
\end{cases}
\]

- **Mean** (Expected value):

\[
\mu = \frac{a + b}{2}
\]

- **Variance**:

\[
\sigma^2 = \frac{(b - a)^2}{12}
\]

---

### 🧠 Applications:

Continuous uniform distribution is used when:
- The probability of any value in a given range is **equally likely**.
- **Random number generation** within a fixed range.
- **Modeling measurement errors** with no known bias.
- **Simulations**, where equally likely outcomes are needed over a continuous range.

---

### 📊 Example:

If a bus arrives at a stop **every 30 minutes**, and you arrive at a **random time**, the waiting time is uniformly distributed between 0 and 30 minutes.

Let \( a = 0 \), \( b = 30 \):

\[
f(x) = \frac{1}{30}, \quad \text{for } 0 \leq x \leq 30
\]


**Q 11.  What are the characteristics of a normal distribution?**

**Ans.** The **Normal Distribution** (or Gaussian distribution) is a **bell-shaped**, **symmetric** probability distribution defined by its **mean (μ)** and **standard deviation (σ)**.

---

### ✅ Key Characteristics:
- **Symmetric** about the mean \( \mu \)
- **Mean = Median = Mode**
- Follows the **68-95-99.7 Rule**:
  - ~68% of data within \( \mu \pm 1\sigma \)
  - ~95% within \( \mu \pm 2\sigma \)
  - ~99.7% within \( \mu \pm 3\sigma \)
- Total area under the curve = **1**
- Tails extend infinitely in both directions

---

### 📐 Probability Density Function (PDF):

\[ f(x) = \frac{1}{\sigma \sqrt{2\pi}} e^{-\frac{(x - \mu)^2}{2\sigma^2}}\]

---

### 🧠 Applications:
Used in **natural and social sciences**, **statistics**, **machine learning**, and **quality control**.
