## Random Variable
Before expected value or variance, we need to understand the building block: random variables.  

A random variable (RV) is not random in the sense of "wild chaos." Think of it as a function that assigns numbers to outcomes of a random event.  

- Example: Roll a fair die → your random variable X could be "the number showing on the die".  
- Example: Toss a coin → your random variable Y could be "1 if heads, 0 if tails".  

There are two main types:  
- **Discrete random variable** → takes on separate values (like dice rolls: 1, 2, 3, 4, 5, 6).  
- **Continuous random variable** → takes on any value within an interval (like the exact height of a student: 167.43 cm, 167.431 cm, etc.).  


## Probability Distributions  
The random variable doesn’t “float freely.” It’s tied to a probability distribution that tells us how likely each value is.  

A random variable is a variable whose possible values result from a random phenomenon—its behavior is governed by probability, not certainty. The probabilities themselves are described by a function called a probability distribution:

- For discrete random variables, this function is called a **probability mass function (PMF)**.
- For continuous random variables, it is a **probability density function (PDF)**.

Rules for a distribution:  
- No negative probabilities.  
Probability values must be ≥ 0.  
It does not make sense to have "−20% chance."  
- Total probability must be 1.  
For discrete distributions: all probabilities add up to 1.  
Example: For a fair die, P(X=k)=1/6, k=1,2,3,4,5,6 and 1/6 × 6 = 1.  
For continuous distributions: probabilities are described with a density function (PDF). The curve can go above 1, but the area under the curve across all possible values = 1.  
∫−∞∞ f(x) dx = 1  

Think of it like this:  
Discrete = “probability mass stacked into boxes.”  
Continuous = “smooth probability liquid spread under a curve of area 1.”  


## Expected Value (Mean, Expectation)  
The expected value (EV) is the "center" of a probability distribution — the point you'd balance it if it were a teeter-totter.  
The mean or mathematical expectation, is the “center of mass” or long-run average of a random variable’s possible outcomes, each weighted by its probability. Imagine repeating the random process indefinitely: the average outcome you would see is the expected value.

Mathematical definitions:  
- For discrete RV, X:  
**E[X] = ∑ xi P(X=xi)** (sum of each value times its probability).  

- For continuous RV, X:  
**E[X] = ∫−∞∞ x f(x) dx** (integral of each value weighted by the density).  

Interpretation: Expected value = “long-run average.”  
Imagine repeating an experiment thousands of times. If you take the average of all outcomes, it will settle closer and closer to the expected value.  

Example 1: Dice Roll  
E[X] = ∑(k=1 to 6) k⋅(1/6) = 21/6 = 3.5  
Even though 3.5 is not a possible roll, it’s the balance point: half outcomes are below, half above.  

Example 2: Continuous Uniform(0, 2)  
If probability is equal between 0 and 2, PDF = 0.5 between 0 and 2.  
E[X] = ∫0^2 x⋅0.5 dx = 1  
Mean = mid-point = 1.  

## Median vs Mean (Why not always use median?)  
**Mean (expected value): Balance point.** Sensitive to outliers.  
**Median: Middle value if you sort data**. Not affected by outliers.  

Example: Weekly salary of 5 people:  
Mean ≈ 2,710 → inflated by one outlier.  
Median = 700 → gives a “typical” sense.  

So why stick with mean in probability theory?  
Because the Law of Large Numbers links the mean to repeated trials — averages converge to the expected value, not the median. 

## Variance (and Standard Deviation)  
If expected value tells us center, variance tells us spread (how wide the distribution is).  

Definition  
Variance of X: **Var(X) = E[(X−E[X])²] = “Average squared distance from the mean.”**  

Discrete case: Var(X) = ∑ (xi−μ)² P(X=xi)  
Continuous case: Var(X) = ∫−∞∞ (x−μ)² f(x) dx  
where μ is the E[X], the expected value or mean

Example: Dice roll variance  
Mean = 3.5. Compute squared deviations:  
Var(X) = (1/6) ∑(k=1 to 6)(k−3.5)² = 2.92  
So standard deviation: **σ = √Var(X)** ≈ 1.71  

Key intuition:  
Low variance: Values tightly clustered near mean.  
High variance: Values scattered widely.  
Units: If X is measured in meters, variance = m². That’s why people prefer standard deviation (σ) — same units as the variable.  


## The Law of Large Numbers (Why the Mean Matters)  
This is a cornerstone theorem:  
If you repeat an experiment many times, the empirical average of outcomes will converge to the expected value.  

Formally: (1/n) ∑(i=1 to n) Xi → E[X] as n→∞  

Example: Toss a fair coin (E[X] = 0.5 for heads if coding heads=1).  
At 10 tosses, average may be 0.6.  
At 1,000, it will hover near 0.5.  
At 1,000,000, it will be extremely close to 0.5.  

This property makes the mean reliable as the long-run description of a system.  


Putting It All Together  

**Random variable**: assigns numbers to random outcomes.  
**Distribution**: describes likelihood of different outcomes.  
**Expected value (mean, μ)**: balance point, long-run average.  
**Variance** (σ²): average squared deviation from mean → spread/uncertainty.  
**Standard deviation** (σ): square root of variance, same units as variable.  
**Median vs Mean**: median is robust to outliers, but mean links to long-run averages.  
**Law of Large Numbers**: sample averages converge to expected value, justifying its importance.  

So, if I simplify:  
Expected value = "center" of gravity, long-term average  
Variance = "width" of the spread around that center  

In [None]:
# Sources:
# [1](https://www.investopedia.com/terms/e/expected-value.asp)
# [2](https://www.youtube.com/watch?v=BUk1Bl3E86I)
# [3](https://statisticsbyjim.com/probability/expected-value/)
# [4](https://www.youtube.com/watch?v=5mor27xtjcY)
# [5](https://www.geeksforgeeks.org/maths/expected-value/)
# [6](https://www.youtube.com/watch?v=SEIpdMxb8Gs)
# [7](https://www.statisticshowto.com/probability-and-statistics/expected-value/)
# [8](https://www.youtube.com/watch?v=1YE-pxKXCM0)
# [9](https://en.wikipedia.org/wiki/Expected_value)
# [10](https://www.youtube.com/watch?v=fdi2KmPSbfs)