## Probability Model and Probability Law

*(Coding along with the Udemy course [Mastering Probability & Statistic Python (Theory & Projects)](https://www.udemy.com/course/mastering-probability-and-statistics-in-python/) by Sajjad Mustafa)*

#### __Probability Model__
> 
> A probability model consists of three components:
> 1. Sample space (Ω or S): The set of all possible outcomes
> 2. Events (subsets of S): Collections of outcomes we're interested in
> 3. Probability law/measure: A function that assigns probabilities to events
> 
> For example, for a fair coin flip:
> - Sample Space S = {Heads, Tails}
> - Events could be any subset: {}, {Heads}, {Tails}, {Heads, Tails}
> - Probability law: P(Heads) = P(Tails) = 1/2

#### __Probability Law__
>
> The probability law (or probability measure) is a function P that assigns a real number P(A) to each event A, satisfying these axioms:
> 
> 1. Non-negativity: P(A) ≥ 0 for any event A
> 2. Normalization: P(S) = 1 for the sample space S
> 3. Additivity: For disjoint events A and B, P(A ∪ B) = P(A) + P(B)
> 
> The probability law tells us how likely each event is to occur. For example:
> - Rolling a fair die: P(getting a 6) = 1/6
> - Drawing from a deck: P(getting a heart) = 13/52
> - Tossing a biased coin: P(Heads) = 0.6, P(Tails) = 0.4
> 
> ***The probability law must be consistent with the axioms of probability and reflect the nature of the random experiment being modeled.***

#### __The Axioms of Probability__

> The Axioms of Probability, also known as ***Kolmogorov's axioms***, are the fundamental rules that define probability mathematically.
> 
> 1. Non-negativity:
>    The probability of any event must be non-negative
   P(A) ≥ 0 for any event A in the sample space
> 
> 2. Normalization:
>    The probability of the entire sample space (Ω) equals 1
   P(Ω) = 1
> 
> 3. Countable Additivity (or σ-additivity):
>    For mutually exclusive events (A₁, A₂, ...), the probability of their union equals the sum of their individual probabilities
   P(A₁ ∪ A₂ ∪ ...) = P(A₁) + P(A₂) + ...
> 
> From these three basic axioms, we can derive several important properties:
> 
> - Probability of impossible events: P(∅) = 0
> - Complement rule: P(A') = 1 - P(A)
> - Addition rule for any two events: P(A ∪ B) = P(A) + P(B) - P(A ∩ B)
> - Monotonicity: If A ⊆ B, then P(A) ≤ P(B)
> - Probability bounds: 0 ≤ P(A) ≤ 1 for any event A


#### Normalization in Probability Theory
> Normalization (P(Ω) = 1) means that the probability of all possible outcomes in your sample space must sum to 1, or 100% if you're working with percentages. The sample space (Ω) contains all possible outcomes of your random experiment.
> 
> Here are some concrete examples to illustrate this:
>  
> 1. For a fair coin flip:
>     - Sample space Ω = {Heads, Tails}
>     - P(Heads) = 0.5, P(Tails) = 0.5
>    - P(Ω) = P(Heads) + P(Tails) = 0.5 + 0.5 = 1
> 
> 2. For a six-sided die:
>     - Sample space Ω = {1, 2, 3, 4, 5, 6}
>     - For a fair die, P(each number) = 1/6
>    - P(Ω) = P(1) + P(2) + P(3) + P(4) + P(5) + P(6) = 6 × (1/6) = 1
> 
> 3. For a continuous random variable like height:
>     - The total area under the probability density function must equal 1
>    - ∫(probability density function) dx = 1 over the entire range
> 
> This axiom ensures that we account for all possible outcomes and maintains consistency in probability calculations. It's a fundamental principle that helps us validate probability distributions – if the probabilities don't sum to 1, we know something is wrong with our model.

#### Probability Density Function

> A probability density function (PDF) is a function that describes the relative likelihood of a continuous random variable taking on a specific value. Let me break this down:
> 
> Key characteristics of a PDF:
> 
> 1. It's a function f(x) that is always non-negative: f(x) ≥ 0
> 
> 2. The total area under the curve must equal 1:
>    ∫₋∞^∞ f(x)dx = 1
> 
> 3. The probability of the random variable falling within an interval [a,b] is:
>    P(a ≤ X ≤ b) = ∫ₐ^b f(x)dx
> 
> Important distinctions from discrete probability:
> - For a continuous random variable, the probability of it taking any exact value is 0
> - We can only find probabilities of ranges (intervals)
> 
> A classic example is the Normal (Gaussian) Distribution:
f(x) = (1/√(2πσ²))e^(-(x-μ)²/2σ²)
where:
> - μ is the mean
> - σ is the standard deviation
> - e is Euler's number
> - π is pi
> 
> Other common examples include:
> - Uniform distribution
> - Exponential distribution
> - Chi-square distribution
> - Student's t-distribution