# Content

[Constructing a probability distribution for random variable](#constructing-a-probability-distribution-for-a-random-variable)

[Probability with discrete random variable](#probability-with-a-discrete-random-variable)

## Constructing a Probability Distribution for a Random Variable

#### Theory
First, what is a **Random Variable**? It's a variable whose value is a numerical outcome of a random process. We typically denote it with a capital letter, like `X`.

A **Discrete Random Variable** is one that can only take on a finite or countable number of distinct values (e.g., the numbers on a die: 1, 2, 3, 4, 5, 6; the number of emails you get in an hour: 0, 1, 2, ...).

A **Probability Distribution** for a discrete random variable is essentially a table, graph, or formula that links each possible value of the random variable with its probability of occurring.

For a distribution to be valid, it must follow two rules:
1.  The probability for every value of `X` must be between 0 and 1 (inclusive).
    *   `0 ≤ P(X=x) ≤ 1`
2.  The sum of all the probabilities for all possible values of `X` must equal 1.
    *   `Σ P(X=x) = 1`

#### Calculation Example
Let's construct a probability distribution for the number of heads when we flip **three** fair coins.

1.  **Define the Random Variable:**
    *   Let `X` = the number of heads that appear.

2.  **List all possible outcomes (the Sample Space):**
    *   HHH
    *   HHT, HTH, THH
    *   HTT, THT, TTH
    *   TTT
    *   Total number of equally likely outcomes = 8.

3.  **Link each outcome to a value of the random variable `X`:**
    *   `X=3`: (HHH) - 1 outcome
    *   `X=2`: (HHT, HTH, THH) - 3 outcomes
    *   `X=1`: (HTT, THT, TTH) - 3 outcomes
    *   `X=0`: (TTT) - 1 outcome

4.  **Calculate the probability for each value of `X`:**
    *   `P(X=3)` = (Number of ways to get 3 heads) / (Total outcomes) = 1/8
    *   `P(X=2)` = 3/8
    *   `P(X=1)` = 3/8
    *   `P(X=0)` = 1/8

5.  **Create the Probability Distribution Table:**

| `x` (Number of Heads) | `P(X=x)` |
| :---: | :---: |
| 0 | 1/8 = 0.125 |
| 1 | 3/8 = 0.375 |
| 2 | 3/8 = 0.375 |
| 3 | 1/8 = 0.125 |
| **Total** | **8/8 = 1.0** |

This table is the probability distribution. It fulfills both rules: all probabilities are between 0 and 1, and their sum is 1.

#### Real-Life Usage
*   **Retail:** A bookstore manager can create a probability distribution for the number of copies of a bestseller sold per day. This helps in managing inventory and deciding when to reorder.
*   **Call Centers:** A company can model the number of calls received per hour. This distribution helps determine how many operators need to be staffed at different times of the day to meet service level goals.

***

## Probability with a Discrete Random Variable

#### Theory
Once you have a probability distribution, you can use it to find the probability of various events. This usually involves identifying the relevant values of the random variable `X` and summing their probabilities.

You can answer questions like:
*   The probability of an **exact** value: `P(X=a)`
*   The probability of being **less than** a value: `P(X < a)`
*   The probability of being **less than or equal to** a value: `P(X ≤ a)`
*   The probability of being **greater than** a value: `P(X > a)`
*   The probability of being **greater than or equal to** a value: `P(X ≥ a)`

#### Calculation Example
Let's use the probability distribution we just created for flipping three coins.

| `x` | `P(X=x)` |
| :---: | :---: |
| 0 | 0.125 |
| 1 | 0.375 |
| 2 | 0.375 |
| 3 | 0.125 |

**Question 1: What is the probability of getting exactly two heads?**
*   This is `P(X=2)`.
*   We just read it from the table: `P(X=2) = 0.375`.

**Question 2: What is the probability of getting *fewer than* two heads?**
*   This is `P(X < 2)`, which means `X` can be 0 or 1.
*   We add the probabilities for those values: `P(X < 2) = P(X=0) + P(X=1)`.
*   `P(X < 2) = 0.125 + 0.375 = 0.500`.

**Question 3: What is the probability of getting *at least one* head?**
*   This is `P(X ≥ 1)`, which means `X` can be 1, 2, or 3.
*   We add the probabilities: `P(X ≥ 1) = P(X=1) + P(X=2) + P(X=3)`.
*   `P(X ≥ 1) = 0.375 + 0.375 + 0.125 = 0.875`.
*   **Shortcut using the complement:** `P(X ≥ 1) = 1 - P(X < 1) = 1 - P(X=0) = 1 - 0.125 = 0.875`.

#### Real-Life Usage
*   **Finance:** An analyst has a probability distribution for the number of times a stock's price will drop in a week. They can use this to calculate the probability that the stock will drop `more than 3 times`, helping them assess risk. `P(Drops > 3) = P(Drops=4) + P(Drops=5)`.
*   **Manufacturing:** A factory manager has a distribution for the number of defective items in a batch of 100. They can calculate the probability of a batch having `2 or fewer` defects (`P(Defects ≤ 2)`), which might be a condition for shipping the batch to a customer.

***

### Python Code Illustration



In [1]:
import pandas as pd

# --- Part 1: Constructing a Probability Distribution ---
print("--- Part 1: Constructing a Probability Distribution ---")

# Let's model the sum of rolling two 6-sided dice.
# The random variable X is the sum.
# Possible values for X are 2, 3, 4, ..., 12.

# There are 6x6 = 36 total possible outcomes.
# We'll create a dictionary mapping the sum (value of X) to its probability.
prob_map = {
    2: 1/36,  # (1,1)
    3: 2/36,  # (1,2), (2,1)
    4: 3/36,  # (1,3), (2,2), (3,1)
    5: 4/36,  # (1,4), (2,3), (3,2), (4,1)
    6: 5/36,  # ...and so on
    7: 6/36,
    8: 5/36,
    9: 4/36,
    10: 3/36,
    11: 2/36,
    12: 1/36,
}

# A pandas Series is a great way to represent a probability distribution.
prob_dist = pd.Series(prob_map)

print("Probability Distribution for the Sum of Two Dice:")
print(prob_dist)
print("\nVerifying the rules:")
print(f"Sum of all probabilities: {prob_dist.sum():.2f}\n")


# --- Part 2: Probability with a Discrete Random Variable ---
print("--- Part 2: Using the Probability Distribution ---")

# Let's use the distribution we just created to answer some questions.

# Question 1: What is the probability that the sum is exactly 5?
p_exact_5 = prob_dist[5]
print(f"P(Sum = 5) = {p_exact_5:.4f}")

# Question 2: What is the probability that the sum is 10 or more?
# P(Sum >= 10) = P(Sum=10) + P(Sum=11) + P(Sum=12)
p_10_or_more = prob_dist[prob_dist.index >= 10].sum()
print(f"P(Sum >= 10) = {p_10_or_more:.4f}")

# Question 3: What is the probability that the sum is less than 5?
# P(Sum < 5) = P(Sum=2) + P(Sum=3) + P(Sum=4)
p_less_than_5 = prob_dist[prob_dist.index < 5].sum()
print(f"P(Sum < 5) = {p_less_than_5:.4f}")

# Question 4: What is the probability the sum is between 6 and 8, inclusive?
# P(6 <= Sum <= 8) = P(Sum=6) + P(Sum=7) + P(Sum=8)
p_6_to_8 = prob_dist[(prob_dist.index >= 6) & (prob_dist.index <= 8)].sum()
print(f"P(6 <= Sum <= 8) = {p_6_to_8:.4f}")


--- Part 1: Constructing a Probability Distribution ---
Probability Distribution for the Sum of Two Dice:
2     0.027778
3     0.055556
4     0.083333
5     0.111111
6     0.138889
7     0.166667
8     0.138889
9     0.111111
10    0.083333
11    0.055556
12    0.027778
dtype: float64

Verifying the rules:
Sum of all probabilities: 1.00

--- Part 2: Using the Probability Distribution ---
P(Sum = 5) = 0.1111
P(Sum >= 10) = 0.1667
P(Sum < 5) = 0.1667
P(6 <= Sum <= 8) = 0.4444
