# Basics of Probability Tutorial

This notebook demonstrates key concepts from probability theory including:
- Discrete probability and basic calculations
- Probability spaces and set theory
- Simple and compound events
- Independent and dependent events
- Combinatorics and counting principles

## 1. Import Required Libraries

First, we'll import the necessary Python libraries for probability calculations and visualizations.

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from itertools import product, combinations
import matplotlib.patches as mpatches
from matplotlib_venn import venn2, venn3

# Set style for better visualizations
sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (10, 6)
np.random.seed(42)

<details>
<summary>üí° Summary</summary>

We imported numpy for numerical operations, pandas for data manipulation, matplotlib and seaborn for visualizations, itertools for generating combinations and products, and matplotlib_venn for Venn diagrams. These libraries will help us explore probability concepts visually and computationally.
</details>

## 2. Introduction to Probability

### What is Probability?

Probability is a way of quantifying uncertainty. It represents how likely something is to happen, expressed as a number between 0 and 1:
- **P(A) close to 0**: Event A is very unlikely
- **P(A) close to 1**: Event A is very likely
- **P(A) > 0.5**: Event A is more likely to happen than not

In [None]:
# Example: Rolling a fair 6-sided die
def calculate_probability(favorable_outcomes, total_outcomes):
    """Calculate probability using the basic formula"""
    return favorable_outcomes / total_outcomes

# Probability of rolling a 5
die_sides = 6
desired_outcome = 1  # Only one way to roll a 5

prob_roll_5 = calculate_probability(desired_outcome, die_sides)

print("Rolling a Fair 6-Sided Die:")
print(f"P(rolling a 5) = {desired_outcome}/{die_sides} = {prob_roll_5:.4f}")
print(f"P(rolling a 5) = {prob_roll_5*100:.2f}%")
print(f"\nInterpretation: There is a 1 in {die_sides} chance of rolling a 5")

# Probability of rolling a number greater than 2
numbers_greater_than_2 = [3, 4, 5, 6]
prob_greater_than_2 = calculate_probability(len(numbers_greater_than_2), die_sides)

print(f"\nP(rolling > 2) = {len(numbers_greater_than_2)}/{die_sides} = {prob_greater_than_2:.4f}")
print(f"P(rolling > 2) = {prob_greater_than_2*100:.2f}%")

<details>
<summary>üí° Summary</summary>

The basic probability formula is: **P(A) = (number of favorable outcomes) / (total number of outcomes)**. For a fair die, each outcome has equal probability. Rolling a 5 has probability 1/6 (16.67%), while rolling a number greater than 2 has probability 4/6 (66.67%) since four outcomes (3,4,5,6) satisfy the condition.
</details>

### Visualizing Probability Distributions

In [None]:
# Create a probability distribution for a fair die
outcomes = [1, 2, 3, 4, 5, 6]
probabilities = [1/6] * 6

# Create bar chart
plt.figure(figsize=(10, 6))
bars = plt.bar(outcomes, probabilities, color='steelblue', edgecolor='black', alpha=0.7)

# Highlight outcomes > 2
for i, outcome in enumerate(outcomes):
    if outcome > 2:
        bars[i].set_color('coral')

plt.xlabel('Die Outcome', fontsize=12, fontweight='bold')
plt.ylabel('Probability', fontsize=12, fontweight='bold')
plt.title('Probability Distribution of a Fair 6-Sided Die', fontsize=14, fontweight='bold')
plt.xticks(outcomes)
plt.ylim(0, 0.25)
plt.axhline(y=1/6, color='red', linestyle='--', alpha=0.5, label='Equal Probability (1/6)')

# Add legend
blue_patch = mpatches.Patch(color='steelblue', label='Outcomes ‚â§ 2')
coral_patch = mpatches.Patch(color='coral', label='Outcomes > 2')
plt.legend(handles=[blue_patch, coral_patch, plt.Line2D([0], [0], color='red', linestyle='--', label='P = 1/6')], 
           loc='upper right')

plt.grid(axis='y', alpha=0.3)
plt.tight_layout()
plt.show()

print(f"Sum of all probabilities: {sum(probabilities):.1f}")
print("\nNote: The sum of all probabilities in a sample space must equal 1")

<details>
<summary>üí° Summary</summary>

The bar chart visualizes the probability distribution where each outcome has equal probability (1/6). The coral bars represent outcomes greater than 2, showing that 4 out of 6 outcomes meet this condition. An important property: **the sum of all probabilities in a sample space must equal 1**, representing certainty that one of the outcomes will occur.
</details>

## 3. Probability and Sets

### Sample Spaces and Events

- **Sample Space (S)**: The set of all possible outcomes
- **Event (A)**: A subset of the sample space
- **Outcome**: A single element in the sample space

In [None]:
# Example: Days of the week
sample_space = {'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'}

# Define events
event_weekend = {'Saturday', 'Sunday'}
event_workday = {'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday'}

print("Sample Space S:")
print(f"S = {sample_space}")
print(f"Cardinality |S| = {len(sample_space)}")

print("\nEvent A (Weekend):")
print(f"A = {event_weekend}")
print(f"Cardinality |A| = {len(event_weekend)}")
print(f"P(A) = |A| / |S| = {len(event_weekend)}/{len(sample_space)} = {len(event_weekend)/len(sample_space):.4f}")

print("\nEvent B (Workday):")
print(f"B = {event_workday}")
print(f"Cardinality |B| = {len(event_workday)}")
print(f"P(B) = |B| / |S| = {len(event_workday)}/{len(sample_space)} = {len(event_workday)/len(sample_space):.4f}")

# Check that events are mutually exclusive and exhaustive
print(f"\nA ‚à© B (intersection): {event_weekend.intersection(event_workday)}")
print(f"A ‚à™ B (union): {event_weekend.union(event_workday)}")
print(f"Is A ‚à™ B = S? {event_weekend.union(event_workday) == sample_space}")

<details>
<summary>üí° Summary</summary>

Sets provide a mathematical framework for probability. The **sample space S** contains all possible outcomes. An **event A** is a subset of S. The probability P(A) = |A| / |S| when outcomes are equally likely. In this example, weekend and workday events are **mutually exclusive** (no overlap) and **exhaustive** (together they cover all outcomes in S).
</details>

### Venn Diagrams for Events

In [None]:
# Example: Survey about programming languages
# 1500 developers, 860 enjoy Python, 800 enjoy JavaScript, 230 enjoy both

total_respondents = 1500
python_total = 860
javascript_total = 800
both_languages = 230

# Calculate exclusive regions
python_only = python_total - both_languages
javascript_only = javascript_total - both_languages
neither = total_respondents - (python_only + javascript_only + both_languages)

print("Programming Language Survey (1500 developers):")
print(f"Python only: {python_only}")
print(f"JavaScript only: {javascript_only}")
print(f"Both Python and JavaScript: {both_languages}")
print(f"Neither: {neither}")

# Create Venn diagram
plt.figure(figsize=(10, 8))
venn = venn2(subsets=(python_only, javascript_only, both_languages), 
             set_labels=('Python', 'JavaScript'),
             set_colors=('skyblue', 'lightcoral'),
             alpha=0.7)

# Customize labels
venn.get_label_by_id('10').set_text(f'{python_only}\n({python_only/total_respondents*100:.1f}%)')
venn.get_label_by_id('01').set_text(f'{javascript_only}\n({javascript_only/total_respondents*100:.1f}%)')
venn.get_label_by_id('11').set_text(f'{both_languages}\n({both_languages/total_respondents*100:.1f}%)')

plt.title('Developer Programming Language Preferences\n(Venn Diagram)', fontsize=14, fontweight='bold')
plt.text(0, -0.8, f'Neither language: {neither} ({neither/total_respondents*100:.1f}%)', 
         ha='center', fontsize=11, bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.5))
plt.tight_layout()
plt.show()

# Calculate probabilities
print("\nProbabilities:")
print(f"P(Python only) = {python_only/total_respondents:.4f}")
print(f"P(JavaScript only) = {javascript_only/total_respondents:.4f}")
print(f"P(Both) = P(Python ‚à© JavaScript) = {both_languages/total_respondents:.4f}")
print(f"P(At least one) = P(Python ‚à™ JavaScript) = {(python_only + javascript_only + both_languages)/total_respondents:.4f}")
print(f"P(Neither) = {neither/total_respondents:.4f}")

<details>
<summary>üí° Summary</summary>

Venn diagrams visually represent set relationships and overlaps. The intersection **P ‚à© J** shows developers who enjoy both languages (230/1500 = 15.3%). The union **P ‚à™ J** represents developers who enjoy at least one language (1430/1500 = 95.3%). The complement (outside both circles) shows developers who enjoy neither (70/1500 = 4.7%). Venn diagrams make it easy to calculate probabilities for complex events using set operations.
</details>

## 4. Calculating Probability of Simple Events

### General Formula for Event Probability

For event A with outcomes e‚ÇÅ, e‚ÇÇ, ..., e‚Çñ:
- **General case**: P(A) = P(e‚ÇÅ) + P(e‚ÇÇ) + ... + P(e‚Çñ)
- **Equally likely outcomes**: P(A) = |A| / |S|

In [None]:
# Example: Workforce diversity in a tech organization
workforce_data = {
    'Ethnicity': ['White', 'Latin', 'Asian', 'Black', 'Others'],
    'Count': [585, 330, 225, 255, 105],
    'Probability': [0.39, 0.22, 0.15, 0.17, 0.07]
}

df_workforce = pd.DataFrame(workforce_data)
total_employees = df_workforce['Count'].sum()

print("Tech Organization Workforce Breakdown:")
print(df_workforce)
print(f"\nTotal Employees: {total_employees}")

# Calculate probabilities for different events
print("\nEvent Probabilities:")

# Event A: Randomly selecting an Asian employee
prob_asian = df_workforce[df_workforce['Ethnicity'] == 'Asian']['Probability'].values[0]
print(f"\n1. P(Asian employee) = {prob_asian:.2f}")

# Event B: Randomly selecting a Latin or Asian employee
prob_latin_or_asian = df_workforce[df_workforce['Ethnicity'].isin(['Latin', 'Asian'])]['Probability'].sum()
print(f"2. P(Latin OR Asian) = P(Latin) + P(Asian) = 0.22 + 0.15 = {prob_latin_or_asian:.2f}")

# Event C: Randomly selecting a minority (not White)
prob_minority = df_workforce[df_workforce['Ethnicity'] != 'White']['Probability'].sum()
print(f"3. P(Minority, not White) = 1 - P(White) = 1 - 0.39 = {prob_minority:.2f}")

# Visualize
plt.figure(figsize=(12, 6))

# Pie chart
plt.subplot(1, 2, 1)
colors_diversity = ['#3498db', '#e74c3c', '#2ecc71', '#f39c12', '#9b59b6']
plt.pie(df_workforce['Probability'], labels=df_workforce['Ethnicity'], autopct='%1.1f%%',
        colors=colors_diversity, startangle=90)
plt.title('Workforce Distribution by Ethnicity', fontsize=12, fontweight='bold')

# Bar chart
plt.subplot(1, 2, 2)
bars = plt.bar(df_workforce['Ethnicity'], df_workforce['Probability'], 
               color=colors_diversity, edgecolor='black', alpha=0.7)
plt.xlabel('Ethnicity', fontsize=11, fontweight='bold')
plt.ylabel('Probability', fontsize=11, fontweight='bold')
plt.title('Probability Distribution', fontsize=12, fontweight='bold')
plt.xticks(rotation=45)
plt.grid(axis='y', alpha=0.3)

plt.tight_layout()
plt.show()

<details>
<summary>üí° Summary</summary>

When outcomes have **different probabilities**, we calculate P(A) by summing the probabilities of all outcomes in event A. For example, P(Latin OR Asian) = P(Latin) + P(Asian) = 0.22 + 0.15 = 0.37. We can also use the complement rule: P(Minority) = 1 - P(White) = 1 - 0.39 = 0.61. The visualizations show both the proportional distribution (pie chart) and direct probability comparison (bar chart).
</details>

## 5. Compound Events and Set Operations

### Set Operations: Union (‚à™), Intersection (‚à©), and Complement

- **A ‚à™ B**: A or B (at least one occurs)
- **A ‚à© B**: A and B (both occur)
- **S - A**: Not A (complement of A)

In [None]:
# Example: Rolling a 6-sided die
S = {1, 2, 3, 4, 5, 6}
A = {1, 2}  # Roll a number smaller than 3
B = {2, 4, 6}  # Roll an even number

# Calculate set operations
union_AB = A.union(B)
intersection_AB = A.intersection(B)
complement_A = S - A

print("Sample Space S:", S)
print("Event A (number < 3):", A)
print("Event B (even number):", B)
print()
print("Set Operations:")
print(f"A ‚à™ B (A or B): {union_AB}")
print(f"A ‚à© B (A and B): {intersection_AB}")
print(f"S - A (not A): {complement_A}")
print()
print("Probabilities:")
print(f"P(A ‚à™ B) = |A ‚à™ B| / |S| = {len(union_AB)}/{len(S)} = {len(union_AB)/len(S):.4f}")
print(f"P(A ‚à© B) = |A ‚à© B| / |S| = {len(intersection_AB)}/{len(S)} = {len(intersection_AB)/len(S):.4f}")
print(f"P(S - A) = |S - A| / |S| = {len(complement_A)}/{len(S)} = {len(complement_A)/len(S):.4f}")
print(f"\nAlternatively, P(not A) = 1 - P(A) = 1 - {len(A)/len(S):.4f} = {1 - len(A)/len(S):.4f}")

# Visualize with Venn diagram
plt.figure(figsize=(14, 4))

# Create three diagrams for different operations
operations = [
    ('A ‚à™ B', A.union(B), 'Union (A or B)'),
    ('A ‚à© B', A.intersection(B), 'Intersection (A and B)'),
    ('S - A', S - A, 'Complement (not A)')
]

for idx, (title, result_set, description) in enumerate(operations, 1):
    plt.subplot(1, 3, idx)
    plt.text(0.5, 0.7, description, ha='center', fontsize=12, fontweight='bold')
    plt.text(0.5, 0.5, f"Result: {result_set}", ha='center', fontsize=10)
    plt.text(0.5, 0.3, f"P = {len(result_set)}/{len(S)} = {len(result_set)/len(S):.3f}", 
             ha='center', fontsize=11, bbox=dict(boxstyle='round', facecolor='lightblue', alpha=0.5))
    plt.xlim(0, 1)
    plt.ylim(0, 1)
    plt.axis('off')

plt.suptitle('Set Operations on Events', fontsize=14, fontweight='bold', y=1.02)
plt.tight_layout()
plt.show()

<details>
<summary>üí° Summary</summary>

Set operations enable calculation of compound event probabilities. **Union (A ‚à™ B)** contains outcomes in A or B or both: {1,2,4,6} with P = 4/6. **Intersection (A ‚à© B)** contains outcomes in both A and B: {2} with P = 1/6. **Complement (S - A)** contains outcomes not in A: {3,4,5,6} with P = 4/6. The complement rule provides a shortcut: P(not A) = 1 - P(A) = 1 - 2/6 = 4/6.
</details>

## 6. Sample Sets for Compound Events

### Cartesian Product and Multiplication Rule

For compound events (sequences of events), the sample space is the **Cartesian product** of individual event spaces:
- |S| = |A| √ó |B| for two events
- |S| = |A‚ÇÅ| √ó |A‚ÇÇ| √ó ... √ó |A‚Çô| for n events

In [None]:
# Example 1: Flipping two coins
coin = ['H', 'T']
two_coins = list(product(coin, coin))

print("Example 1: Flipping Two Coins")
print(f"Sample space for one coin: {coin}")
print(f"Sample space for two coins (Cartesian product):")
print(f"S = {two_coins}")
print(f"|S| = |coin‚ÇÅ| √ó |coin‚ÇÇ| = {len(coin)} √ó {len(coin)} = {len(two_coins)}")

# Display as table
print("\nOutcome Table:")
print("Coin 1 | Coin 2 | Outcome")
print("-" * 30)
for outcome in two_coins:
    print(f"  {outcome[0]}    |   {outcome[1]}    | {outcome}")

# Example 2: Rolling two dice
die = [1, 2, 3, 4, 5, 6]
two_dice = list(product(die, die))

print("\n" + "="*50)
print("Example 2: Rolling Two Dice")
print(f"Sample space for one die: {die}")
print(f"|S| = |die‚ÇÅ| √ó |die‚ÇÇ| = {len(die)} √ó {len(die)} = {len(two_dice)}")

# Create outcome table for two dice
df_dice = pd.DataFrame(index=range(1, 7), columns=range(1, 7))
for i in range(1, 7):
    for j in range(1, 7):
        df_dice.loc[i, j] = f"({i},{j})"

print("\nAll 36 Possible Outcomes (Die‚ÇÅ, Die‚ÇÇ):")
print(df_dice)

# Example 3: Bridge crossing game (10 steps, 2 choices each)
num_steps = 10
choices_per_step = 2
total_paths = choices_per_step ** num_steps

print("\n" + "="*50)
print("Example 3: Bridge Crossing Game")
print(f"Number of steps: {num_steps}")
print(f"Choices per step: {choices_per_step} (left or right)")
print(f"|S| = {choices_per_step}^{num_steps} = {total_paths:,} possible paths")
print(f"\nOnly 1 path is correct, so P(winning) = 1/{total_paths:,} = {1/total_paths:.6f}")

<details>
<summary>üí° Summary</summary>

The **multiplication rule** states that for compound events, |S| is the product of the cardinalities of each event. Flipping two coins gives |S| = 2 √ó 2 = 4 outcomes. Rolling two dice gives |S| = 6 √ó 6 = 36 outcomes. The bridge game with 10 binary choices has |S| = 2¬π‚Å∞ = 1,024 paths, making the probability of randomly choosing the correct path only 1/1024 ‚âà 0.098%. This multiplication principle is fundamental for calculating probabilities in multi-stage experiments.
</details>

### Sampling With and Without Replacement

In [None]:
# Example: Removing balls from a bag
num_balls = 10

# Without replacement
first_ball_choices = num_balls
second_ball_choices_no_replacement = num_balls - 1
outcomes_no_replacement = first_ball_choices * second_ball_choices_no_replacement

# With replacement
second_ball_choices_with_replacement = num_balls
outcomes_with_replacement = first_ball_choices * second_ball_choices_with_replacement

print("Scenario: Drawing 2 balls from a bag with 10 distinct balls\n")

print("WITHOUT Replacement:")
print(f"  First draw: {first_ball_choices} choices")
print(f"  Second draw: {second_ball_choices_no_replacement} choices (one ball removed)")
print(f"  |S| = {first_ball_choices} √ó {second_ball_choices_no_replacement} = {outcomes_no_replacement} outcomes")

print("\nWITH Replacement:")
print(f"  First draw: {first_ball_choices} choices")
print(f"  Second draw: {second_ball_choices_with_replacement} choices (ball returned)")
print(f"  |S| = {first_ball_choices} √ó {second_ball_choices_with_replacement} = {outcomes_with_replacement} outcomes")

print(f"\nDifference: {outcomes_with_replacement - outcomes_no_replacement} more outcomes with replacement")

# Visualize
scenarios = ['Without\nReplacement', 'With\nReplacement']
outcome_counts = [outcomes_no_replacement, outcomes_with_replacement]

plt.figure(figsize=(10, 6))
bars = plt.bar(scenarios, outcome_counts, color=['#e74c3c', '#3498db'], 
               edgecolor='black', alpha=0.7, width=0.5)

# Add value labels
for bar, value in zip(bars, outcome_counts):
    height = bar.get_height()
    plt.text(bar.get_x() + bar.get_width()/2., height + 1,
             f'{value}', ha='center', va='bottom', fontsize=14, fontweight='bold')

plt.ylabel('Number of Possible Outcomes |S|', fontsize=12, fontweight='bold')
plt.title('Sample Space Size: Drawing 2 Balls from 10', fontsize=14, fontweight='bold')
plt.ylim(0, max(outcome_counts) + 10)
plt.grid(axis='y', alpha=0.3)
plt.tight_layout()
plt.show()

<details>
<summary>üí° Summary</summary>

**Replacement affects sample space size**. When drawing without replacement, the second event has fewer choices (10 √ó 9 = 90 outcomes). With replacement, both events have the same choices (10 √ó 10 = 100 outcomes). This distinction is crucial for calculating probabilities correctly. Without replacement creates **dependent events** where the first outcome affects the second. With replacement creates **independent events** where outcomes don't affect each other.
</details>

## 7. Calculating Probability of Compound Events

### Using the Cartesian Product Sample Space

In [None]:
# Example 1: Rolling two dice - sum equals 3
die = [1, 2, 3, 4, 5, 6]
two_dice_outcomes = list(product(die, die))
total_outcomes = len(two_dice_outcomes)

# Find outcomes where sum = 3
sum_equals_3 = [outcome for outcome in two_dice_outcomes if sum(outcome) == 3]

print("Example 1: Two Dice - Sum Equals 3")
print(f"Total possible outcomes: {total_outcomes}")
print(f"Outcomes with sum = 3: {sum_equals_3}")
print(f"Number of favorable outcomes: {len(sum_equals_3)}")
print(f"P(sum = 3) = {len(sum_equals_3)}/{total_outcomes} = {len(sum_equals_3)/total_outcomes:.4f} = {len(sum_equals_3)/total_outcomes*100:.2f}%")

# Example 2: Rolling two dice - sum greater than 8
sum_greater_8 = [outcome for outcome in two_dice_outcomes if sum(outcome) > 8]

print("\n" + "="*50)
print("Example 2: Two Dice - Sum Greater Than 8")
print(f"Outcomes with sum > 8: {sum_greater_8}")
print(f"Number of favorable outcomes: {len(sum_greater_8)}")
print(f"P(sum > 8) = {len(sum_greater_8)}/{total_outcomes} = {len(sum_greater_8)/total_outcomes:.4f} = {len(sum_greater_8)/total_outcomes*100:.2f}%")

# Visualize sum distribution
sums = [sum(outcome) for outcome in two_dice_outcomes]
sum_counts = pd.Series(sums).value_counts().sort_index()

plt.figure(figsize=(12, 6))

# Bar chart of sum frequencies
plt.subplot(1, 2, 1)
colors = ['coral' if s == 3 else 'steelblue' for s in sum_counts.index]
bars = plt.bar(sum_counts.index, sum_counts.values, color=colors, edgecolor='black', alpha=0.7)
plt.xlabel('Sum of Two Dice', fontsize=11, fontweight='bold')
plt.ylabel('Frequency (out of 36)', fontsize=11, fontweight='bold')
plt.title('Distribution of Sums When Rolling Two Dice', fontsize=12, fontweight='bold')
plt.xticks(range(2, 13))
plt.grid(axis='y', alpha=0.3)

# Probability distribution
plt.subplot(1, 2, 2)
probabilities = sum_counts / total_outcomes
colors2 = ['green' if s > 8 else 'lightblue' for s in probabilities.index]
bars2 = plt.bar(probabilities.index, probabilities.values, color=colors2, edgecolor='black', alpha=0.7)
plt.xlabel('Sum of Two Dice', fontsize=11, fontweight='bold')
plt.ylabel('Probability', fontsize=11, fontweight='bold')
plt.title('Probability Distribution of Sums', fontsize=12, fontweight='bold')
plt.xticks(range(2, 13))
plt.axhline(y=len(sum_greater_8)/total_outcomes, color='red', linestyle='--', 
            alpha=0.5, label=f'P(sum > 8) = {len(sum_greater_8)/total_outcomes:.3f}')
plt.legend()
plt.grid(axis='y', alpha=0.3)

plt.tight_layout()
plt.show()

print(f"\nMost likely sum: {sum_counts.idxmax()} (occurs {sum_counts.max()} times, P = {sum_counts.max()/total_outcomes:.4f})")

<details>
<summary>üí° Summary</summary>

For compound events, we use: **P(event) = (number of favorable outcomes) / |S|**. Rolling two dice has |S| = 36 outcomes. Only 2 outcomes sum to 3: (1,2) and (2,1), giving P(sum=3) = 2/36 ‚âà 5.6%. Ten outcomes sum to more than 8, giving P(sum>8) = 10/36 ‚âà 27.7%. The distribution shows that sum = 7 is most likely (6/36 ‚âà 16.7%) because it has the most combinations: (1,6), (2,5), (3,4), (4,3), (5,2), (6,1).
</details>

## 8. Independent vs Dependent Events

### Understanding Independence

- **Independent events**: Event A does not affect the probability of event B
- **Dependent events**: Event A affects the probability of event B

For independent events: **P(A ‚à© B) = P(A) √ó P(B)**

In [None]:
# Example 1: Independent Events - Flipping Two Coins
print("Example 1: INDEPENDENT EVENTS")
print("Scenario: Flipping two fair coins\n")

# Define events
coin_outcomes = ['H', 'T']
prob_heads_coin1 = 1/2
prob_tails_coin2 = 1/2

print("Event A: First coin shows Tails")
print(f"P(A) = {prob_heads_coin1}")
print("\nEvent B: Second coin shows Tails")
print(f"P(B) = {prob_tails_coin2}")

# Calculate P(A ‚à© B) for independent events
prob_both_tails = prob_heads_coin1 * prob_tails_coin2

print("\nSince flipping coins are independent events:")
print(f"P(A ‚à© B) = P(A) √ó P(B)")
print(f"P(both tails) = {prob_heads_coin1} √ó {prob_tails_coin2} = {prob_both_tails}")
print(f"P(both tails) = {prob_both_tails*100:.0f}%")

# Verify by enumeration
all_outcomes = list(product(coin_outcomes, coin_outcomes))
both_tails_outcomes = [o for o in all_outcomes if o[0] == 'T' and o[1] == 'T']
print(f"\nVerification: {len(both_tails_outcomes)} out of {len(all_outcomes)} outcomes = {len(both_tails_outcomes)/len(all_outcomes)}")

print("\n" + "="*70)

# Example 2: Dependent Events - Drawing Balls Without Replacement
print("Example 2: DEPENDENT EVENTS")
print("Scenario: Drawing 2 balls from a bag with 10 balls (3 red, 4 blue, 3 green)\n")

total_balls = 10
blue_balls = 4
red_balls = 3

print("Event A: First ball drawn is Blue")
prob_blue_first = blue_balls / total_balls
print(f"P(A) = {blue_balls}/{total_balls} = {prob_blue_first}")

print("\nEvent B: Second ball drawn is Red (given first was Blue)")
remaining_balls = total_balls - 1
remaining_red = red_balls  # Red balls unaffected
prob_red_second_given_blue_first = remaining_red / remaining_balls
print(f"P(B|A) = {remaining_red}/{remaining_balls} = {prob_red_second_given_blue_first:.4f}")
print("(Note: Sample space changed from 10 to 9 balls)")

# Calculate P(A ‚à© B) for dependent events
prob_blue_then_red = prob_blue_first * prob_red_second_given_blue_first

print("\nSince drawing without replacement creates dependent events:")
print(f"P(A ‚à© B) = P(A) √ó P(B|A)")
print(f"P(Blue then Red) = {prob_blue_first} √ó {prob_red_second_given_blue_first:.4f}")
print(f"P(Blue then Red) = {prob_blue_then_red:.4f} = {prob_blue_then_red*100:.2f}%")

# Compare independent vs dependent
print("\n" + "="*70)
print("KEY DIFFERENCE:")
print("Independent: P(A ‚à© B) = P(A) √ó P(B)")
print("             |S_A| = |S_B| (sample spaces unchanged)")
print("\nDependent:   P(A ‚à© B) = P(A) √ó P(B|A)")
print("             |S_B| changes based on outcome of A")

<details>
<summary>üí° Summary</summary>

**Independent events** don't affect each other. Flipping two coins: P(both tails) = P(tail‚ÇÅ) √ó P(tail‚ÇÇ) = 1/2 √ó 1/2 = 1/4. The sample spaces remain constant (|S_A| = |S_B| = 2). **Dependent events** affect each other. Drawing balls without replacement: P(blue then red) = (4/10) √ó (3/9) = 0.133. After drawing the first ball, the sample space changes from 10 to 9 balls. This distinction is critical for correct probability calculations.
</details>

### Visualizing Independence

In [None]:
# Create comparison visualization
fig, axes = plt.subplots(2, 2, figsize=(14, 10))
fig.suptitle('Independent vs Dependent Events', fontsize=16, fontweight='bold')

# Independent Events - Tree Diagram (Conceptual)
axes[0, 0].text(0.5, 0.9, 'INDEPENDENT EVENTS', ha='center', fontsize=14, fontweight='bold')
axes[0, 0].text(0.5, 0.8, 'Flipping Two Coins', ha='center', fontsize=11)
axes[0, 0].text(0.2, 0.6, 'Coin 1', ha='center', fontsize=10, bbox=dict(boxstyle='round', facecolor='lightblue'))
axes[0, 0].text(0.1, 0.4, 'H (1/2)', ha='center', fontsize=9)
axes[0, 0].text(0.3, 0.4, 'T (1/2)', ha='center', fontsize=9)
axes[0, 0].text(0.5, 0.6, 'Coin 2', ha='center', fontsize=10, bbox=dict(boxstyle='round', facecolor='lightcoral'))
axes[0, 0].text(0.45, 0.4, 'H (1/2)', ha='center', fontsize=9)
axes[0, 0].text(0.55, 0.4, 'T (1/2)', ha='center', fontsize=9)
axes[0, 0].text(0.5, 0.2, 'Sample space unchanged:\n|S‚ÇÅ| = 2, |S‚ÇÇ| = 2', 
                ha='center', fontsize=10, bbox=dict(boxstyle='round', facecolor='lightyellow', alpha=0.7))
axes[0, 0].text(0.5, 0.05, 'P(T,T) = 1/2 √ó 1/2 = 1/4', ha='center', fontsize=11, fontweight='bold',
                bbox=dict(boxstyle='round', facecolor='lightgreen', alpha=0.7))
axes[0, 0].set_xlim(0, 1)
axes[0, 0].set_ylim(0, 1)
axes[0, 0].axis('off')

# Independent Events - Probability Calculation
coin_outcomes_all = list(product(['H', 'T'], ['H', 'T']))
outcome_labels = [f"{o[0]},{o[1]}" for o in coin_outcomes_all]
outcome_probs = [0.25] * 4
colors_ind = ['green' if o == ('T', 'T') else 'steelblue' for o in coin_outcomes_all]

axes[0, 1].bar(outcome_labels, outcome_probs, color=colors_ind, edgecolor='black', alpha=0.7)
axes[0, 1].set_ylabel('Probability', fontsize=11, fontweight='bold')
axes[0, 1].set_title('All Outcomes (Each = 1/4)', fontsize=12, fontweight='bold')
axes[0, 1].set_ylim(0, 0.3)
axes[0, 1].grid(axis='y', alpha=0.3)

# Dependent Events - Tree Diagram (Conceptual)
axes[1, 0].text(0.5, 0.9, 'DEPENDENT EVENTS', ha='center', fontsize=14, fontweight='bold')
axes[1, 0].text(0.5, 0.8, 'Drawing Balls Without Replacement', ha='center', fontsize=11)
axes[1, 0].text(0.2, 0.6, 'Draw 1', ha='center', fontsize=10, bbox=dict(boxstyle='round', facecolor='lightblue'))
axes[1, 0].text(0.1, 0.45, 'Blue\n(4/10)', ha='center', fontsize=8)
axes[1, 0].text(0.3, 0.45, 'Red\n(3/10)', ha='center', fontsize=8)
axes[1, 0].text(0.5, 0.6, 'Draw 2', ha='center', fontsize=10, bbox=dict(boxstyle='round', facecolor='lightcoral'))
axes[1, 0].text(0.45, 0.45, 'Red\n(3/9)', ha='center', fontsize=8)
axes[1, 0].text(0.55, 0.45, 'Blue\n(3/9)', ha='center', fontsize=8)
axes[1, 0].text(0.5, 0.2, 'Sample space changes:\n|S‚ÇÅ| = 10, |S‚ÇÇ| = 9', 
                ha='center', fontsize=10, bbox=dict(boxstyle='round', facecolor='lightyellow', alpha=0.7))
axes[1, 0].text(0.5, 0.05, 'P(Blue,Red) = 4/10 √ó 3/9 = 0.133', ha='center', fontsize=11, fontweight='bold',
                bbox=dict(boxstyle='round', facecolor='lightgreen', alpha=0.7))
axes[1, 0].set_xlim(0, 1)
axes[1, 0].set_ylim(0, 1)
axes[1, 0].axis('off')

# Dependent Events - Sample Space Change
draw_scenarios = ['Draw 1\n(10 balls)', 'Draw 2 after Blue\n(9 balls)']
sample_sizes = [10, 9]
axes[1, 1].bar(draw_scenarios, sample_sizes, color=['#3498db', '#e74c3c'], 
               edgecolor='black', alpha=0.7, width=0.6)
axes[1, 1].set_ylabel('Sample Space Size |S|', fontsize=11, fontweight='bold')
axes[1, 1].set_title('Sample Space Reduction', fontsize=12, fontweight='bold')
axes[1, 1].set_ylim(0, 12)
for i, (scenario, size) in enumerate(zip(draw_scenarios, sample_sizes)):
    axes[1, 1].text(i, size + 0.3, f'{size}', ha='center', fontsize=12, fontweight='bold')
axes[1, 1].grid(axis='y', alpha=0.3)

plt.tight_layout()
plt.show()

<details>
<summary>üí° Summary</summary>

The visualizations highlight the key difference between independent and dependent events. For **independent events**, the sample space remains constant across trials (|S‚ÇÅ| = |S‚ÇÇ| = 2), and all outcomes have equal probability (1/4 each). For **dependent events**, the sample space shrinks (|S‚ÇÅ| = 10 ‚Üí |S‚ÇÇ| = 9), affecting subsequent probabilities. Understanding this distinction is essential for correctly applying probability formulas in real-world scenarios.
</details>

## 9. Real-World Applications

### Card Games and Probability

In [None]:
# Standard deck of cards
total_cards = 52
num_suits = 4
cards_per_value = 4  # One of each suit

# Example 1: Drawing a King
num_kings = 4
prob_king = num_kings / total_cards

print("Standard Deck: 52 cards, 4 suits (‚ô† ‚ô• ‚ô¶ ‚ô£), 13 values each")
print("\nExample 1: Drawing a King")
print(f"Number of Kings: {num_kings}")
print(f"P(King) = {num_kings}/{total_cards} = {prob_king:.4f}")
print(f"As a fraction: {num_kings}/{total_cards} = 1/13")

# Example 2: Drawing two cards - pair of fives (without replacement)
num_fives = 4
prob_first_five = num_fives / total_cards
prob_second_five = (num_fives - 1) / (total_cards - 1)  # Dependent event
prob_pair_fives = prob_first_five * prob_second_five

print("\n" + "="*50)
print("Example 2: Drawing a Pair of Fives (without replacement)")
print(f"\nFirst card is a 5: P(A) = {num_fives}/{total_cards} = {prob_first_five:.4f}")
print(f"Second card is a 5 (given first was 5): P(B|A) = {num_fives-1}/{total_cards-1} = {prob_second_five:.4f}")
print(f"\nP(pair of 5s) = P(A) √ó P(B|A)")
print(f"P(pair of 5s) = {prob_first_five:.4f} √ó {prob_second_five:.4f} = {prob_pair_fives:.6f}")
print(f"As a fraction: {num_fives}/{total_cards} √ó {num_fives-1}/{total_cards-1} = {num_fives*(num_fives-1)}/{total_cards*(total_cards-1)}")

# Visualize probabilities for different card combinations
scenarios = [
    ('Single King', prob_king),
    ('Pair of Fives', prob_pair_fives),
    ('Any Pair\n(same value)', (13 * cards_per_value * (cards_per_value-1)) / (total_cards * (total_cards-1))),
    ('Royal Flush\n(poker)', 4 / 2598960)  # 4 ways out of C(52,5)
]

labels = [s[0] for s in scenarios]
probs = [s[1] for s in scenarios]

plt.figure(figsize=(12, 6))
bars = plt.bar(labels, probs, color=['steelblue', 'coral', 'lightgreen', 'gold'], 
               edgecolor='black', alpha=0.7)

# Add probability labels
for bar, prob in zip(bars, probs):
    height = bar.get_height()
    if prob > 0.001:
        plt.text(bar.get_x() + bar.get_width()/2., height + height*0.05,
                 f'{prob:.6f}\n({prob*100:.4f}%)', ha='center', va='bottom', fontsize=9)
    else:
        plt.text(bar.get_x() + bar.get_width()/2., height + height*10,
                 f'{prob:.8f}\n({prob*100:.6f}%)', ha='center', va='bottom', fontsize=9)

plt.ylabel('Probability', fontsize=12, fontweight='bold')
plt.title('Card Drawing Probabilities', fontsize=14, fontweight='bold')
plt.yscale('log')  # Log scale to show wide range
plt.grid(axis='y', alpha=0.3, which='both')
plt.tight_layout()
plt.show()

print(f"\nNote: Royal Flush is extremely rare - only {4 / 2598960 * 100:.6f}% chance!")

<details>
<summary>üí° Summary</summary>

Card games illustrate practical probability calculations. Drawing a King has probability 4/52 = 1/13 (7.69%). Drawing a pair of fives without replacement involves dependent events: (4/52) √ó (3/51) ‚âà 0.45%. The visualization uses a logarithmic scale to compare probabilities spanning several orders of magnitude. A Royal Flush in poker is extraordinarily rare at 0.000154%, demonstrating how compound dependent events can create very unlikely outcomes.
</details>

### Spinners and Multiple Events

In [None]:
# Two spinners with different values
spinner1 = [1, 3, 6, 8, 9]
spinner2 = [2, 4, 6, 8, 9, 10]

print("Spinner Setup:")
print(f"Spinner 1 values: {spinner1} (all equal probability)")
print(f"Spinner 2 values: {spinner2} (all equal probability)")

# Event A: Spinner 1 shows even number
spinner1_even = [v for v in spinner1 if v % 2 == 0]
prob_spinner1_even = len(spinner1_even) / len(spinner1)

# Event B: Spinner 2 shows number > 3
spinner2_greater_3 = [v for v in spinner2 if v > 3]
prob_spinner2_greater_3 = len(spinner2_greater_3) / len(spinner2)

# Combined probability (independent events)
prob_both = prob_spinner1_even * prob_spinner2_greater_3

print("\nEvent A: Spinner 1 shows an even number")
print(f"Even values in Spinner 1: {spinner1_even}")
print(f"P(A) = {len(spinner1_even)}/{len(spinner1)} = {prob_spinner1_even}")

print("\nEvent B: Spinner 2 shows a number > 3")
print(f"Values > 3 in Spinner 2: {spinner2_greater_3}")
print(f"P(B) = {len(spinner2_greater_3)}/{len(spinner2)} = {prob_spinner2_greater_3:.4f}")

print("\nSince the spinners are independent:")
print(f"P(A ‚à© B) = P(A) √ó P(B)")
print(f"P(both events) = {prob_spinner1_even} √ó {prob_spinner2_greater_3:.4f} = {prob_both:.4f}")
print(f"As a fraction: {len(spinner1_even)}/{len(spinner1)} √ó {len(spinner2_greater_3)}/{len(spinner2)} = {len(spinner1_even)*len(spinner2_greater_3)}/{len(spinner1)*len(spinner2)}")

# Enumerate all possible outcomes
all_outcomes = list(product(spinner1, spinner2))
favorable_outcomes = [(s1, s2) for s1, s2 in all_outcomes if s1 % 2 == 0 and s2 > 3]

print(f"\nVerification by enumeration:")
print(f"Total possible outcomes: {len(all_outcomes)}")
print(f"Favorable outcomes: {favorable_outcomes}")
print(f"Count: {len(favorable_outcomes)}")
print(f"Probability: {len(favorable_outcomes)}/{len(all_outcomes)} = {len(favorable_outcomes)/len(all_outcomes):.4f} ‚úì")

# Visualize outcome space
df_outcomes = pd.DataFrame(index=spinner1, columns=spinner2)
for s1 in spinner1:
    for s2 in spinner2:
        if s1 % 2 == 0 and s2 > 3:
            df_outcomes.loc[s1, s2] = '‚úì'
        else:
            df_outcomes.loc[s1, s2] = ''

print("\nOutcome Grid (‚úì = favorable outcome):")
print("Spinner 1 (rows) √ó Spinner 2 (columns)")
print(df_outcomes)

# Create heatmap
outcome_grid = np.zeros((len(spinner1), len(spinner2)))
for i, s1 in enumerate(spinner1):
    for j, s2 in enumerate(spinner2):
        if s1 % 2 == 0 and s2 > 3:
            outcome_grid[i, j] = 1

plt.figure(figsize=(10, 6))
sns.heatmap(outcome_grid, annot=True, fmt='.0f', cmap='RdYlGn', 
            xticklabels=spinner2, yticklabels=spinner1, 
            cbar_kws={'label': 'Favorable (1) / Not Favorable (0)'},
            linewidths=1, linecolor='black')
plt.xlabel('Spinner 2 Value', fontsize=12, fontweight='bold')
plt.ylabel('Spinner 1 Value', fontsize=12, fontweight='bold')
plt.title(f'Outcome Space: Even (Spinner 1) AND >3 (Spinner 2)\n{len(favorable_outcomes)} favorable out of {len(all_outcomes)} total', 
          fontsize=13, fontweight='bold')
plt.tight_layout()
plt.show()

<details>
<summary>üí° Summary</summary>

Multiple independent events follow the multiplication rule. Spinner 1 has P(even) = 2/5, Spinner 2 has P(>3) = 5/6. Since spinners are independent: P(both) = (2/5) √ó (5/6) = 10/30 = 1/3. The outcome grid visualizes all 30 possible combinations (5 √ó 6), with 10 favorable outcomes shown in green. This heatmap approach makes it easy to verify probability calculations by counting favorable outcomes in the sample space.
</details>

## 10. Summary: Key Probability Concepts

### Essential Formulas and Rules

In [None]:
# Create summary table
summary_data = {
    'Concept': [
        'Basic Probability',
        'Sample Space',
        'Event Probability (equal outcomes)',
        'Event Probability (general)',
        'Complement Rule',
        'Union (A or B)',
        'Intersection - Independent',
        'Intersection - Dependent',
        'Multiplication Rule (sample space)',
        'Sum of Probabilities'
    ],
    'Formula / Rule': [
        'P(A) = favorable / total',
        'S = {all possible outcomes}',
        'P(A) = |A| / |S|',
        'P(A) = P(e‚ÇÅ) + P(e‚ÇÇ) + ... + P(e‚Çñ)',
        'P(not A) = 1 - P(A)',
        'P(A ‚à™ B) = P(A) + P(B) - P(A ‚à© B)',
        'P(A ‚à© B) = P(A) √ó P(B)',
        'P(A ‚à© B) = P(A) √ó P(B|A)',
        '|S| = |A| √ó |B| √ó ... √ó |N|',
        'Œ£ P(all outcomes) = 1'
    ],
    'Example': [
        'P(rolling 5) = 1/6',
        'S_die = {1,2,3,4,5,6}',
        'P(even) = 3/6 = 0.5',
        'P(Latin or Asian) = 0.22 + 0.15',
        'P(not King) = 1 - 4/52 = 48/52',
        'Include/exclude principle',
        'P(HH) = 1/2 √ó 1/2 = 1/4',
        'P(5‚ô•,5‚ô†) = 4/52 √ó 3/51',
        '|S_2dice| = 6 √ó 6 = 36',
        'All die outcomes sum to 1'
    ]
}

df_summary = pd.DataFrame(summary_data)

print("="*100)
print("PROBABILITY THEORY - KEY CONCEPTS AND FORMULAS")
print("="*100)
print()
print(df_summary.to_string(index=False))
print()
print("="*100)
print("IMPORTANT DISTINCTIONS:")
print("="*100)
print()
print("INDEPENDENT vs DEPENDENT Events:")
print("  ‚Ä¢ Independent: Outcome of A doesn't affect B")
print("                 Example: Flipping two coins, rolling dice")
print("                 Formula: P(A ‚à© B) = P(A) √ó P(B)")
print()
print("  ‚Ä¢ Dependent: Outcome of A affects B")
print("               Example: Drawing cards without replacement")
print("               Formula: P(A ‚à© B) = P(A) √ó P(B|A)")
print("               Note: Sample space changes (|S_B| ‚â† |S_A|)")
print()
print("="*100)
print("KEY REMINDERS:")
print("="*100)
print("1. Probabilities range from 0 (impossible) to 1 (certain)")
print("2. All probabilities in a sample space must sum to 1")
print("3. Use sets to represent events and apply set operations")
print("4. Draw Venn diagrams to visualize relationships")
print("5. Enumerate outcomes when sample space is small")
print("6. For compound events, check if events are independent or dependent")
print("7. Multiplication rule applies to both probabilities and sample spaces")
print("8. Verify calculations by checking if results make logical sense")
print("="*100)

<details>
<summary>üí° Summary</summary>

This notebook covered fundamental probability concepts:

- **Basic Probability**: Quantifying uncertainty with values between 0 and 1
- **Set Theory**: Sample spaces, events, and set operations (‚à™, ‚à©, complement)
- **Simple Events**: Calculating P(A) using favorable outcomes over total outcomes
- **Compound Events**: Using Cartesian products and multiplication rules
- **Independence**: Distinguishing between independent (P(A‚à©B) = P(A)√óP(B)) and dependent events (P(A‚à©B) = P(A)√óP(B|A))
- **Real Applications**: Card games, spinners, and practical scenarios

These concepts form the foundation for statistical inference, machine learning, and data-driven decision making.
</details>