code6: Probability, Distributions, and Bayes with SciPy + IRIS dataset    
MATH170 Calculus for Data Science I, Chapter 6    
All comments are in English as requested.    
Hands-on practice cells are included throughout and a final exercise set is provided at the end.   

In [None]:
import pandas as pd
from itertools import product

# -------------------------------
# Sample space: Flip 3 fair coins
# -------------------------------
coins = ['H', 'T']
S3 = pd.DataFrame(product(coins, repeat=3), columns=['C1', 'C2', 'C3'])
print("Sample space size:", len(S3))  # should be 8
print(S3.head())

# Define events
# A = "at least one head"
# B = "exactly two heads"
A = (S3 == 'H').any(axis=1)  # any head appears
B = (S3 == 'H').sum(axis=1) == 2  # exactly 2 heads

# Probabilities
p_A = A.mean()
p_B = B.mean()
p_AiB = (A & B).mean()

print("P(A) =", p_A)
print("P(B) =", p_B)
print("P(A∩B) =", p_AiB)
print("P(A)*P(B) =", p_A * p_B)
print("Independent?", abs(p_A * p_B - p_AiB) < 1e-12)


In [None]:
import pandas as pd
from itertools import product

# -------------------------------
# Sample space: Roll 2 fair dice
# -------------------------------
dice = range(1, 7)
S2 = pd.DataFrame(product(dice, repeat=2), columns=['D1', 'D2'])
print("Sample space size:", len(S2))  # should be 36

# Define events
# A = "sum is even"
# B = "at least one die shows 6"
A = ((S2['D1'] + S2['D2']) % 2 == 0)
B = (S2['D1'] == 6) | (S2['D2'] == 6)

# Probabilities
p_A = A.mean()
p_B = B.mean()
p_AiB = (A & B).mean()

print("P(A) =", p_A)
print("P(B) =", p_B)
print("P(A∩B) =", p_AiB)
print("P(A)*P(B) =", p_A * p_B)
print("Independent?", abs(p_A * p_B - p_AiB) < 1e-12)

In [None]:
# Hands-On Practice 1
# Task: In a deck of 52 cards, let A = event "spade", B = event "rank is 2".
# Compute P(A), P(B), P(A∩B), and test independence by comparing P(A)P(B) with P(A∩B).
# Write your calculations below.

In [None]:
# Write your Code

In [None]:
# ===============================
# 2. Conditional probability, independence, total probability (manual computation)
# ===============================
import pandas as pd

# ----------------------------
# (1) Contingency Table Example: Handedness by Sex
# ----------------------------
data = pd.DataFrame({
    'Gender': ['M']*52 + ['F']*48,
    'Hand': ['R']*43 + ['L']*9 + ['R']*44 + ['L']*4
})

ct = pd.crosstab(data['Gender'], data['Hand'])
print("Contingency Table:\n", ct)

N = len(data)

# Basic probabilities
p_M = (data['Gender'] == 'M').mean()
p_F = (data['Gender'] == 'F').mean()
p_R = (data['Hand'] == 'R').mean()
p_L = (data['Hand'] == 'L').mean()

# Joint probabilities
p_M_and_R = ((data['Gender'] == 'M') & (data['Hand'] == 'R')).mean()
p_F_and_L = ((data['Gender'] == 'F') & (data['Hand'] == 'L')).mean()
p_M_and_L = ((data['Gender'] == 'M') & (data['Hand'] == 'L')).mean()
p_F_and_R = ((data['Gender'] == 'F') & (data['Hand'] == 'R')).mean()

# Conditional probabilities (by definition)
p_R_given_M = p_M_and_R / p_M
p_R_given_F = p_F_and_R / p_F
p_L_given_M = p_M_and_L / p_M
p_M_given_L = p_M_and_L / p_L

print(f"P(M)={p_M:.3f}, P(F)={p_F:.3f}, P(R)={p_R:.3f}, P(L)={p_L:.3f}")
print(f"P(M and R)={p_M_and_R:.3f}, P(F and L)={p_F_and_L:.3f}")
print(f"P(R|M)={p_R_given_M:.3f}, P(R|F)={p_R_given_F:.3f}, P(L|M)={p_L_given_M:.3f}, P(M|L)={p_M_given_L:.3f}")

# Independence check
print("Are Hand and Gender independent with respect to R?",
      "Yes" if abs(p_R_given_M - p_R) < 1e-12 else "No")
print("Numeric gap =", abs(p_R_given_M - p_R))

### Hands-On Practice 2

In this exercise, we will explore the relationship between **smoking** and **exercise habits** in a small group of people.  
We will compute joint, marginal, and conditional probabilities using a `pandas` DataFrame and test whether the two variables are independent.

---

### Problem Statement

A health survey of 100 people produced the following results:

| Smoke | Exercise = Yes | Exercise = No | Total |
|:------|:----------------:|:---------------:|:------:|
| Yes   | 10 | 20 | 30 |
| No    | 50 | 20 | 70 |
| **Total** | **60** | **40** | **100** |

Let  
- A = event “person smokes,”  
- B = event “person exercises.”

Answer the following questions:

1. Compute \(P(A)\), \(P(B)\), \(P(A∩B)\).  
2. Compute \(P(B|A)\) and \(P(B|\bar{A})\).  
3. Compute \(P(A|B)\) and \(P(A|\bar{B})\).  
4. Are “Smoke” and “Exercise” independent? Justify numerically.  

In [1]:
import pandas as pd

data = pd.DataFrame({
    'Smoke': ['Yes']*10 + ['Yes']*20 + ['No']*50 + ['No']*20,
    'Exercise': ['Yes']*10 + ['No']*20 + ['Yes']*50 + ['No']*20
})

ct = pd.crosstab(data['Smoke'], data['Exercise'])
print("Contingency Table:\n", ct)

N = len(data)

Contingency Table:
 Exercise  No  Yes
Smoke            
No        20   50
Yes       20   10
