# Probability and Statistics Tutorial

This notebook covers **Conditional Probability**, **Bayes' Theorem**, **Binomial Distribution**, and **Beta Distribution**. The JSON structure is corrected to avoid parsing errors, and content is streamlined for clarity.

## Table of Contents
- [Conditional Probability and Bayes' Theorem](#conditional-probability-and-bayes-theorem)
- [Binomial Distribution](#binomial-distribution)
- [Beta Distribution](#beta-distribution)
- [Practice Questions](#practice-questions)


## Conditional Probability and Bayes' Theorem

**Conditional Probability** measures the probability of event A given event B: P(A|B).

**Bayes' Theorem** reverses this:
\[ P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)} \]
where:
- P(A|B): Probability of A given B.
- P(B|A): Probability of B given A.
- P(A), P(B): Prior probabilities.

### Example: Coffee and Cancer
Given:
- P(Coffee) = 0.65
- P(Cancer) = 0.005
- P(Coffee|Cancer) = 0.85

Calculate P(Cancer|Coffee).


In [None]:
# Calculate P(Cancer|Coffee) using Bayes' Theorem
P_coffee = 0.65                # P(Coffee)
P_cancer = 0.005               # P(Cancer)
P_coffee_given_cancer = 0.85   # P(Coffee|Cancer)

P_cancer_given_coffee = (P_coffee_given_cancer * P_cancer) / P_coffee
print(f"Probability of cancer given coffee drinker: {P_cancer_given_coffee:.6f}")


The result (~0.65%) shows the low probability of cancer among coffee drinkers.


## Binomial Distribution

The **Binomial Distribution** models successes in n independent trials, each with success probability p.

**PMF**:
\[ P(X = k) = \binom{n}{k} p^k (1-p)^{n-k} \]

**Properties**:
- n: Number of trials.
- p: Success probability.
- Mean: np, Variance: np(1-p).

### Example: Classifier Accuracy
A classifier with 80% accuracy (p = 0.8) runs on 10 trials (n = 10). Compute probabilities for all successes.


In [None]:
from scipy.stats import binom

# Parameters
n_trials = 10
p_success = 0.8

# Compute PMF for k = 0 to 10
print("Number of Successes (k) | Probability")
print("----------------------------------")
for k in range(n_trials + 1):
    probability = binom.pmf(k, n_trials, p_success)
    print(f"{k:>19} | {probability:.6f}")


| Function | Description                  | Python Code                 |
|----------|------------------------------|-----------------------------|
| PMF      | P(X = k)                     | binom.pmf(k, n, p)         |
| CDF      | P(X <= k)                    | binom.cdf(k, n, p)         |


## Beta Distribution

The **Beta Distribution** models probabilities on [0, 1].

**Properties**:
- Parameters: alpha (successes), beta (failures).
- PDF: \[ f(x) \propto x^{\alpha-1} (1-x)^{\beta-1} \]
- Mean: \[ \frac{\alpha}{\alpha + \beta} \]
- Posterior: Beta(alpha + s, beta + f) after s successes, f failures.

### Example: Posterior Distribution
With 6 successes, 4 failures, and a uniform prior (Beta(1, 1)), compute and plot the posterior.


In [None]:
from scipy.stats import beta
import numpy as np
import matplotlib.pyplot as plt

# Observed data
successes = 6
failures = 4
alpha_prior = 1
beta_prior = 1

# Posterior parameters
alpha_post = alpha_prior + successes
beta_post = beta_prior + failures

# Plot posterior
x = np.linspace(0, 1, 100)
y = beta.pdf(x, alpha_post, beta_post)
plt.figure(figsize=(8, 5))
plt.plot(x, y, label=f'Beta({alpha_post}, {beta_post})', color='blue')
plt.fill_between(x, y, alpha=0.1, color='blue')
plt.title('Posterior Distribution of Success Probability')
plt.xlabel('Success Probability')
plt.ylabel('Density')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

# Compute mean
mean_estimate = alpha_post / (alpha_post + beta_post)
print(f'Mean of posterior: {mean_estimate:.3f}')


The posterior is Beta(7, 5), with mean 0.583.


## Practice Questions

### Q1: Binomial Distribution
For the classifier (n = 10, p = 0.8), compute P(X = 8).


In [None]:
from scipy.stats import binom

n_trials = 10
p_success = 0.8
k_successes = 8
probability = binom.pmf(k_successes, n_trials, p_success)
print(f'Probability of exactly 8 correct predictions: {probability:.6f}')


### Q2: Beta Distribution
For the posterior Beta(7, 5), compute P(p <= 0.9).


In [None]:
from scipy.stats import beta

alpha_post = 7
beta_post = 5
prob = beta.cdf(0.9, alpha_post, beta_post)
print(f'Probability that success probability is at most 0.9: {prob:.3f}')
