# Chapter 6 Toolkit ‚Äî Random Variable Generation (Lecture Notes pp. 102‚Äì109)

This notebook is a **complete, reusable toolkit** for Chapter 6: Random Variable Generation.

---

## üìö What You'll Learn

This toolkit helps you understand **how computers generate random numbers** and **how to sample from probability distributions**. Since computers are deterministic machines, they can't produce truly random numbers‚Äîinstead, they create sequences that *look* random (called **pseudorandom**).

---

## üéØ How to Use This Notebook

1. **Run cells sequentially from top to bottom** to set up all functions
2. Once functions are loaded, you can:
   - Rerun any example cell to see it work again
   - Modify parameters to experiment with different settings
   - Use the functions in your own analyses

3. **Take it slow**: Each section builds on previous concepts
4. **Run the demos**: They help visualize what's happening

---

## üìñ Sections Overview

### **6.1 Congruential Generators (LCG)** ‚Äî Building Blocks of Randomness
Learn how computers generate sequences of "random" numbers using a simple mathematical formula. We'll explore:
- **What is an LCG?** A formula that generates numbers: u_i = (a √ó u_{i-1} + b) mod M
- **Why it matters**: Understanding when these generators work well (full period)
- **Quality checks**: How to verify if numbers are truly "random-looking"

**Key Concepts Covered:**
- **Definition 6.1**: Pseudorandomness ‚Äî what makes numbers "random enough"
- **Definition 6.2**: Period ‚Äî how long before the sequence repeats
- **Lemma 6.6**: Full period ‚Üí uniform distribution (good randomness!)
- **Lemma 6.8 & 6.10**: How to map to smaller ranges
- **Theorem 6.11** (Hull‚ÄìDobell): Conditions that guarantee full period
- **Lemma 6.12 & 6.13**: Expected statistical properties (mean/variance)
- **Lemma 6.14**: Quality bounds for intervals

---

### **6.2 Sampling from Distributions** ‚Äî From Uniform to Any Distribution
Once we have uniform random numbers [0,1], how do we get samples from other distributions like Normal, Exponential, etc.?

**Two Main Techniques:**

1. **Inversion Sampling** (Theorem 5.39)
   - If you know the inverse CDF: X = F‚Åª¬π(U)
   - Simple and exact when inverse CDF is available

2. **Accept-Reject Sampling** (Algorithm 1, Lemma 6.15)
   - When inverse CDF is hard to compute
   - Generate from an easier proposal, then accept/reject strategically

**Special Focus:**
- **Normal Distribution**: Box‚ÄìMuller transform (Theorem 6.16)
- **Optional**: Inverse CDF approximation for Gaussian

---

### **6.3 Practice Exercises** ‚Äî Apply What You've Learned
Work through concrete examples to solidify understanding:
- **Exercise 6.18**: Testing uniformity with Kolmogorov-Smirnov test
- **Exercise 6.19**: Testing normality (with important caveats)
- **Exercise 6.20**: Sampling from p(x) = 0.5 cos(x) using both methods

---

## üí° Learning Tips

- **Definitions tell you WHAT**: Understand the terminology
- **Lemmas/Theorems tell you WHY**: Understand why methods work
- **Code shows you HOW**: See the concepts in action
- **Plots help you SEE**: Visualize distributions and patterns

**Don't rush!** Each concept builds on the previous one. If something is unclear, review the previous section.

---

## üöÄ Quick Start

Run the import cell below to get started!

In [None]:
import numpy as np
import math
from dataclasses import dataclass
from typing import Callable, Tuple, Dict, Optional, List
import matplotlib.pyplot as plt


---

## 6.1 Linear Congruential Generators (LCG)

### ü§î What is an LCG?

A **Linear Congruential Generator** is the simplest way to generate pseudorandom numbers. It uses a simple formula to create a sequence of integers:

$$u_i = (a \cdot u_{i-1} + b) \mod M$$

**Breaking it down:**
- **u·µ¢**: The current random number
- **u·µ¢‚Çã‚ÇÅ**: The previous random number
- **a**: Multiplier (a constant we choose)
- **b**: Increment (another constant we choose)
- **M**: Modulus (defines the range: 0 to M-1)
- **mod M**: Take remainder after dividing by M

**Think of it like a clock:** If M=12 (like a clock), and you start at 3, then do (5√ó3 + 1) = 16, you get 16 mod 12 = 4. The next number depends only on the current number‚Äîit's deterministic but looks random!

**Key Insight:** The sequence will eventually repeat (cycle back to where it started). The length of this cycle is called the **period**. A good LCG should have a long period (ideally M, the maximum possible).

### üìä LCG Recurrence Formula

$$u_i = (a \cdot u_{i-1} + b) \mod M$$

All values are in the set $\{0, 1, 2, \ldots, M-1\}$

In [None]:
def lcg(a: int, b: int, M: int, seed: int):
    """
    Infinite Linear Congruential Generator stream.
    
    Generates pseudorandom integers in {0, 1, ..., M-1} using:
        u_i = (a * u_{i-1} + b) mod M
    
    Parameters:
    -----------
    a : int
        Multiplier (should be chosen carefully)
    b : int
        Increment 
    M : int
        Modulus (defines the range and maximum period)
    seed : int
        Starting value u_0 (initial state)
    
    Yields:
    -------
    int : Next pseudorandom value in the sequence
    
    Example:
    --------
    >>> gen = lcg(5, 1, 16, 1)
    >>> [next(gen) for _ in range(5)]
    [1, 6, 15, 12, 13]
    """
    if M <= 0:
        raise ValueError("M must be positive.")
    
    # Start with the seed value, brought into range [0, M-1]
    u = seed % M
    
    while True:
        yield u  # Return current value
        u = (a * u + b) % M  # Generate next value

def lcg_sequence(a: int, b: int, M: int, seed: int, n: int) -> np.ndarray:
    """
    Generate first n values from LCG as a NumPy array.
    
    This is a convenience wrapper around lcg() for getting
    a fixed number of values at once.
    
    Parameters:
    -----------
    n : int
        Number of values to generate
    
    Returns:
    --------
    np.ndarray : Array of n pseudorandom integers
    """
    gen = lcg(a, b, M, seed)
    return np.fromiter((next(gen) for _ in range(n)), dtype=np.int64, count=n)

def estimate_period_from_seed(a: int, b: int, M: int, seed: int, max_steps: int = 5_000_000) -> int:
    """
    Find the period (cycle length) of an LCG by detecting when it repeats.
    
    How it works:
    1. Generate the first value
    2. Keep generating until we see that first value again
    3. Count how many steps it took
    
    ‚ö†Ô∏è Warning: Only use this for small M (like M < 10,000).
    For large M, this could take forever!
    
    Parameters:
    -----------
    max_steps : int
        Maximum iterations before giving up
    
    Returns:
    --------
    int : The period (cycle length)
    
    Example:
    --------
    >>> estimate_period_from_seed(5, 1, 16, 1)
    16  # Full period!
    """
    gen = lcg(a, b, M, seed)
    first = next(gen)  # Remember the first value
    
    for t in range(1, max_steps + 1):
        if next(gen) == first:  # Found the cycle!
            return t
    
    # If we get here, period is longer than max_steps
    raise RuntimeError("Period not found within max_steps.")

---

### Definition 6.1: Pseudorandomness

**Question:** How do we know if our generated numbers are "random enough"?

**Answer:** We check if each value appears with **equal frequency** over the full period.

For M possible values {0, 1, ..., M-1}, each should appear approximately 1/M of the time.

**Why this matters:** If some numbers appear more often than others, your "random" numbers have a bias, which can ruin statistical simulations!

In [None]:
def frequency_table(seq: np.ndarray, M: int) -> np.ndarray:
    """
    Calculate how often each value appears in a sequence.
    
    For pseudorandom numbers in {0, ..., M-1}, we expect each
    value to appear with frequency ‚âà 1/M.
    
    Parameters:
    -----------
    seq : np.ndarray
        Sequence of integers to analyze
    M : int
        Number of possible values (0 to M-1)
    
    Returns:
    --------
    np.ndarray : Frequency of each value (sums to 1.0)
    """
    counts = np.bincount(np.asarray(seq, dtype=np.int64), minlength=M)
    return counts / counts.sum()

def uniformity_score(freqs: np.ndarray) -> float:
    """
    Measure how far frequencies deviate from uniform (1/M for each value).
    
    Returns the maximum deviation from expected frequency.
    
    Lower score = better uniformity (closer to truly random)
    Score of 0 = perfect uniformity
    
    Parameters:
    -----------
    freqs : np.ndarray
        Frequency array from frequency_table()
    
    Returns:
    --------
    float : Maximum deviation from uniform (0 = perfect)
    
    Example:
    --------
    If M=10, ideal frequency is 0.1 for each value.
    If one value appears 0.15 times, deviation = |0.15 - 0.1| = 0.05
    """
    M = len(freqs)
    expected = 1.0 / M
    return float(np.max(np.abs(freqs - expected)))

def show_frequency_bar(freqs: np.ndarray, title: str = "Frequencies"):
    """
    Visualize frequency distribution as a bar chart.
    
    For good pseudorandomness, all bars should be approximately equal height.
    """
    plt.figure()
    plt.bar(np.arange(len(freqs)), freqs)
    plt.axhline(y=1.0/len(freqs), color='r', linestyle='--', 
                label=f'Expected (1/{len(freqs)})')
    plt.title(title)
    plt.xlabel("value")
    plt.ylabel("frequency")
    plt.legend()
    plt.show()

---

### Lemma 6.6: Full Period ‚üπ Uniform Frequencies

**Key Result:** If an LCG has **full period** (period = M), then each value in {0, ..., M-1} appears **exactly once** per cycle!

**Why?** With period M, the sequence visits all M values before repeating. So over one full cycle:
- Each value appears exactly 1 time
- Frequency of each value = 1/M (perfectly uniform!)

**Demo below:** We'll generate 200 values and check if frequencies are close to uniform.

In [None]:
# Demo: Test an LCG for full period and uniform frequencies
a, b, M, seed = 5, 1, 16, 1

# Generate 200 values (which is 200/16 ‚âà 12.5 full cycles)
seq = lcg_sequence(a, b, M, seed, n=200)

# Calculate frequencies
freqs = frequency_table(seq, M)

# Find the period
period = estimate_period_from_seed(a, b, M, seed)

print(f"Period = {period} (full period would be {M})")
print(f"Uniformity score = {uniformity_score(freqs):.6f} (lower is better, 0 is perfect)")
print(f"Expected frequency per value = {1/M:.4f}")

# Visualize: bars should all be approximately the same height
show_frequency_bar(freqs, title=f"LCG Frequencies (period={period}, M={M})")

---

### Lemma 6.8 and 6.10: Mapping to Smaller Ranges

**Problem:** Our LCG generates numbers in {0, ..., M-1}, but what if we want a smaller range {0, ..., K-1} where K < M?

**Two Solutions:**

#### **Method 1: Modulo Mapping** (Lemma 6.8)
$$v_i = u_i \mod K$$

**Pros:** Maintains good frequency distribution  
**Cons:** **Period becomes K** (much shorter than M!)

**When to use:** When you need good frequency balance and short period is okay

---

#### **Method 2: Scaled-Floor Mapping** (Lemma 6.10)
$$v_i = \lfloor \frac{u_i}{M} \cdot K \rfloor$$

**Pros:** **Period stays M** (doesn't shrink!)  
**Cons:** Slightly less uniform frequencies

**When to use:** When you need to maintain the long period

---

**Intuition:**
- **Mod:** Like cutting a clock from 12 hours to 4 hours (wraps every 4)
- **Scaled-floor:** Like rescaling 0-100 to 0-10 (proportional mapping)

**Demo below:** Compare both methods on the same LCG

In [None]:
def map_mod_k(u: np.ndarray, K: int) -> np.ndarray:
    """
    Map values to {0, ..., K-1} using modulo operation.
    
    Formula: v_i = u_i mod K
    
    ‚ö†Ô∏è Warning: This reduces the period to K!
    """
    u = np.asarray(u, dtype=np.int64)
    return (u % K).astype(np.int64)

def map_scaled_floor(u: np.ndarray, M: int, K: int) -> np.ndarray:
    """
    Map values to {0, ..., K-1} using proportional scaling.
    
    Formula: v_i = floor((u_i / M) * K)
    
    ‚úì Preserves the period M
    """
    u = np.asarray(u, dtype=np.float64)
    return np.floor((u / M) * K).astype(np.int64)

# Demo: Compare both mapping methods
M_demo = 2**10  # Original range: 0 to 1023
K_demo = 64     # Target range: 0 to 63

a2, b2, seed2 = 5, 1, 7

# Generate 50,000 values from LCG
u = lcg_sequence(a2, b2, M_demo, seed2, n=50000)

# Method 1: Modulo mapping
v_mod = map_mod_k(u, K_demo)

# Method 2: Scaled-floor mapping  
v_floor = map_scaled_floor(u, M_demo, K_demo)

# Compare uniformity
print("="*60)
print("Mapping Comparison:")
print("="*60)
print(f"Original range: [0, {M_demo-1}]")
print(f"Target range:   [0, {K_demo-1}]")
print()
print(f"Modulo mapping uniformity score:       {uniformity_score(frequency_table(v_mod, K_demo)):.6f}")
print(f"Scaled-floor mapping uniformity score: {uniformity_score(frequency_table(v_floor, K_demo)):.6f}")
print()
print("Lower score = better uniformity")
print("Both methods work well, but mod has slightly better uniformity")
print("However, mod reduces period to K, while scaled-floor keeps period M!")

---

### Theorem 6.11: Hull‚ÄìDobell Conditions for Full Period

**The Big Question:** How do we choose parameters (a, b, M) to guarantee full period?

**Answer: The Hull‚ÄìDobell Theorem**

An LCG has **full period M** if and only if ALL three conditions hold:

1. **gcd(b, M) = 1**  
   ‚Üí b and M must be coprime (share no common factors except 1)

2. **For every prime p that divides M:**  
   ‚Üí (a - 1) must be divisible by p  
   ‚Üí In math: p | (a - 1) for all primes p | M

3. **If 4 divides M:**  
   ‚Üí Then 4 must also divide (a - 1)  
   ‚Üí In math: 4 | M  ‚üπ  4 | (a - 1)

---

**Why This Matters:**  
These conditions ensure the sequence cycles through ALL M values before repeating. If any condition fails, you'll get a shorter period (bad randomness!).

**Example:**  
- M = 16 = 2‚Å¥ (so 4 | M)
- Prime factors of M: {2}
- Choose a = 5, b = 1
  - gcd(1, 16) = 1 ‚úì
  - (5-1) = 4, and 2 | 4 ‚úì
  - (5-1) = 4, and 4 | 4 ‚úì
- **Result:** Full period guaranteed!

**Demo below:** Test whether parameters satisfy Hull‚ÄìDobell

In [None]:
def prime_factors(n: int) -> List[int]:
    """
    Find all unique prime factors of n.
    
    Example:
    --------
    >>> prime_factors(16)
    [2]  # 16 = 2^4
    
    >>> prime_factors(30)
    [2, 3, 5]  # 30 = 2 √ó 3 √ó 5
    """
    n = abs(int(n))
    factors = []
    if n < 2:
        return factors
    
    # Check for factor of 2
    if n % 2 == 0:
        factors.append(2)
        while n % 2 == 0:
            n //= 2
    
    # Check odd factors from 3 upward
    p = 3
    while p * p <= n:
        if n % p == 0:
            factors.append(p)
            while n % p == 0:
                n //= p
        p += 2
    
    # If n > 1, then it's a prime factor
    if n > 1:
        factors.append(n)
    
    return factors

def hull_dobell_conditions(a: int, b: int, M: int) -> Dict[str, object]:
    """
    Check if LCG parameters satisfy Hull‚ÄìDobell conditions for full period.
    
    Returns a detailed report of each condition.
    
    Example:
    --------
    >>> hull_dobell_conditions(5, 1, 16)
    {'gcd(b,M)=1': True,
     'primes(M)': [2],
     'p | (a-1) for all primes p|M': True,
     'if 4|M then 4|(a-1)': True,
     'FULL_PERIOD_EXPECTED': True}
    """
    if M <= 0:
        raise ValueError("M must be positive.")
    
    # Condition 1: gcd(b, M) = 1
    cond1 = math.gcd(b, M) == 1
    
    # Find prime factors of M
    primes = prime_factors(M)
    
    # Condition 2: For all primes p dividing M, p must divide (a-1)
    cond2 = all(((a - 1) % p == 0) for p in primes)
    
    # Condition 3: If 4 divides M, then 4 must divide (a-1)
    cond3 = True if (M % 4 != 0) else ((a - 1) % 4 == 0)
    
    return {
        "gcd(b,M)=1": cond1,
        "primes(M)": primes,
        "p | (a-1) for all primes p|M": cond2,
        "if 4|M then 4|(a-1)": cond3,
        "FULL_PERIOD_EXPECTED": (cond1 and cond2 and cond3),
    }

# Test: Does a=5, b=1, M=2^10 give full period?
print("Testing Hull‚ÄìDobell conditions for a=5, b=1, M=1024:")
print("="*60)
result = hull_dobell_conditions(a=5, b=1, M=2**10)
for key, value in result.items():
    print(f"{key}: {value}")
print("="*60)

if result["FULL_PERIOD_EXPECTED"]:
    print("‚úì All conditions satisfied! Full period guaranteed.")
else:
    print("‚úó Some conditions failed. Period will be less than M.")

---

### Lemma 6.12 + Corollary 6.13: Long-Run Statistical Properties

**Question:** What are the expected mean and variance of LCG values?

**Two Scenarios:**

#### **Scenario 1: Raw LCG values** (integers in {0, 1, ..., M-1})

For discrete uniform distribution:
- **Mean**: $\mu = \frac{M-1}{2}$
- **Variance**: $\sigma^2 = \frac{M^2 - 1}{12}$

**Intuition:** Values range from 0 to M-1, so the average is roughly M/2.

---

#### **Scenario 2: Scaled to [0,1]** (v = u/M)

When we scale values to [0, 1]:
- **Mean**: $\mu = \frac{1}{2} - \frac{1}{2M}$ (approaches 0.5 as M grows)
- **Variance**: $\sigma^2 = \frac{1}{12} - \frac{1}{12M^2}$ (approaches 1/12 as M grows)

**Why 1/12?** For continuous Uniform[0,1], variance = (1-0)¬≤/12 = 1/12

---

**Demo below:** Generate many values and verify these theoretical predictions

In [None]:
def theoretical_discrete_uniform_moments(M: int) -> Tuple[float, float]:
    """
    Calculate theoretical mean and variance for discrete uniform {0, ..., M-1}.
    
    Returns:
    --------
    (mean, variance)
    """
    mean = (M - 1) / 2.0
    variance = (M**2 - 1) / 12.0
    return mean, variance

def theoretical_scaled_moments(M: int) -> Tuple[float, float]:
    """
    Calculate theoretical mean and variance when scaled to [0, 1].
    
    Formula for v = u/M:
    - Mean = 1/2 - 1/(2M)
    - Variance = 1/12 - 1/(12M¬≤)
    
    As M ‚Üí ‚àû, these approach the continuous Uniform[0,1] moments:
    - Mean ‚Üí 0.5
    - Variance ‚Üí 1/12 ‚âà 0.0833
    """
    mean = 0.5 - 1.0 / (2.0 * M)
    variance = 1.0 / 12.0 - 1.0 / (12.0 * M * M)
    return mean, variance

def empirical_moments(x: np.ndarray) -> Tuple[float, float]:
    """Calculate sample mean and variance from data."""
    x = np.asarray(x, dtype=np.float64)
    return float(np.mean(x)), float(np.var(x, ddof=0))

# Demo: Compare theoretical vs empirical moments
M3 = 2**16  # M = 65536
u3 = lcg_sequence(5, 1, M3, 123, n=200000)  # Generate 200,000 values
v3 = u3 / M3  # Scale to [0, 1]

print("="*70)
print("MOMENTS COMPARISON: Theoretical vs Empirical")
print("="*70)
print()
print("Raw integer values u (in {0, ..., M-1}):")
print("-"*70)
emp_mean_u, emp_var_u = empirical_moments(u3)
theo_mean_u, theo_var_u = theoretical_discrete_uniform_moments(M3)
print(f"  Empirical:   mean = {emp_mean_u:10.4f}  variance = {emp_var_u:12.4f}")
print(f"  Theoretical: mean = {theo_mean_u:10.4f}  variance = {theo_var_u:12.4f}")
print(f"  Difference:        = {abs(emp_mean_u - theo_mean_u):10.6f}           = {abs(emp_var_u - theo_var_u):12.6f}")
print()

print("Scaled values v (in [0, 1]):")
print("-"*70)
emp_mean_v, emp_var_v = empirical_moments(v3)
theo_mean_v, theo_var_v = theoretical_scaled_moments(M3)
print(f"  Empirical:   mean = {emp_mean_v:.6f}  variance = {emp_var_v:.6f}")
print(f"  Theoretical: mean = {theo_mean_v:.6f}  variance = {theo_var_v:.6f}")
print(f"  Difference:        = {abs(emp_mean_v - theo_mean_v):.8f}           = {abs(emp_var_v - theo_var_v):.8f}")
print()
print("Note: Empirical values should be very close to theoretical!")
print("      Small differences are due to finite sample size.")

---

### Lemma 6.14: Interval Frequency Error Bound

**Question:** If we scale LCG values to [0, 1], how accurate are frequencies in any interval (a, b)?

**Answer:** The error is bounded by **1/M**

**Formal Statement:**  
Let v = u/M be scaled to [0, 1]. For any interval (a, b):

$$\left| \frac{\#\{v_i \in (a,b)\}}{n} - (b - a) \right| \leq \frac{1}{M}$$

**What this means:**
- Left side: empirical frequency in interval (a, b)
- (b - a): theoretical probability for Uniform[0,1]
- The difference is at most 1/M

**Intuition:** Larger M ‚Üí smaller error ‚Üí better approximation to true uniform distribution

**Example:** If M = 1024 and (a,b) = (0.2, 0.7), theoretical probability = 0.5. The empirical frequency will be between 0.499 and 0.501 (error ‚â§ 1/1024 ‚âà 0.001).

**Demo below:** Test this bound on an actual interval

In [None]:
def interval_frequency(x: np.ndarray, a: float, b: float) -> float:
    """
    Calculate the fraction of values that fall in the interval (a, b).
    
    For Uniform[0,1], we expect this to be (b - a).
    """
    x = np.asarray(x, dtype=np.float64)
    return float(np.mean((x > a) & (x < b)))

def lemma_614_bound(M: int) -> float:
    """
    Return the theoretical error bound from Lemma 6.14.
    
    This is the maximum possible deviation: 1/M
    """
    return 1.0 / M

# Demo: Test Lemma 6.14 on interval (0.2, 0.7)
print("="*70)
print("LEMMA 6.14 DEMONSTRATION: Interval Frequency Error Bound")
print("="*70)
print()

a_int, b_int = 0.2, 0.7
print(f"Testing interval: ({a_int}, {b_int})")
print(f"M = {M3}")
print()

# Calculate empirical frequency
emp = interval_frequency(v3, a_int, b_int)

# Theoretical probability for Uniform[0,1]
true = b_int - a_int

# Error bound from Lemma 6.14
bound = lemma_614_bound(M3)

# Actual error
error = abs(emp - true)

print(f"Theoretical probability:  {true:.6f}")
print(f"Empirical frequency:      {emp:.6f}")
print(f"Actual error:             {error:.6f}")
print(f"Lemma 6.14 bound:         {bound:.6f} (= 1/M)")
print()

if error <= bound:
    print(f"‚úì Error ({error:.6f}) ‚â§ Bound ({bound:.6f})")
    print("  Lemma 6.14 is satisfied!")
else:
    print(f"‚úó Error ({error:.6f}) > Bound ({bound:.6f})")
    print("  This shouldn't happen with a good LCG!")

---

## Exercise 6.18 ‚Äî Testing Uniformity with Kolmogorov-Smirnov Test

### üéØ What We're Testing

**Goal:** Verify that our LCG produces values that are truly uniform on [0, 1]

**Method:** Kolmogorov-Smirnov (KS) test
- Compares empirical CDF vs theoretical CDF
- Measures maximum vertical distance between them

### üìä The KS Statistic

$$D_n = \sup_x |F_n(x) - F(x)|$$

Where:
- **F‚Çô(x)**: Empirical CDF from our sample
- **F(x)**: Theoretical CDF (for Uniform[0,1], F(x) = x)
- **sup**: supremum (maximum) over all x

### üîç Decision Rule

We reject the null hypothesis (data is uniform) if:

$$D_n > \sqrt{\frac{\log(2/\alpha)}{2n}}$$

This is a conservative **DKW (Dvoretzky‚ÄìKiefer‚ÄìWolfowitz) bound**

**Typical choice:** Œ± = 0.05 (5% significance level)

**Interpretation:**
- **Reject (D_n > threshold):** Data doesn't look uniform  
- **Fail to reject (D_n ‚â§ threshold):** Data is consistent with uniform

**Demo below:** Test our LCG and visualize the results

In [None]:
def ks_statistic(sample: np.ndarray, cdf: Callable[[np.ndarray], np.ndarray]) -> float:
    """
    Calculate Kolmogorov-Smirnov test statistic.
    
    Measures the maximum distance between empirical and theoretical CDFs.
    
    How it works:
    1. Sort the sample
    2. For each value x, compute:
       - Empirical CDF: F‚Çô(x) = (# values ‚â§ x) / n
       - Theoretical CDF: F(x) from the cdf function
    3. Find the maximum absolute difference
    
    Parameters:
    -----------
    sample : np.ndarray
        Data to test
    cdf : Callable
        Theoretical cumulative distribution function
    
    Returns:
    --------
    float : KS statistic D‚Çô ‚àà [0, 1]
    """
    xs = np.sort(np.asarray(sample, dtype=np.float64))
    n = xs.size
    
    # Theoretical CDF values at sample points
    F = cdf(xs)
    
    # Empirical CDF: steps from 0 to 1
    Fn_right = np.arange(1, n+1) / n  # Right-continuous version
    Fn_left = np.arange(0, n) / n      # Left-continuous version
    
    # Maximum distance (check both sides of each jump)
    d1 = np.max(np.abs(Fn_right - F))
    d2 = np.max(np.abs(Fn_left - F))
    
    return float(max(d1, d2))

def uniform_cdf(x: np.ndarray) -> np.ndarray:
    """
    CDF of Uniform[0,1]: F(x) = x for x ‚àà [0,1]
    """
    x = np.asarray(x, dtype=np.float64)
    return np.clip(x, 0.0, 1.0)

def dkw_critical_value(n: int, alpha: float = 0.05) -> float:
    """
    Calculate the DKW critical value for KS test.
    
    Conservative bound: sqrt(log(2/Œ±) / (2n))
    
    Parameters:
    -----------
    n : int
        Sample size
    alpha : float
        Significance level (default 0.05 for 95% confidence)
    
    Returns:
    --------
    float : Critical value threshold
    """
    return math.sqrt(math.log(2/alpha) / (2*n))

def ks_test_dkw(sample: np.ndarray, alpha: float = 0.05) -> Dict[str, object]:
    """
    Perform KS test for Uniform[0,1] using DKW bound.
    
    Returns:
    --------
    dict with keys:
        - n: sample size
        - D_n: KS statistic
        - crit: critical value
        - reject: True if we reject uniformity hypothesis
    """
    n = len(sample)
    D = ks_statistic(sample, uniform_cdf)
    crit = dkw_critical_value(n, alpha=alpha)
    return {
        "n": n,
        "D_n": D,
        "crit": crit,
        "reject": (D > crit)
    }

def plot_uniform_diagnostics(u01: np.ndarray, bins: int = 50, title_prefix: str = ""):
    """
    Create diagnostic plots to visually assess uniformity.
    
    Plot 1: Histogram (should look flat)
    Plot 2: Successive pairs scatter (should fill square uniformly)
    """
    u01 = np.asarray(u01, dtype=np.float64)
    
    # Plot 1: Histogram
    plt.figure(figsize=(12, 5))
    
    plt.subplot(1, 2, 1)
    plt.hist(u01, bins=bins, edgecolor='black', alpha=0.7)
    plt.axhline(y=len(u01)/bins, color='r', linestyle='--', 
                label=f'Expected (n/bins)')
    plt.title(f"{title_prefix}Histogram of u in [0,1]")
    plt.xlabel("u")
    plt.ylabel("count")
    plt.legend()
    
    # Plot 2: Successive pairs (lag plot)
    plt.subplot(1, 2, 2)
    plt.scatter(u01[:-1], u01[1:], s=5, alpha=0.5)
    plt.title(f"{title_prefix}Successive pairs (u·µ¢, u·µ¢‚Çä‚ÇÅ)")
    plt.xlabel("u·µ¢")
    plt.ylabel("u·µ¢‚Çä‚ÇÅ")
    plt.grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()

# Demo: Test our LCG for uniformity
print("="*70)
print("KOLMOGOROV-SMIRNOV TEST FOR UNIFORMITY")
print("="*70)
print()

test_sample = v3[:5000]  # Use first 5000 values

result = ks_test_dkw(test_sample, alpha=0.05)

print(f"Sample size:       n = {result['n']}")
print(f"KS statistic:     D‚Çô = {result['D_n']:.6f}")
print(f"Critical value:      = {result['crit']:.6f} (at Œ±=0.05)")
print()

if result['reject']:
    print("‚úó REJECT uniformity hypothesis")
    print("  Data does NOT appear to be uniform")
else:
    print("‚úì FAIL TO REJECT uniformity hypothesis")
    print("  Data is consistent with Uniform[0,1]")

print()
print("-"*70)
print("Visual Diagnostics:")
print("-"*70)
plot_uniform_diagnostics(test_sample, title_prefix="LCG ")

---

## 6.2 Sampling from Probability Distributions

### üéØ The Big Picture

**We now have:** Uniform[0,1] random numbers (from LCG or other generators)

**We want:** Samples from other distributions (Normal, Exponential, custom distributions, etc.)

**The Challenge:** How do we transform Uniform[0,1] into other distributions?

---

### Method 1: Inversion Sampling (Inverse Transform Method)

**Core Idea (Theorem 5.39):**  
If U ~ Uniform[0,1] and F is a CDF, then:

$$X = F^{-1}(U) \sim F$$

**In words:** Apply the inverse CDF to a uniform random variable!

**Step-by-step:**
1. Generate U ~ Uniform[0,1]
2. Compute X = F‚Åª¬π(U)
3. X follows the distribution with CDF F

**When to use:**
- ‚úì When inverse CDF F‚Åª¬π is easy to compute
- ‚úì Gives exact samples (no approximation error)
- ‚úó Many distributions don't have closed-form inverse CDF

**Example: Exponential Distribution**
- CDF: F(x) = 1 - e^(-Œªx)
- Inverse: F‚Åª¬π(u) = -ln(1-u)/Œª
- Simple to compute!

**Demo below:** Sample from Exponential using inversion

In [None]:
def inversion_sampler(
    inv_cdf: Callable[[np.ndarray], np.ndarray], 
    n: int, 
    rng: Optional[np.random.Generator] = None
) -> np.ndarray:
    """
    Generate samples using the inversion (inverse transform) method.
    
    Algorithm:
    1. Generate n uniform random numbers U‚ÇÅ, ..., U‚Çô ~ Uniform[0,1]
    2. Apply inverse CDF: X·µ¢ = F‚Åª¬π(U·µ¢)
    3. Return X‚ÇÅ, ..., X‚Çô which follow distribution F
    
    Parameters:
    -----------
    inv_cdf : Callable
        The inverse CDF function F‚Åª¬π(u)
    n : int
        Number of samples to generate
    rng : np.random.Generator, optional
        Random number generator (for reproducibility)
    
    Returns:
    --------
    np.ndarray : n samples from the target distribution
    """
    rng = np.random.default_rng() if rng is None else rng
    u = rng.random(n)  # Step 1: Generate uniform samples
    return inv_cdf(u)   # Step 2: Transform via inverse CDF

def inv_cdf_exponential(u: np.ndarray, lam: float = 1.0) -> np.ndarray:
    """
    Inverse CDF for Exponential(Œª) distribution.
    
    Derivation:
    - CDF: F(x) = 1 - e^(-Œªx)
    - Set u = F(x): u = 1 - e^(-Œªx)
    - Solve for x: e^(-Œªx) = 1 - u
    - Take log: -Œªx = ln(1 - u)
    - Result: x = -ln(1 - u)/Œª
    
    Note: We can also use -ln(u)/Œª since (1-U) ~ Uniform[0,1]
          We use ln1p(-u) for better numerical accuracy
    
    Parameters:
    -----------
    u : np.ndarray
        Uniform[0,1] random variables
    lam : float
        Rate parameter (Œª > 0)
    
    Returns:
    --------
    np.ndarray : Exponential(Œª) samples
    """
    u = np.asarray(u, dtype=np.float64)
    return -np.log1p(-u) / lam  # log1p(x) = log(1+x), numerically stable

# Demo: Sample from Exponential(Œª=2) distribution
print("="*70)
print("INVERSION SAMPLING DEMO: Exponential Distribution")
print("="*70)
print()

rng = np.random.default_rng(0)  # Fixed seed for reproducibility
lambda_param = 2.0

# Generate 5000 samples
x_exp = inversion_sampler(lambda u: inv_cdf_exponential(u, lam=lambda_param), 
                          n=5000, 
                          rng=rng)

# Theoretical properties of Exponential(Œª)
theo_mean = 1.0 / lambda_param
theo_var = 1.0 / (lambda_param ** 2)

# Empirical properties
emp_mean = float(np.mean(x_exp))
emp_var = float(np.var(x_exp))

print(f"Distribution: Exponential(Œª={lambda_param})")
print()
print(f"Theoretical mean:     {theo_mean:.4f}")
print(f"Empirical mean:       {emp_mean:.4f}")
print(f"Difference:           {abs(emp_mean - theo_mean):.6f}")
print()
print(f"Theoretical variance: {theo_var:.4f}")
print(f"Empirical variance:   {emp_var:.4f}")
print(f"Difference:           {abs(emp_var - theo_var):.6f}")
print()
print("‚úì Inversion method produces accurate samples!")

---

### Method 2: Accept-Reject Sampling (Algorithm 1, Lemma 6.15)

**When to use:** When the inverse CDF is hard or impossible to compute

**Core Idea:**  
Sample from an easier "proposal" distribution, then accept/reject strategically to get the target distribution.

---

**The Setup:**

We want to sample from **target density f(x)**, but we have:
- **Proposal density g(x)** that's easy to sample from
- **Constant M** such that: f(x) ‚â§ M¬∑g(x) for all x

**Algorithm:**
1. Generate Y ~ g (from proposal)
2. Generate U ~ Uniform[0,1]
3. If U ‚â§ f(Y)/(M¬∑g(Y)), **accept** Y (return X = Y)
4. Otherwise, **reject** Y and go back to step 1

---

**Key Properties:**

- **Acceptance rate**: 1/M on average
  - Smaller M ‚Üí higher acceptance rate ‚Üí more efficient
  - M must satisfy f(x) ‚â§ M¬∑g(x) everywhere
  
- **Output distribution**: Accepted samples follow f(x) exactly

---

**Intuition:**  
Think of it like a game:
- You propose a candidate Y
- You accept it with probability proportional to how likely it should be under f
- The "envelope" M¬∑g(x) must cover f(x) everywhere

---

**Example: Sample from cos(x) on (-œÄ/2, œÄ/2)**  
Using Uniform[-œÄ/2, œÄ/2] as proposal, we need M = œÄ/2

**Demo below:** Implement accept-reject sampler

In [None]:
@dataclass
class AcceptRejectResult:
    """
    Container for accept-reject sampling results.
    
    Attributes:
    -----------
    samples : np.ndarray
        The accepted samples from target distribution
    n_proposals : int
        Total number of proposals generated (accepted + rejected)
    acceptance_rate : float
        Fraction of proposals that were accepted (= n / n_proposals)
    """
    samples: np.ndarray
    n_proposals: int
    acceptance_rate: float

def accept_reject_sampler(
    target_pdf: Callable[[np.ndarray], np.ndarray],
    proposal_sampler: Callable[[int], np.ndarray],
    proposal_pdf: Callable[[np.ndarray], np.ndarray],
    M: float,
    n: int,
    rng: Optional[np.random.Generator] = None,
    max_total_proposals: int = 50_000_000,
) -> AcceptRejectResult:
    """
    Generate samples using the accept-reject method.
    
    Algorithm (Lemma 6.15):
    -----------------------
    Repeat until we have n accepted samples:
        1. Generate Y ~ g (proposal distribution)
        2. Generate U ~ Uniform[0,1]
        3. Compute acceptance ratio: r = f(Y) / (M¬∑g(Y))
        4. If U ‚â§ r, accept Y; otherwise reject and try again
    
    Parameters:
    -----------
    target_pdf : Callable
        Target density f(x) we want to sample from
    proposal_sampler : Callable
        Function that generates samples from g(x)
    proposal_pdf : Callable
        Proposal density g(x)
    M : float
        Envelope constant satisfying f(x) ‚â§ M¬∑g(x) for all x
    n : int
        Number of samples to generate
    rng : np.random.Generator, optional
        Random number generator
    max_total_proposals : int
        Safety limit to prevent infinite loops
    
    Returns:
    --------
    AcceptRejectResult : Contains samples, proposal count, and acceptance rate
    
    Raises:
    -------
    ValueError : If M ‚â§ 0
    RuntimeError : If max_total_proposals is exceeded (check your M!)
    """
    if M <= 0:
        raise ValueError("M must be > 0.")
    
    rng = np.random.default_rng() if rng is None else rng
    accepted = []
    total = 0
    
    while len(accepted) < n:
        # Safety check
        if total >= max_total_proposals:
            raise RuntimeError(
                f"Exceeded max_total_proposals={max_total_proposals}. "
                f"Only got {len(accepted)}/{n} samples. "
                f"Check that M is correct and proposal matches target support."
            )
        
        # Step 1: Generate proposal
        x = proposal_sampler(1)[0]
        total += 1
        
        # Compute densities
        fx = float(target_pdf(np.array([x]))[0])
        gx = float(proposal_pdf(np.array([x]))[0])
        
        # Skip if proposal density is zero (shouldn't happen with good proposal)
        if gx <= 0:
            continue
        
        # Step 2-4: Accept/reject decision
        r = fx / (M * gx)  # Acceptance ratio
        u = rng.random()    # Uniform[0,1]
        
        if u <= min(1.0, r):  # Accept
            accepted.append(x)
    
    samples = np.array(accepted, dtype=np.float64)
    acceptance_rate = len(accepted) / total
    
    return AcceptRejectResult(
        samples=samples, 
        n_proposals=total, 
        acceptance_rate=acceptance_rate
    )

---

## Special Case: Generating Normal (Gaussian) Samples

### Why Normal Distribution is Special

The Normal distribution is **everywhere** in statistics, but:
- ‚úó Inverse CDF has no closed form (requires numerical approximation)
- ‚úì We have a clever trick: **Box-Muller transform** (Theorem 6.16)

---

### Box-Muller Transform (Theorem 6.16)

**Magic Formula:**  
If U‚ÇÅ, U‚ÇÇ ~ independent Uniform[0,1], then:

$$Z_0 = \sqrt{-2\ln U_1} \cos(2\pi U_2)$$
$$Z_1 = \sqrt{-2\ln U_1} \sin(2\pi U_2)$$

Both Z‚ÇÄ and Z‚ÇÅ are **independent** N(0,1) samples!

**Benefits:**
- Generate 2 independent standard normals from 2 uniform samples
- Exact (not approximate)
- Fast to compute

---

### Transforming to N(Œº, œÉ¬≤)

Once you have Z ~ N(0,1), get X ~ N(Œº, œÉ¬≤) using:

$$X = \mu + \sigma \cdot Z$$

Where œÉ = ‚àövariance

---

**Demo below:** Generate normal samples using Box-Muller and transform them

In [None]:
def box_muller(n: int, rng: Optional[np.random.Generator] = None) -> np.ndarray:
    """
    Generate standard normal N(0,1) samples using Box-Muller transform.
    
    Algorithm (Theorem 6.16):
    -------------------------
    1. Generate pairs (U‚ÇÅ, U‚ÇÇ) ~ Uniform[0,1]
    2. Transform using:
       Z‚ÇÄ = ‚àö(-2ln U‚ÇÅ) cos(2œÄ U‚ÇÇ)
       Z‚ÇÅ = ‚àö(-2ln U‚ÇÅ) sin(2œÄ U‚ÇÇ)
    3. Both Z‚ÇÄ, Z‚ÇÅ are independent N(0,1)
    
    This method generates 2 normals per pair of uniforms (efficient!).
    
    Parameters:
    -----------
    n : int
        Number of N(0,1) samples needed
    rng : np.random.Generator, optional
        Random number generator
    
    Returns:
    --------
    np.ndarray : n independent N(0,1) samples
    """
    rng = np.random.default_rng() if rng is None else rng
    
    # Generate enough pairs (we get 2 normals per pair)
    m = (n + 1) // 2  # Round up
    
    # Generate uniform pairs
    u1 = rng.random(m)
    u2 = rng.random(m)
    
    # Box-Muller transformation
    r = np.sqrt(-2.0 * np.log(u1))  # Radius component
    theta = 2.0 * np.pi * u2         # Angle component
    
    # Convert to Cartesian coordinates
    z0 = r * np.cos(theta)
    z1 = r * np.sin(theta)
    
    # Concatenate and return exactly n samples
    return np.concatenate([z0, z1])[:n]

def normal_from_standard(z: np.ndarray, mu: float = 0.0, var: float = 1.0) -> np.ndarray:
    """
    Transform standard normal Z ~ N(0,1) to X ~ N(Œº, œÉ¬≤).
    
    Formula: X = Œº + œÉ¬∑Z, where œÉ = ‚àövariance
    
    Parameters:
    -----------
    z : np.ndarray
        Standard normal samples N(0,1)
    mu : float
        Desired mean
    var : float
        Desired variance (not standard deviation!)
    
    Returns:
    --------
    np.ndarray : Samples from N(Œº, œÉ¬≤)
    """
    return mu + np.sqrt(var) * np.asarray(z, dtype=np.float64)

# Demo: Generate N(10, 5) samples
print("="*70)
print("BOX-MULLER DEMO: Normal Distribution Generation")
print("="*70)
print()

rng = np.random.default_rng(0)

# Step 1: Generate standard normals
z = box_muller(10000, rng=rng)

# Step 2: Transform to N(Œº=10, œÉ¬≤=5)
mu_target = 10.0
var_target = 5.0
x_norm = normal_from_standard(z, mu=mu_target, var=var_target)

# Check empirical properties
emp_mean = float(np.mean(x_norm))
emp_var = float(np.var(x_norm))

print(f"Target distribution: N(Œº={mu_target}, œÉ¬≤={var_target})")
print()
print(f"Theoretical mean:     {mu_target:.4f}")
print(f"Empirical mean:       {emp_mean:.4f}")
print(f"Difference:           {abs(emp_mean - mu_target):.6f}")
print()
print(f"Theoretical variance: {var_target:.4f}")
print(f"Empirical variance:   {emp_var:.4f}")
print(f"Difference:           {abs(emp_var - var_target):.6f}")
print()
print("‚úì Box-Muller method works perfectly!")

# Visualize
plt.figure(figsize=(10, 5))
plt.hist(x_norm, bins=60, density=True, alpha=0.7, edgecolor='black')

# Overlay theoretical density
x_range = np.linspace(x_norm.min(), x_norm.max(), 200)
theoretical_pdf = (1/np.sqrt(2*np.pi*var_target)) * np.exp(-0.5*(x_range - mu_target)**2/var_target)
plt.plot(x_range, theoretical_pdf, 'r-', linewidth=2, label='Theoretical N(10,5)')

plt.title(f"Box-Muller samples: N({mu_target},{var_target}) approximation")
plt.xlabel("x")
plt.ylabel("density")
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

---

### Optional: Inverse Normal CDF Approximation (Œ¶‚Åª¬π)

**Alternative to Box-Muller:** Use a highly accurate polynomial approximation of Œ¶‚Åª¬π(u)

**When to use:**
- When you specifically need the inverse CDF function
- For quantile calculations
- When you already have uniform samples and need normals

**How it works:**
- Uses different rational function approximations for different regions
- Accurate to ~10‚Åª‚Åπ for most of (0,1)

**Note:** Box-Muller is generally preferred for generating many samples (more efficient)

In [None]:
def norm_ppf_approx(u: np.ndarray) -> np.ndarray:
    u = np.asarray(u, dtype=np.float64)
    if np.any((u <= 0) | (u >= 1)):
        raise ValueError("u must be in (0,1)")

    a = np.array([-3.969683028665376e+01, 2.209460984245205e+02, -2.759285104469687e+02,
                  1.383577518672690e+02, -3.066479806614716e+01, 2.506628277459239e+00])
    b = np.array([-5.447609879822406e+01, 1.615858368580409e+02, -1.556989798598866e+02,
                  6.680131188771972e+01, -1.328068155288572e+01])
    c = np.array([-7.784894002430293e-03, -3.223964580411365e-01, -2.400758277161838e+00,
                  -2.549732539343734e+00, 4.374664141464968e+00, 2.938163982698783e+00])
    d = np.array([7.784695709041462e-03, 3.224671290700398e-01, 2.445134137142996e+00,
                  3.754408661907416e+00])

    plow = 0.02425
    phigh = 1 - plow
    x = np.empty_like(u)

    mask = u < plow
    q = np.sqrt(-2*np.log(u[mask]))
    x[mask] = (((((c[0]*q + c[1])*q + c[2])*q + c[3])*q + c[4])*q + c[5]) /                ((((d[0]*q + d[1])*q + d[2])*q + d[3])*q + 1)

    mask = (u >= plow) & (u <= phigh)
    q = u[mask] - 0.5
    r = q*q
    x[mask] = (((((a[0]*r + a[1])*r + a[2])*r + a[3])*r + a[4])*r + a[5]) * q /                (((((b[0]*r + b[1])*r + b[2])*r + b[3])*r + b[4])*r + 1)

    mask = u > phigh
    q = np.sqrt(-2*np.log(1-u[mask]))
    x[mask] = -(((((c[0]*q + c[1])*q + c[2])*q + c[3])*q + c[4])*q + c[5]) /                  ((((d[0]*q + d[1])*q + d[2])*q + d[3])*q + 1)
    return x

rng = np.random.default_rng(1)
z2 = norm_ppf_approx(rng.random(10000))
print("approx Z mean/var:", float(np.mean(z2)), float(np.var(z2)))


---

## Exercise 6.20 ‚Äî Sampling from p(x) = 0.5 cos(x) on (-œÄ/2, œÄ/2)

### üéØ Problem Setup

**Target density:**  
$$p(x) = 0.5 \cos(x) \text{ for } x \in \left(-\frac{\pi}{2}, \frac{\pi}{2}\right)$$

This is a valid PDF because:
- p(x) ‚â• 0 for all x in the domain (cos is positive on this interval)
- ‚à´ p(x)dx = 1 (integrates to 1 over the domain)

---

### Method 1: Inversion Sampling ‚úì

**Step 1: Find the CDF**

$$F(x) = \int_{-\pi/2}^{x} 0.5 \cos(t) dt = 0.5(\sin(x) + 1)$$

**Step 2: Find the inverse CDF**

Set u = F(x):
$$u = 0.5(\sin(x) + 1)$$
$$2u = \sin(x) + 1$$
$$\sin(x) = 2u - 1$$
$$x = \arcsin(2u - 1)$$

**Result:** F‚Åª¬π(u) = arcsin(2u - 1)

---

### Method 2: Accept-Reject Sampling ‚úì

**Proposal:** Uniform[-œÄ/2, œÄ/2]
- Density: g(x) = 1/œÄ for x ‚àà [-œÄ/2, œÄ/2]

**Find M:** Need f(x) ‚â§ M¬∑g(x) for all x

$$\frac{f(x)}{g(x)} = \frac{0.5 \cos(x)}{1/\pi} = \frac{\pi \cos(x)}{2}$$

Maximum occurs at x = 0 (where cos is largest):
$$M = \frac{\pi \cdot 1}{2} = \frac{\pi}{2}$$

**Expected acceptance rate:** 1/M = 2/œÄ ‚âà 0.637 (about 64%)

---

**Demo below:** Implement both methods and compare results

In [None]:
def cos_target_pdf(x: np.ndarray) -> np.ndarray:
    """
    Target density: p(x) = 0.5¬∑cos(x) on (-œÄ/2, œÄ/2)
    
    Returns 0 outside the support.
    """
    x = np.asarray(x, dtype=np.float64)
    out = np.zeros_like(x)
    mask = (x > -np.pi/2) & (x < np.pi/2)
    out[mask] = 0.5 * np.cos(x[mask])
    return out

def cos_target_cdf(x: np.ndarray) -> np.ndarray:
    """
    CDF: F(x) = 0.5¬∑(sin(x) + 1)
    
    Derivation:
    F(x) = ‚à´_{-œÄ/2}^x 0.5¬∑cos(t) dt
         = 0.5¬∑[sin(t)]_{-œÄ/2}^x
         = 0.5¬∑(sin(x) - sin(-œÄ/2))
         = 0.5¬∑(sin(x) - (-1))
         = 0.5¬∑(sin(x) + 1)
    """
    x = np.asarray(x, dtype=np.float64)
    out = np.zeros_like(x)
    out[x <= -np.pi/2] = 0.0
    out[x >= np.pi/2] = 1.0
    mask = (x > -np.pi/2) & (x < np.pi/2)
    out[mask] = 0.5 * (np.sin(x[mask]) + 1.0)
    return out

def cos_target_inv_cdf(u: np.ndarray) -> np.ndarray:
    """
    Inverse CDF: F‚Åª¬π(u) = arcsin(2u - 1)
    
    Derivation:
    u = 0.5¬∑(sin(x) + 1)
    2u = sin(x) + 1
    sin(x) = 2u - 1
    x = arcsin(2u - 1)
    
    Valid for u ‚àà (0, 1), gives x ‚àà (-œÄ/2, œÄ/2)
    """
    u = np.asarray(u, dtype=np.float64)
    if np.any((u <= 0) | (u >= 1)):
        raise ValueError("u must be in (0,1)")
    return np.arcsin(2*u - 1)

# Visualize the CDF
print("="*70)
print("EXERCISE 6.20: Sampling from p(x) = 0.5¬∑cos(x)")
print("="*70)
print()

grid = np.linspace(-np.pi/2, np.pi/2, 400)

plt.figure(figsize=(12, 5))

# Plot PDF
plt.subplot(1, 2, 1)
plt.plot(grid, cos_target_pdf(grid), linewidth=2)
plt.title("Target PDF: p(x) = 0.5¬∑cos(x)")
plt.xlabel("x")
plt.ylabel("p(x)")
plt.grid(True, alpha=0.3)
plt.axhline(y=0, color='k', linewidth=0.5)

# Plot CDF
plt.subplot(1, 2, 2)
plt.plot(grid, cos_target_cdf(grid), linewidth=2, color='orange')
plt.title("CDF: F(x) = 0.5¬∑(sin(x) + 1)")
plt.xlabel("x")
plt.ylabel("F(x)")
plt.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print()
print("="*70)
print("METHOD 1: Inversion Sampling")
print("="*70)

rng = np.random.default_rng(0)
x_inv = inversion_sampler(cos_target_inv_cdf, n=10000, rng=rng)

print(f"Generated {len(x_inv)} samples using F‚Åª¬π(u) = arcsin(2u - 1)")
print(f"Sample range: [{x_inv.min():.4f}, {x_inv.max():.4f}]")
print(f"Sample mean:  {np.mean(x_inv):.4f} (should be ‚âà 0 by symmetry)")

# Visualize inversion samples
plt.figure(figsize=(10, 5))
plt.hist(x_inv, bins=80, density=True, alpha=0.7, edgecolor='black', label='Inversion samples')
plt.plot(grid, cos_target_pdf(grid), 'r-', linewidth=2, label='True density')
plt.title("Inversion Sampling: Histogram vs True Density")
plt.xlabel("x")
plt.ylabel("density")
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

print()
print("="*70)
print("METHOD 2: Accept-Reject Sampling")
print("="*70)

def proposal_sampler_uniform_interval(n: int, rng: Optional[np.random.Generator] = None) -> np.ndarray:
    """
    Proposal: Uniform[-œÄ/2, œÄ/2]
    """
    rng = np.random.default_rng() if rng is None else rng
    return rng.uniform(-np.pi/2, np.pi/2, size=n)

def proposal_pdf_uniform_interval(x: np.ndarray) -> np.ndarray:
    """
    Proposal density: g(x) = 1/œÄ for x ‚àà [-œÄ/2, œÄ/2]
    """
    x = np.asarray(x, dtype=np.float64)
    out = np.zeros_like(x)
    mask = (x >= -np.pi/2) & (x <= np.pi/2)
    out[mask] = 1.0 / np.pi
    return out

# M calculation: max(f(x)/g(x)) = max(0.5¬∑cos(x)¬∑œÄ) = œÄ/2 (at x=0)
M_needed = np.pi / 2

print(f"Proposal: Uniform[-œÄ/2, œÄ/2]")
print(f"Envelope constant: M = œÄ/2 ‚âà {M_needed:.4f}")
print(f"Expected acceptance rate: 2/œÄ ‚âà {2/np.pi:.4f}")
print()

rng = np.random.default_rng(1)
ar_res = accept_reject_sampler(
    target_pdf=cos_target_pdf,
    proposal_sampler=lambda n: proposal_sampler_uniform_interval(n, rng=rng),
    proposal_pdf=proposal_pdf_uniform_interval,
    M=M_needed,
    n=5000,
    rng=rng,
)

print(f"Generated {len(ar_res.samples)} samples")
print(f"Total proposals: {ar_res.n_proposals}")
print(f"Acceptance rate: {ar_res.acceptance_rate:.4f}")
print(f"Expected rate:   {2/np.pi:.4f}")
print(f"Difference:      {abs(ar_res.acceptance_rate - 2/np.pi):.6f}")
print()
print("‚úì Both methods successfully sample from p(x) = 0.5¬∑cos(x)!")

---

## Exercise 6.19 Note: Testing Normality with KS Test

### ‚ö†Ô∏è Important Caveat

**Two scenarios for KS testing with Normal distribution:**

#### **Scenario 1: Known parameters (Œº, œÉ¬≤ specified beforehand)**
- ‚úì Standard KS test is valid
- Use the theoretical N(Œº, œÉ¬≤) CDF directly
- Critical values from KS tables apply

#### **Scenario 2: Estimated parameters (ŒºÃÇ, œÉÃÇ¬≤ from the same data)**
- ‚úó Standard KS test is **NOT valid**!
- Estimating parameters from data makes the test "easier to pass"
- KS statistic will be smaller than it should be
- Need adjusted critical values (not covered in basic KS tables)

**The Problem:**  
When you estimate Œº and œÉ from the data, you're "fitting" the distribution to that specific sample. This makes the empirical CDF artificially closer to the theoretical CDF.

**Proper alternatives:**
- Shapiro-Wilk test (designed for normality testing with estimated parameters)
- Lilliefors test (KS modification that accounts for parameter estimation)
- Anderson-Darling test

**Demo below:** Calculate KS statistic (but remember the caveat!)

In [None]:
def normal_cdf(x: np.ndarray, mu: float = 0.0, var: float = 1.0) -> np.ndarray:
    """
    CDF of Normal(Œº, œÉ¬≤) distribution.
    
    Uses the error function (erf) to compute:
    Œ¶(x) = 0.5¬∑[1 + erf((x-Œº)/(œÉ‚àö2))]
    
    where œÉ = ‚àövariance
    """
    x = np.asarray(x, dtype=np.float64)
    z = (x - mu) / math.sqrt(var)  # Standardize
    return 0.5 * (1.0 + np.vectorize(math.erf)(z / math.sqrt(2.0)))

# Demo: Calculate KS statistic for our Box-Muller samples
# (Remember: we're using the TRUE parameters we specified, so this is valid)
print("="*70)
print("KS Test for Normality (with KNOWN parameters)")
print("="*70)
print()

test_sample = x_norm[:5000]

# These are the parameters we USED to generate the data (known beforehand)
known_mu = 10.0
known_var = 5.0

D_norm = ks_statistic(test_sample, lambda t: normal_cdf(t, mu=known_mu, var=known_var))

print(f"Testing against: N(Œº={known_mu}, œÉ¬≤={known_var})")
print(f"Sample size: n = {len(test_sample)}")
print(f"KS statistic: D‚Çô = {D_norm:.6f}")
print()
print("Since we're testing against the TRUE parameters used for generation,")
print("this test is valid and should show good fit.")
print()
print(f"‚úì KS statistic = {D_norm:.6f} (small value indicates good fit)")

---

## Utility: Generate Uniform[0,1] from LCG

**Quick wrapper** to scale LCG output to [0,1] interval.

This combines:
1. LCG generation (integers 0 to M-1)
2. Scaling to [0, 1] by dividing by M

**Use case:** When you need Uniform[0,1] samples for inversion or other methods

In [None]:
def lcg_uniform01(a: int, b: int, M: int, seed: int, n: int) -> np.ndarray:
    """
    Generate Uniform[0,1] samples using LCG.
    
    Combines LCG generation with scaling: u/M
    
    Parameters:
    -----------
    a, b, M : int
        LCG parameters
    seed : int
        Initial seed
    n : int
        Number of samples
    
    Returns:
    --------
    np.ndarray : Values in [0, 1)
    """
    return lcg_sequence(a, b, M, seed, n) / M

# Demo: Generate and test Uniform[0,1] from LCG
print("="*70)
print("Generate Uniform[0,1] using LCG")
print("="*70)
print()

u01 = lcg_uniform01(a=5, b=1, M=2**20, seed=123, n=10000)

print(f"Generated {len(u01)} Uniform[0,1] samples")
print(f"Range: [{float(u01.min()):.6f}, {float(u01.max()):.6f}]")
print(f"Mean:  {float(np.mean(u01)):.6f} (should be ‚âà 0.5)")
print()

# KS test
ks_result = ks_test_dkw(u01, alpha=0.05)
print(f"KS statistic: {ks_result['D_n']:.6f}")
print(f"Critical val: {ks_result['crit']:.6f}")

if not ks_result['reject']:
    print("‚úì Passes uniformity test")
else:
    print("‚úó Fails uniformity test")

print()
print("Visual diagnostics:")
plot_uniform_diagnostics(u01, title_prefix="LCG u/M ")

---

## üìë Function Index & Quick Reference

### üî¢ Congruential Generator & Pseudorandomness

#### Core Functions:
- **`lcg(a, b, M, seed)`** ‚Äî Infinite LCG stream generator
- **`lcg_sequence(a, b, M, seed, n)`** ‚Äî Generate n LCG values as array
- **`estimate_period_from_seed(...)`** ‚Äî Find period by detecting cycle
- **`lcg_uniform01(a, b, M, seed, n)`** ‚Äî LCG scaled to [0,1]

#### Quality Assessment:
- **`frequency_table(seq, M)`** ‚Äî Calculate value frequencies
- **`uniformity_score(freqs)`** ‚Äî Measure deviation from uniform
- **`show_frequency_bar(freqs, title)`** ‚Äî Visualize frequency distribution

#### Range Mapping:
- **`map_mod_k(u, K)`** ‚Äî Map to {0,...,K-1} via modulo (Lemma 6.8)
- **`map_scaled_floor(u, M, K)`** ‚Äî Map via scaling (Lemma 6.10)

#### Theoretical Properties:
- **`hull_dobell_conditions(a, b, M)`** ‚Äî Check full period conditions (Thm 6.11)
- **`prime_factors(n)`** ‚Äî Find prime factors (helper for Hull-Dobell)
- **`theoretical_discrete_uniform_moments(M)`** ‚Äî Mean/var for {0,...,M-1}
- **`theoretical_scaled_moments(M)`** ‚Äî Mean/var for [0,1] scaling
- **`empirical_moments(x)`** ‚Äî Compute sample mean/variance
- **`interval_frequency(x, a, b)`** ‚Äî Fraction in interval (a,b)
- **`lemma_614_bound(M)`** ‚Äî Error bound 1/M (Lemma 6.14)

---

### üìä Distribution Testing (Exercise 6.18)

- **`ks_statistic(sample, cdf)`** ‚Äî Kolmogorov-Smirnov test statistic
- **`uniform_cdf(x)`** ‚Äî CDF of Uniform[0,1]
- **`dkw_critical_value(n, alpha)`** ‚Äî Critical value for KS test
- **`ks_test_dkw(sample, alpha)`** ‚Äî Complete KS test for uniformity
- **`plot_uniform_diagnostics(u01, bins, title)`** ‚Äî Visual checks for uniformity
- **`normal_cdf(x, mu, var)`** ‚Äî CDF of Normal distribution

---

### üé≤ Sampling Methods (Section 6.2)

#### Inversion Sampling:
- **`inversion_sampler(inv_cdf, n, rng)`** ‚Äî Generate samples via F‚Åª¬π(U)
- **`inv_cdf_exponential(u, lam)`** ‚Äî Inverse CDF for Exponential(Œª)

#### Accept-Reject Sampling:
- **`accept_reject_sampler(...)`** ‚Äî General accept-reject algorithm (Lemma 6.15)
- **`AcceptRejectResult`** ‚Äî Data class for results (samples, acceptance rate, etc.)

#### Normal Distribution:
- **`box_muller(n, rng)`** ‚Äî Generate N(0,1) samples (Theorem 6.16)
- **`normal_from_standard(z, mu, var)`** ‚Äî Transform N(0,1) to N(Œº,œÉ¬≤)
- **`norm_ppf_approx(u)`** ‚Äî Inverse normal CDF approximation (optional)

---

### üìù Exercise 6.20 (Cosine Density)

- **`cos_target_pdf(x)`** ‚Äî PDF: p(x) = 0.5¬∑cos(x) on (-œÄ/2, œÄ/2)
- **`cos_target_cdf(x)`** ‚Äî CDF: F(x) = 0.5¬∑(sin(x) + 1)
- **`cos_target_inv_cdf(u)`** ‚Äî Inverse CDF: F‚Åª¬π(u) = arcsin(2u-1)
- **`proposal_sampler_uniform_interval(n, rng)`** ‚Äî Uniform[-œÄ/2, œÄ/2] sampler
- **`proposal_pdf_uniform_interval(x)`** ‚Äî Uniform[-œÄ/2, œÄ/2] density

---

## üéì Summary: What You've Learned

1. **How computers generate "random" numbers** using LCG
2. **When pseudorandom sequences are good** (Hull-Dobell conditions)
3. **How to test randomness quality** (uniformity, KS tests)
4. **Two fundamental sampling techniques**:
   - Inversion: Fast when F‚Åª¬π is available
   - Accept-Reject: Works for complex distributions
5. **Special tricks for Normal distribution** (Box-Muller)

---

**üéâ You're now equipped to:**
- Generate and test pseudorandom numbers
- Sample from arbitrary probability distributions
- Understand tradeoffs between different methods
- Apply these techniques in statistical simulations!

**Next steps:** Try these functions with your own distributions and parameters!