# Lecture Notes Toolkit - High-Dimensional Geometry (Chapter 10)

## üåå The Strange World of High Dimensions

Welcome to one of the most **counter-intuitive** topics in mathematics and data science: **high-dimensional geometry**!

### ü§î Why Should You Care?

**Modern data is high-dimensional:**
- **Images:** A 100√ó100 grayscale image = 10,000 dimensions
- **Text:** Document vectors can have 10,000+ dimensions (one per word)
- **Genetics:** DNA sequencing produces thousands of features
- **Machine Learning:** Neural networks operate in spaces with millions of parameters

**The challenge:** Our intuition from 2D and 3D **completely breaks down** in high dimensions!

---

## üéØ What This Notebook Covers

This toolkit provides complete implementations of Chapter 10 concepts (Lecture Notes pp. 152‚Äì163), with a focus on **understanding why high dimensions are weird**.

### üìö Content Overview

**Part 1: Foundations (Sections 1-2)**
- ‚úÖ **Geometric Definitions** (Def 10.1)
  - Balls and spheres in d dimensions
  - Distance and volume measures

- ‚úÖ **Gaussian Models** (Models 10.3‚Äì10.4)
  - Spherical Gaussian: standard normal in d dimensions
  - Normalized Gaussian: scaled version for cleaner formulas

---

**Part 2: The Curse of Dimensionality (Sections 3-4)**
- ‚úÖ **Vanishing Volume** (Lemma 10.5)
  - Unit ball volume shrinks exponentially with dimension
  - Most of the space is "outside" the unit ball!

- ‚úÖ **Boundary Concentration** (Lemma 10.7)
  - In high dimensions, most volume is near the boundary
  - Interior becomes negligible

- ‚úÖ **Exact Formulas** (Theorem 10.8)
  - Precise volume/surface area using Gamma functions
  - Numerical computation for large dimensions

---

**Part 3: Sampling in High Dimensions (Sections 5-6)**
- ‚úÖ **Uniform Sampling on Spheres** (Lemma 10.16)
  - Why normalizing Gaussians works
  - Proof and implementation

- ‚úÖ **Uniform Sampling in Balls** (Theorem 10.18)
  - Radial rescaling method
  - Why naive rejection fails

- ‚úÖ **Projection Problems** (Section 10.4.1)
  - Why projecting from squares doesn't work
  - Visualization of the bias

---

**Part 4: Concentration Phenomena (Section 7)**
- ‚úÖ **Annulus Theorem** (Theorem 10.20)
  - Norms concentrate in thin shells
  - Sub-Gaussian tail bounds
  - Empirical verification

---

## üéì Key Insights You'll Gain

1. **"Empty space" paradox:** In d=100, the unit ball contains virtually no volume compared to the cube [-1,1]^100

2. **"Surface" paradox:** Almost all volume of a ball is near its surface when d is large

3. **"Sampling" paradox:** Rejection sampling from cubes becomes impossible (acceptance rate ‚âà 0)

4. **"Concentration" paradox:** Random vectors have almost deterministic lengths (thin shell phenomenon)

---

## üöÄ How to Use This Notebook

1. **Read explanations carefully** - Each concept has intuition + math + code
2. **Run all cells** - See the phenomena for yourself
3. **Experiment with parameters** - Try different dimensions (d), sample sizes (n)
4. **Reuse functions** - All functions are modular and documented
5. **Build intuition** - Visualizations show what formulas mean

**‚ö†Ô∏è Warning:** Your 3D intuition will be wrong. That's okay‚Äîeveryone's is! This notebook will help you build new intuition.

Let's explore the fascinating geometry of high dimensions! üé¢

In [None]:
import numpy as np
import math
from dataclasses import dataclass
from typing import Callable, Tuple, Optional, Dict

import matplotlib.pyplot as plt


## 1) Geometric Foundations: Balls and Spheres (Definition 10.1)

### üéØ What Are Balls and Spheres?

In d-dimensional space ‚Ñù^d, we generalize familiar 2D and 3D concepts:

**Ball of radius r centered at x:**
$$B_r(x) = \{y \in \mathbb{R}^d : \|y - x\| < r\}$$

This is the **interior** (all points strictly closer than distance r from x).

**Sphere of radius r centered at x:**
$$S_r(x) = \{y \in \mathbb{R}^d : \|y - x\| = r\}$$

This is the **boundary** (all points exactly at distance r from x).

---

### üìä Examples Across Dimensions

| Dimension | Ball $B_1(0)$ | Sphere $S_1(0)$ |
|-----------|-------------|---------------|
| $d=1$ | Open interval $(-1, 1)$ | Two points $\{-1, 1\}$ |
| $d=2$ | Open disk | Circle |
| $d=3$ | Open ball | Sphere (hollow surface) |
| $d=4+$ | **No direct visualization!** | **No direct visualization!** |

---

### üîç Key Notation

Throughout this notebook:
- $B_1$ = unit ball (radius 1, centered at origin)
- $S_1$ = unit sphere (radius 1, centered at origin)
- $\|\cdot\|$ = Euclidean norm (L2 distance): $\|x\| = \sqrt{x_1^2 + x_2^2 + \cdots + x_d^2}$

---

### üí° Intuition: Interior vs. Boundary

In low dimensions:
- **$d=2$:** A circle has "length" (1D), the disk has "area" (2D)
- **$d=3$:** A sphere has "surface area" (2D), the ball has "volume" (3D)

In general:
- **Sphere $S_1$:** A $(d-1)$-dimensional object
- **Ball $B_1$:** A $d$-dimensional object

**The weird part:** In high dimensions, the sphere $S_1$ actually captures "most of the action"‚Äîwe'll see why!

In [None]:
def in_unit_ball(X: np.ndarray) -> np.ndarray:
    """Boolean mask: which rows of X are inside the unit ball."""
    X = np.asarray(X, dtype=float)
    return np.linalg.norm(X, axis=1) < 1.0

def in_ball(X: np.ndarray, r: float = 1.0) -> np.ndarray:
    X = np.asarray(X, dtype=float)
    return np.linalg.norm(X, axis=1) < float(r)

def on_unit_sphere(X: np.ndarray, tol: float = 1e-8) -> np.ndarray:
    X = np.asarray(X, dtype=float)
    return np.abs(np.linalg.norm(X, axis=1) - 1.0) <= tol


## 2) Gaussian Models in High Dimensions (Models 10.3‚Äì10.4)

### üéØ Why Gaussians?

Gaussian distributions are **fundamental** to high-dimensional geometry because:
1. **Rotation invariance:** Distribution looks the same from all directions
2. **Independence:** Coordinates are independent ‚Üí easy to analyze
3. **Central Limit Theorem:** Sums of random variables ‚Üí Gaussian
4. **Mathematical tractability:** Nice formulas, concentration properties

---

### üìê Model 10.3: Spherical Gaussian

A **spherical Gaussian** in ‚Ñù^d has **independent standard normal coordinates**:
$$Z = (Z_1, Z_2, \ldots, Z_d) \quad \text{where each } Z_i \sim N(0,1) \text{ independently}$$

**Density function:**
$$f(x) = (2\pi)^{-d/2} \exp\left(-\frac{\|x\|^2}{2}\right)$$

**Key properties:**
- **Mean:** $\mathbb{E}[Z] = 0$ (center at origin)
- **Covariance:** $\text{Cov}(Z) = I_d$ (identity matrix, uncorrelated coordinates)
- **Expected squared norm:** $\mathbb{E}[\|Z\|^2] = d$ (sum of $d$ unit variances)
- **Typical norm:** $\|Z\| \approx \sqrt{d}$ (by concentration)

**Physical interpretation:** Like a "cloud" of points centered at origin, spreading out in all directions equally.

---

### üìê Model 10.4: Normalized Gaussian

A **normalized Gaussian** rescales the spherical Gaussian:
$$Y = \frac{1}{\sqrt{2\pi}} Z$$

**Density function:**
$$f(y) = (2\pi)^{d/2} \exp(-\pi \|y\|^2)$$

Notice the cleaner exponent (no factor of 2 in the argument)!

**Key properties:**
- **Expected squared norm:** $\mathbb{E}[\|Y\|^2] = \frac{d}{2\pi}$
- **Typical norm:** $\|Y\| \approx \sqrt{\frac{d}{2\pi}}$

**Why use this?** The notes use the normalized version in Lemma 10.5 because it gives cleaner expressions for volume arguments.

---

### üîó Relationship

```
Spherical Gaussian Z ~ N(0, I_d)
        ‚Üì scale by 1/‚àö(2œÄ)
Normalized Gaussian Y = Z/‚àö(2œÄ)
```

Both capture the same high-dimensional phenomena, just with different scaling constants.

In [None]:
def sample_spherical_gaussian(d: int, n: int, rng: Optional[np.random.Generator] = None) -> np.ndarray:
    """Z ~ N(0,I_d). Returns array shape (n,d)."""
    rng = np.random.default_rng() if rng is None else rng
    return rng.normal(loc=0.0, scale=1.0, size=(n, d))

def sample_normalized_gaussian(d: int, n: int, rng: Optional[np.random.Generator] = None) -> np.ndarray:
    """Y = (2œÄ)^(-1/2) Z where Z~N(0,I)."""
    rng = np.random.default_rng() if rng is None else rng
    Z = sample_spherical_gaussian(d, n, rng=rng)
    return Z / math.sqrt(2*math.pi)

def expected_norm_sq_spherical(d: int) -> float:
    return float(d)

def expected_norm_sq_normalized(d: int) -> float:
    return float(d/(2*math.pi))


## 3) The Vanishing Volume Phenomenon (Lemma 10.5)

### ü§Ø The Most Counter-Intuitive Result

**Lemma 10.5:** For $d > 4\pi \approx 12.57$, the unit ball volume $|B_1|$ **shrinks toward zero** as $d$ increases!

### üìä How Small?

Some actual values:
- $d=5$: $|B_1| \approx 5.26$
- $d=10$: $|B_1| \approx 2.55$
- $d=20$: $|B_1| \approx 0.026$ (smaller than in $d=2$!)
- $d=50$: $|B_1| \approx 10^{-18}$ (essentially zero)
- $d=100$: $|B_1| \approx 10^{-41}$ (unimaginably tiny)

**Compare to the d-dimensional cube $[-1,1]^d$:**
- Cube volume = $2^d$ (grows exponentially!)
- Ball volume ‚Üí 0 (shrinks exponentially!)

---

### üí° Why Does This Happen?

**Intuitive explanation:** The unit ball is defined by $\|x\| < 1$, meaning:
$$x_1^2 + x_2^2 + \cdots + x_d^2 < 1$$

In high dimensions:
- Most random points have many coordinates
- Squared terms $x_i^2$ add up quickly
- Very few combinations satisfy the constraint!

**Analogy:** Imagine trying to fit inside a multidimensional "budget constraint"‚Äîas you add more dimensions, it becomes harder and harder to satisfy.

---

### üîç Seeing It Through Gaussians

For a **normalized Gaussian** $Y$:
- Typical norm: $\|Y\| \approx \sqrt{d/(2\pi)}$
- Unit ball condition: $\|Y\| < 1$
- This happens only when $\sqrt{d/(2\pi)} < 1$, i.e., $d < 2\pi \approx 6.28$

For $d > 6.28$:
- The Gaussian "cloud" is centered **outside** the unit ball!
- $\mathbb{P}(Y \in B_1)$ becomes exponentially small

**Conclusion:** The unit ball captures a vanishingly small fraction of probability mass.

---

### üìà Simulation Strategy

Below, we verify this by:
1. Sampling from the normalized Gaussian
2. Checking what fraction lands in $B_1$
3. Measuring concentration of $\|Y\|$ around $\sqrt{d/(2\pi)}$

In [None]:
def prob_in_unit_ball_under_distribution(
    sampler: Callable[[int], np.ndarray],
    n: int = 200_000,
    seed: int = 0,
) -> float:
    rng = np.random.default_rng(seed)
    X = sampler(n) if sampler.__code__.co_argcount == 1 else sampler(n, rng)  # flexible
    return float(np.mean(in_unit_ball(X)))

def simulate_norm_concentration(
    sampler: Callable[[int], np.ndarray],
    d: int,
    n: int = 100_000,
    seed: int = 0,
) -> Dict[str, float]:
    rng = np.random.default_rng(seed)
    X = sampler(d, n, rng=rng)  # expect signature (d,n,rng)
    norms = np.linalg.norm(X, axis=1)
    return {
        "mean_norm": float(np.mean(norms)),
        "std_norm": float(np.std(norms, ddof=0)),
        "q05": float(np.quantile(norms, 0.05)),
        "q95": float(np.quantile(norms, 0.95)),
    }

# Quick demo: probability normalized Gaussian lies in B1 for various d
def demo_prob_gaussian_in_ball(ds=(2,3,5,10,20), n=200_000, seed=0):
    out = []
    for d in ds:
        rng = np.random.default_rng(seed)
        Y = sample_normalized_gaussian(d, n, rng=rng)
        out.append((d, float(np.mean(in_unit_ball(Y)))))
    return out

demo_prob_gaussian_in_ball(ds=(2,3,5,10,15,20), n=100_000, seed=0)


In [None]:
# Plot: P(normalized Gaussian in unit ball) vs dimension
vals = demo_prob_gaussian_in_ball(ds=range(1, 21), n=60_000, seed=1)
ds = np.array([v[0] for v in vals])
ps = np.array([v[1] for v in vals])

plt.figure()
plt.plot(ds, ps, marker='o')
plt.title("P(Y in unit ball) for normalized Gaussian Y (simulation)")
plt.xlabel("dimension d")
plt.ylabel("probability")
plt.yscale("log")
plt.show()


In [None]:
# Plot: concentration of ||Y|| around sqrt(d/(2œÄ))
ds = np.arange(1, 21)
mean_norms = []
expected = []
for d in ds:
    stats = simulate_norm_concentration(sample_normalized_gaussian, d=d, n=50_000, seed=0)
    mean_norms.append(stats["mean_norm"])
    expected.append(math.sqrt(d/(2*math.pi)))

plt.figure()
plt.plot(ds, mean_norms, marker='o', label="empirical mean ||Y||")
plt.plot(ds, expected, marker='x', label="sqrt(d/(2œÄ))")
plt.title("Normalized Gaussian norm concentrates near sqrt(d/(2œÄ))")
plt.xlabel("dimension d")
plt.ylabel("value")
plt.legend()
plt.show()


## 4) Scaling and the Boundary Effect (Lemma 10.7)

### üéØ The Boundary Concentration Principle

**Lemma 10.7:** If you shrink any set $E$ by a factor $(1-\varepsilon)$, its volume scales by $(1-\varepsilon)^d$:
$$|(1-\varepsilon)E| = (1-\varepsilon)^d |E|$$

---

### üìê What Does This Mean?

Consider the unit ball $B_1$. Define:
- **Outer ball:** $B_1$ (radius 1)
- **Inner ball:** $(1-\varepsilon)B_1$ (radius $1-\varepsilon$)
- **Annulus (shell):** The region between them

**Volume of inner ball:**
$$|(1-\varepsilon)B_1| = (1-\varepsilon)^d |B_1|$$

**Volume of annulus:**
$$|B_1| - |(1-\varepsilon)B_1| = |B_1| \left[1 - (1-\varepsilon)^d\right]$$

---

### üí• The Shocking Consequence

Let's use $\varepsilon = 0.1$ (10% shrinkage) and see what fraction of volume is in the **outer 10% shell**:

| Dimension | $(1-0.1)^d$ | Volume in shell | Where's the volume? |
|-----------|-----------|-----------------|---------------------|
| $d=2$ | 0.81 | 19% | Mostly interior |
| $d=5$ | 0.59 | 41% | Balanced |
| $d=10$ | 0.35 | 65% | **Mostly shell!** |
| $d=20$ | 0.12 | 88% | **Almost all shell!** |
| $d=50$ | 0.005 | 99.5% | **Essentially all shell!** |
| $d=100$ | 0.00003 | 99.997% | **Everything is surface!** |

---

### üçä The Orange Peel Analogy

Imagine a high-dimensional "orange":
- In $d=3$: The peel (outer 10%) is a thin layer
- In $d=100$: The peel (outer 10%) contains 99.997% of the volume!
- The "interior" becomes negligible

**Consequence:** In high dimensions, **almost all points are near the boundary**. The interior is essentially empty!

---

### üî¢ Mathematical Insight

For small $\varepsilon$, we can approximate:
$$(1-\varepsilon)^d \approx e^{-\varepsilon d}$$

This decays **exponentially fast** in $d$. Even a small shrinkage (small $\varepsilon$) leads to massive volume loss when $d$ is large.

**Key takeaway:** High-dimensional objects are "almost all surface, no interior."

In [None]:
def scaling_volume_ratio(d: int, eps: float) -> float:
    """|(1-eps)E|/|E| = (1-eps)^d."""
    if not (0 < eps <= 1):
        raise ValueError("eps in (0,1].")
    return float((1.0 - eps) ** d)

# Compare (1-0.1)^d vs exp(-0.1 d) like the note snippet
ds = np.arange(1, 101)
eps = 0.1
plt.figure()
plt.plot(ds, (1-eps)**ds, label="(1-Œµ)^d")
plt.plot(ds, np.exp(-eps*ds), label="exp(-Œµ d)")
plt.title("Scaling decay: (1-Œµ)^d -> 0 as d grows")
plt.xlabel("d")
plt.ylabel("ratio")
plt.legend()
plt.show()


## 5) Exact Volume and Surface Area Formulas (Theorem 10.8)

### üéØ The Precise Mathematics

The notes derive **exact formulas** for the volume of the unit ball and surface area of the unit sphere using the **Gamma function**.

---

### üìê The Formulas

**Volume of the unit ball:**
$$|B_1| = \frac{\pi^{d/2}}{\Gamma(d/2 + 1)}$$

**Equivalent form** (using $\Gamma(z+1) = z\cdot\Gamma(z)$):
$$|B_1| = \frac{2\pi^{d/2}}{d\,\Gamma(d/2)}$$

**Surface area of the unit sphere:**
$$|S_1| = d \cdot |B_1| = \frac{2\pi^{d/2}}{\Gamma(d/2)}$$

**Relationship:** The sphere's "thickness" is infinitesimal, but its surface area relates to the ball's volume through a factor of $d$.

---

### üî¨ What is the Gamma Function?

The **Gamma function** generalizes factorials to non-integers:
$$\Gamma(n) = (n-1)! \quad \text{for positive integers } n$$

**Examples:**
- $\Gamma(1) = 0! = 1$
- $\Gamma(2) = 1! = 1$
- $\Gamma(3) = 2! = 2$
- $\Gamma(4) = 3! = 6$
- $\Gamma(1/2) = \sqrt{\pi}$ (special value!)

**Key identity:** $\Gamma(z+1) = z\cdot\Gamma(z)$

**For large $z$:** $\Gamma(z)$ grows faster than exponentially (roughly like $z^z e^{-z}$ by Stirling's approximation)

---

### üìä Verifying in Low Dimensions

Let's check our formula matches known results:

| Dimension | Name | Formula | $|B_1|$ value |
|-----------|------|---------|-----------|
| $d=1$ | Line segment | $2r = 2$ | 2.000 |
| $d=2$ | Disk | $\pi r^2 = \pi$ | 3.142 |
| $d=3$ | Ball | $\frac{4}{3}\pi r^3 = \frac{4\pi}{3}$ | 4.189 |
| $d=4$ | Hypersphere | $\frac{\pi^2 r^4}{2} = \frac{\pi^2}{2}$ | 4.935 |

**Peak at $d\approx 5$:** The volume reaches its maximum around $d=5$, then starts decreasing!

---

### ‚ö†Ô∏è Numerical Stability

For large $d$, the formula involves large values:
- $\pi^{d/2}$ grows exponentially
- $\Gamma(d/2)$ grows even faster
- Their ratio ‚Üí 0

**Solution:** Work in **log space**:
$$\log|B_1| = \frac{d}{2}\log(\pi) - \log\Gamma\left(\frac{d}{2} + 1\right)$$

Use `math.lgamma` (log-gamma function) to avoid overflow/underflow.

**Implementation strategy:**
1. Compute $\log(|B_1|)$ for any $d$
2. Exponentiate only when needed for display
3. For $d > 200$, just work with log values

In [None]:
def log_volume_unit_ball(d: int) -> float:
    """log(|B1|) in R^d using log-gamma for stability."""
    d = int(d)
    return (d/2)*math.log(math.pi) - math.lgamma(d/2 + 1)

def volume_unit_ball(d: int) -> float:
    """|B1| in R^d."""
    return float(math.exp(log_volume_unit_ball(d)))

def log_area_unit_sphere(d: int) -> float:
    """log(|S1|) in R^d."""
    d = int(d)
    return math.log(2.0) + (d/2)*math.log(math.pi) - math.lgamma(d/2)

def area_unit_sphere(d: int) -> float:
    return float(math.exp(log_area_unit_sphere(d)))

# Quick sanity checks in low dimensions
for d in [1,2,3,4,5,10]:
    print(d, "B1 vol =", volume_unit_ball(d), " | S1 area =", area_unit_sphere(d))


In [None]:
# Plot volume of unit ball vs dimension (exact formula)
ds = np.arange(1, 51)
vols = np.array([volume_unit_ball(int(d)) for d in ds])
plt.figure()
plt.plot(ds, vols, marker='o')
plt.title("Exact volume of the unit ball |B1| vs dimension")
plt.xlabel("dimension d")
plt.ylabel("|B1|")
plt.yscale("log")
plt.show()


## 6) Uniform Sampling: Spheres and Balls

### üéØ The Challenge

How do we generate **uniformly random** points:
1. **On the unit sphere** $S_1$ (surface)?
2. **Inside the unit ball** $B_1$ (interior)?

Naive approaches fail! We need mathematically correct methods.

---

### üåê Method 1: Uniform on the Sphere (Lemma 10.16)

**Key Theorem:** If $Z \sim N(0, I_d)$ (spherical Gaussian), then:
$$\theta = \frac{Z}{\|Z\|} \sim \text{Uniform}(S_1)$$

**Why this works:**
1. **Rotation invariance:** Gaussian distribution looks the same from all directions
2. **Normalization:** Dividing by $\|Z\|$ projects onto the unit sphere
3. **Symmetry:** Every direction is equally likely!

**Algorithm:**
```python
1. Sample Z ~ N(0, I_d)  (d independent standard normals)
2. Compute norm: r = ‚à•Z‚à•
3. Normalize: Œ∏ = Z / r
4. Return Œ∏ (uniformly distributed on S_1)
```

**Probability of $\|Z\| = 0$:** Zero (measure zero event, safe to ignore in floating point)

---

### üé± Method 2: Uniform in the Ball (Theorem 10.18)

**Key Theorem:** If $\theta \sim \text{Uniform}(S_1)$ and $U \sim \text{Uniform}([0,1])$ independent, then:
$$X = U^{1/d} \cdot \theta \sim \text{Uniform}(B_1)$$

**Why this works:**

**Naive (WRONG) approach:** Just use $\theta \cdot U$
- Problem: This concentrates points near the center!
- Reason: In high dimensions, "most volume is near the surface"

**Correct approach:** Use $\theta \cdot U^{1/d}$
- The exponent $1/d$ accounts for the volume scaling
- More points pushed toward the boundary (where the volume is!)

**Intuition via volume:**
- Volume of ball of radius $r$: $\propto r^d$
- To get uniform distribution, need CDF proportional to $r^d$
- Taking $U^{1/d}$ gives the right radial distribution

**Algorithm:**
```python
1. Sample Œ∏ ~ Uniform(S_1)  (using Method 1)
2. Sample U ~ Uniform([0,1])
3. Compute radius: r = U^(1/d)
4. Return X = r ¬∑ Œ∏ (uniformly distributed in B_1)
```

---

### üìä Visualizing the Radial Distribution

For uniform points in $B_1$:
- **$d=2$:** Radius $R$ has PDF $f(r) = 2r$ (more points near edge)
- **$d=3$:** Radius $R$ has PDF $f(r) = 3r^2$ (even more near edge)
- **$d=100$:** Almost ALL points have radius $\approx 1$ (thin shell!)

**Expected radius:** $\mathbb{E}[R] = \frac{d}{d+1} \to 1$ as $d \to \infty$

This confirms: in high dimensions, uniform random points in the ball are **almost all near the surface**!

---

### ‚ùå What NOT to Do

**DON'T:**
1. Sample from cube and project ‚Üí biased!
2. Sample coordinates uniformly ‚Üí not uniform in ball!
3. Use naive radial scaling $r = U$ ‚Üí concentrates at center!

**DO:**
- Use the Gaussian normalization method for spheres
- Use the radial rescaling $r = U^{1/d}$ for balls

In [None]:
def sample_uniform_sphere(d: int, n: int, rng: Optional[np.random.Generator] = None) -> np.ndarray:
    """Uniform on S1 in R^d via Gaussian normalization."""
    rng = np.random.default_rng() if rng is None else rng
    Z = rng.normal(size=(n, d))
    norms = np.linalg.norm(Z, axis=1, keepdims=True)
    # avoid division by zero (almost impossible for Gaussians)
    norms = np.where(norms == 0, 1.0, norms)
    return Z / norms

def sample_uniform_ball(d: int, n: int, rng: Optional[np.random.Generator] = None) -> np.ndarray:
    """Uniform in B1 in R^d via theta * U^(1/d)."""
    rng = np.random.default_rng() if rng is None else rng
    theta = sample_uniform_sphere(d, n, rng=rng)
    U = rng.random((n, 1))
    r = U ** (1.0 / d)
    return theta * r

def radii(x: np.ndarray) -> np.ndarray:
    return np.linalg.norm(np.asarray(x, dtype=float), axis=1)

# Demo: radii distribution (most mass near 1 for large d)
for d in [2, 5, 20, 100]:
    rng = np.random.default_rng(0)
    X = sample_uniform_ball(d, 50_000, rng=rng)
    R = radii(X)
    print("d=", d, "mean radius=", float(np.mean(R)), "q05/q95=", float(np.quantile(R,0.05)), float(np.quantile(R,0.95)))


In [None]:
# Visual: radius histogram for a few dimensions
dims = [2, 10, 50]
plt.figure()
for d in dims:
    rng = np.random.default_rng(d)
    R = radii(sample_uniform_ball(d, 40_000, rng=rng))
    plt.hist(R, bins=60, density=True, alpha=0.5, label=f"d={d}")
plt.title("Radius distribution for Uniform(B1): mass moves toward 1 as d grows")
plt.xlabel("radius ||X||")
plt.ylabel("density")
plt.legend()
plt.show()


## 7) Why Naive Projection Fails (Section 10.4.1)

### üö´ The Common Mistake (2D Example)

**Bad idea:** To sample uniformly on the unit circle:
1. Sample $(X,Y) \sim \text{Uniform}([-1,1]^2)$ (uniform in square)
2. Project: $\theta = (X,Y) / \|(X,Y)\|$

**Why this fails:** The resulting **angle distribution is NOT uniform**!

---

### üîç Why Does It Fail?

**Geometric reason:**
- Points in the square concentrate near the **corners**
- Corners are at angles $45¬∞, 135¬∞, 225¬∞, 315¬∞$
- After projection, these angles get **over-represented**

**Visual intuition:**
```
Square [-1,1]¬≤:

    +-------+
    |   ‚Ä¢   |    ‚Üê fewer points project to 0¬∞, 90¬∞, 180¬∞, 270¬∞
    | ‚Ä¢   ‚Ä¢ |
    |   ‚Ä¢   |    ‚Üê more points project to 45¬∞, 135¬∞, etc.
    +-------+

```

The corners "stretch out" when projected onto the circle!

---

### ‚úÖ The Correct Method

**Option 1: Sample from disk, then project**
1. Sample uniformly in the disk (using rejection or radial method)
2. Project: $\theta = (X,Y) / \|(X,Y)\|$
3. Result: **Uniform angles!**

**Why this works:** The disk is **rotationally symmetric**, so projection preserves uniformity.

**Option 2: Use Gaussian method**
1. Sample $Z \sim N(0, I_2)$
2. Normalize: $\theta = Z / \|Z\|$
3. Result: **Uniform on circle** (as per Lemma 10.16)

---

### üìä Empirical Demonstration

The code below shows:
- **Red histogram:** Angles from projecting square points (NOT uniform, peaks at $\pm\pi/4$)
- **Blue histogram:** Angles from projecting disk points (uniform, flat)

The difference is stark! The naive method creates visible bias.

---

### üåç Higher Dimensions

This problem gets **worse** in higher dimensions:
- In $d=3$: Projecting from cube $[-1,1]^3$ to sphere $S_2$ creates bias
- In $d=100$: The bias is extreme
- **Always use the Gaussian normalization method for uniform sampling on spheres!**

**Lesson:** Low-dimensional intuition ("just project") breaks down. Use proven methods!

In [None]:
def project_to_circle(XY: np.ndarray) -> np.ndarray:
    XY = np.asarray(XY, dtype=float)
    norms = np.linalg.norm(XY, axis=1, keepdims=True)
    norms = np.where(norms == 0, 1.0, norms)
    return XY / norms

def sample_uniform_square_2d(n: int, rng: Optional[np.random.Generator] = None) -> np.ndarray:
    rng = np.random.default_rng() if rng is None else rng
    return rng.uniform(-1, 1, size=(n, 2))

def sample_uniform_disk_via_rejection(n: int, rng: Optional[np.random.Generator] = None) -> Tuple[np.ndarray, float]:
    """Uniform in unit disk by sampling from square and accepting norm<1."""
    rng = np.random.default_rng() if rng is None else rng
    accepted = []
    total = 0
    while len(accepted) < n:
        batch = rng.uniform(-1, 1, size=(max(1000, n), 2))
        total += batch.shape[0]
        keep = batch[np.linalg.norm(batch, axis=1) < 1]
        accepted.append(keep)
        accepted = [np.vstack(accepted)]
        if accepted[0].shape[0] > n:
            accepted[0] = accepted[0][:n]
    return accepted[0], n / total

rng = np.random.default_rng(0)
XY = sample_uniform_square_2d(80_000, rng=rng)
proj_bad = project_to_circle(XY)
angles_bad = np.arctan2(proj_bad[:,1], proj_bad[:,0])

disk, acc = sample_uniform_disk_via_rejection(50_000, rng=rng)
proj_good = project_to_circle(disk)
angles_good = np.arctan2(proj_good[:,1], proj_good[:,0])

print("disk rejection acceptance rate ~ area(disk)/area(square) =", acc)

plt.figure()
plt.hist(angles_bad, bins=30, alpha=0.6, label="project from square (NOT uniform)")
plt.hist(angles_good, bins=30, alpha=0.6, label="disk->project (uniform)")
plt.title("Angle histograms on the unit circle")
plt.xlabel("angle")
plt.ylabel("count")
plt.legend()
plt.show()


## Exercise 10.15: Rejection sampling from cube to ball fails in high dimension

If you sample from the cube $[-1,1]^d$ and accept if $\|X\|<1$, acceptance probability is:
$$\frac{|B_1|}{|[-1,1]^d|} = \frac{|B_1|}{2^d}$$

Since $|B_1|$ shrinks rapidly and $2^d$ grows exponentially, acceptance becomes tiny.

Below: compute theoretical acceptance + simulate to feel it.

In [None]:
def acceptance_prob_ball_in_cube(d: int) -> float:
    return volume_unit_ball(d) / (2.0 ** d)

def rejection_sample_ball_from_cube(d: int, n: int, rng: Optional[np.random.Generator] = None, max_batches: int = 100000) -> Tuple[np.ndarray, float]:
    """Uniform(B1) by cube rejection (impractical for large d). Returns (samples, acceptance_rate)."""
    rng = np.random.default_rng() if rng is None else rng
    accepted = []
    total = 0
    while sum(a.shape[0] for a in accepted) < n:
        if max_batches <= 0:
            raise RuntimeError("Too many batches; d is too large for rejection sampling.")
        batch = rng.uniform(-1, 1, size=(max(5000, n), d))
        total += batch.shape[0]
        keep = batch[np.linalg.norm(batch, axis=1) < 1]
        accepted.append(keep)
        max_batches -= 1
    X = np.vstack(accepted)[:n]
    return X, n / total

for d in [2, 5, 10, 20, 30]:
    print("d=", d, "theoretical acceptance |B1|/2^d =", acceptance_prob_ball_in_cube(d))


## 8) The High-Dimensional Annulus Theorem (Theorem 10.20)

### üéØ The Concentration Phenomenon

**Theorem 10.20** is one of the most important results about high-dimensional geometry. It says:

**Random vectors concentrate in thin shells around their expected norm.**

---

### üìê Precise Statement

Let $X \in \mathbb{R}^d$ have **independent sub-Gaussian coordinates** with variance $\sigma^2$.

For any $\beta \leq \sqrt{d}$:
$$\mathbb{P}\left(\sqrt{d}\,\sigma - \beta \leq \|X\| \leq \sqrt{d}\,\sigma + \beta\right) \geq 1 - 2e^{-\beta^2/128}$$

**Translation:** With high probability, $\|X\|$ is within $\beta$ of $\sqrt{d}\cdot\sigma$.

---

### üí° What Does This Mean?

**For a standard Gaussian** ($\sigma = 1$):

**Expected norm:** $\approx \sqrt{d}$

**Concentration:** $\|X\| \approx \sqrt{d} \pm \beta$ with probability $\geq 1 - 2e^{-\beta^2/128}$

**Example ($d=100, \beta=3$):**
- Expected norm: $\sqrt{100} = 10$
- Shell: $[10-3, 10+3] = [7, 13]$
- Probability in shell: $\geq 1 - 2e^{-9/128} \approx 99.86\%$

**Almost all vectors have norm between 7 and 13!** The spread is tiny relative to the total space.

---

### üé± The Annulus (Shell) Picture

Imagine concentric spheres:
- **Inner sphere:** radius $\sqrt{d} - \beta$
- **Outer sphere:** radius $\sqrt{d} + \beta$  
- **Shell (annulus):** region between them

**Thickness of shell:** $2\beta$ (fixed!)

**Radius of shell center:** $\sqrt{d}$ (grows with dimension!)

**Relative thickness:** $\frac{2\beta}{\sqrt{d}} \to 0$ as $d \to \infty$

**Conclusion:** The shell becomes **arbitrarily thin** (relative to its radius) as $d$ increases!

---

### üìä Probability Bound Analysis

The bound $2e^{-\beta^2/128}$ decays **exponentially** in $\beta^2$:

| $\beta$ | Tail bound | In-shell prob |
|---|------------|---------------|
| 1 | 0.0155 | $\geq 98.45\%$ |
| 2 | 0.0006 | $\geq 99.94\%$ |
| 3 | 0.00002 | $\geq 99.998\%$ |
| 4 | 0.0000002 | $\geq 99.9998\%$ |

Even for moderate $\beta$, almost all probability is in the shell!

---

### üî¨ Sub-Gaussian Coordinates

**What are sub-Gaussian random variables?**

A random variable $X$ is **sub-Gaussian with parameter $\sigma$** if:
$$\mathbb{E}[e^{tX}] \leq e^{\sigma^2t^2/2} \quad \text{for all } t \in \mathbb{R}$$

**Examples:**
- **Gaussian $N(0,\sigma^2)$:** Exactly sub-Gaussian with parameter $\sigma$
- **Bounded variables:** If $|X| \leq C$, then $X$ is sub-Gaussian with parameter $\propto C$
- **Many distributions:** Uniform, Laplace, etc. (with appropriate constants)

**Why this matters:** The theorem applies to **many distributions beyond Gaussians**!

---

### üåä Physical Interpretation

In high dimensions, random vectors are "**almost deterministic in length**":
- The norm $\|X\|$ concentrates tightly around $\sqrt{d}\cdot\sigma$
- Variability is $O(1)$, while mean is $O(\sqrt{d})$
- Relative spread: $O(1/\sqrt{d}) \to 0$

**Analogy:** Like measuring the distance of millions of randomly thrown darts:
- Individual throws are random
- But the average distance becomes very predictable
- In high dimensions, **even single vectors behave predictably!**

---

### üíª Empirical Verification

The code below:
1. Samples vectors from $N(0, I_d)$
2. Computes their norms
3. Checks what fraction lie in $[\sqrt{d} - \beta, \sqrt{d} + \beta]$
4. Compares to the theoretical lower bound

**Result:** Empirical probabilities consistently exceed the theoretical lower bound, confirming the theorem!

In [None]:
def annulus_shell_probability_lower_bound(beta: float) -> float:
    """Lower bound: 1 - 2 exp(-beta^2/128)."""
    return float(1.0 - 2.0 * math.exp(-(beta*beta)/128.0))

def annulus_shell_tail_bound(beta: float) -> float:
    """Upper bound on being outside shell: 2 exp(-beta^2/128)."""
    return float(2.0 * math.exp(-(beta*beta)/128.0))

def empirical_shell_probability(d: int, beta: float, n: int = 200_000, seed: int = 0) -> float:
    rng = np.random.default_rng(seed)
    X = sample_spherical_gaussian(d, n, rng=rng)  # a=1
    R = radii(X)
    lo = math.sqrt(d) - beta
    hi = math.sqrt(d) + beta
    return float(np.mean((R >= lo) & (R <= hi)))

# Demo: compare empirical vs bound for some (d, beta)
for d in [20, 50, 100, 200]:
    for beta in [0.5, 1.0, 2.0, 3.0]:
        if beta <= math.sqrt(d):
            emp = empirical_shell_probability(d, beta, n=80_000, seed=1)
            lb = annulus_shell_probability_lower_bound(beta)
            print(f"d={d:3d}, beta={beta:3.1f}  empirical={emp:.4f}  bound>={lb:.4f}")


In [None]:
# Visual: norm distribution of Gaussian vs shell center sqrt(d)
d = 100
rng = np.random.default_rng(0)
X = sample_spherical_gaussian(d, 80_000, rng=rng)
R = radii(X)
center = math.sqrt(d)

plt.figure()
plt.hist(R, bins=80, density=True)
plt.axvline(center, linestyle='--', label="sqrt(d)")
plt.title("||X|| for X~N(0,I_d) concentrates near sqrt(d)")
plt.xlabel("||X||")
plt.ylabel("density")
plt.legend()
plt.show()


## Extra: Monte Carlo estimate of $|B_1|$ (useful for intuition)

If $X\sim\text{Uniform}([-1,1]^d)$, then:
$$|B_1| = 2^d \cdot \mathbb{P}(\|X\|<1)$$

This is fine for small $d$, but fails for large $d$ because the probability is tiny.

In [None]:
def estimate_volume_unit_ball_mc(d: int, n: int = 2_000_000, seed: int = 0) -> Tuple[float, float]:
    rng = np.random.default_rng(seed)
    X = rng.uniform(-1, 1, size=(n, d))
    p = float(np.mean(np.linalg.norm(X, axis=1) < 1.0))
    est = (2.0 ** d) * p
    return est, p

for d in [2,3,5,8,10]:
    est, p = estimate_volume_unit_ball_mc(d, n=300_000, seed=0)
    print(f"d={d:2d}  MC |B1|‚âà{est:.6f}  exact={volume_unit_ball(d):.6f}  accept p={p:.6f}")


---

## üìö Summary: The Curse and Blessing of High Dimensions

### üéØ Key Phenomena Recap

#### 1Ô∏è‚É£ **Vanishing Volume (Lemma 10.5)**
- Unit ball volume $|B_1| \to 0$ as $d \to \infty$
- For $d > 4\pi \approx 12.57$, the volume starts shrinking
- By $d=100$, $|B_1| \approx 10^{-41}$ (essentially zero!)

**Implication:** "Most of the space" is **outside** the unit ball.

---

#### 2Ô∏è‚É£ **Boundary Concentration (Lemma 10.7)**
- $(1-\varepsilon)^d \to 0$ exponentially fast
- Almost all volume is in the outer $\varepsilon$-shell
- The interior becomes negligible

**Implication:** High-dimensional objects are "**all surface, no interior**."

---

#### 3Ô∏è‚É£ **Norm Concentration (Theorem 10.20)**
- Random vectors have norms tightly concentrated around $\sqrt{d}\cdot\sigma$
- Shell thickness is $O(1)$, but radius is $O(\sqrt{d})$
- Relative uncertainty $\to 0$ as $d \to \infty$

**Implication:** In high dimensions, **randomness becomes predictable**.

---

### üó∫Ô∏è Concept Map

```
     HIGH-DIMENSIONAL GEOMETRY
              |
    ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¥‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
    |                   |
VOLUME EFFECTS    CONCENTRATION
    |                   |
    ‚îú‚îÄ Vanishing        ‚îú‚îÄ Annulus Theorem
    ‚îÇ  (Lemma 10.5)     ‚îÇ  (Thm 10.20)
    ‚îÇ                   ‚îÇ
    ‚îú‚îÄ Boundary         ‚îú‚îÄ Norm concentration
    ‚îÇ  (Lemma 10.7)     ‚îÇ  (tight shells)
    ‚îÇ                   ‚îÇ
    ‚îî‚îÄ Exact formulas   ‚îî‚îÄ Sub-Gaussian
       (Thm 10.8)           variables
              |
    ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¥‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
    |                   |
 SAMPLING           APPLICATIONS
    |                   |
    ‚îú‚îÄ Spheres          ‚îú‚îÄ ML algorithms
    ‚îÇ  (Lemma 10.16)    ‚îÇ
    ‚îÇ                   ‚îú‚îÄ Dimensionality
    ‚îú‚îÄ Balls            ‚îÇ  reduction
    ‚îÇ  (Thm 10.18)      ‚îÇ
    ‚îÇ                   ‚îî‚îÄ Nearest neighbors
    ‚îî‚îÄ Projection       
       pitfalls
```

---

### üìä The Curse of Dimensionality

**Problems caused by high dimensions:**

1. **Volume explosion:** $|[-1,1]^d| = 2^d$ grows exponentially
2. **Sparse data:** Points become far apart (average distance $\approx \sqrt{d}$)
3. **Nearest neighbors fail:** "Near" and "far" lose meaning (all distances $\approx$ equal)
4. **Curse of sampling:** Rejection sampling becomes impossible
5. **Computational cost:** Algorithms scale poorly with $d$

**Examples:**
- **k-NN algorithm:** Breaks down for $d > 20$ (all neighbors equally far)
- **Grid search:** Need exponentially many grid points
- **Density estimation:** Need exponentially many samples

---

### üéÅ The Blessing of Dimensionality

**But high dimensions also help:**

1. **Concentration:** Randomness becomes predictable (law of large numbers in action)
2. **Orthogonality:** Random vectors are nearly orthogonal (useful for projections)
3. **Linear separability:** Complex patterns become linearly separable
4. **Signal representation:** Rich feature spaces for machine learning

**Examples:**
- **Kernel methods:** Map to high-d spaces for better separation
- **Random projections:** Johnson-Lindenstrauss lemma uses high-d concentration
- **Compressed sensing:** Sparse signals in high dimensions

---

### üõ†Ô∏è Practical Guidelines

#### When Working with High-Dimensional Data:

**‚úÖ DO:**
- Use dimensionality reduction (PCA, t-SNE, UMAP)
- Leverage concentration phenomena (norms, distances)
- Use theoretically sound sampling methods (Gaussian normalization)
- Work in subspaces or manifolds when possible
- Use algorithms designed for high dimensions

**‚ùå DON'T:**
- Trust low-dimensional intuition
- Use rejection sampling from cubes
- Rely on k-NN for very high $d$
- Forget to normalize/standardize features
- Ignore the curse when designing algorithms

---

### üîß Function Reference

#### **Geometric Predicates:**
- `in_ball(X, r)`, `in_unit_ball(X)`, `on_unit_sphere(X)`, `radii(X)`

#### **Gaussian Sampling:**
- `sample_spherical_gaussian(d, n)` ‚Üí $N(0, I_d)$
- `sample_normalized_gaussian(d, n)` ‚Üí $(2\pi)^{-1/2} Z$

#### **Exact Formulas:**
- `volume_unit_ball(d)`, `area_unit_sphere(d)`
- `log_volume_unit_ball(d)` (stable for large $d$)

#### **Uniform Sampling:**
- `sample_uniform_sphere(d, n)` ‚Üí $\text{Uniform}(S_1)$
- `sample_uniform_ball(d, n)` ‚Üí $\text{Uniform}(B_1)$

#### **Rejection Sampling:**
- `acceptance_prob_ball_in_cube(d)` ‚Üí theoretical rate
- `rejection_sample_ball_from_cube(d, n)` (use only for small $d$!)

#### **Concentration:**
- `annulus_shell_probability_lower_bound(beta)`
- `empirical_shell_probability(d, beta, n)`

---

### üöÄ Next Steps & Further Reading

#### **Advanced Topics:**

1. **Johnson-Lindenstrauss Lemma**
   - Random projections preserve distances
   - Applications in dimensionality reduction

2. **Concentration of Measure**
   - Isoperimetric inequalities
   - Measure concentration phenomenon

3. **Manifold Learning**
   - High-dimensional data on low-dimensional manifolds
   - t-SNE, UMAP, autoencoders

4. **Curse in Machine Learning**
   - Feature selection and regularization
   - Why deep learning works despite high dimensions

5. **Random Matrix Theory**
   - Eigenvalue distributions
   - Applications in statistics and ML

---

#### **Recommended Resources:**

**Books:**
- *High-Dimensional Probability* by Roman Vershynin
- *High-Dimensional Statistics* by Martin Wainwright  
- *Concentration Inequalities* by Boucheron et al.

**Papers:**
- Donoho: "High-Dimensional Data Analysis: The Curses and Blessings of Dimensionality"
- Indyk & Motwani: "Approximate Nearest Neighbors: Towards Removing the Curse"

**Online:**
- 3Blue1Brown: High-dimensional sphere visualizations
- Distill.pub: Interactive ML visualizations
- Seeing Theory: Probability visualizations

---

## üéì Final Thoughts

High-dimensional geometry is **weird, wonderful, and essential** for modern data science:

- **Weird:** Our intuition fails completely
- **Wonderful:** Beautiful mathematical phenomena emerge
- **Essential:** Understanding it is crucial for ML/AI

**Remember:**
- In high dimensions, **almost everything is on the surface**
- **Random becomes predictable** through concentration
- **Your 3D intuition will mislead you**‚Äîtrust the mathematics!

**Keep exploring, stay curious, and embrace the strange beauty of high dimensions!** üååüöÄ

---

## ‚úÖ Complete Function Index (Copy/Paste Ready)

### Geometry predicates
- `in_ball(X, r)` / `in_unit_ball(X)` / `on_unit_sphere(X)` / `radii(X)`

### Gaussian models (Models 10.3‚Äì10.4)
- `sample_spherical_gaussian(d,n)` ($N(0,I)$)
- `sample_normalized_gaussian(d,n)` (scaled by $(2\pi)^{-1/2}$)
- `expected_norm_sq_spherical(d)` / `expected_norm_sq_normalized(d)`

### Scaling (Lemma 10.7)
- `scaling_volume_ratio(d, eps)`

### Exact volumes/areas (Theorem 10.8)
- `volume_unit_ball(d)` / `log_volume_unit_ball(d)`
- `area_unit_sphere(d)` / `log_area_unit_sphere(d)`

### Uniform sampling (Lemma 10.16, Thm 10.18)
- `sample_uniform_sphere(d,n)`
- `sample_uniform_ball(d,n)`

### Rejection sampling + projections (Sec 10.4.1, Ex 10.15)
- `sample_uniform_disk_via_rejection(n)`
- `project_to_circle(XY)`
- `acceptance_prob_ball_in_cube(d)`
- `rejection_sample_ball_from_cube(d,n)` (use only for small $d$)

### Annulus theorem (Thm 10.20)
- `annulus_shell_probability_lower_bound(beta)`
- `empirical_shell_probability(d, beta, n, seed)`

### Monte Carlo volume estimation
- `estimate_volume_unit_ball_mc(d, n, seed)`