In [None]:
# === Environment Setup ===
import os, sys, math, time, random, json, textwrap, warnings
from typing import Callable
import numpy as np, pandas as pd, matplotlib.pyplot as plt
from scipy.integrate import quad, nquad, simpson
from scipy.special import roots_legendre
from scipy.stats import norm
from IPython.display import Markdown, display

# --- Configuration ---
plt.style.use('seaborn-v0_8-whitegrid')
plt.rcParams.update({'figure.dpi': 130, 'font.size': 12, 'axes.titlesize': 'x-large',
    'axes.labelsize': 'large', 'xtick.labelsize': 'medium', 'ytick.labelsize': 'medium'})
np.set_printoptions(suppress=True, linewidth=120, precision=8)

# --- Utility Functions ---
def note(msg, **kwargs):
    display(Markdown(f"<div class='alert alert-info'>📝 {textwrap.fill(msg, width=100)}</div>"))
def sec(title):
    print(f"\n{100*'='}\n| {title.upper()} |\n{100*'='}")

note("Environment initialized.")

# Part 2: Core Numerical Methods
## Chapter 2.7: Numerical Integration

### Table of Contents
1.  [Deterministic Quadrature](#1.-Deterministic-Quadrature)
    *   [1.1 Newton-Cotes Formulas](#1.1-Newton-Cotes-Formulas)
    *   [1.2 Gaussian Quadrature](#1.2-Gaussian-Quadrature)
    *   [1.3 Convergence Rate Comparison](#1.3-Convergence-Rate-Comparison)
2.  [High-Dimensional Integration: Monte Carlo Methods](#2.-High-Dimensional-Integration:-Monte-Carlo-Methods)
    *   [2.1 The Curse of Dimensionality](#2.1-The-Curse-of-Dimensionality)
    *   [2.2 Variance Reduction Techniques](#2.2-Variance-Reduction-Techniques)
    *   [2.3 Markov Chain Monte Carlo (MCMC)](#2.3-Markov-Chain-Monte-Carlo-(MCMC))
3.  [Application 1: Bayesian Inference](#3.-Application-1:-Bayesian-Inference)
4.  [Application 2: Consumption-Based Asset Pricing](#4.-Application-2:-Consumption-Based-Asset-Pricing)
5.  [Chapter Summary](#5.-Chapter-Summary)
6.  [Exercises](#6.-Exercises)

### Introduction: The Integral as an Expectation
Integration is a fundamental operation in economic theory, used to calculate expected utility, consumer surplus, present values, and posterior distributions in Bayesian econometrics. For any random variable $X$ with probability density function $p(x)$, the expected value of a function $g(X)$ is an integral:
$$ E[g(X)] = \int g(x) p(x) dx $$
While analytical integration is possible for simple functions, most real-world problems involve integrands that are too complex to be solved by hand. For these cases, we must turn to **numerical integration** (also called **numerical quadrature**).

This notebook explores the main families of methods:
1.  **Deterministic Quadrature**: A set of techniques (like Gaussian quadrature) that are highly accurate for low-dimensional integrals of smooth functions.
2.  **Monte Carlo Methods**: A powerful, simulation-based approach essential for the high-dimensional integration problems that frequently arise in modern economic modeling, including the workhorse of modern Bayesian econometrics, **Markov Chain Monte Carlo (MCMC)**.

### 1. Deterministic Quadrature
**Quadrature** refers to methods that approximate a definite integral by a weighted sum of function values at specified points (nodes): 
$$ \int_a^b f(x) dx \approx \sum_{i=1}^n w_i f(x_i) $$

#### 1.1 Newton-Cotes Formulas
This family of methods is derived by integrating a simple polynomial interpolant of the function over evenly-spaced nodes.
- **Trapezoidal Rule:** A 1st-degree polynomial (a line) is fitted between two points. Error is $O(h^3 f''(\xi))$.
- **Simpson's Rule:** A 2nd-degree polynomial (a parabola) is fitted over three points. Due to a cancellation of error terms, its error is surprisingly small: $O(h^5 f^{(4)}(\xi))$.

#### 1.2 Gaussian Quadrature
\nThis powerful idea was first developed by Carl Friedrich Gauss. He realized that by giving up the freedom to choose the nodes (and instead calculating the optimal ones), one could achieve a much higher order of accuracy for the same number of function evaluations.\n**Gaussian Quadrature** is far more powerful. It optimally chooses *both* the nodes $x_i$ and the weights $w_i$ by linking the problem to the theory of **orthogonal polynomials**. This allows an n-point rule to exactly integrate polynomials of degree up to $2n-1$, making it extremely efficient for functions that are well-approximated by polynomials.

#### 1.3 Convergence Rate Comparison
The superior theoretical properties of Gaussian Quadrature translate into much faster convergence in practice.

![Convergence Rate of Quadrature Rules](../images/02-Numerical-Methods/quadrature_convergence.png)

### 2. High-Dimensional Integration: Monte Carlo Methods

#### 2.1 The Curse of Dimensionality
Quadrature methods suffer from the **curse of dimensionality**. To maintain accuracy, the number of function evaluations required grows exponentially with the number of dimensions ($d$). **Monte Carlo integration** breaks this curse. It relies on the Law of Large Numbers, and the Central Limit Theorem tells us that its error decreases at a rate of $O(1/\sqrt{N})$, **regardless of the dimension**. While this is slow in 1D, its immunity to dimensionality makes it the only feasible method for high-dimensional problems.\n**Intuition:** Imagine trying to estimate the average height of people in a country. A quadrature method would be like dividing the country into a grid and measuring one person in each grid cell. In 1D (just age), this is easy. In 10D (age, income, location, etc.), the number of grid cells explodes. A Monte Carlo method is like picking people at random and averaging their heights. The accuracy of this random average depends only on the number of people you pick, not on how many characteristics (dimensions) they have.

#### 2.2 Variance Reduction Techniques
The convergence rate of standard MC is slow. **Variance reduction techniques** can dramatically speed up convergence.
- **Quasi-Monte Carlo (QMC):** Uses low-discrepancy sequences (like Sobol sequences) that fill the integration space more evenly than pseudo-random numbers.
- **Antithetic Variates:** Exploits symmetries in the distribution. If we draw `z`, we also use `-z`.
- **Control Variates:** Uses a correlated variable $X$ with a known mean $E[X]$ to reduce the variance of the estimate of an unknown mean $E[Y]$. The improved estimator is $Y_{CV} = Y - \lambda(X - E[X])$.
- **Importance Sampling:** Draws from a different proposal distribution $q(x)$ and re-weights the results to focus sampling on important regions.

A Sobol sequence, for example, is constructed to fill space as evenly as possible, avoiding the clustering and gaps that can occur with pseudo-random numbers.

![Comparison of Sobol Sequence and Pseudo-Random Points](../images/02-Numerical-Methods/sobol_vs_random.png)

#### 2.3 Markov Chain Monte Carlo (MCMC)
What if the distribution we want to integrate over is too complex to sample from directly? This is the central problem in Bayesian econometrics. **MCMC methods** solve this by constructing a **Markov chain** whose stationary distribution is the target distribution (e.g., the posterior distribution). After a "burn-in" period, the states of the chain are valid (though correlated) draws from the target distribution.
\nThe Metropolis-Hastings algorithm is one of the most important algorithms of the 20th century. It was developed by Nicholas Metropolis, Arianna Rosenbluth, Marshall Rosenbluth, Augusta Teller, and Edward Teller in 1953 to solve problems in statistical physics, and later generalized by W. K. Hastings in 1970.\n
The **Metropolis-Hastings algorithm** is a general MCMC algorithm:
1.  Start at a point $\theta_t$.
2.  Propose a new point $\theta_{prop}$ from a proposal distribution $q(\theta_{prop} | \theta_t)$.
3.  Calculate the acceptance ratio: $\alpha = \min \left(1, \frac{p(\theta_{prop}) q(\theta_t|\theta_{prop})}{p(\theta_t) q(\theta_{prop}|\theta_t)}\right)$, where $p$ is the target density.
4.  Accept the new point (set $\theta_{t+1} = \theta_{prop}$) with probability $\alpha$; otherwise, reject and set $\theta_{t+1} = \theta_t$.

### 3. Application 1: Bayesian Inference
In Bayesian statistics, we update our prior beliefs about a parameter $\theta$ after observing data $D$ using Bayes' Rule to get a posterior distribution:
$$ p(\theta | D) = \frac{p(D | \theta) \cdot p(\theta)}{p(D)} = \frac{\text{Likelihood} \cdot \text{Prior}}{\text{Marginal Likelihood}} $$
We can calculate posterior expectations using quadrature (if the dimension is low) or MCMC (for harder problems).

![Bayesian Posterior Estimation: MCMC vs. Quadrature](../images/02-Numerical-Methods/mcmc_posterior.png)

### 4. Application 2: Consumption-Based Asset Pricing
A central problem in finance is to price an asset. The price of any asset is its expected discounted payoff. In a consumption-based model, the discount factor is not constant but is related to the agent's marginal utility. The price of a simple one-period bond that pays 1 unit of consumption next period is given by:
$$ P = E_t \left[ \beta \frac{u'(c_{t+1})}{u'(c_t)} \right] $$
where the expectation is taken over the distribution of next period's consumption, $c_{t+1}$. If we assume log utility $u(c) = \ln(c)$ and that consumption growth $g = c_{t+1}/c_t$ is log-normally distributed, we can price the bond via Monte Carlo integration.

In [None]:
sec("Pricing a Risk-Free Bond with a Stochastic Discount Factor")
beta = 0.99
# Assume log consumption growth is N(mu, sigma^2)
mu_g, sigma_g = 0.02, 0.015
n_sims = 1_000_000

# Generate draws for consumption growth g = c_t+1 / c_t
g_draws = np.random.lognormal(mean=mu_g, sigma=sigma_g, size=n_sims)

# The stochastic discount factor (SDF) for log utility is M = beta * g**(-1)
sdf_draws = beta * g_draws**(-1)

# The bond price is the expected value of the SDF
bond_price = np.mean(sdf_draws)
note(f"The price of the one-period risk-free bond is: {bond_price:.4f}")
note(f"This implies an interest rate of {(1/bond_price - 1)*100:.2f}%")

### 5. Chapter Summary
- **Low-Dimensional Problems:** For smooth, low-dimensional integrals, deterministic quadrature is preferred. **Gaussian Quadrature** is superior to Newton-Cotes methods (like Trapezoid or Simpson's) as it achieves higher accuracy with fewer function evaluations. `scipy.integrate.quad` is the standard tool.
- **High-Dimensional Problems:** Quadrature suffers from the **curse of dimensionality**. **Monte Carlo integration** is the only feasible method for high-dimensional problems. Its error rate is $O(1/\sqrt{N})$ regardless of dimension.
- **Variance Reduction:** Standard Monte Carlo converges slowly. Its efficiency can be dramatically improved using variance reduction techniques like **Quasi-MC (Sobol sequences)**, **Antithetic Variates**, and **Control Variates**.
- **Complex Distributions:** For integrals where the probability distribution is too complex to sample from directly, **Markov Chain Monte Carlo (MCMC)** methods like Metropolis-Hastings are the primary tool. They are the foundation of modern Bayesian econometrics.

### 6. Exercises

1.  **Producer Surplus:** Using an inverse demand curve $P_D(Q) = (Q-10)^2+10$ and an inverse supply curve $P_s(Q) = Q^2 + 2Q + 6$, find the equilibrium price and quantity by finding the root of the excess demand function. Then, calculate the **producer surplus** (the area above the supply curve and below the market price) using `scipy.integrate.quad`.

2.  **Importance Sampling:** You want to estimate the probability that a standard normal random variable $Z$ is greater than 5, i.e., $P(Z > 5)$. This is a rare event, so standard Monte Carlo is inefficient. Estimate this probability using importance sampling with an exponential distribution (shifted to start at 5) as your proposal. Compare the variance of your estimate to that of a standard MC estimate.

3.  **Pricing a Put Option with Control Variates:** A European put option has a payoff of $\max(K - S_T, 0)$. Price this option using a Monte Carlo simulation, but improve its efficiency by using the stock price $S_T$ as a control variate. We know analytically that $E[e^{-rT}S_T] = S_0$. Compare the variance of the control variate estimate to a standard MC estimate.

4.  **Gibbs Sampling:** The Gibbs sampler is an MCMC algorithm for multivariate distributions. It works by iteratively sampling from the *conditional* distribution of each variable, holding the others fixed. Consider a bivariate normal distribution. The conditional distribution $p(x|y)$ is also normal. Write a Gibbs sampler that generates draws from a bivariate normal distribution by only sampling from the two conditional normal distributions. Plot a scatter plot of your samples.

5.  **Multidimensional Integration:** Calculate the volume of a sphere of radius 1, which is $\frac{4}{3}\pi$. The integral is $\int_{-1}^{1} \int_{-\sqrt{1-x^2}}^{\sqrt{1-x^2}} \int_{-\sqrt{1-x^2-y^2}}^{\sqrt{1-x^2-y^2}} dz dy dx$. Use `scipy.integrate.nquad` to compute this value.