Below are two different—yet complementary—ways to see why the variance estimate from my simulations converges to the “true” variance of the process being simulated.



---

## 1. Large \(N\) and the Law of Large Numbers



Wouldn't it be nice to just say, “running the simulation many times” converges on the correct variance."

1. **Numerical experiments rely on random draws.** Each run of the simulation produces an outcome $ X_i $ .  
2. **Variance in a random sample.** If you collect $N$ outcomes (i.e., run the simulation $N$ times), the sample variance $s^2$ is computed in the usual way:

   $$
   s^2 = \frac{1}{N-1} \sum_{i=1}^{N} \bigl(X_i - \bar{X}\bigr)^2,
   \quad
   \text{where}
   \quad
   \bar{X} = \frac{1}{N} \sum_{i=1}^{N} X_i.
   $$

3. **Law of Large Numbers for variance.** As \(N\) grows large, the sample variance $s^2$ will, *with high probability*, get closer and closer to the true variance  $\sigma^2$. This is a direct consequence of the Law of Large Numbers.

Basically, if we throw enough random draws into your estimate, the fluctuations shrink, and we get a stable estimate for the variance. 



## Note on ways of attaining vairance via simulation

##### Expressing the Sample Variance via Sample Moments

Recall that the sample variance $s^2$ can be written in terms of the first and second sample moments:

$$
s^2
\;=\;
\frac{1}{N - 1}\sum_{i=1}^{N} \bigl(X_i - \bar{X}\bigr)^2
\;=\;
\bigl(\overline{X^2}\bigr) \;-\; \bigl(\bar{X}\bigr)^2
\quad
$$

where

$$
\bar{X}
=
\frac{1}{N}\sum_{i=1}^N X_i,
\quad
\overline{X^2}
=
\frac{1}{N}\sum_{i=1}^N X_i^2.
$$

The distribution and convergence of $\bar{X}$ and $\overline{X^2}$
is key to the convergence of $s^2$.

---

Something I failed on the final at, and I need a richer understanding....

#### CLT for Means and Second Moments

#### CLT for the Sample Mean

The CLT states that if
$X_1, X_2, \dots, X_N$
are i.i.d. with mean $\mu$ and variance $\sigma^2$, then for large $N$,

$$
\sqrt{N}\,\Bigl(\bar{X} - \mu\Bigr)
\;\xrightarrow{d}\;
\mathcal{N}(0,\,\sigma^2),
$$

This equates to, $\bar{X}$ is approximately normally distributed around $\mu$, with a standard deviation shrinking like $1/\sqrt{N}$.

### CLT for the Sample Second Moment

Similarly, if the underlying $X_i$ have a finite fourth moment, one can apply the CLT to $X_i^2$. 

$\mu_2 = \mathbb{E}[X^2]$. 


$$
\sqrt{N}\,\Bigl(\overline{X^2} - \mu_2\Bigr)
\;\xrightarrow{d}\;
\mathcal{N}\!\bigl(0,\,\mathrm{Var}(X^2)\bigr).
$$



Which basically means $\overline{X^2}$  clusters around the true second moment $\mu_2$ for large $N$.

---

## 3. Putting It All Together

Because

$$
s^2 \;\approx\; \overline{X^2} - (\bar{X})^2,
$$

We can think of these as two random variables, $\overline{X^2}$ and $\bar{X}$, each obeying its own CLT. 

$$
\sqrt{N}\,\bigl(s^2 - \sigma^2\bigr)
\;\xrightarrow{d}\;
\text{some normal distribution}.
$$

Basically the CLT tells us that both $\bar{X}$ and $\overline{X^2}$ varies normally around their true means $\mu$ and $ \mu_2$ . The sample variance $s^2$ is just a combination of these, so *it, too*, converges on $\sigma^2$ (the true variance) and fluctuates normally for large $N$.

In [5]:
import numpy as np

size = 10_000_000
# Generate random values and assign 0.25 or 0.75 accordingly
random_values = np.random.uniform(0, 1, size)
result_vector = np.where(random_values < 0.25, 0.25, 0.75)

# Create an array of -100 (make sure it has the same length as result_vector)
array = np.full(size, -100)

# Create the game1Vector as defined
game1Vector = np.random.randint(1, 7, size) ** np.random.binomial(4, 0.5, size)

# Compute the final vector:
# For indices where result_vector is 0.25, multiply by array (-100)
# For indices where result_vector is 0.75, multiply by game1Vector
final_vector = np.where(result_vector == 0.25,
                        array,
                        game1Vector)


sample_mean_squared = (np.mean(final_vector))**2

sample_variance =  (final_vector ** 2) - sample_mean_squared

print(f"way one : {np.mean(sample_variance)}")
print(f"way two : {np.var(final_vector)}")
print(f"way three : {np.mean(final_vector ** 2) - sample_mean_squared}")



way one : 21273.95718171061
way two : 21273.957181710968
way three : 21273.957181710975


## A note on batching (averaging variances over subsets)

Another technique I did in simulation is *batch means*.

1. **Run simulations in batches.** Instead of running one massive block of $N$ simulation, break the runs into $k$ smaller blocks, each with $M = \tfrac{N}{k}$ simulations.
2. **Compute variance within each batch.** For each batch $b$, compute its sample variance $\hat{\sigma}_b^2$.


3. **Average batch variances.** Then combine these $k$ variances by taking an average:

   $$
   \hat{\sigma}_{\mathrm{avg}}^2 
   \;=\; 
   \frac{1}{k} \sum_{b=1}^k \hat{\sigma}_b^2.
   $$

4. **Convergence of the average.** As $k$ and  $M$  gets large, get a stable estimate of the variance. This is neat, becase each batch variance is an independent (or nearly independent) estimate, averaging them smooths out short-term fluctuations.

This method can be helpful:
- It can be used in a way to estimate the variance of variance estimate (i.e., you can look at the spread of the $\hat{\sigma}_b^2$ values to get a confidence interval).


### In conclusion

- **First perspective (large $N$)**: Just keep simulating until sample variance converges to the true variance via the law of large numbers.
- **Second perspective (batching)**: Subdivide large sample into independent batches, compute a variance in each batch, and then combine these estimates. This approach can often yield *faster, more robust* evidence that variance estimates are converging and lets ability for the variability of the variance itself.