# Universe Model — Metrics and Validation

This notebook validates the behaviour of the universe simulation by:

- verifying that all clustering metrics behave as expected,
- checking limiting cases (random vs. clustered configurations),
- analysing how spatial structure emerges as model parameters change.

The goal is to ensure that the simulation produces meaningful and reproducible
results before it is used for systematic parameter sweeps and analysis.

## Metrics implemented (as specified in the project plan)

We quantify spatial structure using the following observables:

1. **Grid-based density variance**  
   Measures spatial inhomogeneity by binning particles on a grid.

2. **Mean nearest-neighbour distance (PBC)**  
   Captures typical inter-particle spacing under periodic boundary conditions.

3. **Number of clusters**  
   Defined as connected components under a distance threshold ε.

4. **Largest cluster fraction (LCF)**  
   Fraction of particles belonging to the largest cluster.  
   This serves as the main *order parameter* for structure formation.

In [7]:
import sys
from pathlib import Path

PROJECT_ROOT = Path.cwd().resolve().parents[0]   
if str(PROJECT_ROOT) not in sys.path:
    sys.path.insert(0, str(PROJECT_ROOT))

In [8]:
import numpy as np

from src.metrics import (
    nearest_neighbor_distance,
    largest_cluster_fraction,
    density_variance_grid,
    number_of_clusters,
)

box_size = 1.0
eps = 0.05

for i in range(5):
    positions = np.random.rand(100, 2) * box_size
    nn = nearest_neighbor_distance(positions, box_size=box_size)
    lcf = largest_cluster_fraction(positions, eps=eps, box_size=box_size)
    print(f"Run {i}: NN={nn:.3f}, LCF={lcf:.2f}")

Run 0: NN=0.052, LCF=0.05
Run 1: NN=0.049, LCF=0.05
Run 2: NN=0.051, LCF=0.06
Run 3: NN=0.055, LCF=0.06
Run 4: NN=0.046, LCF=0.05


In [9]:
h_cluster = run_simulation(
    N=200,
    steps=800,
    seed=0,
    attraction=0.02,
    noise=0.01,
    interaction_range=0.6,
    repulsion_radius=0.05,
    save_every=20,
)

pos = h_cluster[-1]

In [10]:
print("nn:", nearest_neighbor_distance(pos, 1.0))
print("LCF:", largest_cluster_fraction(pos, eps=0.06, box_size=1.0))
print("dens_var:", density_variance_grid(pos, 1.0, bins=20, normalized=True))
print("n_clusters:", number_of_clusters(pos, eps=0.06, box_size=1.0, min_size=3))

nn: 0.011046733410187066
LCF: 0.995
dens_var: 14.899999999940402
n_clusters: 1


### Consistency check of clustering metrics

We evaluate all implemented observables on a clearly clustered configuration
to verify that they provide consistent and physically meaningful results.

In a clustered state, we expect:
- a small nearest-neighbour distance,
- a large largest-cluster fraction (LCF ≈ 1),
- a high density variance,
- and a small number of clusters.

This check confirms that all metrics respond coherently to the same underlying structure.

In [11]:
from src.metrics import (
    nearest_neighbor_distance,
    largest_cluster_fraction,
    density_variance_grid,
    number_of_clusters,
)

pos = h_cluster[-1]  
print("nn:", nearest_neighbor_distance(pos, 1.0))
print("LCF:", largest_cluster_fraction(pos, eps=0.06, box_size=1.0))
print("dens_var:", density_variance_grid(pos, 1.0, bins=20, normalized=True))
print("n_clusters:", number_of_clusters(pos, eps=0.06, box_size=1.0, min_size=3))

nn: 0.011046733410187066
LCF: 0.995
dens_var: 14.899999999940402
n_clusters: 1
