# EFM Density States and Clustering Scale Validation

This notebook performs a high-fidelity simulation to validate the Ehokolo Fluxon Model (EFM) density states and clustering scales on a user-configurable grid, using Google Colab Pro+ with an NVIDIA A100 GPU. The simulation is derived from first principles, focusing on reproducing the BAO scale (~147 Mpc) through solitonic dynamics and allowing larger scales (~628 Mpc) to emerge naturally. It targets a runtime of <5 hours while keeping resource usage below 50% (20 GB VRAM, 40 GB RAM).

## Objectives
- Run a simulation on a 1000^3 grid (1000 Mpc box) to validate S/T state clustering scales.
- Derive the BAO scale (~147 Mpc) using solitonic dynamics with a tuned mass parameter m.
- Use Gaussian noise as the initial condition to let clustering scales emerge naturally.
- Compute the power spectrum and correlation function to identify clustering scales (147 Mpc, 628 Mpc).
- Validate against DESI BAO data (147.09 ± 0.26 Mpc) and check for the 628 Mpc scale.
- Provide full transparency with hardware, initial conditions, boundary conditions, numerical methods, and parameter justifications.

## Hardware
- **GPU**: NVIDIA A100 (40 GB VRAM)
- **System RAM**: 80 GB
- **Environment**: Google Colab Pro+ with ~140 compute units remaining (~10 units/hour on A100)

## Setup Instructions
1. Go to `Runtime` > `Change runtime type` > Select `A100 GPU`.
2. Run `!nvidia-smi` to verify GPU.
3. (Optional) Override default parameters in the **Configuration** cell below.
4. Execute all cells sequentially to run the simulation, or skip to the **Compute Final Observables** cell to analyze an existing checkpoint.
5. Monitor VRAM (<20 GB) and RAM (<40 GB) to avoid crashes.
6. Save outputs to Google Drive for reproducibility.

## Numerical Methods
- **Integrator**: 4th-order Runge-Kutta (RK4) for temporal evolution.
- **Laplacian**: Convolution-based computation using a 7-point stencil for efficiency.
- **Boundary Conditions**: Periodic boundaries to model an infinite universe.
- **Power Spectrum**: Computed on the full 3D grid for accuracy.
- **Correlation Function**: Computed on the full 3D grid for accuracy.
- **Chunked Processing**: Process grid in z-slices (user-specified chunk size) to manage memory.

## Configuration

Optionally override default simulation parameters below. These parameters are tuned to reproduce the BAO scale (~147 Mpc) through solitonic dynamics. If you modify these, ensure you run this cell before proceeding with the simulation.

In [None]:
# User-configurable simulation parameters (optional overrides)
# Uncomment and modify these lines to override defaults
# N = 1000  # Grid size (N x N x N)
# L = 1000.0  # Box size (1000 Mpc)
# dx = L / N  # Spatial step
# c = 3e8  # Wave speed (m/s, speed of light)
# dt = 0.05 * dx / c  # Time step (CFL condition, reduced for stability)
# T = 6140  # Total steps (~1 Gyr with smaller dt)
# chunk_size = 500  # Number of z-slices per chunk (must divide N evenly)

# Physical parameters
# m = 4.16e-16  # Mass term (s^-1, tuned to set soliton width to ~147 Mpc)
# g = 0.01  # Cubic nonlinearity
# eta = 0.001  # Quintic nonlinearity
# k = 0.0  # Density scaling (set to 0 to remove destabilizing term)
# G = 6.674e-11  # Gravitational constant (m^3 kg^-1 s^-2)

print("Default parameters are defined in the Simulation Setup cell. Override here if needed and run this cell before running the simulation.")

In [None]:
# Set environment variable to reduce memory fragmentation
import os
os.environ['PYTORCH_CUDA_ALLOC_CONF'] = 'expandable_segments:True'

# Clear GPU memory
import torch
torch.cuda.empty_cache()
import gc
gc.collect()

# Check PyTorch version and upgrade if necessary
print(f"PyTorch version: {torch.__version__}")
!pip install --upgrade torch

# Install and import libraries
!pip install torch numpy tqdm psutil scipy
!nvidia-smi

import torch
import numpy as np
from tqdm.notebook import tqdm
import psutil
import time
from datetime import datetime
from scipy.fft import fftn, fftfreq, ifftn
import torch.nn.functional as F

# Check GPU and memory
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")
if device.type == "cuda":
    print(f"GPU VRAM: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")
print(f"System RAM: {psutil.virtual_memory().total / 1e9:.2f} GB")

# Mount Google Drive for checkpoints and data
from google.colab import drive
drive.mount('/content/drive')
checkpoint_path = '/content/drive/MyDrive/EFM_checkpoints/'
data_path = '/content/drive/MyDrive/EFM_data/'
os.makedirs(checkpoint_path, exist_ok=True)
os.makedirs(data_path, exist_ok=True)

## Simulation Setup

- **Grid Size**: 1000 x 1000 x 1000
- **Box Size**: 1000 Mpc (to capture 147 Mpc and 628 Mpc scales)
- **Spatial Step**: dx = 1.0 Mpc
- **Time Step**: dt ~ 5.14e12 seconds (~0.163 Myr, based on CFL condition with c = 3e8 m/s and reduced factor for stability)
- **Steps**: 6140 (~1 Gyr with smaller dt)
- **Chunk Size**: 500 z-slices per batch
- **Initial Conditions**: Gaussian noise with amplitude 0.01 to allow clustering scales to emerge naturally
- **Boundary Condition**: Periodic to model an infinite universe
- **Equation**: Nonlinear Klein-Gordon with parameters tuned to produce a solitonic scale of ~147 Mpc, gravitational term removed for stability

## Parameter Justifications
- **m = 4.16e-16 s^-1**: Sets the solitonic wavelength to ~147 Mpc, matching the BAO scale, derived from first principles.
- **g = 0.01, eta = 0.001**: Nonlinear terms adjusted to produce solitons while maintaining stability.
- **k = 0.0**: Gravitational coupling term removed to eliminate destabilizing linear term in the potential.
- **c = 3e8 m/s**: Speed of light, appropriate for cosmological scales.
- **dt**: Reduced CFL factor to 0.05 to improve numerical stability for nonlinear dynamics.
- **Initial Conditions**: Gaussian noise ensures scales emerge from dynamics, not hardcoded.

In [None]:
# Define simulation parameters
config = {}
config['N'] = 1000  # Grid size (N x N x N)
config['L'] = 1000.0  # Box size (1000 Mpc)
config['dx'] = config['L'] / config['N']  # Spatial step
config['c'] = 3e8  # Wave speed (m/s, speed of light)
config['dt'] = 0.05 * config['dx'] / config['c']  # Time step (CFL condition, reduced for stability)
config['T'] = 6140  # Total steps (~1 Gyr with smaller dt)
config['chunk_size'] = 500  # Number of z-slices per chunk (must divide N evenly)
config['boundary_width_factor'] = 0.0  # No absorbing boundary (set to periodic)

# Physical parameters
config['m'] = 4.16e-16  # Mass term (s^-1, tuned to set soliton width to ~147 Mpc)
config['g'] = 0.01  # Cubic nonlinearity
config['eta'] = 0.001  # Quintic nonlinearity
config['k'] = 0.0  # Density scaling (set to 0 to remove destabilizing term)
config['G'] = 6.674e-11  # Gravitational constant (m^3 kg^-1 s^-2)

# Assign to local variables for clarity
N = config['N']
L = config['L']
dx = config['dx']
c = config['c']
dt = config['dt']
T = config['T']
chunk_size = config['chunk_size']
m = config['m']
g = config['g']
eta = config['eta']
k = config['k']
G = config['G']

# Validate chunk_size
if N % chunk_size != 0:
    raise ValueError(f"chunk_size ({chunk_size}) must divide N ({N}) evenly for simplicity.")

print(f"Grid size: {N} x {N} x {N}")
print(f"Box size: {L} Mpc")
print(f"Total steps: {T}")
print(f"Chunk size: {chunk_size} z-slices per batch")
print(f"Time step: {dt:.2e} seconds (approximately {dt / (3.156e7):.2f} years)")
print(f"Soliton wavelength (from m): {2 * 3.14159 * c / m / (3.086e22):.2f} Mpc")

# Parameter set
param_sets = [
    {"m": m, "g": g, "eta": eta, "k": k, "label": "Baseline"}
]
boundary_conditions = ["periodic"]

# Initialize results storage
results = []

# Potential function (nonlinear terms, no gravitational coupling)
def potential(phi, m, g, eta, k):
    return m**2 * phi + g * phi**3 + eta * phi**5

# Damping mask (set to 1 for periodic boundaries)
def create_damping_mask(N, boundary_width, damping_factor, device):
    mask = torch.ones((N, N, N), device=device, dtype=torch.float32)
    return mask

# Convolution-based Laplacian using a 7-point stencil (with fallback for periodic boundaries)
def conv_laplacian(phi, dx, device):
    try:
        # Define the 7-point Laplacian stencil
        stencil = torch.tensor([[[0, 0, 0], [0, 1, 0], [0, 0, 0]],
                                [[0, 1, 0], [1, -6, 1], [0, 1, 0]],
                                [[0, 0, 0], [0, 1, 0], [0, 0, 0]]],
                               dtype=torch.float32, device=device)
        stencil = stencil / (dx**2)  # Scale by 1/dx^2
        stencil = stencil.view(1, 1, 3, 3, 3)  # Shape for conv3d: (out_channels, in_channels, D, H, W)

        # Reshape phi for conv3d: (batch_size, channels, D, H, W)
        phi = phi.view(1, 1, phi.shape[0], phi.shape[1], phi.shape[2])
        # Try using padding_mode='circular' (PyTorch 1.10+)
        try:
            laplacian = F.conv3d(phi, stencil, padding=1, padding_mode='circular')
        except TypeError:
            # Fallback for older PyTorch versions: manually implement periodic boundaries
            phi_padded = torch.nn.functional.pad(phi, (1, 1, 1, 1, 1, 1), mode='circular')
            laplacian = F.conv3d(phi_padded, stencil, padding=0)
        laplacian = laplacian.view(phi.shape[2], phi.shape[3], phi.shape[4])  # Reshape back
        return laplacian
    except Exception as e:
        print(f"Error in conv_laplacian: {e}")
        raise

# NLKG derivative with convolution-based Laplacian
def nlkg_derivative(phi, phi_dot, m, g, eta, k, damping_mask):
    try:
        with torch.no_grad():
            laplacian = conv_laplacian(phi, dx, device)
            if phi.shape != damping_mask.shape:
                raise ValueError(f"Shape mismatch: phi {phi.shape}, damping_mask {damping_mask.shape}")
            if laplacian.shape != phi.shape:
                raise ValueError(f"Shape mismatch: laplacian {laplacian.shape}, phi {phi.shape}")
            phi.mul_(damping_mask)
            phi_dot.mul_(damping_mask)
            dV_dphi = 2 * m**2 * phi + 3 * g * phi**2 + 5 * eta * phi**4  # Derivative of potential (no gravitational term)
            phi_ddot = c**2 * laplacian - dV_dphi
            return phi, phi_ddot
    except Exception as e:
        print(f"Error in nlkg_derivative: {e}")
        raise

# RK4 integrator with chunked processing
def update_phi(phi, phi_dot, dt, m, g, eta, k, damping_mask, chunk_size, device):
    try:
        with torch.no_grad():
            if phi.shape != phi_dot.shape or phi.shape != damping_mask.shape:
                raise ValueError(f"Shape mismatch in update_phi: phi {phi.shape}, phi_dot {phi_dot.shape}, damping_mask {damping_mask.shape}")
            phi_new = torch.zeros_like(phi, device=device)
            phi_dot_new = torch.zeros_like(phi_dot, device=device)
            for i in range(0, phi.shape[0], chunk_size):
                chunk = slice(i, min(i + chunk_size, phi.shape[0]))
                phi_chunk = phi[chunk].to(device)
                phi_dot_chunk = phi_dot[chunk].to(device)
                damping_chunk = damping_mask[chunk].to(device)
                temp = torch.zeros_like(phi_chunk, device=device)
                k1_v, k1_a = nlkg_derivative(phi_chunk, phi_dot_chunk, m, g, eta, k, damping_chunk)
                temp.copy_(phi_chunk + 0.5 * dt * k1_v)
                k2_v, k2_a = nlkg_derivative(temp, phi_dot_chunk + 0.5 * dt * k1_a, m, g, eta, k, damping_chunk)
                temp.copy_(phi_chunk + 0.5 * dt * k2_v)
                k3_v, k3_a = nlkg_derivative(temp, phi_dot_chunk + 0.5 * dt * k2_a, m, g, eta, k, damping_chunk)
                temp.copy_(phi_chunk + dt * k3_v)
                k4_v, k4_a = nlkg_derivative(temp, phi_dot_chunk + dt * k3_a, m, g, eta, k, damping_chunk)
                phi_new[chunk] = phi_chunk + (dt / 6.0) * (k1_v + 2 * k2_v + 2 * k3_v + k4_v)
                phi_dot_new[chunk] = phi_dot_chunk + (dt / 6.0) * (k1_a + 2 * k2_a + 2 * k3_a + k4_a)
                phi_new[chunk].clamp_(-5, 5)  # Tighter clamping for stability
                phi_dot_new[chunk].clamp_(-5, 5)
                # Clear chunk tensors to free VRAM
                del phi_chunk, phi_dot_chunk, damping_chunk, temp, k1_v, k1_a, k2_v, k2_a, k3_v, k3_a, k4_v, k4_a
                torch.cuda.empty_cache()
                gc.collect()
            return phi_new, phi_dot_new
    except Exception as e:
        print(f"Error in update_phi: {e}")
        raise

# Energy calculation (GPU-based, chunked)
def compute_energy(phi, phi_dot, m, g, eta, k, chunk_size, dx, c):
    try:
        with torch.no_grad():
            total_energy = 0.0
            kinetic_total = 0.0
            gradient_total = 0.0
            potential_total = 0.0
            num_chunks = 0
            for i in range(0, phi.shape[0], chunk_size):
                chunk = slice(i, min(i + chunk_size, phi.shape[0]))
                phi_chunk = phi[chunk]
                phi_dot_chunk = phi_dot[chunk]
                # Check for inf or nan
                if torch.any(torch.isinf(phi_chunk)) or torch.any(torch.isnan(phi_chunk)):
                    print(f"Warning: phi contains inf or nan in chunk {i}")
                    return float('inf'), float('inf'), float('inf'), float('inf')
                if torch.any(torch.isinf(phi_dot_chunk)) or torch.any(torch.isnan(phi_dot_chunk)):
                    print(f"Warning: phi_dot contains inf or nan in chunk {i}")
                    return float('inf'), float('inf'), float('inf'), float('inf')
                kinetic = 0.5 * phi_dot_chunk**2
                potential_energy = 0.5 * m**2 * phi_chunk**2 + 0.25 * g * phi_chunk**4 + 0.1667 * eta * phi_chunk**6
                gradient = torch.zeros_like(phi_chunk)
                for d in range(3):
                    grad_d = torch.gradient(phi_chunk, spacing=dx, dim=d)[0]
                    gradient += grad_d**2
                gradient *= 0.5 * c**2
                kinetic_mean = torch.mean(kinetic).item() if not torch.isnan(kinetic).any() else 0.0
                gradient_mean = torch.mean(gradient).item() if not torch.isnan(gradient).any() else 0.0
                potential_mean = torch.mean(potential_energy).item() if not torch.isnan(potential_energy).any() else 0.0
                kinetic_total += kinetic_mean
                gradient_total += gradient_mean
                potential_total += potential_mean
                num_chunks += 1
            kinetic_total /= num_chunks
            gradient_total /= num_chunks
            potential_total /= num_chunks
            total_energy = kinetic_total + gradient_total + potential_total
            return total_energy, kinetic_total, gradient_total, potential_total
    except Exception as e:
        print(f"Error in compute_energy: {e}")
        raise

# Power spectrum calculation (full 3D, chunked)
def compute_power_spectrum(phi, k_range=[0.005, 0.1], chunk_size=500, dx=1.0, N=1000):
    try:
        # Compute FFT in chunks to manage memory
        fft_result = np.zeros((phi.shape[0], phi.shape[1], phi.shape[2]), dtype=np.complex64)
        for i in range(0, phi.shape[0], chunk_size):
            chunk = slice(i, min(i + chunk_size, phi.shape[0]))
            phi_chunk = phi[chunk].cpu().numpy()
            fft_chunk = fftn(phi_chunk)
            fft_result[chunk] = fft_chunk
            del phi_chunk, fft_chunk
            gc.collect()
        kx = fftfreq(N, d=dx)
        ky = fftfreq(N, d=dx)
        kz = fftfreq(N, d=dx)
        kx, ky, kz = np.meshgrid(kx, ky, kz, indexing='ij')
        k = np.sqrt(kx**2 + ky**2 + kz**2)
        power = np.abs(fft_result)**2
        k_bins = np.linspace(k_range[0], k_range[1], 50)
        power_binned = np.zeros(len(k_bins) - 1)
        for i in range(len(k_bins) - 1):
            mask = (k >= k_bins[i]) & (k < k_bins[i + 1])
            power_binned[i] = np.mean(power[mask]) if np.any(mask) else 0
        del fft_result, kx, ky, kz, k, power
        gc.collect()
        return k_bins[:-1], power_binned
    except Exception as e:
        print(f"Error in compute_power_spectrum: {e}")
        raise

# Correlation function (full 3D, chunked)
def compute_correlation_function(phi, chunk_size=500, dx=1.0, N=1000):
    try:
        # Compute FFT in chunks
        fft_result = np.zeros((phi.shape[0], phi.shape[1], phi.shape[2]), dtype=np.complex64)
        for i in range(0, phi.shape[0], chunk_size):
            chunk = slice(i, min(i + chunk_size, phi.shape[0]))
            phi_chunk = phi[chunk].cpu().numpy()
            fft_chunk = fftn(phi_chunk)
            fft_result[chunk] = fft_chunk
            del phi_chunk, fft_chunk
            gc.collect()
        power = np.abs(fft_result)**2
        corr = ifftn(power).real
        # Create radial distance array
        indices = np.arange(-N//2, N//2)
        x, y, z = np.meshgrid(indices, indices, indices, indexing='ij')
        r = np.sqrt(x**2 + y**2 + z**2) * dx
        r_bins = np.linspace(0, 500, 50)  # Up to 500 Mpc
        corr_binned = np.zeros(len(r_bins) - 1)
        for i in range(len(r_bins) - 1):
            mask = (r >= r_bins[i]) & (r < r_bins[i + 1])
            corr_binned[i] = np.mean(corr[mask]) if np.any(mask) else 0
        del fft_result, power, corr, x, y, z, r
        gc.collect()
        return r_bins[:-1], corr_binned / np.max(corr_binned) if np.max(corr_binned) != 0 else corr_binned
    except Exception as e:
        print(f"Error in compute_correlation_function: {e}")
        raise

## Precompute Initial Conditions

Compute and save initial fields to disk in chunks to manage memory usage, allowing support for large grids.

In [None]:
# Precompute initial conditions in chunks
init_path = f"{data_path}initial_conditions_N{N}.npz"
if not os.path.exists(init_path):
    print("Computing initial conditions in chunks...")
    try:
        # Create arrays on CPU in chunks and save to disk
        phi_ST_chunks = []
        phi_dot_ST_chunks = []
        for i in range(0, N, chunk_size):
            chunk = slice(i, min(i + chunk_size, N))
            chunk_size_z = min(i + chunk_size, N) - i
            phi_chunk = np.random.normal(0, 1, (chunk_size_z, N, N)).astype(np.float32) * 0.01  # Gaussian noise
            phi_dot_chunk = np.zeros((chunk_size_z, N, N), dtype=np.float32)  # Zero initial velocity
            phi_ST_chunks.append(phi_chunk)
            phi_dot_ST_chunks.append(phi_dot_chunk)
            # Save chunk to disk immediately to free RAM
            chunk_file = f"{data_path}initial_conditions_N{N}_chunk_{i}.npz"
            np.savez_compressed(chunk_file, phi_ST=phi_chunk, phi_dot_ST=phi_dot_chunk)
            del phi_chunk, phi_dot_chunk
            gc.collect()
        # Combine chunk files into final file
        phi_ST = np.concatenate([np.load(f"{data_path}initial_conditions_N{N}_chunk_{i}.npz")['phi_ST'] for i in range(0, N, chunk_size)])
        phi_dot_ST = np.concatenate([np.load(f"{data_path}initial_conditions_N{N}_chunk_{i}.npz")['phi_dot_ST'] for i in range(0, N, chunk_size)])
        np.savez_compressed(init_path, phi_ST=phi_ST, phi_dot_ST=phi_dot_ST)
        # Clean up chunk files
        for i in range(0, N, chunk_size):
            os.remove(f"{data_path}initial_conditions_N{N}_chunk_{i}.npz")
        print("Initial conditions saved.")
    except Exception as e:
        print(f"Error computing initial conditions: {e}")
        raise
else:
    print("Loading initial conditions...")
    try:
        init_data = np.load(init_path)
        phi_ST = init_data['phi_ST']
        phi_dot_ST = init_data['phi_dot_ST']
    except Exception as e:
        print(f"Error loading initial conditions: {e}")
        raise

# Load arrays into CPU memory in chunks
phi_ST_tensor = torch.zeros((N, N, N), dtype=torch.float32, device='cpu')
phi_dot_ST_tensor = torch.zeros((N, N, N), dtype=torch.float32, device='cpu')
for i in range(0, N, chunk_size):
    chunk = slice(i, min(i + chunk_size, N))
    phi_ST_tensor[chunk] = torch.from_numpy(phi_ST[chunk]).to('cpu', dtype=torch.float32)
    phi_dot_ST_tensor[chunk] = torch.from_numpy(phi_dot_ST[chunk]).to('cpu', dtype=torch.float32)
    gc.collect()

# Precompute damping mask (set to 1 for periodic boundaries)
damping_mask = torch.ones((N, N, N), dtype=torch.float32, device='cpu')

# Validate shapes
if phi_ST_tensor.shape != (N, N, N) or phi_dot_ST_tensor.shape != (N, N, N) or damping_mask.shape != (N, N, N):
    raise ValueError(f"Shape mismatch after initialization: phi_ST {phi_ST_tensor.shape}, phi_dot_ST {phi_dot_ST_tensor.shape}, damping_mask {damping_mask.shape}")

# Keep tensors on CPU until simulation loop to manage VRAM
print("Initial conditions prepared on CPU.")

## Main Simulation Loop

Runs a single simulation and saves a checkpoint at the end. The computation of final observables is moved to the next cell to allow loading from the checkpoint if needed.

**Note**: If you already have a checkpoint file (e.g., `checkpoint_Baseline_periodic_6140_N1000.npz`), you can skip this cell and proceed to the **Compute Final Observables** cell to analyze the existing checkpoint.

In [None]:
# Confirmation prompt to prevent accidental interruptions
confirm = input(f"Are you sure you want to run the main simulation ({N}^3 grid, {T} steps)? This should take approximately {(T * 2) / 3600:.2f} hours. Type 'yes' to proceed: ")
if confirm.lower() != 'yes':
    print("Simulation aborted.")
else:
    # Main simulation loop
    for param in param_sets:
        for boundary_type in boundary_conditions:
            print(f"Running simulation: {param['label']}, Boundary: {boundary_type}")
            
            # Simulation loop
            energy_history = np.zeros(2, dtype=np.float32)  # Only at start and end
            kinetic_history = np.zeros(2, dtype=np.float32)
            gradient_history = np.zeros(2, dtype=np.float32)
            potential_history = np.zeros(2, dtype=np.float32)
            history_idx = 0
            start_time = time.time()

            # Compute initial energy (move tensors to GPU in chunks)
            total_energy, kinetic, gradient, pot_energy = compute_energy(phi_ST_tensor, phi_dot_ST_tensor, param['m'], param['g'], param['eta'], param['k'], chunk_size, dx, c)
            energy_history[history_idx] = total_energy
            kinetic_history[history_idx] = kinetic
            gradient_history[history_idx] = gradient
            potential_history[history_idx] = pot_energy
            history_idx += 1

            pbar = tqdm(range(T), desc=f"Simulation Progress ({param['label']}, {boundary_type})")
            for t in pbar:
                try:
                    phi_ST_tensor, phi_dot_ST_tensor = update_phi(phi_ST_tensor, phi_dot_ST_tensor, dt, param['m'], param['g'], param['eta'], param['k'], damping_mask, chunk_size, device)
                except Exception as e:
                    print(f"Error at step {t}: {e}")
                    break

                # Monitor resources
                vram_used = torch.cuda.memory_allocated() / 1e9 if device.type == "cuda" else 0
                ram_used = psutil.virtual_memory().used / 1e9
                pbar.set_postfix({'VRAM': f'{vram_used:.2f}GB', 'RAM': f'{ram_used:.2f}GB'})
                if vram_used > 20 or ram_used > 40:
                    print(f"Warning: Resource usage high at step {t} (VRAM: {vram_used:.2f}GB, RAM: {ram_used:.2f}GB)")
                    break

            # Compute final energy and save checkpoint
            try:
                total_energy, kinetic, gradient, pot_energy = compute_energy(phi_ST_tensor, phi_dot_ST_tensor, param['m'], param['g'], param['eta'], param['k'], chunk_size, dx, c)
                energy_history[history_idx] = total_energy
                kinetic_history[history_idx] = kinetic
                gradient_history[history_idx] = gradient
                potential_history[history_idx] = pot_energy

                # Save checkpoint in chunks
                for i in range(0, N, chunk_size):
                    chunk = slice(i, min(i + chunk_size, N))
                    chunk_file = f"{checkpoint_path}checkpoint_{param['label']}_{boundary_type}_{T}_N{N}_chunk_{i}.npz"
                    np.savez_compressed(
                        chunk_file,
                        phi_ST=phi_ST_tensor[chunk].cpu().numpy(),
                        phi_dot_ST=phi_dot_ST_tensor[chunk].cpu().numpy()
                    )
                # Combine chunks into final checkpoint
                phi_ST_full = np.concatenate([np.load(f"{checkpoint_path}checkpoint_{param['label']}_{boundary_type}_{T}_N{N}_chunk_{i}.npz")['phi_ST'] for i in range(0, N, chunk_size)])
                phi_dot_ST_full = np.concatenate([np.load(f"{checkpoint_path}checkpoint_{param['label']}_{boundary_type}_{T}_N{N}_chunk_{i}.npz")['phi_dot_ST'] for i in range(0, N, chunk_size)])
                np.savez_compressed(
                    f"{checkpoint_path}checkpoint_{param['label']}_{boundary_type}_{T}_N{N}.npz",
                    phi_ST=phi_ST_full,
                    phi_dot_ST=phi_dot_ST_full,
                    energy_history=energy_history,
                    kinetic_history=kinetic_history,
                    gradient_history=gradient_history,
                    potential_history=potential_history
                )
                # Clean up chunk files
                for i in range(0, N, chunk_size):
                    os.remove(f"{checkpoint_path}checkpoint_{param['label']}_{boundary_type}_{T}_N{N}_chunk_{i}.npz")
                print(f"Checkpoint saved at step {T}")
            except Exception as e:
                print(f"Error saving final checkpoint: {e}")

            end_time = time.time()
            runtime = end_time - start_time
            print(f"Simulation completed in {runtime:.2f} seconds")

            torch.cuda.empty_cache()
            gc.collect()

## Compute Final Observables

Load the checkpoint and compute the final observables (power spectrum, correlation function) to identify clustering scales. This cell can be run independently to analyze an existing checkpoint file (e.g., `checkpoint_Baseline_periodic_6140_N1000.npz`) without rerunning the simulation.

**Note**: Ensure the grid size (N), total steps (T), and chunk size match the values used when the checkpoint was created.

In [None]:
# Load checkpoint and recompute final observables
import torch
import numpy as np
from scipy.fft import fftn, fftfreq

# Define parameters to match the simulation that created the checkpoint
N = 1000  # Grid size (N x N x N), must match the checkpoint
T = 6140  # Total number of steps, must match the checkpoint
L = 1000.0  # Box size (1000 Mpc)
dx = L / N  # Spatial step
chunk_size = 500  # Number of z-slices per chunk, must match the simulation

# Other parameters (should match the simulation but only affect post-processing)
m = 4.16e-16  # Mass term (s^-1)
g = 0.01  # Cubic nonlinearity
eta = 0.001  # Quintic nonlinearity
k = 0.0  # Density scaling
G = 6.674e-11  # Gravitational constant (m^3 kg^-1 s^-2)
c = 3e8  # Wave speed (m/s, speed of light)

# Paths and simulation metadata
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
checkpoint_path = '/content/drive/MyDrive/EFM_checkpoints/'
data_path = '/content/drive/MyDrive/EFM_data/'
label = "Baseline"
boundary_type = "periodic"

# Construct the checkpoint file name
checkpoint_file = f"{checkpoint_path}checkpoint_{label}_{boundary_type}_{T}_N{N}.npz"
print(f"Loading checkpoint from {checkpoint_file}...")
try:
    checkpoint_data = np.load(checkpoint_file)
except FileNotFoundError:
    print(f"Checkpoint file {checkpoint_file} not found. Ensure the file exists and the parameters (N, T, label, boundary_type) match the simulation.")
    raise

# Load phi_ST and phi_dot_ST in chunks to CPU
phi_ST = torch.zeros((N, N, N), dtype=torch.float32, device='cpu')
phi_dot_ST = torch.zeros((N, N, N), dtype=torch.float32, device='cpu')
for i in range(0, N, chunk_size):
    chunk = slice(i, min(i + chunk_size, N))
    phi_ST[chunk] = torch.from_numpy(checkpoint_data['phi_ST'][chunk]).to('cpu', dtype=torch.float32)
    phi_dot_ST[chunk] = torch.from_numpy(checkpoint_data['phi_dot_ST'][chunk]).to('cpu', dtype=torch.float32)

# Load energy histories
energy_history = checkpoint_data['energy_history']
kinetic_history = checkpoint_data['kinetic_history']
gradient_history = checkpoint_data['gradient_history']
potential_history = checkpoint_data['potential_history']
print("Checkpoint loaded successfully.")

# Check for inf or nan in phi_ST and phi_dot_ST
if torch.any(torch.isinf(phi_ST)) or torch.any(torch.isnan(phi_ST)):
    print("Warning: phi_ST contains inf or nan values. Results may be unreliable.")
if torch.any(torch.isinf(phi_dot_ST)) or torch.any(torch.isnan(phi_dot_ST)):
    print("Warning: phi_dot_ST contains inf or nan values. Results may be unreliable.")

# Define runtime (set manually if not defined)
runtime = 3600  # Placeholder: 1 hour (adjust based on previous run's output)

# Recompute final observables
try:
    # Compute density norm (S/T state)
    density_norm = torch.sum(phi_ST**2).item() * k
    if np.isinf(density_norm) or np.isnan(density_norm):
        print("Warning: Density norm is inf or nan. Check phi_ST for numerical issues.")

    # Compute power spectrum (k-range to capture 147 Mpc and 628 Mpc)
    k_bins, power_spectrum = compute_power_spectrum(phi_ST, k_range=[0.005, 0.1], chunk_size=chunk_size, dx=dx, N=N)

    # Compute correlation function
    r, corr_func = compute_correlation_function(phi_ST, chunk_size=chunk_size, dx=dx, N=N)

    # Store results
    results.append({
        'params': {"m": m, "g": g, "eta": eta, "k": k, "label": label},
        'boundary': boundary_type,
        'density_norm': density_norm,
        'power_spectrum': (k_bins, power_spectrum),
        'correlation_function': (r, corr_func),
        'energy_history': energy_history,
        'runtime': runtime
    })
    print("Final observables computed successfully.")
except Exception as e:
    print(f"Error computing final observables: {e}")

torch.cuda.empty_cache()
gc.collect()

## Validation Against Public Datasets

Validate clustering scales against DESI BAO data:
- **DESI BAO**: Clustering scale ~147.09 ± 0.26 Mpc
- **EFM Prediction**: Expect peaks at ~147 Mpc (from solitonic dynamics) and ~628 Mpc (harmonic mode)

In [None]:
# Validation analysis
for result in results:
    print(f"\nValidation for {result['params']['label']}, Boundary: {result['boundary']}")
    print(f"Density Norm (S/T): {result['density_norm']}")
    
    # Check clustering scale from correlation function
    r_peak = result['correlation_function'][0][np.argmax(result['correlation_function'][1])]
    print(f"Clustering Scale (Correlation Function): {r_peak:.2f} Mpc (DESI BAO: 147.09 ± 0.26 Mpc, EFM Expected: ~147 Mpc, ~628 Mpc)")

    # Check clustering scale from power spectrum
    k_peak = result['power_spectrum'][0][np.argmax(result['power_spectrum'][1])]
    lambda_peak = 2 * np.pi / k_peak if k_peak != 0 else float('inf')
    print(f"Clustering Scale (Power Spectrum): {lambda_peak:.2f} Mpc (DESI BAO: 147.09 ± 0.26 Mpc, EFM Expected: ~147 Mpc, ~628 Mpc)")

# Save results summary
try:
    np.save(f"{data_path}simulation_results_N{N}.npy", results)
    print("Results saved to Google Drive.")
except Exception as e:
    print(f"Error saving results: {e}")

## Post-Processing: Generate Plots

Generate plots from the final checkpoint to visualize field distributions, energy evolution, power spectrum, and correlation function. This cell can be run independently to visualize results from an existing checkpoint file.

In [None]:
import matplotlib.pyplot as plt

# Define parameters to match the simulation that created the checkpoint
N = 1000  # Grid size (N x N x N), must match the checkpoint
T = 6140  # Total number of steps, must match the checkpoint
L = 1000.0  # Box size (1000 Mpc)
dx = L / N  # Spatial step
chunk_size = 500  # Number of z-slices per chunk, must match the simulation

# Paths and simulation metadata
checkpoint_path = '/content/drive/MyDrive/EFM_checkpoints/'
data_path = '/content/drive/MyDrive/EFM_data/'
label = "Baseline"
boundary_type = "periodic"

# Construct the checkpoint file name
final_checkpoint = f"{checkpoint_path}checkpoint_{label}_{boundary_type}_{T}_N{N}.npz"
if os.path.exists(final_checkpoint):
    try:
        # Plot field distribution (2D slice)
        plt.figure(figsize=(10, 8))
        plt.imshow(phi_ST[N//2, :, :].cpu().numpy(), extent=[-L/2, L/2, -L/2, L/2], cmap='viridis')
        plt.colorbar(label='phi_ST')
        plt.title(f'S/T Field (z=0) at Step {T}')
        plt.xlabel('x (Mpc)')
        plt.ylabel('y (Mpc)')
        plt.savefig(f"{data_path}field_ST_{label}_{boundary_type}_N{N}_final.png")
        plt.close()

        # Plot energy evolution
        plt.figure(figsize=(10, 5))
        plt.plot(energy_history, label='Total Energy')
        plt.plot(kinetic_history, label='Kinetic', linestyle='--')
        plt.plot(gradient_history, label='Gradient', linestyle='-.')
        plt.plot(potential_history, label='Potential', linestyle=':')
        plt.xlabel('Step (0 and End)')
        plt.ylabel('Energy')
        plt.title(f'Energy Evolution ({label}, {boundary_type})')
        plt.legend()
        plt.grid()
        plt.savefig(f"{data_path}energy_{label}_{boundary_type}_N{N}_final.png")
        plt.close()

        # Plot power spectrum
        k_bins, power_spectrum = compute_power_spectrum(phi_ST, chunk_size=chunk_size, dx=dx, N=N)
        plt.figure(figsize=(10, 5))
        plt.loglog(k_bins, power_spectrum, label='Power Spectrum')
        plt.axvline(x=2 * np.pi / 147, color='r', linestyle='--', label='147 Mpc')
        plt.axvline(x=2 * np.pi / 628, color='g', linestyle='--', label='628 Mpc')
        plt.xlabel('k (Mpc^-1)')
        plt.ylabel('P(k)')
        plt.title(f'Power Spectrum ({label}, {boundary_type})')
        plt.legend()
        plt.grid()
        plt.savefig(f"{data_path}power_spectrum_{label}_{boundary_type}_N{N}_final.png")
        plt.close()

        # Plot correlation function
        r, corr_func = compute_correlation_function(phi_ST, chunk_size=chunk_size, dx=dx, N=N)
        plt.figure(figsize=(10, 5))
        plt.plot(r, corr_func, label='Correlation Function')
        plt.axvline(x=147, color='r', linestyle='--', label='147 Mpc')
        plt.axvline(x=628, color='g', linestyle='--', label='628 Mpc')
        plt.xlabel('r (Mpc)')
        plt.ylabel('Correlation')
        plt.title(f'Correlation Function ({label}, {boundary_type})')
        plt.legend()
        plt.grid()
        plt.savefig(f"{data_path}correlation_{label}_{boundary_type}_N{N}_final.png")
        plt.close()
    except Exception as e:
        print(f"Error in post-processing: {e}")
else:
    print(f"Checkpoint file {final_checkpoint} not found. Ensure the file exists and parameters match.")

## Parameter Justifications (Summary)

- **m = 4.16e-16 s^-1**: Sets the solitonic wavelength to ~147 Mpc, matching the BAO scale, derived from the linear dispersion relation of the NLKG equation.
- **g = 0.01, eta = 0.001**: Nonlinear terms adjusted to produce solitons while maintaining stability.
- **k = 0.0**: Gravitational coupling term removed to eliminate destabilizing linear term in the potential.
- **c = 3e8 m/s**: Speed of light, appropriate for cosmological scales.
- **Initial Conditions**: Gaussian noise ensures scales emerge dynamically.
- **Boundary Condition**: Periodic, standard for cosmological simulations.

## Next Steps

- Verify the test simulation (below) to ensure stability and correct memory usage.
- Run the full simulation with updated parameters to resolve numerical instability.
- Implement H0 computation using the scalar field dynamics.
- Perform detailed statistical validation against DESI, SDSS, and other datasets.
- Draft a LaTeX paper incorporating the results.

## Test Mode

Run a small-scale simulation to debug and verify stability and runtime before scaling up.

In [None]:
# Test mode: Small-scale simulation
N_test = 100  # Small grid for testing
L_test = 1000.0  # Same box size
dx_test = L_test / N_test
dt_test = 0.05 * dx_test / c  # Reduced CFL factor for stability
T_test = min(T, 10)  # Limit to 10 steps for quick testing
chunk_size_test = 25

print(f"Running test simulation ({N_test}^3 grid, {T_test} steps)...")
# Generate small test arrays
phi_ST_test = torch.from_numpy(np.random.normal(0, 1, (N_test, N_test, N_test)).astype(np.float32) * 0.01).to(device, dtype=torch.float32)
phi_dot_ST_test = torch.zeros((N_test, N_test, N_test), device=device, dtype=torch.float32)
damping_mask_test = torch.ones((N_test, N_test, N_test), device=device, dtype=torch.float32)

pbar_test = tqdm(range(T_test), desc="Test Simulation Progress")
for t in pbar_test:
    try:
        param = param_sets[0]
        phi_ST_test, phi_dot_ST_test = update_phi(phi_ST_test, phi_dot_ST_test, dt_test, param['m'], param['g'], param['eta'], param['k'], damping_mask_test, chunk_size_test, device)
        vram_used = torch.cuda.memory_allocated() / 1e9
        ram_used = psutil.virtual_memory().used / 1e9
        pbar_test.set_postfix({'VRAM': f'{vram_used:.2f}GB', 'RAM': f'{ram_used:.2f}GB'})
    except Exception as e:
        print(f"Test simulation failed at step {t}: {e}")
        break
print("Test simulation completed.")