-
Notifications
You must be signed in to change notification settings - Fork 0
Architecture Overview
GRANITE v0.6.8 | ← AMR Design | Benchmarks & Validation →
This page documents every layer of the GRANITE engine — from the top-level data flow through each subsystem's internal design, memory layout, and coupling interfaces. Implementation details reference the source directly.
- High-Level System Diagram
- Data Flow: From YAML to Waveform
- Core Data Structure: GridBlock
- The CCZ4 ↔ GRMHD Interface: GRMetric3
- CCZ4 Variable Layout — 22 Variables
- GRMHD Variable Layout — 9 Variables
- Subsystem Coupling Map
- The RK3 Time Loop — Step by Step
- Memory Architecture
- Namespace Structure
┌─────────────────────────────────────────────────────────────────────┐
│ GRANITE ENGINE │
│ │
│ params.yaml ──► Parameter Parser ──► Initial Data Generator │
│ │ │
│ ┌────────────▼────────────┐ │
│ │ AMR Grid Hierarchy │ │
│ │ (Berger-Oliger, ≤12 L) │ │
│ └────────────┬────────────┘ │
│ │ │
│ ┌─────────────────────────────▼──────────────────┐ │
│ │ SSP-RK3 Time Loop │ │
│ │ │ │
│ │ ┌──────────────────────────────────────────┐ │ │
│ │ │ RK3 Stage k (k = 1, 2, 3) │ │ │
│ │ │ │ │ │
│ │ │ 1. Ghost zone exchange (MPI_Isend/Irecv)│ │ │
│ │ │ 2. Apply outer BCs (Sommerfeld / Copy) │ │ │
│ │ │ 3. CCZ4 RHS (4th-order FD + KO diss.) │ │ │
│ │ │ 4. GRMHD RHS (MP5/HLLD/CT) │ │ │
│ │ │ 5. M1 Radiation RHS [built, not wired] │ │ │
│ │ │ 6. SSP-RK3 combine │ │ │
│ │ │ 7. Apply floors (χ, α) │ │ │
│ │ │ 8. Enforce algebraic constraints │ │ │
│ │ └──────────────────────────────────────────┘ │ │
│ │ │ │ │
│ │ ◄── AMR regrid (every N steps) [partial] │ │
│ │ ◄── Ghost zone prolongation (coarse → fine) │ │
│ │ ◄── Ghost zone restriction (fine → coarse) │ │
│ │ ◄── Diagnostics: ‖H‖₂, α_center, NaN scan │ │
│ │ ◄── Checkpoint write (HDF5, every N steps) │ │
│ │ ◄── GW extraction (Ψ₄) [infrastructure ready]│ │
│ └────────────────────────────────────────────────┘ │
│ │ │
│ Python post-processing │
│ (granite_analysis: sim_tracker CLI, gw, plotting) │
└─────────────────────────────────────────────────────────────────────┘
Step 1: Input
└─ params.yaml parsed by yaml-cpp
└─ All parameters validated and stored in SimParams struct
Step 2: Initial Data
└─ ID type selected: Brill-Lindquist / Bowen-York / Two-Punctures / TOV
└─ Conformal factor computed on base grid
└─ CCZ4 variables initialized (χ, γ̃_ij, K, Ã_ij, Γ̃^i, α, β^i, Θ)
└─ GRMHD variables initialized (D, S_i, τ, B^i, Y_e)
Step 3: AMR Initialization
└─ AMRHierarchy constructed with max_levels
└─ Tracking spheres registered for each BH puncture
└─ Initial refinement cascade to min_level around punctures
└─ GridBlock patches created at each level
└─ Level timesteps set: dt_ℓ = dt_0 / 2^ℓ
Step 4: Time Loop (t = 0 → t_final)
└─ For each RK3 stage:
└─ MPI ghost zone exchange (non-blocking Isend/Irecv)
└─ Outer boundary conditions applied
└─ CCZ4 RHS computed (OpenMP parallel, field-major loops)
└─ GRMHD RHS computed (MP5 reconstruction → HLLD → CT)
└─ Stage update: U^(k+1) = α_k U^n + β_k (U^(k) + dt L(U^(k)))
└─ Floors applied: χ ∈ [1e-4, 1.5], α ≥ 1e-6
└─ Algebraic constraints enforced: det(γ̃)=1, tr(Ã)=0
└─ Every diagnostic_every steps:
└─ Compute ‖H‖₂, ‖M^i‖₂ over entire domain
└─ Scan for NaN events
└─ Write telemetry line (parsed by `granite_analysis.cli.sim_tracker`)
└─ Every checkpoint_every steps:
└─ Write HDF5 checkpoint (parallel MPI-IO)
Step 5: Post-Processing
└─ granite_analysis: load HDF5, plot ‖H‖₂, extract Ψ₄
└─ gw.py: strain recovery, spherical harmonic decomposition
GridBlock is the fundamental compute unit — a single uniform Cartesian patch covering a rectangular region of the simulation domain, including ghost zones.
// Field-major, flat contiguous allocation
// data_[var * stride + flat(i,j,k)]
//
// stride = (nx + 2*nghost) * (ny + 2*nghost) * (nz + 2*nghost)
// flat(i,j,k) = (i + nghost) * ny_stride
// + (j + nghost) * nz_stride
// + (k + nghost)
//
// i ∈ [-nghost, nx+nghost-1] (negative = ghost)
// j ∈ [-nghost, ny+nghost-1]
// k ∈ [-nghost, nz+nghost-1]
class GridBlock {
std::vector<Real> data_; // SINGLE flat allocation — never vector<vector<>>
int nx_, ny_, nz_; // Interior cell counts
int nghost_ = 4; // 4 ghost zones for 4th-order FD
int nvar_; // Number of evolved variables
Real dx_, dy_, dz_; // Cell spacing
Real x0_, y0_, z0_; // Physical origin (ghost-inclusive)
};
// FIELD-MAJOR (our layout):
// data_[var=0, all cells], data_[var=1, all cells], ...
//
// RHS inner loop: for(var...) rhs(var,i,j,k) += ...
// → consecutive var writes are 8 bytes apart (adjacent in memory)
// → L1 cache hit rate: ~99%
// CELL-MAJOR (wrong for our use case):
// data_[i,j,k, all vars], ...
//
// RHS inner loop: for(var...) rhs(var,i,j,k) += ...
// → each var write jumps by stride_ words
// → TLB pressure for large grids
// Canonical accessor — use this everywhere
inline Real& GridBlock::at(int var, int i, int j, int k) {
return data_[var * stride_
+ (i + nghost_) * ny_stride_
+ (j + nghost_) * nz_stride_
+ (k + nghost_)];
}
// i,j,k are interior-relative: 0 → nx-1 are valid interior indices
// i = -1 is the first ghost layer toward -x
// i = -nghost is the outermost ghost layernghost=4:
[g][g][g][g] | [i][i][i]...[i] | [g][g][g][g]
← ghosts → ← interior → ← ghosts →
4 nx 4
Ghost zones filled by:
- MPI exchange: same-level neighbors on different ranks
- Prolongation: from parent AMR level (coarser)
- Boundary conditions: outermost ghosts (Sommerfeld / copy)
| Decision | Rationale | Reference |
|---|---|---|
| Field-major layout | RHS loop vectorization: all cells of one variable are contiguous | Fix H2 |
Single flat vector<Real>
|
No pointer-chasing; enables SIMD; 1 malloc per block | Fix H2 |
nghost = 4 always |
4th-order FD requires 2 ghost layers; 4 provides safety margin for reconstruction | Architecture |
| Spatial outer, var inner in RHS | Matches field-major memory order; eliminates cache thrashing | Fix H3 |
GRMetric3 is the only permitted interface between the spacetime evolution module and the GRMHD matter module. Direct access to CCZ4 internal variables from GRMHD code is forbidden.
// include/granite/grmhd/grmhd.hpp
namespace granite::grmhd {
struct GRMetric3 {
// Conformal decomposition: g_ij = χ^{-1} * γ̃_ij
Real chi; // Conformal factor (χ)
Real gamma_tilde[3][3]; // Conformal spatial metric (γ̃_ij)
Real K; // Trace of extrinsic curvature
Real A_tilde[3][3]; // Traceless conformal extrinsic curvature (Ã_ij)
// Gauge quantities
Real alpha; // Lapse (α)
Real beta[3]; // Shift vector (β^i)
Real Gamma_tilde[3]; // Contracted conformal Christoffel (Γ̃^i)
// Derived (computed on demand from above)
Real gamma[3][3]; // Physical spatial metric (γ_ij = γ̃_ij / χ)
Real gamma_inv[3][3]; // Inverse physical metric (γ^ij)
Real sqrt_gamma; // √(det γ)
Real K_ij[3][3]; // Full extrinsic curvature
// Factory: flat Minkowski spacetime (for tests)
static GRMetric3 flat() {
GRMetric3 g{};
g.chi = 1.0; g.alpha = 1.0; g.sqrt_gamma = 1.0;
g.gamma_tilde[0][0] = g.gamma_tilde[1][1] = g.gamma_tilde[2][2] = 1.0;
g.gamma[0][0] = g.gamma[1][1] = g.gamma[2][2] = 1.0;
g.gamma_inv[0][0] = g.gamma_inv[1][1] = g.gamma_inv[2][2] = 1.0;
return g;
}
};
} // namespace granite::grmhdCoupling rule: At each RK3 stage, after the CCZ4 RHS is computed, readMetric() extracts a GRMetric3 from the current CCZ4 GridBlock and passes it to computeHLLEFlux() / computeHLLDFlux(). The GRMHD solver never reads raw CCZ4 variables directly.
| Index | Symbol | Name | Physical Meaning |
|---|---|---|---|
| 0 | χ | chi |
Conformal factor — encodes the volume element of the physical metric |
| 1 | γ̃_xx | gtxx |
Conformal metric xx-component |
| 2 | γ̃_xy | gtxy |
Conformal metric xy-component (symmetric) |
| 3 | γ̃_xz | gtxz |
Conformal metric xz-component |
| 4 | γ̃_yy | gtyy |
Conformal metric yy-component |
| 5 | γ̃_yz | gtyz |
Conformal metric yz-component |
| 6 | γ̃_zz | gtzz |
Conformal metric zz-component |
| 7 | K | K |
Trace of extrinsic curvature — expansion of spatial slices |
| 8 | Ã_xx | Atxx |
Traceless conformal extrinsic curvature xx |
| 9 | Ã_xy | Atxy |
Traceless conformal extrinsic curvature xy |
| 10 | Ã_xz | Atxz |
Traceless conformal extrinsic curvature xz |
| 11 | Ã_yy | Atyy |
Traceless conformal extrinsic curvature yy |
| 12 | Ã_yz | Atyz |
Traceless conformal extrinsic curvature yz |
| 13 | Ã_zz | Atzz |
Traceless conformal extrinsic curvature zz |
| 14 | Γ̃^x | Ghatx |
Contracted conformal Christoffel — x-component |
| 15 | Γ̃^y | Ghaty |
Contracted conformal Christoffel — y-component |
| 16 | Γ̃^z | Ghatz |
Contracted conformal Christoffel — z-component |
| 17 | α | alpha |
Lapse — rate of proper time vs. coordinate time |
| 18 | β^x | betax |
Shift vector — x-component |
| 19 | β^y | betay |
Shift vector — y-component |
| 20 | β^z | betaz |
Shift vector — z-component |
| 21 | Θ | Theta |
CCZ4 constraint variable — actively damped |
Algebraic constraints (enforced after each full RK3 step):
-
det(γ̃) = 1— conformal metric normalization -
tr(Ã) = 0— traceless condition on conformal extrinsic curvature
| Index | Symbol | Name | Physical Meaning |
|---|---|---|---|
| 0 | D | dens |
Conserved baryon number density: D = ρ W √γ |
| 1 | S_x | momx |
Conserved x-momentum density |
| 2 | S_y | momy |
Conserved y-momentum density |
| 3 | S_z | momz |
Conserved z-momentum density |
| 4 | τ | ener |
Conserved energy density (excluding rest mass) |
| 5 | B^x | Bx |
Magnetic field x-component (staggered on x-faces for CT) |
| 6 | B^y | By |
Magnetic field y-component (staggered on y-faces for CT) |
| 7 | B^z | Bz |
Magnetic field z-component (staggered on z-faces for CT) |
| 8 | D·Y_e | DYe |
Conserved electron fraction (advected passively: D·Y_e) |
Primitive variables (recovered from conserved at each RK3 stage via Newton-Raphson C2P):
| Symbol | Name | Description |
|---|---|---|
| ρ | rho |
Rest-mass density |
| v^i | vx, vy, vz |
3-velocity |
| ε | eps |
Specific internal energy |
| P | press |
Thermal pressure (from EOS) |
| T | temp |
Temperature (from EOS inversion) |
| Y_e | Ye |
Electron fraction |
| W | — | Lorentz factor: W = 1/√(1 - v²) |
┌─────────────────────────────────────────────────────────────────┐
│ Subsystem Interfaces │
│ │
│ CCZ4Solver ──────────────► GRMetric3 ──────────────► GRMHD │
│ (spacetime) (read-only copy (matter) │
│ of metric at cell) │
│ │
│ GRMHDEvolution ──────────► StressEnergyTensor3 ──► CCZ4 │
│ (matter) (T_μν source terms (spacetime) │
│ in CCZ4 RHS) │
│ │
│ M1Radiation ─────────────► Source terms ──────────► GRMHD │
│ (radiation) [built, NOT wired (matter) │
│ in v0.6.8] │
│ │
│ AMRHierarchy ─────────────► GridBlock* ──────────── All │
│ (grid mgmt) (provides patches subsystems │
│ to all subsystems) │
│ │
│ ConstraintMonitor ────────► DiagnosticOutput ─────► Dashboard │
│ (observables) (‖H‖₂, ‖M^i‖₂, α_center (Python) │
│ NaN scan, GW extraction) │
└─────────────────────────────────────────────────────────────────┘
Interface rules:
| From | To | Interface | Rule |
|---|---|---|---|
| CCZ4 | GRMHD | GRMetric3 |
ONLY interface — no direct CCZ4 var access from GRMHD |
| GRMHD | CCZ4 | StressEnergyTensor3 |
Source terms in ∂_t K and ∂_t Ã_ij equations |
| M1 | GRMHD | Source terms | Built but not called in v0.6.8 RK3 loop |
| AMR | All |
GridBlock hierarchy |
All physics operates on GridBlock* patches |
| Diagnostics | Output | Telemetry lines | Parsed by granite_analysis.cli.sim_tracker via regex |
The Shu-Osher SSP-RK3 scheme evolves all variables simultaneously:
U^(1) = U^n + Δt · L(U^n)
U^(2) = (3/4) U^n + (1/4) [U^(1) + Δt · L(U^(1))]
U^(n+1) = (1/3) U^n + (2/3) [U^(2) + Δt · L(U^(2))]
For each stage k, L(U^(k)) is computed as follows:
Stage k computation (src/core/evolution_loop.cpp):
1. syncGhostZones(level)
├── MPI_Isend ghost data to ±x, ±y, ±z neighbors (non-blocking)
├── MPI_Irecv ghost data from ±x, ±y, ±z neighbors (non-blocking)
├── Prolongate ghost cells from parent AMR level
└── MPI_Waitall (overlap with local computation)
2. applyOuterBC(level)
├── Sommerfeld (Two-Punctures data): f_ghost = f∞ + (f_int - f∞)·r_int/r_ghost
└── Copy (BL data): f_ghost = f_interior
3. computeCCZ4RHS(state, rhs)
├── 4th-order centered FD derivatives (d1, d2)
├── Chi-weighted selective advection (d1up near-field, d1 near puncture)
├── Ricci tensor components
├── Γ̃^i evolution (T7 term centered always)
├── KO dissipation (single OpenMP region, all 22 vars)
└── Stress-energy coupling (T_μν from GRMHD primitives)
4. computeGRMHDRHS(state, rhs, metric)
├── readMetric() → GRMetric3 at each cell
├── MP5/PPM/PLM reconstruction → left/right interface states
├── HLLE or HLLD Riemann solver (with actual GRMetric3)
├── Flux differencing: RHS += (F_{i+1/2} - F_{i-1/2}) / dx
├── Constrained Transport: CT update for B^i
└── conservedToPrimitive (Newton-Raphson C2P)
5. applyFloors(state)
├── chi: clamp to [1e-4, 1.5] (NaN-safe: uses !isfinite check)
└── alpha: clamp to [1e-6, ∞)
6. SSP-RK3 combine (stage-dependent coefficients)
[After all 3 stages complete:]
7. applyAlgebraicConstraints(state)
├── Enforce det(γ̃) = 1
└── Enforce tr(Ã) = 0
Memory per GridBlock = nvar × (nx+8) × (ny+8) × (nz+8) × 8 bytes
where nghost=4, so 8 cells of padding per dimension
CCZ4: 22 vars
GRMHD: 9 vars
RHS buffers: 31 vars (copies for each RK3 stage)
Total vars: ~93 per GridBlock (with 3 RK3 stage buffers)
For 128³ base grid, 4 AMR levels:
Level 0: 128³ = 2,097,152 cells × 93 vars × 8 bytes ≈ 1.56 GB
Level 1: 64³ ≈ 0.20 GB
Level 2: 32³ ≈ 0.025 GB
Level 3: 16³ ≈ 0.003 GB
Total: ~1.8 GB (fits in 16 GB desktop RAM with headroom)
| Configuration | AMR Levels | RAM Estimate |
|---|---|---|
| 64³ (development) | 4 | ~0.5 GB |
| 128³ (desktop) | 4 | ~2 GB |
| 256³ (workstation) | 6 | ~16 GB |
| 512³ (cluster) | 8 | ~128 GB |
| B5_star production | 12 | ~2 TB |
granite:: # Top-level namespace
├── GridBlock # core/grid.hpp
├── Real # core/types.hpp (= double)
├── NUM_SPACETIME_VARS = 22 # core/types.hpp
├── NUM_HYDRO_VARS = 9 # core/types.hpp
├── spacetime:: # Spacetime subsystem
│ ├── CCZ4Solver # CCZ4 evolution equations
│ ├── GaugeDriver # 1+log + Gamma-driver
│ └── ConstraintMonitor # ‖H‖₂, ‖M^i‖₂
├── grmhd:: # GRMHD subsystem
│ ├── GRMetric3 # CCZ4 ↔ GRMHD interface
│ ├── GRMHDEvolution # Valencia formulation
│ ├── EquationOfState # EOS base class
│ ├── IdealGasEOS # Γ-law EOS
│ └── TabulatedEOS # Nuclear EOS with NR inversion
├── amr:: # AMR subsystem
│ └── AMRHierarchy # Berger-Oliger grid hierarchy
├── initial_data:: # Initial data generators
│ ├── BrillLindquistID # Conformal factor sum
│ ├── BowenYorkID # Extrinsic curvature
│ ├── TwoPuncturesBBH # Spectral BBH initial data
│ └── TOVSolver # Neutron star initial data
├── radiation:: # Radiation transport
│ ├── M1Closure # Moment equations (grey)
│ └── NeutrinoLeakage # Leakage scheme
└── postprocess:: # Observables
├── Psi4Extractor # GW extraction
└── ApparentHorizonFinder # AH finder
See also: Physics Formulations | Parameter Reference | Developer Guide
v0.6.8 · Repository · Issues
- 🩺 Simulation Health & Debugging
- 📊 Benchmarks & Validation
- 🗂️ AMR Design
- 🖥️ HPC Deployment
- 🌀 VORTEX Engine
"Simulate the unimaginable."