Skip to content

Testing and Benchmarking

DESKTOP-5KM6RA3\david edited this page Apr 7, 2026 · 3 revisions

Testing and Benchmarking

This page explains how GridForge validates behavior and performance today.

The short version is:

  • use the xUnit suite to protect correctness
  • use the benchmark project when changing performance-sensitive or allocation-sensitive code
  • remember that global static state is part of the test story

The Two Validation Tracks

GridForge keeps correctness and performance validation in separate projects:

Track Project Best for
Tests tests/GridForge.Tests Behavioral correctness, regressions, edge cases, and event semantics
Benchmarks tests/GridForge.Benchmarks Allocation trends, warm-vs-cold pool effects, and throughput-sensitive changes

Use both when you touch tracing, pooling, registration, blocker workflows, scan queries, or neighbor caching.

Test Harness Shape

The test suite uses xUnit v3 and a shared collection fixture:

  • GridForgeFixture sets GridForgeLogger.MinimumLevel to Error
  • GridForgeFixture calls GlobalGridManager.Setup()
  • [Collection("GridForgeCollection")] is used to group tests that rely on shared static state

Many test classes still defensively call Setup() or Reset() in their own constructor and Dispose() methods. That is not redundancy by accident. It reflects the reality that GridForge is built around global mutable state and tests need strong cleanup boundaries.

What The Tests Cover

The test folders already mirror the main subsystem boundaries fairly well:

  • tests/GridForge.Tests/Grids covers manager behavior, voxel/grid state, scan cells, and neighbor logic
  • tests/GridForge.Tests/Blockers covers blocker application, stacking, removal, and reapply behavior
  • tests/GridForge.Tests/Utility covers tracing and logging
  • tests/GridForge.Tests/Spatial covers index and provider semantics

That layout is the best place to look when you want to understand the intended contract for a subsystem.

Standard Test Commands

The normal repo flow is:

dotnet restore GridForge.sln
dotnet build GridForge.sln --configuration Debug
dotnet test GridForge.sln --configuration Debug --no-build

If you want to run only the test project directly:

dotnet test tests/GridForge.Tests/GridForge.Tests.csproj --configuration Debug

Targeted Test Runs

When you only need one subsystem:

dotnet test tests/GridForge.Tests/GridForge.Tests.csproj --configuration Debug --filter FullyQualifiedName~GridTracer

Swap the filter to match the subsystem or test name you are iterating on. This is especially useful when working on one tight loop of tracing, blockers, or occupancy behavior.

Coverage Notes

The test project points at coverlet.runsettings, and the repo includes coverlet.collector.

That means coverage collection can be layered onto a normal test run, for example:

dotnet test tests/GridForge.Tests/GridForge.Tests.csproj --configuration Debug --collect:"XPlat Code Coverage"

The runsettings currently exclude generated MemoryPack files from coverage so the report stays focused on authored code.

Test Design Expectations

Good GridForge tests usually do a few things on purpose:

  • call GlobalGridManager.Setup() or rely on the fixture only when the world state is known
  • call GlobalGridManager.Reset() when a test needs a clean boundary
  • use deterministic positions and explicit expectations
  • assert exact snapped values instead of fuzzy tolerances
  • clean up after changing setup-wide values like voxel size

If a test changes global setup assumptions, treat restoring the default world as part of the test itself.

When To Reach For Benchmarks

The benchmark project matters most when a change affects:

  • grid registration
  • tracing coverage
  • scan radius queries
  • blocker application or removal
  • occupancy waves
  • voxel-grid allocation patterns
  • neighbor cache behavior
  • pool warmup or reuse characteristics

If the change could affect allocations or hot-path loop counts, benchmarks are the right second validation step after tests.

Benchmark Runner Basics

The benchmark project has a small command surface on top of BenchmarkDotNet:

dotnet run --project tests/GridForge.Benchmarks/GridForge.Benchmarks.csproj -c Release -f net8.0 -- list
dotnet run --project tests/GridForge.Benchmarks/GridForge.Benchmarks.csproj -c Release -f net8.0 -- all

list prints the available benchmark selections. all runs every benchmark type.

You can also run selected aliases before passing through regular BenchmarkDotNet arguments.

Benchmark Selections

The current benchmark project includes scenarios around:

  • grid-registration
  • grid-tracer
  • blocker-memory
  • blocker-apply-vs-remove
  • scan-radius-memory
  • occupant-wave
  • neighbor-cache
  • voxel-grid-memory

The runner resolves aliases from benchmark class names, so list is the safest way to discover the exact available selections before a run.

Benchmark Environment Behavior

BenchmarkEnvironment is worth understanding because it shapes the benchmark numbers.

It:

  • suppresses logging by setting GridForgeLogger.MinimumLevel to None
  • resets the global world between iterations
  • optionally clears GridForge and shared SwiftCollections pools
  • calls GlobalGridManager.Setup(...) with the benchmark's requested world settings

This makes benchmarks much more explicit about whether they are measuring:

  • cold pools vs warm pools
  • cached coverage vs uncached coverage
  • apply vs remove paths
  • allocation-heavy setup vs steady-state reuse

Benchmark Configuration Notes

The benchmark classes use:

  • [MemoryDiagnoser] to surface allocation behavior
  • InProcessShortRunConfig for relatively fast local iteration

That makes them useful during development, not just in long-running lab conditions.

Where Benchmark Results Go

BenchmarkDotNet emits reports under:

BenchmarkDotNet.Artifacts/results/

That is where to look when you want the generated markdown, CSV, or summary output after a run.

Common Validation Pitfalls

  • Running tests without respecting global setup and reset boundaries
  • Forgetting to re-run tests after changing voxel size or snapping rules
  • Skipping benchmarks after changing pooling, tracing, or scan logic
  • Reading warm-pool results as if they were cold-start behavior
  • Letting logger noise pollute benchmark runs instead of suppressing diagnostics intentionally

Read This Next

Clone this wiki locally