Changelog

All notable changes to this project will be documented in this file.

The format follows Keep a Changelog.
This project adheres to Semantic Versioning.

0.2.0 – 2026-06-07

Breaking changes

DIM → RANK — the const generic parameter on State<D>, Observation<D>, Action<D>, and Environment<D, SD, AD> is renamed to RANK (or R, SR, AR at usage sites) across all crates. Update any downstream impl State<D> / impl Environment<D, …> declarations accordingly.
fn new removed from Environment trait (ADR 0011) — construction is no longer part of the shared trait contract. Replace call sites with the new ConstructableEnv factory trait or a concrete new method.

New crates

rlevo-examples — heavy visualisation, recording, and report examples extracted from the rlevo umbrella (ADR 0012). Lightweight environment/algorithm examples stay in crates/rlevo/.
rlevo-metrics-registry — wasm32-compatible leaf crate that owns the canonical metric descriptor list (CANONICAL_METRICS, MetricDescriptor, domain grouping). Eliminates the hand-copied duplicate that previously existed between rlevo-benchmarks and rlevo-benchmarks-report-client (ADR 0015).
rlevo-benchmarks-report-client — Leptos/WASM static-HTML post-run report viewer. Served from an embedded axum server. Shares the metric registry with rlevo-benchmarks without pulling in burn or rand.

Dependency upgrades

burn 0.20.0 → 0.21.0; migrated ndarray backend to the new flex backend.
rand 0.9.x → 0.10.1, rand_distr → 0.6.0.

`rlevo-core`

Added

ConstructableEnv factory trait — standalone fn new(render: bool) -> Self replacement for the removed Environment::new (ADR 0011).
StyledFrame, StyledLine, StyledSpan, SpanStyle, Color, Modifier, semantic palette module, and AsciiRenderable/AsciiRenderer hoisted from rlevo-environments::render into rlevo-core::render (ADR 0009). Import paths inside rlevo-environments are preserved via a re-export shim.

Changed

AsciiRenderable demoted from a required library invariant to an optional debug helper; implementing it is no longer implied by Environment (ADR 0013).

`rlevo-environments`

Changed

Render types (StyledFrame, AsciiRenderable, etc.) re-exported from rlevo-core::render; originals removed (ADR 0009).
Environment::new removed; each environment exposes its own new constructor and may opt into ConstructableEnv (ADR 0011).

`rlevo-evolution`

Changed

All EA algorithms and shared ops (selection, crossover, mutation, replacement) now draw random values through seed_stream on the host CPU rather than calling B::seed + Tensor::random, eliminating the process-wide RNG mutex contention that caused non-determinism in parallel tests.
SharedPopulationObserver unified to parking_lot::Mutex (was split between std::sync and parking_lot lock types, causing type mismatches in recording examples) (ADR 0010).

`rlevo-reinforcement-learning`

Added

polyak_update hoisted as a shared utility function available to all RL algorithm crates.

`rlevo-benchmarks`

Added

Record schema v6 (FORMAT_VERSION bumped 5 → 6, ADR 0014):
- Expanded CANONICAL_METRICS (explained variance, per-iteration episode-return statistics, DQN/SAC loss terms) — list now owned by rlevo-metrics-registry.
- Typed run-provenance fields on RunManifest: algorithm name, crate versions, git ref, device, seed count, success threshold.
- EpisodeKind { Training, Evaluation } field in episode headers.
- Episode wall-clock duration as a terminal metric.
- checkpoints: Vec<CheckpointRef> seam for deep-RL Burn-Recorder model files (EA runs unaffected).
Metrics-only live ratatui TUI replaces the earlier three-tier visualisation plan (ADR 0013 supersedes ADR 0008); no environment render panel in the TUI.

Changed

CANONICAL_METRICS constant moved to rlevo-metrics-registry; rlevo-benchmarks re-exports it for back-compat.

`rlevo-benchmarks-report-client`

Added

Interactive post-run static HTML report (Leptos + WASM):
- Min/max downsampling for long metric series (ADR 0013 / M8.2).
- Multi-seed mean ± std band aggregation.
- Hover crosshair with exact raw-value tooltip.
- Per-panel SVG export buttons.
- Step / episode / wall-clock x-axis toggle for episode panels.
- Eval/training split via EpisodeKind in the episode index and table badge.
- Landscape heatmap background for EA optimisation landscape runs.
- Diversity-threshold guideline line with breach-pulse highlight.
- Strip-plot overlay toggle on the population box-plot panel.

`rlevo` (umbrella)

Changed

Lightweight examples retained; heavy viz/record/report examples migrated to rlevo-examples (ADR 0012).

Infrastructure

Added

GitHub Actions CI: integration-test matrix (Linux × stable toolchain) and weekly full-workspace test run.
BACKEND_LOCK per-binary synchronisation for wgpu-backed integration tests; removes the previous --test-threads=1 requirement.

0.1.0 – 2026-04-28

Initial alpha release. All crates are published together at the same version.

`rlevo-core`

Added

State<D> / Observation<D> traits for typed, const-generic environment state and agent perception.
Action<D> trait hierarchy (DiscreteAction, ContinuousAction, MultiDiscreteAction, MultiBinaryAction) for compile-time action-space safety.
Environment<D, SD, AD> trait with reset / step / render contract; Snapshot<D> trait with SnapshotBase<D, O, R> concrete type.
Reward trait with ScalarReward and VectorReward implementations.
TensorConvertible<B, D> bridge trait for lifting state/action types onto Burn tensors.
Agent and BenchableAgent traits for uniform agent interaction.
FitnessEvaluable and Landscape traits for benchmarking evolutionary algorithms.
BenchEnv, BenchError, BenchStep, Metric, MetricsProvider, and SeedStream (moved from rlevo-benchmarks per ADR-0004).
util::seed — deterministic SeedStream for reproducible multi-run experiments.
EnvironmentError and StateError error types with thiserror derives.

`rlevo-environments`

Added

Classic control — CartPole, MountainCar, MountainCarContinuous, Pendulum, Acrobot.
Bandits — KArmedBandit<K>, ContextualBandit, NonStationaryBandit, AdversarialBandit.
Toy text — Blackjack, CliffWalking, FrozenLake, Taxi.
Gridworlds (MiniGrid-style) — Empty, DoorKey, Memory, FourRooms, Crossing, LavaGap, MultiRoom, Unlock, UnlockPickup, GoToDoor, DistShift, DynamicObstacles; shared GridCore (grid, entity, action, direction, observation, render, reward, dynamics).
Box2D physics (box2d feature, rapier2d) — BipedalWalker, LunarLander (discrete and continuous action spaces), CarRacing.
Locomotion (locomotion feature, rapier3d) — Reacher, Swimmer, InvertedPendulum, InvertedDoublePendulum.
Games — Chess (full move generation and board state), ConnectFour.
Optimisation landscapes — Sphere, Ackley, Rastrigin for benchmarking evolutionary algorithms.
Wrappers — TimeLimit wraps any Environment with an episode step cap.
Bench adapter (bench feature) — BenchAdapter and preset Suite factories to drive any environment from rlevo-benchmarks.
ASCII render backend for text-based environments.

`rlevo-evolution`

Added

Strategy<B> pure trait (init / ask / tell / best) — stateless, parallelism-friendly, trivially checkpointable.
EvolutionaryHarness<B, S, F> — wraps any Strategy as a BenchEnv.
BatchFitnessFn trait with FromFitnessEvaluable adapter.
GenomeKind enum (RealValued, Binary, Integer, Program).
Classical families — GeneticAlgorithm (real-valued, SBX crossover + polynomial mutation), BinaryGeneticAlgorithm (one-point/uniform crossover + bit-flip mutation), EvolutionStrategy ((1+1), (1+λ), (μ,λ), (μ+λ) with self-adaptive σ), EvolutionaryProgramming (Gaussian perturbation + tournament), DifferentialEvolution (Rand/1/Bin, Best/1/Bin, CurrentToBest/1/Bin), CartesianGeneticProgramming (symbolic regression via CGP graph).
Metaheuristics — ParticleSwarmOptimization, AntColonyOptimizationReal, AntColonyOptimizationPermutation, ArtificialBeeColony, FireflyAlgorithm, BatAlgorithm, CuckooSearch (Lévy flights via Mantegna), GreyWolfOptimizer, SalpSwarmAlgorithm, WhaleOptimizationAlgorithm.
Genetic operators (ops) — selection (tournament, roulette, rank, SUS, elitism, NSGA-II crowding), crossover (uniform, one-point, multi-point, SBX, BLX-α, intermediate), mutation (Gaussian, uniform, polynomial, bit-flip, inversion), replacement (generational, steady-state, elitist, comma, plus).
Custom CubeCL kernels (custom-kernels feature) — fused pairwise-attract (Firefly large-N path) and fused Lévy-flight (Cuckoo/Bat) kernels; pure-tensor fallbacks used when feature is off.
PopulationState tensor wrapper; ShapingFn fitness shaping (linear rank, exponential rank, truncation).

`rlevo-reinforcement-learning`

Added

Replay memory — PrioritizedExperienceReplay (uniform-sampling mode in v0.1.0); TrainingBatch typed container.
Experience — ExperienceTuple (s, a, r, s', done), History trajectory buffer.
Metrics — AgentStats (per-step), PerformanceRecord (per-episode).
DQN — DqnModel, DqnAgent, DqnTrainingConfig; ε-greedy exploration schedule; Double-DQN target option.
C51 — C51Model, C51Agent, C51TrainingConfig; Bellman projection onto N-atom support; categorical cross-entropy loss.
QR-DQN — QrDqnModel, QrDqnAgent, QrDqnTrainingConfig; quantile Huber loss; no [v_min, v_max] required.
PPO — PpoAgent, PpoTrainingConfig, RolloutBuffer, GAE advantages, CategoricalPolicyHead (discrete), TanhGaussianPolicyHead (continuous), clipped surrogate + value loss, early-stop on approx_kl.
PPG — PpgAgent, PpgConfig, AuxBuffer, PpgCategoricalPolicyHead; interleaved policy-phase + auxiliary-phase with KL distillation.
DDPG — DdpgAgent, DdpgTrainingConfig; deterministic actor + Q-critic; Polyak target sync; Gaussian exploration noise.
TD3 — Td3Agent, Td3TrainingConfig; twin-critic min-bootstrap; delayed actor updates; target-policy smoothing.
SAC — SacAgent, SacTrainingConfig; squashed-Gaussian stochastic actor; twin critics; learnable temperature α with auto-tuning toward -|A|.
Shared EpsilonGreedy schedule (DQN / C51 / QR-DQN) and GaussianNoise exploration (DDPG / TD3).

`rlevo-benchmarks`

Added

Evaluator — drives any BenchEnv for N episodes, collecting per-step and per-episode metrics.
Suite — ordered sequence of (env, evaluator) pairs with shared reporter.
Metrics — EaMetrics (best fitness, population diversity, convergence rate), RlMetrics (episode return, episode length, sample efficiency).
Reporters — JsonReporter (newline-delimited JSON), LoggingReporter (tracing spans), TuiReporter (ratatui live dashboard, tui feature).
Checkpoint and Storage traits for saving/resuming benchmark state.
rayon-parallel episode evaluation for multi-seed sweeps.

`rlevo-hybrid`

Added

Stub crate establishing the dependency wiring between rlevo-evolution and rlevo-reinforcement-learning. No hybrid strategies are implemented in v0.1.0; see the crate README for the v0.2.0 roadmap.

`rlevo` (umbrella)

Added

Re-exports all public APIs from every workspace crate behind a single rlevo entry point.
keywords: reinforcement-learning, evolutionary, deep-learning, burn, neural-network.
categories: science, algorithms, simulation.
Full example suite (35 examples across gridworlds, classic control, Box2D, locomotion, evolutionary showcases, RL algorithms, and benchmarks harness).
Cross-crate integration tests.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.2.0 — Metrics registry, schema v6, ConstructableEnv

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Changelog

0.2.0 – 2026-06-07

Breaking changes

New crates

Dependency upgrades

`rlevo-core`

`rlevo-environments`

`rlevo-evolution`

`rlevo-reinforcement-learning`

`rlevo-benchmarks`

`rlevo-benchmarks-report-client`

`rlevo` (umbrella)

Infrastructure

0.1.0 – 2026-04-28

`rlevo-core`

`rlevo-environments`

`rlevo-evolution`

`rlevo-reinforcement-learning`

`rlevo-benchmarks`

`rlevo-hybrid`

`rlevo` (umbrella)

Uh oh!

v0.2.0 — Metrics registry, schema v6, ConstructableEnv

Changelog

0.2.0 – 2026-06-07

Breaking changes

New crates

Dependency upgrades

rlevo-core

rlevo-environments

rlevo-evolution

rlevo-reinforcement-learning

rlevo-benchmarks

rlevo-benchmarks-report-client

rlevo (umbrella)

Infrastructure

0.1.0 – 2026-04-28

rlevo-core

rlevo-environments

rlevo-evolution

rlevo-reinforcement-learning

rlevo-benchmarks

rlevo-hybrid

rlevo (umbrella)

Uh oh!

`rlevo-core`

`rlevo-environments`

`rlevo-evolution`

`rlevo-reinforcement-learning`

`rlevo-benchmarks`

`rlevo-benchmarks-report-client`

`rlevo` (umbrella)

`rlevo-core`

`rlevo-environments`

`rlevo-evolution`

`rlevo-reinforcement-learning`

`rlevo-benchmarks`

`rlevo-hybrid`

`rlevo` (umbrella)