An agentic feedback loop that evolves Rust functions: an LLM generates type-safe code, the harness compiles and hot-swaps it into your running binary, evaluates the results, and feeds performance back to the LLM for the next iteration — bare-metal execution, zero interpreter overhead.
flowchart LR
A["LLM writes\nRust function"] --> B["Constrained\nGeneration"]
B -->|"validate + compile"| C["Native .so"]
C -->|"hot-swap"| D["Running\nBinary"]
D -->|"bare-metal\nexecution"| E["Evaluate"]
E -->|"feedback"| A
style A fill:#1a1a2e,stroke:#e94560,color:#eee
style B fill:#16213e,stroke:#e94560,color:#eee
style C fill:#0f3460,stroke:#e94560,color:#eee
style D fill:#0f3460,stroke:#e94560,color:#eee
style E fill:#1a1a2e,stroke:#e94560,color:#eee
Declare function signatures with the evolvable! macro and provide an evaluation function.
The agent autonomously implements, and refines the code each iteration — the harness
validates, compiles, and hot-swaps the native code into the running process without a restart.
Constrained generation is what makes this reliable: the harness enforces that LLM output is valid Rust, matches the declared function signature, and compiles successfully. When any check fails, the specific error (parse failure, signature mismatch, or compiler diagnostics) is appended to the prompt and the LLM retries automatically until it produces correct code.
symbiont::evolvable! {
fn step(counter: &mut usize) {
// default implementation body, entirely evolved by the LLM
*counter += 1;
println!("doing stuff in iteration {}", counter);
}
}
#[tokio::main]
async fn main() -> symbiont::Result<()> {
let runtime = symbiont::Runtime::init(SYMBIONT_DECLS, symbiont::Profile::Debug).await?;
let agent = symbiont::inference::init_agent()?;
let fn_sigs = runtime.fn_sigs();
let base_prompt = format!(
"Give a concise implementation for this function signature: ```{}```, \
that increments the counter by a constant in the range (5..20). \
Give Rust Code Only.",
fn_sigs[0]
);
let mut counter = 0;
let mut last_evolution = std::time::Instant::now();
loop {
step(&mut counter); // bare-metal: calls into the hot-loaded native dylib
println!("counter: {counter}");
if last_evolution.elapsed() >= std::time::Duration::from_secs(10) {
// LLM rewrites the function, harness validates + compiles + hot-swaps
runtime.evolve(&agent, &base_prompt).await?;
last_evolution = std::time::Instant::now();
// New Agent written code is available next time `step` is called and executed natively.
}
}
}The example shows a basic counter function where the Agent evolves the implementation, based on a user-defined prompt. The compiled dylib (of the function) gets hot-swapped in the evaluation loop, achieving bare-metal performance. This is agentic code mode in action. The harness provides constrained generation and nudges the LLM prompt if necessary.
symbiont-counter-example.mp4
See the Development setup section and the examples/ directory for more.
- Type-safe agentic code: Agents express intent as Rust functions with enforced signatures.
- Constrained generation: Parse errors, signature mismatches, and compiler diagnostics steer the LLM until it produces valid code.
- Hot-swap dylibs:
Functions are compiled to native shared libraries and swapped in-place via
libloading— no process restart. - Bare-metal performance: Evolved functions run as native compiled code. The dispatch overhead is ~1 ns per call (a single atomic pointer load + indirect call). The hot path is fully lock-free and multi-thread safe.
- Plug-in inference: Any Inference provider is supported via rig.
- Tiny Core: Only ~1000 LOC for the Agent harness and constrained generation part.
- Catches Agent Code Panics
Any LLM code that generate a runtime panic will be caught using
catch_unwind, and the panic message is used to provide backpressure in the prompt. See unwind.rs for details.
Current-generation Agent harnesses such as Agentica achieve SOTA on complex long-running tasks like ARC-AGI-3 by providing a persistent Python REPL that the agent lives in. This is known as CodeMode — it allows the agent to leverage the entire Python ecosystem natively, without MCP.
However, Python's interpreter overhead becomes the bottleneck for compute-heavy workloads. If the agent's task is to optimize a well-typed function, evaluation in Python can be 10-100x slower than native execution, directly limiting how many iterations the agent can explore in a given time budget.
Symbiont brings a similar agentic code evolution paradigm to Rust.
Agents write type-safe function bodies that get compiled to native code and hot-swapped into the running binary.
The Rust compiler enforces memory safety and type correctness,
while symbiont's constrained generation loop ensures the LLM output always compiles before it reaches execution.
- Typed function body search (e.g., find an implementation that satisfies a test suite).
- See fizzbuzz-example
- See rastrigin-example
- Performance Optimization under functional equivalence
- See sort-example
- Game AI / strategic reasoning through evolved code
- Auto-research workflows with native-speed evaluation.
- Black-box optimization of inputs that produce desired outputs, e.g. Parameter Search.
- Self-evolving feature processing pipelines.
- Agentic code evolution generally.
The project uses Nix for reproducible builds and devenv to manage a local inference server.
Prerequisites: Nix with flakes enabled.
Setup your .env file like this for the next steps (or use your desired inference provider):
export API_KEY=""
export BASE_URL="http://127.0.0.1:8321/v1"
export MODEL="google/gemma-4-E2B-it"Then execute the following:
# Enter the development shell (provides Rust nightly, cargo tools, formatters)
nix develop
# Start a local llama-cpp server with gemma-4-E2B-it (auto-downloads on first run)
devenv up
# In another terminal, run the counter example
cargo run -p counter-exampleFunction pointers are cached in AtomicPtr statics after each load — callers never touch a lock or perform a symbol lookup.
| Time per call | |
|---|---|
| Direct function call | 0.91 ns |
evolvable! dispatch |
1.64 ns |
Benchmark: cargo bench -p symbiont --bench dispatch_overhead
On reload, the runtime updates the atomic pointers and drops the old library.
This is safe because the feedback loop contract guarantees no evolvable functions are executing during evolution — only one .so is loaded at any time.
A typical evolution cycle (LLM inference → constrained generation → compilation -> fn evaluation) highly depends on:
- The model being used (Inference latency)
- Size of the generated Rust code.
- Optimization level for the compiled dylib.
- Did the LLM make a misstake? -> Repeat cycle again with new steering prompt.
Example timings for fizzbuzz-example using unsloth/Qwen3.6-35B-A3B-GGUF:UD-Q4_K_M on an RTX Pro 6000 Blackwell and llama-cpp (~150TPS):
| Stage | Time |
|---|---|
| LLM inference | 4852 ms |
| Harness checks | 0 ms |
| Compilation | 118 ms |
| Function Eval | <3ns |
The function evaluation pipeline should be built to keep these proportions in mind.
The fizzbuzz-example can be oneshot and function evaluation is super cheap, so its not representative, just a toy example.
These constraints arise from the binary/dylib interaction boundary. The harness mitigates most of them, but users should be aware:
- Static function signatures:
The LLM can only rewrite function bodies — the signature declared in
evolvable!is fixed at compile time and enforced on every evolution. This is by design (it's what makes constrained generation possible), but it means the agent cannot add parameters, change return types, or introduce new functions at runtime. It would be UB to hot-swap a different function signature in, when the main binary expects a certain memory layout. - Sequential feedback loop:
All evolvable function calls must have returned before
evolve()is called. The old library is dropped on reload, so in-flight calls through stale pointers would be UB. This matches the intended usage pattern (run functions, collect results, evolve, repeat) and is enforced with an assertion in debug builds at zero cost in release. Multi-threading is possible, but requires extra care. - Same toolchain required:
Rust has no stable ABI. The binary and dylib must be compiled with the same
rustcversion to guarantee matching calling conventions and memory layouts. The harness ensures this by compiling the dylib on the same machine with the same toolchain. - Primitive types only:
The generated dylib has no dependencies, so evolvable function signatures are limited to
stdtypes (usize,f64,&[u8], etc.). Custom types across the boundary will require shared dependency support (not yet implemented). unsafeat the boundary: Dynamic symbol lookup is inherentlyunsafe. The harness validates function signatures against theevolvable!declaration and only loads code that parses, type-checks, and compiles — but theextern "Rust"pointer cast remains an unsafe invariant.
See CAVEATS.md for more details.
- slopc for function body implementations at compile time, but no evolution or feedback cycles there.
- hot-lib-reloader for the idea of hot-swapping functions at runtime.
- GEPA for optimizing any system with textual parameters against any evaluation metric. But not Rust :(
- Agentica for a Python Agent SDK, providing persistent REPL and sub-agents.
Also checkout the TODOs file for what might come next for symbiont. Stay tuned!
Copyright (C) 2026 MathisWellmann
This project is licensed under the Mozilla Public License 2.0 — see LICENSE for details.