Self-modeling cybernetic manifold — the PLATO agent loop as categorical trace, curvature = compute cost.
104 tests · Rust · nalgebra + serde
This crate implements a self-modeling agent that operates on a cybernetic manifold — a state space where the metric encodes computational cost (FLOPs) and curvature measures how GPU compute paths diverge. The agent runs a 9-step categorical trace loop to observe itself, predict itself, and control itself.
The system is called PLATO (Philosophical Learning Adaptive Transcendent Observer). It is:
- Self-referential: it can construct fixed points (Y combinator for agents)
- Homeostatic: it enforces conservation laws on its own state
- Adaptive: it modifies its own structure based on feedback
- Meta-learning: it optimizes its own learning rate via second-order gradients
In a compact closed category, the identity morphism id_𝒞 has a canonical trace — a loop obtained by composing with the dual's unit and counit. The 9-step PLATO loop is this trace, instantiated as a concrete operational cycle:
Observe → Represent → Decompose → Optimize → Classify → Predict → Control → Adapt → (back to Observe)
Each step is a morphism in the category. The trace connects the last output to the first input, forming a self-referential loop. The agent's manifold curvature (Ricci tensor) measures how expensive it is to move between states — curvature = compute cost.
[dependencies]
lau-self-modeling = { git = "https://github.com/SuperInstance/lau-self-modeling" }git clone https://github.com/SuperInstance/lau-self-modeling.git
cd lau-self-modeling
cargo testuse lau_self_modeling::PlatoAgent;
let mut agent = PlatoAgent::new("demo", 4, 1000.0);
// Set a goal state
agent.goal = vec![1.0, 0.0, 1.0, 0.0];
// Run 10 cycles of the 9-step loop
for _ in 0..10 {
agent.cycle().unwrap();
}
// Check how many FLOPs were consumed
println!("FLOPs used: {}", agent.curvature.total_flops());
println!("Ricci scalar: {:.4}", agent.manifold.points.last().unwrap().ricci_scalar);use lau_self_modeling::{CyberneticManifold, ManifoldPoint};
let mut m = CyberneticManifold::named(3, "test");
let p1 = ManifoldPoint::new(vec![1.0, 0.0, 0.0]);
let p2 = ManifoldPoint::new(vec![0.0, 1.0, 0.0]);
println!("Distance: {:.4}", p1.distance_to(&p2)); // √2
println!("Midpoint ricci: {:.4}", p1.midpoint(&p2).ricci_scalar);use lau_self_modeling::SelfReference;
let mut sr = SelfReference::new(3);
let f = |x: &[f64]| vec![x[0] * 0.5, x[1] * 0.5, x[2] * 0.5]; // contractive
let fp = sr.find_fixed_point(&f, &[1.0, 1.0, 1.0]);
assert!(fp.is_valid(1e-6));
assert!(fp.stable); // contractive → stable| Module | Key Types | Tests | Purpose |
|---|---|---|---|
manifold |
CyberneticManifold, ManifoldPoint |
14 | State space with FLOPs-encoded metric |
loop9 |
Loop9, LoopStep, StepResult |
12 | 9-step categorical trace cycle |
curvature |
ComputationalCurvature, CurvatureTensor, FlopsBudget |
11 | Ricci curvature from GPU dispatch |
feedback |
FeedbackLoop, FeedbackKind, FeedbackSignal |
12 | Positive/negative feedback control |
homeostasis |
Homeostat, ConservationLaw, DeviationReport |
12 | Conservation law enforcement |
adaptation |
AdaptationEngine, StructuralChange |
10 | Self-modification based on feedback |
meta_learning |
MetaLearner, GradientOfGradient |
12 | Second-order learning rate optimization |
self_reference |
SelfReference, FixedPoint |
11 | Y combinator / Gödel fixed points |
plato |
PlatoAgent |
10 | Complete self-modeling agent |
The loop is the categorical trace of the identity morphism:
| Step | Morphism | What it does |
|---|---|---|
| 0. Observe | O: World → State |
Read current state from the world |
| 1. Represent | R: State → Manifold |
Embed state onto the cybernetic manifold |
| 2. Decompose | D: Manifold → Spectrum |
Spectral decomposition of the state |
| 3. Optimize | ∇: Spectrum → Gradient |
Compute gradient on the manifold |
| 4. Classify | C: Gradient → Category |
Categorize the current regime |
| 5. Predict | P: Category → Future |
Predict next state |
| 6. Control | U: Future → Action |
Compute control action |
| 7. Adapt | A: (State, Action) → Params |
Modify internal parameters |
| 8. ObserveAgain | O': Action → World |
Apply action, return to step 0 |
The manifold M is equipped with a metric tensor g where gᵢⱼ = FLOP cost of moving from state i to state j. The Ricci curvature Rᵢⱼ measures the rate at which nearby compute paths diverge:
- Positive curvature (R > 0): paths converge — cheap computation
- Negative curvature (R < 0): paths diverge — expensive computation
- Flat (R = 0): paths stay parallel — uniform cost
The Y combinator Y(f) = f(Y(f)) enables the agent to apply its own policy to itself. In practice, this is fixed-point iteration: start with state x₀, apply f repeatedly, and check convergence. If f is contractive (‖f(x) - f(y)‖ < α‖x - y‖ for α < 1), the fixed point exists and is stable.
The Homeostat enforces conservation laws (sum-to-one, energy conservation, etc.) at every step. When a deviation is detected, it triggers corrective feedback to bring the system back within tolerance.
The MetaLearner tracks the Hessian diagonal (∂²L/∂θᵢ²) of the loss landscape and computes the optimal learning rate:
η* = ‖∇L‖ / (‖∂²L/∂θ²‖ + ε)
This is the natural gradient adapted to the cybernetic manifold's curvature.
In a compact closed category 𝒞 with duals, the trace of a morphism f: A → A is:
Tr(f) = ε_A ∘ (f ⊗ id_{A*}) ∘ η_{A*}
where η: I → A* ⊗ A is the unit and ε: A ⊗ A* → I is the counit. For the PLATO loop, this gives a self-referential cycle where the agent's output feeds back into its input.
The Ricci curvature tensor Rᵢⱼ on the cybernetic manifold is approximated as:
Rᵢⱼ ≈ (FLOPs(midpoint(i,j)) − ½(FLOPs(i) + FLOPs(j))) / Δx²
This measures the excess computational cost of interpolating between states. Positive curvature means the midpoint is cheaper than expected (convergent GPU dispatch); negative means it's more expensive (divergent dispatch).
Gödel's fixed-point theorem states that for any formula φ(x) in a sufficiently powerful system, there exists a sentence G such that G ↔ φ(⌜G⌝). In computation, this is the Y combinator. For agents, it means the agent can construct a state that represents its own properties — enabling self-awareness.
For a conservation law C(x) = c, the deviation δ = C(x) − c triggers a feedback correction:
x_corrected = x − (δ / ‖∇C‖²) · ∇C
This is a negative feedback projection that restores the constraint while minimizing the L² perturbation.
MIT