Skip to content

Intelligence Pipeline

ruv edited this page May 25, 2026 · 1 revision

Intelligence Pipeline

Ruflo's self-learning system (RuVector) runs a 4-step pipeline on every task outcome.


The 4-Step Pipeline

Task Completes
      ↓
[1] RETRIEVE ← Search memory for relevant patterns (HNSW)
      ↓
[2] JUDGE ← Evaluate success/failure + quality score
      ↓
[3] DISTILL ← Extract learnings via low-rank adapters (LoRA)
      ↓
[4] CONSOLIDATE ← Prevent catastrophic forgetting (EWC++)
      ↓
Updated Agent Behavior

Step 1: RETRIEVE

When a task starts, retrieve relevant past patterns from memory:

npx ruflo@latest memory search \
  --query "REST API with authentication" \
  --namespace patterns \
  --limit 5

Uses HNSW indexing for 150x-12,500x speedup vs. brute-force search.

Output: Top-k most similar patterns with confidence scores.


Step 2: JUDGE

After the task completes, evaluate the outcome:

npx ruflo@latest hooks post-task \
  --task-id "task-123" \
  --success true \
  --quality 0.92  # 0.0 = failure, 1.0 = perfect

The quality score feeds the learning loop:

  • 0.9–1.0: Use this agent again for similar tasks
  • 0.6–0.9: Medium confidence; consider alternatives
  • 0.0–0.6: Low confidence; avoid this approach

Output: Verdict stored in feedback namespace.


Step 3: DISTILL

Extract the key insights from successful (or failed) tasks using LoRA:

npx ruflo@latest neural train \
  --pattern-type task-routing \
  --epochs 10 \
  --learning-rate 0.01

What it learns:

  • Which agent types excel at which task domains
  • What code patterns predict success/failure
  • Optimal model selection (Haiku vs. Sonnet vs. Opus)

Output: Low-rank weight updates (LoRA, 0.1% of model size).


Step 4: CONSOLIDATE

Prevent catastrophic forgetting using Elastic Weight Consolidation (EWC++):

npx ruflo@latest hooks intelligence \
  --force-training true

EWC++ protects old knowledge while learning new patterns:

  • Each weight has an importance score (Fisher information)
  • High-importance weights resist change
  • Low-importance weights adapt quickly

Output: Consolidated model weights, ready for next task.


SONA: Self-Optimizing Neural Architecture

SONA adapts in <0.05ms per task:

const sona = new SONAManager();

// Before spawning an agent
const config = await sona.adaptFor("authentication module");
// Returns: {agent: "security-auditor", model: "sonnet", temperature: 0.3}

await spawnAgent(config);

SONA learns from:

  • Task complexity (token count, subtask count)
  • Previous agent performance (quality scores)
  • Domain similarity (via embeddings)
  • Time pressure (deadline)

Mixture of Experts (MoE) Routing

Route tasks to specialized experts based on learned patterns:

npx ruflo@latest hooks route --task "Implement OAuth2 flow"

Output:

Task: Implement OAuth2 flow
Complexity: 0.78 (medium-high)
Recommended agent: security-architect
Model: sonnet
Estimated time: 45 minutes
Confidence: 0.89

MoE learns which experts (agents) are best for which domains:

  • Expert 1 (backend-dev): REST APIs, databases
  • Expert 2 (security-architect): Auth, encryption, threat modeling
  • Expert 3 (mobile-dev): iOS/Android SDKs
  • etc.

Trajectory Tracking

Every step in a task is recorded:

npx ruflo@latest hooks intelligence trajectory-start \
  --task "Add user authentication"

npx ruflo@latest hooks intelligence trajectory-step \
  --trajectory-id "traj-123" \
  --action "designed JWT flow" \
  --quality 0.95

npx ruflo@latest hooks intelligence trajectory-end \
  --trajectory-id "traj-123" \
  --success true

Trajectories capture:

  • Action sequence (what steps did the agent take?)
  • Quality at each step (intermediate feedback)
  • Decision points (where did the agent have a choice?)
  • Final outcome (success/failure/partial)

Used for inverse RL (learning policy from examples).


Thompson Sampling Model Router

The 3-tier model selector (Haiku/Sonnet/Opus) is a cost-adjusted bandit:

// After each task, record outcome
await hooks.modelOutcome({
  task: "Refactor auth module",
  model: "sonnet",
  outcome: "success"
})

// Next time a similar task arrives, sample from learned distribution
const model = await hooks.modelRoute({ task: "..." })
// Biased toward Sonnet (higher win rate) but still explores Haiku

Uses Beta(α, β) priors updated per task:

  • α = successes with this model
  • β = failures with this model
  • Sample θ ~ Beta(α, β); pick argmax(expected-value)

Self-corrects against tier overuse after ~50 outcomes.


Pattern Learning

Beyond task routing, agents learn transferable patterns:

# Store a learned pattern
npx ruflo@latest agentdb pattern-store \
  --pattern "Pagination cursors beat offset for large result sets" \
  --type "performance" \
  --confidence 0.94

# Retrieve on similar task
npx ruflo@latest agentdb pattern-search \
  --query "large dataset pagination" \
  --min-confidence 0.8

Patterns are:

  • Mutable (confidence updates with new evidence)
  • Queryable (semantic search)
  • Temporal (confidence decays if unused)

Neural Statistics

Check the learning progress:

npx ruflo@latest hooks intelligence stats

Output:

SONA Patterns: 342
  - Active: 298 (confidence > 0.7)
  - Warming up: 44 (< 10 examples)
MoE Expert Distribution:
  - backend-dev: 0.35 utilization
  - coder: 0.28 utilization
  - security-architect: 0.22 utilization
  - reviewer: 0.15 utilization
Model Router:
  - Haiku: 45% chosen, 0.82 win-rate
  - Sonnet: 50% chosen, 0.91 win-rate
  - Opus: 5% chosen, 0.94 win-rate
EWC++ Consolidations: 7 (last: 2 hours ago)
Trajectory Storage: 1.2 GB (1,847 tasks recorded)

Performance Targets

Metric Target Status
SONA adaptation latency <0.05ms Achieved
HNSW search 150x-12,500x Achieved (persistent, 3.92x Int8)
Memory reduction 50-75% Achieved (RaBitQ 1-bit quantization)
Flash Attention 2.49x-7.47x In progress
Model routing win-rate >90% Achieved with Thompson sampling
EWC++ consolidation time <500ms Achieved

Controlling Learning

Enable/Disable

npx ruflo@latest hooks intelligence --enable true
npx ruflo@latest hooks intelligence --enable false

Clear History

npx ruflo@latest hooks intelligence-reset

Wipes all patterns and trajectories. Use before major refactors.

Export Patterns

npx ruflo@latest hooks transfer store --pattern "my-pattern"

Share successful patterns with teammates or other projects.


Advanced: Custom Pattern Stores

Integrate your own pattern database:

import { ReasoningBank } from "@claude-flow/neural";

const bank = new ReasoningBank();

// Add custom pattern
await bank.storePattern({
  id: "oauth2-flow",
  embedding: [0.1, 0.2, ...],  // 384-dim
  metadata: {
    type: "security",
    domain: "auth",
    quality: 0.95,
    source: "security-audit-123"
  }
})

// Query
const similar = await bank.search({
  embedding: queryEmbedding,
  topK: 5
})

Witness Integration

Every learned pattern can be signed (ADR-103):

npx ruflo@latest witness sign \
  --manifest learned-patterns.json \
  --private-key ~/.ruflo/signing-key

Enables:

  • Auditability (who verified this pattern?)
  • Cross-org trust (certified by external auditors)
  • Compliance (immutable proof of learning)

Ruflo v3.10.1 · GitHub

Clone this wiki locally