Releases · robot-accomplice/roboticus-rust

06 Apr 10:03

github-actions

v0.11.4

7e5f0d6

v0.11.4 Latest

Latest

Theme: Adaptive Intelligence

The router learns from every turn. The pipeline owns its own crate.

Added

Pipeline decomposition: run_pipeline(), core inference engine, stage functions, decomposition, heuristics, and task state moved to roboticus-pipeline crate (50 source files, ~6,500 lines).
Capability traits: PipelineSecurity, InferenceRunner, PipelineDeps — trait-scoped dependency injection replaces &AppState across the entire inference path.
Shadow routing predictions: Live recording of metascore vs heuristic routing decisions for offline outcome evaluation.
Metascore routing fitness tests: 16 profile-level + 4 end-to-end routing tests verifying metascore overrides, breaker exclusion, cost weight, session penalty, accuracy floor.
Startup baselining fitness tests: 5 tests verifying cold-start model quality seeding and its influence on routing.
Architecture fitness tests: 14 tests enforcing connector thinness, dependency direction, no AppState in pipeline/core/stages.
Dashboard markdown rendering: Consistent ##, ###, *italic*, **bold** rendering across memory page entries.

Changed

PipelineRequest uses PipelineDeps (trait refs) instead of &AppState.
PipelineError uses u16 status codes (framework-agnostic).
Bot command dispatch moved from pipeline to connector pre-pipeline step.
Nickname refinement uses InferenceRunner trait instead of direct provider key resolution.

Fixed

Degraded response on follow-up requests: Cache discard falls through to fresh inference; protocol rescue tries stripping before fallback.
Migration 31+33 crash on fresh databases: Guard has_table() before ALTER TABLE pipeline_traces.
All #[ignore] classifier tests: Fixed to pass with n-gram fallback.
MSRV compliance: is_multiple_of() replaced with modulo for Rust 1.85.

Assets 9

03 Apr 08:41

github-actions

v0.11.3

44d9c85

v0.11.3

Theme: Complete the Operating Picture

The agent can perceive everything, act everywhere, and explain every decision.

Added

Matrix E2EE (Olm/Megolm): vodozemac-based encryption for Matrix adapter — device key lifecycle, Olm session establishment, Megolm group encrypt/decrypt, to-device key exchange, JSON-file key persistence.
Windows AppContainer sandboxing: Job Object confinement for script execution — memory limits, kill-on-close, process count caps with graceful degradation.
Cognitive scaffold architecture principle (ARCHITECTURE.md §4): durable context injection, structured self-knowledge, continuity preservation, learning from failure, referenceability.
Topic segmentation: Embedding-free topic detection at message storage, topic-aware context assembly (current-topic full, off-topic summarized), topic_tag column on session_messages.
Context compaction: Deduplication, compression, budget enforcement between retrieval and context assembly — 25% L0 budget cap for memory.
Ambient recency injection: Last 2 hours of episodic memories injected into every turn regardless of query similarity.
SVG pipeline flow graph: Interactive flow visualization with connected nodes, directional arrows, click-to-inspect floating popovers. Full pipeline trace from input_validation through guard_chain.
Guard stage in pipeline traces: Guard outcomes visible in flow visualization with per-guard annotations.
TASK_DEFERRAL + FALSE_COMPLETION semantic banks: 10 exemplars each for detecting narrated-future-action and unverified completion claims.
EXECUTION intent exemplars expanded: Task-management verbs (close out, implement, carry out, handle, process, complete).
DELEGATION intent exemplars expanded: Indirect delegation patterns (implement recommendations using agents, direct agents to carry out plan).
Delegation workflow Report step: Mandatory step 6 — orchestrator must present delegation results to user since subagents run in isolated sessions.
FIRMWARE.toml rules migration: roboticus update auto-migrates [[rules]] array to [rules] table format.
Generic channel poll loop: Single channel_poll_loop replaces 4 near-identical platform loops.
Database indexes: tasks(status), cron_jobs(enabled, next_run_at), transactions(created_at DESC).
sandbox_required config flag: Abort script execution if OS-level sandboxing unavailable.
#[instrument] on pipeline functions: Automatic span tracing on run_pipeline, agent_message_stream, process_channel_message.
CORS layer: tower-http CorsLayer on the API router.
CSP Google Fonts: fonts.googleapis.com + fonts.gstatic.com in Content-Security-Policy.
God file splits: main.rs 3408→1454, run.rs 3113→2432, update.rs 3058→973, transform.rs 2876→289.

Fixed

Dashboard JS syntax error: Orphaned renderWallet in efficiency.js + duplicate IIFE close in websocket.js — killed all dashboard interactivity since 2.27 decomposition.
InternalJargonGuard: Migrated from 11-string word list to NARRATED_DELEGATION semantic classifier (threshold 0.8). Subagent leak check respects user prompt context.
TaskDeferralGuard: Migrated from "let me"/"I'll" word list to TASK_DEFERRAL semantic bank.
ExecutionTruthGuard: Uses FALSE_COMPLETION score, not bare intent trigger. Recommendations no longer treated as false delegation claims.
ModelIdentityTruth: || → && — stops replacing 1,649-char responses that have ≤3 lines. Redacts model name in substantive responses instead of replacing entire content.
Deterministic fallback: Preserves user's topic snippet instead of generic "did not meet quality standards."
Tool retry loop: Surfaces actual tool error instead of "same tool call kept repeating."
Behavioral contract §1.5: Latest user message takes priority over stale plans.
Cron/subagent session isolation: Dedicated sessions per invocation prevent pollution of user conversations.
Cron query optimization: WHERE enabled=1 pushed to SQL; uses new index.
Delivery queue in-flight recovery: Stale in-flight items auto-recovered after 5 minutes.
CapacityTracker bounded: Hard cap at 10,000 events per vector.
Wallet HTTP timeout: 30s + 5s connect_timeout (was unbounded).
Matrix crypto file permissions: 0o600 on crypto_state.json.
FIRMWARE.toml schema: Accepts both [[rules]] array and [rules] table formats.
Codex CLI plugin: Removed invalid --non-interactive flag, fail-fast error handling.
Channel health recovery: Removed stale recompute override in channel_status().
Theme dedup: Reversed retain order so catalog overrides built-in.
prune_old_backups dedup: Moved to roboticus-core, imported in roboticus-api.
Rate-limit headers: .unwrap() → .expect("numeric header value").
Embedding/classifier log levels: DEBUG → TRACE for per-request embedding and centroid computation.
Streaming guard dedup: GuardContext::for_streaming() replaces triplicated construction.
Discord intents: Magic numbers replaced with named constants.
Intent classification: "close out those 23 revenue tasks" now correctly matches EXECUTION.

Assets 9

30 Mar 09:45

github-actions

v0.11.2+hotfix.1

b84bf26

v0.11.2+hotfix.1

v0.11.2 hotfix 1

Assets 9

29 Mar 13:53

github-actions

v0.11.1

b0d28b3

v0.11.1

Added

TaskOperatingState: Planner as sole authority for action dispatch — replaces 10 retired shortcuts with structured decision-making.
Streaming guard chain: Buffer-first + post-stream guard checks with client-side recovery (stream_retry, stream_replace, stream_blocked).
SemanticClassifier hardening: Typed category constants, per-bank thresholds, TrustLevel (Neural/NGram), AbstainPolicy, ClassificationTrace.
Platform behavioral contract: User intent sovereignty, voice boundaries, output originality, behavioral self-awareness.
App installer: roboticus apps install/uninstall/list commands with manifest-driven profile creation, skill/theme/subagent deployment.
Profile system: roboticus profile create/switch/list/delete for isolated agent configurations.
GM soak framework: 100-turn soak tests with recommendation engine for FIRMWARE, OS, and context health diagnostics.
Adaptive retrieval strategy: Decision layer for cache-only, indexed, live discovery, and direct verification modes.
Post-turn observer dispatch: Subagents can observe conversation turns for background processing.
Dashboard: Pipeline Traces, Flight Recorder, Delegation Outcomes tabs; theme skinning; skill create/edit with markdown rendering.

Changed

Semantic intent classification replaces keyword matching across intent registry, guard detection, and specialist workflow routing.
File decomposition: guard_registry (2399→940), tests (10205→1119), core (2450→submodules), shortcuts (2264→submodules).
Shortcut retirement: 10 high-damage shortcuts removed; CronTool replaces CronShortcut; AcknowledgementShortcut retained.

Fixed

SubagentRegistry startup race (WEB-174): atomic start_agent transition eliminates agents stuck in "booting".
Keystore test flakes: mutex serialization for tests sharing global machine-id file.
Dashboard blank bubble: provider errors now preserve error state instead of showing empty message.
Specialist workflow dead code: case mismatch in category lookup (lowercase vs uppercase) fixed via typed constants.
Infrastructure leaks: guard chain catches subagent name exposure, capability dumps, and narrated delegation.
Stream finalization: preserves interception notices instead of overwriting with empty content.

Assets 9

25 Mar 08:09

github-actions

v0.11.0

7dacadb

v0.11.0

Added

Agent efficacy push: Memory introspection, selective forgetting, relationship-memory automation, and task operating state are now first-class parts of the shared pipeline.
Introspection-driven execution: Task-oriented turns now begin from introspected runtime truth, can compose specialists from a clean slate, and preserve executed state across normalization retries.
MCP release-grade management: Shared /api/mcp/servers management surface aligned across dashboard and CLI.
Skill and subagent utilization telemetry: Usage count and last-used signals exposed for skills and subagents.
Release vetting automation: Added scripts/run-v0110-vetting.sh and docs/testing/v0110-vetting-matrix.md to lock in the v0.11.0 regression contract.

Changed

Delegation and composition flow: Empty-roster specialist requests now progress into composition and delegation instead of collapsing into narrated intent or centralization.
Task-path truthfulness: Filesystem/runtime blockers now surface as real blockers instead of canned fallback prose.
Prompt and planner behavior: Introspection is now treated as the first operational step for task work and feeds a shared task operating state/action planner.
Release documentation: Active docs, architecture notes, roadmap entries, and release gates now reflect shipped v0.11.0 behavior.

Fixed

Pipeline trace drift: Legacy databases missing pipeline_traces.session_id and related fields are repaired on boot.
Prompt Performance persistence: Routing weights and context budget now persist and rehydrate correctly.
Dashboard regressions: Repaired session archive, raw TOML editing, Observability nav, semantic memory navigation, roster skill drill-down, workspace symmetry, and related operator-surface gaps.
Analysis/recommendation soak failures: Live deep-analysis surfaces validated against a stronger provider path during soak.

Assets 9

23 Mar 08:30

robot-accomplice

v0.10.0

676ac62

v0.10.0: Correctness, Safety & Operational Maturity

Highlights

Model Categorization (Phase 1): 10 task categories × 29 model profiles × 8 providers → category-aware routing with category_fit metascore dimension
Skill Authoring API: Create, validate, and publish Markdown instruction skills via POST /api/skills/author
Landlock & Job Object Confinement: Linux filesystem sandboxing (landlock) and Windows process isolation (Job Objects) for script execution
Typestate Sessions: Compile-time session lifecycle enforcement — Session<Created> → Session<Active> → Session<Closed>
Delegation Scoring Engine: score_agent_fit(), composite_fit_ratio(), and utility_margin_for_delegation() for principled decomposition decisions
TUI Client: Interactive terminal interface via roboticus tui with streaming responses and WebSocket transport
Codex CLI Plugin: Delegate coding tasks to OpenAI Codex CLI with structured output

Critical Fixes

Signal adapter: replaced std::sync::Mutex in async context with bounded tokio::sync::mpsc — eliminates runtime thread blocking
Signal adapter: added governor::RateLimiter (5 req/s default) to prevent signal-cli daemon DoS
Delivery queue: proper HTTP status code extraction for permanent vs transient error classification
Formatter: 3-allocation chain → single-pass clean_content() across all channel formatters

Infrastructure

DedupTracker moved from LlmService (requiring write lock) to AppState (standard Mutex) — reduced lock contention
L0 context budget increased from 4,000 → 8,000 tokens
Codecov upload made best-effort (quality gates still enforced at 80% minimum + regression check)
Cross-platform CI: Windows HANDLE type migration for windows-sys 0.59+
Dependabot, Docker Compose, backup/restore runbook, observability guide

Stats

302 files changed across 12 crates
36,542 additions, 4,809 deletions
83.84% test coverage (above 80% gate)
All 28 CI jobs green (Linux, macOS, Windows)

Full changelog · Release notes

Assets 8

20 Mar 15:41

github-actions

v0.9.9

ca600f2

v0.9.9

Added

Terminal user interface (roboticus tui): New roboticus-tui crate providing an interactive terminal UI with chat, log viewer, and status bar. Connects to a running server via REST API + WebSocket, supports session create/resume, streaming responses, and keyboard navigation.
Context budget tuning: Configurable L0-L3 token budgets via [context_budget] config section, dashboard range sliders, and per-channel minimum complexity level. Replaces hardcoded 4k/8k/16k/32k defaults.
Integrations management: Unified channel health monitoring with POST /api/channels/{platform}/test probe endpoint, dashboard integrations panel with per-channel test buttons, and roboticus integrations CLI command group.
Tool output noise filter: New ToolOutputFilterChain with four filters (AnsiStripper, ProgressLineFilter, DuplicateLineDeduper, WhitespaceNormalizer) applied to tool output before LLM observation, reducing token waste from ANSI escapes, progress bars, and duplicate lines.
Dashboard config exposure: All configuration sections now appear in the web dashboard form even when not yet configured. Unconfigured channels show "Enable" buttons. Powered by a CONFIG_SCHEMA merge before rendering.
Routing profile polish: Validation warning when slider weights exceed 1.0, persistence toast on apply, and default profile display (0.33/0.33/0.33) when no routing config exists.

Changed

Default bind address: Changed from 127.0.0.1 to localhost across all defaults, config generation, documentation, and justfile. OAuth loopback callbacks and test ephemeral ports remain as IP literals per RFC 8252.

Fixed

Script runner tests on Windows: Added #[cfg(unix)] guard on script_runner::tests module that uses std::os::unix::fs::PermissionsExt, preventing compilation errors on Windows.

Assets 9

14 Mar 18:53

github-actions

v0.9.7

a66184a

v0.9.7

Added

DB fitness hardening (DF-1–DF-18): 18-item SQLite performance audit resolved — retention pruning for 5 high-growth tables, orphan cleanup sweeps (working memory + embeddings), auto_vacuum=INCREMENTAL, 6 missing indexes, episodic dead-entry pruning, cache NULL-expiry fix, PRAGMA synchronous=NORMAL under WAL, CHECK constraints on 11 columns, and dead proxy_stats table removal.
Memory hygiene mechanic: ironclad mechanic detects and (with --repair) purges contaminated memory entries using 7 deterministic LIKE-prefix patterns across 3 tiers, with JSON-structured findings.
Sandbox boundary management: Filesystem confinement for skill scripts (skills_dir + $IRONCLAD_WORKSPACE, no traversal/symlink escape), configurable network isolation (unshare(CLONE_NEWNET) on Linux), memory ceiling via RLIMIT_AS, interpreter allowlist via absolute-path resolution, and mechanic sandbox health reporting.
Filesystem security overhaul: FilesystemSecurityConfig with workspace_only mode, ~25 default protected path patterns, tool_allowed_paths whitelist (auto-populated from Obsidian vault path), macOS sandbox-exec write-denial confinement, and dashboard UI toggles.
Unified pipeline architecture: IntentRegistry (22-variant Intent enum), GuardChain (12 guards with full()/cached()/streaming() presets), ShortcutDispatcher (15 handlers replacing 983-line god function), PipelineConfig (4 presets: api/streaming/channel/cron), and DedupGuard RAII replacing 11 manual release patterns. Net ~653 lines removed.
ChannelFormatter trait: Per-platform output formatting with static dispatch registry — TelegramFormatter (Markdown→MarkdownV2), DiscordFormatter, WhatsAppFormatter, SignalFormatter, WebFormatter, EmailFormatter — wired into channel_message.rs delivery path. 31 unit tests.
Configurable inference timeouts: Per-provider timeout_seconds setting ([providers.*.timeout_seconds]) with 300-second default, surfaced in dashboard provider configuration.
Dashboard session ID copy button: One-click copy-to-clipboard for session IDs in the Sessions panel.

Fixed

Circuit breaker window reset: record_failure() now tracks window_start for rolling-window accumulation — failures spaced ~60s apart correctly accumulate instead of resetting.
Embedding auth for local providers: EmbeddingConfig.is_local skips API key resolution and auth headers for Ollama/llama.cpp.
Cron schedule_kind: "once" support: Runtime maps "once" → "at" dispatch, calls DurableScheduler::evaluate_at(), auto-disables after single execution.
Vault path whitelisting: tool_allowed_paths auto-populated from obsidian.vault_path during config normalization — workspace-only mode no longer blocks configured external paths.
Fleet activity chart capacity model: Stacked area normalizes per-agent scores by 1/agentCount with fixedMax: 1.0.
Cache guard parity: cached() guard set now includes SubagentClaim + LiteraryQuoteRetry (previously missing).
ExecutionTruthGuard: Tool-results bypass bug removed.
Collapsible if lint: Updated impl_core.rs to use if let chain (edition 2024).
Wallet RPC rate-limit backoff: get_all_balances() detects rate-limit error codes (-32016, -32005, 429) and stops iterating remaining tokens instead of repeatedly hitting the provider.
Cron once-type orphan jobs: Jobs with schedule_kind: "once" and no schedule_expr are now auto-disabled on first encounter instead of emitting a warning every 60s.
Dashboard sidebar footer: Navigation bar footer now stays pinned to the bottom of the viewport (added height: 100% to sidebar container).
Dashboard custom model Add button: Custom model text input row now has its own Add button; both Add buttons use a shared class selector.
Telegram double-underscore italic: __text__ was incorrectly emitted as Telegram underline instead of italic — formatter now maps to _text_.
Config hot-reload path divergence: normalize_paths() and merge_bundled_providers() were skipped during hot-reload — reloaded configs now match boot-time normalization.
Routing audit fixes: Attempt counter not incrementing on retry, u32 truncation on cost metrics, misleading timeout error message wording.
Dashboard UI stall during inference: 4 RwLock guard-scope fixes release locks before async I/O, preventing cascading reader starvation.
Cron semaphore hot-reload race: Semaphore not released when cron runtime reloads config, causing phantom permit exhaustion. Dead LlmService method removed, lock consolidation in admin routes.
Agent audit fixes: Tautological always-true test condition, timeout hint parsing edge case, unreachable branch removal.

Assets 9

12 Mar 14:56

github-actions

v0.9.6

6c81069

v0.9.6

Added

Compliance-first self-funding control plane: Complete revenue opportunity lifecycle (intake → qualify → score → plan → fulfill → settle) with DB-backed restart safety, strategy-level scoring (confidence/effort/risk/priority/recommendation), feedback persistence per opportunity and summary by strategy, configurable post-settlement asset routing (default PALM_USD), EVM swap submission with tx-hash tracking and on-chain receipt reconciliation, tax payout lifecycle mirroring swap tasks, and operator-visible accounting (net profit, attributable costs, retained earnings, tax allocation) across API, CLI, and mechanic surfaces.
Revenue mechanic integration: Mechanic can probe, reconcile, and repair orphaned or stale revenue jobs and swap/tax reconciliation mismatches via run_gateway_provider_and_revenue_checks and run_gateway_integrated_repair_sweep.
Skills catalog: PluginCatalog with CLI flows (ironclad skills catalog list/install/activate) and API endpoints (GET/POST /api/skills/catalog, /install, /activate). Registry manifest fetch from remote URL.
Skill registry protocol: Migration 022 adds version, author, registry_source columns to skills table. Multi-registry support via RegistrySource { name, url, priority, enabled } with backward-compatible fallback from legacy single-URL registry_url.
Multi-registry fetch: Registry sync iterates all configured sources, namespaces skills as {registry_name}/{skill_name} for non-local sources, applies semver comparison to skip redundant downloads, and resolves conflicts by registry priority.
Learning loop closure: Agent now detects repeating multi-step tool sequences on session close and synthesizes reusable SKILL.md procedure files. learned_skills table (migration 021) tracks reinforcement history (success/failure counts, priority). LearningConfig exposes tuneable thresholds for minimum sequence length, success ratio, priority boost/decay, and skill cap. Inspired by recent work on autonomous tool-use learning in LLM agents (arXiv:2603.05344).
Procedural failure recording: record_procedural_failure() (previously dead code in the DB layer) is now called from ingest_turn() when tool results indicate failure, closing the procedural memory feedback loop.
Skill priority adjustment: Governor tick() now runs adjust_learned_skill_priorities() after episodic decay — learned skills with high success ratios get priority boosts; those with poor ratios get decayed.
Skill subdirectory loading: SkillLoader now recurses into learned/ subdirectory, loading machine-synthesized skills alongside hand-authored ones.
Progressive context compaction: 5-stage compaction (Trim → Summarize → Archive → Evict → Emergency) in compact_before_archive() with CompactionStage::from_excess() selector.
Decay-weighted episodic retrieval: rerank_episodic_by_decay() applies time-based decay at retrieval time, preventing stale context from dominating memory budget.
Instruction anti-fade micro-reminders: Event-driven system prompt reinforcement at agent decision points to combat instruction-following drift.
x402 autonomous payment: LLM HTTP client now handles 402 Payment Required responses with autonomous on-chain payment and request retry.
Homebrew & Winget packaging: release.yml contains complete update-homebrew (SHA256 extraction, formula generation, tap push) and update-winget (vedantmgoyal9/winget-releaser@v2) jobs. Activation requires tap repo creation and secrets provisioning.

Assets 9

06 Mar 22:06

github-actions

v0.9.5

ea076fb

v0.9.5

Changed

Behavior soak hardening: scripts/run-agent-behavior-soak.py now includes regression checks for filesystem capability truthfulness, subagent capability response quality, and affirmative continuation quality, with rubric updates to score substantive outcomes over brittle phrase matching.
Roadmap/release traceability: docs/releases/v0.9.5.md and docs/ROADMAP.md updated with current v0.9.5 prep status for speculative execution, browser runtime support, CLI skill roadmap slice, and behavior continuity validation.
Architecture documentation: Added explicit v0.9.5-prep control/dataflow coverage for deterministic execution shortcuts and guarded response sanitization in docs/architecture/ironclad-dataflow.md and docs/architecture/ironclad-sequences.md.
Browser runtime continuity: Browser action execution now attempts a single stop/start session recovery when CDP disconnect/closed-socket errors are detected, limited to idempotent actions to avoid duplicate side effects on replay.
Autonomy turn-budget controls: Added configurable agent-level ReAct budget controls (autonomy_max_react_turns, autonomy_max_turn_duration_seconds) and wired enforcement into the runtime loop.
CLI adapter response contract: run_script now emits stable typed metadata (adapter, schema_version, status, error_class) and normalized script error classes for downstream handling.
Speculative policy invariants: Added explicit test coverage enforcing Safe-only speculative eligibility (Caution/Dangerous/Forbidden remain excluded from speculative execution).

Fixed

Internal protocol fallback leakage: response sanitization no longer surfaces protocol-placeholder fallback text; empty/degraded sanitized content now resolves through deterministic user-facing quality fallback.
Markdown count execution reliability: execution shortcut path now handles recursive markdown-file count prompts deterministically, including strict numeric-only responses when requested (count only / only the number style prompts).
Delegation shortcut boundary: markdown-count shortcut no longer hijacks explicitly delegated prompts, preserving delegation intent handling.
Speculative branch cleanup safety: introduced RAII speculation slot guards and abort-path tests to guarantee no slot leakage when speculative tasks are canceled.
CLI skill sandbox isolation coverage: added explicit tests that secret env vars are stripped while only allowlisted runtime vars are propagated under skills.sandbox_env=true.

Assets 9

Releases: robot-accomplice/roboticus-rust

v0.11.4

Theme: Adaptive Intelligence

Added

Changed

Fixed

Uh oh!

v0.11.3

Theme: Complete the Operating Picture

Added

Fixed

Uh oh!

v0.11.2+hotfix.1

Uh oh!

v0.11.1

Added

Changed

Fixed

Uh oh!

v0.11.0

Added

Changed

Fixed

Uh oh!

v0.10.0: Correctness, Safety & Operational Maturity

Highlights

Critical Fixes

Infrastructure

Stats

Uh oh!

v0.9.9

Added

Changed

Fixed

Uh oh!

v0.9.7

Added

Fixed

Uh oh!

v0.9.6

Added

Uh oh!

v0.9.5

Changed

Fixed

Uh oh!