Skip to content

v0.6.0

Choose a tag to compare

@github-actions github-actions released this 19 May 01:12
· 85 commits to main since this release
b8efa7c

Breaking

  • Native subprocess capsules refuse to launch when the OS-level sandbox is unavailable (security, fixes #655). Previously, when bwrap failed its user-namespace probe — most commonly on Ubuntu 24.04+ where kernel.apparmor_restrict_unprivileged_userns=1 ships enabled — Astrid silently fell through to an unsandboxed launch with a single tracing::warn! line as the only signal. This contradicted the README's "subprocess capsules are sandboxed" promise: a Node.js MCP server (or OpenClaw Tier 2 plugin) could read ~/.ssh/id_rsa, write to ~/.bashrc, or punch through the ~/.astrid tmpfs overlay without any capability check firing. The new default policy is Required: ProcessSandboxConfig::sandbox_prefix() returns an actionable Err instead of Ok(None) when the sandbox can't be applied, and the MCP server-startup path propagates the error so the daemon refuses to launch the subprocess. The only escape hatch is ASTRID_SANDBOX_POLICY=off, which silently launches without a sandbox — for trusted dev environments and CI runners where the kernel can't be configured. There is no "warn and fall through" middle state: a soft fallback hides the security gap and that is the bug. The error message names the sysctl remediation (sudo sysctl -w kernel.apparmor_restrict_unprivileged_userns=0) and the env-var escape hatch directly. Breaking for any Ubuntu 24.04+ install that was tacitly relying on the silent fallback — those deployments now need to either flip the sysctl (recommended) or set ASTRID_SANDBOX_POLICY=off (only if the sandbox bypass is intentional).
  • caps grant <agent> "*" now requires --unsafe-admin acknowledgement. Mirrors the long-standing rail on group create --caps "*". Without it, the group-level safety check was trivially bypassable: instead of creating a custom admin-equivalent group (kernel rejects without unsafe_admin = true), an operator could caps grant bob "*" and silently promote bob to universal admin via direct grant. Layer 6's mutate_caps now refuses any Grant that includes the literal bare * capability unless unsafe_admin is true on the request; the CLI surfaces this as caps grant bob "*" --unsafe-admin and rejects the call client-side before any IPC round-trip. Multi-segment wildcards (network:egress:*, self:capsule:*) are unaffected — they're inherently scoped, the gate only triggers on the universal pattern. Breaking for automation: the new AdminRequestKind::CapsGrant.unsafe_admin: bool field defaults to false via #[serde(default)] so wire-format clients that omit it land on the safe side, but any tooling that previously assumed a bare * grant would succeed must now pass unsafe_admin = true (or set --unsafe-admin on the CLI).
  • PrincipalProfile files moved out of the principal home directory. Per-principal profile.toml now lives at ~/.astrid/etc/profiles/{principal}.toml instead of ~/.astrid/home/{principal}/.config/profile.toml. Profile contents are 100% system policy (enabled, groups, grants, revokes, quotas, auth public keys, egress, process allowlist) — keeping them inside the principal's home directory let any capsule with fs_read = ["home://"] read its own policy file (and fs_write would have let it self-elevate). The new location sits outside the home:// VFS scheme entirely. PrincipalProfile::path_for(&PrincipalHome) is now PrincipalProfile::path_for(&AstridHome, &PrincipalId); same for load/save. A one-shot migration in seed_default_principal_admin_profile moves any legacy home/{principal}/.config/profile.toml to the new location on next boot. (#672)
  • AdminKernelRequest and AdminKernelResponse are now wrapper structs with the typed body on a kind/body field, plus an optional request_id for client-side correlation. Pre-existing test fixtures using AdminKernelRequest::AgentCreate { ... } should construct the variant on AdminRequestKind and convert (AdminRequestKind::AgentCreate { ... }.into() or AdminKernelRequest::new(...)). The wire format is forward-compatible: request_id is omitted when None. (#672)
  • AuditAction::AdminRequest gained a params: Option<serde_json::Value> field. Forward-compatible (#[serde(default)] + skip_serializing_if), but external consumers parsing audit entries with strict schemas may need to add the field. Capture is for forensic replay (issue #672). (#672)
  • PermissionError gained a PrincipalDisabled variant thrown by the Layer 5 enforcement preamble when the caller's profile has enabled = false. Existing match blocks against the enum need a new arm. (#672)
  • Kernel.groups field type changed from Arc<GroupConfig> to Arc<ArcSwap<GroupConfig>> (issue #672 — Layer 6). The boot-loaded group config is now hot-reloadable through the admin IPC topics (astrid.v1.admin.group.*); every authorization check clones the current Arc via load_full() on each request. Enforcement preambles hold their own Arc snapshot per check, so in-flight checks observe a consistent config even during a swap. Any direct consumer matching on Arc<GroupConfig> must migrate to kernel.groups.load_full().
  • Built-in agent group gains self:quota:get and self:agent:list capabilities (issue #672). self:* already subsumed both, but operators inspecting or matching on the exact capability vector see a minor widening. This makes agent self-service visibility into their own row and quotas an explicit contract rather than an incidental consequence of self:*.
  • Default principal's profile now carries groups = ["admin"] after boot (issue #670). bootstrap_cli_root_user writes groups = ["admin"] to ~/.astrid/home/default/.config/profile.toml on any boot where the profile is absent or has groups=[] && grants=[] && revokes=[]. Operators who previously ran with an explicitly empty groups list — and no grants/revokes — will see the default principal gain full management-API capabilities on the next boot. Either edit the profile to add a non-admin group (e.g. restricted), or add an explicit grants/revokes entry, to block the auto-seed. Profiles that already name any group, grant, or revoke are left untouched. (#670)
  • CapabilityToken signing format bumped v1 → v2 with principal signed into the payload (Layer 4 multi-tenancy, issue #668). Existing v1 persistent tokens on disk fail verify_signature() after upgrade and get rejected with InvalidSignature plus a tracing::error! pointing operators at re-mint. There is no silent migration path — changing the signing payload is a cryptographic break, not a data migration. CapabilityToken::create and create_with_options now take a required principal: PrincipalId argument. (#668)
  • Allowance.principal: PrincipalId is now required at construction. AllowanceStore keys on (principal, id); find_matching, find_matching_and_consume, consume_use, export_session_allowances, export_workspace_allowances, and clear_session_allowances now take &PrincipalId. A new clear_all_session_allowances retains the global sweep for kernel-initiated shutdown. (#668)
  • CapabilityStore::has_capability / find_capability now take &PrincipalId. Tokens whose principal does not match the caller are rejected up front, even if the resource pattern matches — fail-closed cross-principal check. Revocation and single-use consumption stay global (they are about the token's identity, not the caller). Persistent KV keys changed from caps:tokens/{token_id} to caps:tokens/{principal}/{token_id}. CapabilityValidator::check and validate_by_id also thread &PrincipalId. (#668)
  • Kernel.active_connections is now per-principal (DashMap<PrincipalId, AtomicUsize>). connection_opened(&PrincipalId) / connection_closed(&PrincipalId) take the connecting principal; only that principal's session allowances are cleared on last-disconnect. New total_connection_count() sums across principals for ephemeral-shutdown. (#668)
  • Kernel.overlay_vfs replaced by Kernel.overlay_registry: Arc<OverlayVfsRegistry>. Each invoking principal resolves their own OverlayVfs on first use; Agent A's workspace writes never reach Agent B's view of the same tree. The registry is bounded (default 1024 principals, tunable via ASTRID_OVERLAY_REGISTRY_MAX_PRINCIPALS) with idle-eviction. Kernel.vfs now points at a plain workspace HostVfs — kernel-internal paths that do not know a principal keep using that field. (#668)
  • SecurityInterceptor::intercept and ApprovalManager::check_approval take &PrincipalId. Single-tenant callers pass PrincipalId::default(). (#668)
  • ApprovalDecision::ApproveWithAllowance now boxes Allowance (Box<Allowance>) — the added principal field pushed the enum past clippy's large_enum_variant threshold. (#668)
  • WASM engine migrated from Extism to wasmtime Component Model. The kernel now loads Component Model binaries via Component::from_binary, not Extism modules. Existing capsules compiled with extism-pdk will not load — they must be rebuilt with the migrated SDK targeting wasm32-wasip2. This is a coordinated multi-repo migration (SDK + 16 capsule repos). (#632)
  • WIT host function signatures retyped. All 49 functions now use proper typed params/returns (result<T, string>, WIT records, u64 handles) instead of string-based JSON blobs. The HostResult 0x00/0x01 prefix encoding is removed — errors are returned via WIT result types. (#632)
  • Guest export astrid-hook-trigger signature changed. Was func(input: list<u8>) -> list<u8>. Now func(action: string, payload: list<u8>) -> capsule-result. The action name and payload are separate typed parameters; the return is the typed capsule-result record. (#632)
  • capsule_abi module removed from astrid-core. Types (CapsuleAbiContext, CapsuleAbiResult, LogLevel, etc.) are replaced by wasmtime::component::bindgen! generated types. (#632)
  • Approval API simplified. risk-level removed from approval-request WIT record. decision removed from approval-response. Capsules declare action + resource, get back approved/denied. Risk classification was speculative complexity — the kernel manages allowance-based approval without risk levels. (#641)

Added

  • astrid -p ... --session <ID_OR_NAME> accepts either a UUID or a name. A string that parses as a UUID is used as-is for resume; anything else is treated as a stable session name and hashed via UUID v5 (NAMESPACE_URL). Operators can copy the UUID printed by --print-session straight into --session <that UUID> for the next turn and round-trip works, matching the cargo / gh / claude convention on the same flag. Help text uses value name ID_OR_NAME.

  • agent create inherits default's per-principal state out-of-the-box. After the home tree provisions, the kernel handler copies env JSON (~/.astrid/home/default/.config/env/<capsule>.env.jsonhome/<new>/.config/env/...), per-capsule KV namespaces (default:capsule:<id>:*<new>:capsule:<id>:*), and per-capsule secret files (secrets/default/<capsule>/<key>secrets/<new>/<capsule>/<key>) for every env_type = "secret" field every installed capsule declares. The new agent reaches the chat REPL with the same model selection, base URLs, and credentials the operator already configured for default — no fresh astrid secret set for every key on every new agent. Best-effort: a copy failure logs warn and leaves the agent in a "needs manual setup" state, but doesn't roll back the profile or home tree (those already succeeded; the confidentiality boundary is intact regardless). The registry read lock is held only while extracting capsule IDs + per-capsule secret-key lists — the async KV copy and blocking file-system copy run after the lock is dropped so concurrent install/update/remove isn't blocked for the duration of the inheritance.

  • agent create rich provisioning flags now chain end-to-end (no new IPC required). --group, --memory, --timeout, --storage, --processes, --egress, --process-allow used to all bail with "needs kernel-side IPC that has not shipped" and fall back to a copy-pasteable three-line script. The flags now compose using the existing admin handlers: --group passes through to AgentCreate.groups; the quota flags fetch the new agent's default Quotas and write them back via a single QuotaSet; --egress / --process-allow translate to network:egress:<label> and process:spawn:<cmd> capability patterns. All inputs (quota specs, capability labels) are validated client-side before any IPC and the capability grants ride along on AgentCreate.grants so a malformed pattern can't leave a half-provisioned agent on disk. --bare, --distro (non-default), and --link still bail with a clear "needs distro / admin.agent.link IPC" message — those genuinely need new admin topics. Tested live: agent create alice --group admin --memory 32M --timeout 60s --storage 512M --processes 4 --egress openai,internal --process-allow git,npm lands all four quota fields, four capability grants, and the admin group membership atomically.

  • agent modify --add-group / --remove-group now actually works. Previously the CLI parsed the flags but emitted "needs admin.agent.modify IPC that has not shipped" and bailed at exit 2; operators worked around it by hand-editing etc/profiles/{p}.toml and rebooting the daemon. The new AdminRequestKind::AgentModify { principal, add_groups, remove_groups } variant ships the partial-update path end-to-end: per-group idempotent (set-based comparison so a no-op modify reports changed = false regardless of Vec ordering), profile-validate before save, cache-invalidate after, audit-logged as admin.agent.modify. Empty (no flags) is rejected with a clear error so a script that forgot to populate either list can't silently no-op.

  • Cargo-like manifest schema (forward-compatible parser side). Capsule.toml now accepts the new [publish] / [subscribe] tables, the flat [exports] / [imports] form with quoted "namespace:interface" keys, and [[tool]] blocks with the operator-reviewable description_for_llm field — all per the cargo-like-manifest RFC. Each [publish] / [subscribe] entry carries a required wit = "@scope/repo/iface/record" (or bare-name self-reference) plus optional version / tag / rev / branch / path source pinning and handler = "..." to bind to a #[astrid::interceptor("...")] export. Keys in those tables also serve as the IPC publish/subscribe ACL — when present, they supersede [capabilities].ipc_publish / ipc_subscribe. The legacy nested [exports.<ns>] foo = "1.0" form, the [[topic]] blocks, the [[interceptor]] blocks, and the [capabilities].ipc_publish / ipc_subscribe arrays continue to parse and behave exactly as before — both forms coexist during the migration window. Resolver, registry index, and lockfile (BLAKE3 verification) plumbing land in a follow-up; today the parser stores wit refs as strings without resolving them. The parser rejects entries that set more than one of version / tag / rev / branch / path at deserialize time so ambiguous manifests fail fast at install instead of producing inconsistent resolver behaviour.

  • EnvScope { Agent, Shared } enum on EnvDef. Operator-only sharing model for env_type = "secret" env vars: per-agent (default, fail-closed) or host-wide (shared, kernel falls through on per-agent miss). The capsule manifest does NOT declare scope — capsules come from external sources and cannot be trusted to mark their own credentials as host-shared (a malicious capsule could otherwise mark its bot token "shared" and inherit every agent's invocation budget). The operator decides scope at astrid secret set --scope time. Additive: the consuming routing change ships in a separate PR.

  • astrid setup subcommand + bundled AppArmor profile (follow-up to #655). Ships dist/apparmor/astrid — a narrow flags=(unconfined) profile whose only addition is the userns, rule — and an astrid setup subcommand that diagnoses the Linux sandbox prerequisites and prints the exact commands (or runs them via sudo with --apply) to install the profile at /etc/apparmor.d/astrid. With the profile loaded, bwrap regains the ability to call unshare(CLONE_NEWUSER) on Ubuntu 23.10+ / 24.04 hosts that keep kernel.apparmor_restrict_unprivileged_userns=1 enabled, so native subprocess capsules (MCP servers, OpenClaw Tier 2 plugins) launch under the default SandboxPolicy::Required without the operator having to flip a kernel-wide sysctl (which would weaken every other unprivileged process on the host, not just Astrid). astrid setup --print-apparmor emits the profile alone — distro packagers can pipe it straight into a .deb / .rpm build without invoking the diagnostic path. The subcommand is a no-op on non-Linux hosts and on Linux hosts where the sandbox probe already passes.

  • ipc-publish-as host function + uplink principal propagation. New WIT entry (astrid:host/ipc.ipc-publish-as) and SDK wrappers (publish_as / publish_json_as) stamp the outgoing IPC envelope with a caller-supplied principal instead of the capsule's own. Used by uplinks (CLI proxy, future Telegram/Discord bridges) to relay external client traffic while preserving the operator's claimed identity through to the kernel's caller resolution and Layer 5/6 capability enforcement. Gated host-side on uplink = true in Capsule.toml [capabilities] (the gate uses manifest.capabilities.uplink — the matching has_uplink_capability field in HostState was previously bound to !manifest.uplinks.is_empty(), which is the unrelated [[uplink]] declaration list). Before this, every uplink-forwarded message landed at the kernel as principal = <uplink's principal> and astrid agent switch was purely cosmetic — admin commands as alice executed as default and bypassed RBAC. The IPC quota bucket is also keyed by the relayed principal so two principals routed through the same uplink cannot starve each other's per-tenant budget. Trust-the-uplink model: the kernel does not verify the uplink authenticated the claimed principal — same trust level as today's "anyone with the daemon socket token can act as default-admin", just extended to the per-principal axis. Cryptographic per-connection auth lives in #658.

  • CLI redesign — modern noun-verb structure with multi-tenancy admin surface (issue #657).

    • The CLI is now modelled after gh and fly: astrid agent <verb>, astrid capsule <verb>, astrid quota <verb>, etc.
    • System-level operations (status, start, stop, restart, ps, top, who, logs) stay as bare verbs for speed.
    • astrid with no subcommand still drops the operator into an interactive agent session — the unchanged self-hosting path.
    • Phase 1 covers the admin-surface bulk unblocked by Layer 6 (#672); sub-agent delegation, vouchers, audit, trust, budget, and remote A2A surfaces are registered in the clap tree but stubbed with tracking-issue references (#656, #658, #675, #653) and exit code 2 so CI scripts can pattern-match.
    • New verbs implemented end-to-end against the Layer 6 admin IPC: agent create / list / show / delete / enable / disable / switch / current, group create / list / show / delete / modify, caps show / grant / revoke / check, quota show / set, secret set / list / delete, capsule install / update / list / remove / tree / build / config / show, distro apply, gc, restart, logs, doctor, version, completions, update, ps, top, who.
    • Every list/show command supports --format pretty|json|yaml|toml.
    • Per-agent commands accept --agent <name>/-a to override the active context; without the flag they fall back to ~/.astrid/run/cli-context.toml written by astrid agent switch.
    • Migration aliases retained: session infosession show, self-updateupdate, wit gcgc, top-level buildcapsule build.
    • New admin_client module wraps SocketClient with request_id correlation (UUID v4 per call, echoed on astrid.v1.admin.response.<topic>) so multiple in-flight admin commands can disambiguate responses, and tolerates broadcast frames (astrid.v1.capsules_loaded) whose payloads do not round-trip through IpcPayload::RawJson's tag.
    • New read_until_topic helper on SocketClient propagates the same skip-and-retry pattern to the bare verbs. (#657)
  • PrincipalProfile.enabled is now enforced by the Layer 5 management-API preamble. Pre-Layer-6 the flag was set on disk by agent.disable but never consulted by authorize_request — operators who disabled an agent saw the flag persist while the agent kept passing authz checks. The preamble now resolves the caller's profile, and if enabled = false returns the new PermissionError::PrincipalDisabled variant before the capability check. (#672)

  • PrincipalProfile.enabled is also now enforced at Layer 3 (WasmEngine::invoke_interceptor). Pre-fix, only the management API honored the flag; capsule invocations bypassed it entirely. The Layer 3 gate runs right after profile cache resolution and returns CapsuleError::WasmError("principal '{p}' is disabled") with a security_event = true log. In-flight invocations finish under the old value (we only check at entry); new invocations after agent.disable are refused. Together with the Layer 5 gate, agent.disable now denies every surface a principal can drive. (#672)

  • Phantom-principal pre-condition on every mutating admin handler. caps.grant, caps.revoke, quota.set, agent.enable, and agent.disable now require the target's profile.toml to already exist on disk. Without this gate, a typo'd principal name (alic vs alice) silently materialized a phantom principal — PrincipalProfile::load_from_path returns Default on NotFound, the handler then saved the mutated default to disk, and any future traffic claiming that principal inherited the phantom permissions. quota.get got the same gate so a typo doesn't return Default-shaped quotas without revealing the mistake. (#672)

  • agent.delete removes profile.toml. Pre-fix the handler unlinked the identity and invalidated the cache but left the policy file on disk; subsequent traffic claiming that principal would re-load the old policy. Home-directory data (capsule KV, audit chain) is still left intact — that's an ops-managed concern. (#672)

  • agent.disable and caps.revoke against the default principal are rejected. The default principal is the bootstrap admin anchor; either operation could lock the operator out of the management API entirely (the new enabled gate denies every request from a disabled principal, and caps.revoke self:* would clear the operator's grants). caps.grant and quota.set on default remain allowed — they only add/adjust, never remove. (#672)

  • AdminKernelRequest now carries an optional request_id echoed back on AdminKernelResponse. Lets clients with multiple in-flight requests on the shared astrid.v1.admin.response.<topic> channel disambiguate responses. The wire shape is { "request_id": "...", "method": "...", "params": {...} }request_id is skip_serializing_if = Option::is_none, so single-client deployments emit no extra field. The typed body lives on AdminKernelRequest.kind: AdminRequestKind (the previous enum, renamed). (#672)

  • AuditAction::AdminRequest.params: Option<serde_json::Value> captures the request payload (capabilities granted, quotas set, group definition, etc.) so forensic replay doesn't require diffing profile.toml/groups.toml snapshots. None for legacy KernelRequest entries with no params struct. Forward-compatible add (#[serde(default)] + skip_serializing_if). (#672)

  • Layer 6 management IPC for agent lifecycle, quotas, groups, and capability grants (issue #672). New astrid.v1.admin.* IPC surface with 13 topics covering agent (create, delete, enable, disable, list), quotas (set, get), groups (create, delete, modify, list), and per-principal capabilities (grant, revoke). Every topic runs through the existing Layer 5 CapabilityCheck::require preamble — there is no new authz mechanism. Capability mappings:

    • admin.agent.create|delete|enable|disableagent:create|delete|enable|disable
    • admin.agent.listself:agent:list (self) / agent:list (cross-tenant)
    • admin.quota.set|getself:quota:set|get / quota:set|get
    • admin.group.create|delete|modify|listgroup:create|delete|modify|list
    • admin.caps.grant|revokecaps:grant|caps:revoke

    Mutating topics acquire a new tokio::sync::Mutex<()> (Kernel.admin_write_lock) so concurrent writers never interleave on profile.toml / groups.toml. Every profile-mutating handler (quota.set, caps.grant, caps.revoke, agent.enable, agent.disable) calls profile_cache.invalidate(&target) after the atomic write so subsequent authz checks reflect the new state without waiting on kernel restart. Group admin topics rewrite groups.toml atomically (tempfile + rename + 0o600 on Unix) and then kernel.groups.store(Arc::new(new_config)) the ArcSwap — in-flight checks holding the old Arc finish under the old config, the next load_full sees the new one. Built-in groups (admin, agent, restricted) cannot be deleted or modified; default principal cannot be deleted. caps.grant never clears a matching revoke — Layer 5 precedence (revoke > grant > group) is preserved. Every allow and deny writes AuditAction::AdminRequest with method set to the topic wire name (admin.agent.create, etc.) and target_principal set when operating on another principal. agent.delete removes the AstridUserId and CLI identity link but does not scrub the home directory — reclamation is an ops concern. (#672)

  • AdminKernelRequest / AdminKernelResponse wire types in astrid-types::kernel with tagged serde (#[serde(tag = "method", content = "params")]), mirroring the shape of KernelRequest / KernelResponse. Response variants include Success(Value), AgentList(Vec<AgentSummary>), GroupList(Vec<GroupSummary>), Quotas, and Error. (#672)

  • Atomic GroupConfig::save_to_path + save in astrid-core::groups::io_impl — tempfile + rename + 0o600 on Unix, mirroring PrincipalProfile::save_to_path. On rename failure the tempfile is removed so no secret-adjacent stale state is left on disk. (#672)

  • GroupConfig::insert_custom_group, modify_custom_group, remove_group, is_builtin_name — immutable-value mutators that return a new GroupConfig with the requested change applied. Built-in names are rejected; modify/remove of unknown custom groups returns the new GroupConfigError::UnknownGroup variant (vs DuplicateName for insert collisions). (#672)

  • IdentityStore::delete_user and list_users on the trait and KvIdentityStore — required by admin.agent.delete (delete user record + all links pointing at it, idempotent for unknown UUIDs) and admin.agent.list (enumerate user records). The name-index entry is cleared only when it still points at the deleted UUID (last-writer-wins survives). (#672)

  • arc-swap workspace dependency for lock-free hot-reloadable GroupConfig. (#672)

  • Capability/group enforcement on the management API (issue #670). Every arm of kernel_router::handle_request now runs through a CapabilityCheck enforcement preamble before reaching the handler body. Each KernelRequest variant maps to a required capability (Shutdown → system:shutdown, ReloadCapsules → self:capsule:reload, InstallCapsule → self:capsule:install, ListCapsules/GetCommands/GetCapsuleMetadata → self:capsule:list, GetStatus → system:status, ApproveCapability → self:approval:respond); no default-allow branch. The caller's principal is resolved from IpcMessage.principal (falling back to PrincipalId::default() for pre-#658 single-token socket traffic), their PrincipalProfile comes from the profile cache, and the new GroupConfig is boot-loaded from $ASTRID_HOME/etc/groups.toml (missing file → built-ins only). Precedence follows revoke > grant > group-inherited, so operators can revoke specific capabilities from admins without dropping the admin membership. Built-in groups (admin, agent, restricted) cannot be redefined at load time; custom groups must opt-in via unsafe_admin = true to grant the universal *. Every allow and deny outcome writes a new AuditAction::AdminRequest { method, required_capability, target_principal } audit entry (chain-linked, signed). The default principal is seeded with groups = ["admin"] on first boot so single-tenant deployments keep full access with no config. CapabilityCheck is a pure function — no I/O, no locking — and is a distinct namespace from the runtime CapabilityToken system (which continues to gate capsule-level sensitive actions). (#670)

  • GroupConfig + Group in astrid-core::groups with a TOML loader at $ASTRID_HOME/etc/groups.toml, baked-in admin/agent/restricted built-ins, fail-closed handling for missing/malformed files, rejection of built-in redefinition, rejection of capability strings containing ** or shell metacharacters, and a Group::unsafe_admin opt-in for custom groups that need the universal * capability. (#670)

  • PrincipalProfile.grants and PrincipalProfile.revokes (per-principal capability overrides) validated at load/save time via the new astrid_core::capability_grammar::validate_capability helper. Revokes have strict precedence over grants and group-inherited capabilities. (#670)

  • CapabilityCheck + PermissionError in astrid-capabilities — the policy-evaluation primitive. Borrowed evaluator over (&PrincipalProfile, &GroupConfig) with has(&str) -> bool and require(&str) -> Result<(), PermissionError>. Pure, thread-safe, zero-alloc on the hot path. (#670)

  • AuditAction::AdminRequest { method, required_capability, target_principal } — new audit action for management-API requests, written with AuthorizationProof::System on allow and AuthorizationProof::Denied on deny. (#670)

  • Principal-scoped AllowanceStore, CapabilityStore, OverlayVfs, and connection counter (Layer 4). Allowance.principal and CapabilityToken.principal are now required construction-time fields. All store lookups are principal-filtered up front; Agent A's approvals and capabilities never match Agent B. A new astrid_vfs::OverlayVfsRegistry gives each invoking principal a fresh OverlayVfs on first use, with a bounded (default 1024) LRU-evicting cache keyed by PrincipalId. Kernel.active_connections became a per-principal DashMap; only the disconnecting principal's session allowances are cleared, while ephemeral shutdown still waits on the global total_connection_count(). WasmEngine::invoke_interceptor resolves the invoking principal's overlay on every call and installs it on HostState.invocation_overlay_vfs. Single-tenant deployments pass PrincipalId::default() and see no behavior change. Extends invariants #6 and #7 from issue #653 (Agent A cannot use Agent B's approvals or capabilities). (#668)

  • Per-invocation quota enforcement from PrincipalProfile (Layer 3). WasmEngine::invoke_interceptor now resolves the invoking principal's PrincipalProfile through a new kernel-scoped PrincipalProfileCache (astrid_capsule::profile_cache) and applies per-invocation: max_memory_bytes via a rebuilt StoreLimits, max_timeout_secs via the epoch deadline (non-daemon capsules only), and max_ipc_throughput_bytes via the rate limiter. IpcRateLimiter is rekeyed from Uuid to (Uuid, PrincipalId) so two principals sharing a single capsule instance never starve each other's throughput. ManagedProcess and active HTTP streams are tagged with the creator principal and counted per-principal against the existing per-capsule hard ceilings (MAX_BACKGROUND_PROCESSES = 8, MAX_ACTIVE_HTTP_STREAMS = 4) — the effective cap is always min(profile, hard_cap). Profile load failures (malformed TOML, unknown fields, invalid values, future profile_version) fail the invocation closed; there is no fallback to the capsule owner's limits. max_storage_bytes is read but not enforced (no storage accountant yet; deferred). Single-tenant deployments without a profile.toml get Layer 2's Default profile and see no behavior change. (#666)

  • Per-principal PrincipalProfile + profile.toml. New astrid_core::profile module with the per-principal policy struct (enablement, groups, auth methods, network egress, process spawn, resource quotas) plus loader and atomic writer at ~/.astrid/home/{principal}/.config/profile.toml. Missing file falls back to Default; malformed TOML, unknown fields, failed validation, or a future profile_version are hard errors. Save is atomic on Unix (temp write at 0o600, then rename). Validation fires on both load and save. Pure data plumbing — Layer 3 enforcement in invoke_interceptor, hot-reload, management IPC, and CLI are separate follow-ups. (#663)

  • Content-addressed WIT store at ~/.astrid/wit/{blake3}.wit. Install-time WIT files (including deps/) are recursively hashed, deduped, and stored with atomic writes. Per-capsule wit/ is removed after addressing; meta.json.wit_files is the authoritative manifest. Append-only by design for replay preservation. (#649)

  • astrid wit gc — admin-only mark-sweep GC for the WIT content store. Dry-run by default, --force to delete. Scans all principal homes + workspace. (#649)

  • Per-invocation home:// and /tmp/ VFS scoping. A shared capsule serving multiple agents now resolves home:// and /tmp/ against the invoking agent's home directory (~/.astrid/home/{principal}/) instead of the capsule owner's. The security gate accepts a principal_home parameter and WasmEngine::invoke_interceptor builds a per-principal VFS bundle when the invocation principal differs from the capsule's. Unregistered principals (no home directory on disk) receive a clean denial — the kernel does not auto-create principal homes. Single-tenant installs (all traffic under default) see no behavior change. Precursor to multi-tenancy (#653). (#549)

  • Per-invocation SecretStore and capsule log re-scoping. has_secret now reads secrets from the invoking agent's KV namespace (and OS keychain scope), and astrid_log writes to the invoking agent's ~/.astrid/home/{principal}/.local/log/{capsule}/{date}.log. HostState gains invocation_secret_store / invocation_capsule_log fields plus effective_secret_store() / effective_capsule_log() accessors; WasmEngine::invoke_interceptor installs per-principal resources alongside the existing invocation VFS bundle and clears them on exit. Unregistered principals receive None — no attacker home is auto-created. Finishes #653 Layer 1's side-channel isolation started in #659. (#661)

  • WIT-driven IPC topic schemas. Capsules declare wit_type = "record-name" on [[topic]] entries in Capsule.toml. At install time, wit-parser reads the record from the capsule's wit/ directory, extracts field names, types, and /// doc comments into JSON Schema, and bakes it into meta.json. At runtime, WasmEngine::load() populates the SchemaCatalog from baked schemas. The LLM sees typed field descriptions without capsule authors writing JSON Schema by hand. (#643)

  • astrid-build::wit_schema module — converts WIT records to JSON Schema. Handles primitives, option<T>, list<T>, tuple, enum, flags, variant, result, nested records, and type aliases. (#643)

  • wit_type: Option<String> field on TopicDef in Capsule.toml — references a WIT record by kebab-case name. (#643)

  • Schema catalog (SchemaCatalog) for A2UI Track 2 — maps IPC topics to schema definitions. Populated at capsule load time from baked meta.json schemas. (#632, #643)

  • Epoch-based WASM timeout with EpochTickerGuard RAII type — replaces Extism wall-clock timeout. 5-minute deadline for interceptors, u64::MAX for daemons/run-loops, 10-minute safety net for lifecycle hooks. (#632)

  • 64MB per-capsule WASM memory limit via StoreLimitsBuilder (matches old Extism setting). Global budget for multi-tenant hosting is a follow-up (#639). (#632)

  • New WIT record types: spawn-request, interceptor-handle, net-read-status (variant), capability-check-request/response, identity-*-request, elicit-request. (#632)

Changed

  • Persistent daemon no longer auto-shuts down after 5 minutes idle. The non-ephemeral mode (astrid start) used to default ASTRID_IDLE_TIMEOUT_SECS to 300s and kill operator daemons mid-session. Idle shutdown is now opt-in for persistent mode — ASTRID_IDLE_TIMEOUT_SECS=<secs> re-enables it for housekeeping flows that genuinely want auto-shutdown. --ephemeral mode is unchanged (30s after last disconnect, env-var override still honoured).
  • astrid capsule install/update/list/remove no longer accept --agent/-a or --group/-g. Capsules are deployed once and shared across every principal — per-invocation isolation is provided by the kernel's caller-context scoping (KV namespace, home, secrets, log, quotas, audit), not by duplicating the WASM. The --agent flag previously parsed and was discarded (let _ = agent; in dispatch.rs); the silent no-op misled operators into thinking they were installing per-tenant. Removed in line with the NEAR-style "one contract, many callers" model. Capsule config and secrets remain per-agent: astrid capsule config -a <agent> and astrid secret set ... -a <agent> still scope to a principal because they hold per-tenant data, not capsule code.

Removed

  • Raw .wasm release asset install paths. Capsule distribution is now .capsule archive or clone+build. Raw WASM assets can't carry WIT dependencies. (#649)

  • install_standard_wit() from init. Fetched stale per-interface WIT from upstream repo; shared contracts are now bundled by astrid-build into each capsule archive. (#649)

  • extism dependency — replaced by direct wasmtime 43 + wasmtime-wasi 43. (#632)

  • capsule_abi.rs (252 lines) — hand-written WIT type mirrors. (#632)

  • host/shim.rs (430 lines) — Extism dispatch shim, WasmHostFunction enum, register_host_functions(), manual memory helpers. (#632)

  • RiskLevel enum and all references — removed from WIT, IPC payloads, approval engine, audit entries, CLI renderers, policy engine, and test fixtures. Approval prompts now render with a single style. The allowance store handles "don't ask again" patterns without risk classification. (#641)

Fixed

  • astrid who now attributes connections to the actual agent holding the socket. The kernel's per-principal active_connections counter was previously incremented by client.v1.connected events fired from net_accept with no principal stamp — at accept time the host has no idea which principal will eventually claim the socket, so the unstamped publish landed under default. The uplink (cli capsule) now publishes the lifecycle event with the claimed principal once the first principal-stamped ingress message arrives. DaemonStatus gains connections_by_principal: Vec<PrincipalConnectionCount> (optional / skip-if-empty) so older daemons stay wire-compatible. The CLI uses the shared SocketClient::extract_kernel_response to decode the response (so any future envelope change is picked up uniformly across ps / daemon / who) and falls back to the bare count + default attribution when the new field is absent.

  • astrid caps check <self> works for non-admin queriers. Previously failed because the CLI calls admin.group.list to resolve group-inherited caps and GroupList required the admin-tier group:list. Now self-scoped: (GroupList, AuthorityScope::Self_) requires self:group:list, which the agent builtin satisfies via self:*. The mutating group operations (group create / delete / modify) keep their dedicated caps (group:create, group:delete, group:modify) and remain AuthorityScope::Global — read-only widening only.

  • ipc::recv and ipc::poll install per-message invocation context and clear on empty. Run-loop capsules that consume IPC via subscribe + recv (prompt-builder, registry, context-engine) used to silently fall back to the capsule owner's principal (default), so non-default agent chat hung when the chain bounced through a run+recv capsule. The host now mirrors the dispatcher's per-interceptor invocation-context setup onto the recv path: caller_context, invocation_kv, and invocation_capsule_log are set from the first message of a recv'd batch, so subsequent publishes / KV reads stamp the publisher's principal correctly. Mixed-principal batches are truncated to the first publisher's contiguous prefix with a security_event = true warn, so trailing messages from a different principal can't be silently mis-stamped under the first's context. Empty drains (timeout, cancellation, idle poll) explicitly clear the recv context so a previous publisher's principal doesn't leak into the next guest host call. Same-principal fast path: when the new message's principal matches the currently installed one, the KV namespace and capsule log are reused instead of re-opened, dropping per-tick I/O for the steady-state chat run loop.

  • astrid secret set routes secret-typed values through the file-per-secret store (security). Previously every secret set invocation wrote plaintext JSON to <principal_home>/.config/env/<capsule>.env.json regardless of whether the manifest declared the key as env_type = "secret". OPENAI_API_KEY and friends sat in cleartext on disk even though the manifest correctly marked them secret. The CLI now reads the installed Capsule.toml, looks up the EnvDef for the key, and writes secret-typed values through a new astrid_storage::FileSecretStore rooted at ~/.astrid/secrets/<scope>/<capsule>/<key> with 0600 perms (atomic via tempfile+rename, 64 KiB per-file read cap, null-byte and path-separator rejection in keys). --scope agent|shared controls per-agent vs host-wide; the manifest does NOT declare scope — capsules can't elevate their own credentials to host-shared. Kernel-side get_config consults the same store at invocation time, per-agent first with host-wide fall-through. Plaintext secret-typed env JSON entries are stripped at capsule load with a security_event = true warning so legacy on-disk state heals on next boot. secret list cross-references the manifest's [env] declarations to surface each row with a STORAGE column — file (green) for the new path, env-json (dim) for non-secret config, env-json (LEGACY!) (red) for secret-typed keys still on plaintext, awaiting the load-time heal. secret delete checks the file store first when the manifest declares the key secret. Pivoted away from an earlier OS-keychain implementation because headless containers and CI environments often lack a DBus secret-service / keyring; the file path works uniformly across Linux, macOS, Windows without backend gymnastics.

  • caps check now resolves group-inherited capabilities. Previously the CLI answered indeterminate: 'bob' belongs to groups agent — group capabilities not enumerated by Layer 6 whenever the requested capability could only be satisfied through group membership. The new implementation runs the same resolution order the kernel's Layer 5 enforcement preamble uses — explicit revokes → direct grants → group-inherited patterns — calling astrid_core::capability_grammar::capability_matches (the kernel-side matcher) on the data returned by admin.agent.list + admin.group.list. Reports the matching pattern in the output so operators can trace why a capability resolved the way it did. Verified live: caps check bob self:capsule:installallowed: 'bob' inherits from group 'agent' (pattern: self:*).

  • caps grant / caps revoke / caps check help text updated to a syntactically valid example. The previous example was network:egress:api.openai.com, but the capability grammar disallows dots in segments (segments are [a-zA-Z0-9_-]+ or bare *, no shell metacharacters so strings round-trip through TOML and the audit log without escaping). Operators following the example got capability segment "api.openai.com" contains invalid character '.'. Help text now shows network:egress:openai and similar dot-free examples, and explicitly names the grammar.

  • caps show no longer mislabels revoked grants as active (precedence display). Layer 5 evaluates revoke > grant; a grant shadowed by any revoke pattern is dead at check time even though both rows persist on disk. The pretty renderer now uses the kernel's own capability_matches(revoke, cap) to detect shadowing (so pattern revokes like sys:* correctly mark sys:status grants as shadowed by revoke), instead of exact string equality which missed pattern-based suppression. Fixes column alignment in the table at the same time — ANSI escape codes inside {:<N} format specs were counted toward width but didn't render visually, shifting every subsequent column.

  • agent create provisions the new principal's home directory tree, fail-closed. The kernel handler now calls principal_home(&p).ensure() after writing etc/profiles/{p}.toml, so ~/.astrid/home/{p}/.local/{kv,log,audit,tmp,tokens,capsules} and ~/.astrid/home/{p}/.config/env exist before the first interceptor scoped to the new principal fires. Without this, caller_context.principal resolved to a directory that didn't exist and per-invocation KV/secret/log/tmp/audit overrides had nowhere to land — silent fallback to default's namespace would defeat multi-tenancy at the data plane. If provisioning fails, the handler rolls back the identity link and profile file and returns an error rather than leaving a half-provisioned agent whose invocations would leak into another principal's namespace.

  • CLI read_message skips unparseable broadcast frames instead of dying on the first one. The kernel publishes astrid.v1.capsules_loaded with an IpcPayload::RawJson whose inner JSON is emitted via to_guest_bytes — the type discriminator is stripped before the proxy forwards it to the socket. Strict serde_json::from_slice::<IpcMessage> propagated the resulting missing field type error up through the TUI run loop on the very first tick, broke out, restored the terminal, and dropped the operator back to a shell prompt with no visible cause. read_message now logs unparseable frames at debug and reads the next valid frame instead. The crash was masked while the wasip2 stub-run misclassification was suppressing every interceptor — once that fix landed, the chat-stack capsules actually started publishing the broadcast and the silent TUI exit became reachable on every boot.

  • IpcMessage now tolerates wire frames that omit timestamp and signature. Both fields are serde-defaulted (timestamp falls back to Utc::now() on deserialize, signature to None). The CLI proxy capsule (astrid-capsule-cli) forwards bus messages to socket clients using only the fields exposed by the SDK's ipc::Message: {topic, payload, source_id}. The SDK does not surface the original timestamp or signature, so every CLI proxy frame previously omitted them and the headless client's strict from_slice::<IpcMessage> silently failed on every frame — astrid -p and the chat REPL never received agent.v1.response and either timed out at 120s or hung indefinitely. Defaulting at the deserialization boundary preserves semantics for in-process publishers (which always set the fields via IpcMessage::new) while letting forwarding capsules pass partial frames through. Round-trip behaviour for already-complete frames is unchanged.

  • Interceptor dispatch restored for capsules without a #[astrid::run] loop. After the wasmtime Component Model migration, the kernel's run-loop pre-scan misclassified every Component Model capsule as a live run-loop daemon, zeroed out the store/instance, and returned NotSupported("plugin handles interceptors internally via IPC auto-subscribe") for every direct interceptor invocation. End-to-end chat REPL was non-functional. The wasm32-wasip2 toolchain auto-synthesizes a single shared nop function for every mandatory WIT func() export the source crate doesn't implement (run, astrid-install, astrid-upgrade) and aliases all unimplemented exports to that one function. The pre-scan now treats a run export aliased to astrid-install or astrid-upgrade as a stub and routes the capsule through the direct-invoke path. The same fix applies to lifecycle pre-scans, so capsules without real #[astrid::install] / #[astrid::upgrade] no longer compile a transient component just to run a nop. Real #[astrid::run] daemons (prompt-builder, context-engine, cli) keep using the auto-subscribe path introduced in #343.

  • [[topic]] declarations now accept trailing-suffix wildcards (e.g. llm.v1.request.generate.*). The previous validator rejected every wildcard in topic names, which broke fan-out topic families where the trailing segment names a provider, source, or recipient that can't be enumerated at manifest-author time (multiple LLM providers, multiple session callbacks, hook fan-out targets). Every member of the family shares the same envelope, so a pattern is the genuine schema declaration. Mid-segment (a.*.b) and leading (*.b) wildcards are still rejected — the bus matcher only supports trailing-suffix wildcards, so those would silently never fire. Bare * is rejected as too broad. Mirrors ipc_subscribe's host-side check.

  • bwrap mount ordering hides capsule directory on Linux. Hidden --tmpfs overlays (e.g. ~/.astrid) were applied before writable --bind mounts, erasing capsule directories inside the hidden path. Reordered so bind-mounts come after tmpfs and punch through. Mirrors the ancestor check already present in the macOS Seatbelt path (PR #534). (#648)

  • bwrap silently fails on Ubuntu 24.04+ (AppArmor). kernel.apparmor_restrict_unprivileged_userns=1 blocks user namespaces required by bwrap. Added a cached probe at startup that detects this and falls back to unsandboxed execution with a clear warning and distro-specific install instructions. (#648)

  • capsule remove no longer deletes env config by default. User configuration (API keys, secrets) in env.json is preserved across uninstall/reinstall cycles. Use --purge to explicitly delete saved configuration. (#647)

  • astrid-build targets wasm32-wasip2 for Component Model capsules. Was still targeting wasm32-wasip1, producing plain WASM modules. (#649)

  • astrid-build bundles SDK shared WIT (astrid-contracts.wit) into capsule archives as a WIT dep, so wit_type references in Capsule.toml resolve at install time without manual WIT duplication. (#649)

  • JSON Schema field names converted to snake_case in wit_schema to match serde(rename_all = "snake_case") wire convention. (#649)

  • reqwest::blocking inside #[tokio::main] panics on first run. All HTTP call sites in self_update.rs, init.rs, and capsule/install.rs used reqwest::blocking::Client, which creates an internal tokio runtime that panics on drop inside the outer async context. Converted to async reqwest. Only manifested on fresh installs (no cache/lock files to short-circuit). (#645)

  • INTERNAL_SUBSCRIBER_COUNT debug_assert race. EventDispatcher subscribed to the event bus inside tokio::spawn(dispatcher.run()), so the assert could fire before the spawned task started. Moved subscription into EventDispatcher::new(). (#645)

Install

From source (requires Rust 1.94+):

cargo install astrid

Pre-built binaries:
Download the archive for your platform, extract, and add to PATH:

tar xzf astrid-*-$(uname -m)-*.tar.gz
sudo mv astrid-*/astrid astrid-*/astrid-daemon astrid-*/astrid-build /usr/local/bin/

Then run astrid init to set up capsules.


With many thanks from the following Astrinauts 🚀

  • Joshua J. Bouw
  • Pavel Grigorenko