release: v0.2.1 — close audit-flagged completeness gaps + Bedrock LLM migration
Replaces the legacy Azure / remote Qwen / local Ollama / OpenAI-compat LLM
backends with a single Anthropic SDK transport that defaults to AWS Bedrock.
Closes the 28 gaps flagged by the v0.2.0 multi-agent audit, surfacing
v0.2.0 fields the daemon already produced but never forwarded, fixing
visible client bugs (breathing pacer, fold restore, stop button, camera
onboarding), and wiring the previously-dead session_report, ML classifier,
recovery credit, and retention sweep into production.
LLM / transport (Bedrock primary):
- New cortex.libs.llm.anthropic_client: build_anthropic_sdk_client +
resolve_anthropic_model_id select between AsyncAnthropicBedrock,
AsyncAnthropicVertex, and AsyncAnthropic via ANTHROPIC_PROVIDER.
- New cortex.services.llm_engine.anthropic_planner.AnthropicPlanner:
per-template model tier (Haiku / Sonnet / Opus), forced tool-use for
structured InterventionPlan output, ephemeral prompt caching on the
system block, bounded retry with jittered backoff, async semaphore
concurrency cap, consecutive-failure circuit breaker, structured
llm.request observability logs.
- BYOK migrated to AWS Bedrock bearer token in macOS Keychain
(cortex.bedrock / bearer_token); onboarding step 3 collects it with a
region picker. Daemon startup surfaces the token via
AWS_BEARER_TOKEN_BEDROCK.
- Deleted azure_openai.py, remote_qwen.py, local_ollama.py,
setup_ssh_tunnel.sh, run_llm_server.py. LLMConfig rewritten with
legacy-value validators so 0.1.x .env files don't crash on first
launch after upgrade.
Detection / state engine:
- Reverted StateEstimate.confidence to the raw smoothed dominant score
so the spec's 0.85 trigger gate is mathematically reachable (softmax
over 4 saturated scores capped at ~0.475).
- PerUserLogisticClassifier now instantiates when state.ml_enabled and a
serialized model exists; smoother blends ml_p_hyper into HYPER with
alpha ramped by ml_min_labeled_episodes / ml_alpha_full_at_episodes.
- StressIntegralTracker receives hrv_sigma from
baselines.metric_distributions['hrv_rmssd'].std so the integral
matches the AMIP safety-floor z-score scale.
- apply_recovery_credit called on FLOW recovery in
_handle_restore_updates.
- mic_active / fullscreen_active sourced from
cortex.libs.utils.receptivity (CoreAudio + Quartz) instead of being
hard-coded False.
- Unified the two hyper_dwell_seconds config fields; TriggerPolicy now
reads StateConfig only.
WebSocket contract:
- _make_state_update payload adds stress_integral,
calibrated_probabilities, classifier_source, classifier_alpha, ISO
timestamp. _make_intervention_trigger adds causal_explanation,
consent_level, plan_warnings — VS Code "Why this?" panel and popup
transparency section are no longer starved.
- New INTERVENTION_APPLIED ack: clients send
{intervention_id, phase, success, applied_actions, errors} after
apply / restore. Daemon updates Mutation.success so
InterventionOutcome.workspace_restored is truthful.
- Dedicated 500ms _broadcast_loop independent of the pipeline; previous
cadence drifted to 2-3s during LLM trigger work.
- broadcast_settings now emits intervention_dismiss_cooldown_ms /
url_dismiss_cooldown_ms (W-16 daemon-side half).
- Fixed parser.py:280 digit regex (r"\\d+..." -> r"\d+...") so
LLM-grounded causal explanations are retained.
VS Code extension:
- panel-provider no longer rebuilds HTML on every STATE_UPDATE; host
posts diff messages and the inlined script patches DOM in place, so
the breathing pacer animation actually completes its 19-second cycle.
- restoreFoldState sets editor.selection before each editor.fold;
snapshot tracks the ranges we explicitly folded (instead of every
foldable range), so dismiss leaves the file in its pre-intervention
fold state.
- COPILOT_THROTTLE emitted from daemon via send_message(...,
target_client_types=["vscode"]); previously the VS Code handler was
orphaned.
- handleIntervention reads ui_plan.max_visible_lines instead of the
hard-coded ±20-line window.
- context-provider populates recent_edits from
onDidChangeTextDocument (file/line/length/kind, privacy-preserving).
- ws-client.sendInterventionApplied wired; extension.ts acks apply.
Desktop shell:
- Dashboard _stop_btn and _goal_input wired through new stop_requested /
goal_set signals to _shutdown_daemon and USER_ACTION.
- SettingsDialog persists every widget via QSettings("Cortex","Desktop")
and reconciles SETTINGS_SYNC payloads via apply_payload(). LLM combo
box rebuilt for Bedrock / Vertex / direct / rule_based.
- Onboarding "Grant Camera Access" calls
cortex.libs.utils.platform.request_camera_permission() so the
AVFoundation prompt actually fires; falls back to System Settings on
pyobjc-missing.
- WS-mode CortexApp now wires show_connections_requested and the
dashboard Connect button (in-process mode was already wired).
- Onboarding step 4 has a real "Open Connections" button.
- Legacy WS reconnect uses 3->30s exponential backoff matching the
browser / VS Code clients.
Browser extension:
- tab-manager classifyTabType output collapsed to the 9-type Python
vocabulary so daemon-side parser keeps the values.
- TabInfo.last_activated_ago_seconds added to the schema.
- newtab.tsx injects an inline <style> with
html,body,#__plasmo{background:...} so the new-tab page no longer
white-flashes before page-reset.css attaches.
- Deleted orphan content.tsx (never auto-injected) and stale
manifest.json (Plasmo authoritative).
- W-06 regression fixed: build/create/implement/make/write removed
from rabbit_hole.py stop_words; regression test added.
Lifecycle + retention:
- SessionReportGenerator wired into runtime_daemon: start in __init__,
record_state/hr/hrv/stress in the state loop, finish() in stop()
writes storage/sessions/session_<id>.json.
- New cortex.services.janitor.retention.sweep_once runs daily;
StorageConfig.{session,feature,error}_retention_days are finally
enforced.
Build & packaging:
- build_macos_app.sh switched from grep denylist to an explicit allowlist
for bundled .env keys, with a defence-in-depth grep that aborts the
build if any forbidden pattern slips through.
- pyproject.toml bumped to 0.2.1; cortex.spec reads CFBundleVersion
dynamically via tomllib.
- Dropped fakeredis (declared, never imported).
- redis_store.py asserts converted to explicit raises so they survive
python -O.
- ruff check is 0 errors after auto-fix + targeted per-file-ignores for
test patterns.
Docs:
- cortex/docs/apis.md aligned with the real /api/stress-integral and
/api/helpfulness/summary payload shapes.
Tests:
- pytest cortex/ : 1024 passed, 2 skipped, 0 failed.
- New tests/unit/test_anthropic_planner.py (15 tests covering tool-use
extraction, model-tier routing, retry+jitter, circuit breaker,
cache, fallback chain).
- W-06 regression guard in test_rabbit_hole.py.