Skip to content

feat(agents): per-agent API tokens + bearer auth + REST endpoints (audit Pass 1, Phase 1)#450

Draft
jaylfc wants to merge 39 commits into
masterfrom
feat/agent-friendliness-pass-1
Draft

feat(agents): per-agent API tokens + bearer auth + REST endpoints (audit Pass 1, Phase 1)#450
jaylfc wants to merge 39 commits into
masterfrom
feat/agent-friendliness-pass-1

Conversation

@jaylfc
Copy link
Copy Markdown
Owner

@jaylfc jaylfc commented May 12, 2026

Summary

Pass 1 of the agent-friendliness audit. Establishes the framework through the AgentsApp surface, lands the first agent-to-UI primitive (`ui.notify`), ships the `taosctl` CLI, and gates everything with B/C-tier conformance tests that run on every PR.

What's in

  • Per-agent opaque API tokens (`taos_agent_*`) — SHA-256 hashed at rest, one active token per agent with a unique partial index, atomic re-issue, immediate revoke (incl. cascade on agent delete).
  • Bearer-token path in `AuthMiddleware` — case-insensitive on the `Bearer` keyword (RFC 6750), session-cookie auth untouched.
  • Canonical error shape on every error path: `{error, detail, fix, doc_url}`. Global `RequestValidationError` handler covers Pydantic 422s. 27 ad-hoc `JSONResponse({"error": ...})` sites in `routes/agents.py` converted.
  • AgentsApp REST audit — see `docs/agents/audit-results.md`. OpenAPI completeness (summary + responses= on every route), action-slug scope enforcement, Idempotency-Key on create/deploy, `GET /api` discovery index.
  • `taosctl` CLI as a sibling top-level package — `agents *` (list/get/create/update/start/stop/pause/restart/logs/delete + `token issue/revoke` subgroup), `ui notify`, `auth login/status/whoami`. `GET /api/auth/whoami` server endpoint added so `taosctl auth whoami` works for bearer + session callers.
  • `docs/agents/` — README, getting-started, concepts/permissions, recipes/managing-agents, recipes/notifying-the-user. Ingested into `knowledge_store` on startup as `category=agent-docs` with `source_id='taos-core'` (forward-prep for community app guides — see Pass 2: per-app agent guides ingestion (App Builder integration) #451).
  • In-OS taOS helper agent swapped from `_load_manual()` to `knowledge_store` (`build_system_prompt` queries `agent-docs` items per-request; legacy manual is fallback only).
  • `ui.notify` — first agent-to-UI primitive. POST /api/ui/notify accepts `{title, body, priority, app_origin?}`, lands in NotificationStore with `source='agent:'` for audit. HTML-escaped in the HTMX renderer.
  • One-shot token migration on startup — walks `config.agents`, issues + injects `TAOS_TOKEN` via `backend.set_env` when the runtime is LXC; short-circuits on Docker/Apple/no-runtime.
  • Conformance tests — B-tier (`tests/conformance/test_agents.py`, `test_ui_notify.py`) + C-tier (recipe-replay runner). A-tier nightly smoke scaffold (`scripts/conformance-smoke.py` + `conformance-nightly.yml` workflow_dispatch).
  • Tracking epic: epic(agent-friendliness): Pass 2+ follow-ups #453. Side issues: Pass 2: per-app agent guides ingestion (App Builder integration) #451 (per-app guides), Pass 2: concurrency hardening on per-agent tokens + idempotency cache #452 (multi-worker concurrency).

CodeRabbit reviews addressed inline: 3 actionable findings on the first round (touch_last_used best-effort, set_env-failure rollback, RUF059), 8 actionable findings on the second round (Critical: XSS in notif HTMX renderer, Major: unbounded idempotency cache, Major: action_url accept-but-discard, Major: unique-active-token DB constraint, Minor: status-code assertions, default-priority verification, scope.py F541, http_client.put round-out).

Test plan

  • Full Pass 1 scope suite locally: 781 passed
  • `test_routes_agents.py` regression: clean
  • CI: lint ✅ dependency-audit ✅ spa-build ✅ test (3.12/3.13) — pending final run
  • CodeRabbit: two review rounds, all actionable findings addressed
  • Manual smoke (Jay): `taosctl ui notify` from inside an LXC agent surfaces in the desktop notification panel
  • Manual smoke (Jay): issue + revoke flow end-to-end against the Pi controller
  • Pi rollout: deploy to LXC, exercise create/start/token endpoints

Out-of-Pass-1 scope (tracked in #453)

  • Pagination cursor on list endpoints, response_model= success types, multi-user NotificationStore, Docker/Apple token injection, more A-tier smoke tasks, taosctl shell completions.

Spec: `docs/superpowers/specs/2026-05-12-agent-friendliness-audit-design.md` (gitignored, local).
Plan: `docs/superpowers/plans/2026-05-12-agent-friendliness-audit-pass-1.md` (gitignored, local).

Summary by CodeRabbit

Release Notes

  • New Features

    • Agent bearer token authentication with scope-based permission control
    • Token management endpoints for issuing and revoking agent API tokens
    • Agent-to-UI notification API endpoint
    • API discovery endpoint and dynamic system prompts built from knowledge base
    • CLI tool (taosctl) for agent and authentication management
    • Request idempotency support via Idempotency-Key header
  • Bug Fixes

    • Standardized error response shape across all endpoints
    • Improved container operation resilience
  • Documentation

    • Agent onboarding and permissions guides
    • API audit results documentation
  • Tests

    • Comprehensive coverage for token authentication, scope enforcement, idempotency, and API conformance

Review Change Stack

jaylfc added 11 commits May 12, 2026 13:44
Adds tinyagentos.errors.error_response() — canonical {error, detail, fix,
doc_url} JSON shape. fix and doc_url turn 403s/404s into self-healing hints
for agents (concrete next step + link to the relevant recipe).

First building block for the agent-friendliness Pass 1 audit.
Code-quality nits from review: imports belong at top of file (matches
project style), and the '# tests/test_errors.py' header comment isn't
a convention used elsewhere in tests/.
Opaque random tokens (taos_agent_<43 b64url chars>), SHA-256 hashed at
rest. Issuing a new token for an agent atomically revokes any prior
active token; lookup matches the hash and gates on revoked_at IS NULL.
…get_metadata

Round out the token lifecycle. revoke_for_agent returns rowcount so
callers can detect 'nothing to revoke.' touch_last_used is best-effort
and updates the last_used_at column on every successful lookup. get_metadata
surfaces non-secret fields for the API's has_token response.
Initialised in lifespan with data_dir/agent_tokens.db, closed cleanly on
shutdown. Available to routes and middleware via request.app.state.agent_tokens_store.
Set eagerly outside lifespan to match the existing pattern so the store
is available to tests that don't start the lifespan.
Recognises 'Authorization: Bearer taos_agent_*' ahead of the existing
session-cookie path. Looks up in AgentTokensStore, populates
request.state.user_id / agent_id / token_scope, touches last_used_at
best-effort. Existing desktop session auth unaffected.

Invalid tokens get the standardised 401 with a 'fix' pointing at the
reissue endpoint. conftest.client fixture now inits agent_tokens_store
so tests can issue real tokens.
POST /api/agents/{name}/token/issue returns plaintext once, revokes any
prior token atomically. DELETE /api/agents/{name}/token revokes. GET on
the agent surfaces has_token + issued_at but never the plaintext.

404s use the standardised error_response with fix + doc_url.
Three small follow-ups from in-progress review:

1. AgentTokensStore: direct unit tests for get_metadata (active /
   no-token / post-revoke). Previously only indirectly covered via the
   API-level test that checks has_token after issue.

2. routes/agents.py: hoist 'from fastapi import Response' to the
   top-level imports rather than the inline import inside
   revoke_agent_token. Matches the existing fastapi import line.

3. auth_middleware.py: make the 'Bearer' keyword check case-insensitive
   (RFC 6750) when matching agent tokens. The token body remains
   exact-case-matched (taos_agent_ prefix is lowercase by construction).
   Added a test asserting Bearer / bearer / BEARER / BeArEr all
   authenticate.
container_exists docstring promises 'errors (incus not installed, daemon
down, malformed output) are treated as unknown and return False' but the
code only handled non-zero exit codes. asyncio.create_subprocess_exec
raises FileNotFoundError (an OSError subclass) when the binary itself is
missing, which leaked through to callers — observable when DELETE
/api/agents/{name} ran in an environment without incus.

Wraps the _run call in try/except OSError to match the documented
contract.
When an agent is deleted, its token is revoked immediately so no
lingering credential can authenticate after the agent itself is gone.
Applies on the success path of _archive_agent_fully (orphan hard-delete,
tombstone, and full archive); if archiving errors, the token survives.
Walks config.agents, issues tokens for agents that don't have one (when
the global container runtime is LXC), and injects TAOS_TOKEN via
backend.set_env on the matching taos-agent-<name> container. Idempotent
— agents with has_token are skipped. When the runtime is Docker, Apple
Containerization, or absent, the migration short-circuits with no
side effects; those env-injection paths land in Pass 2.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 12, 2026

Warning

Rate limit exceeded

@jaylfc has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 38 minutes and 47 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 2d465157-ebd9-4ef6-b773-81f69daa0bf9

📥 Commits

Reviewing files that changed from the base of the PR and between ce5e025 and de80490.

📒 Files selected for processing (8)
  • docs/agents/concepts/permissions.md
  • taosctl/cmd_auth.py
  • taosctl/cmd_ui.py
  • taosctl/config.py
  • tests/cli/test_taosctl_agents.py
  • tests/cli/test_taosctl_ui.py
  • tests/conformance/_recipe_parser.py
  • tests/conformance/test_recipes.py
📝 Walkthrough

Walkthrough

End-to-end per-agent bearer-token system: SQLite-backed AgentTokensStore, middleware bearer validation, startup migration for LXC agents, token issue/revoke endpoints, scope and idempotency utilities, knowledge ingestion, CLI, OpenAPI/docs, and a broad test suite.

Changes

Agent Token Authentication System

Layer / File(s) Summary
Token store implementation & schema
tinyagentos/stores/agent_tokens_store.py
Adds AgentTokensStore: SQLite-backed token table, plaintext token generation with fixed prefix, SHA-256 storage, issue/lookup/revoke/has_token/touch/get_metadata APIs.
AgentTokensStore tests & fixture
tests/stores/test_agent_tokens_store.py
Async fixture with temp SQLite DB and tests validating issuance, token-hash != plaintext, revocation-on-reissue, revoke_for_agent counts, has_token, touch_last_used, and get_metadata behavior.
Standardized error model & helper
tinyagentos/errors.py, tests/test_errors.py
Adds Pydantic ErrorResponse and error_response(...) returning canonical {error, detail, fix, doc_url} JSON; tests cover required and optional fields and status codes.
Scope enforcement & Idempotency cache
tinyagentos/scope.py, tinyagentos/idempotency.py, tests/test_idempotency.py, tests/routes/test_agents_scope.py
Adds scope_matches and require_scope decorator for token-scope enforcement; introduces in-memory IdempotencyCache (24h TTL) and wires idempotency into deploy flows; tests validate scope matching and idempotency semantics.
Auth middleware bearer token branch
tinyagentos/auth_middleware.py, tests/test_auth_middleware_bearer.py
Adds early branch to validate Authorization: Bearer taos_agent_* via agent_tokens_store; returns 503 if store missing, 401 for unknown token, populates request.state for valid tokens and best-effort updates last-used. Tests validate valid/invalid flows and Bearer case-insensitivity.
App init, lifespan wiring & state exposure
tinyagentos/app.py, tests/conftest.py, tests/test_app_state.py
create_app constructs AgentTokensStore and IdempotencyCache, assigns to app.state eagerly and initializes/closes store in lifespan; runs run_agent_token_migration at startup (logged non-fatal), starts knowledge monitor/ingest, and registers a RequestValidationError handler.
Agent token migration implementation & tests
tinyagentos/migrations/agent_tokens_migration.py, tests/migrations/test_agent_tokens_migration.py
Adds run_agent_token_migration: LXC-only issuance for agents missing tokens/names, issues tokens with defaults, sets TAOS_TOKEN in containers, and revokes issuance on set_env failure. Tests cover issuance, skip logic, backend skipping, empty-agent no-op, and failure revocation.
Agent routes: token endpoints & integration tests
tinyagentos/routes/agents.py, tests/routes/test_agent_token_endpoints.py
Adds POST /api/agents/{name}/token/issue (returns plaintext once + issued_at), DELETE /api/agents/{name}/token (revokes active token, 204), enhances GET /api/agents/{name} to include token metadata, and ensures agent deletion revokes tokens; integration tests validate issue/revoke/auth flows and unknown-agent errors.
Notifications, API index & OpenAPI checks
tinyagentos/routes/notifications.py, tinyagentos/routes/api_index.py, tests/routes/*
Adds POST /api/ui/notify guarded by ui.notify scope with UiNotifyRequest and persistence into app notifications store; adds /api discovery index and OpenAPI validation tests.
Knowledge ingestion & system prompt
tinyagentos/knowledge_monitor.py, tinyagentos/routes/taos_agent.py, tests/test_knowledge_monitor_agent_docs.py, tests/test_taos_agent_system_prompt.py
Adds ingest_agent_docs to populate knowledge store from docs and build_system_prompt to assemble per-request system prompt from ingested agent docs with legacy fallback; tests validate ingestion, idempotency, title fallback, and prompt composition.
CLI, packaging, and conformance tooling
taosctl/*, scripts/conformance-smoke.py, tests/cli/*, tests/conformance/*
Adds taosctl CLI package (commands for agents/ui/auth), conformance smoke script, recipe parser, recipe replay conformance tests, and CLI unit tests.
Container tooling robustness & misc tests/docs
tinyagentos/containers/__init__.py, many test and docs files
container_exists() handles OSError gracefully; many tests assert canonical error shapes, scope/idempotency semantics, OpenAPI docs, and added docs/recipes/audit-results.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

Possibly related PRs

  • jaylfc/tinyagentos#274: Modifies auth middleware and request authentication flow; overlaps with middleware changes.

Poem

🐰 A token store, opaque and bright,
I burrow hashes through the night—
Middleware hops, migrations run,
Tokens issued, revocations done.
Tests nibble bugs till none remain.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/agent-friendliness-pass-1

Per-endpoint checklist tracker for AgentsApp. Cells start as _pending_;
remediation PRs (Tasks 10-15) update them to ✅ or ❌-with-reason as
each row's underlying remediation lands.
@jaylfc
Copy link
Copy Markdown
Owner Author

jaylfc commented May 12, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 12, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (4)
tests/migrations/test_agent_tokens_migration.py (1)

22-96: ⚡ Quick win

Add a failure-path test for set_env errors.

Current cases don’t cover injection failure. Add one case where set_env returns {"success": False, ...} and assert the migration does not increment issued (and asserts rollback behavior once implemented).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/migrations/test_agent_tokens_migration.py` around lines 22 - 96, Add a
new async test in tests/migrations/test_agent_tokens_migration.py that calls
run_agent_token_migration with an LXC-like backend whose set_env is an AsyncMock
returning {"success": False, "error": "reason"}; assert the returned summary has
"issued": 0 and appropriate skipped counts, and assert agent_tokens_store.issue
was not called (or not incremented) while container_backend.set_env was awaited
once for that agent; reference run_agent_token_migration,
container_backend.set_env, and agent_tokens_store.issue in the assertions so the
failure-path for set_env is covered (and leave a TODO/assertion placeholder for
rollback behavior once implemented).
tinyagentos/app.py (1)

378-393: ⚡ Quick win

Consider catching broader exceptions when detecting container backend.

The inner try-except at lines 381-384 only catches RuntimeError, but get_backend() might raise other exceptions (e.g., ImportError for missing modules, OSError for command execution failures). Consider catching Exception to ensure the migration runs with container_backend=None in all failure cases, matching the outer block's resilience pattern.

♻️ Proposed fix
         try:
             from tinyagentos.containers.backend import get_backend
             from tinyagentos.migrations.agent_tokens_migration import run_agent_token_migration
             try:
                 _container_backend = get_backend()
-            except RuntimeError:
+            except Exception:
                 _container_backend = None
             await run_agent_token_migration(
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tinyagentos/app.py` around lines 378 - 393, The inner try/except around
get_backend() only catches RuntimeError but other exceptions (ImportError,
OSError, etc.) can occur; change the except clause in the block that calls
get_backend() to catch Exception so any failure sets _container_backend = None
and allows run_agent_token_migration(agents=config.agents,
agent_tokens_store=agent_tokens_store, container_backend=_container_backend) to
proceed; locate the try that imports tinyagentos.containers.backend and calls
get_backend() and update its except RuntimeError to except Exception.
tinyagentos/routes/agents.py (1)

470-472: ⚡ Quick win

Consider making token revocation best-effort on agent deletion.

Token revocation at lines 470-472 is not wrapped in try-except, so a store error would propagate and potentially fail an otherwise-successful archive operation. Other cleanup steps in _archive_agent_fully (LiteLLM key revocation, chat export, etc.) are wrapped in try-except to ensure they don't block the archive. Consider the same pattern here for consistency and resilience.

♻️ Proposed fix
     # Cascade: revoke the agent's token so it can't authenticate post-delete.
     store = request.app.state.agent_tokens_store
-    await store.revoke_for_agent(name)
+    try:
+        await store.revoke_for_agent(name)
+    except Exception:
+        logger.warning("token revocation failed during agent delete for %s", name, exc_info=True)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tinyagentos/routes/agents.py` around lines 470 - 472, Wrap the token
revocation call in _archive_agent_fully in a best-effort try/except: when
calling await request.app.state.agent_tokens_store.revoke_for_agent(name) catch
any exceptions, log the error (including the agent name and exception) and
continue without re-raising so token store failures won't block the archive
flow; keep it async and ensure the log uses the existing request/app logger or
the module logger for consistency with other cleanup steps.
tinyagentos/auth_middleware.py (1)

76-79: 💤 Low value

Consider logging failed last_used updates for observability.

The best-effort pattern here is appropriate—authentication should succeed even if the timestamp update fails. However, silently swallowing exceptions makes debugging issues with the token store difficult. Consider logging at WARNING level so operators can detect persistent failures.

♻️ Proposed enhancement
             try:
                 await store.touch_last_used(plaintext)
-            except Exception:
-                pass
+            except Exception:
+                logger.warning("agent token: failed to update last_used timestamp", exc_info=True)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tinyagentos/auth_middleware.py` around lines 76 - 79, Replace the silent
except block that swallows errors from store.touch_last_used with logging at
WARNING level: catch Exception as e and call the module logger (or
logging.getLogger(__name__)) to emit a warning that includes context (e.g.,
token plaintext or identifier) and the exception details (exc_info=True) so
persistent token-store failures are observable; keep the best-effort behavior
(do not re-raise).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/stores/test_agent_tokens_store.py`:
- Around line 43-44: The test unpacks values from store.issue into variables
that aren't used (row_a and plaintext_b), triggering a linter RUF059; change the
unpacking to ignore unused parts using a throwaway variable (e.g., `_`) so only
the needed values are bound — for the first call keep plaintext_a and discard
row_a, and for the second call keep row_b and discard plaintext_b (referencing
store.issue, plaintext_a, row_a, plaintext_b, row_b).

In `@tinyagentos/migrations/agent_tokens_migration.py`:
- Around line 54-60: The code currently increments issued and logs success
immediately after issuing a token (agent_tokens_store.issue) without verifying
container_backend.set_env succeeded, which risks leaving an active DB token if
set_env fails. Change the flow to call container_backend.set_env and await it
inside a try/except; only on success increment issued and call logger.info; on
failure catch the exception, revoke or remove the issued token via the token
store API (e.g., agent_tokens_store.revoke or delete using the returned
token/metadata from agent_tokens_store.issue), log the failure, and do not
increment issued so subsequent runs can retry.

In `@tinyagentos/stores/agent_tokens_store.py`:
- Around line 138-147: The touch_last_used method currently opens the DB and
executes an UPDATE which can raise and propagate errors; make it truly
best-effort by wrapping the DB access (the aiosqlite.connect, execute, commit
sequence in touch_last_used) in a try/except that catches exceptions (broad
Exception) and swallows them (optionally logging via self._logger or similar) so
transient SQLite failures do not break auth flows; keep using _hash(plaintext)
and _now_iso() and still attempt the UPDATE on agent_tokens but do not re-raise
on failure.

---

Nitpick comments:
In `@tests/migrations/test_agent_tokens_migration.py`:
- Around line 22-96: Add a new async test in
tests/migrations/test_agent_tokens_migration.py that calls
run_agent_token_migration with an LXC-like backend whose set_env is an AsyncMock
returning {"success": False, "error": "reason"}; assert the returned summary has
"issued": 0 and appropriate skipped counts, and assert agent_tokens_store.issue
was not called (or not incremented) while container_backend.set_env was awaited
once for that agent; reference run_agent_token_migration,
container_backend.set_env, and agent_tokens_store.issue in the assertions so the
failure-path for set_env is covered (and leave a TODO/assertion placeholder for
rollback behavior once implemented).

In `@tinyagentos/app.py`:
- Around line 378-393: The inner try/except around get_backend() only catches
RuntimeError but other exceptions (ImportError, OSError, etc.) can occur; change
the except clause in the block that calls get_backend() to catch Exception so
any failure sets _container_backend = None and allows
run_agent_token_migration(agents=config.agents,
agent_tokens_store=agent_tokens_store, container_backend=_container_backend) to
proceed; locate the try that imports tinyagentos.containers.backend and calls
get_backend() and update its except RuntimeError to except Exception.

In `@tinyagentos/auth_middleware.py`:
- Around line 76-79: Replace the silent except block that swallows errors from
store.touch_last_used with logging at WARNING level: catch Exception as e and
call the module logger (or logging.getLogger(__name__)) to emit a warning that
includes context (e.g., token plaintext or identifier) and the exception details
(exc_info=True) so persistent token-store failures are observable; keep the
best-effort behavior (do not re-raise).

In `@tinyagentos/routes/agents.py`:
- Around line 470-472: Wrap the token revocation call in _archive_agent_fully in
a best-effort try/except: when calling await
request.app.state.agent_tokens_store.revoke_for_agent(name) catch any
exceptions, log the error (including the agent name and exception) and continue
without re-raising so token store failures won't block the archive flow; keep it
async and ensure the log uses the existing request/app logger or the module
logger for consistency with other cleanup steps.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: fbe47d94-71f5-40af-809d-ac64edf1a485

📥 Commits

Reviewing files that changed from the base of the PR and between 3c2d2a5 and 7497eb8.

📒 Files selected for processing (17)
  • tests/conftest.py
  • tests/migrations/__init__.py
  • tests/migrations/test_agent_tokens_migration.py
  • tests/routes/test_agent_token_endpoints.py
  • tests/stores/__init__.py
  • tests/stores/test_agent_tokens_store.py
  • tests/test_app_state.py
  • tests/test_auth_middleware_bearer.py
  • tests/test_errors.py
  • tinyagentos/app.py
  • tinyagentos/auth_middleware.py
  • tinyagentos/containers/__init__.py
  • tinyagentos/errors.py
  • tinyagentos/migrations/agent_tokens_migration.py
  • tinyagentos/routes/agents.py
  • tinyagentos/stores/__init__.py
  • tinyagentos/stores/agent_tokens_store.py

Comment thread tests/stores/test_agent_tokens_store.py Outdated
Comment thread tinyagentos/migrations/agent_tokens_migration.py
Comment thread tinyagentos/stores/agent_tokens_store.py
jaylfc added 8 commits May 12, 2026 14:37
Walks routes/agents.py and replaces ~27 ad-hoc JSONResponse error returns
with the canonical error_response helper. Adds an app-level
RequestValidationError handler so 422s from pydantic also follow the
same shape (keeps the standard HTTP code; doesn't remap to 400).

Existing test_routes_agents.py assertions updated to match new shape:
slug in 'error', prose moved to 'detail'.

Agents hitting a 4xx now get a 'fix' (concrete next step) and a
'doc_url' pointing at the relevant recipe — the agentic equivalent of
'see man <command>'.
Applies the 3 actionable findings + 4 nitpicks from the CodeRabbit
review of #450:

Major fixes:
  - migrations/agent_tokens_migration.py: when set_env fails after a
    successful issue(), revoke the issuance so subsequent runs retry
    instead of skipping the agent (which would leave a DB token with
    no TAOS_TOKEN env var, unauthenticatable forever).
  - stores/agent_tokens_store.py: touch_last_used is documented as
    'best-effort — failures are non-fatal' but the DB errors actually
    propagated. Wrapped in try/except so transient SQLite issues can't
    break auth flows. Also added 'AND revoked_at IS NULL' so we never
    touch a revoked token's last_used_at.

Nitpicks:
  - app.py: broaden the get_backend() failure catch from RuntimeError
    to Exception so any container detection failure routes the migration
    through its 'no LXC' no-op path.
  - routes/agents.py: wrap the cascade revoke on agent delete in
    try/except so a token-store error can't undo a successful archive
    (matches the LiteLLM-key-revoke pattern in _archive_agent_fully).
  - auth_middleware.py: log (warn) the swallowed touch_last_used
    exception for observability instead of silently passing.
  - tests/stores/test_agent_tokens_store.py: prefix unused unpacked
    vars with _ (RUF059).

New test:
  - tests/migrations/test_agent_tokens_migration.py: covers the
    set_env-failure path — asserts revoke_for_agent is called and
    issued stays at 0.
Every routes/agents.py endpoint now has a summary + multi-line docstring
(becomes the OpenAPI description AND will be reused as the source for
taosctl --help in a later task). 4xx responses are annotated with the
canonical ErrorResponse schema so generated clients know the shape.
POST /api/agents gets a request body example using the documentation
TEST-NET-1 range.

Skipped: response_model= for success — agent dicts remain loose-typed
in Pass 1, locking the schema is deferred.
Wraps POST /api/agents and POST /api/agents/deploy with an
Idempotency-Key cache (24h TTL, in-memory, per-app-instance). Same key
+ endpoint + user_id returns the cached response — agent retries after
a network blip don't create duplicates.

The cache lives on app.state.idempotency_cache so test apps get their
own; production gets one cache per worker. Persistence across restarts
is a Pass 2 concern (in-memory is fine for typical retry windows).
Task 10's sweep updated test_routes_agents.py but missed this one. The
endpoint lives in routes/agents.py and now returns the canonical
{error, detail, fix, doc_url} shape, so the slug 'agent_not_found'
lands in 'error' instead of the prose 'not found'.
Hand-maintained list of top-level route prefixes with titles and
doc_urls. Lets an agent discover the surface in one round-trip;
per-endpoint OpenAPI spec at /openapi.json carries the detail.

Pass 2+ entries land here as each app's audit completes.
Glob-style scope matching ("*", "agents.*", "agents.list"). Decorator
returns the standardised 403 with a 'fix' (reissue with wider scope) and
'doc_url' (permissions concept) when the bearer's scope doesn't cover
the action. Session-cookie callers bypass scope checks (no regression
on the desktop — scope is a delegated-access mechanism for tokens).

Action slug taxonomy: agents.list / agents.read / agents.logs /
agents.create / agents.update / agents.delete / agents.lifecycle /
agents.deploy / agents.token.issue / agents.token.revoke.
Production endpoint, not a demo. Agents with ui.notify (or *) scope can
send notifications to the user's notification panel. The calling agent
name lands in the source field (agent:<name>) for audit.

Pass 1 uses the existing single-user NotificationStore — the agent-facing
contract is the spec, the storage layer grows into multi-user in Pass 2
without breaking the API. notification_id is a client-side tracking
token, not a DB row identifier yet.
@jaylfc
Copy link
Copy Markdown
Owner Author

jaylfc commented May 12, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 12, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

jaylfc added 2 commits May 12, 2026 15:32
README documents the layout, conventions, and the ingestion-into-knowledge
mechanism. getting-started.md walks an agent from token → first API call
→ discovery → error handling. Examples use the real agent dict shape
(name/host/qmd_index/color) and the canonical error envelope landed in
Phase 2.
Explains the user-mirroring ceiling + tool-set scoping model. Documents
the scope syntax (wildcard, namespace glob, exact match), default-permissive
behaviour, the canonical scope_denied error, and the session-cookie vs
bearer-token distinction. Action slug taxonomy table matches what landed
in Task 14.
jaylfc added 2 commits May 12, 2026 15:33
Task-oriented walkthrough of every AgentsApp verb: list, get, create,
issue/revoke token, lifecycle (start/stop/pause/restart), logs, update,
delete. API + CLI side-by-side per the recipe template. Notes auto-suffix
behaviour on POST /api/agents (no 409 from the user-facing route) and
the cascade token-revoke that fires on DELETE.
Recipe for the ui.notify primitive — API + CLI side-by-side, priority/
app_origin/action_url documented, common errors with fix references.
Verification points at GET /api/notifications and looks for the agent:<name>
source attribution. Notes that per-user routing lands in Pass 2.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

♻️ Duplicate comments (1)
tinyagentos/migrations/agent_tokens_migration.py (1)

54-69: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Rollback issuance when set_env raises, not only when it returns success: false.

If set_env throws, the token remains active and this run aborts early, so retries may skip the agent due to has_token.

💡 Suggested fix
         plaintext, _ = await agent_tokens_store.issue(
             agent_id=name, user_id=user_id, scope=scope
         )
         container = f"taos-agent-{name}"
-        env_result = await container_backend.set_env(container, "TAOS_TOKEN", plaintext)
+        try:
+            env_result = await container_backend.set_env(container, "TAOS_TOKEN", plaintext)
+        except Exception as exc:
+            await agent_tokens_store.revoke_for_agent(name)
+            logger.warning(
+                "agent_token_migration: set_env raised for %s, revoked issuance: %s",
+                name,
+                exc,
+            )
+            continue
         if not env_result.get("success", False):
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tinyagentos/migrations/agent_tokens_migration.py` around lines 54 - 69, The
code only revokes the issued token when container_backend.set_env returns
success=false but not when set_env raises an exception; wrap the await
container_backend.set_env(...) call in a try/except around the call in the same
scope where plaintext was issued (after agent_tokens_store.issue) and on any
Exception call await agent_tokens_store.revoke_for_agent(name), log the
exception with logger.warning/error including the exception details and
container/name/context, then continue so the migration can retry; keep the
existing revoke path for env_result.get("success", False) intact and reference
agent_tokens_store.issue, container_backend.set_env, and
agent_tokens_store.revoke_for_agent when making the change.
🧹 Nitpick comments (5)
tinyagentos/routes/agents.py (5)

1875-1887: ⚡ Quick win

Same canonical-shape drift here.

model_not_reachable reuses the inline JSONResponse pattern instead of error_response(...). Aligning this with the helper keeps the agents-app 4xx/5xx responses uniform per the PR's "Standardized error shape" goal.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tinyagentos/routes/agents.py` around lines 1875 - 1887, The inline
JSONResponse returned when location.kind == "not_found" should be replaced with
the standardized error_response helper to maintain canonical error shape; locate
the block checking location.kind == "not_found" and call error_response(...)
passing error="model_not_reachable", detail (the same f"model '{model_id}' is
not reachable..." string), fix, doc_url, and model=model_id and set
status_code=409 so the response fields and status match the original but use the
shared error_response function.

337-352: ⚡ Quick win

Duplicate inline 404 payloads — reuse _agent_not_found.

Both token endpoints inline their own 404 error_response(...) instead of calling the new _agent_not_found(name) helper that the rest of this module uses (lines 176, 271, 309, 1171, 1197, 1256, 1706, 1738, 1783, 1806, 1838). The bodies differ only in the fix string (one says "taosctl agents list", the other "List agents"), which is itself an inconsistency worth flattening.

♻️ Proposed fix
@@ async def issue_agent_token(request: Request, name: str):
     config = request.app.state.config
     agent = find_agent(config, name)
     if not agent:
-        return error_response(
-            status_code=404,
-            error="agent_not_found",
-            detail=f"No agent named {name!r} exists.",
-            fix="List agents with `taosctl agents list` to see valid names.",
-            doc_url="/docs/agents/recipes/managing-agents#list-agents",
-        )
+        return _agent_not_found(name)
@@ async def revoke_agent_token(request: Request, name: str):
     config = request.app.state.config
     agent = find_agent(config, name)
     if not agent:
-        return error_response(
-            status_code=404,
-            error="agent_not_found",
-            detail=f"No agent named {name!r} exists.",
-            fix="List agents to see valid names.",
-            doc_url="/docs/agents/recipes/managing-agents#list-agents",
-        )
+        return _agent_not_found(name)

Also applies to: 370-382

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tinyagentos/routes/agents.py` around lines 337 - 352, Replace the duplicated
inline 404 payload in the token issuance endpoints with the shared helper by
calling _agent_not_found(name) instead of the inline error_response; locate the
token endpoints around where you call store.issue (e.g., the block that
retrieves config, agent = find_agent(config, name), user_id/scope, and then
calls await store.issue) and replace the inline return error_response(...) with
return _agent_not_found(name); apply the same change to the other token endpoint
that follows the same pattern so both endpoints use the unified helper and
eliminate the inconsistent "fix" wording.

824-843: ⚡ Quick win

Inconsistent error response shape for model_not_on_worker (409).

This branch hand-rolls a JSONResponse instead of using error_response(...) like every other 4xx in this module. The canonical keys are still present, but bypassing the helper means it ships as a plain JSONResponse rather than the typed ErrorResponse model declared in responses={...}, and OpenAPI/tests that key off the canonical shape (tests/routes/test_agents_error_shape.py per the AI summary) won't see this path validated. The extra fields (model, pinned_worker, available_on) can be passed through to error_response via additional kwargs or merged after the call.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tinyagentos/routes/agents.py` around lines 824 - 843, The branch that returns
JSONResponse for the model_not_on_worker case bypasses the module's
error_response helper and therefore emits an inconsistent shape; replace the
JSONResponse(...) block with a call to error_response(...) (same message,
doc_url, fix, detail) and pass the extra fields (model=body.model,
pinned_worker=body.target_worker, available_on=location.hosts) as kwargs and
ensure the status_code is 409 so the response matches the ErrorResponse model
used elsewhere; update the code around the location.kind == "worker" check to
call error_response(...) instead of constructing JSONResponse directly.

177-181: 💤 Low value

{**agent, **meta} lets store metadata overwrite agent fields on collision.

If get_metadata ever returns a key that also exists on the agent dict (e.g. name, user_id, scope), the store wins silently. Right now the store presumably returns has_token, issued_at, last_used_at — none of which collide — but this is a footgun the next time someone adds a field to either side. Prefer either a fixed projection ({"has_token": ..., "issued_at": ..., "last_used_at": ...}) or {**meta, **agent, "has_token": True, ...} so the agent record stays authoritative.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tinyagentos/routes/agents.py` around lines 177 - 181, The current merge
return ({**agent, **meta}) in the agents route lets token-store metadata
overwrite agent fields on key collisions; change the merge so the agent record
remains authoritative and token fields are explicit. Update the code that reads
from request.app.state.agent_tokens_store and get_metadata(name) to either
return a fixed projection like {"has_token": True, "issued_at":
meta.get("issued_at"), "last_used_at": meta.get("last_used_at")} merged with
agent, or merge as {**meta, **agent, "has_token": True, "issued_at":
meta.get(...), "last_used_at": meta.get(...)} to ensure agent keys (e.g., name,
user_id, scope) cannot be silently overwritten.

350-351: ⚡ Quick win

Reuse the issued_at from the second return value instead of calling _now_iso() again.

The store.issue() method returns tuple[str, dict[str, Any]] where the metadata dict includes issued_at set at the time of issuance. Discarding this with _ and then synthesizing a new timestamp via _now_iso() creates a timestamp divergence: the client receives one value in the POST response, but get_metadata() surfaces the database-recorded issued_at from the store's metadata. These timestamps will differ by at least microseconds, and any future changes to the store (e.g., truncating to seconds precision) will make this discrepancy more visible.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tinyagentos/routes/agents.py` around lines 350 - 351, Unpack and use the
issuance metadata returned by store.issue instead of synthesizing a new
timestamp: change the call that currently does `plaintext, _ = await
store.issue(agent_id=name, user_id=user_id, scope=scope)` to capture the
metadata (e.g., `plaintext, metadata = ...`) and return `{"token": plaintext,
"issued_at": metadata["issued_at"]}` so the response uses the exact issued_at
from store.issue rather than calling _now_iso().
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/routes/test_api_index.py`:
- Around line 24-27: Add an assertion that the HTTP response succeeded before
parsing the body: in the test where `resp = await client.get("/api")` and `body
= resp.json()`, insert `assert resp.status_code == 200` immediately after the
GET so the test fails with an explicit status code when the request fails
(referencing `resp`, `client.get("/api")`, and the subsequent `body =
resp.json()`).

In `@tests/routes/test_ui_notify.py`:
- Around line 70-80: The test test_post_ui_notify_priority_default_normal
currently only checks resp.status_code; update it to also fetch the stored
notification from the app.state.agent_tokens_store (same store variable used in
the test and patterned after test_post_ui_notify_writes_to_store) after the POST
and assert the stored entry's priority/level field equals "normal" to verify the
defaulting behavior; use the store's listing/read method (the same access
pattern used in test_post_ui_notify_writes_to_store) to locate the notification
for agent_id="def-agent" and assert its priority/level is "normal".

In `@tinyagentos/idempotency.py`:
- Around line 37-39: The cache in async def set(...) is unbounded; update set to
prune stale entries and/or enforce a max size while holding self._lock: iterate
self._entries and remove items older than the TTL (e.g., 24*3600) based on the
stored timestamp, and if needed enforce a hard cap (e.g., MAX_ENTRIES) by
evicting oldest entries (use timestamps to determine LRU) before inserting the
new (key, endpoint, user_id) -> (value, time.time()) entry; reference the set
method, self._entries, and self._lock when implementing the eviction logic.

In `@tinyagentos/routes/agents.py`:
- Around line 202-211: The idempotency flow races because cache.get() only
returns stored responses and releases its internal lock before the create/write
path (config.agents.append + save_config_locked) runs, allowing concurrent
requests with the same idempotency_key to both proceed; modify the logic to
reserve the key before performing the write by using the IdempotencyCache to
mark an "in-flight" reservation or acquire a per-key lock/lease tied to
idempotency_key, so subsequent requests either wait on that reservation or
return the pending reservation result; concretely, add a reserve/in_progress API
on IdempotencyCache (or a per-key lock map) and call it before
config.agents.append/save_config_locked in the POST /api/agents flow (and
similarly in the /api/agents/deploy handlers), ensure reserve is released and
cache.set() is stored with the final response or error on completion so
concurrent retries are serialized or deduplicated.

In `@tinyagentos/routes/notifications.py`:
- Around line 71-74: The UiNotifyRequest fields (title, body) are rendered into
HTML in list_notifications without escaping, enabling stored XSS; update the
render-time code that interpolates UiNotifyRequest.title and
UiNotifyRequest.body (and any other places in the 114-150 range that render
notifications) to HTML-escape/encode those values before insertion, e.g., call a
safe-escape utility or framework-provided HTML-escape function right where
list_notifications builds the HTML output so raw tokens cannot inject markup or
scripts.
- Around line 80-87: UiNotifyRequest's action_url is accepted but never
persisted — update the ui_notify handler to pass the action_url through to the
NotificationStore and notification model so it isn't dropped; locate ui_notify
and the code that constructs/persists notifications (referencing
UiNotifyRequest, action_url, NotificationStore, and whatever notification
model/class is used) and include the action_url field when creating the
notification record and when calling NotificationStore.save/create; mirror the
same change where similar construction occurs (the second occurrence noted
around the other block ~145-150) so both code paths persist action_url
consistently.

In `@tinyagentos/scope.py`:
- Around line 50-53: The string fragments assigned to the fix variable in
scope.py are prefixed with f-strings but contain no placeholders, causing Ruff
F541 errors; update the fix assignment (the fix= (...) expression) by removing
the leading f from each quoted fragment so they are plain regular strings (e.g.,
change f"Reissue the token..." to "Reissue the token..." and likewise for the
other two fragments) ensuring the concatenated message content is unchanged.

In `@tinyagentos/stores/agent_tokens_store.py`:
- Around line 53-80: The issue: relying on asyncio.Lock in issue(...) allows
race conditions across processes; fix by enforcing DB-level uniqueness and using
an immediate transaction. Modify the issue function to (1) ensure the schema has
a unique partial index like UNIQUE INDEX on agent_id WHERE revoked_at IS NULL
(apply once during DB initialization for the agent_tokens table) and (2) open a
transaction with "BEGIN IMMEDIATE" on the aiosqlite connection before running
the UPDATE/INSERT so the connection acquires an exclusive write lock for the
transaction; keep the UPDATE to set revoked_at and the INSERT for the new token
inside that same immediate transaction and handle unique constraint errors from
the INSERT to surface a clear failure instead of creating duplicate active
tokens.

---

Duplicate comments:
In `@tinyagentos/migrations/agent_tokens_migration.py`:
- Around line 54-69: The code only revokes the issued token when
container_backend.set_env returns success=false but not when set_env raises an
exception; wrap the await container_backend.set_env(...) call in a try/except
around the call in the same scope where plaintext was issued (after
agent_tokens_store.issue) and on any Exception call await
agent_tokens_store.revoke_for_agent(name), log the exception with
logger.warning/error including the exception details and container/name/context,
then continue so the migration can retry; keep the existing revoke path for
env_result.get("success", False) intact and reference agent_tokens_store.issue,
container_backend.set_env, and agent_tokens_store.revoke_for_agent when making
the change.

---

Nitpick comments:
In `@tinyagentos/routes/agents.py`:
- Around line 1875-1887: The inline JSONResponse returned when location.kind ==
"not_found" should be replaced with the standardized error_response helper to
maintain canonical error shape; locate the block checking location.kind ==
"not_found" and call error_response(...) passing error="model_not_reachable",
detail (the same f"model '{model_id}' is not reachable..." string), fix,
doc_url, and model=model_id and set status_code=409 so the response fields and
status match the original but use the shared error_response function.
- Around line 337-352: Replace the duplicated inline 404 payload in the token
issuance endpoints with the shared helper by calling _agent_not_found(name)
instead of the inline error_response; locate the token endpoints around where
you call store.issue (e.g., the block that retrieves config, agent =
find_agent(config, name), user_id/scope, and then calls await store.issue) and
replace the inline return error_response(...) with return
_agent_not_found(name); apply the same change to the other token endpoint that
follows the same pattern so both endpoints use the unified helper and eliminate
the inconsistent "fix" wording.
- Around line 824-843: The branch that returns JSONResponse for the
model_not_on_worker case bypasses the module's error_response helper and
therefore emits an inconsistent shape; replace the JSONResponse(...) block with
a call to error_response(...) (same message, doc_url, fix, detail) and pass the
extra fields (model=body.model, pinned_worker=body.target_worker,
available_on=location.hosts) as kwargs and ensure the status_code is 409 so the
response matches the ErrorResponse model used elsewhere; update the code around
the location.kind == "worker" check to call error_response(...) instead of
constructing JSONResponse directly.
- Around line 177-181: The current merge return ({**agent, **meta}) in the
agents route lets token-store metadata overwrite agent fields on key collisions;
change the merge so the agent record remains authoritative and token fields are
explicit. Update the code that reads from request.app.state.agent_tokens_store
and get_metadata(name) to either return a fixed projection like {"has_token":
True, "issued_at": meta.get("issued_at"), "last_used_at":
meta.get("last_used_at")} merged with agent, or merge as {**meta, **agent,
"has_token": True, "issued_at": meta.get(...), "last_used_at": meta.get(...)} to
ensure agent keys (e.g., name, user_id, scope) cannot be silently overwritten.
- Around line 350-351: Unpack and use the issuance metadata returned by
store.issue instead of synthesizing a new timestamp: change the call that
currently does `plaintext, _ = await store.issue(agent_id=name, user_id=user_id,
scope=scope)` to capture the metadata (e.g., `plaintext, metadata = ...`) and
return `{"token": plaintext, "issued_at": metadata["issued_at"]}` so the
response uses the exact issued_at from store.issue rather than calling
_now_iso().
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 08d40c68-aca1-4b73-9d73-802e3bb305e1

📥 Commits

Reviewing files that changed from the base of the PR and between 7497eb8 and 866955c.

📒 Files selected for processing (22)
  • docs/agents/audit-results.md
  • tests/migrations/test_agent_tokens_migration.py
  • tests/routes/test_agents_error_shape.py
  • tests/routes/test_agents_openapi.py
  • tests/routes/test_agents_scope.py
  • tests/routes/test_api_index.py
  • tests/routes/test_ui_notify.py
  • tests/stores/test_agent_tokens_store.py
  • tests/test_banner_dismiss.py
  • tests/test_idempotency.py
  • tests/test_routes_agents.py
  • tests/test_scope.py
  • tinyagentos/app.py
  • tinyagentos/auth_middleware.py
  • tinyagentos/errors.py
  • tinyagentos/idempotency.py
  • tinyagentos/migrations/agent_tokens_migration.py
  • tinyagentos/routes/agents.py
  • tinyagentos/routes/api_index.py
  • tinyagentos/routes/notifications.py
  • tinyagentos/scope.py
  • tinyagentos/stores/agent_tokens_store.py
✅ Files skipped from review due to trivial changes (2)
  • docs/agents/audit-results.md
  • tests/test_idempotency.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/stores/test_agent_tokens_store.py
  • tests/migrations/test_agent_tokens_migration.py

Comment thread tests/routes/test_api_index.py
Comment thread tests/routes/test_ui_notify.py
Comment thread tinyagentos/idempotency.py Outdated
Comment on lines +202 to +211
idempotency_key = request.headers.get("idempotency-key")
user_id = getattr(request.state, "user_id", "default")
cache = request.app.state.idempotency_cache
if idempotency_key:
cached = await cache.get(
key=idempotency_key, endpoint="POST /api/agents", user_id=user_id
)
if cached is not None:
return cached

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Locate the idempotency cache implementation and inspect its public surface.
fd -t f idempotency
rg -nP -C3 '\bclass\s+IdempotencyCache\b|\bidempotency_cache\b'
rg -nP -C2 '\b(reserve|claim|in_flight|inflight|begin)\s*\(' --type=py

Repository: jaylfc/tinyagentos

Length of output: 3751


🏁 Script executed:

cat -n tinyagentos/idempotency.py

Repository: jaylfc/tinyagentos

Length of output: 1819


🏁 Script executed:

cat -n tinyagentos/routes/agents.py | sed -n '200,250p'

Repository: jaylfc/tinyagentos

Length of output: 2346


🏁 Script executed:

cat -n tinyagentos/routes/agents.py | sed -n '718,740p'

Repository: jaylfc/tinyagentos

Length of output: 1160


🏁 Script executed:

cat -n tinyagentos/routes/agents.py | sed -n '1048,1060p'

Repository: jaylfc/tinyagentos

Length of output: 605


The idempotency cache cannot prevent duplicate agents during concurrent retries within the same request window.

The cache.get() at lines 206–210 releases its lock immediately. The actual write (config.agents.append + save_config_locked) happens without any coordination, and cache.set() doesn't run until after line 243. A client retrying the same Idempotency-Key while the first request is mid-execution will find the cache empty, proceed with its own write, and create a second agent. The -2 slug suffix loop masks this as a silent duplicate rather than an error.

The IdempotencyCache class (in tinyagentos/idempotency.py) provides no in-flight reservation semantics—it is purely a response cache with no mechanism to mark a key as "reserved" or "in progress." The lock protects only the internal _entries dict, not the request flow.

The same pattern affects /api/agents/deploy (lines 724–728 and 1053–1055), where the issue is more severe because the endpoint spawns background tasks.

A fix requires either reserving the key before the write (e.g., with an in-flight marker) or serializing requests per idempotency_key with a dedicated lock.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tinyagentos/routes/agents.py` around lines 202 - 211, The idempotency flow
races because cache.get() only returns stored responses and releases its
internal lock before the create/write path (config.agents.append +
save_config_locked) runs, allowing concurrent requests with the same
idempotency_key to both proceed; modify the logic to reserve the key before
performing the write by using the IdempotencyCache to mark an "in-flight"
reservation or acquire a per-key lock/lease tied to idempotency_key, so
subsequent requests either wait on that reservation or return the pending
reservation result; concretely, add a reserve/in_progress API on
IdempotencyCache (or a per-key lock map) and call it before
config.agents.append/save_config_locked in the POST /api/agents flow (and
similarly in the /api/agents/deploy handlers), ensure reserve is released and
cache.set() is stored with the final response or error on completion so
concurrent retries are serialized or deduplicated.

Comment thread tinyagentos/routes/notifications.py
Comment thread tinyagentos/routes/notifications.py Outdated
Comment thread tinyagentos/scope.py Outdated
Comment on lines +53 to +80
await db.execute("CREATE INDEX IF NOT EXISTS idx_agent_id ON agent_tokens(agent_id)")
await db.commit()

async def close(self) -> None:
pass

async def issue(
self,
*,
agent_id: str,
user_id: str,
scope: list[str],
) -> tuple[str, dict[str, Any]]:
"""Issue a new token for an agent. Atomically revokes any prior active token."""
async with self._lock:
plaintext = _generate_plaintext()
token_hash = _hash(plaintext)
now = _now_iso()
async with aiosqlite.connect(self._db_path) as db:
await db.execute(
"UPDATE agent_tokens SET revoked_at = ? WHERE agent_id = ? AND revoked_at IS NULL",
(now, agent_id),
)
await db.execute(
"INSERT INTO agent_tokens (token_hash, agent_id, user_id, scope_json, issued_at) VALUES (?, ?, ?, ?, ?)",
(token_hash, agent_id, user_id, json.dumps(scope), now),
)
await db.commit()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

sed -n '1,240p' tinyagentos/stores/agent_tokens_store.py
echo
echo "---- Indexes / transaction markers ----"
rg -n "CREATE (UNIQUE )?INDEX|BEGIN IMMEDIATE|revoked_at IS NULL|INSERT INTO agent_tokens" tinyagentos/stores/agent_tokens_store.py

Repository: jaylfc/tinyagentos

Length of output: 7237


🏁 Script executed:

rg -n "autocommit|isolation_level|journal_mode|PRAGMA|aiosqlite.connect" tinyagentos/stores/agent_tokens_store.py

Repository: jaylfc/tinyagentos

Length of output: 512


🏁 Script executed:

# Check if there are tests that might show concurrent usage patterns
find . -type f -name "*.py" | xargs rg -l "agent_tokens_store|AgentTokensStore" | grep -i test

Repository: jaylfc/tinyagentos

Length of output: 308


🏁 Script executed:

# Check SQLite default isolation behavior documentation context
rg -B5 -A5 "aiosqlite.connect" tinyagentos/stores/agent_tokens_store.py | head -40

Repository: jaylfc/tinyagentos

Length of output: 1791


🏁 Script executed:

cat tests/stores/test_agent_tokens_store.py

Repository: jaylfc/tinyagentos

Length of output: 3502


🏁 Script executed:

# Check if there are any multi-process or concurrent test scenarios
rg -n "asyncio\.|concurrent|threading|Process|Worker" tests/stores/test_agent_tokens_store.py

Repository: jaylfc/tinyagentos

Length of output: 88


🏁 Script executed:

# Check deployment/app context to see if multi-worker is used
rg -n "uvicorn|gunicorn|worker|process" . --type py | head -20

Repository: jaylfc/tinyagentos

Length of output: 1858


🏁 Script executed:

# Check SQLite PRAGMA settings if any are set
rg -n "PRAGMA" . --type py

Repository: jaylfc/tinyagentos

Length of output: 1054


🏁 Script executed:

# Look for any documentation or README about deployment
find . -type f \( -name "README*" -o -name "*.md" \) | xargs cat 2>/dev/null | grep -i "worker\|process\|deploy\|uvicorn\|gunicorn" | head -10

Repository: jaylfc/tinyagentos

Length of output: 3233


🏁 Script executed:

# Check aiosqlite configuration and transaction handling
rg -n "autocommit|isolation" . --type py | head -10

Repository: jaylfc/tinyagentos

Length of output: 1177


🏁 Script executed:

# Check if there are any integration tests that might use multiple processes or simulated concurrent access
find tests -name "*.py" -exec grep -l "asyncio.gather\|create_task\|gather" {} \;

Repository: jaylfc/tinyagentos

Length of output: 728


🌐 Web query:

aiosqlite transaction isolation level default autocommit SQLite

💡 Result:

In aiosqlite, which proxies the standard sqlite3.Connection, the default transaction isolation level is 'DEFERRED' (equivalent to an empty string ""), as passed via kwargs to sqlite3.connect [1][2]. This controls the type of transaction started implicitly on the first write statement under legacy transaction control (the current default autocommit mode) [2][3]. Autocommit mode defaults to sqlite3.LEGACY_TRANSACTION_CONTROL, meaning transactions are not fully automatic; explicit await db.commit is typically required after writes, matching sqlite3 behavior [2][4][5]. SQLite itself provides serializable isolation across connections [6]. To enable true autocommit (no implicit transactions), pass isolation_level=None or (in Python 3.12+) autocommit=True to aiosqlite.connect [2][7].

Citations:


Enforce "one active token per agent" with database-level constraints, not just in-process lock.

The asyncio.Lock protects only a single process. In multi-worker deployments, concurrent requests across workers can bypass the in-process lock and produce multiple revoked_at IS NULL rows for the same agent_id.

Example scenario:

  • Worker A acquires its asyncio.Lock, issues new token (UPDATE old to revoked, INSERT new)
  • Worker B simultaneously does the same for same agent_id
  • Both transactions complete → 2 active tokens for one agent, violating the stated invariant

Fix requires two changes:

  1. Add a unique partial index to prevent multiple active tokens at the database level:
Schema change
 await db.execute("CREATE INDEX IF NOT EXISTS idx_agent_id ON agent_tokens(agent_id)")
+await db.execute(
+    "CREATE UNIQUE INDEX IF NOT EXISTS idx_one_active_token_per_agent "
+    "ON agent_tokens(agent_id) WHERE revoked_at IS NULL"
+)
 await db.commit()
  1. Use BEGIN IMMEDIATE to ensure exclusive transaction lock from the start:
Transaction isolation
 async with self._lock:
     plaintext = _generate_plaintext()
     token_hash = _hash(plaintext)
     now = _now_iso()
     async with aiosqlite.connect(self._db_path) as db:
+        await db.execute("BEGIN IMMEDIATE")
         await db.execute(
             "UPDATE agent_tokens SET revoked_at = ? WHERE agent_id = ? AND revoked_at IS NULL",
             (now, agent_id),
         )
         await db.execute(
             "INSERT INTO agent_tokens (token_hash, agent_id, user_id, scope_json, issued_at) VALUES (?, ?, ?, ?, ?)",
             (token_hash, agent_id, user_id, json.dumps(scope), now),
         )
         await db.commit()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
await db.execute("CREATE INDEX IF NOT EXISTS idx_agent_id ON agent_tokens(agent_id)")
await db.commit()
async def close(self) -> None:
pass
async def issue(
self,
*,
agent_id: str,
user_id: str,
scope: list[str],
) -> tuple[str, dict[str, Any]]:
"""Issue a new token for an agent. Atomically revokes any prior active token."""
async with self._lock:
plaintext = _generate_plaintext()
token_hash = _hash(plaintext)
now = _now_iso()
async with aiosqlite.connect(self._db_path) as db:
await db.execute(
"UPDATE agent_tokens SET revoked_at = ? WHERE agent_id = ? AND revoked_at IS NULL",
(now, agent_id),
)
await db.execute(
"INSERT INTO agent_tokens (token_hash, agent_id, user_id, scope_json, issued_at) VALUES (?, ?, ?, ?, ?)",
(token_hash, agent_id, user_id, json.dumps(scope), now),
)
await db.commit()
await db.execute("CREATE INDEX IF NOT EXISTS idx_agent_id ON agent_tokens(agent_id)")
await db.execute(
"CREATE UNIQUE INDEX IF NOT EXISTS idx_one_active_token_per_agent "
"ON agent_tokens(agent_id) WHERE revoked_at IS NULL"
)
await db.commit()
async def close(self) -> None:
pass
async def issue(
self,
*,
agent_id: str,
user_id: str,
scope: list[str],
) -> tuple[str, dict[str, Any]]:
"""Issue a new token for an agent. Atomically revokes any prior active token."""
async with self._lock:
plaintext = _generate_plaintext()
token_hash = _hash(plaintext)
now = _now_iso()
async with aiosqlite.connect(self._db_path) as db:
await db.execute("BEGIN IMMEDIATE")
await db.execute(
"UPDATE agent_tokens SET revoked_at = ? WHERE agent_id = ? AND revoked_at IS NULL",
(now, agent_id),
)
await db.execute(
"INSERT INTO agent_tokens (token_hash, agent_id, user_id, scope_json, issued_at) VALUES (?, ?, ?, ?, ?)",
(token_hash, agent_id, user_id, json.dumps(scope), now),
)
await db.commit()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tinyagentos/stores/agent_tokens_store.py` around lines 53 - 80, The issue:
relying on asyncio.Lock in issue(...) allows race conditions across processes;
fix by enforcing DB-level uniqueness and using an immediate transaction. Modify
the issue function to (1) ensure the schema has a unique partial index like
UNIQUE INDEX on agent_id WHERE revoked_at IS NULL (apply once during DB
initialization for the agent_tokens table) and (2) open a transaction with
"BEGIN IMMEDIATE" on the aiosqlite connection before running the UPDATE/INSERT
so the connection acquires an exclusive write lock for the transaction; keep the
UPDATE to set revoked_at and the INSERT for the new token inside that same
immediate transaction and handle unique constraint errors from the INSERT to
surface a clear failure instead of creating duplicate active tokens.

jaylfc added 3 commits May 12, 2026 15:39
Adds ingest_agent_docs() — walks docs/agents/*.md and (re-)ingests each
as a KnowledgeItem with source_type='agent-docs' and categories=['agent-docs'].
Title is the first H1 line if present, else the filename. Idempotent:
each call wipes prior agent-docs items then re-adds, so doc edits land
cleanly on the next boot.

Wired into the lifespan so taos_agent can query knowledge_store for
recipes / concepts / getting-started content instead of reading docs
files from disk.
ingest_agent_docs now takes a source_id kwarg (default 'taos-core'). Each
ingested item is tagged with that id, and the re-ingest wipe only touches
items whose source_id matches — so a future per-app guides ingester can
call the same function with source_id='app:<slug>' and its guides won't
collide with the canonical taOS docs tree.

Forward-prep for the app store: community-submitted apps ship their own
agent guides, which all land in category=agent-docs alongside the core
docs but stay independently re-ingestible. Per-app ingester wiring lands
in Pass 2 when the app builder system goes live.
Critical:
  - notifications.py list_notifications: HTML-escape title and message in
    the HTMX renderer. ui.notify lets token-supplied input land in
    those fields; without escaping, an agent could inject script tags
    into the desktop notification panel.

Major:
  - idempotency.py: bounded LRU cache (max 1024 entries). Previous
    implementation could grow memory unboundedly with high-cardinality
    keys over a 24h window. Uses OrderedDict + move_to_end on get/set,
    popitem(last=False) on eviction.
  - agent_tokens_store.py: add a unique partial index over agent_id WHERE
    revoked_at IS NULL. The in-process asyncio.Lock only coordinates
    within a single worker; this constraint catches cross-worker races.
    (BEGIN IMMEDIATE on the issue txn is a Pass 2 follow-up — tracked.)
  - notifications.py UiNotifyRequest: drop action_url from the schema.
    The existing single-user NotificationStore has no column for it, so
    accepting the field would silently discard caller intent. action_url
    returns alongside the multi-user store migration in Pass 2.

Minor:
  - scope.py: drop unneeded f-prefix on placeholder-less strings (ruff
    F541 / lint CI).
  - test_api_index.py: assert status_code == 200 in the ui.notify-row test
    so failures surface explicitly.
  - test_ui_notify.py: assert the stored notification's level field
    reflects the default 'normal' priority; previous test only checked
    status code so a regression that swallowed the field would pass.

Docs:
  - recipes/notifying-the-user.md: move action_url to a Pass 2 note since
    it is no longer in the Pass 1 request schema.
jaylfc added 10 commits May 12, 2026 15:56
build_system_prompt(knowledge_store) replaces the module-level
SYSTEM_PROMPT global. The helper agent's system prompt is now assembled
on each request from the agent-docs category in knowledge_store, which
gives it access to community-app guides (source_id='app:<slug>') as
they install, not just the canonical taOS docs tree.

Falls back to docs/taos-agent-manual.md when the store has no agent-docs
items (fresh install before first ingest, or ingest failure) so the
assistant keeps working through the transition. The legacy fallback
retires when the ingest path is universal.
Click-based CLI at taosctl/ as a sibling to tinyagentos/. Root command
exposes three subgroups (agents, ui, auth) — full subcommands land in
Tasks 23-25. Config helpers resolve TAOS_URL and TAOS_TOKEN from env
or ~/.config/taos/credentials.

Installable as a console script via pyproject.toml — agents inside
LXC containers and human operators get the same binary on PATH.
… endpoint

CLI:
  - taosctl auth login --token … writes {token: …} to
    ~/.config/taos/credentials with mode 0600.
  - taosctl auth status hits /api/auth/whoami; exit 2 + stderr message
    when the token is missing or rejected.
  - taosctl auth whoami prints the user_id / agent_id / scope as JSON.

Server:
  - GET /api/auth/whoami returns the calling principal's identity. Reads
    request.state.{user_id,agent_id,token_scope} when the AuthMiddleware
    bearer branch populated them; falls back to session-cookie lookup
    otherwise. The middleware itself returns 401 for unauthenticated
    callers — the endpoint body only handles the authenticated cases.
Subcommands map 1:1 to /api/agents: list, get, create, update, start,
stop, pause, restart, logs, delete (with --yes), and token issue/revoke
under a 'token' subgroup. Human-readable output by default; --json for
piping. create auto-generates an Idempotency-Key so retries are safe.
token issue warns the operator the plaintext is shown once.

Adds http_client.put() to round out the verbs.
Wraps POST /api/ui/notify. --priority validated client-side via
click.Choice; --app-origin optional. Output is the JSON response
(notification_id + delivered).

--action-url is intentionally not exposed in Pass 1 — the server-side
UiNotifyRequest dropped that field after the CodeRabbit review (it
would have been accepted-but-discarded against the single-user
NotificationStore). It returns alongside the multi-user store
migration in Pass 2.
Extracts fenced bash/http blocks from docs/agents/recipes/*.md for
C-tier conformance testing. bash-skip and http-skip language tags are
explicitly excluded so destructive or placeholder examples don't run.
Line numbers are 1-indexed at the opening fence so failure messages
deep-link into the recipe.
Parametrises over every http block in docs/agents/recipes/*.md and
replays each against the test app's client fixture. Doc rot in a
recipe (e.g. a renamed endpoint, a schema change) now surfaces as a
named test failure pointing at the recipe file + line number.

Bash blocks in the recipes are now bash-skip — the ASGI test
transport can't satisfy curl, so the bash blocks are documentation
for human readers and the matching http block right below each
bash-skip is what the runner replays. Both stay because the curl
form is what an operator types interactively.
The CI 'pip install -e .' installs the tinyagentos package only. taosctl
is a sibling top-level dir with its own pyproject.toml (separately
installable, but not part of the parent install). Local pytest auto-
discovers it via the implicit rootdir behaviour; CI doesn't, so
test_taosctl_* fails with ModuleNotFoundError at collection time.

Adding pythonpath = ['.'] to [tool.pytest.ini_options] makes the
worktree root importable during test collection without changing the
install layout. taosctl can still ship as its own console-script
package via taosctl/pyproject.toml.
Covers the canonical contract every agent relies on: happy path, scope
denial, canonical 4xx error shape, Idempotency-Key replay, the
issue/revoke token loop, and (for ui.notify) the source='agent:<name>'
audit trail. Every PR runs these.

ui_notify multi-user-isolation is NOT covered — Pass 1 uses the
single-user NotificationStore by design; per-user routing lands in
Pass 2 along with the store migration.
scripts/conformance-smoke.py walks a small set of canonical agent
tasks (create-agent / send-notification / issue-and-use-token), gives
each to an agent runtime in plain English, then verifies the
documented side effect via the REST API. Outcomes are succeeded /
wrong_path / gave_up; JSON summary written to --report-json (- for
stdout).

run_agent_task is intentionally a NotImplementedError hook — wiring
to an LLM provider + agent harness is operator-supplied. Per the
test-agent policy, kilo-auto/free is the expected throwaway provider;
any agent runtime that honours TAOS_URL + TAOS_TOKEN env + can read
docs/agents/ would work.

.github/workflows/conformance-nightly.yml ships as workflow_dispatch
only — no cron until the runtime + secrets land. Per-PR conformance
(B-tier + C-tier) already runs in standard CI; this workflow is the
'real-world agent can actually use this' check.
All _pending_ cells resolved — every AgentsApp row is now ✅ or
❌-with-reason-tagged-to-Pass-2. Honest about which 'List shape' rows
fail (no pagination cursor yet) and which restart-style endpoints
aren't Idempotency-Key gated. All deferred items cross-reference
the Pass 2+ tracking epic #453.
@jaylfc
Copy link
Copy Markdown
Owner Author

jaylfc commented May 14, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 14, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (3)
taosctl/cmd_auth.py (1)

26-36: ⚡ Quick win

Consider atomic file creation with restrictive permissions.

There's a brief window between write_text() (line 30) and chmod() (line 32) where the token file has default permissions and could be world-readable. Using os.open() with mode flags creates the file atomically with 0o600.

🔒 Proposed atomic creation
 def login_cmd(token: str) -> None:
     """Save the bearer token for future taosctl invocations."""
     path = config.CREDENTIALS_PATH
     path.parent.mkdir(parents=True, exist_ok=True)
-    path.write_text(json.dumps({"token": token}), encoding="utf-8")
+    
+    # Atomic creation with restrictive permissions
+    fd = os.open(path, os.O_CREAT | os.O_WRONLY | os.O_TRUNC, 0o600)
     try:
-        os.chmod(path, 0o600)
+        os.write(fd, json.dumps({"token": token}).encode("utf-8"))
+    finally:
+        os.close(fd)
+    
+    # chmod no longer needed but keep for explicit documentation
+    try:
+        os.chmod(path, 0o600)
     except OSError:
-        # Windows / unusual filesystems — env-var fallback recommended there.
         pass
     click.echo(f"Token saved to {path}.")

Or more simply, remove the chmod entirely since os.open already set the mode:

 def login_cmd(token: str) -> None:
     """Save the bearer token for future taosctl invocations."""
     path = config.CREDENTIALS_PATH
     path.parent.mkdir(parents=True, exist_ok=True)
-    path.write_text(json.dumps({"token": token}), encoding="utf-8")
-    try:
-        os.chmod(path, 0o600)
-    except OSError:
-        # Windows / unusual filesystems — env-var fallback recommended there.
-        pass
+    
+    fd = os.open(path, os.O_CREAT | os.O_WRONLY | os.O_TRUNC, 0o600)
+    try:
+        os.write(fd, json.dumps({"token": token}).encode("utf-8"))
+    finally:
+        os.close(fd)
+    
     click.echo(f"Token saved to {path}.")

Note: On Windows, file modes are ignored, but the pattern is still safe.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@taosctl/cmd_auth.py` around lines 26 - 36, Replace the non-atomic write+chmod
in login_cmd with an atomic file create/write using os.open and explicit mode
0o600: after ensuring path.parent exists, open the credentials file with
os.open(path, os.O_WRONLY|os.O_CREAT|os.O_TRUNC, 0o600), write the JSON bytes
and close the fd (or use os.fdopen for a file object) so the file is created
with restrictive permissions atomically; remove the subsequent os.chmod call and
keep a small fallback catch for platforms that ignore chmod/mode (e.g., Windows)
to avoid raising on failure. Ensure you still use config.CREDENTIALS_PATH and
produce the same click.echo message.
tests/cli/test_taosctl_agents.py (1)

62-77: ⚡ Quick win

Assert result.exit_code in these invocation tests.

These tests verify API calls but don’t assert successful command completion, so post-call failures can be missed.

Proposed fix
 def test_agents_stop_posts():
     with patch("taosctl.http_client.post", return_value={"ok": True}) as p:
-        CliRunner().invoke(cli, ["agents", "stop", "x"])
+        result = CliRunner().invoke(cli, ["agents", "stop", "x"])
+    assert result.exit_code == 0, result.output
     p.assert_called_once_with("/api/agents/x/stop")

 def test_agents_restart_posts():
     with patch("taosctl.http_client.post", return_value={"ok": True}) as p:
-        CliRunner().invoke(cli, ["agents", "restart", "x"])
+        result = CliRunner().invoke(cli, ["agents", "restart", "x"])
+    assert result.exit_code == 0, result.output
     p.assert_called_once_with("/api/agents/x/restart")

 def test_agents_pause_posts():
     with patch("taosctl.http_client.post", return_value={"ok": True}) as p:
-        CliRunner().invoke(cli, ["agents", "pause", "x"])
+        result = CliRunner().invoke(cli, ["agents", "pause", "x"])
+    assert result.exit_code == 0, result.output
     p.assert_called_once_with("/api/agents/x/pause")

 def test_agents_logs_passes_lines_param():
     with patch("taosctl.http_client.get", return_value={"lines": []}) as p:
-        CliRunner().invoke(cli, ["agents", "logs", "x", "--lines", "50"])
+        result = CliRunner().invoke(cli, ["agents", "logs", "x", "--lines", "50"])
+    assert result.exit_code == 0, result.output
     p.assert_called_once_with("/api/agents/x/logs", params={"lines": 50})

Also applies to: 88-91

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/cli/test_taosctl_agents.py` around lines 62 - 77, Add an assertion that
the CLI invocation returned a successful exit code in each test that currently
only checks the HTTP post call: capture the result from CliRunner().invoke(...)
in test_agents_stop_posts, test_agents_restart_posts, test_agents_pause_posts
(and the analogous tests around lines 88-91) and assert result.exit_code == 0
after the post call assertion; update the tests that patch
taosctl.http_client.post to store the invoke return value (e.g., result) and add
the exit code assertion to ensure the command completed successfully.
tests/cli/test_taosctl_ui.py (1)

25-40: ⚡ Quick win

Add exit-code assertions for these CLI invocations.

Both tests assert request payloads but not command success, which can mask failures after the mocked call.

Proposed fix
 def test_ui_notify_with_priority_high():
     with patch("taosctl.http_client.post", return_value={"delivered": True}) as p:
-        CliRunner().invoke(
+        result = CliRunner().invoke(
             cli, ["ui", "notify", "--title", "T", "--body", "B", "--priority", "high"]
         )
+    assert result.exit_code == 0, result.output
     assert p.call_args.kwargs["json"]["priority"] == "high"

 def test_ui_notify_with_app_origin():
     with patch("taosctl.http_client.post", return_value={"delivered": True}) as p:
-        CliRunner().invoke(
+        result = CliRunner().invoke(
             cli,
             ["ui", "notify", "--title", "T", "--body", "B", "--app-origin", "my-agent"],
         )
+    assert result.exit_code == 0, result.output
     assert p.call_args.kwargs["json"]["app_origin"] == "my-agent"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/cli/test_taosctl_ui.py` around lines 25 - 40, Both tests
(test_ui_notify_with_priority_high and test_ui_notify_with_app_origin) currently
call CliRunner().invoke(...) but don't assert the command exit status; capture
the invoke result into a variable (e.g., result = CliRunner().invoke(cli, ...))
and add an assertion that result.exit_code == 0 after the invoke to ensure the
CLI command succeeded in addition to verifying the HTTP payload via
p.call_args.kwargs["json"].
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/agents/concepts/permissions.md`:
- Line 39: The table cell for `agents.lifecycle` contains unescaped pipe
characters in the string "POST /api/agents/{name}/start|stop|pause|restart"
which breaks Markdown tables; update the `agents.lifecycle` cell to either
escape each pipe as "\|" or rewrite the endpoint list without inline pipes
(e.g., use commas or separate endpoints) so the table renders correctly and the
text still references the same routes.

In `@taosctl/cmd_ui.py`:
- Line 41: The current echo always prints pretty JSON because both branches call
json.dumps(result, indent=2); make the --json flag actually change output by
printing compact JSON when json_out is True and pretty-printed JSON when False:
update the click.echo call that references json_out and result so it uses
json.dumps(result) for the json_out branch and json.dumps(result, indent=2) for
the non-json_out branch (locate the call where click.echo(json.dumps(...)) is
used).

In `@taosctl/config.py`:
- Around line 26-27: The code currently returns data.get("token") directly from
the parsed JSON; update the logic in the function that reads the credential file
(where data = json.loads(...) and token = data.get("token")) to validate the
token's type: if isinstance(token, str) return token, otherwise return None (or
explicitly cast only when appropriate) so the function's str | None contract is
preserved; reference the data variable and the token retrieval expression
(data.get("token")) when making the change.

In `@tests/conformance/_recipe_parser.py`:
- Around line 44-45: Replace the loose startswith("```") check with an exact
closing-fence regex to avoid truncating code blocks: add a module-level compiled
pattern named _FENCE_CLOSE = re.compile(r"^```\s*$") and update the loop in the
recipe parsing logic (the loop that currently tests lines[i].startswith("```"))
to use _FENCE_CLOSE.match(lines[i]) instead; this ensures only a line that is
exactly a triple-backtick fence (optionally followed by whitespace) ends the
code block.

In `@tests/conformance/test_recipes.py`:
- Around line 46-50: The code currently assumes the HTTP block has at least one
non-empty line and directly accesses request_line = lines[0], causing IndexError
for empty/whitespace-only blocks; before splitting/using lines, strip and check
if code is empty or if lines is empty or contains only blank lines (e.g., after
lines = code.strip().splitlines() verify lines and/or code.strip()), and raise
the existing ValueError with the same message (or a clear parse error) instead
of letting an IndexError propagate; update the logic around variables
request_line, lines, and code to perform this guard and raise
ValueError(f"unparseable http request line: {request_line!r}") when appropriate.

---

Nitpick comments:
In `@taosctl/cmd_auth.py`:
- Around line 26-36: Replace the non-atomic write+chmod in login_cmd with an
atomic file create/write using os.open and explicit mode 0o600: after ensuring
path.parent exists, open the credentials file with os.open(path,
os.O_WRONLY|os.O_CREAT|os.O_TRUNC, 0o600), write the JSON bytes and close the fd
(or use os.fdopen for a file object) so the file is created with restrictive
permissions atomically; remove the subsequent os.chmod call and keep a small
fallback catch for platforms that ignore chmod/mode (e.g., Windows) to avoid
raising on failure. Ensure you still use config.CREDENTIALS_PATH and produce the
same click.echo message.

In `@tests/cli/test_taosctl_agents.py`:
- Around line 62-77: Add an assertion that the CLI invocation returned a
successful exit code in each test that currently only checks the HTTP post call:
capture the result from CliRunner().invoke(...) in test_agents_stop_posts,
test_agents_restart_posts, test_agents_pause_posts (and the analogous tests
around lines 88-91) and assert result.exit_code == 0 after the post call
assertion; update the tests that patch taosctl.http_client.post to store the
invoke return value (e.g., result) and add the exit code assertion to ensure the
command completed successfully.

In `@tests/cli/test_taosctl_ui.py`:
- Around line 25-40: Both tests (test_ui_notify_with_priority_high and
test_ui_notify_with_app_origin) currently call CliRunner().invoke(...) but don't
assert the command exit status; capture the invoke result into a variable (e.g.,
result = CliRunner().invoke(cli, ...)) and add an assertion that
result.exit_code == 0 after the invoke to ensure the CLI command succeeded in
addition to verifying the HTTP payload via p.call_args.kwargs["json"].
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 6fd90744-f03b-4f94-bdfb-1080046d8b28

📥 Commits

Reviewing files that changed from the base of the PR and between 866955c and ce5e025.

📒 Files selected for processing (41)
  • .github/workflows/conformance-nightly.yml
  • docs/agents/README.md
  • docs/agents/audit-results.md
  • docs/agents/concepts/permissions.md
  • docs/agents/getting-started.md
  • docs/agents/recipes/managing-agents.md
  • docs/agents/recipes/notifying-the-user.md
  • pyproject.toml
  • scripts/conformance-smoke.py
  • taosctl/__init__.py
  • taosctl/__main__.py
  • taosctl/cli.py
  • taosctl/cmd_agents.py
  • taosctl/cmd_auth.py
  • taosctl/cmd_ui.py
  • taosctl/config.py
  • taosctl/http_client.py
  • taosctl/pyproject.toml
  • tests/cli/test_taosctl_agents.py
  • tests/cli/test_taosctl_auth.py
  • tests/cli/test_taosctl_scaffold.py
  • tests/cli/test_taosctl_ui.py
  • tests/conformance/__init__.py
  • tests/conformance/_recipe_parser.py
  • tests/conformance/test_agents.py
  • tests/conformance/test_recipe_parser.py
  • tests/conformance/test_recipes.py
  • tests/conformance/test_ui_notify.py
  • tests/routes/test_api_index.py
  • tests/routes/test_ui_notify.py
  • tests/test_auth_whoami.py
  • tests/test_knowledge_monitor_agent_docs.py
  • tests/test_taos_agent_system_prompt.py
  • tinyagentos/app.py
  • tinyagentos/idempotency.py
  • tinyagentos/knowledge_monitor.py
  • tinyagentos/routes/auth.py
  • tinyagentos/routes/notifications.py
  • tinyagentos/routes/taos_agent.py
  • tinyagentos/scope.py
  • tinyagentos/stores/agent_tokens_store.py
✅ Files skipped from review due to trivial changes (6)
  • pyproject.toml
  • taosctl/init.py
  • docs/agents/recipes/notifying-the-user.md
  • docs/agents/audit-results.md
  • docs/agents/README.md
  • docs/agents/recipes/managing-agents.md
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/routes/test_api_index.py
  • tests/routes/test_ui_notify.py

Comment thread docs/agents/concepts/permissions.md Outdated
Comment thread taosctl/cmd_ui.py Outdated
Comment thread taosctl/config.py Outdated
Comment thread tests/conformance/_recipe_parser.py Outdated
Comment thread tests/conformance/test_recipes.py
All Minor, mostly defensive hardening:

Actionable:
  - docs/agents/concepts/permissions.md: escape pipes in markdown table
    cell so 'start|stop|pause|restart' renders correctly.
  - taosctl/cmd_ui.py: --json flag now actually changes output. Previous
    code printed the formatted JSON in both branches. Default now emits
    a short human-readable 'delivered: ntf_xxx' line; --json keeps the
    raw indented JSON for piping.
  - taosctl/config.py: validate token type before returning. Corrupt
    credentials files (non-dict JSON, non-string 'token') now return
    None instead of leaking a non-string into the auth header.
  - tests/conformance/_recipe_parser.py: exact closing-fence check
    (rstrip() == '```') so 4+ backticks or backtick-prefixed prose
    don't terminate a block early.
  - tests/conformance/test_recipes.py: guard empty http block before
    indexing into lines[0] — raises a clean ValueError instead of
    IndexError.

Nitpicks:
  - taosctl/cmd_auth.py login_cmd: atomic os.open(...0o600) create
    instead of write_text+chmod, closing the TOCTOU window where the
    credentials file could be world-readable.
  - tests/cli/test_taosctl_agents.py: assert exit_code == 0 in
    stop/restart/pause/logs-with-param invocation tests so post-call
    failures don't get masked.
  - tests/cli/test_taosctl_ui.py: same exit_code asserts on the
    priority + app_origin invocation tests.
@jaylfc
Copy link
Copy Markdown
Owner Author

jaylfc commented May 14, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 14, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

1 participant