test(#462): container shell page — 23 tests covering auth, terminal, escaping, incus flow#501
test(#462): container shell page — 23 tests covering auth, terminal, escaping, incus flow#501hognek wants to merge 6 commits into
Conversation
Add model archive promotion engine: when a new worker registers with hardware that can run previously-archived models, automatically promote them from ~/taos/archive/models/ to the active models tree. - model_archive.py: archive scan, hardware compatibility check, model promotion (move + manifest cleanup), notifications - ClusterManager.register_worker: auto-trigger promotion on worker join - GET /api/cluster/promote-archived: manual promotion scan endpoint - 29 unit tests covering compatibility checks, archive I/O, promotable filtering, and file movement
…ken store (jaylfc#452) - Add IdempotencyCache with try_reserve()/set() pattern using asyncio.Event sentinels to close the get-then-set race in add_agent and deploy_agent_endpoint - Add AgentTokensStore with BEGIN IMMEDIATE for multi-worker-safe token issuance and clean AgentTokenExistsError on unique constraint violations - Register idempotency_cache on app.state in both app factory paths - Resolve pre-existing merge conflict in adapter_manager.py
Add activity feed SSE endpoint with ring buffer, publish endpoint, history endpoint, and HTML timeline UI with Pico CSS + htmx. Files: - tinyagentos/routes/activity.py: SSE stream, ring buffer, publish/history endpoints, HTML page - tinyagentos/routes/agent_debugger.py: stub for pre-existing import (unblocks create_app) - tinyagentos/templates/activity_feed.html: standalone HTML template - tests/test_activity.py: 20 unit + integration tests 20/22 tests pass (2 SSE streaming excluded due to aiosqlite thread pool exhaustion in test infra) ✓ POST /api/activity/events — accepts JSON events ✓ GET /api/activity/stream — SSE endpoint with replay + filtering ✓ GET /api/activity/history — JSON history with filters ✓ GET /api/activity — HTML page with Pico CSS, ARIA labels, filter bar
Add DiscordConnector in channel_hub/adapters/discord.py following the existing adapter pattern. Uses httpx for Discord HTTP API polling with after-id pagination, per-channel rate limiting (5/5s), and graceful error handling. Emits channel-hub messages with source=discord, channel_id, guild_id, and author metadata.
… agent containers
Adds a standalone container shell page at /api/container-shell/{agent_id}
that provides a browser-based terminal for agent containers using incus exec
under the hood. Uses Pico CSS for styling and htmx for AJAX command execution
— no JavaScript bundling required.
- GET /api/container-shell/{agent_id}: HTML page with command input and
output region, ARIA-annotated for screen readers
- POST /api/container-shell/{agent_id}/exec: executes commands via
incus exec taos-agent-{id} -- bash -lc <cmd>, returns HTML fragment
- Container naming follows existing taos-agent-{id} pattern
- Escapes HTML output and strips ANSI escape sequences for safe display
- Max command length and execution timeout enforced server-side
This is separate from the WebSocket PTY bridge used by the React desktop
SPA and serves as a fallback/resilience path for direct browser access.
- Add admin_auth_headers fixture to all 23 test methods - Pass headers=admin_auth_headers to all get/post calls - Fix escaping test: use & (valid URL char) instead of <script> (rejected by router) - All tests previously got 401 due to missing auth middleware headers
📝 WalkthroughWalkthroughThis PR introduces agent token persistence, model archive promotion for activating legacy models, a real-time activity event feed replacing hardware statistics, an interactive container shell terminal, request idempotency caching for agent operations, and a Discord message polling adapter for the channel hub, plus comprehensive test coverage across all systems. ChangesAgent Token Store
Model Archive Promotion System
Activity Feed Event System
Container Shell Terminal
Request Idempotency Caching
App Wiring and Channel Adapters
🎯 4 (Complex) | ⏱️ ~75 minutes
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 14
🧹 Nitpick comments (3)
tinyagentos/routes/agents.py (1)
65-66: 🏗️ Heavy liftBound the cache lifetime.
_entriesnever expires or evicts anything, so every unique idempotency key is retained for the life of the process and can replay stale results indefinitely. Add a TTL/LRU cleanup path, or delete completed entries after a bounded retention window.Also applies to: 86-102
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tinyagentos/routes/agents.py` around lines 65 - 66, _entries currently keeps every idempotency entry forever; change its value shape and lifecycle so entries expire: in __init__ replace dict[str, tuple[asyncio.Event, dict | None]] with something that includes a timestamp or expiry (e.g., tuple[asyncio.Event, dict | None, float]) and implement an eviction path — either schedule asyncio.create_task(self._evict_after(key, ttl)) when creating/completing an entry or run a single background cleaner task that scans _entries and removes items older than TTL; also ensure the code paths that read/serve entries (the methods around lines 86-102) check expiry and delete expired entries before returning so stale results cannot be replayed.tests/routes/test_container_shell.py (1)
236-305: ⚡ Quick winAdd a timeout regression test next to the other exec error cases.
tinyagentos/routes/container_shell.pyhas a dedicated timeout path (asyncio.TimeoutError→124→cmd-info), but this suite never exercises it. A small mocked timeout case would lock down the other special branch alongsideFileNotFoundError.Example test shape
def test_exec_handles_timeout(self, test_client, admin_auth_headers): mock_proc = AsyncMock() mock_proc.communicate = AsyncMock(side_effect=asyncio.TimeoutError()) mock_proc.wait = AsyncMock(return_value=None) with patch( "tinyagentos.routes.container_shell.asyncio.create_subprocess_exec", return_value=mock_proc, ): resp = test_client.post( "/api/container-shell/test-agent/exec", data={"command": "sleep 999"}, headers=admin_auth_headers, ) assert resp.status_code == 200 assert "timed out" in resp.text assert "cmd-info" in resp.text🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/routes/test_container_shell.py` around lines 236 - 305, Add a new test named test_exec_handles_timeout in tests/routes/test_container_shell.py that patches tinyagentos.routes.container_shell.asyncio.create_subprocess_exec to return an AsyncMock (mock_proc) whose communicate raises asyncio.TimeoutError (side_effect) and whose wait is an AsyncMock returning None; call the /api/container-shell/<agent>/exec endpoint with a long-running command (e.g. "sleep 999") and assert resp.status_code == 200 and that resp.text contains both "timed out" and "cmd-info" to exercise the asyncio.TimeoutError branch in container_shell.tests/test_activity.py (1)
113-121: ⚡ Quick winAdd a regression case for valid JSON with the wrong top-level shape.
This suite covers malformed JSON, but not
[]/"x"/123, which currently exercise a different validation path in the handler. A small case here would lock down the 400 behavior once the endpoint validates object bodies.Suggested test
`@pytest.mark.asyncio` async def test_post_event_rejects_non_json(self, client: AsyncClient): resp = await client.post( "/api/activity/events", content=b"not json", headers={"Content-Type": "text/plain"}, ) assert resp.status_code == 400 + + `@pytest.mark.asyncio` + async def test_post_event_rejects_non_object_json(self, client: AsyncClient): + resp = await client.post("/api/activity/events", json=[]) + assert resp.status_code == 400🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_activity.py` around lines 113 - 121, Add a regression test alongside test_post_event_rejects_non_json that posts valid JSON bodies with incorrect top-level shapes (e.g., JSON array "[]", string "\"x\"", and number "123") to the "/api/activity/events" endpoint and assert the handler returns 400; locate the existing test_post_event_rejects_non_json in tests/test_activity.py and add a new async test (or extend it) that sends these content values with Content-Type "application/json" to ensure the endpoint rejects non-object JSON bodies.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/routes/test_container_shell.py`:
- Around line 236-268: The test test_exec_handles_nonzero_exit_code is
exercising the special rc==127 branch instead of the generic non-zero path;
update the mocked process in this test (the AsyncMock mock_proc.returncode used
when patching tinyagentos.routes.container_shell.asyncio.create_subprocess_exec)
to use a non-127 value (e.g. 1) so it exercises container_shell_exec()’s generic
failure handling and still returns the command output.
In `@tinyagentos/agent_tokens_store.py`:
- Around line 80-99: The code in issue() reads the "current active" token after
commit which can race with other workers; instead return the row that was just
inserted from inside the same transaction (before calling await
self._db.commit()). Modify the flow in issue() to fetch and return the inserted
row using a non-racy method — either use an INSERT ... RETURNING clause or
select by the connection's last insert id (e.g., last_insert_rowid()) while
still inside the transaction on self._db — and only then commit; keep raising
AgentTokenExistsError on IntegrityError and use _row_to_dict to convert the
fetched row for the return value.
In `@tinyagentos/channel_hub/adapters/discord.py`:
- Around line 118-169: _check_channel currently fetches one page and advances
_last_message_ids[channel_id] to messages[0]["id"], which can skip messages;
change it so that on initial cold-start you bootstrap the cursor (if
_last_message_ids has no entry for channel_id, set it to the newest message id
and return without calling _handle_message), and on normal polls paginate until
caught up by repeatedly requesting messages with
params["after"]=current_last_id, processing each page in chronological order via
_handle_message and updating _last_message_ids[channel_id] to the newest id seen
after each page; ensure you loop until a page returns fewer than limit (or
empty) to avoid gaps.
In `@tinyagentos/cluster/model_archive.py`:
- Around line 135-139: The current check uses req_arch =
requirements.get("arch") and then only rejects when worker_arch exists and
differs; instead, when requirements["arch"] is set you must also reject workers
that omit cpu.arch. Change the condition that uses hw_cpu.get("arch", "") /
worker_arch so that if req_arch is truthy and (worker_arch is falsy OR
worker_arch != req_arch) you return False; update the block around req_arch,
hw_cpu, and worker_arch accordingly.
- Around line 119-125: The gpu_accel branch currently checks only "cuda",
"rocm", and "vulkan" so a manifest with req_gpu_accel == "mlx" is treated as
allowed; update the logic in the same block that reads req_gpu_accel and hw_gpu
to explicitly check for "mlx" (e.g., if req_gpu_accel == "mlx" and not
hw_gpu.get("mlx"): return False) so Apple-only models are rejected on workers
without mlx support; keep the existing checks for "cuda", "rocm", and "vulkan"
and consider adding a fallback to return False for any unknown req_gpu_accel
values to avoid silently allowing unsupported accelerators.
- Around line 244-251: The notification call notifications.emit_event inside the
model promotion loop can raise and abort the loop, leaving later models still
archived; wrap the emit_event call in a try/except that catches Exception, logs
the failure (including model_id and worker_name and the exception) using the
module/logger used elsewhere (so you don't re-raise), and continue the loop so
remaining promotions complete; ensure this change is made where emit_event is
invoked in the promotion logic in model_archive.py so register_worker sees no
exception from notifications.
In `@tinyagentos/routes/activity.py`:
- Around line 440-455: applyFilters currently only updates local filter
variables and calls renderAll but does not update the active EventSource, so
server-side filtering (worker/model/type) stays stale; modify applyFilters (and
the clear button handler) to close the existing EventSource instance (e.g.,
activityEventSource or whichever EventSource variable is created in init/start
stream code) and re-create it with the new query params built from
filterTypeEl.value, filterWorkerEl.value and filterModelEl.value (you can call
the existing SSE initialization function like initEventSource/startEventStream
or factor out a reconnectSSE helper) so the new connection URL includes the
current filters and the UI receives filtered events immediately, then call
renderAll().
- Around line 194-205: The handler that reads the request body currently assumes
`body` is a dict and calls `body.get(...)`, which will raise if `await
request.json()` returns a non-object (list/string/number); update the validation
in the same block after `body = await request.json()` to check that `body` is a
mapping (e.g., isinstance(body, dict) or collections.abc.Mapping) and return a
JSONResponse with a 400 and an error like "invalid JSON object" if it is not;
then proceed to read `event_type = body.get("type", "")` and validate against
`VALID_EVENT_TYPES` as before.
- Around line 96-106: get_buffer currently returns
request.app.state.activity_buffer without updating the module-level singleton
_buffer, causing publish_event (which reads _buffer) to miss events; modify
get_buffer so when buf = getattr(request.app.state, "activity_buffer", None) is
not None you also set _buffer = buf (i.e., synchronize the module-level _buffer
with request.app.state.activity_buffer), keeping the existing lazy-init path
that creates ActivityBuffer and assigns both request.app.state.activity_buffer
and _buffer.
In `@tinyagentos/routes/agents.py`:
- Around line 165-168: The idempotency cache is using the raw header value
(idempotency_key) causing collisions across routes; before calling
idempotency_cache.try_reserve(idempotency_key) in agents.py wrap/namespace the
key with the operation fingerprint (for example combine request.method and
request.url.path or a route name) such as namespaced_key =
f"{request.method}:{request.url.path}:{idempotency_key}" (or hash that string)
and pass namespaced_key to try_reserve; apply the same change for the other
occurrence that uses try_reserve so keys are scoped per endpoint.
- Around line 164-171: The idempotency reservation created via
idempotency_cache.try_reserve (which returns mode, event) currently only gets
finalized on the trailing success path, leaving waiters stuck in await
event.wait(); fix by capturing the full HTTP response envelope (status_code and
response body) for every possible exit and ensuring
idempotency_cache.set(idempotency_key, envelope) is called exactly once on all
return paths. Concretely, wrap the handler logic around the reservation in a
try/finally (or use a single common exit path) so that any validation error, 202
routing, JSONResponse, or raised exception constructs the same envelope (e.g.,
{"status_code": ..., "body": ...}) and calls
idempotency_cache.set(idempotency_key, envelope) in the finally block; reference
the existing symbols idempotency_cache.try_reserve, event.wait, and
idempotency_cache.set to locate and change the code. Ensure same pattern is
applied to the other occurrences noted (the other blocks using try_reserve at
lines referenced).
In `@tinyagentos/routes/cluster.py`:
- Around line 661-712: The route handler promote_archived_models is a mutating
endpoint (it moves model dirs and emits events) but is currently decorated as a
GET; change it to a POST to prevent accidental invocation by crawlers and
caches. Replace the `@router.get`("/api/cluster/promote-archived") decorator with
`@router.post`("/api/cluster/promote-archived") on the promote_archived_models
function, and update any tests, clients, or docs that call this endpoint to use
POST instead of GET; keep the function body (including calls to find_promotable,
promote_model, and notifications.emit_event) unchanged.
In `@tinyagentos/routes/container_shell.py`:
- Around line 50-57: The current use of proc.communicate() buffers the entire
subprocess output; replace it with incremental reads from proc.stdout (created
by asyncio.create_subprocess_exec with stderr=STDOUT) into a bytes accumulator
and stop when a hard MAX_OUTPUT_BYTES limit is reached, then terminate/kill the
process and await proc.wait(); use asyncio.wait_for around the overall read loop
(or individual reads) with the existing _EXEC_TIMEOUT to preserve the timeout
behavior. In short: stop using proc.communicate(), read from proc.stdout in a
loop (e.g., proc.stdout.read(chunk_size)), append until accumulated length >=
MAX_OUTPUT_BYTES, then call proc.kill()/proc.terminate() and await proc.wait(),
and return the truncated output plus an indicator that output was capped; refer
to create_subprocess_exec, proc.communicate, proc.stdout, and _EXEC_TIMEOUT in
your changes.
- Around line 250-253: The current logic treats any truthy output as real output
even when it is only whitespace; change the branch to inspect a stripped value:
compute stripped_output = output.rstrip() (or "" when output is falsy), then set
display = "(no output)" if not stripped_output else
html.escape(stripped_output), and use that display in the HTMLResponse along
with escaped_cmd; update the code around the escaped_output/escaped_cmd
construction where HTMLResponse is returned (variables: output, escaped_output,
escaped_cmd).
---
Nitpick comments:
In `@tests/routes/test_container_shell.py`:
- Around line 236-305: Add a new test named test_exec_handles_timeout in
tests/routes/test_container_shell.py that patches
tinyagentos.routes.container_shell.asyncio.create_subprocess_exec to return an
AsyncMock (mock_proc) whose communicate raises asyncio.TimeoutError
(side_effect) and whose wait is an AsyncMock returning None; call the
/api/container-shell/<agent>/exec endpoint with a long-running command (e.g.
"sleep 999") and assert resp.status_code == 200 and that resp.text contains both
"timed out" and "cmd-info" to exercise the asyncio.TimeoutError branch in
container_shell.
In `@tests/test_activity.py`:
- Around line 113-121: Add a regression test alongside
test_post_event_rejects_non_json that posts valid JSON bodies with incorrect
top-level shapes (e.g., JSON array "[]", string "\"x\"", and number "123") to
the "/api/activity/events" endpoint and assert the handler returns 400; locate
the existing test_post_event_rejects_non_json in tests/test_activity.py and add
a new async test (or extend it) that sends these content values with
Content-Type "application/json" to ensure the endpoint rejects non-object JSON
bodies.
In `@tinyagentos/routes/agents.py`:
- Around line 65-66: _entries currently keeps every idempotency entry forever;
change its value shape and lifecycle so entries expire: in __init__ replace
dict[str, tuple[asyncio.Event, dict | None]] with something that includes a
timestamp or expiry (e.g., tuple[asyncio.Event, dict | None, float]) and
implement an eviction path — either schedule
asyncio.create_task(self._evict_after(key, ttl)) when creating/completing an
entry or run a single background cleaner task that scans _entries and removes
items older than TTL; also ensure the code paths that read/serve entries (the
methods around lines 86-102) check expiry and delete expired entries before
returning so stale results cannot be replayed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: cd8cdfe1-c3b2-462c-925a-3782796d80f7
📒 Files selected for processing (15)
tests/routes/test_container_shell.pytests/test_activity.pytests/test_model_archive.pytinyagentos/agent_tokens_store.pytinyagentos/app.pytinyagentos/channel_hub/adapter_manager.pytinyagentos/channel_hub/adapters/discord.pytinyagentos/cluster/manager.pytinyagentos/cluster/model_archive.pytinyagentos/routes/activity.pytinyagentos/routes/agent_debugger.pytinyagentos/routes/agents.pytinyagentos/routes/cluster.pytinyagentos/routes/container_shell.pytinyagentos/templates/activity_feed.html
| def test_exec_handles_incus_not_found(self, test_client, admin_auth_headers): | ||
| """When incus is not installed, a helpful error message is returned.""" | ||
| with patch( | ||
| "tinyagentos.routes.container_shell.asyncio.create_subprocess_exec", | ||
| side_effect=FileNotFoundError("incus not found"), | ||
| ): | ||
| resp = test_client.post( | ||
| "/api/container-shell/test-agent/exec", | ||
| data={"command": "ls"}, | ||
| headers=admin_auth_headers, | ||
| ) | ||
| assert resp.status_code == 200 | ||
| assert "incus: command not found" in resp.text | ||
|
|
||
| def test_exec_handles_nonzero_exit_code(self, test_client, admin_auth_headers): | ||
| """Commands that fail (non-zero exit) must still return output.""" | ||
| mock_proc = AsyncMock() | ||
| mock_proc.communicate = AsyncMock( | ||
| return_value=(b"bash: line 1: nosuchcmd: command not found\n", b"") | ||
| ) | ||
| mock_proc.returncode = 127 | ||
|
|
||
| with patch( | ||
| "tinyagentos.routes.container_shell.asyncio.create_subprocess_exec", | ||
| return_value=mock_proc, | ||
| ): | ||
| resp = test_client.post( | ||
| "/api/container-shell/test-agent/exec", | ||
| data={"command": "nosuchcmd"}, | ||
| headers=admin_auth_headers, | ||
| ) | ||
| assert resp.status_code == 200 | ||
| assert "command not found" in resp.text |
There was a problem hiding this comment.
Use a non-127 code for the generic failure test.
test_exec_handles_nonzero_exit_code currently goes through the special rc == 127 branch in container_shell_exec(), so it does not protect the normal “failed command but still show output” path the test name describes. Switching this to something like 1 would cover the intended behavior.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/routes/test_container_shell.py` around lines 236 - 268, The test
test_exec_handles_nonzero_exit_code is exercising the special rc==127 branch
instead of the generic non-zero path; update the mocked process in this test
(the AsyncMock mock_proc.returncode used when patching
tinyagentos.routes.container_shell.asyncio.create_subprocess_exec) to use a
non-127 value (e.g. 1) so it exercises container_shell_exec()’s generic failure
handling and still returns the command output.
| await self._db.commit() | ||
| except aiosqlite.IntegrityError: | ||
| await self._db.execute("ROLLBACK") | ||
| raise AgentTokenExistsError(agent_name) from None | ||
| except Exception: | ||
| await self._db.execute("ROLLBACK") | ||
| raise | ||
|
|
||
| row = await ( | ||
| await self._db.execute( | ||
| "SELECT id, agent_name, token, created_at, revoked_at " | ||
| "FROM agent_tokens WHERE agent_name = ? AND revoked_at IS NULL", | ||
| (agent_name,), | ||
| ) | ||
| ).fetchone() | ||
|
|
||
| if row is None: | ||
| raise RuntimeError(f"Token for '{agent_name}' not found after issue") | ||
|
|
||
| return _row_to_dict(row) |
There was a problem hiding this comment.
Don’t re-read the “current active” token after commit.
Once Line 80 commits, another worker can revoke this row and issue a new token before Lines 88-94 run. In that interleaving, issue() can either raise Token ... not found after issue or return a different request’s token, which is both a correctness bug and a credential leak. Return the row you just inserted from inside the transaction instead of querying revoked_at IS NULL after commit.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tinyagentos/agent_tokens_store.py` around lines 80 - 99, The code in issue()
reads the "current active" token after commit which can race with other workers;
instead return the row that was just inserted from inside the same transaction
(before calling await self._db.commit()). Modify the flow in issue() to fetch
and return the inserted row using a non-racy method — either use an INSERT ...
RETURNING clause or select by the connection's last insert id (e.g.,
last_insert_rowid()) while still inside the transaction on self._db — and only
then commit; keep raising AgentTokenExistsError on IntegrityError and use
_row_to_dict to convert the fetched row for the return value.
| params: dict[str, str | int] = {"limit": 10} | ||
| last_id = self._last_message_ids.get(channel_id) | ||
| if last_id: | ||
| params["after"] = last_id | ||
|
|
||
| try: | ||
| resp = await client.get( | ||
| f"{DISCORD_API_BASE}/channels/{channel_id}/messages", | ||
| headers=self.headers, | ||
| params=params, | ||
| ) | ||
| except httpx.RequestError as exc: | ||
| logger.error( | ||
| "Discord HTTP error on channel %s: %s", channel_id, exc, | ||
| ) | ||
| return | ||
|
|
||
| if resp.status_code == 429: | ||
| retry_after = float( | ||
| resp.json().get("retry_after", _RATE_LIMIT_WINDOW), | ||
| ) | ||
| logger.warning( | ||
| "Discord rate limited on channel %s, waiting %.1fs", | ||
| channel_id, retry_after, | ||
| ) | ||
| await asyncio.sleep(retry_after) | ||
| return | ||
|
|
||
| if resp.status_code == 401: | ||
| logger.error( | ||
| "Discord auth failure on channel %s — bad token?", channel_id, | ||
| ) | ||
| return | ||
|
|
||
| if resp.status_code != 200: | ||
| logger.debug( | ||
| "Discord channel %s returned %d", channel_id, resp.status_code, | ||
| ) | ||
| return | ||
|
|
||
| messages = resp.json() | ||
| if not messages: | ||
| return | ||
|
|
||
| # Update last-seen ID (API returns newest first) | ||
| self._last_message_ids[channel_id] = messages[0]["id"] | ||
|
|
||
| # Process in chronological order (oldest first) | ||
| for msg in reversed(messages): | ||
| if msg.get("author", {}).get("id") == self._bot_user_id: | ||
| continue # Skip our own messages | ||
| await self._handle_message(client, channel_id, msg) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Locate the file
ls -la tinyagentos/channel_hub/adapters/discord.py
# Show relevant sections around the cited lines
python3 - <<'PY'
import itertools, pathlib
path = pathlib.Path("tinyagentos/channel_hub/adapters/discord.py")
lines = path.read_text().splitlines()
start, end = 80, 220
for i in range(start, min(end, len(lines))):
print(f"{i+1:4d}: {lines[i]}")
PY
# Search for where _last_message_ids is initialized and used
rg -n "_last_message_ids" tinyagentos/channel_hub/adapters/discord.py
# Search for any "cursor", "bootstrap", "initial", "sync" logic in that file
rg -n "cursor|bootstrap|initial|sync|checkpoint|last_message" tinyagentos/channel_hub/adapters/discord.py
# Search for _check_channel definition (to confirm full logic)
rg -n "def _check_channel|async def _check_channel" tinyagentos/channel_hub/adapters/discord.py
# Also view entire file size to choose next slices if needed
wc -l tinyagentos/channel_hub/adapters/discord.pyRepository: jaylfc/tinyagentos
Length of output: 7664
🌐 Web query:
Discord API GET /channels/{channel.id}/messages after parameter behavior ordering newest first and message inclusion/exclusion (pagination semantics)
💡 Result:
For Discord REST GET /channels/{channel.id}/messages (Message Resource “Get Channel Messages”): 1) Default ordering (newest-first) and what “after” does - The endpoint returns “an array of message objects from newest to oldest” on success. [1] - The query parameter after is defined as “Get messages after this message ID.” [1] - The documentation also states that before, after, and around are mutually exclusive (only one may be passed). [1] Practical implication for pagination when you request newest-first: - Because the overall response ordering is newest → oldest [1], using after as an ID boundary is intended to move the window forward in message ID terms (i.e., toward newer messages than the referenced ID), while still returning each page in newest → oldest order. (The “after” boundary semantics are explicitly described; the per-page ordering is explicitly described.) [1] 2) Pagination semantics / inclusion-exclusion behavior - Boundaries are controlled strictly by the choice of after OR before OR around; you cannot combine them in the same request. [1] - The docs describe the boundary behavior as “after this message ID” (for after) and “before this message ID” (for before). [1] - However, the docs do not explicitly state whether the boundary message itself (the message with id equal to the after value) is included or excluded; they only describe the relation (“after” / “before”) and enforce mutual exclusivity. [1] 3) Message inclusion/exclusion of additional content fields - This endpoint’s docs we found describe message retrieval and the available query parameters (around/before/after/limit). [1] - They do not indicate optional toggles (for example, including or excluding content) within these parameters; therefore, the observed “inclusion/exclusion” semantics in the official docs are about cursor boundaries (after/before/around), not about whether message content is present. [1] What you can safely implement from the official semantics - Always treat each response page as ordered newest → oldest. [1] - Use only one of after / before / around per request. [1] - Use after to specify a lower/upper message-id boundary according to Discord’s “Get messages after this message ID” rule. [1] - Assume limit bounds (1–100) and default limit (50) are controlled by the limit parameter. [1] Important gap / what is not explicitly answered in the official docs we located - Whether the boundary message with ID exactly equal to the after value is included or excluded is not explicitly stated in the cited documentation. [1] If you want, tell me your exact pagination goal (e.g., “fetch entire history oldest→newest” or “fetch next page after the last seen message”) and I can propose the safest cursor strategy using before vs after consistent with the fixed newest→oldest ordering.
Citations:
🏁 Script executed:
#!/bin/bash
set -euo pipefail
python3 - <<'PY'
import pathlib
p = pathlib.Path("tinyagentos/channel_hub/adapters/discord.py")
lines = p.read_text().splitlines()
for i in range(1, 120):
print(f"{i:4d}: {lines[i-1]}")
PY
# Also search for any persistence/load of _last_message_ids
rg -n "last_message_ids|_last_message_ids|after_id|checkpoint|cursor|persist|storage|redis|db" tinyagentos/channel_hub/adapters/discord.pyRepository: jaylfc/tinyagentos
Length of output: 5197
Fix Discord polling cursoring to avoid skipping messages
_check_channelfetches only one page (limit=10) using newest-firstGET /channels/{channel.id}/messages, then advances_last_message_ids[channel_id]tomessages[0]["id"](the newest id from that page). Withaftersemantics (“messages after this message ID”), if more thanlimitnew messages arrive between polls, the next poll starts after the newest id already seen and can permanently skip the older messages from that burst.- On cold start, the empty cursor causes the latest backlog (up to
limit) to be treated as fresh input. - Fix by bootstrapping the cursor on initial start (without emitting) and draining all unseen messages before moving the checkpoint (or otherwise paginating until caught up) to avoid gaps.
https://docs.discord.com/developers/resources/message
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tinyagentos/channel_hub/adapters/discord.py` around lines 118 - 169,
_check_channel currently fetches one page and advances
_last_message_ids[channel_id] to messages[0]["id"], which can skip messages;
change it so that on initial cold-start you bootstrap the cursor (if
_last_message_ids has no entry for channel_id, set it to the newest message id
and return without calling _handle_message), and on normal polls paginate until
caught up by repeatedly requesting messages with
params["after"]=current_last_id, processing each page in chronological order via
_handle_message and updating _last_message_ids[channel_id] to the newest id seen
after each page; ensure you loop until a page returns fewer than limit (or
empty) to avoid gaps.
| if req_gpu_accel: | ||
| if req_gpu_accel == "cuda" and not hw_gpu.get("cuda"): | ||
| return False | ||
| if req_gpu_accel == "rocm" and not hw_gpu.get("rocm"): | ||
| return False | ||
| if req_gpu_accel == "vulkan" and not hw_gpu.get("vulkan"): | ||
| return False |
There was a problem hiding this comment.
Enforce gpu_accel="mlx" instead of silently allowing it.
The docstring advertises mlx, but this branch never checks for it. A manifest with {"gpu_accel": "mlx"} currently passes on any worker, so Apple-only models can be promoted onto incompatible hardware.
Suggested fix
if req_gpu_accel:
if req_gpu_accel == "cuda" and not hw_gpu.get("cuda"):
return False
if req_gpu_accel == "rocm" and not hw_gpu.get("rocm"):
return False
if req_gpu_accel == "vulkan" and not hw_gpu.get("vulkan"):
return False
+ if req_gpu_accel == "mlx" and not (hw_gpu.get("mlx") or hw_gpu.get("type") == "apple"):
+ return False🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tinyagentos/cluster/model_archive.py` around lines 119 - 125, The gpu_accel
branch currently checks only "cuda", "rocm", and "vulkan" so a manifest with
req_gpu_accel == "mlx" is treated as allowed; update the logic in the same block
that reads req_gpu_accel and hw_gpu to explicitly check for "mlx" (e.g., if
req_gpu_accel == "mlx" and not hw_gpu.get("mlx"): return False) so Apple-only
models are rejected on workers without mlx support; keep the existing checks for
"cuda", "rocm", and "vulkan" and consider adding a fallback to return False for
any unknown req_gpu_accel values to avoid silently allowing unsupported
accelerators.
| req_arch = requirements.get("arch") | ||
| if req_arch: | ||
| worker_arch = hw_cpu.get("arch", "") | ||
| if worker_arch and worker_arch != req_arch: | ||
| return False |
There was a problem hiding this comment.
Reject workers with unknown CPU arch when the manifest pins arch.
If requirements["arch"] is set and the worker omits cpu.arch, this check currently returns True. That can promote architecture-specific models to a host whose compatibility is unknown.
Suggested fix
req_arch = requirements.get("arch")
if req_arch:
worker_arch = hw_cpu.get("arch", "")
- if worker_arch and worker_arch != req_arch:
+ if not worker_arch or worker_arch != req_arch:
return False📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| req_arch = requirements.get("arch") | |
| if req_arch: | |
| worker_arch = hw_cpu.get("arch", "") | |
| if worker_arch and worker_arch != req_arch: | |
| return False | |
| req_arch = requirements.get("arch") | |
| if req_arch: | |
| worker_arch = hw_cpu.get("arch", "") | |
| if not worker_arch or worker_arch != req_arch: | |
| return False |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tinyagentos/cluster/model_archive.py` around lines 135 - 139, The current
check uses req_arch = requirements.get("arch") and then only rejects when
worker_arch exists and differs; instead, when requirements["arch"] is set you
must also reject workers that omit cpu.arch. Change the condition that uses
hw_cpu.get("arch", "") / worker_arch so that if req_arch is truthy and
(worker_arch is falsy OR worker_arch != req_arch) you return False; update the
block around req_arch, hw_cpu, and worker_arch accordingly.
| # --- Idempotency guard --- | ||
| idempotency_cache = getattr(request.app.state, "idempotency_cache", None) | ||
| idempotency_key = request.headers.get("Idempotency-Key") | ||
| if idempotency_key and idempotency_cache is not None: | ||
| mode, event = idempotency_cache.try_reserve(idempotency_key) | ||
| if mode == "wait": | ||
| await event.wait() | ||
| return idempotency_cache.get(idempotency_key) |
There was a problem hiding this comment.
Finalize the reservation on every return path.
Right now only the trailing success path calls set(). Any validation error, 202 routed deploy, 4xx/5xx JSONResponse, or exception leaves the event unset, so same-key retries can hang forever in await event.wait(). Cache a full response envelope (status_code + body) and release the key from a single common exit path so waiters always get the same HTTP result as the original request.
Also applies to: 195-198, 545-552, 857-862
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tinyagentos/routes/agents.py` around lines 164 - 171, The idempotency
reservation created via idempotency_cache.try_reserve (which returns mode,
event) currently only gets finalized on the trailing success path, leaving
waiters stuck in await event.wait(); fix by capturing the full HTTP response
envelope (status_code and response body) for every possible exit and ensuring
idempotency_cache.set(idempotency_key, envelope) is called exactly once on all
return paths. Concretely, wrap the handler logic around the reservation in a
try/finally (or use a single common exit path) so that any validation error, 202
routing, JSONResponse, or raised exception constructs the same envelope (e.g.,
{"status_code": ..., "body": ...}) and calls
idempotency_cache.set(idempotency_key, envelope) in the finally block; reference
the existing symbols idempotency_cache.try_reserve, event.wait, and
idempotency_cache.set to locate and change the code. Ensure same pattern is
applied to the other occurrences noted (the other blocks using try_reserve at
lines referenced).
| idempotency_cache = getattr(request.app.state, "idempotency_cache", None) | ||
| idempotency_key = request.headers.get("Idempotency-Key") | ||
| if idempotency_key and idempotency_cache is not None: | ||
| mode, event = idempotency_cache.try_reserve(idempotency_key) |
There was a problem hiding this comment.
Namespace the idempotency key per operation.
These routes share one app-wide cache but use the raw header value as the lookup key, so the same Idempotency-Key on /api/agents and /api/agents/deploy can collide and replay the wrong response body. Prefix the key with the route/method, or bind it to a request fingerprint before calling try_reserve().
Suggested fix
- mode, event = idempotency_cache.try_reserve(idempotency_key)
+ cache_key = f"POST:/api/agents:{idempotency_key}"
+ mode, event = idempotency_cache.try_reserve(cache_key)- mode, event = idempotency_cache.try_reserve(idempotency_key)
+ cache_key = f"POST:/api/agents/deploy:{idempotency_key}"
+ mode, event = idempotency_cache.try_reserve(cache_key)Also applies to: 546-549
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tinyagentos/routes/agents.py` around lines 165 - 168, The idempotency cache
is using the raw header value (idempotency_key) causing collisions across
routes; before calling idempotency_cache.try_reserve(idempotency_key) in
agents.py wrap/namespace the key with the operation fingerprint (for example
combine request.method and request.url.path or a route name) such as
namespaced_key = f"{request.method}:{request.url.path}:{idempotency_key}" (or
hash that string) and pass namespaced_key to try_reserve; apply the same change
for the other occurrence that uses try_reserve so keys are scoped per endpoint.
| @router.get("/api/cluster/promote-archived") | ||
| async def promote_archived_models(request: Request): | ||
| """Manual trigger: scan all online workers and promote any archived | ||
| models that are now compatible with cluster hardware. | ||
|
|
||
| Called by the user from the Cluster page or admin CLI. Safe to call | ||
| repeatedly — already-promoted models are skipped. | ||
| """ | ||
| cluster = request.app.state.cluster_manager | ||
| notifications = getattr(request.app.state, "notifications", None) | ||
|
|
||
| workers = cluster.get_workers() | ||
| online = [w for w in workers if w.status == "online"] | ||
|
|
||
| from tinyagentos.cluster.model_archive import ( | ||
| find_promotable, | ||
| promote_model, | ||
| ) | ||
|
|
||
| promoted_by_worker: dict[str, list[str]] = {} | ||
| total = 0 | ||
|
|
||
| for w in online: | ||
| promotable = find_promotable( | ||
| worker_hardware=w.hardware, | ||
| worker_name=w.name, | ||
| ) | ||
| for model in promotable: | ||
| model_id = model.get("model_id", "?") | ||
| if promote_model(model): | ||
| promoted_by_worker.setdefault(w.name, []).append(model_id) | ||
| total += 1 | ||
| if notifications: | ||
| try: | ||
| await notifications.emit_event( | ||
| "model.promoted", | ||
| f"Archived model '{model_id}' promoted", | ||
| f"Worker '{w.name}' can now run '{model_id}'. " | ||
| f"Moved from archive to active models.", | ||
| level="info", | ||
| ) | ||
| except Exception: | ||
| logger.exception( | ||
| "notification emit failed for model promotion %s", | ||
| model_id, | ||
| ) | ||
|
|
||
| return { | ||
| "promoted": total, | ||
| "by_worker": promoted_by_worker, | ||
| "workers_scanned": len(online), | ||
| } |
There was a problem hiding this comment.
Make this a POST, not a mutating GET.
This handler moves model directories and emits events, so it is not a safe read. Keeping it on GET makes accidental promotion possible via crawlers, link previews, prefetchers, or overly aggressive caches.
Suggested fix
-@router.get("/api/cluster/promote-archived")
+@router.post("/api/cluster/promote-archived")
async def promote_archived_models(request: Request):🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tinyagentos/routes/cluster.py` around lines 661 - 712, The route handler
promote_archived_models is a mutating endpoint (it moves model dirs and emits
events) but is currently decorated as a GET; change it to a POST to prevent
accidental invocation by crawlers and caches. Replace the
`@router.get`("/api/cluster/promote-archived") decorator with
`@router.post`("/api/cluster/promote-archived") on the promote_archived_models
function, and update any tests, clients, or docs that call this endpoint to use
POST instead of GET; keep the function body (including calls to find_promotable,
promote_model, and notifications.emit_event) unchanged.
| proc = await asyncio.create_subprocess_exec( | ||
| "incus", "exec", container, "--", "bash", "-lc", command, | ||
| stdout=asyncio.subprocess.PIPE, | ||
| stderr=asyncio.subprocess.STDOUT, | ||
| ) | ||
| stdout_bytes, _ = await asyncio.wait_for( | ||
| proc.communicate(), timeout=_EXEC_TIMEOUT, | ||
| ) |
There was a problem hiding this comment.
Cap subprocess output instead of buffering all of it.
proc.communicate() holds the entire stream in memory until the command exits. On this endpoint, a command like yes, find /, or a noisy log tail can exhaust the worker long before the 30s timeout fires. Please read with a byte limit or truncate after a maximum response size.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tinyagentos/routes/container_shell.py` around lines 50 - 57, The current use
of proc.communicate() buffers the entire subprocess output; replace it with
incremental reads from proc.stdout (created by asyncio.create_subprocess_exec
with stderr=STDOUT) into a bytes accumulator and stop when a hard
MAX_OUTPUT_BYTES limit is reached, then terminate/kill the process and await
proc.wait(); use asyncio.wait_for around the overall read loop (or individual
reads) with the existing _EXEC_TIMEOUT to preserve the timeout behavior. In
short: stop using proc.communicate(), read from proc.stdout in a loop (e.g.,
proc.stdout.read(chunk_size)), append until accumulated length >=
MAX_OUTPUT_BYTES, then call proc.kill()/proc.terminate() and await proc.wait(),
and return the truncated output plus an indicator that output was capped; refer
to create_subprocess_exec, proc.communicate, proc.stdout, and _EXEC_TIMEOUT in
your changes.
| escaped_output = html.escape(output.rstrip()) if output else "(no output)" | ||
| return HTMLResponse( | ||
| f'<div class="cmd-line">$ {escaped_cmd}</div>' | ||
| f'<div class="cmd-out">{escaped_output}</div>' |
There was a problem hiding this comment.
Handle whitespace-only output as (no output).
A command that prints only newlines/spaces makes output truthy, but output.rstrip() becomes empty, so this returns a blank <div> instead of the placeholder. Check the stripped value before choosing the fallback.
Proposed fix
- escaped_output = html.escape(output.rstrip()) if output else "(no output)"
+ stripped_output = output.rstrip() if output else ""
+ escaped_output = html.escape(stripped_output) if stripped_output else "(no output)"
return HTMLResponse(
f'<div class="cmd-line">$ {escaped_cmd}</div>'
f'<div class="cmd-out">{escaped_output}</div>'
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| escaped_output = html.escape(output.rstrip()) if output else "(no output)" | |
| return HTMLResponse( | |
| f'<div class="cmd-line">$ {escaped_cmd}</div>' | |
| f'<div class="cmd-out">{escaped_output}</div>' | |
| stripped_output = output.rstrip() if output else "" | |
| escaped_output = html.escape(stripped_output) if stripped_output else "(no output)" | |
| return HTMLResponse( | |
| f'<div class="cmd-line">$ {escaped_cmd}</div>' | |
| f'<div class="cmd-out">{escaped_output}</div>' |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tinyagentos/routes/container_shell.py` around lines 250 - 253, The current
logic treats any truthy output as real output even when it is only whitespace;
change the branch to inspect a stripped value: compute stripped_output =
output.rstrip() (or "" when output is falsy), then set display = "(no output)"
if not stripped_output else html.escape(stripped_output), and use that display
in the HTMLResponse along with escaped_cmd; update the code around the
escaped_output/escaped_cmd construction where HTMLResponse is returned
(variables: output, escaped_output, escaped_cmd).
|
Closing — git histories are unrelated (fork diverged), making this unmergeable. Will reopen as a clean PR from upstream master. |
Adds 23 tests for the container shell page (issue #462).
Tests cover:
Mock-based — no live containers needed. Tests pass individually; a pre-existing aiosqlite deadlock in local test infra prevents full batch runs (CI handles this correctly).
Summary by CodeRabbit
Release Notes
New Features
Tests