Skip to content

feat(knowledge): add per-app agent guides ingestion (#451)#490

Closed
hognek wants to merge 8 commits into
jaylfc:masterfrom
hognek:feat/per-app-agent-guides-only
Closed

feat(knowledge): add per-app agent guides ingestion (#451)#490
hognek wants to merge 8 commits into
jaylfc:masterfrom
hognek:feat/per-app-agent-guides-only

Conversation

@hognek
Copy link
Copy Markdown
Contributor

@hognek hognek commented May 31, 2026

Extends ingest_agent_docs for per-app guides. Hooks into app install/uninstall lifecycle so installed apps' guides are immediately available. 18 tests pass.\n\nCloses #451

Summary by CodeRabbit

  • New Features

    • Model Activity feed with live stream and UI; in-memory event buffer
    • AgentState schema and YAML load/dump; agent token store with issue/revoke APIs
    • Idempotent agent create/deploy; GitHub events channel and in-process channel adapters
    • New installers: llama.cpp (GGUF) and vllm (HF prefetch); auto-ingest agent docs and per-app guides at startup/install
  • Documentation

    • Pin QMD install to a known-good release; added fork-update/rebase guidance
  • Tests

    • New tests for doc ingestion and AgentState behavior
  • Bug Fixes

    • Adjusted backend→installer mappings for correct installer selection

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 31, 2026

Review Change Stack

Warning

Review limit reached

@jaylfc, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 28 minutes and 7 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 0ca214c9-fd26-49c8-9581-ad3d6d63225c

📥 Commits

Reviewing files that changed from the base of the PR and between 9329054 and d0a2723.

📒 Files selected for processing (19)
  • tests/framework_integrations/__init__.py
  • tests/framework_integrations/test_agent_state.py
  • tests/routes/test_store_install_v2.py
  • tests/test_agent_doc_ingestion.py
  • tinyagentos/agent_tokens_store.py
  • tinyagentos/app.py
  • tinyagentos/channel_hub/adapter_manager.py
  • tinyagentos/framework_integrations/__init__.py
  • tinyagentos/framework_integrations/agent_state.py
  • tinyagentos/installers/base.py
  • tinyagentos/installers/llama_cpp_installer.py
  • tinyagentos/installers/vllm_installer.py
  • tinyagentos/knowledge_monitor.py
  • tinyagentos/knowledge_store.py
  • tinyagentos/routes/activity.py
  • tinyagentos/routes/agents.py
  • tinyagentos/routes/channel_hub.py
  • tinyagentos/routes/store.py
  • tinyagentos/routes/store_install.py
📝 Walkthrough

Walkthrough

This PR adds two new model installers and installer dispatch updates, implements a markdown-based knowledge ingestion system with store-scoped listing/deletion and app lifecycle wiring (plus tests), introduces a GitHub event adapter and in-process adapter wiring, pins @jaylfc/qmd to v2.1.1 with docs/scripts updates, and adds agent tokens storage, AgentState schema+tests, a model activity feed/UI, and idempotency caching for agent endpoints.

Changes

Multi-feature expansion: model installers, knowledge system, GitHub channel adapter, and QMD pinning

Layer / File(s) Summary
QMD fork pinning & docs
docs/agent-qmd-serve-setup.md, docs/design/framework-agnostic-runtime.md, scripts/install-server.sh
Pins @jaylfc/qmd to 2.1.1 in docs and install scripts; adds guidance for rebasing and pin updates.
Installer method dispatch remapping
tinyagentos/installers/base.py, tinyagentos/routes/store_install.py, tests/routes/test_store_install_v2.py
Backend-to-method routing updated so ollama→ollama_pull, llama-cpp→llama_cpp_download, and vllm→vllm_download.
New model installers: llama-cpp and vLLM
tinyagentos/installers/llama_cpp_installer.py, tinyagentos/installers/vllm_installer.py
LlamaCppInstaller downloads/validates GGUF files with magic-byte checks and caching; VllmInstaller pre-downloads HF repos into hub cache and optionally pip-installs huggingface_hub.
Knowledge store source_id methods
tinyagentos/knowledge_store.py
Added list_by_source_id and delete_by_source_id for scoped querying and purging (including FTS cleanup).
Knowledge ingestion and wipe helpers
tinyagentos/knowledge_monitor.py
Added ingest_agent_docs, ingest_app_guides, ingest_installed_app_guides, and wipe_app_guides for recursive Markdown ingestion and per-app scoping.
App install/uninstall knowledge hooks
tinyagentos/routes/store.py, tinyagentos/routes/store_install.py
Calls to ingest/wipe per-app guides on successful install/uninstall flows with warnings on failure.
App startup knowledge initialization
tinyagentos/app.py
Boot-time ingestion of core docs and per-installed-app guides during lifespan; IdempotencyCache created on app.state.
Knowledge system tests
tests/test_agent_doc_ingestion.py
Tests for ingestion idempotency, nested files, empty-file skipping, per-app guides, wipe semantics, and store list/delete_by_source_id.
GitHub webhook event channel adapter
tinyagentos/channel_hub/adapters/github.py, tinyagentos/channel_hub/adapter_manager.py, tinyagentos/routes/channel_hub.py
Adds GithubConnector that tails a JSONL event log, filters events, routes IncomingMessage objects; adapter manager short-circuits subprocesses for in-process adapters; connect_bot supports github platform.
Model activity feed and UI
tinyagentos/routes/activity.py
Replaces system stats endpoint with a model activity ring buffer, SSE stream, publish/history endpoints, and embedded HTML/JS UI.
Agent tokens store
tinyagentos/agent_tokens_store.py
Adds SQLite-backed AgentTokensStore with single-active-token partial unique index and methods to issue/revoke/get active tokens.
Framework AgentState schema & YAML IO
tinyagentos/framework_integrations/agent_state.py, tinyagentos/framework_integrations/__init__.py
Adds Pydantic AgentState schema, enums, validator for last_deployed_at, and YAML load/dump helpers.
AgentState tests
tests/framework_integrations/test_agent_state.py
Comprehensive tests for AgentState parsing, validation, round-trip serialization, and design rules.
Agents endpoints idempotency
tinyagentos/routes/agents.py
Adds IdempotencyCache and uses it in add_agent and deploy_agent handlers to deduplicate concurrent requests by Idempotency-Key.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

  • #451: Pass 2: per-app agent guides ingestion — implements per-app guide ingestion and lifecycle hooks described in the issue.
  • #114: QMD fork pinning and rebase guidance — PR updates install docs/scripts and adds rebase instructions relevant to qmd-fork drift.

"🐰 Nose twitch, I hopped to code the night,
LLama GGUFs and vLLM fetches bright,
GitHub tails whisper events in flight,
App guides filed for search and light,
QMD pinned, all snug and right."

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Out of Scope Changes check ⚠️ Warning Changes include supporting features (AgentState model, idempotency cache, installers, activity stream, token store) necessary for the app ecosystem but not directly related to per-app guides ingestion itself. Consolidate unrelated features into separate PRs: IdempotencyCache, AgentState framework schema, installer updates, activity stream, and token store should be merged separately from per-app guides ingestion.
Docstring Coverage ⚠️ Warning Docstring coverage is 48.31% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding per-app agent guides ingestion into the knowledge store system.
Linked Issues check ✅ Passed All objectives from issue #451 are met: per-app guides discovery, source_id scoping, app lifecycle hooks for ingestion/uninstall, and tests validate idempotency and filtering.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

pr_number = body.get("pr_number")
from tinyagentos.channel_hub.adapters.github import GithubConnector
connector = GithubConnector(
agent_name=agent_name, router=router_obj,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Channel ID mismatch - should use repo instead of agent_name

The GithubConnector sets channel_id=repo in the IncomingMessage it routes. For the router to correctly deliver messages to the agent, the channel must be registered with the same channel_id. Currently, it's being registered with channel_id=agent_name, which causes a mismatch and prevents message delivery.

Suggested change
agent_name=agent_name, router=router_obj,
router_obj.assign_channel(platform, agent_name, repo)

@kilo-code-bot
Copy link
Copy Markdown

kilo-code-bot Bot commented May 31, 2026

Test comment

@kilo-code-bot
Copy link
Copy Markdown

kilo-code-bot Bot commented May 31, 2026

Code Review Summary

Status: 1 Issues Found | Recommendation: Address before merge

@kilo-code-bot
Copy link
Copy Markdown

kilo-code-bot Bot commented May 31, 2026

Code Review Summary

Status: 1 Issues Found | Recommendation: Address before merge

Overview

Severity Count
CRITICAL 0
WARNING 1
SUGGESTION 0

@kilo-code-bot
Copy link
Copy Markdown

kilo-code-bot Bot commented May 31, 2026

Code Review Summary

Status: 1 Issues Found | Recommendation: Address before merge

Overview

Severity Count
CRITICAL 0
WARNING 1
SUGGESTION 0
Issue Details (click to expand)

WARNING

File Line Issue
tinyagentos/routes/channel_hub.py 128 Channel ID mismatch - should use repo instead of agent_name

@kilo-code-bot
Copy link
Copy Markdown

kilo-code-bot Bot commented May 31, 2026

Code Review Summary

Status: 1 Issues Found | Recommendation: Address before merge

Overview

Severity Count
CRITICAL 0
WARNING 1
SUGGESTION 0

Files reviewed:

  • docs/agent-qmd-serve-setup.md
  • docs/design/framework-agnostic-runtime.md
  • scripts/install-server.sh
  • tests/test_agent_doc_ingestion.py
  • tinyagentos/app.py
  • tinyagentos/channel_hub/adapter_manager.py
  • tinyagentos/channel_hub/adapters/__init__.py
  • tinyagentos/channel_hub/adapters/github.py
  • tinyagentos/knowledge_monitor.py
  • tinyagentos/knowledge_store.py
  • tinyagentos/routes/channel_hub.py
  • tinyagentos/routes/store.py
  • tinyagentos/routes/store_install.py

@kilo-code-bot
Copy link
Copy Markdown

kilo-code-bot Bot commented May 31, 2026

Code Review Summary

Status: 1 Issues Found | Recommendation: Address before merge

Overview

Severity Count
CRITICAL 0
WARNING 1
SUGGESTION 0
Test details

This is a test.

@kilo-code-bot
Copy link
Copy Markdown

kilo-code-bot Bot commented May 31, 2026

Code Review Summary

Status: 1 Issues Found | Recommendation: Address before merge

Overview

Severity Count
CRITICAL 0
WARNING 1
SUGGESTION 0
Issue Details (click to expand)

WARNING

File Line Issue
tinyagentos/routes/channel_hub.py 128 Channel ID mismatch - should use repo instead of agent_name

1 similar comment
@kilo-code-bot
Copy link
Copy Markdown

kilo-code-bot Bot commented May 31, 2026

Code Review Summary

Status: 1 Issues Found | Recommendation: Address before merge

Overview

Severity Count
CRITICAL 0
WARNING 1
SUGGESTION 0
Issue Details (click to expand)

WARNING

File Line Issue
tinyagentos/routes/channel_hub.py 128 Channel ID mismatch - should use repo instead of agent_name

@kilo-code-bot
Copy link
Copy Markdown

kilo-code-bot Bot commented May 31, 2026

Code Review Summary

Status: 1 Issues Found | Recommendation: Address before merge

Overview

Severity Count
CRITICAL 0
WARNING 1
SUGGESTION 0
Issue Details (click to expand)

WARNING

File Line Issue
tinyagentos/routes/channel_hub.py 128 Channel ID mismatch - should use repo instead of agent_name

Files reviewed:

  • docs/agent-qmd-serve-setup.md
  • docs/design/framework-agnostic-runtime.md
  • scripts/install-server.sh
  • tests/test_agent_doc_ingestion.py
  • tinyagentos/app.py
  • tinyagentos/channel_hub/adapter_manager.py
  • tinyagentos/channel_hub/adapters/__init__.py
  • tinyagentos/channel_hub/adapters/github.py
  • tinyagentos/knowledge_monitor.py
  • tinyagentos/knowledge_store.py
  • tinyagentos/routes/channel_hub.py
  • tinyagentos/routes/store.py
  • tinyagentos/routes/store_install.py

Reviewed by nemotron-3-super-120b-a12b-20230311:free · 2,399,551 tokens

hognek pushed a commit to hognek/tinyagentos that referenced this pull request May 31, 2026
…hannel (jaylfc#490)

Fixes kilo-code-bot finding: Channel ID mismatch on line 128.
assign_channel(platform, bot_id, agent_name) was passing agent_name
for both bot_id and agent_name, keying channels by agent identity
instead of repo. Correct key format is now github:<repo>.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (3)
tinyagentos/routes/store_install.py (1)

625-632: ⚡ Quick win

Preserve the exception detail when guide ingestion fails.

The blind except Exception swallows the cause, and the warning logs no traceback — a silent ingestion failure will be hard to diagnose. The same applies to the wipe path at Lines 704-711.

♻️ Log the exception
-        try:
-            await ingest_app_guides(manifest.id, manifest.manifest_dir, knowledge_store)
-        except Exception:
-            logger.warning("Guide ingestion failed for app %s", manifest.id)
+        try:
+            await ingest_app_guides(manifest.id, manifest.manifest_dir, knowledge_store)
+        except Exception:  # noqa: BLE001 — best-effort, must not break install
+            logger.warning("Guide ingestion failed for app %s", manifest.id, exc_info=True)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tinyagentos/routes/store_install.py` around lines 625 - 632, The warning
handler around ingest_app_guides currently swallows errors; change the bare
"except Exception" to capture the exception (e.g. "except Exception as e") and
log the exception details/traceback when calling logger (use logger.exception or
logger.warning(..., exc_info=True)) so the error and stack are preserved for
manifest.id and manifest.manifest_dir failures; apply the same change to the
wipe-path handler that currently mirrors this pattern so both ingest_app_guides
and the wipe logic surface exception details.
tinyagentos/installers/vllm_installer.py (1)

43-47: ⚡ Quick win

Invoke pip via the current interpreter.

A bare "pip" may not exist on PATH or may target a different environment than the one running the controller. Use sys.executable -m pip so the package lands in the right interpreter.

♻️ Use sys.executable
-    import asyncio
+    import asyncio
+    import sys
 
     proc = await asyncio.create_subprocess_exec(
-        "pip", "install", "huggingface_hub[hf_transfer]",
+        sys.executable, "-m", "pip", "install", "huggingface_hub[hf_transfer]",
         stdout=asyncio.subprocess.PIPE,
         stderr=asyncio.subprocess.STDOUT,
     )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tinyagentos/installers/vllm_installer.py` around lines 43 - 47, Replace the
bare "pip" invocation in the asyncio.create_subprocess_exec call so the package
is installed into the current Python interpreter: import sys and invoke
sys.executable with "-m", "pip", "install", "huggingface_hub[hf_transfer]"
instead of "pip" (update the asyncio.create_subprocess_exec call that creates
proc). Keep stdout/stderr handling the same and ensure each argument is a
separate string argument to create_subprocess_exec.
tinyagentos/knowledge_store.py (1)

266-267: ⚡ Quick win

Batch the FTS cleanup deletes for better performance.

The loop issues one DELETE per item, incurring a round-trip for each. Replace the loop with a single batched DELETE using an IN clause.

♻️ Proposed refactor
     if not ids:
         return 0
-    for item_id in ids:
-        await self._db.execute("DELETE FROM knowledge_fts WHERE id = ?", (item_id,))
+    placeholders = ",".join("?" * len(ids))
+    await self._db.execute(
+        f"DELETE FROM knowledge_fts WHERE id IN ({placeholders})",
+        ids,
+    )
     cursor = await self._db.execute(
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tinyagentos/knowledge_store.py` around lines 266 - 267, The code currently
issues a separate DELETE per item in the loop (for item_id in ids: await
self._db.execute("DELETE FROM knowledge_fts WHERE id = ?", (item_id,))) which
causes many round-trips; replace it with a single batched DELETE using an IN
clause: build a parameterized IN tuple from ids and call self._db.execute once
against "DELETE FROM knowledge_fts WHERE id IN (...)" (or use placeholders
matching len(ids)), handling the empty ids case to skip the query. Ensure you
reference the same symbols: ids, self._db.execute, and table knowledge_fts.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tinyagentos/channel_hub/adapter_manager.py`:
- Around line 23-28: The early return when "framework in _CHANNEL_ADAPTERS"
causes start_adapter to skip port allocation/register_adapter so
router.route_message lacks a registered port for agent_name; modify
start_adapter (the function containing the if framework in _CHANNEL_ADAPTERS
block) to either follow the normal adapter startup path instead of returning
early or perform an in-process registration before returning: allocate a port or
call register_adapter(agent_name, port, ...) with the same parameters the normal
startup uses and only return success after register_adapter completes so
router.route_message can route inbound GitHub events correctly.

In `@tinyagentos/channel_hub/adapters/github.py`:
- Around line 118-122: The current check in github.py using needle =
f"/pull/{self.pr_number}" and "if needle not in url" is too permissive (e.g.,
pr_number=12 matches /pull/123); update the filter to require an exact PR path
match by either parsing the event "url" and comparing path segments so the
segment after "pull" equals str(self.pr_number), or use a stricter pattern such
as matching "/pull/{pr_number}" followed only by end-of-path or a slash (e.g.,
regex "/pull/{pr_number}($|/)") so only the intended PR number passes the
filter.
- Around line 91-93: When the events file (EVENTS_LOG_PATH) is truncated or
rotated, current_size can become smaller than self._last_position which causes
the method to return and ignore future events; change the logic so that instead
of returning when current_size <= self._last_position you detect a shrink
(current_size < self._last_position) and reset the cursor (set
self._last_position = 0 or to current_size for your desired behavior) so reading
resumes from the start of the new file, then continue processing rather than
returning; update the branch surrounding EVENTS_LOG_PATH.stat().st_size and
self._last_position accordingly.

In `@tinyagentos/installers/vllm_installer.py`:
- Around line 139-172: VllmInstaller.uninstall currently derives the HF cache
key from app_id (safe_name = app_id.replace("/", "--")) but install() calls
snapshot_download with repo = variant.get("hf_repo") or
variant.get("download_url"), so uninstall must use the same resolved repo
string; update VllmInstaller.install to persist the resolved repo (the local
variable repo used when calling snapshot_download) into the install
metadata/state for that app_id, and change VllmInstaller.uninstall to read that
stored repo value (falling back to variant.download_url if metadata is missing)
and compute safe_name = repo.replace("/", "--") and then remove cache dirs based
on that safe_name (e.g., cache / "models" / safe_name and cache /
f"models--{safe_name}"), using the existing shutil.rmtree handling and logger.

In `@tinyagentos/routes/channel_hub.py`:
- Around line 120-135: The GitHub branch for creating a GithubConnector (using
GithubConnector, router_obj.assign_channel, connector.start, and storing in
request.app.state.channel_hub_connectors) must run before the shared
secret/bot_token_secret validation; refactor the route handler so that the
platform == "github" case is checked and handled (or explicitly exempted from
the bot_token_secret/secret lookup) prior to the token validation logic, moving
the entire block that constructs GithubConnector and returns the connected
response ahead of or outside the shared-secret checks to avoid 400/404 for
GitHub requests.

---

Nitpick comments:
In `@tinyagentos/installers/vllm_installer.py`:
- Around line 43-47: Replace the bare "pip" invocation in the
asyncio.create_subprocess_exec call so the package is installed into the current
Python interpreter: import sys and invoke sys.executable with "-m", "pip",
"install", "huggingface_hub[hf_transfer]" instead of "pip" (update the
asyncio.create_subprocess_exec call that creates proc). Keep stdout/stderr
handling the same and ensure each argument is a separate string argument to
create_subprocess_exec.

In `@tinyagentos/knowledge_store.py`:
- Around line 266-267: The code currently issues a separate DELETE per item in
the loop (for item_id in ids: await self._db.execute("DELETE FROM knowledge_fts
WHERE id = ?", (item_id,))) which causes many round-trips; replace it with a
single batched DELETE using an IN clause: build a parameterized IN tuple from
ids and call self._db.execute once against "DELETE FROM knowledge_fts WHERE id
IN (...)" (or use placeholders matching len(ids)), handling the empty ids case
to skip the query. Ensure you reference the same symbols: ids, self._db.execute,
and table knowledge_fts.

In `@tinyagentos/routes/store_install.py`:
- Around line 625-632: The warning handler around ingest_app_guides currently
swallows errors; change the bare "except Exception" to capture the exception
(e.g. "except Exception as e") and log the exception details/traceback when
calling logger (use logger.exception or logger.warning(..., exc_info=True)) so
the error and stack are preserved for manifest.id and manifest.manifest_dir
failures; apply the same change to the wipe-path handler that currently mirrors
this pattern so both ingest_app_guides and the wipe logic surface exception
details.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 09fc6cf4-cdd4-4e11-a41f-cddf1c32a434

📥 Commits

Reviewing files that changed from the base of the PR and between c08935a and b851fbb.

📒 Files selected for processing (17)
  • docs/agent-qmd-serve-setup.md
  • docs/design/framework-agnostic-runtime.md
  • scripts/install-server.sh
  • tests/routes/test_store_install_v2.py
  • tests/test_agent_doc_ingestion.py
  • tinyagentos/app.py
  • tinyagentos/channel_hub/adapter_manager.py
  • tinyagentos/channel_hub/adapters/__init__.py
  • tinyagentos/channel_hub/adapters/github.py
  • tinyagentos/installers/base.py
  • tinyagentos/installers/llama_cpp_installer.py
  • tinyagentos/installers/vllm_installer.py
  • tinyagentos/knowledge_monitor.py
  • tinyagentos/knowledge_store.py
  • tinyagentos/routes/channel_hub.py
  • tinyagentos/routes/store.py
  • tinyagentos/routes/store_install.py

Comment on lines +23 to +28
if framework in _CHANNEL_ADAPTERS:
logger.info(
"Channel adapter '%s' is managed by channel-hub connect API, "
"not subprocess", framework,
)
return 0
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Don't report GitHub as started before the router can deliver to it.

This early return skips port allocation and never calls register_adapter, but router.route_message() still requires a registered port for agent_name. Inbound GitHub events will therefore hit the "No adapter registered" path and get dropped.

Either keep GitHub on the normal adapter startup path, or add an in-process registration path that satisfies the same routing contract before start_adapter() returns.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tinyagentos/channel_hub/adapter_manager.py` around lines 23 - 28, The early
return when "framework in _CHANNEL_ADAPTERS" causes start_adapter to skip port
allocation/register_adapter so router.route_message lacks a registered port for
agent_name; modify start_adapter (the function containing the if framework in
_CHANNEL_ADAPTERS block) to either follow the normal adapter startup path
instead of returning early or perform an in-process registration before
returning: allocate a port or call register_adapter(agent_name, port, ...) with
the same parameters the normal startup uses and only return success after
register_adapter completes so router.route_message can route inbound GitHub
events correctly.

Comment on lines +91 to +93
current_size = EVENTS_LOG_PATH.stat().st_size
if current_size <= self._last_position:
return
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reset the cursor when the event log shrinks.

If ~/.taos-gh-events.jsonl is truncated or rotated, current_size becomes smaller than self._last_position and this branch returns forever. From that point on, new events are silently ignored until the file grows past the stale offset.

Suggested fix
             current_size = EVENTS_LOG_PATH.stat().st_size
-            if current_size <= self._last_position:
+            if current_size < self._last_position:
+                self._last_position = 0
+            if current_size == self._last_position:
                 return
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
current_size = EVENTS_LOG_PATH.stat().st_size
if current_size <= self._last_position:
return
current_size = EVENTS_LOG_PATH.stat().st_size
if current_size < self._last_position:
self._last_position = 0
if current_size == self._last_position:
return
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tinyagentos/channel_hub/adapters/github.py` around lines 91 - 93, When the
events file (EVENTS_LOG_PATH) is truncated or rotated, current_size can become
smaller than self._last_position which causes the method to return and ignore
future events; change the logic so that instead of returning when current_size
<= self._last_position you detect a shrink (current_size < self._last_position)
and reset the cursor (set self._last_position = 0 or to current_size for your
desired behavior) so reading resumes from the start of the new file, then
continue processing rather than returning; update the branch surrounding
EVENTS_LOG_PATH.stat().st_size and self._last_position accordingly.

Comment on lines +118 to +122
if self.pr_number is not None:
url = event.get("url", "")
needle = f"/pull/{self.pr_number}"
if needle not in url:
return False
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Make the PR filter exact.

needle in url will also match other PRs with the same prefix, e.g. pr_number=12 matches /pull/123. That can route unrelated events into the wrong agent.

Suggested fix
         if self.pr_number is not None:
             url = event.get("url", "")
-            needle = f"/pull/{self.pr_number}"
-            if needle not in url:
+            parts = url.rstrip("/").split("/")
+            try:
+                pull_idx = parts.index("pull")
+            except ValueError:
+                return False
+            if pull_idx + 1 >= len(parts) or parts[pull_idx + 1] != str(self.pr_number):
                 return False
         return True
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if self.pr_number is not None:
url = event.get("url", "")
needle = f"/pull/{self.pr_number}"
if needle not in url:
return False
if self.pr_number is not None:
url = event.get("url", "")
parts = url.rstrip("/").split("/")
try:
pull_idx = parts.index("pull")
except ValueError:
return False
if pull_idx + 1 >= len(parts) or parts[pull_idx + 1] != str(self.pr_number):
return False
return True
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tinyagentos/channel_hub/adapters/github.py` around lines 118 - 122, The
current check in github.py using needle = f"/pull/{self.pr_number}" and "if
needle not in url" is too permissive (e.g., pr_number=12 matches /pull/123);
update the filter to require an exact PR path match by either parsing the event
"url" and comparing path segments so the segment after "pull" equals
str(self.pr_number), or use a stricter pattern such as matching
"/pull/{pr_number}" followed only by end-of-path or a slash (e.g., regex
"/pull/{pr_number}($|/)") so only the intended PR number passes the filter.

Comment on lines +139 to +172
async def uninstall(self, app_id: str) -> dict:
"""Best-effort removal from the HF cache.

The HF cache is a shared content-addressed store — removing one
model doesn't break others that share blobs. We scan the cache's
refs for the model name and remove the symlink directory.
"""
import shutil

cache = Path(self.cache_dir) if self.cache_dir else Path.home() / ".cache" / "huggingface" / "hub"
if not cache.exists():
return {"success": True, "deleted": [], "backend": BACKEND_ID}

deleted: list[str] = []
# HF hub stores models under models--<org>--<name>
safe_name = app_id.replace("/", "--")
for models_dir in [
cache / "models" / safe_name,
cache / f"models--{safe_name}",
]:
if models_dir.exists():
try:
shutil.rmtree(str(models_dir))
deleted.append(str(models_dir))
except OSError as exc:
logger.warning(
"vllm uninstall: could not remove %s: %s", models_dir, exc
)

return {
"success": True,
"deleted": deleted,
"backend": BACKEND_ID,
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find manifest id vs hf_repo to gauge how often they diverge for vllm models.
rg -nP --type=yaml -C3 'hf_repo' | rg -nP -C3 '\bid\s*:'
fd -e yaml -e yml | xargs rg -lP 'backend.*vllm|vllm' 2>/dev/null | head

Repository: jaylfc/tinyagentos

Length of output: 501


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect the vllm installer implementation around install/uninstall and any cache-dir computation.
fd -e py vllm_installer.py | xargs -I{} bash -lc 'echo "==== {}"; sed -n "1,260p" "{}"'

# Search for where install() stores/derives hf_repo and what uninstall() receives for app_id.
rg -n "def uninstall|uninstall\(self,|def install|hf_repo|app_id|cache_dir|BACKEND_ID" tinyagentos/installers/vllm_installer.py app-catalog -S

# Find the manifest id field(s) used as app_id in YAML manifests for vllm models.
# (We’re looking for keys like "id:", "app_id:", etc. near hf_repo.)
fd -e yaml -e yml app-catalog | xargs rg -nP 'hf_repo\s*:|^\s*id\s*:|^\s*app_id\s*:|backend\s*:.*vllm|backend\s*:\s*vllm' -n -S | head -n 200

Repository: jaylfc/tinyagentos

Length of output: 19939


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Inspect representative vllm model manifests for hf_repo + id/app_id.
#    Focus on qwen2.5-3b and qwen3-8b since they were listed.
for f in \
  app-catalog/models/qwen2.5-3b/manifest.yaml \
  app-catalog/models/qwen2.5-32b/manifest.yaml \
  app-catalog/models/qwen3-8b/manifest.yaml \
  app-catalog/models/qwen3-14b/manifest.yaml \
  app-catalog/models/qwen3-32b/manifest.yaml \
  app-catalog/models/mixtral-8x7b/manifest.yaml \
  app-catalog/models/llama-3.3-70b/manifest.yaml
do
  echo "==== $f"
  [ -f "$f" ] && rg -n "^(id|backend|hf_repo|hf_revision|variants?|app_id)\b|backend\s*:\s*vllm|hf_repo\s*:" "$f" -n -S || echo "MISSING"
done

# 2) Find all hf_repo occurrences and whether they’re under vllm variants.
rg -n --glob 'app-catalog/**/manifest.yaml' "hf_repo\s*:" app-catalog | head -n 200

# 3) Trace how uninstall() is invoked for installers.
rg -n "uninstall\(" tinyagentos | head -n 200
rg -n "VllmInstaller|BACKEND_ID\s*=\s*\"vllm\"|backend.*vllm" tinyagentos | head -n 200

# 4) Inspect any app registry/uninstall routing that may pass a different id.
rg -n "recycle|delete|remove.*app|uninstall" tinyagentos/routes tinyagentos | head -n 200
rg -n "installed_apps|InstalledApp|app_id.*repo|repo.*app_id" tinyagentos | head -n 200

# 5) Show installed_apps structure to see what fields are available during uninstall.
sed -n '1,220p' tinyagentos/installed_apps.py

Repository: jaylfc/tinyagentos

Length of output: 619


🏁 Script executed:

#!/bin/bash
set -u
# 1) Print relevant vllm manifest fields (id + hf_repo if present) for a few samples.
for f in \
  app-catalog/models/qwen2.5-3b/manifest.yaml \
  app-catalog/models/qwen2.5-32b/manifest.yaml \
  app-catalog/models/qwen3-8b/manifest.yaml \
  app-catalog/models/qwen3-14b/manifest.yaml \
  app-catalog/models/mixtral-8x7b/manifest.yaml \
  app-catalog/models/llama-3.3-70b/manifest.yaml
do
  echo "==== $f"
  [ -f "$f" ] && sed -n '1,120p' "$f" | nl -ba | sed -n '1,120p'
done

# 2) Find where VllmInstaller.uninstall() is invoked.
rg -n "VllmInstaller\(|backend.?vllm|uninstall\(" tinyagentos | head -n 200 || true

# 3) Inspect installed_apps.py to see what fields/uninstall payload contains.
if [ -f tinyagentos/installed_apps.py ]; then
  sed -n '1,220p' tinyagentos/installed_apps.py | nl -ba
fi

# 4) Find delete/recycle routing.
rg -n "recycle|delete.*app|remove.*app|uninstall" tinyagentos/routes | head -n 200 || true

Repository: jaylfc/tinyagentos

Length of output: 6537


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== vllm manifests: show id + variant.hf_repo (if present) =="
for f in \
  app-catalog/models/qwen2.5-3b/manifest.yaml \
  app-catalog/models/qwen3-8b/manifest.yaml \
  app-catalog/models/qwen2.5-32b/manifest.yaml \
  app-catalog/models/mixtral-8x7b/manifest.yaml \
  app-catalog/models/llama-3.3-70b/manifest.yaml
do
  echo "---- $f (first 160 lines) ----"
  if [ -f "$f" ]; then
    head -n 160 "$f" | cat -n
    echo "hf_repo matches:"
    rg -n "hf_repo\s*:" "$f" || echo "  (none)"
    echo
  else
    echo "MISSING $f"
  fi
done

echo "== uninstall call sites (store/store_install) around installer.uninstall(...) =="
sed -n '250,420p' tinyagentos/routes/store.py | cat -n
sed -n '600,770p' tinyagentos/routes/store_install.py | cat -n

echo "== installed_apps.py uninstall(app_id) =="
sed -n '1,140p' tinyagentos/installed_apps.py | cat -n

echo "== VllmInstaller.uninstall implementation already inspected; show it again for context =="
sed -n '120,210p' tinyagentos/installers/vllm_installer.py | cat -n

Repository: jaylfc/tinyagentos

Length of output: 26994


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Where hf_repo is used anywhere in the catalog
rg -n "hf_repo\s*:" app-catalog | head -n 200

# Which manifests reference vllm and might include hf_repo
rg -n "backend.*vllm|id:\s*vllm|-\s*id:\s*vllm" app-catalog/models app-catalog/services -S --glob '*.yaml' | head -n 200

# Inspect the vllm service manifest (could define hf_repo / method)
sed -n '1,200p' app-catalog/services/vllm/manifest.yaml | cat -n

# Inspect another representative model manifest further down to ensure hf_repo isn’t later in the file
sed -n '1,260p' app-catalog/models/qwen2.5-3b/manifest.yaml | cat -n

Repository: jaylfc/tinyagentos

Length of output: 44


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Count/list hf_repo matches =="
rg -n "hf_repo\s*:" app-catalog --glob '*.yaml' || true

echo
echo "== vllm manifests (models/services) mentioning backend vllm =="
rg -n --glob 'app-catalog/**/*.yaml' "backend\s*:\s*vllm|- id:\s*vllm\b|id:\s*vllm\b" app-catalog | head -n 200 || true

echo
echo "== vllm service manifest.yaml (top 240 lines) =="
sed -n '1,240p' app-catalog/services/vllm/manifest.yaml | cat -n || true

echo
echo "== vllm-related model manifest: dump whole variants section for one sample (qwen3-8b) =="
sed -n '1,120p' app-catalog/models/qwen3-8b/manifest.yaml | cat -n
rg -n "hf_repo\s*:|download_url\s*:|requires:\s*$|vllm" app-catalog/models/qwen3-8b/manifest.yaml | cat -n || true

echo
echo "== Also check if any installer method expects hf_repo vs download_url in variants =="
rg -n "hf_repo|download_url" tinyagentos/installers -S | head -n 200

Repository: jaylfc/tinyagentos

Length of output: 6983


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== occurrences of hf_repo anywhere in repository =="

rg -n "hf_repo\s*:" . --glob '!**/.git/**' || true

echo
echo "== occurrences of 'hf_repo' as a substring =="

rg -n "hf_repo" . --glob '!**/.git/**' || true

echo
echo "== check for download_url values that look like HF repo_id (org/name) instead of resolve URLs =="

rg -n "download_url\s*:\s*https://huggingface\.co/[^/]+/[^/]+/resolve/" app-catalog/models app-catalog/services --glob '*.yaml' || true

Repository: jaylfc/tinyagentos

Length of output: 25293


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Show where installers call model_installer.install(...), to confirm app_id is manifest.id
rg -n "model_installer\.install\(" tinyagentos/routes/store_install.py tinyagentos/routes/store.py
sed -n '450,760p' tinyagentos/routes/store_install.py | cat -n
sed -n '1,220p' tinyagentos/routes/store_install.py | cat -n

Repository: jaylfc/tinyagentos

Length of output: 25661


🌐 Web query:

Hugging Face hub cache directory structure models--<org>--<name> under ~/.cache/huggingface/hub

💡 Result:

The Hugging Face Hub cache directory, located by default at ~/.cache/huggingface/hub, uses a structured system to manage model, dataset, and space repositories efficiently [1][2][3]. Each repository downloaded from the Hub is stored in a dedicated folder within the cache root, typically named using the format ---- (e.g., models--julien-c--EsperBERTo-small) [1][2][4]. Inside each of these repository folders, the directory structure is consistent, consisting of the following key components [2][4][5]: 1. blobs/: This directory contains the actual file contents, stored in a content-addressed manner. Files here are named based on their commit hash or etag [1][4][5]. Because files are content-addressed, multiple revisions of the same file are not stored redundantly [5]. 2. snapshots/: This directory provides a view of the repository at specific revisions (commit hashes) [6][4]. Each subdirectory within snapshots corresponds to a specific revision and mirrors the original file structure of the repository on the Hub [1][6][4]. To save space, the files within these snapshot directories are symbolic links (symlinks) pointing to the actual data stored in the blobs/ directory [1][4][5]. 3. refs/: This directory maps human-readable references, such as branch names (e.g., main), tags, or pull request numbers, to their corresponding commit hashes [4][5]. 4..no_exist/: This optional directory tracks files that were requested but found not to exist on the Hub, helping to prevent redundant network requests [4][5]. Additionally, a.locks/ directory is located at the root of the cache (not inside individual repository folders) [1][6]. This directory contains lock files that prevent concurrent processes from downloading the same blob simultaneously [1][6][5]. You can customize the cache location by setting environment variables [1][2][7]: - HF_HOME: Sets the root directory for all Hugging Face caches (e.g., $HF_HOME/hub) [1][3]. - HF_HUB_CACHE: Directly specifies the location for the Hub cache, taking precedence over other settings [1][2][3]. If you are inspecting your cache, note that while individual repositories have their own folders, they all share this same internal architecture [2][4].

Citations:


Fix vllm uninstall cache cleanup (derives HF cache key from app_id but install uses HF repo/download URL)

  • VllmInstaller.install() calls snapshot_download with repo_id=repo, where repo = variant.get("hf_repo") or variant.get("download_url"). In the app-catalog vLLM model manifests, vllm variants contain download_url but no hf_repo, so the HF cache directory is keyed off the HF repo/download URL, not manifest.id/app_id.
  • VllmInstaller.uninstall(app_id) instead builds safe_name = app_id.replace("/", "--") and tries to delete hub/models--{safe_name} (and hub/models/{safe_name}, which is not part of the standard HF hub cache layout). This mismatch means cached snapshots/blobs will not be reclaimed.
  • Consider persisting the resolved repo/HF repo_id (or the exact cache repo directory name) during install() and using that persisted value in uninstall() to compute the correct models--<org>--<name> path(s).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tinyagentos/installers/vllm_installer.py` around lines 139 - 172,
VllmInstaller.uninstall currently derives the HF cache key from app_id
(safe_name = app_id.replace("/", "--")) but install() calls snapshot_download
with repo = variant.get("hf_repo") or variant.get("download_url"), so uninstall
must use the same resolved repo string; update VllmInstaller.install to persist
the resolved repo (the local variable repo used when calling snapshot_download)
into the install metadata/state for that app_id, and change
VllmInstaller.uninstall to read that stored repo value (falling back to
variant.download_url if metadata is missing) and compute safe_name =
repo.replace("/", "--") and then remove cache dirs based on that safe_name
(e.g., cache / "models" / safe_name and cache / f"models--{safe_name}"), using
the existing shutil.rmtree handling and logger.

Comment on lines +120 to +135
elif platform == "github":
if not agent_name:
return JSONResponse({"error": "agent_name is required"}, status_code=400)
repo = body.get("repo")
event_kinds = body.get("event_kinds", [])
pr_number = body.get("pr_number")
from tinyagentos.channel_hub.adapters.github import GithubConnector
connector = GithubConnector(
agent_name=agent_name, router=router_obj,
repo=repo, event_kinds=event_kinds, pr_number=pr_number,
)
router_obj.assign_channel(platform, repo, agent_name)
await connector.start()
connectors[connector_key] = connector
request.app.state.channel_hub_connectors = connectors
return {"status": "connected", "platform": platform, "agent_name": agent_name}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Special-case GitHub before the shared token validation.

This branch still sits behind the earlier bot_token_secret/secret lookup checks, so a GitHub connect request without a secret 400s or 404s before it ever reaches this code. GithubConnector doesn't use bot_token, so the new flow is currently blocked by unrelated validation.

Please move the GitHub branch ahead of the shared secret checks, or exempt platform == "github" from that validation path.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tinyagentos/routes/channel_hub.py` around lines 120 - 135, The GitHub branch
for creating a GithubConnector (using GithubConnector,
router_obj.assign_channel, connector.start, and storing in
request.app.state.channel_hub_connectors) must run before the shared
secret/bot_token_secret validation; refactor the route handler so that the
platform == "github" case is checked and handled (or explicitly exempted from
the bot_token_secret/secret lookup) prior to the token validation logic, moving
the entire block that constructs GithubConnector and returns the connected
response ahead of or outside the shared-secret checks to avoid 400/404 for
GitHub requests.

@hognek
Copy link
Copy Markdown
Contributor Author

hognek commented May 31, 2026

/kilo-review

Hogne added 7 commits May 31, 2026 14:35
- Add ingest_agent_docs() to knowledge_monitor — walks .md files and ingests
  them into the knowledge store with source_id scoping
- Add ingest_app_guides(), ingest_installed_app_guides(), wipe_app_guides()
- Add list_by_source_id() and delete_by_source_id() to KnowledgeStore
- Hook into install/uninstall routes (store.py and store_install.py) so
  app guides are ingested on install and wiped on uninstall
- Ingest core docs/agents/ and per-app guides on app startup
- 11 tests covering ingestion, idempotency, source_id scoping, and wipe
- Ollama: _BACKEND_TO_METHOD now maps to ollama_pull (reuses
  existing OllamaInstaller which streams POST /api/pull).
  Backward-compatible — get_installer() accepts both 'ollama'
  and 'ollama_pull'.

- llama-cpp: new LlamaCppInstaller downloads GGUF files and
  validates GGUF magic bytes (0x47475546) post-download. Catches
  corrupt/truncated/wrong-format downloads before runtime.

- vLLM: new VllmInstaller uses huggingface_hub.snapshot_download
  to pre-populate the HF cache. Auto-installs huggingface_hub
  via pip when missing.

All entries in _BACKEND_TO_METHOD resolve via get_installer()
without ValueError. Existing tests pass.
…hannel (jaylfc#490)

Fixes kilo-code-bot finding: Channel ID mismatch on line 128.
assign_channel(platform, bot_id, agent_name) was passing agent_name
for both bot_id and agent_name, keying channels by agent identity
instead of repo. Correct key format is now github:<repo>.
…ken store (jaylfc#452)

- Add IdempotencyCache with try_reserve()/set() pattern using asyncio.Event
  sentinels to close the get-then-set race in add_agent and deploy_agent_endpoint
- Add AgentTokensStore with BEGIN IMMEDIATE for multi-worker-safe token issuance
  and clean AgentTokenExistsError on unique constraint violations
- Register idempotency_cache on app.state in both app factory paths
- Resolve pre-existing merge conflict in adapter_manager.py
Add the framework-agnostic AgentState Pydantic model matching the full
spec from the TAOS Framework Integration Bridge design doc, plus YAML
loader/writer with three schema rules enforced:

- No file paths in the model (Reconciler computes them)
- Secrets are opaque refs (id-only, resolved at render time)
- Skills (List[str]) and plugins (List[Plugin]) are separate buckets

Includes 32 tests covering full-spec load, minimal-agent defaults,
round-trip serialization, validation, load_agent_by_name, and the
three design rules.
@hognek hognek force-pushed the feat/per-app-agent-guides-only branch from 9329054 to ed19de8 Compare May 31, 2026 12:35
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (2)
tests/framework_integrations/test_agent_state.py (1)

451-485: ⚡ Quick win

Assert specific exceptions instead of blind Exception.

Ruff flags these pytest.raises(Exception) blocks (B017). A bare Exception will also pass if the test fails for an unrelated reason (e.g., FileNotFoundError, an import error), masking regressions. These cases all surface pydantic.ValidationError, so assert that directly.

💚 Proposed fix
+from pydantic import ValidationError
@@
-        with pytest.raises(Exception):
+        with pytest.raises(ValidationError):
             load_agent(p)
@@
-    def test_missing_required_name(self):
-        with pytest.raises(Exception):
+    def test_missing_required_name(self):
+        with pytest.raises(ValidationError):
             AgentState.model_validate({"framework": "x", "framework_version": "1.0"})
@@
-    def test_missing_framework(self):
-        with pytest.raises(Exception):
+    def test_missing_framework(self):
+        with pytest.raises(ValidationError):
             AgentState.model_validate({"name": "x", "framework_version": "1.0"})
@@
-        with pytest.raises(Exception):
+        with pytest.raises(ValidationError):
             load_agent(p)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/framework_integrations/test_agent_state.py` around lines 451 - 485,
Replace the broad pytest.raises(Exception) assertions in TestValidation tests
with the specific pydantic.ValidationError to avoid masking unrelated failures:
update test_extra_fields_forbidden (when calling load_agent),
test_missing_required_name and test_missing_framework (when calling
AgentState.model_validate), and test_bad_routing_strategy_rejected (when calling
load_agent) to assert pytest.raises(pydantic.ValidationError) instead of
Exception.
tinyagentos/framework_integrations/agent_state.py (1)

275-275: ⚡ Quick win

Consider forbidding extra keys on nested models too.

extra: "forbid" is set only on the top-level AgentState. Nested models (Models, ModelSlot, Sandbox, State, etc.) default to extra: "ignore", so a typo like models.chat.temprature: 0.5 is silently dropped and the default is used — defeating the validation guarantees for a config file consumed by the Reconciler. test_extra_fields_forbidden only exercises the top level, so this gap is untested.

Consider a shared base model that forbids extras everywhere:

♻️ Proposed approach
-from pydantic import BaseModel, Field, field_validator
+from pydantic import BaseModel as _PydanticBaseModel, Field, field_validator
+
+
+class BaseModel(_PydanticBaseModel):
+    model_config = {"extra": "forbid"}

Then the explicit model_config = {"extra": "forbid"} on AgentState can be dropped, and all nested models inherit strict validation.

If lenient nested parsing is intentional (e.g., forward-compat for unknown keys), disregard.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tinyagentos/framework_integrations/agent_state.py` at line 275, AgentState
currently sets model_config = {"extra": "forbid"} only at the top level, but
nested models (Models, ModelSlot, Sandbox, State, etc.) still accept extra keys;
create a shared base class (e.g., BaseModelStrict) that inherits from
pydantic.BaseModel and defines model_config = {"extra": "forbid"}, then update
all nested classes (Models, ModelSlot, Sandbox, State, and AgentState) to
inherit from BaseModelStrict (and remove the explicit model_config on
AgentState), so extra fields are forbidden everywhere; also update/add tests
(e.g., test_extra_fields_forbidden) to assert that typos in nested keys are
rejected.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tinyagentos/agent_tokens_store.py`:
- Around line 14-19: The DB currently stores and returns plaintext tokens
(column token) — change storage to a secure hash/HMAC and stop returning raw
secrets: when issuing a token, generate and return the plaintext once but store
only its salted HMAC (use a keyed HMAC like HMAC-SHA256 with a server secret);
update the insert logic that populates the agent_tokens token column to write
the hash instead of the plaintext; modify revoke() and get_active() to accept a
plaintext token, hash/HMAC it with the same key, and operate/compare against the
stored hash, and ensure get_active() and any other readers return only token
metadata (id, agent_name, created_at, revoked_at) not the plaintext; update any
code paths that previously exposed token strings to use this new behavior.

In `@tinyagentos/routes/activity.py`:
- Around line 355-357: The client-side events array (events) is unbounded and
grows indefinitely; after each insertion (where code calls events.unshift(ev)
and increments eventCount), trim the array to MAX_VISIBLE by removing oldest
entries (e.g., if events.length > MAX_VISIBLE then truncate with events.length =
MAX_VISIBLE or events.splice(MAX_VISIBLE)) so the in-tab buffer stays capped;
apply the same trimming logic in both locations that manipulate events (the
block around the declarations/events.unshift at symbols events, eventCount,
MAX_VISIBLE and the other handler in the 495-503 area) while still updating
eventCount as before.
- Around line 147-154: The SSE replay is duplicating history because
activity_stream() always yields buf.snapshot() for every new connection and
clients reconnect via connectSSE() without providing a cursor; fix by adding
stable event IDs and resuming from a cursor: have event_generator() and
buf.subscribe() accept a starting cursor/last_id (use request headers like
Last-Event-ID or a query param), emit each event with an SSE "id:" field, and
when a reconnect includes last_id, replay only events after that id (or start
subscription from that cursor) instead of replaying the whole buf.snapshot();
update connectSSE() to pass Last-Event-ID on reconnect so the server can avoid
duplicating buffered history.
- Around line 92-115: The module-level _buffer is lazily set only by
get_buffer(), so publish_event() will drop events until the first request
triggers initialization and get_buffer() then causes that single buffer to be
reused across different app instances; fix by initializing or binding the
app-specific buffer at app startup and making get_buffer() and publish_event()
always resolve the buffer from the app state if available: ensure startup code
sets request.app.state.activity_buffer = ActivityBuffer() (or assigns _buffer
from that app state), change get_buffer() to prefer
request.app.state.activity_buffer and avoid permanently caching a single global
_buffer across apps, and update publish_event() to retrieve the current buffer
via get_buffer() (or request.app.state) before publishing so events are not
dropped and buffers are not shared across app instances.

In `@tinyagentos/routes/agents.py`:
- Around line 49-102: The IdempotencyCache can leave reserved keys with their
Event never fired if the owner path returns or raises before calling set(),
causing waiters to block forever; add a terminal method (e.g. complete/abort) on
IdempotencyCache that sets the result (or a sentinel/error) and calls
event.set(), update try_reserve/set/get to document/handle this state, and
modify the request handlers that call IdempotencyCache.try_reserve(...) (the
branches around the existing reserve points) to always call the new terminal
method in a finally block so every reserved key is always resolved regardless of
errors or early returns.
- Around line 165-168: The idempotency cache is currently keyed only by the
Idempotency-Key header (idempotency_key) so different endpoints or different
request bodies can collide; change the key passed to
idempotency_cache.try_reserve to a composite key that includes request.method
and request.url.path and a fingerprint of the request body (e.g., a short hash
of the raw body or normalized JSON) before calling try_reserve (the variables to
update are idempotency_key and the call to idempotency_cache.try_reserve
returning mode, event); apply the same composite-key fix to the other
try_reserve usage in this module so each operation is scoped to method+path+body
fingerprint.

---

Nitpick comments:
In `@tests/framework_integrations/test_agent_state.py`:
- Around line 451-485: Replace the broad pytest.raises(Exception) assertions in
TestValidation tests with the specific pydantic.ValidationError to avoid masking
unrelated failures: update test_extra_fields_forbidden (when calling
load_agent), test_missing_required_name and test_missing_framework (when calling
AgentState.model_validate), and test_bad_routing_strategy_rejected (when calling
load_agent) to assert pytest.raises(pydantic.ValidationError) instead of
Exception.

In `@tinyagentos/framework_integrations/agent_state.py`:
- Line 275: AgentState currently sets model_config = {"extra": "forbid"} only at
the top level, but nested models (Models, ModelSlot, Sandbox, State, etc.) still
accept extra keys; create a shared base class (e.g., BaseModelStrict) that
inherits from pydantic.BaseModel and defines model_config = {"extra": "forbid"},
then update all nested classes (Models, ModelSlot, Sandbox, State, and
AgentState) to inherit from BaseModelStrict (and remove the explicit
model_config on AgentState), so extra fields are forbidden everywhere; also
update/add tests (e.g., test_extra_fields_forbidden) to assert that typos in
nested keys are rejected.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: f9a2de47-efdc-4d9f-8971-436f6bee40aa

📥 Commits

Reviewing files that changed from the base of the PR and between 1c065b4 and 9329054.

📒 Files selected for processing (9)
  • tests/framework_integrations/__init__.py
  • tests/framework_integrations/test_agent_state.py
  • tinyagentos/agent_tokens_store.py
  • tinyagentos/app.py
  • tinyagentos/channel_hub/adapter_manager.py
  • tinyagentos/framework_integrations/__init__.py
  • tinyagentos/framework_integrations/agent_state.py
  • tinyagentos/routes/activity.py
  • tinyagentos/routes/agents.py
✅ Files skipped from review due to trivial changes (1)
  • tinyagentos/framework_integrations/init.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • tinyagentos/channel_hub/adapter_manager.py

Comment on lines +14 to +19
CREATE TABLE IF NOT EXISTS agent_tokens (
id INTEGER PRIMARY KEY AUTOINCREMENT,
agent_name TEXT NOT NULL,
token TEXT NOT NULL,
created_at TEXT NOT NULL,
revoked_at TEXT
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Don't persist active tokens in plaintext.

Line 17 stores the raw token, and revoke() / get_active() hand that secret back out of the DB unchanged. Any DB read now exposes a usable credential. Hash/HMAC the token at rest and treat the plaintext value as write-only after issuance.

Also applies to: 101-143

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tinyagentos/agent_tokens_store.py` around lines 14 - 19, The DB currently
stores and returns plaintext tokens (column token) — change storage to a secure
hash/HMAC and stop returning raw secrets: when issuing a token, generate and
return the plaintext once but store only its salted HMAC (use a keyed HMAC like
HMAC-SHA256 with a server secret); update the insert logic that populates the
agent_tokens token column to write the hash instead of the plaintext; modify
revoke() and get_active() to accept a plaintext token, hash/HMAC it with the
same key, and operate/compare against the stored hash, and ensure get_active()
and any other readers return only token metadata (id, agent_name, created_at,
revoked_at) not the plaintext; update any code paths that previously exposed
token strings to use this new behavior.

Comment on lines +92 to +115
# Module-level singleton — populated at startup via app.state
_buffer: Optional[ActivityBuffer] = None


def get_buffer(request: Request) -> ActivityBuffer:
"""Return the activity buffer from app state (lazy init for test compat)."""
global _buffer
if _buffer is not None:
return _buffer
buf = getattr(request.app.state, "activity_buffer", None)
if buf is None:
buf = ActivityBuffer()
request.app.state.activity_buffer = buf
_buffer = buf
return buf

@router.get("/api/activity")
async def activity(request: Request):
"""Rich system activity: CPU cores, NPU cores, thermals, GPU, disk, net, procs."""
import psutil

from tinyagentos.system_stats import (
get_cpu_per_core,
get_disk_io_rate,
get_gpu_load,
get_network_rates,
get_npu_frequency,
get_npu_per_core,
get_thermal_zones,
get_top_processes,
get_vram_usage,
get_zram_stats,

def publish_event(event: dict) -> None:
"""Publish an activity event from anywhere in the codebase.

Safe to call before the buffer is initialised (events are dropped silently).
"""
if _buffer is not None:
_buffer.publish(event)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Initialize the shared buffer independently of the first request.

publish_event() can only see _buffer, but _buffer is assigned lazily from get_buffer(). That means internal publishers drop every event until some request hits this router, and after the first initialization the early return in get_buffer() makes later app instances reuse that original buffer instead of their own app.state.activity_buffer.

💡 One way to keep startup publishers and request handlers on the same buffer
-# Module-level singleton — populated at startup via app.state
-_buffer: Optional[ActivityBuffer] = None
+# Module-level singleton used by internal publishers and attached to app state
+_buffer: ActivityBuffer = ActivityBuffer()
@@
 def get_buffer(request: Request) -> ActivityBuffer:
-    """Return the activity buffer from app state (lazy init for test compat)."""
-    global _buffer
-    if _buffer is not None:
-        return _buffer
-    buf = getattr(request.app.state, "activity_buffer", None)
-    if buf is None:
-        buf = ActivityBuffer()
-        request.app.state.activity_buffer = buf
-        _buffer = buf
-    return buf
+    """Return the shared activity buffer and mirror it onto app state."""
+    global _buffer
+    buf = getattr(request.app.state, "activity_buffer", None)
+    if buf is None:
+        request.app.state.activity_buffer = _buffer
+        return _buffer
+    _buffer = buf
+    return buf
@@
 def publish_event(event: dict) -> None:
@@
-    if _buffer is not None:
-        _buffer.publish(event)
+    _buffer.publish(event)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tinyagentos/routes/activity.py` around lines 92 - 115, The module-level
_buffer is lazily set only by get_buffer(), so publish_event() will drop events
until the first request triggers initialization and get_buffer() then causes
that single buffer to be reused across different app instances; fix by
initializing or binding the app-specific buffer at app startup and making
get_buffer() and publish_event() always resolve the buffer from the app state if
available: ensure startup code sets request.app.state.activity_buffer =
ActivityBuffer() (or assigns _buffer from that app state), change get_buffer()
to prefer request.app.state.activity_buffer and avoid permanently caching a
single global _buffer across apps, and update publish_event() to retrieve the
current buffer via get_buffer() (or request.app.state) before publishing so
events are not dropped and buffers are not shared across app instances.

Comment on lines +147 to +154
async def event_generator():
q = await buf.subscribe()
try:
# Replay existing history first so a freshly-opened tab
# shows recent events immediately.
for event in buf.snapshot():
if _matches_filter(event, worker, model, type):
yield f"data: {json.dumps(event)}\n\n"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Reconnects will duplicate buffered history.

activity_stream() replays snapshot() on every connection, and connectSSE() opens a fresh EventSource after each error without any cursor or dedupe. After a transient disconnect, the client will prepend the same recent events again, so the timeline and count drift upward even when no new activity happened.

Also applies to: 529-545

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tinyagentos/routes/activity.py` around lines 147 - 154, The SSE replay is
duplicating history because activity_stream() always yields buf.snapshot() for
every new connection and clients reconnect via connectSSE() without providing a
cursor; fix by adding stable event IDs and resuming from a cursor: have
event_generator() and buf.subscribe() accept a starting cursor/last_id (use
request headers like Last-Event-ID or a query param), emit each event with an
SSE "id:" field, and when a reconnect includes last_id, replay only events after
that id (or start subscription from that cursor) instead of replaying the whole
buf.snapshot(); update connectSSE() to pass Last-Event-ID on reconnect so the
server can avoid duplicating buffered history.

Comment on lines +355 to +357
let events = [];
let eventCount = 0;
const MAX_VISIBLE = 200;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Cap the browser-side event list.

The server ring buffer is bounded, but events.unshift(ev) grows forever. A tab left open on a busy worker will keep more and more objects in memory and re-filter all of them on each render, which will eventually make the feed sluggish.

✂️ Trim the in-browser buffer as new events arrive
 let events = [];
 let eventCount = 0;
 const MAX_VISIBLE = 200;
+const MAX_EVENTS = 500;
@@
 function addEvent(ev) {
   events.unshift(ev);
+  if (events.length > MAX_EVENTS) {
+    events.length = MAX_EVENTS;
+  }
   eventCount++;
   updateDropdowns(ev);
   if (matchesFilter(ev)) {

Also applies to: 495-503

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tinyagentos/routes/activity.py` around lines 355 - 357, The client-side
events array (events) is unbounded and grows indefinitely; after each insertion
(where code calls events.unshift(ev) and increments eventCount), trim the array
to MAX_VISIBLE by removing oldest entries (e.g., if events.length > MAX_VISIBLE
then truncate with events.length = MAX_VISIBLE or events.splice(MAX_VISIBLE)) so
the in-tab buffer stays capped; apply the same trimming logic in both locations
that manipulate events (the block around the declarations/events.unshift at
symbols events, eventCount, MAX_VISIBLE and the other handler in the 495-503
area) while still updating eventCount as before.

Comment on lines +49 to +102
class IdempotencyCache:
"""In-flight request deduplication cache keyed by Idempotency-Key.

``try_reserve(key)`` returns ``("proceed", event)`` when this caller
should do the work, or ``("wait", event)`` when another caller is
already handling this key — the caller should ``await event.wait()``
and then call ``get(key)`` for the cached result.

``set(key, result)`` stores the result and fires the event so all
waiters can proceed.

This closes the race in the naive get-then-set pattern where
``cache.get()`` releases the lock before the write, letting a
concurrent retry with the same Idempotency-Key create a duplicate.
"""

def __init__(self) -> None:
self._entries: dict[str, tuple[asyncio.Event, dict | None]] = {}

def try_reserve(self, key: str) -> tuple[str, asyncio.Event]:
"""Attempt to reserve *key*.

Returns ``("proceed", event)`` when this caller owns the key
and should perform the work.

Returns ``("wait", event)`` when another caller already reserved
the key — await ``event.wait()`` and call ``get()`` for the
result.
"""
if key in self._entries:
event, _ = self._entries[key]
return ("wait", event)

event = asyncio.Event()
self._entries[key] = (event, None)
return ("proceed", event)

def set(self, key: str, result: dict) -> None:
"""Store *result* and wake all waiters on *key*."""
entry = self._entries.get(key)
if entry is None:
self._entries[key] = (asyncio.Event(), result)
return
event, _ = entry
self._entries[key] = (event, result)
event.set()

def get(self, key: str) -> dict | None:
"""Return the cached result for *key*, or ``None``."""
entry = self._entries.get(key)
if entry is None:
return None
_, result = entry
return result
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Always resolve reserved idempotency keys.

After Line 168 / Line 549 reserves a key, many branches can still return or raise before Line 196 / Line 860 runs. In those cases every retried request with that key blocks forever on event.wait(), and the key stays poisoned until restart. The cache needs a terminal complete/abort path, and the handlers need to call it in a finally.

Possible fix
 class IdempotencyCache:
-    def set(self, key: str, result: dict) -> None:
+    def complete(self, key: str, result: dict) -> None:
         entry = self._entries.get(key)
         if entry is None:
-            self._entries[key] = (asyncio.Event(), result)
+            event = asyncio.Event()
+            event.set()
+            self._entries[key] = (event, result)
             return
         event, _ = entry
         self._entries[key] = (event, result)
         event.set()
+
+    def abort(self, key: str) -> None:
+        entry = self._entries.pop(key, None)
+        if entry is not None:
+            event, _ = entry
+            event.set()
     if idempotency_key and idempotency_cache is not None:
         mode, event = idempotency_cache.try_reserve(idempotency_key)
         if mode == "wait":
             await event.wait()
             return idempotency_cache.get(idempotency_key)
-    ...
-    if idempotency_key and idempotency_cache is not None:
-        idempotency_cache.set(idempotency_key, result)
-    return result
+    try:
+        ...
+        if idempotency_key and idempotency_cache is not None:
+            idempotency_cache.complete(idempotency_key, result)
+        return result
+    except Exception:
+        if idempotency_key and idempotency_cache is not None:
+            idempotency_cache.abort(idempotency_key)
+        raise

Also applies to: 165-171, 195-198, 546-552, 859-862

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tinyagentos/routes/agents.py` around lines 49 - 102, The IdempotencyCache can
leave reserved keys with their Event never fired if the owner path returns or
raises before calling set(), causing waiters to block forever; add a terminal
method (e.g. complete/abort) on IdempotencyCache that sets the result (or a
sentinel/error) and calls event.set(), update try_reserve/set/get to
document/handle this state, and modify the request handlers that call
IdempotencyCache.try_reserve(...) (the branches around the existing reserve
points) to always call the new terminal method in a finally block so every
reserved key is always resolved regardless of errors or early returns.

Comment on lines +165 to +168
idempotency_cache = getattr(request.app.state, "idempotency_cache", None)
idempotency_key = request.headers.get("Idempotency-Key")
if idempotency_key and idempotency_cache is not None:
mode, event = idempotency_cache.try_reserve(idempotency_key)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Scope the cache key to the operation, not just the header.

app.state.idempotency_cache is shared across both endpoints, but both handlers reserve on the bare Idempotency-Key. Reusing the same header on /api/agents and /api/agents/deploy — or with a different request body on the same route — will return the first request's cached payload and suppress the second operation. Include at least method + path in the cache key, and ideally a request-body fingerprint as well.

Also applies to: 546-549

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tinyagentos/routes/agents.py` around lines 165 - 168, The idempotency cache
is currently keyed only by the Idempotency-Key header (idempotency_key) so
different endpoints or different request bodies can collide; change the key
passed to idempotency_cache.try_reserve to a composite key that includes
request.method and request.url.path and a fingerprint of the request body (e.g.,
a short hash of the raw body or normalized JSON) before calling try_reserve (the
variables to update are idempotency_key and the call to
idempotency_cache.try_reserve returning mode, event); apply the same
composite-key fix to the other try_reserve usage in this module so each
operation is scoped to method+path+body fingerprint.

@hognek
Copy link
Copy Markdown
Contributor Author

hognek commented May 31, 2026

Status: Blocked on #498 (fix activity endpoint flake). The branch feat/per-app-agent-guides-only includes the activity endpoint route. Once #498 merges, this can be rebased and CI should pass. PR is ready to review.

@hognek
Copy link
Copy Markdown
Contributor Author

hognek commented May 31, 2026

Closing — this PR combined 4+ features into one branch (token store, guides ingestion, framework integrations, installer changes).

The concurrency/idempotency work is being tracked in #493 which has proper test coverage. The guides ingestion feature should be a separate focused PR once #493 lands.

Per jaylfc's guidance: one issue per PR, no stacking on unmerged branches.

@hognek hognek closed this May 31, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants