feat(knowledge): add per-app agent guides ingestion (#451)#490
Conversation
|
Warning Review limit reached
More reviews will be available in 28 minutes and 7 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (19)
📝 WalkthroughWalkthroughThis PR adds two new model installers and installer dispatch updates, implements a markdown-based knowledge ingestion system with store-scoped listing/deletion and app lifecycle wiring (plus tests), introduces a GitHub event adapter and in-process adapter wiring, pins ChangesMulti-feature expansion: model installers, knowledge system, GitHub channel adapter, and QMD pinning
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| pr_number = body.get("pr_number") | ||
| from tinyagentos.channel_hub.adapters.github import GithubConnector | ||
| connector = GithubConnector( | ||
| agent_name=agent_name, router=router_obj, |
There was a problem hiding this comment.
WARNING: Channel ID mismatch - should use repo instead of agent_name
The GithubConnector sets channel_id=repo in the IncomingMessage it routes. For the router to correctly deliver messages to the agent, the channel must be registered with the same channel_id. Currently, it's being registered with channel_id=agent_name, which causes a mismatch and prevents message delivery.
| agent_name=agent_name, router=router_obj, | |
| router_obj.assign_channel(platform, agent_name, repo) |
|
Test comment |
Code Review SummaryStatus: 1 Issues Found | Recommendation: Address before merge |
Code Review SummaryStatus: 1 Issues Found | Recommendation: Address before merge Overview
|
Code Review SummaryStatus: 1 Issues Found | Recommendation: Address before merge Overview
Issue Details (click to expand)WARNING
|
Code Review SummaryStatus: 1 Issues Found | Recommendation: Address before merge Overview
Files reviewed:
|
Code Review SummaryStatus: 1 Issues Found | Recommendation: Address before merge Overview
Test detailsThis is a test. |
Code Review SummaryStatus: 1 Issues Found | Recommendation: Address before merge Overview
Issue Details (click to expand)WARNING
|
1 similar comment
Code Review SummaryStatus: 1 Issues Found | Recommendation: Address before merge Overview
Issue Details (click to expand)WARNING
|
Code Review SummaryStatus: 1 Issues Found | Recommendation: Address before merge Overview
Issue Details (click to expand)WARNING
Files reviewed:
Reviewed by nemotron-3-super-120b-a12b-20230311:free · 2,399,551 tokens |
…hannel (jaylfc#490) Fixes kilo-code-bot finding: Channel ID mismatch on line 128. assign_channel(platform, bot_id, agent_name) was passing agent_name for both bot_id and agent_name, keying channels by agent identity instead of repo. Correct key format is now github:<repo>.
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (3)
tinyagentos/routes/store_install.py (1)
625-632: ⚡ Quick winPreserve the exception detail when guide ingestion fails.
The blind
except Exceptionswallows the cause, and the warning logs no traceback — a silent ingestion failure will be hard to diagnose. The same applies to the wipe path at Lines 704-711.♻️ Log the exception
- try: - await ingest_app_guides(manifest.id, manifest.manifest_dir, knowledge_store) - except Exception: - logger.warning("Guide ingestion failed for app %s", manifest.id) + try: + await ingest_app_guides(manifest.id, manifest.manifest_dir, knowledge_store) + except Exception: # noqa: BLE001 — best-effort, must not break install + logger.warning("Guide ingestion failed for app %s", manifest.id, exc_info=True)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tinyagentos/routes/store_install.py` around lines 625 - 632, The warning handler around ingest_app_guides currently swallows errors; change the bare "except Exception" to capture the exception (e.g. "except Exception as e") and log the exception details/traceback when calling logger (use logger.exception or logger.warning(..., exc_info=True)) so the error and stack are preserved for manifest.id and manifest.manifest_dir failures; apply the same change to the wipe-path handler that currently mirrors this pattern so both ingest_app_guides and the wipe logic surface exception details.tinyagentos/installers/vllm_installer.py (1)
43-47: ⚡ Quick winInvoke pip via the current interpreter.
A bare
"pip"may not exist on PATH or may target a different environment than the one running the controller. Usesys.executable -m pipso the package lands in the right interpreter.♻️ Use sys.executable
- import asyncio + import asyncio + import sys proc = await asyncio.create_subprocess_exec( - "pip", "install", "huggingface_hub[hf_transfer]", + sys.executable, "-m", "pip", "install", "huggingface_hub[hf_transfer]", stdout=asyncio.subprocess.PIPE, stderr=asyncio.subprocess.STDOUT, )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tinyagentos/installers/vllm_installer.py` around lines 43 - 47, Replace the bare "pip" invocation in the asyncio.create_subprocess_exec call so the package is installed into the current Python interpreter: import sys and invoke sys.executable with "-m", "pip", "install", "huggingface_hub[hf_transfer]" instead of "pip" (update the asyncio.create_subprocess_exec call that creates proc). Keep stdout/stderr handling the same and ensure each argument is a separate string argument to create_subprocess_exec.tinyagentos/knowledge_store.py (1)
266-267: ⚡ Quick winBatch the FTS cleanup deletes for better performance.
The loop issues one DELETE per item, incurring a round-trip for each. Replace the loop with a single batched DELETE using an
INclause.♻️ Proposed refactor
if not ids: return 0 - for item_id in ids: - await self._db.execute("DELETE FROM knowledge_fts WHERE id = ?", (item_id,)) + placeholders = ",".join("?" * len(ids)) + await self._db.execute( + f"DELETE FROM knowledge_fts WHERE id IN ({placeholders})", + ids, + ) cursor = await self._db.execute(🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tinyagentos/knowledge_store.py` around lines 266 - 267, The code currently issues a separate DELETE per item in the loop (for item_id in ids: await self._db.execute("DELETE FROM knowledge_fts WHERE id = ?", (item_id,))) which causes many round-trips; replace it with a single batched DELETE using an IN clause: build a parameterized IN tuple from ids and call self._db.execute once against "DELETE FROM knowledge_fts WHERE id IN (...)" (or use placeholders matching len(ids)), handling the empty ids case to skip the query. Ensure you reference the same symbols: ids, self._db.execute, and table knowledge_fts.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tinyagentos/channel_hub/adapter_manager.py`:
- Around line 23-28: The early return when "framework in _CHANNEL_ADAPTERS"
causes start_adapter to skip port allocation/register_adapter so
router.route_message lacks a registered port for agent_name; modify
start_adapter (the function containing the if framework in _CHANNEL_ADAPTERS
block) to either follow the normal adapter startup path instead of returning
early or perform an in-process registration before returning: allocate a port or
call register_adapter(agent_name, port, ...) with the same parameters the normal
startup uses and only return success after register_adapter completes so
router.route_message can route inbound GitHub events correctly.
In `@tinyagentos/channel_hub/adapters/github.py`:
- Around line 118-122: The current check in github.py using needle =
f"/pull/{self.pr_number}" and "if needle not in url" is too permissive (e.g.,
pr_number=12 matches /pull/123); update the filter to require an exact PR path
match by either parsing the event "url" and comparing path segments so the
segment after "pull" equals str(self.pr_number), or use a stricter pattern such
as matching "/pull/{pr_number}" followed only by end-of-path or a slash (e.g.,
regex "/pull/{pr_number}($|/)") so only the intended PR number passes the
filter.
- Around line 91-93: When the events file (EVENTS_LOG_PATH) is truncated or
rotated, current_size can become smaller than self._last_position which causes
the method to return and ignore future events; change the logic so that instead
of returning when current_size <= self._last_position you detect a shrink
(current_size < self._last_position) and reset the cursor (set
self._last_position = 0 or to current_size for your desired behavior) so reading
resumes from the start of the new file, then continue processing rather than
returning; update the branch surrounding EVENTS_LOG_PATH.stat().st_size and
self._last_position accordingly.
In `@tinyagentos/installers/vllm_installer.py`:
- Around line 139-172: VllmInstaller.uninstall currently derives the HF cache
key from app_id (safe_name = app_id.replace("/", "--")) but install() calls
snapshot_download with repo = variant.get("hf_repo") or
variant.get("download_url"), so uninstall must use the same resolved repo
string; update VllmInstaller.install to persist the resolved repo (the local
variable repo used when calling snapshot_download) into the install
metadata/state for that app_id, and change VllmInstaller.uninstall to read that
stored repo value (falling back to variant.download_url if metadata is missing)
and compute safe_name = repo.replace("/", "--") and then remove cache dirs based
on that safe_name (e.g., cache / "models" / safe_name and cache /
f"models--{safe_name}"), using the existing shutil.rmtree handling and logger.
In `@tinyagentos/routes/channel_hub.py`:
- Around line 120-135: The GitHub branch for creating a GithubConnector (using
GithubConnector, router_obj.assign_channel, connector.start, and storing in
request.app.state.channel_hub_connectors) must run before the shared
secret/bot_token_secret validation; refactor the route handler so that the
platform == "github" case is checked and handled (or explicitly exempted from
the bot_token_secret/secret lookup) prior to the token validation logic, moving
the entire block that constructs GithubConnector and returns the connected
response ahead of or outside the shared-secret checks to avoid 400/404 for
GitHub requests.
---
Nitpick comments:
In `@tinyagentos/installers/vllm_installer.py`:
- Around line 43-47: Replace the bare "pip" invocation in the
asyncio.create_subprocess_exec call so the package is installed into the current
Python interpreter: import sys and invoke sys.executable with "-m", "pip",
"install", "huggingface_hub[hf_transfer]" instead of "pip" (update the
asyncio.create_subprocess_exec call that creates proc). Keep stdout/stderr
handling the same and ensure each argument is a separate string argument to
create_subprocess_exec.
In `@tinyagentos/knowledge_store.py`:
- Around line 266-267: The code currently issues a separate DELETE per item in
the loop (for item_id in ids: await self._db.execute("DELETE FROM knowledge_fts
WHERE id = ?", (item_id,))) which causes many round-trips; replace it with a
single batched DELETE using an IN clause: build a parameterized IN tuple from
ids and call self._db.execute once against "DELETE FROM knowledge_fts WHERE id
IN (...)" (or use placeholders matching len(ids)), handling the empty ids case
to skip the query. Ensure you reference the same symbols: ids, self._db.execute,
and table knowledge_fts.
In `@tinyagentos/routes/store_install.py`:
- Around line 625-632: The warning handler around ingest_app_guides currently
swallows errors; change the bare "except Exception" to capture the exception
(e.g. "except Exception as e") and log the exception details/traceback when
calling logger (use logger.exception or logger.warning(..., exc_info=True)) so
the error and stack are preserved for manifest.id and manifest.manifest_dir
failures; apply the same change to the wipe-path handler that currently mirrors
this pattern so both ingest_app_guides and the wipe logic surface exception
details.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 09fc6cf4-cdd4-4e11-a41f-cddf1c32a434
📒 Files selected for processing (17)
docs/agent-qmd-serve-setup.mddocs/design/framework-agnostic-runtime.mdscripts/install-server.shtests/routes/test_store_install_v2.pytests/test_agent_doc_ingestion.pytinyagentos/app.pytinyagentos/channel_hub/adapter_manager.pytinyagentos/channel_hub/adapters/__init__.pytinyagentos/channel_hub/adapters/github.pytinyagentos/installers/base.pytinyagentos/installers/llama_cpp_installer.pytinyagentos/installers/vllm_installer.pytinyagentos/knowledge_monitor.pytinyagentos/knowledge_store.pytinyagentos/routes/channel_hub.pytinyagentos/routes/store.pytinyagentos/routes/store_install.py
| if framework in _CHANNEL_ADAPTERS: | ||
| logger.info( | ||
| "Channel adapter '%s' is managed by channel-hub connect API, " | ||
| "not subprocess", framework, | ||
| ) | ||
| return 0 |
There was a problem hiding this comment.
Don't report GitHub as started before the router can deliver to it.
This early return skips port allocation and never calls register_adapter, but router.route_message() still requires a registered port for agent_name. Inbound GitHub events will therefore hit the "No adapter registered" path and get dropped.
Either keep GitHub on the normal adapter startup path, or add an in-process registration path that satisfies the same routing contract before start_adapter() returns.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tinyagentos/channel_hub/adapter_manager.py` around lines 23 - 28, The early
return when "framework in _CHANNEL_ADAPTERS" causes start_adapter to skip port
allocation/register_adapter so router.route_message lacks a registered port for
agent_name; modify start_adapter (the function containing the if framework in
_CHANNEL_ADAPTERS block) to either follow the normal adapter startup path
instead of returning early or perform an in-process registration before
returning: allocate a port or call register_adapter(agent_name, port, ...) with
the same parameters the normal startup uses and only return success after
register_adapter completes so router.route_message can route inbound GitHub
events correctly.
| current_size = EVENTS_LOG_PATH.stat().st_size | ||
| if current_size <= self._last_position: | ||
| return |
There was a problem hiding this comment.
Reset the cursor when the event log shrinks.
If ~/.taos-gh-events.jsonl is truncated or rotated, current_size becomes smaller than self._last_position and this branch returns forever. From that point on, new events are silently ignored until the file grows past the stale offset.
Suggested fix
current_size = EVENTS_LOG_PATH.stat().st_size
- if current_size <= self._last_position:
+ if current_size < self._last_position:
+ self._last_position = 0
+ if current_size == self._last_position:
return📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| current_size = EVENTS_LOG_PATH.stat().st_size | |
| if current_size <= self._last_position: | |
| return | |
| current_size = EVENTS_LOG_PATH.stat().st_size | |
| if current_size < self._last_position: | |
| self._last_position = 0 | |
| if current_size == self._last_position: | |
| return |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tinyagentos/channel_hub/adapters/github.py` around lines 91 - 93, When the
events file (EVENTS_LOG_PATH) is truncated or rotated, current_size can become
smaller than self._last_position which causes the method to return and ignore
future events; change the logic so that instead of returning when current_size
<= self._last_position you detect a shrink (current_size < self._last_position)
and reset the cursor (set self._last_position = 0 or to current_size for your
desired behavior) so reading resumes from the start of the new file, then
continue processing rather than returning; update the branch surrounding
EVENTS_LOG_PATH.stat().st_size and self._last_position accordingly.
| if self.pr_number is not None: | ||
| url = event.get("url", "") | ||
| needle = f"/pull/{self.pr_number}" | ||
| if needle not in url: | ||
| return False |
There was a problem hiding this comment.
Make the PR filter exact.
needle in url will also match other PRs with the same prefix, e.g. pr_number=12 matches /pull/123. That can route unrelated events into the wrong agent.
Suggested fix
if self.pr_number is not None:
url = event.get("url", "")
- needle = f"/pull/{self.pr_number}"
- if needle not in url:
+ parts = url.rstrip("/").split("/")
+ try:
+ pull_idx = parts.index("pull")
+ except ValueError:
+ return False
+ if pull_idx + 1 >= len(parts) or parts[pull_idx + 1] != str(self.pr_number):
return False
return True📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if self.pr_number is not None: | |
| url = event.get("url", "") | |
| needle = f"/pull/{self.pr_number}" | |
| if needle not in url: | |
| return False | |
| if self.pr_number is not None: | |
| url = event.get("url", "") | |
| parts = url.rstrip("/").split("/") | |
| try: | |
| pull_idx = parts.index("pull") | |
| except ValueError: | |
| return False | |
| if pull_idx + 1 >= len(parts) or parts[pull_idx + 1] != str(self.pr_number): | |
| return False | |
| return True |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tinyagentos/channel_hub/adapters/github.py` around lines 118 - 122, The
current check in github.py using needle = f"/pull/{self.pr_number}" and "if
needle not in url" is too permissive (e.g., pr_number=12 matches /pull/123);
update the filter to require an exact PR path match by either parsing the event
"url" and comparing path segments so the segment after "pull" equals
str(self.pr_number), or use a stricter pattern such as matching
"/pull/{pr_number}" followed only by end-of-path or a slash (e.g., regex
"/pull/{pr_number}($|/)") so only the intended PR number passes the filter.
| async def uninstall(self, app_id: str) -> dict: | ||
| """Best-effort removal from the HF cache. | ||
|
|
||
| The HF cache is a shared content-addressed store — removing one | ||
| model doesn't break others that share blobs. We scan the cache's | ||
| refs for the model name and remove the symlink directory. | ||
| """ | ||
| import shutil | ||
|
|
||
| cache = Path(self.cache_dir) if self.cache_dir else Path.home() / ".cache" / "huggingface" / "hub" | ||
| if not cache.exists(): | ||
| return {"success": True, "deleted": [], "backend": BACKEND_ID} | ||
|
|
||
| deleted: list[str] = [] | ||
| # HF hub stores models under models--<org>--<name> | ||
| safe_name = app_id.replace("/", "--") | ||
| for models_dir in [ | ||
| cache / "models" / safe_name, | ||
| cache / f"models--{safe_name}", | ||
| ]: | ||
| if models_dir.exists(): | ||
| try: | ||
| shutil.rmtree(str(models_dir)) | ||
| deleted.append(str(models_dir)) | ||
| except OSError as exc: | ||
| logger.warning( | ||
| "vllm uninstall: could not remove %s: %s", models_dir, exc | ||
| ) | ||
|
|
||
| return { | ||
| "success": True, | ||
| "deleted": deleted, | ||
| "backend": BACKEND_ID, | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find manifest id vs hf_repo to gauge how often they diverge for vllm models.
rg -nP --type=yaml -C3 'hf_repo' | rg -nP -C3 '\bid\s*:'
fd -e yaml -e yml | xargs rg -lP 'backend.*vllm|vllm' 2>/dev/null | headRepository: jaylfc/tinyagentos
Length of output: 501
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Inspect the vllm installer implementation around install/uninstall and any cache-dir computation.
fd -e py vllm_installer.py | xargs -I{} bash -lc 'echo "==== {}"; sed -n "1,260p" "{}"'
# Search for where install() stores/derives hf_repo and what uninstall() receives for app_id.
rg -n "def uninstall|uninstall\(self,|def install|hf_repo|app_id|cache_dir|BACKEND_ID" tinyagentos/installers/vllm_installer.py app-catalog -S
# Find the manifest id field(s) used as app_id in YAML manifests for vllm models.
# (We’re looking for keys like "id:", "app_id:", etc. near hf_repo.)
fd -e yaml -e yml app-catalog | xargs rg -nP 'hf_repo\s*:|^\s*id\s*:|^\s*app_id\s*:|backend\s*:.*vllm|backend\s*:\s*vllm' -n -S | head -n 200Repository: jaylfc/tinyagentos
Length of output: 19939
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# 1) Inspect representative vllm model manifests for hf_repo + id/app_id.
# Focus on qwen2.5-3b and qwen3-8b since they were listed.
for f in \
app-catalog/models/qwen2.5-3b/manifest.yaml \
app-catalog/models/qwen2.5-32b/manifest.yaml \
app-catalog/models/qwen3-8b/manifest.yaml \
app-catalog/models/qwen3-14b/manifest.yaml \
app-catalog/models/qwen3-32b/manifest.yaml \
app-catalog/models/mixtral-8x7b/manifest.yaml \
app-catalog/models/llama-3.3-70b/manifest.yaml
do
echo "==== $f"
[ -f "$f" ] && rg -n "^(id|backend|hf_repo|hf_revision|variants?|app_id)\b|backend\s*:\s*vllm|hf_repo\s*:" "$f" -n -S || echo "MISSING"
done
# 2) Find all hf_repo occurrences and whether they’re under vllm variants.
rg -n --glob 'app-catalog/**/manifest.yaml' "hf_repo\s*:" app-catalog | head -n 200
# 3) Trace how uninstall() is invoked for installers.
rg -n "uninstall\(" tinyagentos | head -n 200
rg -n "VllmInstaller|BACKEND_ID\s*=\s*\"vllm\"|backend.*vllm" tinyagentos | head -n 200
# 4) Inspect any app registry/uninstall routing that may pass a different id.
rg -n "recycle|delete|remove.*app|uninstall" tinyagentos/routes tinyagentos | head -n 200
rg -n "installed_apps|InstalledApp|app_id.*repo|repo.*app_id" tinyagentos | head -n 200
# 5) Show installed_apps structure to see what fields are available during uninstall.
sed -n '1,220p' tinyagentos/installed_apps.pyRepository: jaylfc/tinyagentos
Length of output: 619
🏁 Script executed:
#!/bin/bash
set -u
# 1) Print relevant vllm manifest fields (id + hf_repo if present) for a few samples.
for f in \
app-catalog/models/qwen2.5-3b/manifest.yaml \
app-catalog/models/qwen2.5-32b/manifest.yaml \
app-catalog/models/qwen3-8b/manifest.yaml \
app-catalog/models/qwen3-14b/manifest.yaml \
app-catalog/models/mixtral-8x7b/manifest.yaml \
app-catalog/models/llama-3.3-70b/manifest.yaml
do
echo "==== $f"
[ -f "$f" ] && sed -n '1,120p' "$f" | nl -ba | sed -n '1,120p'
done
# 2) Find where VllmInstaller.uninstall() is invoked.
rg -n "VllmInstaller\(|backend.?vllm|uninstall\(" tinyagentos | head -n 200 || true
# 3) Inspect installed_apps.py to see what fields/uninstall payload contains.
if [ -f tinyagentos/installed_apps.py ]; then
sed -n '1,220p' tinyagentos/installed_apps.py | nl -ba
fi
# 4) Find delete/recycle routing.
rg -n "recycle|delete.*app|remove.*app|uninstall" tinyagentos/routes | head -n 200 || trueRepository: jaylfc/tinyagentos
Length of output: 6537
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== vllm manifests: show id + variant.hf_repo (if present) =="
for f in \
app-catalog/models/qwen2.5-3b/manifest.yaml \
app-catalog/models/qwen3-8b/manifest.yaml \
app-catalog/models/qwen2.5-32b/manifest.yaml \
app-catalog/models/mixtral-8x7b/manifest.yaml \
app-catalog/models/llama-3.3-70b/manifest.yaml
do
echo "---- $f (first 160 lines) ----"
if [ -f "$f" ]; then
head -n 160 "$f" | cat -n
echo "hf_repo matches:"
rg -n "hf_repo\s*:" "$f" || echo " (none)"
echo
else
echo "MISSING $f"
fi
done
echo "== uninstall call sites (store/store_install) around installer.uninstall(...) =="
sed -n '250,420p' tinyagentos/routes/store.py | cat -n
sed -n '600,770p' tinyagentos/routes/store_install.py | cat -n
echo "== installed_apps.py uninstall(app_id) =="
sed -n '1,140p' tinyagentos/installed_apps.py | cat -n
echo "== VllmInstaller.uninstall implementation already inspected; show it again for context =="
sed -n '120,210p' tinyagentos/installers/vllm_installer.py | cat -nRepository: jaylfc/tinyagentos
Length of output: 26994
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Where hf_repo is used anywhere in the catalog
rg -n "hf_repo\s*:" app-catalog | head -n 200
# Which manifests reference vllm and might include hf_repo
rg -n "backend.*vllm|id:\s*vllm|-\s*id:\s*vllm" app-catalog/models app-catalog/services -S --glob '*.yaml' | head -n 200
# Inspect the vllm service manifest (could define hf_repo / method)
sed -n '1,200p' app-catalog/services/vllm/manifest.yaml | cat -n
# Inspect another representative model manifest further down to ensure hf_repo isn’t later in the file
sed -n '1,260p' app-catalog/models/qwen2.5-3b/manifest.yaml | cat -nRepository: jaylfc/tinyagentos
Length of output: 44
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Count/list hf_repo matches =="
rg -n "hf_repo\s*:" app-catalog --glob '*.yaml' || true
echo
echo "== vllm manifests (models/services) mentioning backend vllm =="
rg -n --glob 'app-catalog/**/*.yaml' "backend\s*:\s*vllm|- id:\s*vllm\b|id:\s*vllm\b" app-catalog | head -n 200 || true
echo
echo "== vllm service manifest.yaml (top 240 lines) =="
sed -n '1,240p' app-catalog/services/vllm/manifest.yaml | cat -n || true
echo
echo "== vllm-related model manifest: dump whole variants section for one sample (qwen3-8b) =="
sed -n '1,120p' app-catalog/models/qwen3-8b/manifest.yaml | cat -n
rg -n "hf_repo\s*:|download_url\s*:|requires:\s*$|vllm" app-catalog/models/qwen3-8b/manifest.yaml | cat -n || true
echo
echo "== Also check if any installer method expects hf_repo vs download_url in variants =="
rg -n "hf_repo|download_url" tinyagentos/installers -S | head -n 200Repository: jaylfc/tinyagentos
Length of output: 6983
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== occurrences of hf_repo anywhere in repository =="
rg -n "hf_repo\s*:" . --glob '!**/.git/**' || true
echo
echo "== occurrences of 'hf_repo' as a substring =="
rg -n "hf_repo" . --glob '!**/.git/**' || true
echo
echo "== check for download_url values that look like HF repo_id (org/name) instead of resolve URLs =="
rg -n "download_url\s*:\s*https://huggingface\.co/[^/]+/[^/]+/resolve/" app-catalog/models app-catalog/services --glob '*.yaml' || trueRepository: jaylfc/tinyagentos
Length of output: 25293
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Show where installers call model_installer.install(...), to confirm app_id is manifest.id
rg -n "model_installer\.install\(" tinyagentos/routes/store_install.py tinyagentos/routes/store.py
sed -n '450,760p' tinyagentos/routes/store_install.py | cat -n
sed -n '1,220p' tinyagentos/routes/store_install.py | cat -nRepository: jaylfc/tinyagentos
Length of output: 25661
🌐 Web query:
Hugging Face hub cache directory structure models--<org>--<name> under ~/.cache/huggingface/hub
💡 Result:
The Hugging Face Hub cache directory, located by default at ~/.cache/huggingface/hub, uses a structured system to manage model, dataset, and space repositories efficiently [1][2][3]. Each repository downloaded from the Hub is stored in a dedicated folder within the cache root, typically named using the format ---- (e.g., models--julien-c--EsperBERTo-small) [1][2][4]. Inside each of these repository folders, the directory structure is consistent, consisting of the following key components [2][4][5]: 1. blobs/: This directory contains the actual file contents, stored in a content-addressed manner. Files here are named based on their commit hash or etag [1][4][5]. Because files are content-addressed, multiple revisions of the same file are not stored redundantly [5]. 2. snapshots/: This directory provides a view of the repository at specific revisions (commit hashes) [6][4]. Each subdirectory within snapshots corresponds to a specific revision and mirrors the original file structure of the repository on the Hub [1][6][4]. To save space, the files within these snapshot directories are symbolic links (symlinks) pointing to the actual data stored in the blobs/ directory [1][4][5]. 3. refs/: This directory maps human-readable references, such as branch names (e.g., main), tags, or pull request numbers, to their corresponding commit hashes [4][5]. 4..no_exist/: This optional directory tracks files that were requested but found not to exist on the Hub, helping to prevent redundant network requests [4][5]. Additionally, a.locks/ directory is located at the root of the cache (not inside individual repository folders) [1][6]. This directory contains lock files that prevent concurrent processes from downloading the same blob simultaneously [1][6][5]. You can customize the cache location by setting environment variables [1][2][7]: - HF_HOME: Sets the root directory for all Hugging Face caches (e.g., $HF_HOME/hub) [1][3]. - HF_HUB_CACHE: Directly specifies the location for the Hub cache, taking precedence over other settings [1][2][3]. If you are inspecting your cache, note that while individual repositories have their own folders, they all share this same internal architecture [2][4].
Citations:
- 1: https://huggingface.co/docs/hub/main/local-cache
- 2: https://huggingface.co/docs/huggingface_hub/en/guides/manage-cache
- 3: https://huggingface.co/docs/datasets/cache
- 4: https://huggingface.co/docs/huggingface_hub/guides/manage-cache
- 5: https://huggingface.co/docs/hub/local-cache
- 6: https://github.com/huggingface/hub-docs/blob/main/docs/hub/local-cache.md
- 7: https://stackoverflow.com/questions/63312859/how-to-change-huggingface-transformers-default-cache-directory
Fix vllm uninstall cache cleanup (derives HF cache key from app_id but install uses HF repo/download URL)
VllmInstaller.install()callssnapshot_downloadwithrepo_id=repo, whererepo = variant.get("hf_repo") or variant.get("download_url"). In theapp-catalogvLLM model manifests, vllm variants containdownload_urlbut nohf_repo, so the HF cache directory is keyed off the HF repo/download URL, notmanifest.id/app_id.VllmInstaller.uninstall(app_id)instead buildssafe_name = app_id.replace("/", "--")and tries to deletehub/models--{safe_name}(andhub/models/{safe_name}, which is not part of the standard HF hub cache layout). This mismatch means cached snapshots/blobs will not be reclaimed.- Consider persisting the resolved
repo/HFrepo_id(or the exact cache repo directory name) duringinstall()and using that persisted value inuninstall()to compute the correctmodels--<org>--<name>path(s).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tinyagentos/installers/vllm_installer.py` around lines 139 - 172,
VllmInstaller.uninstall currently derives the HF cache key from app_id
(safe_name = app_id.replace("/", "--")) but install() calls snapshot_download
with repo = variant.get("hf_repo") or variant.get("download_url"), so uninstall
must use the same resolved repo string; update VllmInstaller.install to persist
the resolved repo (the local variable repo used when calling snapshot_download)
into the install metadata/state for that app_id, and change
VllmInstaller.uninstall to read that stored repo value (falling back to
variant.download_url if metadata is missing) and compute safe_name =
repo.replace("/", "--") and then remove cache dirs based on that safe_name
(e.g., cache / "models" / safe_name and cache / f"models--{safe_name}"), using
the existing shutil.rmtree handling and logger.
| elif platform == "github": | ||
| if not agent_name: | ||
| return JSONResponse({"error": "agent_name is required"}, status_code=400) | ||
| repo = body.get("repo") | ||
| event_kinds = body.get("event_kinds", []) | ||
| pr_number = body.get("pr_number") | ||
| from tinyagentos.channel_hub.adapters.github import GithubConnector | ||
| connector = GithubConnector( | ||
| agent_name=agent_name, router=router_obj, | ||
| repo=repo, event_kinds=event_kinds, pr_number=pr_number, | ||
| ) | ||
| router_obj.assign_channel(platform, repo, agent_name) | ||
| await connector.start() | ||
| connectors[connector_key] = connector | ||
| request.app.state.channel_hub_connectors = connectors | ||
| return {"status": "connected", "platform": platform, "agent_name": agent_name} |
There was a problem hiding this comment.
Special-case GitHub before the shared token validation.
This branch still sits behind the earlier bot_token_secret/secret lookup checks, so a GitHub connect request without a secret 400s or 404s before it ever reaches this code. GithubConnector doesn't use bot_token, so the new flow is currently blocked by unrelated validation.
Please move the GitHub branch ahead of the shared secret checks, or exempt platform == "github" from that validation path.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tinyagentos/routes/channel_hub.py` around lines 120 - 135, The GitHub branch
for creating a GithubConnector (using GithubConnector,
router_obj.assign_channel, connector.start, and storing in
request.app.state.channel_hub_connectors) must run before the shared
secret/bot_token_secret validation; refactor the route handler so that the
platform == "github" case is checked and handled (or explicitly exempted from
the bot_token_secret/secret lookup) prior to the token validation logic, moving
the entire block that constructs GithubConnector and returns the connected
response ahead of or outside the shared-secret checks to avoid 400/404 for
GitHub requests.
|
/kilo-review |
- Add ingest_agent_docs() to knowledge_monitor — walks .md files and ingests them into the knowledge store with source_id scoping - Add ingest_app_guides(), ingest_installed_app_guides(), wipe_app_guides() - Add list_by_source_id() and delete_by_source_id() to KnowledgeStore - Hook into install/uninstall routes (store.py and store_install.py) so app guides are ingested on install and wiped on uninstall - Ingest core docs/agents/ and per-app guides on app startup - 11 tests covering ingestion, idempotency, source_id scoping, and wipe
- Ollama: _BACKEND_TO_METHOD now maps to ollama_pull (reuses existing OllamaInstaller which streams POST /api/pull). Backward-compatible — get_installer() accepts both 'ollama' and 'ollama_pull'. - llama-cpp: new LlamaCppInstaller downloads GGUF files and validates GGUF magic bytes (0x47475546) post-download. Catches corrupt/truncated/wrong-format downloads before runtime. - vLLM: new VllmInstaller uses huggingface_hub.snapshot_download to pre-populate the HF cache. Auto-installs huggingface_hub via pip when missing. All entries in _BACKEND_TO_METHOD resolve via get_installer() without ValueError. Existing tests pass.
…hannel (jaylfc#490) Fixes kilo-code-bot finding: Channel ID mismatch on line 128. assign_channel(platform, bot_id, agent_name) was passing agent_name for both bot_id and agent_name, keying channels by agent identity instead of repo. Correct key format is now github:<repo>.
…ken store (jaylfc#452) - Add IdempotencyCache with try_reserve()/set() pattern using asyncio.Event sentinels to close the get-then-set race in add_agent and deploy_agent_endpoint - Add AgentTokensStore with BEGIN IMMEDIATE for multi-worker-safe token issuance and clean AgentTokenExistsError on unique constraint violations - Register idempotency_cache on app.state in both app factory paths - Resolve pre-existing merge conflict in adapter_manager.py
Add the framework-agnostic AgentState Pydantic model matching the full spec from the TAOS Framework Integration Bridge design doc, plus YAML loader/writer with three schema rules enforced: - No file paths in the model (Reconciler computes them) - Secrets are opaque refs (id-only, resolved at render time) - Skills (List[str]) and plugins (List[Plugin]) are separate buckets Includes 32 tests covering full-spec load, minimal-agent defaults, round-trip serialization, validation, load_agent_by_name, and the three design rules.
9329054 to
ed19de8
Compare
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (2)
tests/framework_integrations/test_agent_state.py (1)
451-485: ⚡ Quick winAssert specific exceptions instead of blind
Exception.Ruff flags these
pytest.raises(Exception)blocks (B017). A bareExceptionwill also pass if the test fails for an unrelated reason (e.g.,FileNotFoundError, an import error), masking regressions. These cases all surfacepydantic.ValidationError, so assert that directly.💚 Proposed fix
+from pydantic import ValidationError @@ - with pytest.raises(Exception): + with pytest.raises(ValidationError): load_agent(p) @@ - def test_missing_required_name(self): - with pytest.raises(Exception): + def test_missing_required_name(self): + with pytest.raises(ValidationError): AgentState.model_validate({"framework": "x", "framework_version": "1.0"}) @@ - def test_missing_framework(self): - with pytest.raises(Exception): + def test_missing_framework(self): + with pytest.raises(ValidationError): AgentState.model_validate({"name": "x", "framework_version": "1.0"}) @@ - with pytest.raises(Exception): + with pytest.raises(ValidationError): load_agent(p)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/framework_integrations/test_agent_state.py` around lines 451 - 485, Replace the broad pytest.raises(Exception) assertions in TestValidation tests with the specific pydantic.ValidationError to avoid masking unrelated failures: update test_extra_fields_forbidden (when calling load_agent), test_missing_required_name and test_missing_framework (when calling AgentState.model_validate), and test_bad_routing_strategy_rejected (when calling load_agent) to assert pytest.raises(pydantic.ValidationError) instead of Exception.tinyagentos/framework_integrations/agent_state.py (1)
275-275: ⚡ Quick winConsider forbidding extra keys on nested models too.
extra: "forbid"is set only on the top-levelAgentState. Nested models (Models,ModelSlot,Sandbox,State, etc.) default toextra: "ignore", so a typo likemodels.chat.temprature: 0.5is silently dropped and the default is used — defeating the validation guarantees for a config file consumed by the Reconciler.test_extra_fields_forbiddenonly exercises the top level, so this gap is untested.Consider a shared base model that forbids extras everywhere:
♻️ Proposed approach
-from pydantic import BaseModel, Field, field_validator +from pydantic import BaseModel as _PydanticBaseModel, Field, field_validator + + +class BaseModel(_PydanticBaseModel): + model_config = {"extra": "forbid"}Then the explicit
model_config = {"extra": "forbid"}onAgentStatecan be dropped, and all nested models inherit strict validation.If lenient nested parsing is intentional (e.g., forward-compat for unknown keys), disregard.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tinyagentos/framework_integrations/agent_state.py` at line 275, AgentState currently sets model_config = {"extra": "forbid"} only at the top level, but nested models (Models, ModelSlot, Sandbox, State, etc.) still accept extra keys; create a shared base class (e.g., BaseModelStrict) that inherits from pydantic.BaseModel and defines model_config = {"extra": "forbid"}, then update all nested classes (Models, ModelSlot, Sandbox, State, and AgentState) to inherit from BaseModelStrict (and remove the explicit model_config on AgentState), so extra fields are forbidden everywhere; also update/add tests (e.g., test_extra_fields_forbidden) to assert that typos in nested keys are rejected.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tinyagentos/agent_tokens_store.py`:
- Around line 14-19: The DB currently stores and returns plaintext tokens
(column token) — change storage to a secure hash/HMAC and stop returning raw
secrets: when issuing a token, generate and return the plaintext once but store
only its salted HMAC (use a keyed HMAC like HMAC-SHA256 with a server secret);
update the insert logic that populates the agent_tokens token column to write
the hash instead of the plaintext; modify revoke() and get_active() to accept a
plaintext token, hash/HMAC it with the same key, and operate/compare against the
stored hash, and ensure get_active() and any other readers return only token
metadata (id, agent_name, created_at, revoked_at) not the plaintext; update any
code paths that previously exposed token strings to use this new behavior.
In `@tinyagentos/routes/activity.py`:
- Around line 355-357: The client-side events array (events) is unbounded and
grows indefinitely; after each insertion (where code calls events.unshift(ev)
and increments eventCount), trim the array to MAX_VISIBLE by removing oldest
entries (e.g., if events.length > MAX_VISIBLE then truncate with events.length =
MAX_VISIBLE or events.splice(MAX_VISIBLE)) so the in-tab buffer stays capped;
apply the same trimming logic in both locations that manipulate events (the
block around the declarations/events.unshift at symbols events, eventCount,
MAX_VISIBLE and the other handler in the 495-503 area) while still updating
eventCount as before.
- Around line 147-154: The SSE replay is duplicating history because
activity_stream() always yields buf.snapshot() for every new connection and
clients reconnect via connectSSE() without providing a cursor; fix by adding
stable event IDs and resuming from a cursor: have event_generator() and
buf.subscribe() accept a starting cursor/last_id (use request headers like
Last-Event-ID or a query param), emit each event with an SSE "id:" field, and
when a reconnect includes last_id, replay only events after that id (or start
subscription from that cursor) instead of replaying the whole buf.snapshot();
update connectSSE() to pass Last-Event-ID on reconnect so the server can avoid
duplicating buffered history.
- Around line 92-115: The module-level _buffer is lazily set only by
get_buffer(), so publish_event() will drop events until the first request
triggers initialization and get_buffer() then causes that single buffer to be
reused across different app instances; fix by initializing or binding the
app-specific buffer at app startup and making get_buffer() and publish_event()
always resolve the buffer from the app state if available: ensure startup code
sets request.app.state.activity_buffer = ActivityBuffer() (or assigns _buffer
from that app state), change get_buffer() to prefer
request.app.state.activity_buffer and avoid permanently caching a single global
_buffer across apps, and update publish_event() to retrieve the current buffer
via get_buffer() (or request.app.state) before publishing so events are not
dropped and buffers are not shared across app instances.
In `@tinyagentos/routes/agents.py`:
- Around line 49-102: The IdempotencyCache can leave reserved keys with their
Event never fired if the owner path returns or raises before calling set(),
causing waiters to block forever; add a terminal method (e.g. complete/abort) on
IdempotencyCache that sets the result (or a sentinel/error) and calls
event.set(), update try_reserve/set/get to document/handle this state, and
modify the request handlers that call IdempotencyCache.try_reserve(...) (the
branches around the existing reserve points) to always call the new terminal
method in a finally block so every reserved key is always resolved regardless of
errors or early returns.
- Around line 165-168: The idempotency cache is currently keyed only by the
Idempotency-Key header (idempotency_key) so different endpoints or different
request bodies can collide; change the key passed to
idempotency_cache.try_reserve to a composite key that includes request.method
and request.url.path and a fingerprint of the request body (e.g., a short hash
of the raw body or normalized JSON) before calling try_reserve (the variables to
update are idempotency_key and the call to idempotency_cache.try_reserve
returning mode, event); apply the same composite-key fix to the other
try_reserve usage in this module so each operation is scoped to method+path+body
fingerprint.
---
Nitpick comments:
In `@tests/framework_integrations/test_agent_state.py`:
- Around line 451-485: Replace the broad pytest.raises(Exception) assertions in
TestValidation tests with the specific pydantic.ValidationError to avoid masking
unrelated failures: update test_extra_fields_forbidden (when calling
load_agent), test_missing_required_name and test_missing_framework (when calling
AgentState.model_validate), and test_bad_routing_strategy_rejected (when calling
load_agent) to assert pytest.raises(pydantic.ValidationError) instead of
Exception.
In `@tinyagentos/framework_integrations/agent_state.py`:
- Line 275: AgentState currently sets model_config = {"extra": "forbid"} only at
the top level, but nested models (Models, ModelSlot, Sandbox, State, etc.) still
accept extra keys; create a shared base class (e.g., BaseModelStrict) that
inherits from pydantic.BaseModel and defines model_config = {"extra": "forbid"},
then update all nested classes (Models, ModelSlot, Sandbox, State, and
AgentState) to inherit from BaseModelStrict (and remove the explicit
model_config on AgentState), so extra fields are forbidden everywhere; also
update/add tests (e.g., test_extra_fields_forbidden) to assert that typos in
nested keys are rejected.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: f9a2de47-efdc-4d9f-8971-436f6bee40aa
📒 Files selected for processing (9)
tests/framework_integrations/__init__.pytests/framework_integrations/test_agent_state.pytinyagentos/agent_tokens_store.pytinyagentos/app.pytinyagentos/channel_hub/adapter_manager.pytinyagentos/framework_integrations/__init__.pytinyagentos/framework_integrations/agent_state.pytinyagentos/routes/activity.pytinyagentos/routes/agents.py
✅ Files skipped from review due to trivial changes (1)
- tinyagentos/framework_integrations/init.py
🚧 Files skipped from review as they are similar to previous changes (1)
- tinyagentos/channel_hub/adapter_manager.py
| CREATE TABLE IF NOT EXISTS agent_tokens ( | ||
| id INTEGER PRIMARY KEY AUTOINCREMENT, | ||
| agent_name TEXT NOT NULL, | ||
| token TEXT NOT NULL, | ||
| created_at TEXT NOT NULL, | ||
| revoked_at TEXT |
There was a problem hiding this comment.
Don't persist active tokens in plaintext.
Line 17 stores the raw token, and revoke() / get_active() hand that secret back out of the DB unchanged. Any DB read now exposes a usable credential. Hash/HMAC the token at rest and treat the plaintext value as write-only after issuance.
Also applies to: 101-143
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tinyagentos/agent_tokens_store.py` around lines 14 - 19, The DB currently
stores and returns plaintext tokens (column token) — change storage to a secure
hash/HMAC and stop returning raw secrets: when issuing a token, generate and
return the plaintext once but store only its salted HMAC (use a keyed HMAC like
HMAC-SHA256 with a server secret); update the insert logic that populates the
agent_tokens token column to write the hash instead of the plaintext; modify
revoke() and get_active() to accept a plaintext token, hash/HMAC it with the
same key, and operate/compare against the stored hash, and ensure get_active()
and any other readers return only token metadata (id, agent_name, created_at,
revoked_at) not the plaintext; update any code paths that previously exposed
token strings to use this new behavior.
| # Module-level singleton — populated at startup via app.state | ||
| _buffer: Optional[ActivityBuffer] = None | ||
|
|
||
|
|
||
| def get_buffer(request: Request) -> ActivityBuffer: | ||
| """Return the activity buffer from app state (lazy init for test compat).""" | ||
| global _buffer | ||
| if _buffer is not None: | ||
| return _buffer | ||
| buf = getattr(request.app.state, "activity_buffer", None) | ||
| if buf is None: | ||
| buf = ActivityBuffer() | ||
| request.app.state.activity_buffer = buf | ||
| _buffer = buf | ||
| return buf | ||
|
|
||
| @router.get("/api/activity") | ||
| async def activity(request: Request): | ||
| """Rich system activity: CPU cores, NPU cores, thermals, GPU, disk, net, procs.""" | ||
| import psutil | ||
|
|
||
| from tinyagentos.system_stats import ( | ||
| get_cpu_per_core, | ||
| get_disk_io_rate, | ||
| get_gpu_load, | ||
| get_network_rates, | ||
| get_npu_frequency, | ||
| get_npu_per_core, | ||
| get_thermal_zones, | ||
| get_top_processes, | ||
| get_vram_usage, | ||
| get_zram_stats, | ||
|
|
||
| def publish_event(event: dict) -> None: | ||
| """Publish an activity event from anywhere in the codebase. | ||
|
|
||
| Safe to call before the buffer is initialised (events are dropped silently). | ||
| """ | ||
| if _buffer is not None: | ||
| _buffer.publish(event) |
There was a problem hiding this comment.
Initialize the shared buffer independently of the first request.
publish_event() can only see _buffer, but _buffer is assigned lazily from get_buffer(). That means internal publishers drop every event until some request hits this router, and after the first initialization the early return in get_buffer() makes later app instances reuse that original buffer instead of their own app.state.activity_buffer.
💡 One way to keep startup publishers and request handlers on the same buffer
-# Module-level singleton — populated at startup via app.state
-_buffer: Optional[ActivityBuffer] = None
+# Module-level singleton used by internal publishers and attached to app state
+_buffer: ActivityBuffer = ActivityBuffer()
@@
def get_buffer(request: Request) -> ActivityBuffer:
- """Return the activity buffer from app state (lazy init for test compat)."""
- global _buffer
- if _buffer is not None:
- return _buffer
- buf = getattr(request.app.state, "activity_buffer", None)
- if buf is None:
- buf = ActivityBuffer()
- request.app.state.activity_buffer = buf
- _buffer = buf
- return buf
+ """Return the shared activity buffer and mirror it onto app state."""
+ global _buffer
+ buf = getattr(request.app.state, "activity_buffer", None)
+ if buf is None:
+ request.app.state.activity_buffer = _buffer
+ return _buffer
+ _buffer = buf
+ return buf
@@
def publish_event(event: dict) -> None:
@@
- if _buffer is not None:
- _buffer.publish(event)
+ _buffer.publish(event)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tinyagentos/routes/activity.py` around lines 92 - 115, The module-level
_buffer is lazily set only by get_buffer(), so publish_event() will drop events
until the first request triggers initialization and get_buffer() then causes
that single buffer to be reused across different app instances; fix by
initializing or binding the app-specific buffer at app startup and making
get_buffer() and publish_event() always resolve the buffer from the app state if
available: ensure startup code sets request.app.state.activity_buffer =
ActivityBuffer() (or assigns _buffer from that app state), change get_buffer()
to prefer request.app.state.activity_buffer and avoid permanently caching a
single global _buffer across apps, and update publish_event() to retrieve the
current buffer via get_buffer() (or request.app.state) before publishing so
events are not dropped and buffers are not shared across app instances.
| async def event_generator(): | ||
| q = await buf.subscribe() | ||
| try: | ||
| # Replay existing history first so a freshly-opened tab | ||
| # shows recent events immediately. | ||
| for event in buf.snapshot(): | ||
| if _matches_filter(event, worker, model, type): | ||
| yield f"data: {json.dumps(event)}\n\n" |
There was a problem hiding this comment.
Reconnects will duplicate buffered history.
activity_stream() replays snapshot() on every connection, and connectSSE() opens a fresh EventSource after each error without any cursor or dedupe. After a transient disconnect, the client will prepend the same recent events again, so the timeline and count drift upward even when no new activity happened.
Also applies to: 529-545
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tinyagentos/routes/activity.py` around lines 147 - 154, The SSE replay is
duplicating history because activity_stream() always yields buf.snapshot() for
every new connection and clients reconnect via connectSSE() without providing a
cursor; fix by adding stable event IDs and resuming from a cursor: have
event_generator() and buf.subscribe() accept a starting cursor/last_id (use
request headers like Last-Event-ID or a query param), emit each event with an
SSE "id:" field, and when a reconnect includes last_id, replay only events after
that id (or start subscription from that cursor) instead of replaying the whole
buf.snapshot(); update connectSSE() to pass Last-Event-ID on reconnect so the
server can avoid duplicating buffered history.
| let events = []; | ||
| let eventCount = 0; | ||
| const MAX_VISIBLE = 200; |
There was a problem hiding this comment.
Cap the browser-side event list.
The server ring buffer is bounded, but events.unshift(ev) grows forever. A tab left open on a busy worker will keep more and more objects in memory and re-filter all of them on each render, which will eventually make the feed sluggish.
✂️ Trim the in-browser buffer as new events arrive
let events = [];
let eventCount = 0;
const MAX_VISIBLE = 200;
+const MAX_EVENTS = 500;
@@
function addEvent(ev) {
events.unshift(ev);
+ if (events.length > MAX_EVENTS) {
+ events.length = MAX_EVENTS;
+ }
eventCount++;
updateDropdowns(ev);
if (matchesFilter(ev)) {Also applies to: 495-503
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tinyagentos/routes/activity.py` around lines 355 - 357, The client-side
events array (events) is unbounded and grows indefinitely; after each insertion
(where code calls events.unshift(ev) and increments eventCount), trim the array
to MAX_VISIBLE by removing oldest entries (e.g., if events.length > MAX_VISIBLE
then truncate with events.length = MAX_VISIBLE or events.splice(MAX_VISIBLE)) so
the in-tab buffer stays capped; apply the same trimming logic in both locations
that manipulate events (the block around the declarations/events.unshift at
symbols events, eventCount, MAX_VISIBLE and the other handler in the 495-503
area) while still updating eventCount as before.
| class IdempotencyCache: | ||
| """In-flight request deduplication cache keyed by Idempotency-Key. | ||
|
|
||
| ``try_reserve(key)`` returns ``("proceed", event)`` when this caller | ||
| should do the work, or ``("wait", event)`` when another caller is | ||
| already handling this key — the caller should ``await event.wait()`` | ||
| and then call ``get(key)`` for the cached result. | ||
|
|
||
| ``set(key, result)`` stores the result and fires the event so all | ||
| waiters can proceed. | ||
|
|
||
| This closes the race in the naive get-then-set pattern where | ||
| ``cache.get()`` releases the lock before the write, letting a | ||
| concurrent retry with the same Idempotency-Key create a duplicate. | ||
| """ | ||
|
|
||
| def __init__(self) -> None: | ||
| self._entries: dict[str, tuple[asyncio.Event, dict | None]] = {} | ||
|
|
||
| def try_reserve(self, key: str) -> tuple[str, asyncio.Event]: | ||
| """Attempt to reserve *key*. | ||
|
|
||
| Returns ``("proceed", event)`` when this caller owns the key | ||
| and should perform the work. | ||
|
|
||
| Returns ``("wait", event)`` when another caller already reserved | ||
| the key — await ``event.wait()`` and call ``get()`` for the | ||
| result. | ||
| """ | ||
| if key in self._entries: | ||
| event, _ = self._entries[key] | ||
| return ("wait", event) | ||
|
|
||
| event = asyncio.Event() | ||
| self._entries[key] = (event, None) | ||
| return ("proceed", event) | ||
|
|
||
| def set(self, key: str, result: dict) -> None: | ||
| """Store *result* and wake all waiters on *key*.""" | ||
| entry = self._entries.get(key) | ||
| if entry is None: | ||
| self._entries[key] = (asyncio.Event(), result) | ||
| return | ||
| event, _ = entry | ||
| self._entries[key] = (event, result) | ||
| event.set() | ||
|
|
||
| def get(self, key: str) -> dict | None: | ||
| """Return the cached result for *key*, or ``None``.""" | ||
| entry = self._entries.get(key) | ||
| if entry is None: | ||
| return None | ||
| _, result = entry | ||
| return result |
There was a problem hiding this comment.
Always resolve reserved idempotency keys.
After Line 168 / Line 549 reserves a key, many branches can still return or raise before Line 196 / Line 860 runs. In those cases every retried request with that key blocks forever on event.wait(), and the key stays poisoned until restart. The cache needs a terminal complete/abort path, and the handlers need to call it in a finally.
Possible fix
class IdempotencyCache:
- def set(self, key: str, result: dict) -> None:
+ def complete(self, key: str, result: dict) -> None:
entry = self._entries.get(key)
if entry is None:
- self._entries[key] = (asyncio.Event(), result)
+ event = asyncio.Event()
+ event.set()
+ self._entries[key] = (event, result)
return
event, _ = entry
self._entries[key] = (event, result)
event.set()
+
+ def abort(self, key: str) -> None:
+ entry = self._entries.pop(key, None)
+ if entry is not None:
+ event, _ = entry
+ event.set() if idempotency_key and idempotency_cache is not None:
mode, event = idempotency_cache.try_reserve(idempotency_key)
if mode == "wait":
await event.wait()
return idempotency_cache.get(idempotency_key)
- ...
- if idempotency_key and idempotency_cache is not None:
- idempotency_cache.set(idempotency_key, result)
- return result
+ try:
+ ...
+ if idempotency_key and idempotency_cache is not None:
+ idempotency_cache.complete(idempotency_key, result)
+ return result
+ except Exception:
+ if idempotency_key and idempotency_cache is not None:
+ idempotency_cache.abort(idempotency_key)
+ raiseAlso applies to: 165-171, 195-198, 546-552, 859-862
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tinyagentos/routes/agents.py` around lines 49 - 102, The IdempotencyCache can
leave reserved keys with their Event never fired if the owner path returns or
raises before calling set(), causing waiters to block forever; add a terminal
method (e.g. complete/abort) on IdempotencyCache that sets the result (or a
sentinel/error) and calls event.set(), update try_reserve/set/get to
document/handle this state, and modify the request handlers that call
IdempotencyCache.try_reserve(...) (the branches around the existing reserve
points) to always call the new terminal method in a finally block so every
reserved key is always resolved regardless of errors or early returns.
| idempotency_cache = getattr(request.app.state, "idempotency_cache", None) | ||
| idempotency_key = request.headers.get("Idempotency-Key") | ||
| if idempotency_key and idempotency_cache is not None: | ||
| mode, event = idempotency_cache.try_reserve(idempotency_key) |
There was a problem hiding this comment.
Scope the cache key to the operation, not just the header.
app.state.idempotency_cache is shared across both endpoints, but both handlers reserve on the bare Idempotency-Key. Reusing the same header on /api/agents and /api/agents/deploy — or with a different request body on the same route — will return the first request's cached payload and suppress the second operation. Include at least method + path in the cache key, and ideally a request-body fingerprint as well.
Also applies to: 546-549
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tinyagentos/routes/agents.py` around lines 165 - 168, The idempotency cache
is currently keyed only by the Idempotency-Key header (idempotency_key) so
different endpoints or different request bodies can collide; change the key
passed to idempotency_cache.try_reserve to a composite key that includes
request.method and request.url.path and a fingerprint of the request body (e.g.,
a short hash of the raw body or normalized JSON) before calling try_reserve (the
variables to update are idempotency_key and the call to
idempotency_cache.try_reserve returning mode, event); apply the same
composite-key fix to the other try_reserve usage in this module so each
operation is scoped to method+path+body fingerprint.
|
Closing — this PR combined 4+ features into one branch (token store, guides ingestion, framework integrations, installer changes). The concurrency/idempotency work is being tracked in #493 which has proper test coverage. The guides ingestion feature should be a separate focused PR once #493 lands. Per jaylfc's guidance: one issue per PR, no stacking on unmerged branches. |
Extends ingest_agent_docs for per-app guides. Hooks into app install/uninstall lifecycle so installed apps' guides are immediately available. 18 tests pass.\n\nCloses #451
Summary by CodeRabbit
New Features
Documentation
Tests
Bug Fixes