docs(audit): comprehensive codebase audit report (fixes #250)#251
docs(audit): comprehensive codebase audit report (fixes #250)#251xlabtg merged 4 commits intoxlabtg:mainfrom
Conversation
Adding .gitkeep for PR creation (default mode). This file will be removed when the task is complete. Issue: #250
Delivers the audit asked for in issue #250: a read-only sweep of the autonomous core, policy engine, WebUI surface, lifecycle, and memory layer, ranked by severity with concrete remediations. Four critical findings, most security-relevant: - Default `restrictedTools` list never matches real tool names — TON sends bypass escalation. - `AutonomousTaskManager` is a local variable; never stopped on agent shutdown. - Pause/resume resets rate-limits and loop-detection (policy bypass). - Full WebUI auth token printed to stdout at startup. See AUDIT_REPORT.md for full list, evidence, and action plan.
Solution summaryAudit delivered. PR #251 is now open (non-draft, mergeable) with Summary of what was delivered The report catalogs 23 findings across critical/high/medium/low tiers, each with file:line evidence, impact, and remediation. The four critical items are the ones to flag:
Recommendation in the report: 🟠 Conditional Go — safe read-only, but land the P1 fixes before enabling autonomous mode against a real TON wallet. This summary was automatically extracted from the AI working session output. |
🤖 Solution Draft LogThis log file contains the complete execution trace of the AI solution draft process. 💰 Cost: $3.846644📊 Context and tokens usage:Claude Opus 4.7:
Total: (162.1K + 3.1M cached) input tokens, 23.4K output tokens, $3.163384 cost Claude Haiku 4.5:
Total: (194.3K + 3.4M cached) input tokens, 20.5K output tokens, $0.683261 cost 🤖 Models used:
📎 Log file uploaded as Gist (3170KB)Now working session is ended, feel free to review and add any feedback on the solution draft. |
|
🤖 AI Work Session Started Starting automated work session at 2026-04-22T19:01:30.137Z The PR has been converted to draft mode while work is in progress. This comment marks the beginning of an AI work session. Please wait for the session to finish, and provide your feedback. |
Solution summaryPR #251 is now back to ready-for-review, mergeable, and clean. Summary of this session: Verified the audit deliverable is complete and accurate. The existing
The PR had been auto-converted to draft when this session started; it is now marked ready-for-review again. PR: #251 This summary was automatically extracted from the AI working session output. |
🤖 Solution Draft LogThis log file contains the complete execution trace of the AI solution draft process. 💰 Cost: $0.564772📊 Context and tokens usage:
Total: (41.9K + 407.0K cached) input tokens, 4.0K output tokens, $0.564772 cost 🤖 Models used:
📎 Log file uploaded as Gist (511KB)Now working session is ended, feel free to review and add any feedback on the solution draft. |
✅ Ready to mergeThis pull request is now ready to be merged:
Monitored by hive-mind with --auto-restart-until-mergeable flag |
This reverts commit d6b09c0.
|
Okay, create a ready-made document for each problem to create an issue based on it and put it in the improvements folder. Create a work folder in it and put all these ready-made templates there. |
|
🤖 AI Work Session Started Starting automated work session at 2026-04-22T19:29:10.604Z The PR has been converted to draft mode while work is in progress. This comment marks the beginning of an AI work session. Please wait for the session to finish, and provide your feedback. |
Per owner request on PR #251, add one self-contained GitHub Issue template per audit finding from AUDIT_REPORT.md under improvements/work/. 23 templates total (4 critical, 7 high, 8 medium, 4 low) plus a README index. Each template carries YAML front-matter (title, labels, milestone, severity, category, effort, priority) and a body with source, description, location, impact, proposed remediation, and acceptance criteria — matching the format requested in issue #250. Refs: #250
|
Done — 23 ready-to-file issue templates now live under Each file follows the structure requested in #250:
The README also includes a ready-to-run Commit: |
🤖 Solution Draft LogThis log file contains the complete execution trace of the AI solution draft process. 💰 Cost: $2.257797📊 Context and tokens usage:
Total: (106.4K + 1.5M cached) input tokens, 33.9K output tokens, $2.257797 cost 🤖 Models used:
📎 Log file uploaded as Gist (973KB)Now working session is ended, feel free to review and add any feedback on the solution draft. |
Mirror the precedent set by AUDIT_REPORT.md (issue xlabtg#250 / PR xlabtg#251) which lives at the repo root. The 24 templates in improvements/work2/ already reference ../../FULL_AUDIT_REPORT.md, and the PR description (and README) link to ./FULL_AUDIT_REPORT.md — both expected the file at the repo root.
Summary
Delivers the audit requested in #250: a thorough, read-only sweep of the
repository looking for real bugs, logical inconsistencies, and conflicts
— with concrete, evidence-backed remediations.
The full report lives at
AUDIT_REPORT.md.Per owner request on this PR, every finding also ships as a
ready-to-file GitHub Issue template under
improvements/work/(23 templates + index).Scope covered
src/autonomous/**— loop, manager, policy engine, goal parser, integrationsrc/memory/agent/autonomous-tasks.ts+ migration1.20.0.sqlsrc/webui/**— server, setup-server, middleware (auth, CSRF), all routessrc/index.tsstart/stop lifecycleHeadline findings (full list + remediations in the report)
🔴 Critical (4)
DEFAULT_POLICY_CONFIG.restrictedToolsuses names ("wallet:send","contract:deploy","system:exec") that don't match any real tool. The real tools areton_send,jetton_send,exec. Result: autonomous tasks can spend TON up to budget without any escalation gate.src/autonomous/policy-engine.ts:34.AutonomousTaskManageris a local variable insidestartAgent();stopAgent()never callsstopAll(). Autonomous loops outlive shutdown and race the DB close.src/index.ts:333-414vssrc/index.ts:1487-1583.AutonomousLoop(andPolicyEngine), wipingrecentActions, rate-limit timestamps, and the uncertain counter. ThetoolCallsPerHourcap, loop-detection, and uncertainty escalation can all be bypassed via pause/resume.src/autonomous/manager.ts:84-126,src/autonomous/loop.ts:68-80.log.info(\URL: ${url}/auth/exchange?token=${this.authToken}`). A 7-day session token ends up in any aggregated log.src/webui/server.ts:503`.🟠 High (7)
JSON.parseinrowToTask/Checkpoint/LogEntryhas no try/catch — one bad row DoSes the listing (src/memory/agent/autonomous-tasks.ts:119-163).notify()only logs; escalations never reach the user via Telegram or WebUI (src/autonomous/integration.ts:110-115).deps_planWithTimeoutleaks asetTimeoutper iteration (src/autonomous/loop.ts:359-370)..finally— paused tasks can land infailed(src/autonomous/manager.ts:84-118).task_checkpointsgrows unbounded for active tasks (src/autonomous/loop.ts:306-320).admin_ids[0] ?? 0silently runs autonomous tools as user 0 whenadmin_idsis empty (src/autonomous/integration.ts:91).POST /api/setup/launch, port 7777) writes the auth token toconfig.yamlunauthenticated and unrate-limited (src/webui/setup-server.ts:132-161).🟡 Medium (8) / 🟢 Low (4) — see the report; includes missing global iteration cap, unbounded rate-limit timestamp arrays, aggressive default TON budget, empty-reason escalations, paused-forever tasks, inconsistent path-traversal checks, management API default host
0.0.0.0, and misc.Ready-to-file issue templates
Each of the 23 findings now has its own ready-made template under
improvements/work/, with the structurerequested in #250:
Index:
improvements/work/README.md.audit-c1..c4-*.mdaudit-h1..h7-*.mdaudit-m1..m8-*.mdaudit-l1..l4-*.mdEach file is self-contained —
gh issue create --body-file(afterstripping the front-matter) will open the corresponding issue. See
improvements/work/README.mdfor a ready-to-run shell loop.Risk assessment
🟠 Conditional Go. Safe in read-only configurations today; autonomous
mode touching real TON balances should wait for the P1 fixes
(C1, C2, C3, C4, H1, H2, H7), which are all small changes.
Methodology
Three parallel focused sub-audits (WebUI/security, autonomous core,
CLI/lifecycle/config). Each finding was spot-checked directly by
reading the referenced lines before inclusion. No runtime execution —
this is static analysis against commit
d6b09c0on branchissue-250-294bf2f3de08.Follow-up
Each P1 finding should land as its own PR with a regression test so
the list stays verifiable. The
Validation plansection at the bottomof the report enumerates the concrete checks (unit tests, integration
tests, log grep). The
improvements/work/templates are the readyinputs for those issues.
Test plan
improvements/work/contains 23 templates + README, one per findingFixes #250