App Builder - Unit and Integration tests by eshurakov · Pull Request #7 · Kilo-Org/cloud

eshurakov · 2026-02-04T09:32:19Z

This pull request introduces improvements to the Cloudflare App Builder project focusing on enhanced test coverage.

Testing & CI Improvements

Added integration tests for the App Builder.

API Schema Additions

Introduced new API schemas for the /tree and /blob endpoints, defining types and validation for tree and blob responses.

Migrated from Kilo-Org/OLD-UNUSED-kilocode-dnekcab#4482

marius-kilocode · 2026-02-09T12:14:07Z

+   * @param path - Optional path to a subdirectory (defaults to root)
+   * @returns Array of tree entries with name, type, oid, and mode
+   */
+  async getTree(


This code is duplicated, can we reuse the git implementation from here?

marius-kilocode · 2026-02-09T12:15:19Z

+   * @param path - Path to the file
+   * @returns File content (utf-8 or base64 encoded), encoding type, size, and blob SHA
+   */
+  async getBlob(


Also duplicated and the return types are mismatching?

marius-kilocode · 2026-02-09T12:22:10Z

+        message: error instanceof Error ? error.message : 'Path not found',
+      }),
+      {
+        status: 404,


This will return 404 also for 'Error(Path not found: ${normalizedPath})' and 'Error(Path is not a directory: ${normalizedPath})'. It depends on the quality we want for this API, but an optional improvement would be to handle not found cases explicitly by cascading the error from getTree

marius-kilocode · 2026-02-09T12:22:31Z

+        message: error instanceof Error ? error.message : 'File not found',
+      }),
+      {
+        status: 404,


See comment above. Same thing here.

…g in handlers

kilo-code-bot · 2026-02-09T14:17:22Z

+      // Handle tree requests (GET /apps/{app_id}/tree/{ref})
+      const treeMatch = pathname.match(TREE_PATTERN);
+      if (treeMatch && request.method === 'GET') {
+        const decodedRef = decodeURIComponent(treeMatch[2]);


WARNING: decodeURIComponent() can throw on malformed percent-encoding

If a client sends an invalid URL escape sequence in the {ref} segment (e.g. %E0), decodeURIComponent() throws URIError: URI malformed, which will bubble up as an unhandled exception (500). Consider guarding decoding and returning a 400 for invalid encoding.

kilo-code-bot · 2026-02-09T14:17:36Z

Code Review Summary

Status: 1 Issues Found | Recommendation: Address before merge

Overview

Severity	Count
CRITICAL	0
WARNING	1
SUGGESTION	0

Issue Details (click to expand)

WARNING

File	Line	Issue
`cloudflare-app-builder/src/index.ts`	143	`decodeURIComponent()` can throw on malformed percent-encoding (unhandled 500 risk)

Other Observations (not in diff)

Issues found in unchanged code that cannot receive inline comments:

File	Line	Issue
`cloudflare-app-builder/src/git/git.ts`	432	`console.log` usage in library code (may be noisy in production logs)

Files Reviewed (6 files)

cloudflare-app-builder/src/index.ts - 1 issue
cloudflare-app-builder/src/handlers/files.ts - 0 issues
cloudflare-app-builder/src/handlers/git-protocol.ts - 0 issues
cloudflare-app-builder/src/git-repository-do.ts - 0 issues
cloudflare-app-builder/src/types.ts - 0 issues
cloudflare-app-builder/package.json - 0 issues

Fix these issues in Kilo Cloud

eshurakov · 2026-02-09T21:01:44Z

thank you @marius-kilocode, PR is ready for another pass

marius-kilocode · 2026-02-09T12:24:12Z

+    "@types/better-sqlite3": "^7.6.13",
+    "@types/jsonwebtoken": "^9.0.9",
    "@types/node": "^22.10.1",
+    "better-sqlite3": "^12.6.0",


We can already use: 12.6.2, same applies for the other new deps and their patch releases.

Still the old version?

marius-kilocode · 2026-02-09T12:25:14Z

+    try {
+      db.close();
+    } catch {
+      // Ignore - might already be closed


Should we not log this?

marius-kilocode · 2026-02-09T12:26:02Z

@@ -0,0 +1,72 @@
+/**
+ * Test utilities for GitVersionControl unit tests


The code does't appears really domain specific to git. It rather looks like common code that should be sharable and reusable?

marius-kilocode · 2026-02-09T12:31:00Z

+        run: pnpm --filter app-builder typecheck
+
+      - name: Run app-builder tests
+        run: pnpm --filter app-builder test


I am a bit confused here should we not run the integration tests?

marius-kilocode · 2026-02-10T10:37:32Z

+  echo ""
+
+  if pnpm exec tsx "$test_file"; then
+    TOTAL_PASSED=$((TOTAL_PASSED + 1))


Is there a reason why we do all of this manually? Should we not use a proper test runner instead?

marius-kilocode · 2026-02-10T10:39:36Z

+
+    if (result.status === 201 && result.body.success === true) {
+      logSuccess('Init with valid special chars (underscore, hyphen) succeeds');
+      return true;


I don't get why we are returning true or false here? We need assertions here or not and a proper test framework to detect failures.

marius-kilocode · 2026-02-10T10:42:27Z

+import { logger } from '../utils/logger';
+import type { Env } from '../types';
+
+const notFoundPatterns = [


This is a really hacky model reward hack, once any error string changes the logic changes. We should introduce either proper error codes or error implementations.

## Root cause AI agent was interleaving analysis and commenting — it would find an issue, post it, then move on. The new prompt enforces a strict two-phase approach: complete all analysis first, then post everything at once. This should eliminate the drip-feed pattern across revisions. ## The fix 1. **New hard constraint #7** — `"ALL issues in ONE pass"`: Explicitly tells the AI to report every issue it finds in a single review and frames finding new issues on unchanged code in subsequent revisions as a failure of the first review. 2. **Restructured workflow** — Two key changes: - **Step 1** renamed from "Analyze the PR/MR" to "Analyze ALL Changed Files (complete this BEFORE posting any comments)" with a bold instruction: *"Do NOT post any comments until you have reviewed EVERY changed file. Analyze ALL files first, THEN comment."* - The per-file instructions now explicitly say *"Check for ALL issue types"* and *"Note every issue you find — do NOT stop at the first issue per file"* - Steps 2-3 are renamed to emphasize "ALL Issues" rather than individual verification

github-actions · 2026-03-13T00:53:23Z

This Pull Request has been inactive for 30 days and will be closed in 7 days if no further activity occurs. Please update or close this PR if it is no longer relevant.

github-actions · 2026-03-20T00:54:19Z

This Pull Request has been closed due to inactivity.

…lations Each invariant violation now reports to Sentry with structured context (invariant number, message, townId). Existing console.error logging and metrics.invariantViolations counter are preserved. For invariant #7 (working agent with no hook), auto-recovery transitions the agent back to idle via applyAction.

…lations Each invariant violation now reports to Sentry with structured context (invariant number, message, townId). For invariant #7 (working agent with no hook), auto-recovery transitions the agent back to idle. Uses structured context field on Violation type instead of fragile regex extraction from message strings, making the coupling type-safe.

Each invariant violation now triggers Sentry.captureMessage with structured context (invariant number, message, townId) as both extra data and tags. Existing analytics event emission is preserved. Added TODO for future auto-recovery of invariant #7 (working agent with no hook).

#1373) * fix: skip container_status events for running containers (#1368) Filter out 'running' status in the alarm pre-phase before calling upsertContainerStatus(). Running is the steady-state for healthy agents and a no-op in applyEvent(), so recording it just bloats the event table (~720 events/hour/agent). Non-running statuses (stopped, error, unknown) still get inserted for reconciler detection. * feat(gastown): add POST /debug/reconcile-dry-run endpoint (#1367) Add a debug endpoint that runs the reconciler against current live state and returns the actions it would emit without applying them. This enables inspecting what the reconciler thinks should happen at any given moment. - Add debugDryRun() method to TownDO that calls reconciler.reconcile() and returns actions + metrics without calling applyAction() - Add POST /debug/towns/:townId/reconcile-dry-run route following the same unauthenticated debug pattern as GET /debug/towns/:townId/status - Response includes actions array, actionsEmitted count, actionsByType breakdown, and pendingEventCount * feat(gastown): add debug dry-run endpoint with event draining (#1370) * feat(claw): evaluate button-vs-card feature flag for PostHog experiment tracking * fix(claw): move button-vs-card flag eval to CreateInstanceCard Moves useFeatureFlagVariantKey('button-vs-card') from ClawDashboard (which renders for all users including those with existing instances) to CreateInstanceCard (which only renders for users who haven't provisioned yet). This scopes the experiment exposure to users who can actually see the create CTA, avoiding population dilution. * feat(gastown): add POST /debug/reconcile-dry-run endpoint Add a debug endpoint that runs the reconciler against current live state and returns the actions it would emit without applying them. This enables inspecting what the reconciler thinks should happen at any given moment. - Add debugDryRun() method to TownDO that calls reconciler.reconcile() and returns actions + metrics without calling applyAction() - Add POST /debug/towns/:townId/reconcile-dry-run route following the same unauthenticated debug pattern as GET /debug/towns/:townId/status - Response includes actions array, actionsEmitted count, actionsByType breakdown, and pendingEventCount * fix(gastown): drain pending events in debugDryRun() before reconciling Wrap debugDryRun() in a SQLite savepoint so it can drain and apply pending town_events (Phase 0) before running reconcile (Phase 1), matching the real alarm loop behavior. The savepoint is rolled back in a finally block so the endpoint remains fully side-effect-free. Adds eventsDrained to the returned metrics. --------- Co-authored-by: kiloconnect[bot] <240665456+kiloconnect[bot]@users.noreply.github.com> Co-authored-by: Pedro Heyerdahl <pedro@kilocode.ai> Co-authored-by: Pedro Heyerdahl <61753986+pedroheyerdahl@users.noreply.github.com> * feat(gastown): add POST /debug/replay-events endpoint for event replay debugging Adds debugReplayEvents(from, to) method to Town.do.ts that queries all town_events in a time range (regardless of processed_at), applies them to reconstruct state transitions, runs the reconciler, and returns the computed actions and a state snapshot. Uses a SQLite SAVEPOINT that is rolled back so the endpoint remains fully side-effect-free. Route: POST /debug/towns/:townId/replay-events Body: { from: ISO, to: ISO } Response: { eventsReplayed, actions, stateSnapshot } * feat(gastown): emit reconciler metrics to Analytics Engine and add Grafana dashboard panels (#1372) - Extend writeEvent() to support double3-double10 fields for reconciler metrics - Emit reconciler_tick event after each alarm tick with all 9 metrics - Add Reconciler row to Grafana dashboard with 6 panels: 1. Events drained per tick (timeseries) 2. Actions emitted per tick by type (stacked bar) 3. Side effects attempted/succeeded/failed (timeseries) 4. Invariant violations (stat with >0 alert threshold) 5. Reconciler wall clock time (timeseries with >500ms threshold) 6. Pending event queue depth (gauge with >50 threshold) * fix(gastown): add replay caveat and fix Grafana pending-events gauge query Add a caveat comment and response field to debugReplayEvents explaining that events are re-applied on top of live state, not from a pre-window snapshot — results are approximate, useful for debugging event flow but not faithful historical reconstruction. Fix the Grafana 'Pending Event Queue Depth' gauge to show the latest row's double8 value instead of averaging across the time window. * feat(gastown): add Sentry capture for reconciler invariant violations Each invariant violation now triggers Sentry.captureMessage with structured context (invariant number, message, townId) as both extra data and tags. Existing analytics event emission is preserved. Added TODO for future auto-recovery of invariant #7 (working agent with no hook). --------- Co-authored-by: kiloconnect[bot] <240665456+kiloconnect[bot]@users.noreply.github.com> Co-authored-by: Pedro Heyerdahl <pedro@kilocode.ai> Co-authored-by: Pedro Heyerdahl <61753986+pedroheyerdahl@users.noreply.github.com>

…review fixes - Fix #1: Add WHERE clause to updateConfig (config.ts accepts wastelandId param) - Fix #2: Update router.ts and ownership.ts imports from stub to Wasteland.do - Fix #3: initializeWasteland/storeCredential/updateConfig now return records - Fix #4: Implement all 6 missing RPC methods (updateMember, connectTown, disconnectTown, listConnectedTowns, getWantedBoard, refreshWantedBoard) - Fix #5: Synchronous initializeDatabase(), no ensureInitialized/initPromise - Fix #7: Delete WastelandDO.stub.ts New files: - 5 table definitions under db/tables/ (config, members, credentials, connected-towns, wanted-cache) - 5 sub-modules under dos/wasteland/ with business logic - Wasteland.do.ts: thin DO class delegating to sub-modules

App Builder - Unit and Integration tests

20d1235

Migrated from Kilo-Org/OLD-UNUSED-kilocode-dnekcab#4482

marius-kilocode reviewed Feb 9, 2026

View reviewed changes

Reuse GitVersionControl for tree/blob in DO and improve error handlin…

400dccd

…g in handlers

kilo-code-bot bot reviewed Feb 9, 2026

View reviewed changes

eshurakov requested a review from marius-kilocode February 9, 2026 21:01

marius-kilocode reviewed Feb 10, 2026

View reviewed changes

alex-alecu mentioned this pull request Feb 18, 2026

feat: Force exhaustive first-pass analysis in code review prompts #323

Merged

github-actions bot added the stale label Mar 13, 2026

github-actions bot closed this Mar 20, 2026

jrf0110 mentioned this pull request Apr 3, 2026

feat(gastown): Workflows — visual DAG builder with scheduled and event-triggered execution #1943

Open

42 tasks

kilo-code-bot bot mentioned this pull request Apr 8, 2026

feat(exa): add $10/month free allowance with credit billing for overages #2169

Merged

		@@ -0,0 +1,72 @@
		/**
		* Test utilities for GitVersionControl unit tests

Conversation

eshurakov commented Feb 4, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kilo-code-bot bot Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

kilo-code-bot bot commented Feb 9, 2026

Code Review Summary

Overview

WARNING

Uh oh!

eshurakov commented Feb 9, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 13, 2026

Uh oh!

github-actions bot commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants