Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
177 changes: 156 additions & 21 deletions apps/backend/src/lib/ai/prompts.ts
Original file line number Diff line number Diff line change
Expand Up @@ -517,33 +517,64 @@ follow these rules without exception:
5. Before finishing the code, mentally re-order your hooks and confirm the count and order are
identical on every possible render path.

CANONICAL BAD EXAMPLE (crashes with React error #310):
CANONICAL SHAPE OF EVERY Dashboard COMPONENT:
function Dashboard() {
const [users, setUsers] = React.useState(null);
if (!users) {
return <Loading />; // ← early return BEFORE the next hook
}
const [filter, setFilter] = React.useState(""); // ← this hook is skipped on first render
React.useEffect(() => { ... }, []); // ← and this one
return <div>...</div>;
}

CANONICAL GOOD EXAMPLE:
function Dashboard() {
// All hooks first. Unconditional. Same count every render.
// 1) ALL hooks first. Unconditional. Same count every render.
// Includes React.useState / useEffect / useCallback / useMemo / useRef
// AND every DashboardUI.use* (useDataSource, etc).
const [users, setUsers] = React.useState(null);
const [filter, setFilter] = React.useState("");
const [error, setError] = React.useState(null);
React.useEffect(() => { ... }, []);

// Conditional rendering AFTER all hooks:
// 2) Conditional rendering happens ONLY AFTER every hook has run.
if (error) return <ErrorState message={error} />;
if (!users) return <Loading />;
return <div>...</div>;
}

If you catch yourself writing \`if (...) return ...\` anywhere above a \`React.useXxx\` call,
STOP and move the return below every hook.
Mental check before you emit: scan your Dashboard function top-to-bottom. If ANY line
starting with \`use\` (React.useX, DashboardUI.useX, or any custom useX) sits BELOW a
\`return\`, an \`if\`, a ternary, or a loop, the code is wrong — move the hook up.

CUSTOM HOOKS COUNT TOO — \`DashboardUI.useDataSource\` IS A HOOK
─────────────────────────────────────────────────────────────────
Anything starting with \`use\` is a hook, regardless of namespace. \`DashboardUI.useDataSource\`
and any other \`use*\` from \`DashboardUI\` follow the SAME rules as \`React.useState\` — they
MUST be called unconditionally at the top of the component, before any \`if\` / early
\`return\` / ternary / loop. The \`DashboardUI\` namespace does NOT exempt them.

The most common crash in generated dashboards: \`useDataSource\` gets placed AFTER a
loading or error guard, so on the first render (guard hits) it isn't called, and on the
next render (guard passes) it suddenly is. Hook count changes between renders → React
error #310. NEVER put \`useDataSource\` below an early return.

// ✅ CORRECT PATTERN for a DataGrid:
function Dashboard() {
const [rows, setRows] = React.useState(null);
const [error, setError] = React.useState(null);
const [gridState, setGridState] = React.useState(DashboardUI.createDefaultDataGridState());
React.useEffect(() => {
stackServerApp.listUsers({ includeAnonymous: true, limit: 500 })
.then(setRows)
.catch((e) => setError(String(e)));
}, []);
const gridData = DashboardUI.useDataSource({ // ← always called
data: rows ?? [], // ← tolerate null pre-load
columns: [/* ... */],
getRowId: (u) => u.id,
sorting: gridState.sorting,
quickSearch: gridState.quickSearch,
pagination: gridState.pagination,
paginationMode: "client",
});
if (error) return <div className="p-6 text-red-500">{error}</div>;
if (rows == null) return <DashboardUI.DesignSkeleton />;
return <DashboardUI.DataGrid rows={gridData.rows} state={gridState} onChange={setGridState} />;
}

The rule: ALL hooks first (including \`useDataSource\`), THEN the conditional returns.
Pass \`data: rows ?? []\` so \`useDataSource\` is safe to call while \`rows\` is still null.

────────────────────────────────────────
EDITING BEHAVIOR (when existing code is provided)
Expand All @@ -552,7 +583,108 @@ EDITING BEHAVIOR (when existing code is provided)
- Always preserve parts of the dashboard the user didn't ask to change.
- If the user asks to add something, add it without removing existing content.
- If the user asks to change styling, colors, or layout, make those changes while preserving functionality.
- Always call the updateDashboard tool with the COMPLETE updated source code — no partial code or diffs.

CHOOSING THE RIGHT TOOL — patchDashboard vs. updateDashboard
You have TWO write tools. Pick the right one. Wrong choice wastes tokens and time —
or worse, breaks the layout.

The decision is NOT just about size. It's about whether the change is LOCAL to one
element's own attributes, or whether it ripples through surrounding layout / sibling
positioning / shared state.

- patchDashboard — for changes that are LOCAL to one element and don't affect siblings.
Use it for:
* Rename a label or heading
* Change a color, className, or style on one element
* Swap an icon
* Tweak one prop value (e.g. limit: 100 → 500)
* Adjust one chart's config (axis label, color, format)
* Fix a single bug in one function or hook body
* Add or remove ONE self-contained leaf component (a badge, an icon)

How it works: \`{ edits: [{ oldText, newText, occurrenceIndex? }, ...] }\` — each edit
is a literal find-and-replace on the CURRENT source.
* \`oldText\` MUST appear verbatim in the current source — copy it character-for-character
including whitespace and line breaks. No paraphrasing, no normalization.
* Include enough surrounding context in \`oldText\` to make the match UNIQUE. If the same
snippet appears more than once, either expand \`oldText\` to disambiguate OR set
\`occurrenceIndex\` (0-indexed) to pick the Nth match.
* \`newText\` is the replacement. Use \`""\` to delete.
* Batch related edits in ONE call (up to 20). Edits apply in order; later edits see the
result of earlier ones.
* If a single \`oldText\` block can hold all your changes, prefer one large edit over
many small ones — fewer matching failures.

- updateDashboard — for changes that affect LAYOUT, ORDERING, or REGIONS, even if the
user pointed at a single component. Re-emit the full source.
Use it for:
* REORDERING components — moving a card up/down, swapping two charts, changing the
sequence of rows. The grid's child order matters; sibling indices shift.
* RESIZING — making a card wider/taller, changing grid-cols-2 to grid-cols-3,
adjusting a chart's height, changing col-span/row-span. Affects siblings'
positioning in the same grid.
* MOVING a component to a different parent or section.
* Adding a NEW row or column to a grid (sibling positions change).
* Layout overhauls (flex → grid, single column → two-column).
* Replacing the data model or switching the entire theme.
* Initial creation of a brand-new dashboard.

The rule of thumb: ask "would this change break or shift any neighboring component?"
If yes → updateDashboard. If the change is purely cosmetic on the targeted element
itself → patchDashboard.

Why: patchDashboard is a literal text replacement. It can't reason about JSX siblings,
grid templates, or array order. A "move this card up" patch that just swaps two JSX
blocks usually breaks because the surrounding structure (commas, fragment boundaries,
conditional renders) doesn't survive a naive swap.

Do NOT use updateDashboard for purely cosmetic edits. Rewriting 3000 tokens to change
one className is wrong; emit a patchDashboard with one edit instead.

WIDGET CONTEXT FROM THE USER
When the user prefixes their message with a block like:
[Widget: User Signups]
Path: div.grid > div:nth-of-type(2) > h3
HTML: <h3 class="text-lg font-semibold">User Signups</h3>
they have clicked a specific element in the running dashboard. The HTML is a verbatim
slice of the rendered DOM and almost always contains text that also appears in the JSX
source — use it as your \`oldText\` anchor when patching. The Path describes the
element's position in the render tree, useful for disambiguation when the HTML alone
repeats.

A widget pointer narrows the TARGET, but it does NOT decide the tool. Apply the same
rule above: if the user wants to restyle/rename the targeted widget → patchDashboard.
If the user wants to reorder, resize, move, or restructure the layout around the widget
→ updateDashboard. "Make this card bigger" affects the grid → updateDashboard, even
though the user only pointed at one card.

ACTION INTENTS FROM THE USER
The user may include a block like:
[Action: Add a new component to the dashboard]
This is a structural intent — the user wants something NEW added. Adding a component
shifts sibling positions in the layout grid and is exactly the case that updateDashboard
handles (per the rule above: structural change → full rewrite). Do NOT try to express
"add a card" as a patchDashboard with a JSX fragment insertion — JSX commas, parent
containers, and grid template-cols may all need adjustment together. Re-emit the full
source via updateDashboard with the new component placed sensibly in the existing
layout, preserving everything else.

The user's typed text describes WHAT to add ("a metric card for active users",
"a chart of weekly signups"). Combine the action intent with the typed text to decide
what to build.

RUNTIME ERROR REPORTS FROM THE USER
The user may include a block like:
[Error: The dashboard crashed at runtime — please diagnose and fix.]
Message: <error message>
Stack: <stack trace>
Component stack: <react component stack>
The dashboard threw at runtime. Localize the bug from the stack and component stack,
identify the smallest possible fix in the source, and apply it. If the fix is a
one-line change (a typo, a wrong prop name, a missing null check), use patchDashboard.
If the fix requires restructuring (a hook ordering bug, a malformed JSX tree), use
updateDashboard. Either way, preserve the rest of the dashboard. Don't strip out
features the user didn't ask to remove just because they're near the crash site.

────────────────────────────────────────
CORE DATA FETCHING RULES (STACK)
Expand Down Expand Up @@ -961,11 +1093,14 @@ PRE-EMIT CHECKLIST (RUN THIS IN YOUR HEAD BEFORE CALLING updateDashboard)
Before you call updateDashboard, silently walk through these four checks. If any fails, fix it
FIRST and re-run the list.

[1] HOOK ORDER — Are all \`React.useState\` / \`React.useEffect\` / \`React.useCallback\` calls at
[1] HOOK ORDER — Is EVERY call starting with \`use\` (React.useState / useEffect /
useCallback / useMemo / useRef AND every DashboardUI.use* including useDataSource) at
the top of the Dashboard component, before every \`if\` / early \`return\` / conditional?
If no, move them up. This prevents React error #310. Also check that any variable
referenced inside a hook initializer (e.g. \`useState(() => foo(columns))\`) is declared
ABOVE that hook — a TDZ error looks like a hook-order crash but isn't one.
If no, move them up — this is the #1 cause of React error #310 in generated dashboards.
For \`useDataSource\` specifically: pass \`data: rows ?? []\` so the hook is safe to call
while data is still loading, then guard on \`rows == null\` AFTER the hook. Also check
that any variable referenced inside a hook initializer (e.g. \`useState(() => foo(columns))\`)
is declared ABOVE that hook — a TDZ error looks like a hook-order crash but isn't one.

[2] DATA HONESTY — Does every field the code references actually exist in the SDK types or
ClickHouse schema shown in context? No made-up field names, no hardcoded sample arrays,
Expand Down
21 changes: 20 additions & 1 deletion apps/backend/src/lib/ai/tools/create-dashboard.ts
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,28 @@ import { z } from "zod";
*/
export function updateDashboardTool(auth: SmartRequestAuth | null) {
return tool({
description: "Update the dashboard with new source code. The source code must define a React functional component named 'Dashboard' (no props). It runs inside a sandboxed iframe with React, Recharts, DashboardUI, and stackServerApp available as globals. No imports, exports, or require statements.",
description: "Replace the entire dashboard source. Use ONLY for initial creation or large structural rewrites that touch most of the file. For any change smaller than ~30% of the file, use patchDashboard instead. The source must define a React functional component named 'Dashboard' (no props). Runs in a sandboxed iframe with React, Recharts, DashboardUI, and stackServerApp as globals. No imports, exports, or require statements.",
inputSchema: z.object({
content: z.string().describe("The complete updated JSX source code for the Dashboard component"),
}),
});
}

/**
* Tool for surgical edits to the existing dashboard source.
*
* Like updateDashboardTool, this is inert server-side - the call streams back to the
* client, which applies the patches against currentTsxSource and updates state.
*/
export function patchDashboardTool(auth: SmartRequestAuth | null) {
return tool({
description: "Apply one or more surgical text edits to the existing dashboard source. Prefer this over updateDashboard for any change smaller than ~30% of the file (rename, restyle, add/remove a single component, fix one bug). Each edit is a literal find-and-replace on the current source. Returns nothing - the client applies the patch.",
inputSchema: z.object({
edits: z.array(z.object({
oldText: z.string().min(1).describe("Exact substring to find in the current source. Must match verbatim including whitespace. Include enough surrounding context to make the match unique, OR set occurrenceIndex when oldText repeats."),
newText: z.string().describe("Replacement text. Empty string deletes the match."),
occurrenceIndex: z.number().int().min(0).optional().describe("0-indexed match to replace when oldText appears multiple times. Omit when oldText is unique in the source."),
})).min(1).max(20).describe("Edits applied in order against the running source. Later edits see the result of earlier ones."),
}),
});
}
11 changes: 9 additions & 2 deletions apps/backend/src/lib/ai/tools/index.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import { SmartRequestAuth } from "@/route-handlers/smart-request";
import { ToolSet } from "ai";
import { updateDashboardTool } from "./create-dashboard";
import { patchDashboardTool, updateDashboardTool } from "./create-dashboard";
import { createEmailDraftTool } from "./create-email-draft";
import { createEmailTemplateTool } from "./create-email-template";
import { createEmailThemeTool } from "./create-email-theme";
Expand All @@ -13,7 +13,8 @@ export type ToolName =
| "create-email-theme"
| "create-email-template"
| "create-email-draft"
| "update-dashboard";
| "update-dashboard"
| "patch-dashboard";

export type ToolContext = {
auth: SmartRequestAuth | null,
Expand Down Expand Up @@ -62,6 +63,11 @@ export async function getTools(
break;
}

case "patch-dashboard": {
tools["patchDashboard"] = patchDashboardTool(context.auth);
break;
}

default: {
// TypeScript will ensure this is unreachable if we handle all cases
const _exhaustive: never = toolName;
Expand Down Expand Up @@ -89,6 +95,7 @@ export function validateToolNames(toolNames: unknown): toolNames is ToolName[] {
"create-email-template",
"create-email-draft",
"update-dashboard",
"patch-dashboard",
];

return toolNames.every((name) => validToolNames.includes(name as ToolName));
Expand Down
Loading
Loading