Conversation
- Added new internal API endpoint for documentation tools, allowing actions such as listing available docs, searching, and fetching specific documentation by ID. - Updated environment configuration to support optional internal secret for enhanced security. - Refactored existing search functionality to utilize the new docs tools API instead of the previous MCP server. - Improved error handling and response parsing for documentation-related requests. - Expanded documentation to clarify the relationship between the new tools and existing API functionalities. This update streamlines the documentation access process and enhances the overall developer experience.
- Introduced error capturing for failed HTTP requests in the docs tools API, improving debugging capabilities. - Updated the API response for unsupported methods to include an 'Allow' header, clarifying the expected request type. These changes enhance the robustness of the documentation tools integration and improve developer experience.
- Updated the key name in the capabilities section of the API documentation to follow a consistent naming convention, improving clarity and maintainability.
The .gitmodules was updated in d22593d to point at apps/backend/src/private/implementation, but the gitlink entry (mode 160000) was never added to the tree. This caused `git clone --recurse-submodules` to silently skip the private submodule. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…on for docs tools - Added `STACK_DOCS_INTERNAL_BASE_URL` to backend `.env` and `.env.development` files for AI tool bundle configuration. - Removed references to `STACK_INTERNAL_DOCS_TOOLS_SECRET` from backend and docs environment files and validation logic from the docs tools API route. - Introduced a new `.env` file for the docs app with essential configuration variables.
… into llm-mcp-flow
nams1570
left a comment
There was a problem hiding this comment.
This is a partial review. Identified some potential bugs and set up some discussions. Also, maybe we should also log system prompt to our internal tool? Otherwise it's a bit hard to debug.
nams1570
left a comment
There was a problem hiding this comment.
Quick partial review. It looks like internal-tool is a new project, which means it doesnt get rolled into the existing instrumentation in instrumentation.ts. This means no sentry, no jaeger, no otel. If things go wrong, we won't be able to see it.
| systemPromptId: apiKey === "stack-auth-proxy" ? "stack-cli" : apiKey, | ||
| quality: "unknown", | ||
| speed: "unknown", | ||
| modelId: String(parsed.model ?? OPENROUTER_DEFAULT_MODEL), |
There was a problem hiding this comment.
why cast this here? isnt it already validated earlier?
| if (!res.ok) { | ||
| const preview = (await res.text()).slice(0, 200); | ||
| throw new StackAssertionError(`Reducer ${reducer} failed (${res.status}): ${preview}`); | ||
| } |
There was a problem hiding this comment.
This would mean that any spacetime db failure would be treated as a 500 no? Internal server error? But if a row was deleted while someone was looking at it, that's a 404 not a 500 right?
| STACK_SPACETIMEDB_URL=# SpacetimeDB host URL; default empty (logging disabled) | ||
| STACK_SPACETIMEDB_DB_NAME=# SpacetimeDB database name | ||
| STACK_MCP_LOG_TOKEN=# shared secret gating the log_mcp_call reducer; must match EXPECTED_LOG_TOKEN in apps/internal-tool/spacetimedb/src/index.ts | ||
| STACK_SPACETIMEDB_SERVICE_TOKEN=# optional; pre-provisioned SpacetimeDB identity token for the backend service. To mint: `curl -X POST <STACK_SPACETIMEDB_URL>/v1/identity` and copy the `token` field. |
There was a problem hiding this comment.
Could you clarify? This isn't super clear
There was a problem hiding this comment.
yep, changed the comment
| conversationId: string | undefined, | ||
| }; | ||
|
|
||
| export async function logAiQuery(entry: AiQueryLogEntry): Promise<void> { |
There was a problem hiding this comment.
Nit: what do you think about having some validation for the entry fields here to make sure they're not NaN or infinity or something like that? Asking because despite it not being a possibility right now, if there is a bug, we would end up silently dropping logs.
There was a problem hiding this comment.
good idea, added validation and captureError
nams1570
left a comment
There was a problem hiding this comment.
End of review. Some more bugs
| while ((boundary = buffer.indexOf("\n\n")) !== -1) { | ||
| const block = buffer.slice(0, boundary); |
There was a problem hiding this comment.
Potential bug: I think this would break. as per https://html.spec.whatwg.org/multipage/server-sent-events.html#parsing-an-event-stream
sse's can terminate with carriage returns \r as well.
\n\n, \r\n\r\n, and \r\r.
This is a pretty high impact bug, the buffer would keep growing imo
| } finally { | ||
| reader.releaseLock(); | ||
| } |
There was a problem hiding this comment.
Potential bug: So the issue here is that this finally captures the loop, but not the few statements above the loop. In general, you want the ability to release a lock on failed states as soon as the lock is acquired, think Go's "defer" clause.
| humanCorrectedQuestion: args.correctedQuestion, | ||
| humanCorrectedAnswer: args.correctedAnswer, | ||
| humanReviewedAt: row.humanReviewedAt ?? ctx.timestamp, | ||
| humanReviewedBy: row.humanReviewedBy ?? args.reviewedBy, |
There was a problem hiding this comment.
Design question/ Bug: This looks a little weird. You fallback to whoeever issued the correction/update, but keep the original. That means that you could be in a position where it says "xyz reviewed this" but abc was the last person to update it. I guess this is also a design question, do we want audit log for this?
There was a problem hiding this comment.
yea it was a bug, fixed it. now we keep the latest reviewer
| humanReviewedAt: row.humanReviewedAt ?? ctx.timestamp, | ||
| humanReviewedBy: row.humanReviewedBy ?? args.reviewedBy, | ||
| publishedToQa: args.publish, | ||
| publishedAt: args.publish ? (row.publishedAt ?? ctx.timestamp) : undefined, |
There was a problem hiding this comment.
Is it possible for something to be unpublished and then republished? if so, we lose info on when something was first published, if that is important
There was a problem hiding this comment.
yes it is possible. i feel like we wanna keep the latest publishedAt (the field should basically reflect the most recent time when it was published)
| } | ||
| ctx.db.operators.insert({ | ||
| identity: args.identity, | ||
| addedAt: ctx.timestamp, |
There was a problem hiding this comment.
does ctx change based on where someone logs in from? if so this would keep getting rewritten
| ctx.db.operators.insert({ | ||
| identity: ctx.sender, | ||
| addedAt: ctx.timestamp, | ||
| stackUserId: '__service__', |
There was a problem hiding this comment.
this is fine for now since its purely internal, but might be good to have a guard to reject any stackUserIds that are being created with the name _service_
| if (row == null) { | ||
| throw new SenderError('Call log not found for correlationId: ' + args.correlationId); | ||
| } | ||
| ctx.db.mcpCallLog.id.delete(row.id); |
There was a problem hiding this comment.
ai_query_log has mcpcorrelationid right? Would this delete then orphan the ai_query_log rows? should we have some sort of cascading delete?
This PR introduces a context chip bar for the AI dashboard chat panel — users can click widgets in the sandbox iframe, trigger "Add a component", or auto-capture runtime errors as chips that are prepended to the next AI request. It also adds a patchDashboard tool for surgical find-and-replace edits alongside the existing updateDashboard, streaming dashboard generation, and an improved system prompt with hook-ordering guidance for generated components.
This update includes a comment explaining the rationale behind appending the verified QA corpus to the system prompt for each request
aadesh18
left a comment
There was a problem hiding this comment.
made changes based on your comments
| STACK_SPACETIMEDB_URL=# SpacetimeDB host URL; default empty (logging disabled) | ||
| STACK_SPACETIMEDB_DB_NAME=# SpacetimeDB database name | ||
| STACK_MCP_LOG_TOKEN=# shared secret gating the log_mcp_call reducer; must match EXPECTED_LOG_TOKEN in apps/internal-tool/spacetimedb/src/index.ts | ||
| STACK_SPACETIMEDB_SERVICE_TOKEN=# optional; pre-provisioned SpacetimeDB identity token for the backend service. To mint: `curl -X POST <STACK_SPACETIMEDB_URL>/v1/identity` and copy the `token` field. |
There was a problem hiding this comment.
yep, changed the comment
| content: systemPrompt, | ||
| ...(isAnthropic && { | ||
| providerOptions: { openrouter: { cacheControl: { type: "ephemeral" } } }, | ||
| }), | ||
| }; |
There was a problem hiding this comment.
As we discussed, most of the routes are being hit at least once in 5 mins. however in future we should think about not having the cache for the routes that are not being hit too often. added a comment explain the same
| } | ||
| ctx.db.operators.insert({ | ||
| identity: args.identity, | ||
| addedAt: ctx.timestamp, |
| ctx.db.operators.insert({ | ||
| identity: ctx.sender, | ||
| addedAt: ctx.timestamp, | ||
| stackUserId: '__service__', |
| humanCorrectedQuestion: args.correctedQuestion, | ||
| humanCorrectedAnswer: args.correctedAnswer, | ||
| humanReviewedAt: row.humanReviewedAt ?? ctx.timestamp, | ||
| humanReviewedBy: row.humanReviewedBy ?? args.reviewedBy, |
There was a problem hiding this comment.
yea it was a bug, fixed it. now we keep the latest reviewer
| humanReviewedAt: row.humanReviewedAt ?? ctx.timestamp, | ||
| humanReviewedBy: row.humanReviewedBy ?? args.reviewedBy, | ||
| publishedToQa: args.publish, | ||
| publishedAt: args.publish ? (row.publishedAt ?? ctx.timestamp) : undefined, |
There was a problem hiding this comment.
yes it is possible. i feel like we wanna keep the latest publishedAt (the field should basically reflect the most recent time when it was published)
|
@greptile-ai review |
| const qaId = BigInt(body.qaId); | ||
| await callReducerStrict("update_qa_entry", [token, qaId, body.question, body.answer, editor]); |
There was a problem hiding this comment.
The handler issues two separate HTTP reducer calls without any rollback mechanism. If update_qa_entry succeeds but set_qa_published fails (e.g. a network blip or SpacetimeDB 502), the entry's text is saved with the new question/answer but its published flag is left in the old state. The client receives a 502 and the user's intent (e.g. "save and publish") is only half-executed, creating a silent inconsistency.
Consider consolidating both mutations into a single SpacetimeDB reducer (e.g. update_qa_entry_and_publish) that applies both changes atomically within one transaction.
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/backend/src/app/api/latest/internal/mcp-review/update-qa-entry/route.ts
Line: 36-37
Comment:
**Non-atomic two-phase mutation**
The handler issues two separate HTTP reducer calls without any rollback mechanism. If `update_qa_entry` succeeds but `set_qa_published` fails (e.g. a network blip or SpacetimeDB 502), the entry's text is saved with the new question/answer but its `published` flag is left in the old state. The client receives a 502 and the user's intent (e.g. "save and publish") is only half-executed, creating a silent inconsistency.
Consider consolidating both mutations into a single SpacetimeDB reducer (e.g. `update_qa_entry_and_publish`) that applies both changes atomically within one transaction.
How can I resolve this? If you propose a fix, please make it concise.| correlationId, | ||
| mode, |
There was a problem hiding this comment.
mcpCorrelationId is always equal to correlationId
correlationId is the UUID freshly minted for this AI query, and mcpCorrelationId is set to that same value when mcpCallMetadata is present. In logIfMcpToolCall, the MCP call log row is also written with the same correlationId. So ai_query_log.mcpCorrelationId and ai_query_log.correlationId carry identical values, making the field redundant. The field name implies a different ID (the MCP log's own correlation key), which will confuse future maintainers. Either give the MCP call log its own distinct UUID and store that here, or remove the mcpCorrelationId column and rely on the shared correlationId.
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/backend/src/app/api/latest/ai/query/[mode]/route.ts
Line: 89-90
Comment:
**`mcpCorrelationId` is always equal to `correlationId`**
`correlationId` is the UUID freshly minted for this AI query, and `mcpCorrelationId` is set to that same value when `mcpCallMetadata` is present. In `logIfMcpToolCall`, the MCP call log row is also written with the same `correlationId`. So `ai_query_log.mcpCorrelationId` and `ai_query_log.correlationId` carry identical values, making the field redundant. The field name implies a _different_ ID (the MCP log's own correlation key), which will confuse future maintainers. Either give the MCP call log its own distinct UUID and store that here, or remove the `mcpCorrelationId` column and rely on the shared `correlationId`.
How can I resolve this? If you propose a fix, please make it concise.| return { | ||
| correlationId, | ||
| mode: parsed.stream === true ? "stream" : "generate", | ||
| systemPromptId: apiKey === "stack-auth-proxy" ? "stack-cli" : apiKey, |
There was a problem hiding this comment.
The
"stack-auth-proxy" string is silently coerced to "stack-cli" with no documentation. If the caller passes a different key prefix in the future, or if this sentinel value is ever renamed, logs will accumulate under the raw API-key value with no warning. Extracting the mapping into a named function makes the contract explicit.
| systemPromptId: apiKey === "stack-auth-proxy" ? "stack-cli" : apiKey, | |
| systemPromptId: resolveSystemPromptId(apiKey), |
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/backend/src/lib/ai/loggers/ai-proxy-logger.ts
Line: 28
Comment:
The `"stack-auth-proxy"` string is silently coerced to `"stack-cli"` with no documentation. If the caller passes a different key prefix in the future, or if this sentinel value is ever renamed, logs will accumulate under the raw API-key value with no warning. Extracting the mapping into a named function makes the contract explicit.
```suggestion
systemPromptId: resolveSystemPromptId(apiKey),
```
How can I resolve this? If you propose a fix, please make it concise.| const rawCost = extractCostFromUsage(usage); | ||
| runAsynchronouslyAndWaitUntil(logAiQuery({ | ||
| ...common, | ||
| stepsJson: serializeSteps(steps), |
There was a problem hiding this comment.
BUG: logging-only serialization can break a successful AI response.
Walkthrough: the AI request can finish successfully, then logAiQuerySuccess tries to prepare the log row. Because serializeSteps(steps) is evaluated before the fire-and-forget logging promise is handed to runAsynchronouslyAndWaitUntil, any JSON.stringify failure here throws synchronously. I reproduced this with a tool result containing a circular object: the answer path is otherwise successful, but logging throws circular before the async boundary.
Expected behavior: logging should be isolated from the user-facing success path. Build/serialize the logging payload inside the async logging task, or catch serialization failures and log a safe placeholder.
| token, | ||
| correlationId: body.correlationId, | ||
| }); | ||
| BigInt(body.qaId), |
There was a problem hiding this comment.
BUG: invalid qaId becomes a server exception instead of a clean 400.
Walkthrough: the request schema only says qaId is a string, so "abc" passes validation. The handler then calls BigInt(body.qaId), and BigInt("abc") throws during request handling. I reproduced this by invoking the route handler with reviewer auth and { qaId: "abc" }; the failure comes from conversion, not validation.
Expected behavior: validate the decimal/u64 shape before conversion and return a normal schema/client error for malformed IDs.
| const editor = user.display_name ?? user.primary_email ?? user.id; | ||
| const qaId = BigInt(body.qaId); | ||
| await callReducerStrict("update_qa_entry", [token, qaId, body.question, body.answer, editor]); | ||
| await callReducerStrict("set_qa_published", [token, qaId, body.publish]); |
There was a problem hiding this comment.
BUG: this update is not atomic.
Walkthrough: editing text and changing publish state are two separate reducer calls. I reproduced the failure by mocking update_qa_entry to succeed and set_qa_published to fail. The route returns an error, but the question/answer mutation has already been committed.
Expected behavior: use a single reducer for “update text plus publish state” so the whole operation either commits or fails together. Partial review edits are hard for reviewers to reason about because the UI reports failure while the data has changed.
|
|
||
| const token = getEnvVariable("STACK_MCP_LOG_TOKEN"); | ||
| await conn.reducers.updateHumanCorrection({ | ||
| await callReducerStrict("upsert_qa_from_call", [ |
There was a problem hiding this comment.
BUG: submitting a correction no longer marks the MCP call reviewed.
Walkthrough: this route only calls upsert_qa_from_call. I reproduced that by mocking callReducerStrict and invoking the handler; the only reducer call was upsert_qa_from_call, with no mark_human_reviewed. If the old correction flow implied review completion, corrected calls will still appear unreviewed unless the reviewer separately clicks mark reviewed.
Expected behavior: either make correction submission mark the source call reviewed in the same logical operation, or intentionally reflect the two-step workflow in the UI/API.
| - If the user asks to change styling, colors, or layout, make those changes while preserving functionality. | ||
| - Always call the updateDashboard tool with the COMPLETE updated source code — no partial code or diffs. | ||
|
|
||
| CHOOSING THE RIGHT TOOL — patchDashboard vs. updateDashboard |
There was a problem hiding this comment.
BUG: the prompt now introduces patchDashboard, but the same prompt still later forces the old full-rewrite tool.
Walkthrough: this new section teaches the model how to choose between patchDashboard and updateDashboard, but the prompt still ends with You MUST call the updateDashboard tool with the complete source code. I reproduced the conflict by checking the create-dashboard prompt contains both patchDashboard guidance and that forced updateDashboard instruction.
Expected behavior: remove or update the later forced updateDashboard instruction so the strongest instruction matches this new tool-choice guidance.
| const edits = parsePatchEdits(item.args); | ||
| if (!edits) continue; | ||
| const result = applyDashboardPatches(runningSource, edits); | ||
| runningSource = result.updatedSource; |
There was a problem hiding this comment.
BUG: patch application commits partial results when later edits fail.
Walkthrough: I reproduced this with two edits: the first replaces alpha with ALPHA, the second searches for missing text. The result reports a failure for edit 2, but updatedSource still contains edit 1. For AI-generated code edits, partial source is often worse than no source because the user sees a failed operation while the code was still changed.
Expected behavior: apply a patch batch against a draft source and only assign runningSource = draft if every edit succeeds.
| onSave(row.correlationId, question, answer, publish); | ||
| Promise.resolve(onSave(row.id, question, answer, publish)) | ||
| .catch(err => captureError("knowledge-base-save", err)); | ||
| setEditingId(null); |
There was a problem hiding this comment.
BUG: failed saves look successful to the reviewer.
Walkthrough: the save starts Promise.resolve(onSave(...)).catch(...), then immediately closes the editor with setEditingId(null). I reproduced this from the component source: a rejected save is only sent to captureError, while the reviewer sees the editor disappear as if the save worked.
Expected behavior: await the save/delete action, keep the editor/dialog open on failure, and show an inline error. Capturing the error is useful for us, but it is not enough feedback for the reviewer performing the action.
| { name: "mark_human_reviewed", args: [wrong, "corr", "reviewer"] }, | ||
| { name: "unmark_human_reviewed", args: [wrong, "corr"] }, | ||
| { | ||
| name: "update_human_correction", |
There was a problem hiding this comment.
BUG: this auth test still references a removed reducer.
Walkthrough: the current correction path uses the newer QA reducers, but this test still calls update_human_correction. I reproduced this statically by checking the test source still contains that reducer name. Against the real module this either fails for the wrong reason or no longer tests the current write surface.
Expected behavior: replace this with the current mutating reducer(s), especially the correction/upsert path that actually ships now.
| args: [wrong, "corr", "q", "a", false, "reviewer"], | ||
| }, | ||
| { name: "add_manual_qa", args: [wrong, "q", "a", false, "reviewer"] }, | ||
| { name: "delete_qa_entry", args: [wrong, "corr"] }, |
There was a problem hiding this comment.
BUG: this does not prove delete_qa_entry enforces auth.
Walkthrough: delete_qa_entry expects a numeric/u64 QA id, but the test passes "corr". That can fail while decoding the reducer arguments before the token check is reached. So a failing reducer call here does not necessarily mean the wrong token was rejected.
Expected behavior: pass a valid numeric QA id shape with the wrong token, then assert the failure message is the invalid-token path.
| expect(enroll.status).toBe(200); | ||
| const seedMarker = uniqueMarker("private-mcp-seed"); | ||
| scope.trackMcpQuestion(seedMarker); | ||
| const seed = await niceBackendFetch("/api/latest/internal/mcp-review/add-manual", { |
There was a problem hiding this comment.
BUG: this seeds the wrong table for the private-table assertion.
Walkthrough: the test wants to prove a stranger cannot see mcp_call_log, but it seeds via /mcp-review/add-manual. Manual Q&A now writes qa_entries, not mcp_call_log. I reproduced this by checking the seed block calls add-manual and does not call log_mcp_call, so the privacy assertion can pass simply because mcp_call_log is empty.
Expected behavior: seed mcp_call_log directly, or first assert the private table contains the seeded row for an authorized caller before checking stranger visibility.
This PR adds an AI analytics dashboard to the internal tool and centralises AI query observability. Every call through the /ai/query and /integrations/ai-proxy endpoints is now logged to SpacetimeDB with token counts, cost, latency, and full conversation replay. The internal tool gains a new "Unified AI Endpoint Analytics" tab backed by a new useAiQueryLogs SpacetimeDB subscription, plus reviewer enrollment and unmark-reviewed flows.
How to review this PR:
Summary by CodeRabbit
New Features
Improvements
Infrastructure