feat(examples): add rag-chat init-style example + scaffolding for databricks apps init#51
Merged
andrelandgraf merged 7 commits intomainfrom Apr 18, 2026
Merged
Conversation
Adds a new RAG Chat example that proves DevHub can ship examples scaffolded through the AppKit template system (not just `git clone`). The generated app does streaming RAG against a pgvector store in Lakebase, persists chat history per session, and ships a one-shot deploy script. DevHub-side changes: - Extend the Example schema with optional `agentPrereqSteps` / `agentDeploySteps` so init-style examples can inject per-example prerequisite and post-init commands into the Copy prompt output. - Extend `buildFullPrompt` / `example-detail.tsx` to emit an init-style flow (CLI auth check, prereqs, scaffold, deploy) for any example whose `initCommand` starts with `databricks apps init`. Clone-style examples render byte-identical output. Template-side highlights (examples/rag-chat/template/): - `appkit.plugins.json` declares Lakebase Postgres as a required resource so `databricks apps init` auto-resolves PGHOST / PGDATABASE / LAKEBASE_ENDPOINT into the scaffolded .env. - `.env.tmpl` handles the remaining knobs (chat / embedding endpoints, numeric workspace id for AI Gateway). - `databricks.yml` binds the bundle's `app` resource to the Lakebase branch / database via variables, and excludes `package-lock.json` from bundle uploads to avoid macOS-lockfile issues in the Apps Linux build container. - `scripts/sync-bundle-vars.mjs` derives the bundle variable overrides from the generated .env + the Lakebase API before first `bundle deploy`, so the one-shot `npm run deploy` works on a cold start where the app does not yet exist. - UX: retrieved sources render before the streaming assistant reply so users see retrieval grounding in real time. Verified end-to-end via a cold start: teardown of prior Lakebase project + Databricks app, then fresh agent recipe (create Lakebase project, `databricks apps init`, patch .env workspace id, `npm install && npm run deploy`) reaches a live app with populated RAG corpus. Tests: 87 pass, typecheck clean, build clean.
…e diff Two small follow-ups to the rag-chat example: 1. Rewrite the Lakebase prereq collision paragraph in the generated agent prompt. The old guidance told the agent to delete an existing project if the chosen id collided. Replaced with a defensive version: prefer picking a different PROJECT_ID; never delete an existing project without explicit user confirmation (Lakebase projects may hold data other apps depend on); and wait-and-retry for the control-plane eventual-consistency case a validation subagent hit after a fresh teardown. 2. Add a top-level .gitattributes marking examples/*/template/package-lock.json as linguist-generated, so GitHub collapses the ~15k-line lockfile in PR diffs and excludes it from language stats. The lockfile still ships in the repo because `databricks apps init` runs `npm ci` (confirmed against databricks/cli cmd/apps/init.go and libs/apps/initializer/nodejs.go), and is excluded from bundle uploads via databricks.yml sync.exclude so the macOS-generated lockfile never reaches the Linux build container.
LakebasePage.tsx and lakebase/todo-routes.ts were leftover scaffolding from the AppKit reference template we copied from. Neither is wired in (App.tsx only mounts ChatPage; server.ts only registers chat + chat-persistence routes), so they were dead code shipping to users.
Six fixes from code review: 1. Enforce chat ownership on read/write. chat-store getChatMessages and appendMessage now JOIN on user_id, and both chat route files fetch the chat via a new getChatForUser guard before operating. Previously any user with a chat UUID could read or write another user's messages. 2. Stop silently collapsing users into "local-dev-user" in production. Identity comes from x-forwarded-email (injected + stripped at the Databricks Apps gateway); when DATABRICKS_APP_NAME is set and the header is missing, we now 401. Local dev still falls back to local-dev-user. Extracted into server/lib/auth.ts. 3. Stream RAG sources once, not twice. Replaced the separate GET /api/chat/sources endpoint + client-side double-fetch with a createUIMessageStream that writes a data-sources part before the LLM tokens. The client reads sources directly off the assistant message's parts, which also fixes the sourcesMap-keyed-by-raw-text race the reviewer flagged. Transport swapped from TextStreamChatTransport to DefaultChatTransport. 4. Drop broken /resources/rag-chat-app-template link from the README. 5. Split the agent recipe: .env workspace-id patching is only needed for local `npm run dev`. Deploys get DATABRICKS_WORKSPACE_ID from the auto-injected runtime env and Lakebase vars from the bound postgres resource via app.yaml, so the deploy flow no longer asks the agent to sed the .env file. Matching split in the template README.
…guard DATABRICKS_APP_NAME is pre-populated in .env by `databricks apps init`, so a presence check couldn't distinguish local dev from production and would have 401'd local `npm run dev`. DATABRICKS_CLIENT_ID is the service-principal credential auto-injected only at runtime, so it's the correct signal. Caught during end-to-end local validation.
- Render assistant responses as markdown with streamdown (animated={false}
for linear streaming), wired via @source + styles.css import
- Add DELETE /api/chats/:id with ownership enforcement and sidebar delete
UI that falls back to empty state when the active chat is removed
- ChatPage polish: autofocus + instant input clear on submit, auto-scroll
that sticks to bottom unless the user scrolls up, auto-sizing textarea
(Enter to send, Shift+Enter for newline) with horizontal overflow fix,
and a Radix ScrollArea display:table override so long chat titles no
longer push the delete button off-screen
Resolve conflict in src/lib/recipes/recipes.ts by keeping both the inventory-intelligence example (from main) and the rag-chat example (from this branch).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
DevHub currently only surfaces
git clone-based examples. I propose that we also support adatabricks apps initflow. The PR includes one example (rag-chat) end-to-end as a proof of concept. Treat the implementation as a starting point/proof of concept; I am happy to refactor or hand this off.What's in the PR
Example registry + agent prompt plumbing (
src/lib/recipes/recipes.ts,src/lib/examples/build-example-markdown.ts,src/components/examples/example-detail.tsx)agentPrereqSteps/agentDeployStepsfields onExample. The markdown builder and detail page inject them around the init command only when present, so every existinggit cloneexample is rendered byte-identically.tests/build-example-markdown.test.tscover the init vs. clone branches.New
rag-chatexample (examples/rag-chat/template/).env.tmpl,appkit.plugins.json,databricks.yml,app.yaml, server + client source, seeded pgvector retrieval.scripts/sync-bundle-vars.mjsbridgesdatabricks apps initanddatabricks bundle deployby derivingpostgres_branch/postgres_databasefrom the generated.envplus the Lakebase API and writingvariable-overrides.json.package-lock.jsonis committed (needed becausedatabricks apps initrunsnpm ci— confirmed indatabricks/clicmd/apps/init.go+libs/apps/initializer/nodejs.go) and excluded from bundle uploads viadatabricks.yml:sync.excludeso the macOS-generated lockfile never reaches the Linux build container..gitattributesmarks it aslinguist-generatedto collapse it in the PR view.Agent recipe
rag-chatguides an agent through CLI auth → Lakebase project creation →databricks apps init→.envpatching (numericDATABRICKS_WORKSPACE_IDfromunity-catalog/current-metastore-assignment) →npm run deploy.Open questions / alternatives
rag-chatcarries its ownsync-bundle-vars.mjs,agentPrereqSteps, and lockfile decisions. For a second init-style example we'd probably want a sharedsync-bundle-varshelper and typed agent-step helpers. Happy to factor that out now if preferred, or defer until a second example arrives.databricks apps init'snpm ci. If we want a smaller diff we could (a) keep it out and accept thatapps initfails until the user runsnpm install, or (b) push upstream ondatabricks/clito fall back tonpm installwhen no lockfile is present. Current PR picks the path that keepsapps initworking out of the box.Test plan
npm run fmt && npm run typecheck && npm run build && npm run testpass locallyrag-chatexample page and agent prompt