Skip to content

ufukkaraca/spine

Repository files navigation

Spine

What if a novel had a git log?

Spine turns a manuscript into a navigable graph. Scenes are commits. Plotlines are branches. Characters are contributors with line-level blame. Subplots branch off main, weave back together at merges, or quietly die in the margins, visibly, like abandoned feature branches.

Built by Ufuk Karaca for the Built with Opus 4.7 — a Claude Code hackathon (Cerebral Valley × Anthropic, virtual, April 2026).

The stack: Claude (Opus 4.7 / Sonnet 4.6 / Haiku 4.5) · Pioneer (Fastino GLiNER2) · Tavily.


The metaphor

Both fiction and code are causal sequences of decisions made by agents. The primitives map cleanly. A literary critic's mental model and a senior engineer's mental model are the same shape.

Git primitive Spine equivalent
Commit Scene — the atomic narrative unit
Author / blame Character involvement, weighted by speaking and agency
Branch Plotline / subplot — a divergent thread from the main arc
Merge Plot convergence — when two threads return to the same scene
Abandoned branch Dropped subplot — threads that never resolve
HEAD / main The dominant arc (usually the protagonist's)
Diff Scene-to-scene change in narrative state

This is not a metaphor bolted on for a hackathon. It is the load-bearing insight. The whole product follows from it.


How Claude shows up in the product

Spine isn't a basic Claude integration. Every layer of the pipeline maps to a different Claude tier on purpose, and uses Claude features that wouldn't make sense with another model.

Tier Used for Why this tier
Opus 4.7 with adaptive thinking Spine Agent loop, editorial brief, Ask synthesis, character voice chat, contradiction scan 1M-token context holds the whole manuscript; adaptive thinking lets the model weigh evidence visibly, with output_config.effort: 'high'
Sonnet 4.6 Plotline naming, editorial diff insights, voice / audience / comparable-titles passes Balanced cost / depth for mid-weight reasoning
Haiku 4.5 with prompt caching Per-scene extraction, per-mention character disambiguation, ask-triage, chapter-segment fallback Identical system across thousands of calls — cache makes ingest ~10× cheaper

Every Claude call goes through lib/ai/claude.ts → callClaude(template, input). Each call uses tool-use JSON — a single emit_result tool whose input schema is auto-derived from a Zod schema, so the response is structurally guaranteed before validation. No fragile fence parsing.

The agentic surface (lib/ai/agent.ts) is a custom multi-turn loop on top of client.messages.create with 12 tools: 8 read-only book tools (find_scenes, get_scene, character_arc, list_plotlines, plotline_scenes, chapter_summary, co_occurrence, list_characters) and 4 UI-control tools (highlight_character, focus_plotline, open_scene, clear_filters) that emit SSE ui_directive events so the model literally drives the workspace as it answers.

Caching is double-layered. Anthropic prompt caching reduces server-side input tokens. Disk caching at data/.llm-cache/ (committed to the repo) means a clone-and-run hits warm.


Spine, by the numbers

Real counts from this repo, today.

Metric Value
Lines of TypeScript / TSX (lib + components + app) 25,890
TS / TSX files 106
Node + Python pipeline scripts 8
Prompt templates (Zod-typed PromptTemplate records under lib/ai/prompts/) 16
— on Haiku 4.5 (ingest hot path) 6
— on Sonnet 4.6 (mid-weight reasoning) 6
— on Opus 4.7 with adaptive thinking 4
Spine Agent tools (Opus 4.7 loop) 12 (8 read-only + 4 UI-control)
Cached Claude responses committed for warm-clone (data/.llm-cache/) 6,655
Cached Tavily web-research responses (data/.tavily-cache/) 13
Pre-ingested manuscripts in data/spine-demo.db 2
— Pride & Prejudice — words / scenes / characters / plotlines 130,415 / 321 / 53 / 98
— War & Peace — words / scenes / characters / plotlines 566,333 / 1,538 / 489 / 715
Total scene→character involvements (the "blame" graph) 7,860
Token footprint of the full P&P manuscript at agent start ~165k (fits in 1M context)
Token footprint of the full W&P manuscript at agent start ~580k (fits in 1M context)

Adaptive thinking is the Opus 4.7 default on synthesis surfaces. Manual thinking: { type: 'enabled', budget_tokens: N } is rejected on 4.7; the codebase uses thinking: { type: 'adaptive', display: 'summarized' } paired with output_config: { effort: 'high' }.

Token totals across the full build are not retained because the disk cache is keyed by application output (parsed-and-validated tool input), not API response metadata, by design — model-independent caches survive a tier upgrade. The agent loop itself does emit per-turn usage over SSE for the live UI.


Features

Workspace IDE (/p/[bookId]/read)

A six-route editorial environment.

Route Surface
/ Landing — what Spine is, the metaphor, demo books
/workspace Project list, upload, recent activity
/p/[bookId] Overview — stats strip, editorial letter, spotlight scene, manuscript versions, intern insights teaser
/p/[bookId]/draft Manuscript diff — clip-by-chapter slider, structural diff, Claude prose-level insights
/p/[bookId]/read Full workspace — tension ribbon · manuscript prose · AskDock · overlays
/p/[bookId]/chat Chat with characters in their own voice — 1:1 and group chats grounded on each character's scenes

The reader is a single canvas: the TensionRibbon spine of the book runs across the top, color-coded by plotline, scroll-coupled to the visible scene via IntersectionObserver. Prose centers down the middle with inline character-mention chips that link to the cast drawer. The AskDock sits as a bottom pill, types directly into the manuscript, and the model's UI-control tool calls (highlight_character, focus_plotline, open_scene) move the page as the answer streams in.

URL state is the cursor. ?character=, ?plotline=, ?scene=, ?cast= are read and written by AskDock and the overlays so every selection is a stable, shareable link.

Overlays summoned by keys:

/  Ask              c  Cast (list + network)
g  Plot graph       i  Insights
⌘K Search           ?  Help

Ask — Claude + Tavily research

POST /api/ask/[bookId] runs a two-pass chain:

  1. Claude Haiku triage — decides whether the question needs real-world research.
  2. Tavily (if warranted) — external citation search, results grounded in scene text.
  3. Claude Opus 4.7 synthesis with adaptive thinking — weaves book scenes with external sources into a single answer.

The UI renders external sources under the answer with a via tavily badge. Provenance is always visible.

Example routing:

  • "How does Pemberley's architecture match Derbyshire of 1813?" — routes to Tavily, cites 5 external sources.
  • "What does Lizzy think of Darcy?" — book-only, no Tavily call.

Cold research call ≈ 12s. Cache-warm (committed at data/.tavily-cache/) ≈ 80ms.

Editorial brief — the Spine Agent

POST /api/agent/[bookId] with {"mode":"editor-brief"} runs the Opus 4.7 agent loop end-to-end on the whole manuscript. The model calls the 8 read-only book tools to verify each claim against the literal scene text before writing it. SSE streams thinking, tool_use, tool_result, text, and final done events so the UI shows the model reading the book in real time. Cold ≈ 2 minutes, warm ≈ 5–10s via the disk cache. Output is a structured editorial letter the kind a senior dev editor would hand back.

Plot graph (full-screen overlay via g)

  • Lane chart with frozen plotline labels — labels stay visible when scrolling
  • Scene nodes sized by importance; branch / merge / abandon markers
  • Character filter ribbons — click to filter, hover to highlight that character's arc across all lanes
  • Hover indicator line that snaps to the closest plotline at cursor x — like an NLE playhead
  • Orphan badge flags a thread that never connects to the main tree; unresolved badge flags one that trails off
  • War & Peace render cap: top-200 scenes by importance (1,538 total remain queryable via Ask and search)

Tension ribbon and cast network

Two compact graph lenses inside the workspace: the TensionRibbon at the top of every read view (importance over ordinal, color-coded per plotline) and the CastNetwork in the cast drawer — a D3 force-directed graph of the top characters by centrality, edges weighted by shared-scene co-occurrence.

Search palette (⌘K)

Pure-SQL search across scenes, plotlines, characters, places, and chapters. Alias-aware: "lizzy" resolves to Elizabeth Bennet at score 1.0. No LLM on the hot path.

Pioneer NER — character disambiguation

Pioneer (Fastino's GLiNER2) runs at ingest to cluster aliases: "Lizzy", "Miss Bennet", "Eliza" all collapse to pride-and-prejudice-elizabeth-bennet. The client supports four modeslocal (the default; Python sidecar holding the model in memory across the Next.js worker's lifetime), fine-tuned (Fastino HTTP), zero-shot (Hugging Face Inference), and disabled — with a kill-switch that auto-disables after 3 consecutive failures and falls back to Claude Haiku disambiguation. Attribution surfaces in the UI as resolved by pioneer · gliner2 ner under alias chips on every character sheet.

Editorial intern insights — Claude + Tavily

POST /api/insights/[bookId] produces a structured pre-editorial read: comparable titles (Tavily-sourced — Longbourn by Jo Baker, etc.), voice fingerprint ("reminiscent of Henry James"), pacing assessment, audience demographic guess, and a contradiction scan. The contradiction subsection runs on Opus 4.7 with adaptive thinking so it can hold many scenes in mind and cite specific scene IDs for each contradiction. Cached at data/.insights-cache/. Surfaces as a teaser on the project overview and a full panel inside the Insights overlay.

Chat with characters — 1:1 and group, in voice

/p/[bookId]/chat opens a conversation menu. Pick any one or more characters from the cast — the backend loads each character's top-importance scenes and a quote-bearing dialogue excerpt, locks the prompt to first-person era-appropriate voice, and (in group mode) sequences responses so the second character literally sees the first character's reply. Powered by Opus 4.7 with adaptive thinking so the model thinks through the character's emotional state before responding.

Demo prompts that land:

  • Lizzy on Hunsford: "His manner was so presumptuous, and his words so little calculated to win affection…"
  • Lizzy + Darcy, day after the proposal: both reply in voice; they address each other.

Demo books

Book Scenes Characters Plotlines
Pride & Prejudice 321 53 98
War & Peace 1,538 489 715

Both ship pre-ingested. No API key needed to browse them.

Source & copyright. Both demo manuscripts are sourced from Project Gutenberg, which preserves and distributes works in the public domain. Pride and Prejudice (Jane Austen, 1813) and War and Peace (Leo Tolstoy, 1869, English translation by Aylmer & Louise Maude) are unambiguously public domain in the United States and most jurisdictions. Spine ships only the source .txt files at data/books/ and structural data derived from them (scene boundaries, character mentions, plotline clusters); no copyrighted content of any kind is included.

If you ingest your own manuscript, you retain all rights — Spine writes only to your local data/spine.db (gitignored at the file level) and data/.llm-cache/. Don't commit a private manuscript without checking your .gitignore.


The stack

Claude — extraction, synthesis, voice, agent

lib/ai/claude.ts · lib/ai/agent.ts · lib/ai/prompts/ · app/api/agent/[bookId]/route.ts · app/api/ask/[bookId]/route.ts · app/api/chat/[bookId]/route.ts · app/api/insights/[bookId]/route.ts · app/api/diff/route.ts

Three Claude tiers do different jobs:

  • Haiku 4.5 with prompt caching drives the ingest hot path: chapter split, scene segmentation, event and entity extraction, character disambiguation, ask-triage. The system block is identical across thousands of calls; the cache cuts input tokens ~10×.
  • Sonnet 4.6 drives mid-weight reasoning: plotline naming, editorial diff insights, voice / audience / comparable-titles passes for the intern report.
  • Opus 4.7 with adaptive thinking drives synthesis and the agent loop: editorial brief, Ask answer composition, character voice chat, contradiction scan in the intern report.

Every call uses tool-use JSON for guaranteed-shape output, then crosses a Zod schema for runtime validation, with one retry-on-parse-failure path. Disk-cached under data/.llm-cache/ keyed by (system + user + schema-shape) (model excluded — upgrading from Sonnet to Opus on a synthesis path doesn't cold-start the cache).

Tavily — agentic research, only when warranted

lib/ai/tavily.ts · lib/ai/prompts/ask-triage.ts · lib/ai/prompts/ask-synthesize.ts · components/AskExternalCitations.tsx

Tavily is invoked only when Claude triage decides the question needs real-world grounding. The result is woven into the Opus 4.7 synthesis pass and surfaced with explicit via tavily provenance in the UI. Research questions get external citations; in-book questions stay book-only. Both paths share the same POST /api/ask/[bookId] endpoint.

Pioneer / Fastino — GLiNER2 NER

lib/ai/pioneer.ts · scripts/bench-disambiguation.ts · scripts/pioneer/serve.py · scripts/pioneer/train.py · components/CharacterSheet.tsx

Pioneer (Fastino's GLiNER2) is the character-disambiguation backbone at ingest. Four modes: local (Python sidecar holding urchade/gliner_multi-v2.1 in memory across the Next.js worker's lifetime), fine-tuned (Fastino HTTP), zero-shot (Hugging Face Inference), and disabled. The client auto-falls-back after 3 HTTP failures.

The local mode is the production default. First request pays the ~13s warmup; every subsequent call is ~80ms — about 6.6× faster than the equivalent Claude Haiku disambiguation round-trip, with no API cost. When the local model says null, the request falls through to Claude.


Quick start

Local Node (development)

git clone https://github.com/ufukkaraca/spine.git
cd spine
nvm use                              # Node 20 (pinned in .nvmrc)
npm install
cp data/spine-demo.db data/spine.db  # seed runtime DB from committed snapshot
cp .env.local.example .env.local     # fill in ANTHROPIC_API_KEY to enable Ask + insights
npm run dev

Open:

  • http://localhost:3000/ — landing
  • http://localhost:3000/workspace — project list
  • http://localhost:3000/p/pride-and-prejudice/read — workspace IDE
  • http://localhost:3000/p/war-and-peace/read — stress test (1,538 scenes)
  • http://localhost:3000/p/pride-and-prejudice/chat — chat with Lizzy + Darcy

No API key needed for read-only browsing. Both demo books ship pre-ingested in data/spine-demo.db, which is committed to the repo. Claude and Tavily caches are also committed (data/.llm-cache/, data/.tavily-cache/), so every Ask answer that ships in the demo is warm.

Local Docker

docker build -t spine . && docker run --rm -p 3000:3000 spine

The image (~395 MB) bakes both demo books, all caches, and no required env vars. Add --env-file .env.local to enable live Claude + Tavily calls.

Environment variables

Copy .env.local.example to .env.local and fill in:

Variable Required for Notes
ANTHROPIC_API_KEY Spine Agent, Ask, ingest, diff insights, editorial letter, chat Get a key at https://console.anthropic.com/settings/keys
TAVILY_API_KEY Agentic research in Ask Optional — without it, Ask stays book-only
PIONEER_API_KEY Pioneer fine-tuned disambiguation Optional — ingest falls back to Claude Haiku
SPINE_DB_PATH Default ./data/spine.db

Deploy

Full guide (Dokploy walkthrough, Vercel, Render, Fly.io, troubleshooting) in docs/DEPLOY.md.

The Docker image is self-contained — boots to a live demo at http://localhost:3000 with no env vars. The seeded data/spine-demo.db is renamed to data/spine.db at image build time, all warm caches are baked in.

docker build -t spine . && docker run --rm -p 3000:3000 spine

Dokploy: Create Application → Dockerfile → port 3000. Source the repo (ufukkaraca/spine, branch main). No persistent volume needed — the DB is read-only.


Architecture

Next.js 15 (App Router) + React 19 + Tailwind + TypeScript strict
│
├── app/
│   ├── page.tsx                     Landing
│   ├── workspace/page.tsx           Project list / upload / activity
│   ├── p/[bookId]/{page,read,draft,chat} Five-route IDE
│   └── api/
│       ├── agent/[bookId]           Spine Agent loop (SSE) — Opus 4.7 + 12 tools
│       ├── ask/[bookId]             Claude triage → Tavily → Opus synthesis
│       ├── chat/[bookId]            Opus 4.7 character voice + adaptive thinking
│       ├── insights/[bookId]        5 sub-passes; contradictions on Opus
│       ├── graph/[bookId]           BookGraph JSON (scenes, plotlines, chars)
│       ├── diff                     Structural diff + Sonnet prose insights
│       ├── projects/[bookId]/letter Editorial letter (pure-TS, no LLM call)
│       ├── search                   ⌘K — pure-SQL, alias-aware
│       ├── ingest                   .txt / .md / .docx → pipeline
│       └── annotations/[bookId]/…   Threaded editorial annotations
│
├── components/
│   ├── TensionRibbon.tsx            Scroll-coupled spine of the book
│   ├── BookGraph.tsx                D3 plot graph, lanes, ribbons, markers
│   ├── overlays/{CastDrawer,GraphExpansion,InsightsOverlay,ShortcutHelp,…}
│   ├── AskDock.tsx                  Bottom-pill Spine Agent UI (SSE)
│   ├── ReaderView.tsx               Manuscript prose + inline mention chips
│   └── …
│
├── lib/
│   ├── types.ts                     Single source of truth for shared types
│   ├── ai/
│   │   ├── claude.ts                @anthropic-ai/sdk wrapper — caching, thinking, tool use
│   │   ├── agent.ts                 Spine Agent multi-turn loop, 12 tools, SSE events
│   │   ├── pioneer.ts               GLiNER2 NER (4 modes, kill-switched)
│   │   ├── tavily.ts                Web research client
│   │   └── prompts/                 16 Zod-typed PromptTemplate records
│   ├── ingest/{parse,extract,classify,formats}   Text → scenes → entities
│   ├── graph/{build,plotlines,merges,diff}       Graph construction
│   └── db/{schema.sql,client.ts}    better-sqlite3, raw SQL, prepared statements
│
├── data/
│   ├── books/                       Public-domain source texts (Gutenberg)
│   ├── spine-demo.db                Pre-seeded SQLite (committed; 17 MB, 2 books)
│   ├── .llm-cache/                  Committed warm Claude cache (6,655 entries)
│   ├── .tavily-cache/               Committed warm Tavily cache (13 entries)
│   └── .insights-cache/             Committed warm Insights cache
│
└── scripts/
    ├── ingest-book.ts               CLI ingest
    ├── seed-demo.ts                 Pre-process both demo books
    ├── pioneer/{serve,train}.py     Local GLiNER2 sidecar + training entry
    └── bench-disambiguation.ts      Pioneer vs Claude benchmark

Storage: better-sqlite3, raw SQL, no ORM. Single SQLite file at data/spine.db (seeded from committed data/spine-demo.db). Type boundary: every LLM response crosses a Zod schema; tool-use JSON guarantees structural validity, parse failures retry automatically. Caches: SHA-256 keyed (model-independent), committed to the repo so a fresh clone hits warm. Fonts: self-hosted via next/font/local (no Google CDN at build). Node: 20 (pinned in .nvmrc). tsx in devDeps for scripts.


Limitations and known issues

  • Pioneer fine-tuned and zero-shot HTTP endpoints are unreachable in the current Fastino state (expired SSL + DEPLOYMENT_NOT_FOUND from Vercel). The local mode (in-process GLiNER2 via Python sidecar) is the production default and the path used by both demo books. Runtime auto-falls-back to Claude Haiku, then to local fuzzy-match if Claude is also unavailable.
  • W&P minor character clustering is weaker on the long tail — 489 characters includes unmerged minor variants. Frontend hides this by filtering to top-N by centrality on the rails.
  • Cold Tavily-research call ≈ 12s (Claude triage + Tavily + Opus synthesis with adaptive thinking). Cache-warm ≈ 80ms. Committed cache warms the demo examples.
  • Cold editorial-brief on War & Peace ≈ $3 because of 580k input tokens at the Opus 4.7 rate. Pride & Prejudice cold ≈ $0.80. Warm ≈ free (disk cache).

Built with Opus 4.7


License

MIT — 2026 Ufuk Karaca. See LICENSE.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors