Skip to content

DSado88/gloss

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

124 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Gloss

Persistent memory and recall for Claude Code. Every conversation compounds.

Why

Claude throws 40 things at you and there's no way to respond bullet by bullet. You can't highlight the 3 that matter, push back on the 1 you disagree with, and ignore the rest — not in a terminal. And when context compaction kicks in, the good stuff vanishes. Next turn, Claude doesn't remember what it said. Next session, you don't remember which conversation it happened in.

Gloss fixes both problems. It gives you the full uncompacted record of every Claude Code session across every project, with tools to curate (highlight, comment, tag) and recall (semantic search, MCP tools) — so the knowledge compounds instead of evaporating.

Highlights are a response mechanism. Select the parts that matter, annotate the parts you disagree with, tag decisions as settled. Those curated signals feed back into Claude as structured context — not next session, but this session, via the MCP server.

Semantic search is cross-project recall. Ask "what did we decide about the DuckDB architecture?" and Gloss searches 12,000+ conversations using hybrid FTS + vector embeddings. Claude finds the answer itself, mid-conversation, without you copy-pasting from old sessions.

The core loop: Claude generates → you curate in Gloss → curation feeds back to Claude → Claude generates better. Every conversation makes every future conversation — and the current one — smarter.

Quick Start

bun install
bun src/cli.ts serve

Opens http://localhost:3456 — all your conversations from ~/.claude/projects/ are discovered automatically, indexed in SQLite, and available to browse. On first launch, Gloss will start building the full-text and vector search indexes in the background.

Features

Server (default mode)

  • Multi-session browsing at localhost:3456/c/<session-id>
  • Index page with search, Recent/By-project views, project filter, and min-turns filter
  • Live updates via WebSocket — new turns appear as the JSONL grows
  • Annotation API — highlights persist in SQLite, sync across tabs
  • Session discovery — scans ~/.claude/projects/ on startup, rescans periodically with adaptive backoff
  • Semantic search — hybrid FTS + vector search with AI-powered answers at /ask
  • Copy resume — one-click copy of claude --resume <uuid> from the index page

Semantic Search (Ask)

Type a natural language question in the search bar on the index page. Gloss uses a three-stage retrieval pipeline:

1. Retrieval (FTS + Vector, ~50ms)

  • FTS5 full-text search — per-token queries ensure each concept gets proper representation instead of being drowned by generic words in a combined OR query
  • Vector similarity — 256-dimensional Snowflake Arctic Embed embeddings, cosine similarity search across all indexed turns
  • Metadata matching — project names and session titles
  • RRF fusion (k=60) — combines all ranking signals fairly so no single retriever dominates

2. Context assembly

Top-ranked sessions are loaded and the most relevant turns are extracted with surrounding context windows. This produces a focused evidence set for the LLM.

3. Answer synthesis (~5-15s)

Claude Haiku reads the evidence and generates a direct answer with numbered source citations. The answer streams in real-time. Source cards below the answer link directly to the referenced turns in their conversations.

Vector indexing

Embeddings are generated locally using @huggingface/transformers — no API calls, no external services. The model runs in a subprocess to avoid blocking the server.

First-run expectations:

  • Model download: ~100MB on first launch (cached after that)
  • Indexing speed: ~50 sessions/minute depending on conversation length
  • A typical collection of ~800 sessions takes 15-20 minutes to fully index
  • Indexing runs in the background — the server is usable immediately, search quality improves as more sessions get indexed

What gets indexed:

  • Sessions with 3+ turns by default (configurable via Settings > Min turns on the index page)
  • Files between 10KB and 50MB
  • Each turn is truncated to 2,000 characters before embedding
  • Embeddings are stored as 1KB BLOBs in SQLite (256 × float32)

Recommendation: Set the min turns filter to 5-7 if you have many short test/debug sessions. This avoids wasting indexing time on throwaway conversations and keeps the vector index focused on substantive sessions. You can change this in Settings on the index page — it applies to both the visible session list and what gets vectorized.

Disabling embeddings:

bun src/cli.ts serve --no-embeddings    # Skip vector indexing entirely

FTS search still works without embeddings. Vector search adds recall for semantic/synonym queries (e.g., finding "database" when the conversation says "SQLite") but FTS handles exact keyword matches well on its own.

Viewer

  • Dark/light mode (follows system preference)
  • Collapsible tool calls, results, and thinking blocks
  • Toggle checkboxes for tools, thinking, and tags/kinds
  • Rendered markdown tables, code blocks, inline formatting
  • Clickable file paths and URLs
  • Slash commands shown as styled pills
  • Session continuations as expandable dividers

Annotations

  • Select text and press h to highlight
  • Add comments, assign kinds (decision, bug, constraint, todo, question, insight)
  • Tag highlights for organization
  • Three-tier restore: precise (char offsets) > fuzzy (prefix/suffix) > legacy (text search)

Export formats

Format What Use case
For Claude XML <context_bundle> with <highlight>, <trigger>, <quote>, <note> Paste into Claude Code to give it context from a previous session
Markdown Numbered list with speaker, timestamp, quoted text, and comments Documentation, notes, sharing
JSONL Slice Full turn text for annotated exchanges + their conversation partner Raw material for further processing
Download Raw annotations JSON with all metadata and offsets Backup, portability

Static export

Self-contained HTML files with CSS/JS inlined — works via file:// with no server needed.

bun src/cli.ts export <session.jsonl>
bun src/cli.ts export --no-tools --no-thinking <session.jsonl>
bun src/cli.ts export -o output.html <session.jsonl>

CLI

bun src/cli.ts serve                       # Start the server (default)
bun src/cli.ts serve --port 8080           # Custom port
bun src/cli.ts serve --no-embeddings       # Disable vector indexing
bun src/cli.ts export <file>               # Export to self-contained HTML
bun src/cli.ts export -o out.html          # Custom output path
bun src/cli.ts highlights --json           # Query highlights from SQLite
bun src/cli.ts highlights --tags           # List all tags with counts
bun src/cli.ts import                      # Import sidecar .annotations.json files
bun src/cli.ts search-exclude list         # Show excluded project patterns
bun src/cli.ts search-exclude add "foo*"   # Exclude projects matching pattern
bun src/cli.ts search-exclude remove "foo*"

Slash Commands

When working in the Gloss repo, these skills are available:

Command Description
/gloss:convo Start server or export a conversation
/gloss:index Browse all conversations
/gloss:highlights Pull highlights from the current session
/gloss:search Search highlights across all sessions
/gloss:auto-tag AI-powered auto-tagging of highlights

MCP Server (Claude Code integration)

Gloss exposes a Model Context Protocol server so Claude Code can search and read past conversations directly during a session.

claude mcp add --transport stdio --scope user gloss -- bun /path/to/gloss/src/mcp-server.ts

Requires the Gloss server running on :3456. The MCP server talks to it over HTTP — no duplicate embedding engine.

Tools:

Tool What it does
search_conversations Hybrid FTS + vector search across all sessions. Returns sources with relevance scores, matched tokens, and turn ranges.
read_conversation Read turns from a session by ID + range (server-side slicing, max 30 turns/call).
get_highlights Query annotations by session, tag, text search, or recency. Filters compose.
list_sessions Browse sessions by project or recency.

Search results include RRF relevance scores per source so Claude can weight strong matches over weak ones, plus startTurnIndex/endTurnIndex for precise follow-up reads. Text truncation is semantic-aware — won't break mid-code-block.

Architecture

~/.claude/projects/       JSONL session logs (source of truth)
        |
   discovery.ts           Scans for sessions, extracts metadata from first 32KB
        |
   ~/.convo/db.sqlite     Session index + annotations + embeddings
        |
   server.ts              HTTP routes + WebSocket live updates
        |                  Background: turn counts, FTS indexing, vector indexing
        |
   localhost:3456          Index page, conversation viewer, Ask, annotation API

Key modules:

File Role
src/server.ts Multi-session HTTP + WebSocket server
src/discovery.ts JSONL scanning, SQLite sync, turn counting
src/db.ts SQLite schema, session/annotation/embedding/FTS CRUD
src/cli.ts CLI entry point (serve, export, highlights, search-exclude)
src/ask.ts Hybrid search pipeline: FTS + vector + RRF fusion + Haiku synthesis
src/ask-page.ts Streaming Ask UI with answer + source cards
src/mcp-server.ts MCP stdio server — bridges Claude Code to the Gloss HTTP API
src/embeddings.ts Embedding engine (subprocess) + in-memory vector index
src/indexer.ts Background embedding backfill with batching and progress logging
src/index-page.ts Server index page with search/filter/grouping
src/incremental-parser.ts Streaming JSONL parser for live updates
src/parser.ts Full JSONL-to-conversation parser
src/renderer.ts Turn-to-HTML renderer
src/convert.ts JSONL-to-HTML pipeline (export path)
src/templates/html-template.ts Dual-mode HTML (server vs inline)
src/templates/client-js.ts Client JS (annotations, WS, exports)
src/templates/css.ts Shared styles

How Ask works (detailed)

User question
     |
     v
 Per-token FTS queries ──────────┐
 (each keyword searched           |
  individually for coverage)      |
                                  |── RRF fusion (k=60) ──> Top N sessions
 Vector cosine search ───────────┤
 (256-dim Arctic Embed,           |
  "query:" prefix encoding)       |
                                  |
 Metadata LIKE matching ─────────┘
 (project names, titles)
     |
     v
 Load source turns + context windows
     |
     v
 Claude Haiku (-p --model haiku)
 reads evidence, streams answer
 with numbered citations

The -p flag pipes the prompt via stdin to the Claude CLI. Haiku was chosen for synthesis because it's fast (~5-15s) and the retrieval pipeline has already done the hard work of finding relevant content — the LLM just needs to read and summarize.

Development

bun install
bun test              # Run tests
bunx tsc --noEmit     # Type check

About

Convert Claude Code JSONL conversation logs to annotated, readable HTML

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors