Skip to content

Ian-q/Carta

Repository files navigation

Carta

Carta

Maps, connects, and remembers your documentation.

Carta is a Claude Code plugin that keeps your project docs honest — auditing for contradictions, embedding reference material into a searchable knowledge base, and surfacing the right context exactly when you need it.


The problem (or: how this got built)

Fast-moving projects accumulate documentation debt quietly. You write a spec. An AI agent writes a dozen more files based on it. The spec changes. Three weeks later, four different documents describe the same API endpoint four different ways, and nobody — human or AI — knows which one is right.

This problem gets worse the more you lean on AI agents to help you work. Agents are only as good as the context they can see, and when your docs/ folder is a fog of contradictions and stale frontmatter, you're giving your agent a map that leads off a cliff.

Carta started as a happy accident. While working through a project with a lot of PDFs, datasheets, and fast-changing markdown — the kind of repo where the hardware changes on Thursday and the docs are still describing Wednesday — we built a small structural scanner to flag stale and broken cross-references. Then we added a semantic pass. Then a vector store. Then a /doc-search skill so Claude could query the embedded knowledge directly.

At some point we looked at what we had and realized: this is a thing. It works. It's small, it runs locally, it requires no new services beyond what an LLM-augmented developer already has running. So we generalized it.


What Carta does

Three things, tightly integrated:

1. Audit

A two-pass system that runs on a schedule or on demand:

  • Structural scanner (zero LLM calls) — detects stale docs, broken related: links, homeless markdown files, and orphaned content. Runs fast, runs often.
  • Semantic audit (Claude) — reads the scanner output and checks changed doc pairs for contradictions: version numbers, API endpoints, config values, whatever matters in your domain. Writes a rolling AUDIT_REPORT.md with stable AUDIT-NNN issue IDs that persist across runs.

2. Embed

Ingests your reference material — PDFs, datasheets, manuals, audio transcripts — into a local Qdrant vector store via Ollama. Generates spec_summary blocks for dense documents so the audit agent can cross-reference them without re-reading 200 pages.

3. Search

Natural language recall over everything that's been embedded. Ask Claude what the docs say about rate limiting, authentication flows, power supply constraints, sample naming conventions — whatever's in your knowledge base — and get cited answers back.


Good fits

Carta shines in projects where:

  • Docs outnumber the people who maintain them. Research repos, hardware projects, API platforms — anywhere the documentation surface area is large relative to the team.
  • AI agents are generating or editing docs. Agents don't track contradictions between files. Carta does.
  • Reference material lives outside version control. PDFs, datasheets, vendor manuals, meeting transcripts — Carta pulls them into the same queryable knowledge base as your markdown.
  • The project changes fast. Embedded firmware, evolving APIs, active research — anything where a doc written last Tuesday might already be wrong by Friday.

Less useful for: simple single-repo projects with a handful of docs, or projects where the docs are already the source of truth and rarely change.


Quickstart

Version history: CHANGELOG.md. Install (pipx, venv, PATH): docs/install.md.

Claude Code plugin (recommended)

Add the Carta marketplace to your ~/.claude/settings.json:

{
  "extraKnownMarketplaces": {
    "carta-cc": {
      "source": {
        "source": "github",
        "repo": "Ian-q/Carta"
      }
    }
  }
}

Then install and enable the carta-cc plugin via /plugins in Claude Code. That's it — hooks and skills are registered automatically. Run /carta-init in any project to bootstrap Carta there.

CLI install (pip / uvx / curl)

For use without the Claude Code plugin, or if you want the carta command available directly:

# One-shot (no install required)
uvx --from carta-cc carta init

# Install as a CLI tool (recommended on macOS)
pipx install carta-cc
carta init

# Install directly (may require --user or a venv on macOS/PEP 668 systems)
python3 -m pip install carta-cc
carta init

# Or via curl
curl -fsSL https://raw.githubusercontent.com/Ian-q/Carta/main/carta/install/install.sh | bash

See docs/install.md for pipx vs venv, PATH, PlatformIO conflicts, and --pip-args syntax.


Setup (5 minutes)

Prerequisites:

# 1. Qdrant — run with persistence so collections survive restarts
docker run -d -p 6333:6333 -v ~/.carta/qdrant_storage:/qdrant/storage --name qdrant qdrant/qdrant

# 2. Ollama — install from ollama.ai, then pull required models
ollama pull nomic-embed-text   # text embeddings
ollama pull qwen3.5:0.8b       # hook judge (swap for larger model if preferred)
ollama pull llava               # optional: visual embedding only

Both services are optional if you only want structural audit without embedding or search. See docs/install.md for the full setup walkthrough and carta doctor to verify your environment.

After init:

  1. Edit .carta/config.yaml — set your project_name, docs_root, and excluded_paths
  2. Add frontmatter to a few key docs:
---
related:
  - CLAUDE.md
  - docs/api/endpoints.md
last_reviewed: 2026-03-20
---
  1. Run your first audit: /doc-audit in Claude Code (or carta scan)
  2. Embed your reference PDFs: /doc-embed (drop files into docs/reference/)
  3. Query: /doc-search what does the docs say about authentication?

Skills

Skill What it does
/carta-init Bootstrap Carta in a new project (generates .carta/config.yaml)
/doc-audit Structural + semantic audit, generates AUDIT_REPORT.md
/doc-embed Ingest PDFs, manuals, and audio transcripts into Qdrant
/doc-search Natural language search over the embedded knowledge base

Configuration

All settings live in .carta/config.yaml (generated by carta init from the template). Key fields:

project_name: my-project           # namespaces your Qdrant collections
qdrant_url: http://localhost:6333   # required — where Qdrant is running
docs_root: docs/
stale_threshold_days: 30
contradiction_types:
  - version numbers
  - API endpoints
  - configuration values
  # add domain-specific ones: pin numbers, CAN IDs, SQL table names, etc.
anchor_doc: CLAUDE.md              # fallback comparison anchor
modules:
  doc_embed: true                  # set false to skip embed layer
  doc_search: true                 # set false to skip search
embed:
  ollama_model: nomic-embed-text:latest

Visual Embedding (ColPali/ColQwen2)

Carta supports multimodal embedding of visually-rich PDF pages (datasheets, register maps, timing diagrams) using ColPali and ColQwen2 late-interaction retrieval.

Instead of converting visual content to text (lossy), this pathway:

  1. Embeds each PDF page as 1,024 patch vectors (128-dim) directly into Qdrant's multi-vector collection
  2. Stores the raw page PNG as a sidecar payload in .carta/visual_cache/
  3. Enables visual search that returns actual page images alongside text results

Enable visual embedding:

# Install with visual dependencies
pip install 'carta-cc[visual]'

Then set in .carta/config.yaml:

embed:
  colpali_enabled: true              # opt-in flag (default: false)
  colpali_model: "vidore/colqwen2-v1.0"  # or colpali-v1.3 for lower VRAM
  colpali_device: "cpu"              # "cpu", "cuda", or "mps"
  colpali_batch_size: 1              # pages per batch (1 for CPU)
  colpali_sidecar_path: ".carta/visual_cache/"

Model Selection:

Model VRAM Speed Quality Best For
vidore/colqwen2-v1.0 ~8GB Slow Highest GPU servers
vidore/colpali-v1.3 ~6GB Medium High Balanced GPU
vidore/colSmol-500M ~3GB Medium Good CPU workstations
vidore/colSmol-256M ~2GB Fast Fair CPU-only/laptops

Notes:

  • Visual embedding is additive — existing text embedding pipeline is unchanged
  • Pages are classified automatically; only visually-rich pages are embedded
  • Visual collections are separate: {project_name}_visual (multi-vector) vs {project_name}_doc (text)
  • Search returns both text and visual results; visual hits include base64-encoded PNGs

Issue lifecycle

Carta assigns stable AUDIT-NNN IDs that survive across audit runs:

new → persisting → needs-input → resolved → archived

After needs_input_at_audit_count consecutive audits without resolution, an issue is escalated to needs-input and added to docs/BACKLOG/TRIAGE.md as a DOC-NNN item. The audit report is the single source of truth — no separate state file.


What Carta doesn't do

  • It doesn't replace your wiki or CMS.
  • It doesn't auto-fix contradictions (it surfaces them; you or your agent decides what to do).
  • It doesn't require a cloud service — everything runs locally by default.
  • It doesn't add much overhead to projects with simple, stable docs.

Contributing

Issues and PRs welcome. The scanner, embed pipeline, and skill files are all designed to be readable and hackable.


License

MIT

About

Carta is a Claude Code plugin that keeps your project docs honest — auditing for contradictions, embedding reference material into a searchable knowledge base, and surfacing the right context exactly when you need it.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors