Skip to content

Split Cursor into cursor-cli and cursor-ide, add missing stores#27

Merged
tony merged 4 commits into
masterfrom
cursor-split-cli-ide
May 31, 2026
Merged

Split Cursor into cursor-cli and cursor-ide, add missing stores#27
tony merged 4 commits into
masterfrom
cursor-split-cli-ide

Conversation

@tony
Copy link
Copy Markdown
Owner

@tony tony commented May 30, 2026

Summary

  • Replace the single cursor agent with two — cursor-cli (the cursor-agent terminal binary) and cursor-ide (the desktop app) — because they are distinct applications with disjoint data homes and on-disk formats, and one-agent modelling was the only place the catalogue broke its "one agent = one product = one root family" shape. This is breaking; see Breaking change.
  • Add cursor-cli.prompt_history — Cursor's typed-prompt recall buffer at ~/.config/cursor/prompt_history.json, giving Cursor parity with the Claude/Codex/Grok prompt-history stores. The catalogue never knew this path because Cursor moved its CLI home to the lowercase ~/.config/cursor/ while only ~/.cursor/ was watched. As with those backends, the store's records are kind="history" (with role="user"), so it surfaces under --type history/--type all rather than the default --type prompts — a bare --agent cursor-cli search still returns typed prompts via the transcripts' user turns. See prompt_history stores parse as kind='history', so the default --type prompts search skips them #37.
  • Add cursor-ide.workspace_state — per-workspace workspaceStorage/<hash>/state.vscdb databases, complementing the global store. Both now surface the aiService.prompts history, which prior builds advertised but never actually extracted (those entries carry no role field, so the generic walker skipped them).
  • Add cursor-cli.chats — a best-effort, schema-less protobuf reader for the CLI's content-addressed chats/<hash>/<uuid>/store.db blob stores. Inspectable (opt-in), not searched by default, since the extraction overlaps the cleaner JSONL transcripts.
  • Drop the now-redundant cli/ide infix from Cursor store and adapter IDs so the first store_id segment matches the agent.

Breaking change

The cursor agent is retired. Anywhere cursor was referenced — the --agent flag, agent: query predicates, the agentgrep://sources/{agent} MCP resource, and record agent fields — choose the surface you mean:

$ agentgrep search foo --agent cursor
$ agentgrep search foo --agent cursor-cli
$ agentgrep search foo --agent cursor-ide

Store and adapter identifiers were renamed to match:

Old New
cursor.cli.transcripts cursor-cli.transcripts
cursor.ai_tracking cursor-cli.ai_tracking
cursor.ide.state_vscdb cursor-ide.state_vscdb
cursor.cli_jsonl.v1 cursor_cli.transcripts_jsonl.v1
cursor.state_vscdb_modern.v1 cursor_ide.state_vscdb_modern.v1

Changes by area

  • src/agentgrep/store_catalog.py: Split _CURSOR_STORES into _CURSOR_CLI_STORES / _CURSOR_IDE_STORES; add the three new descriptors; bump catalog_version.
  • src/agentgrep/__init__.py: New parsers parse_cursor_prompt_history, parse_cursor_cli_chats_db, and the schema-less iter_protobuf_text_fields walker; teach parse_cursor_state_db the aiService.prompts shape via iter_cursor_prompt_candidates; split discover_cursor_sources into CLI/IDE discovery with a platform-resolved ide_workspace root; rename dispatch arms and the AgentName literal.
  • MCP / CLI / query surface (mcp/_library.py, mcp/models.py, query/registry.py, ui/app.py): Update agent literals, the adapter registries, the query enum, and TUI agent colors.
  • docs/: Split the backend page into cursor-cli.md + cursor-ide.md, update the grid/toctree and storage-catalog dev page, and refresh agent-name references across the CLI/library/MCP docs.
  • tests/: Update fixtures and assertions to the split names; add parametrized and fixture-based coverage for the new parsers.

Design decisions

  • Two agents, not a sub-facet: Each record already carries a store field, so CLI-vs-IDE provenance was distinguishable without splitting. But the two surfaces have disjoint roots and formats (JSONL/protobuf-SQLite vs state.vscdb), and per-surface availability is only honest when they are separate agents (e.g. an IDE that runs on a remote host has no local state.vscdb). Splitting also makes Cursor obey the same "one agent = one product binary" shape as the other four backends.
  • Best-effort protobuf over no parser: Cursor publishes no schema for the chat blobs, and they form a content-addressed protobuf graph. Rather than skip them, the walker decodes the wire format generically — text fields begin with a printable byte, nested messages begin with a low tag byte — and surfaces readable runs. The adapter is date-versioned (cursor_cli.chats_protobuf.v1) precisely because the layout is unofficial and may drift.
  • chats is opt-in: The protobuf extraction is noisier than and overlaps the JSONL transcripts, so the store is INSPECTABLE to keep default search clean; it still parses fully when explicitly included.
  • Drop the infix: With no backward-compat constraint, collapsing cursor.cli.transcriptscursor-cli.transcripts keeps store_id aligned with the agent — the invariant fixture_path and discovery both rely on.

Verification

No bare cursor agent literal remains in source:

$ rg '"cursor"' src/

The retired agent is rejected by the CLI:

$ agentgrep find --agent cursor

The new adapters are registered and dispatched:

$ rg 'cursor_cli\.(prompt_history|chats)|cursor_ide\.state_vscdb' src/agentgrep/__init__.py src/agentgrep/mcp/_library.py

Test plan

  • test_iter_protobuf_text_fields — schema-less walker recovers leaf/nested text and skips varint/truncated/short fields
  • test_iter_cursor_prompt_candidatesaiService.prompts shapes surface as user prompts; non-prompt shapes ignored
  • test_cursor_cli_prompt_history_surfaces_user_promptsprompt_history.json becomes deduped user records with mtime fallback
  • test_cursor_cli_chats_db_is_opt_in_and_extracts_protobuf_text — chats store is inspectable-only and yields readable blob text
  • test_cursor_ide_workspace_state_extracts_aiservice_prompts — per-workspace state.vscdb surfaces its prompt history
  • test_actual_cursor_discovery_splits_main_and_subagent_transcripts — discovery splits under the new names
  • Full gate: uv run ruff check ., uv run ruff format ., uv run ty check, uv run py.test --reruns 0, just build-docs
  • End-to-end on a real $HOME: find cursor-cli surfaces cursor-cli.prompt_history; search --agent cursor-cli --type all returns its prompt-history records (the store is kind="history", so it needs --type history/all, not the default --type prompts — see prompt_history stores parse as kind='history', so the default --type prompts search skips them #37); the chats walker extracts readable text from real store.db files

tony added 3 commits May 31, 2026 07:06
… cursor-ide

why: Cursor is two distinct applications with disjoint data homes and
formats — the cursor-agent terminal CLI (~/.cursor, ~/.config/cursor)
and the desktop IDE (~/.config/Cursor/User/state.vscdb). Modelling both
under one `cursor` agent was the only place the catalogue violated its
"one agent = one product = one root family" invariant, and it made
per-surface availability ambiguous. Backward compatibility is not a
concern, so the bare `cursor` agent is retired.

what:
- Replace the `cursor` AgentName with `cursor-cli` and `cursor-ide`
  across every literal: AgentName (stores, __init__, mcp/_library),
  AgentSelector, AGENT_CHOICES, the mcp/models payload Literals, the
  query registry agent enum, and the TUI agent colours.
- Split _CURSOR_STORES into _CURSOR_CLI_STORES and _CURSOR_IDE_STORES,
  dropping the now-redundant cli/ide infix so store_ids and runtime
  store keys collapse to one value (cursor-cli.transcripts,
  cursor-ide.state_vscdb); rename adapter ids to cursor_cli.* /
  cursor_ide.*; bump catalog_version to 10.
- Split discover_cursor_sources into discover_cursor_cli_sources and
  discover_cursor_ide_sources; update the discover_sources dispatch,
  the iter_source_records adapter arms, and both adapter registries.
- Rename the docs page to backends/cursor-cli.md, add cursor-ide.md,
  update the backends grid/toctree and the storage directive labels.
- Update the test suite and sample fixture dirs to the split names;
  repair the shipped CHANGES {doc} link that pointed at the moved page.
…orkspace stores

why: The split surfaced three on-disk stores the catalogue never knew,
all under the lowercase ~/.config/cursor home or the IDE's per-workspace
storage. The Cursor CLI prompt-history file gives Cursor the same
prompt-recall store the other backends already expose, and the IDE
workspace databases hold the aiService.prompts history that only the
global state.vscdb was catalogued for.

what:
- cursor-cli.prompt_history: parse_cursor_prompt_history reads the flat
  JSON string array at ~/.config/cursor/prompt_history.json into
  deduplicated user-prompt records, stamped with the file mtime.
- cursor-cli.chats: best-effort parser for the protobuf store.db blobs.
  iter_protobuf_text_fields walks the wire format schema-lessly and
  yields readable UTF-8 runs; the store is inspectable (opt-in) since
  the extraction overlaps the cleaner transcripts JSONL. Date-versioned
  (cursor_cli.chats_protobuf.v1) because the layout is unofficial.
- cursor-ide.workspace_state: discovers per-workspace
  workspaceStorage/<hash>/state.vscdb via a platform-resolved
  ide_workspace root and reuses the state.vscdb adapter.
- Teach parse_cursor_state_db the aiService.prompts shape
  ({"prompts": [{"text", "commandType"}]}) via
  iter_cursor_prompt_candidates, so both the global and per-workspace
  IDE stores surface typed prompts that carry no role field.
- Register the new adapters, bump catalog_version, and cover the
  parsers with NamedTuple+test_id parametrized and fixture-based tests.
…d new stores

why: The backend docs still described Cursor as a single agent and
listed the old `cursor` name in agent enumerations, query examples, and
MCP resource routes. They also did not mention the prompt-history,
chat-blob, or per-workspace stores added with the split.

what:
- Expand the cursor-cli and cursor-ide backend pages with record-schema
  sections for cursor-cli.prompt_history, cursor-cli.chats (best-effort
  protobuf, opt-in), and cursor-ide.workspace_state.
- Rewrite the storage-catalog dev section to describe the two agents,
  their disjoint home directories, and the new adapter IDs.
- Update agent enumerations and query/CLI/MCP examples in the
  getting-started, CLI, library, and MCP pages to the split names.
why: Record the breaking Cursor backend split and the new
prompt-history, per-workspace, and best-effort chat-blob stores for the
next release.

what:
- Add a Breaking changes entry for retiring the cursor agent in favour
  of cursor-cli and cursor-ide, with a --agent migration block and the
  store/adapter id renames.
- Add What's new deliverables for the Cursor CLI prompt-history store,
  Cursor IDE per-workspace history (and the aiService.prompts fix), and
  the best-effort protobuf chat parser.
@tony tony force-pushed the cursor-split-cli-ide branch from ffcb916 to 81d88a6 Compare May 31, 2026 12:58
@tony tony merged commit 5d5971e into master May 31, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant