Skip to content

feat(core): project identity for grouping sessions across sources#122

Merged
graydawnc merged 12 commits intomainfrom
feat/project-identity
Apr 29, 2026
Merged

feat(core): project identity for grouping sessions across sources#122
graydawnc merged 12 commits intomainfrom
feat/project-identity

Conversation

@graydawnc
Copy link
Copy Markdown
Collaborator

@graydawnc graydawnc commented Apr 29, 2026

Why

Spool indexes sessions from Claude Code, Codex CLI, and Gemini CLI. Each source records a project's working directory in its own way, and the same physical project (e.g. ~/code/spool) can show up under different project_id rows depending on which CLI you used. Without a stable identity, the upcoming library-first UI — Sidebar grouping, single-project page, in:project search scope — has no way to merge them.

What this PR does

Introduces a deterministic project identity for every session, computed from canonical signals — git remote URL for repos with origin, git common dir as fallback, manifest path (package.json / Cargo.toml / pyproject.toml / go.mod / Gemfile / pom.xml / build.gradle) for non-git projects, raw path as last resort. The identity is split into a kind discriminator and a stable key that survives renames, branch checkouts, and home-directory portability (so a workspace cloned to a different machine resolves to the same identity).

A backfill step runs once during migration so every pre-v6 row gets an identity immediately. Identity computation is idempotent — re-running on the same project always yields the same key — which is what lets new sessions slot into existing groups without UI churn.

Concrete example: two agents, one project

A user runs claude in ~/code/spool and later runs codex in the same directory. Each CLI writes its session log under its own ~/.<agent>/sessions/... tree, so the syncer creates two projects rows with cwd = /Users/chen/code/spool but different source values. Without identity grouping, the sidebar would show two entries for the same repo.

When a new session is inserted, computeIdentity(cwd, fs) runs:

  1. cwd is not the home directory or a known loose dir → not loose
  2. Walks up looking for .git → finds it at /Users/chen/code/spool (the gitRoot)
  3. Runs git config --get remote.origin.urlgit@github.com:spool-lab/spool.git
  4. normalizeGitRemote strips the git@…: SSH prefix, the .git suffix, and lower-cases → github.com/spool-lab/spool
  5. Reads package.json for the display name → spool

Both rows persist with the same identity:

projects
┌────┬──────────────┬──────────────────────────┬───────────────┬──────────────────────────────┬──────────────┐
│ id │ source       │ cwd                      │ identity_kind │ identity_key                 │ display_name │
├────┼──────────────┼──────────────────────────┼───────────────┼──────────────────────────────┼──────────────┤
│ 1  │ claude-code  │ /Users/chen/code/spool   │ git_remote    │ github.com/spool-lab/spool   │ spool        │
│ 2  │ codex-cli    │ /Users/chen/code/spool   │ git_remote    │ github.com/spool-lab/spool   │ spool        │
└────┴──────────────┴──────────────────────────┴───────────────┴──────────────────────────────┴──────────────┘

project_groups_v (aggregate, message_count > 0 only)
┌────────────────────────────────┬──────────────┬───────────────┬──────────────────────┐
│ identity_key                   │ display_name │ session_count │ sources              │
├────────────────────────────────┼──────────────┼───────────────┼──────────────────────┤
│ github.com/spool-lab/spool     │ spool        │ 47            │ claude-code, codex-cli│
└────────────────────────────────┴──────────────┴───────────────┴──────────────────────┘

Cloning the repo to a second machine at ~/work/spool produces the same identity_key because it's derived from the remote URL, not the path. Switching git branches doesn't change it either. If the project has no remote configured, identity falls back to git_common_dir (a stable absolute path inside .git); for non-git projects it falls back to the manifest directory; only as a last resort does it use the raw cwd.

Schema (user_version 6)

Adds three columns to projects:

column purpose
identity_kind TEXT NOT NULL discriminator that selects the resolver (git_remote / git_common_dir / manifest_path / path / loose)
identity_key TEXT NOT NULL stable grouping key
display_name TEXT NOT NULL derived label rendered in UI

And a new view project_groups_v that aggregates session counts and last-activity timestamps per identity. Sessions with message_count = 0 are filtered out of the aggregate because the watcher creates a session row when a CLI process starts, before the first user message is written — those rows would otherwise appear as empty entries in every group.

How it connects

This PR is the data contract; everything downstream in the stack consumes it:

  • Sidebar (next PR) reads project_groups_v directly
  • ProjectView resolves by identity_key
  • in:project search scope filters by identity_key

Submitting it as the first PR keeps the schema migration reviewable on its own.

Test plan

  • pnpm --filter @spool-lab/core test — identity computation, homedir portability, git common-dir absolutization, remote URL normalization
  • e2e: index a sample with claude + codex sessions in the same git repo; confirm one Sidebar row, correct session count, no duplicates

@graydawnc graydawnc marked this pull request as ready for review April 29, 2026 16:16
@graydawnc graydawnc merged commit f721bb8 into main Apr 29, 2026
4 checks passed
@graydawnc graydawnc deleted the feat/project-identity branch April 29, 2026 16:16
graydawnc added a commit that referenced this pull request Apr 30, 2026
* docs(design): rewrite for library-first shell

Realigns DESIGN.md with the shipped library-first product (PRs #122#133).
Reverses the 2026-03-27 "search box is the product" framing now that the
Sidebar + Project View + ⌘K overlay are the home, and adds a decisions-log
row that references the reversal.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(design): clean up two stale references missed in first pass

Spacing's "Search bar padding (home / results bar)" and the First-Person
Language Do/Don't row "You starred this" both predated the library-first
shell. Updated to "⌘K overlay / results page" and "You pinned this" so
the doc reads consistently end-to-end.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Chen <99816898+donteatfriedrice@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant