feat: multi-agent support, unified CLI, and extractors by ccf · Pull Request #37 · ccf/primer

ccf · 2026-02-28T04:00:57Z

Summary

Multi-agent data model: Add agent_type field to sessions/facets, with OpenAI and Google model pricing alongside Claude
Codex & Gemini extractors: New extractors for OpenAI Codex CLI and Google Gemini CLI session data, with a registry-based discovery system
Unified primer CLI: Full CLI (primer init, setup, server, hook, mcp, sync, doctor, configure) built with Click, including server process management (launchd/systemd/pidfile)
95% CLI test coverage: 82 tests across server_manager, setup, server, sync, doctor, configure, and config modules
Fix flaky test: Prevent PRIMER_ADMIN_API_KEY env var leak between test files

Test plan

pytest tests/ -v — 469 tests pass
ruff check . — clean
ruff format --check . — clean
bandit -r src/ -c pyproject.toml — clean
Frontend lint and type checks pass
Manual: primer init && primer server start && primer setup && primer doctor

🤖 Generated with Claude Code

Note

High Risk
High risk due to database schema migrations/field renames (claude_* → agent_*) and changes to ingest/sync/extraction paths that affect how session data is discovered, normalized, and stored (plus new background server management on host OS).

Overview
Adds multi-agent support across the stack. Introduces agent_type (default claude_code) on sessions and daily stats, renames claude_version→agent_version and claude_helpfulness→agent_helpfulness, expands pricing to OpenAI/Gemini models, and extends analytics with agent_type filtering plus agent_type_counts in overview stats.

Refactors local session ingestion to be extractor-driven. Claude extraction is wrapped as ClaudeCodeExtractor, new CodexExtractor and GeminiExtractor are added, reader now aggregates sessions via an extractor registry, and sync dispatches extraction per-session agent_type while only loading facets where available.

Introduces a Click-based primer CLI and packaging updates. Adds primer entrypoint, config.toml management/loading, init (generates config + runs Alembic), setup, server (launchd/systemd/pidfile manager + logs), hook, mcp, sync, and doctor; scripts/install_hook.py becomes a thin wrapper.

Also updates frontend API types/tests/UI labels to use agent_* fields, enhances seed data generation for multi-agent distributions, bumps ingest rate limit, and adds GitHub Pages deployment for the website/ build plus shared brand/tokens.css.

^{Written by Cursor Bugbot for commit 8105c96. This will update automatically on new commits. Configure here.}

Introduce agent_type dimension across the data model to support Codex CLI and Gemini CLI alongside Claude Code. Rename claude_version → agent_version and claude_helpfulness → agent_helpfulness with backward-compatible Pydantic validators. Add pricing for 9 OpenAI/Google models and agent_type filter to analytics queries. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…covery Introduce a SessionExtractor protocol and three implementations (Claude Code, Codex CLI, Gemini CLI) behind an extractor registry. list_local_sessions() and sync now dispatch through the registry, enabling multi-agent session discovery and ingestion from ~/.codex/ and ~/.gemini/ alongside ~/.claude/. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…or commands Replaces ~8 manual steps (python -m, scripts/, raw uvicorn, manual settings edits) with a single `pip install . && primer init && primer server start` workflow. Adds click-based CLI with commands: init, setup, server {start,stop,status,logs}, hook {install,uninstall,status}, mcp {install,uninstall,serve}, sync, doctor, and configure {get,set,list}. Includes config.toml bridge, launchd/systemd/pidfile server management, and refactored hook installer. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add 40+ tests across server_manager, setup, server, sync, doctor, configure, and config modules using monkeypatch — no real infrastructure. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The init test's load_config_into_env() set env vars without monkeypatch, causing test_admin_headers to see a stale admin key. Fix both the source (delenv in init test) and the victim (patch ADMIN_API_KEY in mcp fixture). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

src/primer/cli/server_manager.py

src/primer/hook/codex_extractor.py

src/primer/cli/server_manager.py

src/primer/hook/codex_extractor.py

src/primer/common/schemas.py

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

src/primer/cli/config.py

- Fix invalid systemd directive `StandardErrorOutput` → `StandardError` - Close log file descriptor after spawning background server process - Key cumulative token tracker per model to avoid cross-model deltas - Only count `ExecCommandBegin` events (not paired End) for tool calls - Populate `agent_type_counts` in overview stats from session data Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

src/primer/cli/commands/setup.py

src/primer/cli/server_manager.py

src/primer/mcp/reader.py

src/primer/hook/gemini_extractor.py

- Escape special chars in TOML string serializer (`\`, `"`) - Read engineer ID from nested `engineer` key in setup response - Escape XML special chars in launchd plist env values - Fix operator precedence in Gemini extractor usage metadata access - Remove dead `_project_path_to_dir_name` and `_find_transcript` from reader Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

src/primer/cli/config.py

src/primer/cli/commands/configure.py

pyproject.toml

src/primer/hook/gemini_extractor.py

…yment Implements the marketing website plan using Astro 5 with Tailwind v4, React islands for interactive components, and MDX content collections for the blog. Includes dark hero, feature grid, comparison table, pricing page, and GitHub Pages deployment workflow. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

frontend/src/types/api.ts

- Preserve comments in config.toml by doing line-level replacement in set_value instead of full round-trip through tomllib - Coerce string values to int/float/bool before storing in TOML - Mask sensitive values in `configure set` output (consistent with get/list) - Scope S603 suppression to cli/ and hook/ via per-file-ignores - Use exact session_id match in Gemini telemetry lookup instead of substring matching on raw log lines Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

src/primer/hook/codex_extractor.py

src/primer/cli/server_manager.py

Aligns the TypeScript interface with the backend schema that now returns agent type distribution data. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Merge _extract_session_id and _extract_project_path into single _extract_session_meta to avoid reading each rollout file twice - Use check=False in systemd stop to handle gracefully when server isn't running instead of crashing with CalledProcessError Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

src/primer/cli/commands/doctor.py

src/primer/cli/config.py

src/primer/cli/server_manager.py

- Preserve leading whitespace in config line replacement - Guard API key truncation for short keys in doctor command - XML-escape log path in launchd plist generation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

src/primer/cli/commands/configure.py

cursor · 2026-02-28T17:37:01Z

src/primer/cli/config.py

+        return float(value)
+    except ValueError:
+        pass
+    return value


Numeric-like config values silently coerced to wrong TOML types

Low Severity

_coerce_value converts all string values through int()/float()/bool() before writing to TOML. A value like "true" passed to set_value("auth.api_key", "true") becomes a TOML boolean (api_key = true), and an all-digit string becomes an integer. While get_value converts back via str(), the TOML file itself contains a type mismatch that could confuse external tools or future reads expecting a string.

Extend seed_data.py with multi-agent support so the dashboard shows realistic Codex/Gemini data alongside Claude Code. Also bump the ingest rate limit from 120/min to 300/min to avoid 429s during seeding. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Short sensitive values (<=12 chars) were displayed in full by `configure get`, `configure set`, and `configure list`. Now they show "***" consistent with the doctor command. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix prepared fixes for both issues found in the latest run.

✅ Fixed: Truncated Codex CLI version string in seed data
- Fixed the truncated version string from '0.1.2025051' (7 digits) to '0.1.20250513' (8 digits) to match the YYYYMMDD convention used by the other entries.
✅ Fixed: Telemetry log fully scanned for each Gemini session
- Replaced per-session full file reads with a class-level _build_telemetry_index method that parses telemetry.log once and indexes all token data by session_id, reducing complexity from O(N*M) to O(N+M).

Or push these changes by commenting:

@cursor push 3373e9f38b

Preview (3373e9f38b)

diff --git a/scripts/seed_data.py b/scripts/seed_data.py
--- a/scripts/seed_data.py
+++ b/scripts/seed_data.py
@@ -128,7 +128,7 @@
             "file_write",
             "function_call",
         ],
-        "versions": ["0.1.20250613", "0.1.20250530", "0.1.2025051"],
+        "versions": ["0.1.20250613", "0.1.20250530", "0.1.20250513"],
         "permission_modes": ["auto", "suggest", "ask"],
     },
     "gemini_cli": {

diff --git a/src/primer/hook/gemini_extractor.py b/src/primer/hook/gemini_extractor.py
--- a/src/primer/hook/gemini_extractor.py
+++ b/src/primer/hook/gemini_extractor.py
@@ -28,6 +28,7 @@
     """Extractor for Gemini CLI sessions (~/.gemini/)."""
 
     agent_type = "gemini_cli"
+    _telemetry_index: dict[str, dict[str, dict[str, int]]] | None = None
 
     def get_data_dir(self) -> Path:
         return Path.home() / ".gemini"
@@ -268,15 +269,13 @@
         meta.output_tokens += output_t
         meta.cache_read_tokens += cached_t
 
-    def _load_telemetry_tokens(self, session_path: Path) -> dict[str, dict[str, int]] | None:
-        """Try to load token counts from ~/.gemini/telemetry.log (OpenTelemetry)."""
-        telemetry_path = self.get_data_dir() / "telemetry.log"
-        if not telemetry_path.exists():
-            return None
+    @classmethod
+    def _build_telemetry_index(cls, telemetry_path: Path) -> dict[str, dict[str, dict[str, int]]]:
+        """Parse telemetry.log once and index all token data by session_id."""
+        if cls._telemetry_index is not None:
+            return cls._telemetry_index
 
-        session_id = session_path.stem
-        model_tokens: dict[str, dict[str, int]] = {}
-
+        index: dict[str, dict[str, dict[str, int]]] = {}
         try:
             with open(telemetry_path) as f:
                 for line in f:
@@ -289,29 +288,45 @@
                         continue
 
                     attrs = event.get("attributes", event.get("resourceAttributes", {}))
-                    # Verify this event belongs to our session via exact match
-                    event_session = attrs.get("session_id", event.get("session_id", ""))
-                    if event_session != session_id:
-                        continue
                     if not attrs:
                         continue
+                    session_id = attrs.get("session_id", event.get("session_id", ""))
+                    if not session_id:
+                        continue
 
                     model = attrs.get("model", "gemini-2.5-flash")
-                    if model not in model_tokens:
-                        model_tokens[model] = {
+                    if session_id not in index:
+                        index[session_id] = {}
+                    if model not in index[session_id]:
+                        index[session_id][model] = {
                             "input": 0,
                             "output": 0,
                             "cache_read": 0,
                             "cache_creation": 0,
                         }
 
-                    model_tokens[model]["input"] += attrs.get("input_token_count", 0)
-                    model_tokens[model]["output"] += attrs.get("output_token_count", 0)
-                    model_tokens[model]["cache_read"] += attrs.get("cached_content_token_count", 0)
+                    index[session_id][model]["input"] += attrs.get("input_token_count", 0)
+                    index[session_id][model]["output"] += attrs.get("output_token_count", 0)
+                    index[session_id][model]["cache_read"] += attrs.get(
+                        "cached_content_token_count", 0
+                    )
         except OSError:
+            cls._telemetry_index = {}
+            return cls._telemetry_index
+
+        cls._telemetry_index = index
+        return index
+
+    def _load_telemetry_tokens(self, session_path: Path) -> dict[str, dict[str, int]] | None:
+        """Try to load token counts from ~/.gemini/telemetry.log (OpenTelemetry)."""
+        telemetry_path = self.get_data_dir() / "telemetry.log"
+        if not telemetry_path.exists():
             return None
 
-        return model_tokens if model_tokens else None
+        session_id = session_path.stem
+        index = self._build_telemetry_index(telemetry_path)
+        tokens = index.get(session_id)
+        return tokens if tokens else None
 
     @staticmethod
     def _infer_project_path(session_file: Path) -> str | None:

_{This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.}

cursor · 2026-02-28T19:03:53Z

scripts/seed_data.py

+            "file_write",
+            "function_call",
+        ],
+        "versions": ["0.1.20250613", "0.1.20250530", "0.1.2025051"],


Truncated Codex CLI version string in seed data

Low Severity

The Codex CLI version list contains "0.1.2025051" which appears to be a truncated date string. The other entries follow the pattern 0.1.YYYYMMDD (8-digit dates: "0.1.20250613", "0.1.20250530"), but this entry only has 7 digits after the dot, likely missing a trailing digit (e.g., "0.1.20250510" or "0.1.20250513").

cursor · 2026-02-28T19:03:53Z

src/primer/hook/gemini_extractor.py

+        except OSError:
+            return None
+
+        return model_tokens if model_tokens else None


Telemetry log fully scanned for each Gemini session

Medium Severity

_load_telemetry_tokens reads and parses the entire ~/.gemini/telemetry.log file for every single session during extraction. When syncing N Gemini sessions, this results in N full file reads of the telemetry log, producing O(N × M) total work where M is the number of telemetry lines. This can be very slow for users with many sessions and a large telemetry log.

ccf and others added 5 commits February 27, 2026 18:19

test: increase CLI test coverage to 95% (was ~55%)

486980d

Add 40+ tests across server_manager, setup, server, sync, doctor, configure, and config modules using monkeypatch — no real infrastructure. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>