feat: add Logfire phased instrumentation by phernandez · Pull Request #692 · basicmachines-co/basic-memory

phernandez · 2026-03-24T21:14:18Z

Summary

Add Logfire telemetry integration with config-gated bootstrap (logfire_enabled = true in config enables; false by default — zero overhead when off)
Instrument key paths: MCP tool execution, project routing, sync service, search, and API endpoints with structured spans and failure details
Improve MCP log clarity by binding loguru context to telemetry scopes
logfire is an optional dependency (pip install basic-memory[telemetry])

Details

Phased instrumentation strategy following docs/logfire-instrumentation-strategy.md:

Bootstrap & config gating — telemetry.py module with configure_telemetry() and span helpers
Root span boundaries — MCP server lifecycle, sync coordinator, API startup
Routing & sync spans — project context routing, sync service phases (discover → parse → save → link)
Failure-focused details — error attributes on spans, search scoring, tool input capture
Loguru ↔ telemetry binding — structured log context flows into OpenTelemetry spans

Setup:

pip install basic-memory[telemetry] (or logfire is already included in dev dependencies)
Set logfire_enabled = true in ~/.basic-memory/config.json
Optionally set LOGFIRE_TOKEN env var for backend export to Logfire

Includes comprehensive test coverage (telemetry tests for each instrumented layer) and a smoke workflow (just test-smoke-logfire).

Test plan

just fast-check passes
just test-smoke-logfire smoke test validates spans are emitted when enabled
Verify no performance impact when logfire_enabled is false (telemetry is fully no-op)
Verify graceful degradation when logfire package is not installed

🤖 Generated with Claude Code

Signed-off-by: phernandez <paul@basicmachines.co>

github-actions · 2026-03-24T21:14:37Z

Claude finished @phernandez's task in 3m 31s —— View job

PR Review: feat/logfire-phased-instrumentation

Tasks:

Summary

This is a well-structured telemetry integration. The config-gated bootstrap, zero-overhead no-op path when disabled, and layered span / scope / operation abstractions are clean. The test coverage via fake context managers is thorough. A few issues need attention before merge.

Issues

1. `# pragma: no cover` on functions that ARE tested

init_cli_logging, init_mcp_logging, and init_api_logging are all marked # pragma: no cover in config.py (lines 992, 1003, 1014), but tests/test_telemetry.py::test_init_logging_functions_configure_telemetry_and_logging directly calls all three. This means the test provides coverage but it's silently excluded from the coverage report, undermining the 100% coverage requirement.

Since these functions are tested (via monkeypatched dependencies), the pragmas should be removed. If there's a reason they can't be tested fully (e.g., the log_to_file=True path causing real file I/O), that specific branch should carry the pragma — not the entire function.

Fix this →

2. PR description says "LOGFIRE_TOKEN enables" — code says `logfire_enabled = true`

The PR summary states "LOGFIRE_TOKEN enables, absent disables — zero overhead when off" but the actual gate is logfire_enabled: bool in BasicMemoryConfig (config.py line 146). A user who sets LOGFIRE_TOKEN in their environment but leaves logfire_enabled at its default (false) will see no telemetry, with no warning.

LOGFIRE_TOKEN is consumed by Logfire's own configure() for the backend connection — it doesn't auto-enable the integration. The docs/README and strategy doc should be updated to clarify the two-step setup:

Set logfire_enabled = true in ~/.basic-memory/config.json
Set LOGFIRE_TOKEN for backend export (optional if only local inspection)

3. Duplicate helper functions: `_canonicalize_project_name` vs `_configured_project_name`

project_context.py contains two nearly-identical helpers:

_canonicalize_project_name (lines 67–85): returns the configured name if found, else returns project_name unchanged
_configured_project_name (lines 88–101): returns the configured name if found, else returns None

The only difference is the fallback. Both iterate config.projects with the same permalink logic. The names are confusingly similar and their distinction ("canonicalize" vs "configured") isn't obvious. Consider consolidating into one with a clear name or at minimum adding a comment explaining why both exist.

Fix this →

4. `telemetry.operation` vs `telemetry.scope` are functionally identical

telemetry.py lines 178–180:

@contextmanager
def operation(name: str, **attrs: Any) -> Iterator[None]:
    with scope(name, **attrs):
        yield

operation is a pure pass-through to scope with no behavioral difference. The semantic distinction ("root operation" vs "nested scope") is documented but not enforced. This makes it easy to misuse (scope at root level, operation nested). Either add enforcement (e.g., track nesting depth via a ContextVar and warn) or collapse them into one function with a is_root kwarg — or at minimum add a docstring note stating the two are aliases and the distinction is convention only.

5. Sync context managers wrapping `yield` in `@asynccontextmanager` — span lifecycle covers full tool execution

In get_project_client (project_context.py):

with telemetry.scope("routing.resolve_client", ...):
    async with get_client(...) as client:
        active_project = await get_active_project(...)
        yield client, active_project   # span stays open until caller exits

The yield is inside the with telemetry.scope(...) block, so the span remains open for the entire duration of the caller's async with get_project_client(...) block — including all tool I/O. This means routing.resolve_client spans can last seconds rather than milliseconds. The span name implies it's measuring routing resolution time, not tool execution time. Consider closing the routing span before yielding, or renaming it to routing.tool_execution to accurately reflect what's being measured.

6. `logfire` added as a hard dependency in `pyproject.toml`

pyproject.toml adds "logfire>=4.19.0" to the core dependencies. Since telemetry is optional (config-gated), this adds a mandatory install dependency for all users. The _load_logfire() lazy import guard handles the missing-package case gracefully, but users who don't want telemetry still pay the install cost (~several MB).

Consider moving logfire to an optional extras group (e.g., [telemetry]) and documenting the install step in the telemetry setup guide. This aligns with the "zero overhead when off" design goal.

Team Checklist

Item	Status	Notes
Follows Basic Memory coding conventions	✅	Literate comments, section headers, decision-point comments
Python 3.12+ type annotations and async patterns	✅
SQLAlchemy 2.0 best practices	✅	No DB changes
FastAPI and Typer conventions	✅	`ctx.with_resource()` for CLI span is idiomatic
100-character line length	✅
No commented-out code	✅
Unit tests for new functions	✅	Comprehensive fake-logfire coverage
Integration tests for MCP tools	✅	`test_tool_telemetry.py` covers all major tools
Test coverage edge cases	✅	Missing dependency, disabled, retry without `send_to_logfire`
100% test coverage maintained	⚠️	`# pragma: no cover` on tested init functions (see Issue 1)
Documentation updated	✅	`docs/logfire-instrumentation-strategy.md` added
CLAUDE.md updated if conventions change	N/A	No convention changes
MCP tools follow atomic, composable design	✅
Database changes include Alembic migrations	✅	No DB changes
Preserves local-first architecture	✅	Zero overhead path when disabled
No hardcoded secrets	✅
Input validation for MCP tools	✅
Proper error handling and logging	✅	`pop_telemetry_warnings()` surfaces errors after logger init
Performance when disabled	✅	`telemetry_enabled()` short-circuits all paths
No sensitive data in logs	✅	Only structural metadata in span attributes
Windows compatibility	✅	No path operations in telemetry code
No emojis/unicode in console/log output	✅

Minor Observations

telemetry.py line 120–123 uses triple # pragma: no cover on adjacent lines — standard practice is a single pragma on the except Exception line only; the pragma on the body lines is redundant but harmless.
_load_logfire() is called on every span() invocation (when enabled). Python's import cache makes this fast, but a module-level cache of the resolved logfire object after configure_telemetry would be marginally cleaner.
The with telemetry.scope(...) in resolve_workspace_parameter and get_active_project are synchronous context managers used in async functions — this is correct with asyncio/contextvars and works as expected.

Overall this is a solid, well-tested implementation. Issues 1 (coverage pragma) and 6 (hard dependency) are the most important to address before merge.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 23e0073ce6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-24T21:22:25Z

src/basic_memory/mcp/project_context.py

+        # project. Resolving it through /v2/projects/resolve creates noisy false-negative
+        # logs before the tool recovers with the active project anyway.
+        # Outcome: skip project resolution and keep the path within the active project.
+        if project_prefix and (configured_prefix is not None or project is None):


Resolve memory URL prefixes even when project is preselected

The new guard if project_prefix and (configured_prefix is not None or project is None) skips /v2/projects/resolve whenever a project is already fixed and the prefix is not present in local config. In cloud routing, local config may not contain all accessible projects, so valid inputs like memory://other-project/note can be misinterpreted as a path inside the active project (for example <active>/other-project/note) instead of routing to other-project. This regresses cross-project memory URL behavior in any flow that has already resolved project.

Useful? React with 👍 / 👎.

The telemetry PR introduced a guard that skipped /v2/projects/resolve when a project was already fixed and the URL prefix wasn't in local config. This broke cloud routing where remote projects aren't locally configured — valid URLs like memory://other-project/note would be misinterpreted as paths within the active project. Revert to the original unconditional `if project_prefix:` guard and remove the now-unused `_configured_project_name()` helper. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: phernandez <paul@basicmachines.co>

github-actions · 2026-03-24T22:28:33Z

Claude finished @phernandez's task in 1m 10s —— View job

PR Review Update: feat/logfire-phased-instrumentation

Tasks:

Identify new commits since last review
Review new changes (fe32ac5: fix cross-project memory URL resolution)
Check if previous review issues were addressed
Post final review update

What changed in `fe32ac5`

The new commit directly addresses the Codex P1 regression: cross-project memory URL resolution was broken when a project was already pre-selected. The fix is clean and correct:

Removes _configured_project_name helper (which was used only in the now-reverted guard)
Reverts the conditional in resolve_project_and_path from if project_prefix and (configured_prefix is not None or project is None): back to the simpler if project_prefix: — restoring unconditional resolution via /v2/projects/resolve
Removes the test that validated the reverted optimization (appropriate — it tested behavior that no longer exists)

This is the right call. The optimization was trading correctness for noise reduction. The "noisy false-negative logs" concern can be addressed at the logging level (e.g., log at DEBUG instead of WARNING on project-not-found), not by skipping resolution.

Partially addresses Issue 3 from prior review: _configured_project_name is gone, so the duplicate-helper concern is resolved. _canonicalize_project_name (lines 67–85) remains and is still used — that's correct.

Remaining open issues from prior review

These were not addressed in this push:

#	Issue	Status
1	`# pragma: no cover` on `init_cli_logging`, `init_mcp_logging`, `init_api_logging` in `config.py` — these functions are fully tested	Still open
2	PR description says "LOGFIRE_TOKEN enables" but actual gate is `logfire_enabled: bool` in config — docs/README need clarification	Still open
4	`telemetry.operation` is a pure pass-through to `telemetry.scope` — semantic distinction is convention only, not enforced	Still open (minor)
5	`routing.resolve_client` span in `get_project_client` wraps the `yield`, so it stays open for the entire tool execution rather than just routing resolution time	Still open (minor)
6	`logfire` is a hard dependency in `pyproject.toml` — should be an optional extras group given the "zero overhead when off" design goal	Still open

Issues 1 and 6 remain the most important before merge. The fix in this commit is solid — the cross-project routing regression is the kind of subtle correctness bug that would be painful to track down in production.

- Remove `# pragma: no cover` from init_cli/mcp/api_logging since they are fully tested (issue 1) - Move logfire to optional dependency `pip install basic-memory[telemetry]` while keeping it in dev deps for testing (issue 6) - Collapse `telemetry.operation` into an alias for `telemetry.scope` with a comment explaining the convention (issue 4) - Rename `routing.resolve_client` span to `routing.client_session` to accurately reflect that it covers the full tool execution lifetime, not just routing resolution (issue 5) - Update PR description to clarify two-step setup: logfire_enabled config flag + optional LOGFIRE_TOKEN (issue 2) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: phernandez <paul@basicmachines.co>

github-actions · 2026-03-25T00:23:01Z

Claude finished @phernandez's task in 2m 3s —— View job

PR Review Update: feat/logfire-phased-instrumentation

Tasks:

Gather context from PR and prior reviews
Read new commits since last review (02b42c6 refactor: address PR review feedback)
Check if prior issues were addressed
Review key files for outstanding issues
Post final review

What changed in `02b42c6`

This commit directly addresses all 6 issues from the prior review. Here's what was resolved:

#	Prior Issue	Status
1	`# pragma: no cover` on `init_cli/mcp/api_logging` in `config.py` — these are fully tested	✅ Fixed — pragmas removed
2	PR description said "LOGFIRE_TOKEN enables" but actual gate is `logfire_enabled: bool`	✅ Fixed — PR description updated; `logfire_enabled = true` is now the documented first step
3	Duplicate `_canonicalize_project_name` / `_configured_project_name` helpers	✅ Already fixed in `fe32ac5` (prior review cycle)
4	`telemetry.operation` was a pure pass-through to `telemetry.scope` with no enforcement	✅ Fixed — collapsed to a module-level alias with a comment documenting the convention-only distinction
5	`routing.resolve_client` span stayed open for the full tool execution lifetime, not just routing	✅ Fixed — renamed to `routing.client_session`, accurately reflecting what it measures
6	`logfire` was a hard dependency for all users	✅ Fixed — moved to `[project.optional-dependencies] telemetry = ["logfire>=4.19.0"]`; kept in `[dependency-groups] dev` for tests

Remaining `# pragma: no cover` (all appropriate)

The remaining pragmas in telemetry.py are on genuinely hard-to-test paths:

Lines 120–123: except Exception as exc inside logfire.configure() — requires failure injection into a third-party call
Lines 190–192: if logfire is None inside span() — only reachable if logfire is importable at configure_telemetry time but disappears afterward (can't happen in practice)

These are appropriate uses per the project's # pragma: no cover policy.

Team Checklist

Item	Status	Notes
Follows Basic Memory coding conventions	✅	Literate comments, section headers, decision-point comments throughout
Python 3.12+ type annotations and async patterns	✅
SQLAlchemy 2.0 best practices	✅	No DB changes
FastAPI and Typer conventions	✅
100-character line length	✅
No commented-out code	✅
Unit tests for new functions	✅	Comprehensive fake-logfire coverage in `tests/test_telemetry.py`
Integration tests for MCP tools	✅	`test_tool_telemetry.py` covers all major tools
Test coverage edge cases	✅	Missing dependency, disabled path, retry without `send_to_logfire`
100% test coverage maintained	✅	Pragmas removed from `init_cli/mcp/api_logging` (Issue 1 resolved)
Documentation updated	✅	`docs/logfire-instrumentation-strategy.md` added; span name updated in docs to match rename
CLAUDE.md updated if conventions change	N/A	No convention changes
MCP tools follow atomic, composable design	✅
Database changes include Alembic migrations	N/A	No DB changes
Preserves local-first architecture	✅	Zero overhead path when disabled
No hardcoded secrets	✅
Input validation for MCP tools	✅
Proper error handling and logging	✅	`pop_telemetry_warnings()` surfaces errors after logger init
Performance when disabled	✅	`telemetry_enabled()` short-circuits all paths
No sensitive data in logs	✅	Only structural metadata in span attributes
Windows compatibility	✅	No path operations in telemetry code
No emojis/unicode in console/log output	✅
`logfire` optional dependency	✅	Moved to `[telemetry]` extras; dev deps retain it for tests

Summary

All prior issues are resolved. The PR is clean:

logfire is an optional install (pip install basic-memory[telemetry])
Telemetry is fully no-op when logfire_enabled = false (default)
Coverage pragmas are only on genuinely untestable paths
Span names accurately reflect their lifetime scope
telemetry.operation / telemetry.scope distinction is documented as convention-only with an alias

This is ready to merge.

phernandez added 8 commits March 20, 2026 14:41

feat: add logfire bootstrap and config gating

738ce9d

Signed-off-by: phernandez <paul@basicmachines.co>

feat: add root span telemetry boundaries

54b5572

Signed-off-by: phernandez <paul@basicmachines.co>

feat: add routing and sync telemetry spans

f0b4437

Signed-off-by: phernandez <paul@basicmachines.co>

feat: add failure-focused telemetry details

1639bff

Signed-off-by: phernandez <paul@basicmachines.co>

feat: bind loguru context to telemetry scopes

396abd0

Signed-off-by: phernandez <paul@basicmachines.co>

docs: add local logfire smoke playbook

3600419

Signed-off-by: phernandez <paul@basicmachines.co>

feat: extend telemetry smoke workflow

b1e8b9e

Signed-off-by: phernandez <paul@basicmachines.co>

feat: improve mcp log clarity

23e0073

Signed-off-by: phernandez <paul@basicmachines.co>

chatgpt-codex-connector bot reviewed Mar 24, 2026

View reviewed changes

phernandez merged commit 4791e19 into main Mar 25, 2026
25 checks passed

phernandez deleted the feat/logfire-phased-instrumentation branch March 25, 2026 01:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add Logfire phased instrumentation#692

feat: add Logfire phased instrumentation#692
phernandez merged 10 commits intomainfrom
feat/logfire-phased-instrumentation

phernandez commented Mar 24, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 24, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 24, 2026

Uh oh!

github-actions bot commented Mar 24, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 25, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

phernandez commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Details

Test plan

Uh oh!

github-actions bot commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: feat/logfire-phased-instrumentation

Summary

Issues

1. # pragma: no cover on functions that ARE tested

2. PR description says "LOGFIRE_TOKEN enables" — code says logfire_enabled = true

3. Duplicate helper functions: _canonicalize_project_name vs _configured_project_name

4. telemetry.operation vs telemetry.scope are functionally identical

5. Sync context managers wrapping yield in @asynccontextmanager — span lifecycle covers full tool execution

6. logfire added as a hard dependency in pyproject.toml

Team Checklist

Minor Observations

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Update: feat/logfire-phased-instrumentation

What changed in fe32ac5

Remaining open issues from prior review

Uh oh!

github-actions bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Update: feat/logfire-phased-instrumentation

What changed in 02b42c6

Remaining # pragma: no cover (all appropriate)

Team Checklist

Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

phernandez commented Mar 24, 2026 •

edited

Loading

github-actions bot commented Mar 24, 2026 •

edited

Loading

1. `# pragma: no cover` on functions that ARE tested

2. PR description says "LOGFIRE_TOKEN enables" — code says `logfire_enabled = true`

3. Duplicate helper functions: `_canonicalize_project_name` vs `_configured_project_name`

4. `telemetry.operation` vs `telemetry.scope` are functionally identical

5. Sync context managers wrapping `yield` in `@asynccontextmanager` — span lifecycle covers full tool execution

6. `logfire` added as a hard dependency in `pyproject.toml`

github-actions bot commented Mar 24, 2026 •

edited

Loading

What changed in `fe32ac5`

github-actions bot commented Mar 25, 2026 •

edited

Loading

What changed in `02b42c6`

Remaining `# pragma: no cover` (all appropriate)