fix: recent_activity dedup + pagination across MCP tools#595
fix: recent_activity dedup + pagination across MCP tools#595phernandez merged 4 commits intomainfrom
Conversation
A user reported that recent_activity returns only 1 distinct note even
though 9 were modified. Root cause: the API defaults to returning ALL
types (entity, observation, relation), so a single well-connected entity
fills the entire page with its observations and relations.
Changes:
- 🎯 Default to entity-only in recent_activity MCP tool when no type specified
- 📄 Expose page/page_size params on recent_activity (were hardcoded)
- ✅ Add has_more field to GraphContext and SearchResponse (N+1 trick)
- 📝 Show pagination guidance ("Use page=2 to see more") in text output
The API layer stays unchanged — it still accepts all types. This is
purely an MCP tool UX default so LLMs see distinct notes by default.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: phernandez <paul@basicmachines.co>
|
Claude finished @phernandez's task in 2m 14s —— View job PR Review: fix/recent-activity-dedup-pagination
|
| Item | Status |
|---|---|
| Follows Basic Memory conventions | ✅ Mostly — see speculative getattr below |
| Python 3.12+ type annotations & async | Union[str, List[str]] at line 39 vs str | list[dict] at line 48 |
| SQLAlchemy 2.0 best practices | ✅ No direct DB changes |
| FastAPI/Typer conventions | ✅ |
| 100-char line limit | ✅ |
| No commented-out code | ✅ |
Speculative getattr/hasattr — violates CLAUDE.md convention
_format_project_output (recent_activity.py:464–511) and _extract_recent_rows (line 354) use unnecessary hasattr/getattr guards on typed objects. After the type-based grouping loop already separates results into entities, observations, and relations lists, the code still does defensive field access like:
# recent_activity.py:464 — entity.title is never None on EntitySummary if hasattr(entity, "title") and entity.title else "Untitled"
# recent_activity.py:479 — obs.category always exists on ObservationSummary
category = getattr(obs, "category", "general") if hasattr(obs, "category") else "general"
# recent_activity.py:354 — all three summary types have created_at
"created_at": (primary.created_at.isoformat() if getattr(primary, "created_at", None) else None),All three summary types (EntitySummary, RelationSummary, ObservationSummary) define these fields. CLAUDE.md says: "Never use getattr(obj, "attr", default) when unsure about attribute names. Check the class definition or source code first." After the type-filtering loop, the field is always present and can be accessed directly.
Compatibility — Windows path splitting ⚠️
_get_project_activity (recent_activity.py:327) and _format_project_output (line 468) split file_path using /:
# recent_activity.py:327 — breaks on Windows
folder = "/".join(result.primary_result.file_path.split("/")[:-1])
# recent_activity.py:468-470 — same issue
folder_path = "/".join(entity.file_path.split("/")[:-1])CLAUDE.md checklist says "File path comparisons must be windows compatible." Use pathlib.Path(file_path).parent instead:
from pathlib import Path
folder = str(Path(result.primary_result.file_path).parent)Testing & Documentation
| Item | Status |
|---|---|
| Unit tests for new functions | ✅ Good coverage of formatting helpers |
| Integration tests for MCP tools | ✅ Entity-only default, explicit type filtering, pagination |
| Edge cases | ✅ has_more True/False, no results, observation truncation |
| 100% test coverage | ✅ Appears maintained |
| Tool contract test updated | ✅ page/page_size added to recent_activity signature |
| Documentation updated | ✅ Docstring explains new params and entity-only default |
Architecture
| Item | Status |
|---|---|
| MCP tools atomic/composable | ✅ |
| No DB migrations needed | ✅ Only schema + service changes |
| Local-first principles preserved | ✅ |
| Knowledge graph consistency | ✅ |
| API layer unchanged (types still accepted) | ✅ Confirmed in PR description |
N+1 pagination trick is cleanly applied in both context_service.py:107 (fetch limit + 1) and search_router.py:52 (fetch page_size + 1). Both correctly trim before returning. This is the right approach to avoid COUNT queries.
has_more in discovery mode — when iterating multiple projects, per-project has_more values are silently discarded. This is acceptable for a summary view, but worth noting in a follow-up if users page through discovery results.
Security & Performance
| Item | Status |
|---|---|
| No hardcoded secrets | ✅ |
| Input validation | ✅ SearchItemType enum validation raises ValueError |
| Error handling | ✅ |
| Performance | ✅ N+1 trick avoids COUNT queries |
| No sensitive data in logs | ✅ |
Minor Observations
-
Type annotation style inconsistency —
Union[str, List[str]]on line 39 vsstr | list[dict]on line 48. Prefer the Python 3.10+|style throughout since the project targets 3.12+. -
SearchItemTypestring comparison —_extract_recent_rowscomparesprimary.type == "entity"using string literals. Minor — usingSearchItemType.ENTITY.valuewould be more defensible but is low priority. -
Discovery mode
has_morenot surfaced — if a project returnshas_more=Truein discovery mode, it's not visible to the user. Low priority since discovery mode is a summary view.
Checklist Results
- Follows Basic Memory coding conventions (mostly — speculative
getattris a violation) - Python 3.12+ type annotations and async patterns (inconsistent style, not blocking)
- SQLAlchemy 2.0 best practices
- FastAPI and Typer conventions followed
- 100-character line length maintained
- No commented-out code blocks
- Unit tests for new functions/methods
- Integration tests for new MCP tools
- Test coverage for edge cases
- 100% test coverage maintained
- Documentation updated (docstrings)
- CLAUDE.md conventions — speculative
getattr/hasattrshould be removed - MCP tools follow atomic, composable design
- No database migrations needed
- Preserves local-first architecture
- Knowledge graph operations consistent
- No hardcoded secrets
- Input validation in MCP tools
- Proper error handling
- No sensitive data in logs
- Windows compatibility — path splitting with
/should usepathlib.Path - No emojis in console/log output (MCP text output is LLM-facing, not console)
Verdict
Two issues should be fixed before merge:
- Windows path compatibility in
_get_project_activityand_format_project_output - Speculative
getattr/hasattrpatterns on typed objects — replace with direct attribute access
Everything else looks good. The core changes (entity-only default, has_more, exposed pagination) are correct and well-tested.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 14eb35e82e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| "page": page, | ||
| "page_size": page_size, |
There was a problem hiding this comment.
Validate pagination arguments before calling recent API
The new page/page_size inputs are forwarded as-is, but this tool does not enforce page >= 1 and page_size >= 1. In project mode these values are passed to /v2/projects/{id}/memory/recent, where offset is derived from page and can become negative; Postgres rejects negative OFFSET, so values like page=0 now surface as server/tool errors instead of a clear validation error. This regression appears only after exposing pagination params here, since the previous implementation always used safe constants.
Useful? React with 👍 / 👎.
Prevents negative OFFSET errors from reaching the database when invalid pagination values (page=0, page_size=-1, etc.) are passed. Raises clear ValueError messages at the tool boundary. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: phernandez <paul@basicmachines.co>
|
Claude finished @phernandez's task in 2m 20s —— View job PR Review Update: fix/recent-activity-dedup-paginationTasks
New Commit: e3f8c83 — validate page/page_size before API callThe new commit directly addresses the Codex review comment: What was added (recent_activity.py:114–121): if page < 1:
raise ValueError(f"page must be >= 1, got {page}")
if page_size < 1:
raise ValueError(f"page_size must be >= 1, got {page_size}")
if page_size > 100:
raise ValueError(f"page_size must be <= 100, got {page_size}")The upper bound ( Remaining Issues from Previous ReviewThe two items flagged in the first review are still unresolved: 1. Speculative
|
| Location | Guarded field | Actual schema |
|---|---|---|
recent_activity.py:335 |
file_path (hasattr) |
All three types have it |
recent_activity.py:363 |
created_at (getattr) |
All three types have it |
recent_activity.py:388 |
title (hasattr) |
All three types have it |
recent_activity.py:417 |
title (hasattr) |
All three types have it |
recent_activity.py:473 |
title (hasattr) |
EntitySummary.title: str |
recent_activity.py:476 |
file_path (hasattr) |
EntitySummary.file_path: str |
recent_activity.py:489 |
category (getattr) |
ObservationSummary.category: str |
recent_activity.py:499 |
content (getattr) |
ObservationSummary.content: str |
recent_activity.py:513 |
relation_type (getattr) |
RelationSummary.relation_type: str |
recent_activity.py:518 |
from_entity (getattr) |
RelationSummary.from_entity |
recent_activity.py:520 |
to_entity (getattr) |
RelationSummary.to_entity |
2. Windows path splitting — still uses / string split
Two locations still split file_path with / instead of pathlib.Path:
# recent_activity.py:336
folder = "/".join(result.primary_result.file_path.split("/")[:-1])
# recent_activity.py:477-478
folder_path = "/".join(entity.file_path.split("/")[:-1])Should be:
from pathlib import Path
folder = str(Path(result.primary_result.file_path).parent)Checklist
- Follows Basic Memory coding conventions — partial (speculative getattr still present)
- Python 3.12+ type annotations and async patterns — minor style inconsistency (
Union[str, List[str]]vsstr | list[str]on same function, non-blocking) - SQLAlchemy 2.0 best practices
- FastAPI and Typer conventions followed
- 100-character line length maintained
- No commented-out code blocks
- Unit tests for new functions/methods
- Integration tests for new MCP tools
- Test coverage for edge cases — new pagination validation tests are thorough
- 100% test coverage maintained
- Documentation updated (docstrings)
- CLAUDE.md conventions — speculative
getattr/hasattrshould be removed - MCP tools follow atomic, composable design
- No database migrations needed
- Preserves local-first architecture
- Knowledge graph operations consistent
- No hardcoded secrets
- Input validation in MCP tools — improved (page/page_size validation added)
- Proper error handling
- No sensitive data in logs
- Windows compatibility — path splitting with
/should usepathlib.Path
Verdict
The new commit cleanly resolves the Codex-flagged pagination validation issue. Two items from the first review remain open before merge: the speculative getattr/hasattr patterns (CLAUDE.md violation) and the Windows-incompatible / path splitting. Both are mechanical fixes.
…tting
All three summary types (EntitySummary, ObservationSummary, RelationSummary)
define title, file_path, created_at as required fields. Replace defensive
getattr/hasattr with direct attribute access per CLAUDE.md guidelines.
Replace string-based path splitting ("/".join(path.split("/")[:-1])) with
PurePosixPath().parent for cross-platform correctness.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: phernandez <paul@basicmachines.co>
|
Claude finished @phernandez's task in 2m 9s —— View job PR Review Update: fix/recent-activity-dedup-paginationTasks
New Commit: 024bb04 — remove speculative getattr/hasattr and use pathlib for path splittingThis commit directly resolves both remaining issues from the previous reviews. Issue 1: Speculative getattr/hasattr — RESOLVED ✅All eleven guarded field accesses have been replaced with direct attribute access. Verified against the schemas in
All guards removed correctly. The Issue 2: Windows path splitting — RESOLVED ✅Both locations replaced with # Before
folder = "/".join(result.primary_result.file_path.split("/")[:-1])
# After
folder = str(PurePosixPath(result.primary_result.file_path).parent)
if folder and folder != ".": # correctly filters root-level files
active_folders.add(folder)
Minor Observations (non-blocking, unchanged from previous reviews)
Neither blocks merge. Checklist Results
VerdictAll blocking issues from both previous reviews are resolved. The three commits together deliver the entity-only default, exposed pagination, N+1 |
5 new tests in test-int/mcp/test_pagination_integration.py: - search_notes: page 1 and page 2 return disjoint result sets - search_notes: has_more transitions from True to False on last page - recent_activity: page 1 and page 2 return disjoint result sets - recent_activity: has_more text guidance appears/disappears correctly - recent_activity: entity-only default filters out observations/relations All tests go through the full MCP → API → service → DB stack using write_note to seed real data, then verify pagination offsets produce correct non-overlapping windows. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: phernandez <paul@basicmachines.co>
|
Claude finished @phernandez's task in 2m 15s —— View job PR Review Update: fix/recent-activity-dedup-paginationTasks
New Commit: 5eb6db3 — integration tests for pagination through to the databaseThis commit adds 5 integration tests in Tests added:
Test isolation is correct — Text assertions are correct — JSON output assertions are correct — Minor Observation (non-blocking)
SummaryAll four commits together deliver:
All blocking issues from the three prior reviews are resolved. Ready to merge. |

Summary
recent_activityMCP tool — prevents a single well-connected entity from filling the page with its observations/relations when no type is specifiedpage/page_sizeparams onrecent_activity(were hardcoded to 1/10)has_morefield toGraphContextandSearchResponseusing N+1 trick — no expensive COUNT queriesThe API layer stays unchanged — it still accepts all types. This is purely an MCP tool UX default so LLMs see distinct notes by default.
Fixes the issue where a user reported
recent_activityreturning only 1 distinct note even though 9 were modified.Test plan
just fast-check— lint, format, typecheck, impacted tests all passjust test-sqlite— full SQLite test suite (275 passed)just test-unit-sqlite— all 1915 unit tests passhas_morein search and memory router responses_format_project_outputpage/page_sizeparams🤖 Generated with Claude Code