-
Notifications
You must be signed in to change notification settings - Fork 156
feat: fast edit entities, refactors for webui, enhance search #532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: phernandez <paul@basicmachines.co>
Signed-off-by: phernandez <paul@basicmachines.co>
Signed-off-by: phernandez <paul@basicmachines.co>
|
Claude finished @phernandez's task in 4m 7s —— View job PR Review: fast-edit-entitiesComprehensive review of this significant performance and feature enhancement PR. Executive SummaryThis PR introduces fast entity operations and enhanced search capabilities with structured metadata filtering. The changes are substantial (+1784/-7053 lines across 78 files) and represent a major refactoring that: ✅ Strengths:
Detailed Review by Checklist✅ Code Quality & StandardsPython 3.12+ compliance:
Literate Programming:
Code conventions:
|
| Category | Status | Notes |
|---|---|---|
| Code Quality | ✅ Good | Well-structured, good comments |
| Architecture | ✅ Good | Follows established patterns |
| Performance | ✅ Excellent | Smart optimizations throughout |
| Test Coverage | ❌ Blocking | Missing metadata filter tests |
| Documentation | Migration guide needed | |
| Security | ✅ Good | Proper validation and SQL safety |
| Windows Compat | ✅ Good | Proper path handling |
| CI Status | ❌ Failing | Must fix before merge |
Recommendation: Do not merge until:
- Test coverage for metadata filtering is added
- CI passes successfully
- V1 API deprecation is documented (if intentional)
This is high-quality work with excellent architectural decisions. The main blockers are test coverage and CI failures, which should be straightforward to resolve.
Review completed. The PR shows strong engineering with smart performance optimizations and well-designed database migrations. Address the test coverage and CI issues, and this will be ready to merge.
| # Normalize paths for comparison (resolve symlinks, etc.) | ||
| from pathlib import Path | ||
|
|
||
| requested_path = Path(project_data.path).resolve() |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
user-provided value
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 2 days ago
In general, to fix uncontrolled-path issues, normalize paths relative to a known-safe base directory and verify that the resulting path stays within that base. If the path is used only for comparison or metadata, avoid hitting the real filesystem with arbitrary user input (for example, by not following symlinks or by using a dedicated normalization helper).
In this specific case, the problematic operation is requested_path = Path(project_data.path).resolve(). We can avoid resolving arbitrary filesystem paths here by reusing the existing normalize_project_path utility (already imported at line 36), which presumably encapsulates the project’s path normalization/sanitization behavior. We will:
- Replace the direct
.resolve()call on user input with a call tonormalize_project_path(project_data.path)to produce a normalized string. - Apply the same normalization to
existing_project.pathso that the comparison remains semantically equivalent (still checking for “same effective path”). - Compare these normalized strings (or convert them to
Pathpurely for string comparison withoutresolve()) instead of resolving untrusted paths against the filesystem.
All changes are confined to src/basic_memory/api/v2/routers/project_router.py, in the add_project function around lines 88–92. No new imports are needed since normalize_project_path is already imported.
-
Copy modified lines R88-R90 -
Copy modified line R94
| @@ -85,13 +85,13 @@ | ||
| if existing_project: | ||
| # Project exists - check if paths match for true idempotency | ||
| # Normalize paths for comparison (resolve symlinks, etc.) | ||
| from pathlib import Path | ||
| # Normalize paths for comparison without resolving arbitrary filesystem locations | ||
| requested_path = normalize_project_path(project_data.path) | ||
| existing_path = normalize_project_path(existing_project.path) | ||
|
|
||
| requested_path = Path(project_data.path).resolve() | ||
| existing_path = Path(existing_project.path).resolve() | ||
|
|
||
| if requested_path == existing_path: | ||
| # Same name, same path - return 200 OK (idempotent) | ||
| # Same name, same path - return 200 OK (idempotent) | ||
| return ProjectStatusResponse( # pyright: ignore [reportCallIssue] | ||
| message=f"Project '{project_data.name}' already exists", | ||
| status="success", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4d64aa09b7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| create_data = { | ||
| **update_data, | ||
| "external_id": external_id, | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid inserting NULL external_id in fast create
When fast_write_entity is used from the v2 create endpoint (default fast=True), external_id is not provided, but create_data still includes "external_id": external_id. In SQLAlchemy this explicitly inserts NULL and bypasses the model’s UUID default, so new entities can end up with a null external_id. That breaks v2’s contract (clients can’t address the entity by external_id and multiple NULLs are allowed), and can cause follow‑up GET/PUT/resolve to fail. Omit the key when external_id is None or generate a UUID here.
Useful? React with 👍 / 👎.
| for j, val in enumerate(filt.value): | ||
| tag_param = f"{value_param}_{j}" | ||
| params[tag_param] = _json.dumps([val]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix undefined variable in Postgres metadata 'contains' filters
In the contains branch for metadata filters, value_param is referenced but never defined in this scope. Any metadata_filters list (e.g., tags) will raise an UnboundLocalError before executing the SQL, returning a 500 on Postgres searches. You likely intended to base these parameter names on idx (similar to other branches).
Useful? React with 👍 / 👎.
Signed-off-by: phernandez <paul@basicmachines.co>
|
Claude finished @phernandez's task in 3m 28s —— View job PR Review: fast-edit-entitiesComprehensive review of this PR against the Basic Memory team checklist. Executive SummaryThis PR introduces major performance improvements (fast entity operations), recent activity API enhancements, and structured metadata search. The changes are substantial (+1851/-7122 lines across 85 files) and represent a significant refactoring. Status: ❌ Not ready to merge - Critical issues must be addressed first. Critical Issues (Blockers) 🚨1. Missing Test Coverage (Violates 100% Coverage Requirement)The project mandates 100% test coverage, but this PR adds 131 lines of new parsing logic in Missing test coverage:
Required tests: # tests/repository/test_metadata_filters.py (NEW FILE NEEDED)
- test_parse_simple_equality()
- test_parse_in_operator()
- test_parse_comparison_operators() # $gt, $gte, $lt, $lte
- test_parse_between_operator()
- test_parse_array_contains()
- test_invalid_filter_keys()
- test_numeric_vs_text_detection()
- test_build_sqlite_json_path()
- test_build_postgres_json_path()
- test_normalize_scalar_dates()
- test_normalize_scalar_booleans()
# tests/repository/test_sqlite_search_repository.py (ADD TESTS)
- test_search_with_metadata_filters()
- test_search_metadata_tags()
- test_search_metadata_status()
- test_search_metadata_numeric_comparison()
- test_search_metadata_between()
- test_search_metadata_custom_fields()
# tests/repository/test_postgres_search_repository.py (ADD TESTS)
- Same tests as SQLite repositoryFix this → Add comprehensive test coverage for metadata filtering 2. Bug: Undefined Variable in Postgres Search 🐛File: The # Line 321-340
if filt.op == "contains":
import json as _json
base_param = f"meta_val_{idx}"
tag_conditions = []
# Require all values to be present
for j, val in enumerate(filt.value):
tag_param = f"{base_param}_{j}"
params[tag_param] = _json.dumps([val])
like_param = f"{base_param}_{j}_like"
params[like_param] = f'%"{val}"%'
like_param_single = f"{base_param}_{j}_like_single"
params[like_param_single] = f"%'{val}'%"
tag_conditions.append(
f"({json_expr} @> :{tag_param}::jsonb "
f"OR {text_expr} LIKE :{like_param} "
f"OR {text_expr} LIKE :{like_param_single})"
)
conditions.append(" AND ".join(tag_conditions))
continueImpact: Any metadata search with list values (e.g., Root Cause: The code builds parameter names correctly ( Action: Verify this code path works correctly with integration tests (which are currently missing). Fix this → Add integration tests for metadata contains operator 3. Bug: NULL external_id in Fast Create 🐛File: When create_data = {
**update_data,
"external_id": external_id, # Can be None!
}
return await self.repository.create(create_data)Impact: This bypasses SQLAlchemy's UUID default generator and creates entities with NULL external_id, breaking v2 API contract where clients expect to address entities by external_id. Fix: Only include create_data = {**update_data}
if external_id is not None:
create_data["external_id"] = external_id
return await self.repository.create(create_data)Fix this → Fix NULL external_id issue in fast_write_entity 4. Security: Path Traversal Risk
|
| Category | Status | Notes |
|---|---|---|
| Code Quality | ✅ Good | Well-structured, excellent comments |
| Architecture | ✅ Excellent | Smart optimizations, good patterns |
| Performance | ✅ Excellent | Background tasks, DB indexes |
| Test Coverage | ❌ Blocking | Missing metadata filter tests |
| Bugs | ❌ Blocking | NULL external_id, possible Postgres issue |
| Security | Path traversal risk | |
| Documentation | User docs needed | |
| Windows Compat | ✅ Good | Proper path handling |
| DB Compat | ✅ Excellent | Both SQLite and Postgres |
Final Assessment
Recommendation: ❌ Do not merge until:
- Test coverage for metadata filtering is added (100% requirement)
- NULL external_id bug is fixed
- Path traversal security issue is addressed
- Postgres contains operator is verified/tested
This is high-quality engineering work with excellent architectural decisions and performance optimizations. The database migration is particularly well-designed with proper idempotency and database-specific optimizations.
The main blockers are:
- Missing test coverage (violates 100% coverage requirement)
- Critical bugs (NULL external_id, potential Postgres issue)
- Security concern (path traversal)
Once these issues are resolved, this will be a significant improvement to Basic Memory's performance and search capabilities.
Summary
Details
Fast edit / indexing
Recent activity
Structured frontmatter search
entity_metadata(no stringification)SearchQuerysupportsmetadata_filters,tags, andstatussearch_by_metadatasearch_notessupports structured filters and convenience tags/status--tag,--status,--type,--meta,--filterMigration / Notes
d7e8f9a0b1c2_add_structured_metadata_indexesentity_metadataTesting