fix(sync): preserve bmignore rclone filters#827
Conversation
|
Claude finished @phernandez's task in 2m 12s —— View job PR Review:
|
| Area | Status |
|---|---|
| Follows CLAUDE.md coding conventions | ✅ |
| Python 3.12+ type annotations | ✅ |
| 100-char line length maintained | ✅ |
| No commented-out code | ✅ |
| Unit/integration tests for new code | ✅ |
| Edge case coverage | ✅ (minor gap: negation patterns, pre-existing) |
| 100% test coverage maintained | ✅ |
| Documentation updated | ✅ |
| No DB/migration changes needed | ✅ |
| Local-first architecture preserved | ✅ |
| No hardcoded secrets | ✅ |
| Error handling preserved (except block) | ✅ |
| Windows path compatibility | ✅ (string ops, not Path, for rclone filter syntax) |
| No emojis/unicode in output | ✅ |
Verdict
Looks good to merge. The fix is correct, minimal, and well-tested. The one pre-existing gap (negation patterns) and the cosmetic nit (superfluous *.pyc/**) can be addressed in a follow-up if needed.
|
For the negation note: I am leaving that out of this PR. This patch fixes the two concrete #819 failures by preserving direct + recursive excludes. Gitignore-style negation needs a separate rclone include/filter-order design; simply skipping |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c0cc67faad
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
c0cc67f to
324dc6f
Compare
|
Claude finished @phernandez's task in 1m 30s —— View job PR Review:
|
| Input pattern | Branch | Output |
|---|---|---|
cache/ |
endswith("/") |
- cache/ + - cache/** |
node_modules/** |
removesuffix("/**") → node_modules |
- node_modules + - node_modules/** |
config.json, .*, .git |
default | - {pattern} + - {pattern}/** |
✅ The Trigger/Why/Outcome comments match CLAUDE.md literate style.
Regarding the Codex P2 review comment ("Do not strip the trailing /")
The Codex comment describes a concern that is not present in the actual code. The concern was that rstrip("/") would erase directory-only semantics from cache/. But _rclone_exclude_filters has an explicit endswith("/") guard (line 19) that returns early before any normalization, preserving - cache/ and - cache/** exactly. The removesuffix("/**") on line 25 is only reached for patterns that do not end in /. The Codex comment can be dismissed.
Minor cosmetic — superfluous /** entry for file-glob patterns
*.pyc generates ["- *.pyc", "- *.pyc/**"]. The - *.pyc/** line is valid rclone syntax but a no-op (no real directory is named foo.pyc). It doubles the line count of .bmignore.rclone for every *-containing file pattern. Harmless today, but worth revisiting if the filter file grows large.
Testing & Documentation
✅ test_convert_bmignore_to_rclone_filters_creates_and_converts — updated to assert both direct and recursive forms for plain (node_modules), glob (*.pyc), and dotfile (.git) patterns.
✅ test_convert_bmignore_to_rclone_filters_excludes_files_and_hidden_directory_contents — new test covers config.json (plain filename), .* (glob hidden), and node_modules/** (pre-suffixed). These are the exact cases from issue #819.
✅ test_convert_bmignore_to_rclone_filters_preserves_directory_only_patterns — new test verifies cache/ produces - cache/ and - cache/** but not - cache (correctly asserts the negative). Good edge-case discipline.
✅ The documentation's default pattern list now matches create_default_bmignore() in ignore_utils.py exactly, and the fix to the "Add custom patterns" example (*.tmp instead of *.tmp/**) corrects a misleading user-facing example.
Checklist Results
| Area | Status |
|---|---|
| Follows CLAUDE.md coding conventions | ✅ |
| Python 3.12+ type annotations | ✅ |
| 100-char line length maintained | ✅ |
| No commented-out code | ✅ |
| Unit/integration tests for new code | ✅ |
| Edge case coverage | ✅ (minor: negation patterns, explicitly deferred per comment) |
| 100% test coverage maintained | ✅ |
| Documentation updated | ✅ |
| No DB/migration changes needed | ✅ |
| Local-first architecture preserved | ✅ |
| No hardcoded secrets | ✅ |
| Windows path compatibility | ✅ (string ops, not Path, for rclone filter syntax) |
| No emojis/unicode in output | ✅ |
Verdict
Approved. The fix is correct, minimal, and well-tested. The Codex P2 concern doesn't apply to the actual code. The one pre-existing gap (negation patterns) and the cosmetic nit (*.pyc/**) can be addressed in separate follow-ups if needed.
Signed-off-by: phernandez <paul@basicmachines.co>
324dc6f to
5d93b72
Compare
|
Claude finished @phernandez's task in 2m 3s —— View job PR Review:
|
.bmignore input |
Branch | rclone output |
|---|---|---|
cache/ |
endswith("/") guard |
- cache/ + - cache/** |
node_modules/** |
removesuffix("/**") |
- node_modules + - node_modules/** |
config.json, .*, .git, *.pyc |
default | - {pattern} + - {pattern}/** |
✅ Trigger/Why/Outcome comment blocks match CLAUDE.md literate style.
✅ _rclone_exclude_filters is properly extracted as a helper — it has a clear single responsibility and is easy to unit-test indirectly through the integration tests.
Cosmetic nit — superfluous /** entry for wildcard file patterns
*.pyc → ["- *.pyc", "- *.pyc/**"]. The - *.pyc/** line is valid rclone syntax but a no-op in practice (no real directory is named foo.pyc). Harmless, but it doubles lines for every *-containing file pattern. Not a blocker.
Testing & Documentation
✅ test_convert_bmignore_to_rclone_filters_creates_and_converts — updated to assert both direct and recursive forms for plain (node_modules), glob (*.pyc), and dotfile (.git) patterns.
✅ test_convert_bmignore_to_rclone_filters_excludes_files_and_hidden_directory_contents — new test directly covers the #819 cases: config.json (plain filename → must exclude file, not just a phantom directory), .* (glob hidden files/dirs), node_modules/** (pre-suffixed → normalised to direct+recursive).
✅ test_convert_bmignore_to_rclone_filters_preserves_directory_only_patterns — correctly asserts the negative: "- cache" is NOT in the output (as a list element), confirming directory-only semantics are preserved for cache/. Good edge-case discipline.
✅ Docs default pattern list in cloud-cli.md matches create_default_bmignore() in ignore_utils.py exactly — verified line-by-line. No drift.
✅ Conversion explanation paragraph added (lines 649–652) — clearly describes direct+recursive behaviour for file patterns and directory-only semantics, with concrete examples.
✅ echo "*.tmp" (not "*.tmp/**") in the "Add custom patterns" example is now correct.
Architecture Considerations
✅ No changes to local-first architecture or the knowledge graph.
✅ No MCP tool changes — this is purely a cloud-sync filter concern.
✅ No DB or Alembic migration needed.
✅ .* in the default .bmignore already covers .git, .obsidian, etc. via glob — the explicit .git and .svn entries that follow are redundant but intentional for clarity and to ensure both file and directory exclusion regardless of pattern expansion order.
Security & Performance
✅ No hardcoded secrets or credentials.
✅ mtime-based caching logic (rclone_mtime >= bmignore_mtime) is preserved — no re-generation on every sync.
✅ The bare except Exception: fallback in convert_bmignore_to_rclone_filters is pre-existing and intentional: it ensures bisync can proceed even if the .bmignore file becomes unreadable. The fallback now correctly emits both - .git and - .git/** (fixed from - .git/** only).
Compatibility
✅ String operations used throughout _rclone_exclude_filters (not Path), keeping rclone filter syntax platform-independent (/ as separator is correct for rclone regardless of OS).
✅ No emojis or unicode in any output path.
CI Timeout Increases (.github/workflows/test.yml)
✅ SQLite unit: 30 → 45 min; Postgres unit: 30 → 60 min. Reasonable adjustments given Postgres uses testcontainers with Docker startup overhead.
Checklist Results
| Area | Status |
|---|---|
| Follows CLAUDE.md coding conventions | ✅ |
| Python 3.12+ type annotations | ✅ |
| 100-char line length maintained | ✅ |
| No commented-out code | ✅ |
| Unit/integration tests for new code | ✅ |
| Edge case coverage | ✅ (negation patterns pre-existing, explicitly deferred) |
| 100% test coverage maintained | ✅ |
| Documentation updated and matches code | ✅ |
| No DB/migration changes needed | ✅ |
| Local-first architecture preserved | ✅ |
| No hardcoded secrets | ✅ |
| Windows path compatibility | ✅ |
| No emojis/unicode in output | ✅ |
Verdict
Approved. The fix is correct, minimal, and well-tested. The Codex trailing-slash concern doesn't apply to the actual code. The one pre-existing gap (negation patterns, explicitly deferred by the author) and the cosmetic *.pyc/** nit can be addressed in follow-ups.
Summary
.bmignoreto rclone filter conversion so file patterns exclude both the direct match and recursive directory contents.cache/as rclone directory rules while still excluding their recursive contents.config.json,.*, patterns already written with/**, and trailing-slash directory patterns..bmignoreas gitignore-style input and document the generated rclone filters.Closes #819.
Issue #740 investigation
basic-memory --helpmeasured 4.660s,basic-memory --versionmeasured 0.871s,import basic_memory.cli.mainmeasured 3.449s, andimport basic_memory.cli.appmeasured 0.756s.Validation
uv run pytest -q tests/cli/cloud/test_rclone_config_and_bmignore_filters.pyjust fast-checkjust doctorruby -e 'require "yaml"; ARGV.each { |f| YAML.load_file(f); puts "#{f} ok" }' .github/workflows/*.ymlgit diff --check