Skip to content

fix(sync): preserve bmignore rclone filters#827

Merged
phernandez merged 1 commit into
mainfrom
codex/issue-819-740-bug-investigation
May 15, 2026
Merged

fix(sync): preserve bmignore rclone filters#827
phernandez merged 1 commit into
mainfrom
codex/issue-819-740-bug-investigation

Conversation

@phernandez
Copy link
Copy Markdown
Member

@phernandez phernandez commented May 15, 2026

Summary

  • Fix .bmignore to rclone filter conversion so file patterns exclude both the direct match and recursive directory contents.
  • Preserve directory-only patterns such as cache/ as rclone directory rules while still excluding their recursive contents.
  • Add regression coverage for config.json, .*, patterns already written with /**, and trailing-slash directory patterns.
  • Update the cloud CLI docs to show .bmignore as gitignore-style input and document the generated rclone filters.
  • Give the unit-test CI jobs enough timeout headroom after the dependency update made the Postgres unit matrix exceed the old 30-minute cap.

Closes #819.

Issue #740 investigation

  • Reproduced the current slow root help path on this branch: basic-memory --help measured 4.660s, basic-memory --version measured 0.871s, import basic_memory.cli.main measured 3.449s, and import basic_memory.cli.app measured 0.756s.
  • The import timing still points at the broader eager CLI command graph/API/MCP/embedding import path, so I did not mix that multi-file lazy-import refactor into this sync-filter PR.

Validation

  • uv run pytest -q tests/cli/cloud/test_rclone_config_and_bmignore_filters.py
  • just fast-check
  • just doctor
  • ruby -e 'require "yaml"; ARGV.each { |f| YAML.load_file(f); puts "#{f} ok" }' .github/workflows/*.yml
  • git diff --check

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 15, 2026

Claude finished @phernandez's task in 2m 12s —— View job


PR Review: fix(sync): preserve bmignore rclone filters

  • Gather context (read changed files)
  • Review code quality & standards
  • Review testing & documentation
  • Review architecture considerations
  • Review security & performance
  • Review compatibility

Summary

This is a clean, well-targeted fix. The core logic change is small and correct, tests cover the specific cases raised in issue #819, and the docs now accurately reflect actual .bmignore syntax. Approved with a few minor notes.


Code Quality & Standards

bisync_commands.py_rclone_exclude_filters (lines 17–26)

✅ The fix is correct. The old logic had two separate branches (wildcard → simple exclude, no-wildcard → /** only), which meant:

  • config.json (no *) became - config.json/** — excluded a hypothetical directory named config.json, not the file
  • .* (has *) became - .* — excluded hidden files but not hidden directories' recursive contents

The new approach of always emitting both - {pattern} and - {pattern}/** handles all cases correctly.

✅ The removesuffix("/**").rstrip("/") normalization correctly handles patterns already written as node_modules/** or dist/ so they don't double up.

✅ The Trigger/Why/Outcome comment format matches the CLAUDE.md literate programming style.

Minor observation — wildcard file patterns generate a superfluous /** entry:

# Pattern: *.pyc
# Generates: ["- *.pyc", "- *.pyc/**"]

- *.pyc/** means "exclude contents of directories matching *.pyc". This is effectively a no-op for file patterns (no real directory is named foo.pyc), and it's harmless. But it doubles the line count of .bmignore.rclone for every wildcard file pattern. Not a blocker, just worth noting — if the filter file ever grows large this could be revisited.

Missing edge case — gitignore negation patterns (!pattern)

The .bmignore reader passes non-comment, non-empty lines through to _rclone_exclude_filters without checking for !. A .bmignore line like !important.db would generate - !important.db and - !important.db/**, which is invalid rclone syntax. This is pre-existing behavior (the old code had the same gap), so it's not introduced by this PR, but it's worth a follow-up to either document "negation patterns are not supported" or add a guard:

# Skip negation patterns (rclone doesn't support gitignore-style negation)
if line.startswith("!"):
    continue

Fix this →


Testing & Documentation

test_convert_bmignore_to_rclone_filters_creates_and_converts — updated to assert both direct and recursive forms for all pattern types.

test_convert_bmignore_to_rclone_filters_excludes_files_and_hidden_directory_contents — new test specifically covers config.json (plain file), .* (glob matching hidden items), and node_modules/** (pre-suffixed pattern). These are exactly the cases from the issue.

_rclone_exclude_filters is fully exercised through the integration tests; no separate unit test needed for a two-line helper.

✅ The documentation default pattern list in cloud-cli.md now matches create_default_bmignore() in ignore_utils.py exactly — good alignment between docs and code.

✅ The docs correctly show .bmignore using gitignore-style syntax (no /**) and explains the conversion. The fix to the "Add custom patterns" example (*.tmp instead of *.tmp/**) is a nice touch.


Checklist Results

Area Status
Follows CLAUDE.md coding conventions
Python 3.12+ type annotations
100-char line length maintained
No commented-out code
Unit/integration tests for new code
Edge case coverage ✅ (minor gap: negation patterns, pre-existing)
100% test coverage maintained
Documentation updated
No DB/migration changes needed
Local-first architecture preserved
No hardcoded secrets
Error handling preserved (except block)
Windows path compatibility ✅ (string ops, not Path, for rclone filter syntax)
No emojis/unicode in output

Verdict

Looks good to merge. The fix is correct, minimal, and well-tested. The one pre-existing gap (negation patterns) and the cosmetic nit (superfluous *.pyc/**) can be addressed in a follow-up if needed.

@phernandez
Copy link
Copy Markdown
Member Author

For the negation note: I am leaving that out of this PR. This patch fixes the two concrete #819 failures by preserving direct + recursive excludes. Gitignore-style negation needs a separate rclone include/filter-order design; simply skipping !pattern would avoid a literal pattern but would not restore gitignore unignore semantics after an earlier exclude. I recorded #740 separately and kept this PR scoped to the sync-filter bug.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c0cc67faad

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/basic_memory/cli/commands/cloud/bisync_commands.py Outdated
@phernandez phernandez force-pushed the codex/issue-819-740-bug-investigation branch from c0cc67f to 324dc6f Compare May 15, 2026 21:23
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 15, 2026

Claude finished @phernandez's task in 1m 30s —— View job


PR Review: fix(sync): preserve bmignore rclone filters

  • Gather context
  • Read changed files
  • Review code quality & standards
  • Review testing & documentation
  • Review architecture considerations
  • Review security & performance
  • Review compatibility

Summary

Clean, well-scoped fix. The core logic change is small and correct, all three pattern classes are now covered by tests, and the docs accurately reflect actual .bmignore syntax. One point below addresses the Codex review comment, which describes behaviour that isn't in the actual code.


Code Quality & Standards

bisync_commands.py_rclone_exclude_filters (lines 17–32)

✅ The three-branch design is correct:

Input pattern Branch Output
cache/ endswith("/") - cache/ + - cache/**
node_modules/** removesuffix("/**")node_modules - node_modules + - node_modules/**
config.json, .*, .git default - {pattern} + - {pattern}/**

✅ The Trigger/Why/Outcome comments match CLAUDE.md literate style.

Regarding the Codex P2 review comment ("Do not strip the trailing /")

The Codex comment describes a concern that is not present in the actual code. The concern was that rstrip("/") would erase directory-only semantics from cache/. But _rclone_exclude_filters has an explicit endswith("/") guard (line 19) that returns early before any normalization, preserving - cache/ and - cache/** exactly. The removesuffix("/**") on line 25 is only reached for patterns that do not end in /. The Codex comment can be dismissed.

Minor cosmetic — superfluous /** entry for file-glob patterns

*.pyc generates ["- *.pyc", "- *.pyc/**"]. The - *.pyc/** line is valid rclone syntax but a no-op (no real directory is named foo.pyc). It doubles the line count of .bmignore.rclone for every *-containing file pattern. Harmless today, but worth revisiting if the filter file grows large.

Fix this →


Testing & Documentation

test_convert_bmignore_to_rclone_filters_creates_and_converts — updated to assert both direct and recursive forms for plain (node_modules), glob (*.pyc), and dotfile (.git) patterns.

test_convert_bmignore_to_rclone_filters_excludes_files_and_hidden_directory_contents — new test covers config.json (plain filename), .* (glob hidden), and node_modules/** (pre-suffixed). These are the exact cases from issue #819.

test_convert_bmignore_to_rclone_filters_preserves_directory_only_patterns — new test verifies cache/ produces - cache/ and - cache/** but not - cache (correctly asserts the negative). Good edge-case discipline.

✅ The documentation's default pattern list now matches create_default_bmignore() in ignore_utils.py exactly, and the fix to the "Add custom patterns" example (*.tmp instead of *.tmp/**) corrects a misleading user-facing example.


Checklist Results

Area Status
Follows CLAUDE.md coding conventions
Python 3.12+ type annotations
100-char line length maintained
No commented-out code
Unit/integration tests for new code
Edge case coverage ✅ (minor: negation patterns, explicitly deferred per comment)
100% test coverage maintained
Documentation updated
No DB/migration changes needed
Local-first architecture preserved
No hardcoded secrets
Windows path compatibility ✅ (string ops, not Path, for rclone filter syntax)
No emojis/unicode in output

Verdict

Approved. The fix is correct, minimal, and well-tested. The Codex P2 concern doesn't apply to the actual code. The one pre-existing gap (negation patterns) and the cosmetic nit (*.pyc/**) can be addressed in separate follow-ups if needed.

Signed-off-by: phernandez <paul@basicmachines.co>
@phernandez phernandez force-pushed the codex/issue-819-740-bug-investigation branch from 324dc6f to 5d93b72 Compare May 15, 2026 21:57
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 15, 2026

Claude finished @phernandez's task in 2m 3s —— View job


PR Review: fix(sync): preserve bmignore rclone filters

  • Gather context (read changed files)
  • Review code quality & standards
  • Review testing & documentation
  • Review architecture considerations
  • Review security & performance
  • Review compatibility

Summary

Clean, well-targeted fix. The core logic change in _rclone_exclude_filters is correct and minimal, the three new/updated tests precisely cover the issue #819 failures, and the docs now exactly match create_default_bmignore() in ignore_utils.py. The Codex P2 concern about trailing-slash semantics was correctly addressed — the endswith("/") guard fires before any normalization, so cache/ is never stripped.


Code Quality & Standards

bisync_commands.py_rclone_exclude_filters (lines 17–32)

✅ The three-branch design handles all pattern classes correctly:

.bmignore input Branch rclone output
cache/ endswith("/") guard - cache/ + - cache/**
node_modules/** removesuffix("/**") - node_modules + - node_modules/**
config.json, .*, .git, *.pyc default - {pattern} + - {pattern}/**

✅ Trigger/Why/Outcome comment blocks match CLAUDE.md literate style.

_rclone_exclude_filters is properly extracted as a helper — it has a clear single responsibility and is easy to unit-test indirectly through the integration tests.

Cosmetic nit — superfluous /** entry for wildcard file patterns

*.pyc["- *.pyc", "- *.pyc/**"]. The - *.pyc/** line is valid rclone syntax but a no-op in practice (no real directory is named foo.pyc). Harmless, but it doubles lines for every *-containing file pattern. Not a blocker.

Fix this →


Testing & Documentation

test_convert_bmignore_to_rclone_filters_creates_and_converts — updated to assert both direct and recursive forms for plain (node_modules), glob (*.pyc), and dotfile (.git) patterns.

test_convert_bmignore_to_rclone_filters_excludes_files_and_hidden_directory_contents — new test directly covers the #819 cases: config.json (plain filename → must exclude file, not just a phantom directory), .* (glob hidden files/dirs), node_modules/** (pre-suffixed → normalised to direct+recursive).

test_convert_bmignore_to_rclone_filters_preserves_directory_only_patterns — correctly asserts the negative: "- cache" is NOT in the output (as a list element), confirming directory-only semantics are preserved for cache/. Good edge-case discipline.

✅ Docs default pattern list in cloud-cli.md matches create_default_bmignore() in ignore_utils.py exactly — verified line-by-line. No drift.

✅ Conversion explanation paragraph added (lines 649–652) — clearly describes direct+recursive behaviour for file patterns and directory-only semantics, with concrete examples.

echo "*.tmp" (not "*.tmp/**") in the "Add custom patterns" example is now correct.


Architecture Considerations

✅ No changes to local-first architecture or the knowledge graph.

✅ No MCP tool changes — this is purely a cloud-sync filter concern.

✅ No DB or Alembic migration needed.

.* in the default .bmignore already covers .git, .obsidian, etc. via glob — the explicit .git and .svn entries that follow are redundant but intentional for clarity and to ensure both file and directory exclusion regardless of pattern expansion order.


Security & Performance

✅ No hardcoded secrets or credentials.

✅ mtime-based caching logic (rclone_mtime >= bmignore_mtime) is preserved — no re-generation on every sync.

✅ The bare except Exception: fallback in convert_bmignore_to_rclone_filters is pre-existing and intentional: it ensures bisync can proceed even if the .bmignore file becomes unreadable. The fallback now correctly emits both - .git and - .git/** (fixed from - .git/** only).


Compatibility

✅ String operations used throughout _rclone_exclude_filters (not Path), keeping rclone filter syntax platform-independent (/ as separator is correct for rclone regardless of OS).

✅ No emojis or unicode in any output path.


CI Timeout Increases (.github/workflows/test.yml)

✅ SQLite unit: 30 → 45 min; Postgres unit: 30 → 60 min. Reasonable adjustments given Postgres uses testcontainers with Docker startup overhead.


Checklist Results

Area Status
Follows CLAUDE.md coding conventions
Python 3.12+ type annotations
100-char line length maintained
No commented-out code
Unit/integration tests for new code
Edge case coverage ✅ (negation patterns pre-existing, explicitly deferred)
100% test coverage maintained
Documentation updated and matches code
No DB/migration changes needed
Local-first architecture preserved
No hardcoded secrets
Windows path compatibility
No emojis/unicode in output

Verdict

Approved. The fix is correct, minimal, and well-tested. The Codex trailing-slash concern doesn't apply to the actual code. The one pre-existing gap (negation patterns, explicitly deferred by the author) and the cosmetic *.pyc/** nit can be addressed in follow-ups.

@phernandez phernandez merged commit 4d22c39 into main May 15, 2026
26 checks passed
@phernandez phernandez deleted the codex/issue-819-740-bug-investigation branch May 15, 2026 22:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] bmignore weirdness

1 participant