Skip to content

fix(dynamodb): BETWEEN-aware + whitespace-tolerant AND/OR tokenizer#670

Merged
vieiralucas merged 1 commit intomainfrom
fix/dynamodb-tokenizer-between
Apr 22, 2026
Merged

fix(dynamodb): BETWEEN-aware + whitespace-tolerant AND/OR tokenizer#670
vieiralucas merged 1 commit intomainfrom
fix/dynamodb-tokenizer-between

Conversation

@vieiralucas
Copy link
Copy Markdown
Member

@vieiralucas vieiralucas commented Apr 22, 2026

Summary

  • The DynamoDB expression splitter used a literal " AND " / " OR " match. That broke for BETWEEN :lo AND :hi (the inner AND got treated as a top-level separator) and for any non-space whitespace between tokens (tabs, newlines, multi-space).
  • Rewrite split_on_top_level_keyword around a keyword-boundary helper (match_keyword). Alphanumeric keywords (AND/OR/BETWEEN) now require ASCII-whitespace word boundaries; punctuation keywords (,) still match literally.
  • Track a top-level BETWEEN skip counter so x BETWEEN :lo AND :hi no longer has its inner AND consumed as a top-level separator.
  • Unignores 3 tests that were parked for this bug:
    • key_condition_between
    • key_condition_whitespace_variations
    • filter_between
  • Adds direct unit tests for split_on_top_level_keyword covering BETWEEN skip, whitespace variants, case-insensitive keyword matching, and identifier substrings (e.g. land must not match AND).

Test plan

  • cargo test -p fakecloud-dynamodb --lib (all 213 pass, 3 ignored remaining for follow-up PRs)
  • cargo test -p fakecloud-e2e --test dynamodb (all 46 pass)
  • cargo clippy -p fakecloud-dynamodb --all-targets -- -D warnings
  • cargo fmt --all

Summary by cubic

Fixes DynamoDB expression splitting so top-level AND/OR no longer break on BETWEEN’s inner AND and tolerate tabs/newlines/multi-space. Re-enables impacted tests and adds targeted unit tests for the tokenizer.

  • Bug Fixes
    • Make AND/OR splitting BETWEEN-aware so x BETWEEN :lo AND :hi doesn’t split on the inner AND.
    • Support ASCII whitespace around keywords (tabs, newlines, multiple spaces) and case-insensitive matching.
    • Enforce word boundaries to avoid matching inside identifiers (e.g., land), and respect parentheses.
    • Add match_keyword helper and refactor split_on_top_level_keyword.
    • Unignore key_condition_between, key_condition_whitespace_variations, filter_between; add new tokenizer unit tests.

Written for commit 6fbf67a. Summary will update on new commits.

- Rewrite split_on_top_level_keyword around a keyword-boundary helper
  (match_keyword). Alphanumeric keywords (AND, OR, BETWEEN) require
  ASCII-whitespace word boundaries so ':s\tAND\t:o' and ':s\nAND\n:o'
  split like ':s AND :o'. Punctuation keywords (',') keep matching
  literally.
- Track a top-level BETWEEN skip counter so 'x BETWEEN :lo AND :hi'
  no longer has its inner AND mistaken for a top-level separator.
- Unignore 3 tests: key_condition_between,
  key_condition_whitespace_variations, filter_between.
- Add unit tests for split_on_top_level_keyword covering BETWEEN skip,
  whitespace variants, case-insensitivity, and identifier substrings.
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 22, 2026

Codecov Report

❌ Patch coverage is 98.87640% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/fakecloud-dynamodb/src/service/mod.rs 98.87% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 2 files

@vieiralucas vieiralucas merged commit 5befb6a into main Apr 22, 2026
48 checks passed
@vieiralucas vieiralucas deleted the fix/dynamodb-tokenizer-between branch April 22, 2026 16:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant