feat: eng-71 recursive glob#2089
Conversation
…r brace-expansion 5.0.6
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughEnable recursive ChangesBucket Path Glob Expansion with Recursive Pattern Support
Sequence Diagram(s)sequenceDiagram
participant CLI
participant GlobSync
participant Mapper
participant SegmentBuilder
CLI->>GlobSync: call glob.sync(pattern, ignore/follow tuned for **)
GlobSync->>Mapper: return matched file paths
Mapper->>SegmentBuilder: align pattern segments to path segments (DFS + memo)
SegmentBuilder->>CLI: emit restored `[locale]` paths or throw ambiguousPathPattern
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/cli/src/cli/utils/buckets.ts`:
- Around line 263-270: The code currently chooses the first successful k in the
isDoubleStar branch (inside dfs), which can silently pick the wrong alignment
when multiple k values match; change the loop to detect ambiguity: iterate all k
from j..sourceLength, collect/count successful dfs(i+1,k) results, if count==0
leave matched false (return false), if count==1 set parent.set(memoKey, { i2:
i+1, j2: kFound }) and matched=true, if count>1 throw an explicit Error (or
return a hard-failure) indicating an ambiguous `**` alignment for the current
memoKey/pattern segment so callers know the match is ambiguous rather than
arbitrarily picking the first. Ensure you still memoize only the unique mapping
when setting parent.
- Around line 317-336: The function buildLocalePlaceholderSegment currently only
handles a single placeholder occurrence because it computes leftGlob/rightGlob
from the first placeholder and only restores the first matching locale in
sourceChunk; detect multiple occurrences of the placeholder by counting
occurrences in patternChunk (e.g., split or indexOf loop) and either (a)
explicitly reject them with a clear thrown Error mentioning
buildLocalePlaceholderSegment, placeholder and patternChunk, or (b) implement
full support by iterating over each placeholder instance: for each placeholder
occurrence compute its leftGlob/rightGlob relative to that position, scan
sourceChunk for matching locale occurrences (using leftMatches/rightMatches
logic) and replace them with placeholder while preserving other matches; choose
one approach and add/update tests accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d466f1b9-3a86-4ecc-826c-781ab15275d3
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (5)
packages/cli/package.jsonpackages/cli/src/cli/utils/buckets.spec.tspackages/cli/src/cli/utils/buckets.tspackages/cli/src/cli/utils/errors.tspackages/spec/src/config.ts
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/cli/src/cli/utils/buckets.spec.ts`:
- Around line 461-503: Reset the shared glob mock before each test and tighten
assertions to only check the current test call: add a beforeEach that
clears/reset vi.mocked(glob.sync) (or call mockGlobSync reset helper) and then
replace toHaveBeenCalledWith assertions with either
expect(vi.mocked(glob.sync)).toHaveBeenLastCalledWith(...) or assert
expect(vi.mocked(glob.sync)).toHaveBeenCalledTimes(1) plus toHaveBeenCalledWith,
keeping references to mockGlobSync and getBuckets so you clear the mock before
invoking getBuckets and then validate the glob.sync call in that single test.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f8b7f873-f7eb-4021-8603-c8610190cace
📒 Files selected for processing (2)
packages/cli/src/cli/utils/buckets.spec.tspackages/cli/src/cli/utils/buckets.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/cli/src/cli/utils/buckets.ts
|
Caution Failed to replace (edit) comment. This is likely due to insufficient permissions or the comment being deleted. Error details |
Summary
Adds recursive glob (
**) support to bucketinclude/excludeini18n.json. Pattern likeconfig/locales/**/[locale].ymlnow matches files at any depth, replacing the previous manual enumeration of each depth level.Implementation highlights:
mapPatternToSourcealigns variable-length**segments against source paths.restoreLocaleInSegment) restores[locale]placeholders when a segment contains multiple[locale]tokens (e.g.[locale]-fixed-[locale].json).**admits multiple valid[locale]positions, or when[locale]cannot be unambiguously restored inside a segment. Replaces the silent fallback that closed previous attempts (PRs feat: recursive glob patterns #1178, feat: recursive glob patterns #1179, feat: more flexible glob patterns #1180).node_modules,.git,dist,build,.next,.turbo) applied only for patterns containing**. Concrete and single-*patterns keep the previous traversal behavior exactly.follow: falsefor**patterns to avoid symlink cycles.Manual verification (real filesystem, not mocks)
Run via
lingo.dev show fileson a scratch fixture with locale files at depths 0/1/2, plus decoys innode_modules/anddist/.Main scenarios
config/locales/[locale].yml(legacy concrete)config/locales/*/[locale].yml(legacy*)config/locales/**/[locale].yml**/[locale].ymlfrom repo rootnode_modulesanddistfiltered outconfig/locales/**/[locale].yml+exclude: admin/**/[locale].ymlconfig/locales/**/**/[locale].yml(consecutive**)[locale][locale].txtagainstenen.txt(no separator)[locale][locale].txt**/[locale]/**/dummy.txtagainsten/x/en/dummy.txt(locale value appears twice as a directory)[locale]has two valid positionsEdge cases
multi/**/[locale]/[locale].json[locale]segments separated by**locales/**/[locale]/strings.json, sourceen-US→ targetde-DE**{"path":"locales/**/[locale]/file.json","delimiter":"-"}(config uses_, on-disk uses-)resolveOverriddenLocaleflows through**.config/locales/**/[locale].yml**dot: truelets minimatch descendRound-trip (Moses' check)
Created
fr.ymlandes.ymlnext to everyen.ymlin the fixture, then ran the CLI. Every CLI-resolved path matches a real file on disk —diffbetween CLI output andfindoutput is empty.Cross-platform end-to-end verification (macOS + Windows)
Standalone fixture project (
config/locales/**/[locale].ymlat depths 0/1/2, decoys innode_modules/,dist/,admin/excluded, plus legacy patterns for backward compat) translated end-to-end vialingo.dev runagainst the real API on both platforms with the locally-built CLI from this branch.**)Both shapes produced the same set of 12 target files. Windows-side checks:
\) round-trip correctly throughmapPatternToSourceandrestoreLocaleInSegment— depth-0/1/2 yaml files all resolved.exclude: ["config/locales/admin/**/[locale].yml"]filteredadmin/on Windows (this was the candidate Windows-specific risk).DEFAULT_GLOB_IGNOREfilterednode_modules\some-pkg\locales\en.ymlanddist\locales\en.ymlon Windows.legacy/[locale]/messages.json,legacy/single-star/*/[locale].json) unchanged.Backward compatibility guarantees (non-
**patterns)To ensure existing customer configs are not affected by the algorithmic rewrite, the new path-resolution code is gated behind a strict
**-segment check; non-**patterns are routed through a near-verbatim copy of the pre-PR implementation.pathPatternChunks.includes("**")— checks the segment array, not the joined string, so a literalfoo**barinside a segment is not mistaken for a globstar.**segment, restoration uses the original regex with/i(added in this PR to fix mixed-case locales on macOS / Windows) and the original silent-fallback behavior.glob.syncis invoked withfollow: true, ignore: undefined— identical to pre-PR.differenceBykeys onpathPatternonly (not delimiter), preserving the pre-PR contract that an exclude entry cancels an include with a matching path regardless of delimiter shape.This combination guarantees that any existing valid
i18n.jsonproduces identical bucket output before and after this PR. The new strict mapping algorithm, ambiguity probes, andDEFAULT_GLOB_IGNOREonly activate when the user explicitly opts in by writing**in their pattern.Backward-compat coverage in the test suite:
does not treat literal ** inside a segment as a recursive globstar(foo**barstays on the legacy path)silently leaves the source chunk untouched when [locale] cannot be matched (legacy fallback)leaves source chunk untouched when legacy regex cannot match (multi-[locale] segment)exclude subtracts include by pathPattern regardless of delimiter mismatchAlgorithmic safeguards
Two ambiguity layers, both with explicit
CLIErrorinstead of the silent fallback that closed PR #1180:mapPatternToSource+forbidprobe). When**admits two source-index positions for the[locale]segment, the second placement is detected and the CLI refuses to pick arbitrarily. Example:**/[locale]/**/dummy.txtagainsten/x/en/dummy.txt.restoreLocaleInSegment). When the locale value cannot be located inside a matched segment such that the rest of the segment still matches the pattern, the CLI throws rather than returning the literal pattern as a fallback. Deterministic multi-[locale]patterns ([locale]-fixed-[locale].json,[locale][locale].txt) resolve correctly thanks to recursive matching.Rollback plan
Single-file revert is sufficient:
git revert <merge-sha>onpackages/cli/src/cli/utils/buckets.tsandpackages/cli/package.json(to dropminimatch). No data migrations or lockfile changes are involved (lock file keys are MD5 of the pattern string, which stays stable across the change).Summary by CodeRabbit
New Features
**) in bucket include/exclude (e.g.,config/locales/**/[locale].yml) with clear errors when a match can't unambiguously map to the locale placeholder.**.Documentation
**behavior and safety rules.Tests