Skip to content

fix: handle filtering better#220

Closed
luxass wants to merge 14 commits intomainfrom
filter-fix
Closed

fix: handle filtering better#220
luxass wants to merge 14 commits intomainfrom
filter-fix

Conversation

@luxass
Copy link
Member

@luxass luxass commented Aug 24, 2025

🔗 Linked issue

resolves #215

📚 Description

Summary by CodeRabbit

  • New Features

    • Enhanced path filtering with default exclusions (archives, PDFs, .DS_Store), user overrides, directory-pattern expansion, POSIX normalization, and ordered last-match-wins rules.
    • Filtered tree rebuilding and flattening with an option to include directory paths.
  • Changes

    • Public API expanded: optional filter creation, read-only pattern access, and new tree/flattening options and exports.
  • Tests

    • Added and expanded tests for tree filtering, path flattening, file-paths, file retrieval, and capability scenarios.
  • Chores

    • Added changeset marking a minor shared-package release.

luxass added 2 commits August 24, 2025 06:59
… matches

* Introduced a new test case to verify that directories containing matching files are included in the file tree, even if their names do not match the filter.
* This addresses the bug described in GitHub Issue #215, where certain directories were incorrectly excluded.
@changeset-bot
Copy link

changeset-bot bot commented Aug 24, 2025

🦋 Changeset detected

Latest commit: a9a7edb

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 5 packages
Name Type
@ucdjs/shared Minor
@ucdjs/schema-gen Patch
@ucdjs/ucd-store Patch
@ucdjs/utils Patch
@ucdjs/cli Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 24, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Replaces local tree-filtering in the UCD store with a shared rule-based path filter and tree-filtering utility; expands shared filtering API (defaults, overrides, directory pattern expansion, introspection) and adds focused UCD store file-operation tests, import fixes, and flattening options.

Changes

Cohort / File(s) Summary
Shared filter engine & tree API
packages/shared/src/filter.ts, packages/shared/src/index.ts
Adds DEFAULT_EXCLUSIONS, InternalPathFilterRule/RuleSet, directory-pattern expansion, POSIX normalization, last-match-wins rule evaluation, pattern-override semantics, TreeEntry type, filterTreeStructure, and re-exports from index.
Shared flatten util
packages/shared/src/flatten.ts, packages/shared/test/flatten.test.ts
Adds FlattenFilePathsOptions with includeDirectories, updates flattenFilePaths signature/behavior, adds input validation, and test coverage for directory inclusion.
Shared tests: filter
packages/shared/test/filter.test.ts
Expands tests to cover ordered rules, default exclusions and overrides, directory expansion, tree filtering, extraFilters, and edge cases.
UCD store: use shared filter & store changes
packages/ucd-store/src/store.ts
Replaces local recursive filtering in getFileTree with filterTreeStructure(this.#filter, entries, extraFilters), removing local helper recursion.
UCD store tests: restructure & imports
packages/ucd-store/test/file-operations/*.test.ts, packages/ucd-store/test/core/*.test.ts, packages/ucd-store/test/maintenance/*.test.ts
Deletes monolithic file-operations test, adds focused tests (capability requirements, file paths, file tree, getFile), adjusts import paths to local src, and minor assertion updates.
Changeset
.changeset/shy-roses-admire.md
Adds changeset noting enhanced PathFilter and tree support for @ucdjs/shared.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Caller
  participant PF as PathFilter (createPathFilter)
  participant RS as InternalRuleSet
  participant FT as filterTreeStructure
  participant Store as UCDStore.getFileTree

  Caller->>PF: createPathFilter(filters?)
  activate PF
  PF->>RS: buildRules(filters + defaults)
  RS-->>PF: ordered rules
  deactivate PF

  Caller->>Store: getFileTree(version, extraFilters?)
  activate Store
  Store->>FT: filterTreeStructure(this.#filter, entries, extraFilters)
  activate FT
  loop per entry
    FT->>PF: test(path)
    activate PF
    PF->>RS: evaluate ordered rules (last-match-wins)
    RS-->>PF: include/exclude
    deactivate PF
    alt directory with matching descendants
      FT->>FT: include directory and filtered children
    else file included
      FT->>FT: include file
    else
      FT->>FT: exclude
    end
  end
  FT-->>Store: filtered entries
  deactivate FT
  Store-->>Caller: result
  deactivate Store
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Assessment against linked issues

Objective Addressed Explanation
Fix directory filtering to include directories when any child matches, regardless of directory path match (#215)

Assessment against linked issues: Out-of-scope changes

Code Change Explanation
Add DEFAULT_EXCLUSIONS, rule engine and pattern-override logic (packages/shared/src/filter.ts) Broad enhancement to shared filtering semantics; not required by #215 which targets directory-inclusion behavior in getFileTree.
Add includeDirectories option and signature change for flattenFilePaths (packages/shared/src/flatten.ts) New flattening option unrelated to the directory-inclusion fix in getFileTree.
Changeset metadata (.changeset/shy-roses-admire.md) Release metadata entry; not functionally related to the specific getFileTree directory-filtering bug.

Possibly related PRs

Suggested labels

pkg: utils

Poem

I hop through paths with nimble paws,
I guard the roots and honor laws.
If leaves within the branches sing,
I keep the branch — let children spring.
A little rabbit trims the tree, and sets the matching free. 🐇🌿

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch filter-fix

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbit in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbit in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbit gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbit read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbit help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbit ignore or @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbit summary or @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbit or @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions bot added the pkg: ucd-store Changes related to the UCD Store package. label Aug 24, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Aug 24, 2025

Preview Deployment for Web

The Web worker has been deployed successfully.

Preview URL: https://preview.ucdjs.dev

This preview was built from commit a9a7edb


🤖 This comment will be updated automatically when you push new commits to this PR.

@github-actions
Copy link
Contributor

github-actions bot commented Aug 24, 2025

Preview Deployment for Api

The Api worker has been deployed successfully.

Preview URL: https://preview.api.ucdjs.dev

This preview was built from commit a9a7edb


🤖 This comment will be updated automatically when you push new commits to this PR.

@codecov
Copy link

codecov bot commented Aug 24, 2025

Codecov Report

❌ Patch coverage is 90.75630% with 11 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
packages/shared/src/filter.ts 90.75% 10 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

luxass added 7 commits August 24, 2025 10:10
* Updated the `patterns` method in the `PathFilter` interface to return a `readonly string[]` instead of a mutable array.
* This change enhances type safety and prevents unintended modifications to the filter patterns.
… structure support

- Added `DEFAULT_EXCLUSIONS` to include common file types in exclusion patterns.
- Updated `createPathFilter` to allow disabling default exclusions via options.
- Introduced `filterTreeStructure` function to filter tree entries based on path filters.
- Updated tests to cover new functionality and ensure correct behavior with various patterns.
- Reformatted the `filters` array for better readability.
- Updated the expectation for `store.filter.patterns()` to use `expect.arrayContaining()` for more robust matching.
- Introduced new patterns and options for improved path filtering.
- Added `expandDirectoryPattern` function to handle directory-like patterns.
- Updated `createPathFilter` to support additional rules and user-defined filters.
- Enhanced `filterTreeStructure` to accurately build tree from filtered paths.
- Adjusted tests to reflect changes in filter behavior and expectations.
- Added `~__internal__rules` method to `PathFilter` for accessing internal rules.
- Updated `buildRules` function to return `InternalPathFilterRuleSet` instead of `RuleSet`.
- Improved type definitions for better clarity and maintainability.
@luxass luxass marked this pull request as ready for review August 27, 2025 05:00
Copilot AI review requested due to automatic review settings August 27, 2025 05:00
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR improves the file filtering functionality in the UCD store by reorganizing tests and addressing issues with directory pattern handling. The changes help resolve GitHub issue #215 related to better filter handling.

  • Refactors large test files into smaller, focused test modules for better maintainability
  • Updates import paths to use relative imports from the source code instead of the published package
  • Adds a todo test case that highlights the specific issue mentioned in #215 regarding directory filtering

Reviewed Changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
packages/ucd-store/test/maintenance/*.test.ts Updated imports to use relative paths instead of package imports
packages/ucd-store/test/file-operations/*.test.ts Split large test file into focused modules with corrected imports
packages/ucd-store/test/file-operations.test.ts Removed original large test file that was split into smaller modules
packages/ucd-store/test/core/*.test.ts Updated imports to use relative paths
packages/shared/test/filter.test.ts Enhanced filter tests and added tree structure filtering tests
packages/shared/src/index.ts Exported new filterTreeStructure function
packages/shared/src/filter.ts Major refactor of filtering logic with improved pattern handling

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

const filter2 = createPathFilter(["!test.js", "*.js"]);
expect(filter2("index.js")).toBe(true);
expect(filter2("test.js")).toBe(false);
expect(filter2("test.js")).toBe(true);
Copy link

Copilot AI Aug 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This expectation appears incorrect. According to the comment on line 164, 'same patterns in different order should work the same', but this test expects different behavior from filter1 which returns false for 'test.js'. Both filters should have the same result for the same input.

Suggested change
expect(filter2("test.js")).toBe(true);
expect(filter2("test.js")).toBe(false);

Copilot uses AI. Check for mistakes.
for (const match of matches) {
const [root, ...rest] = match.split("/");

if (root == null) {
Copy link

Copilot AI Aug 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The null check should use strict equality (===) instead of loose equality (==) for better type safety and to avoid potential issues with falsy values like empty strings.

Suggested change
if (root == null) {
if (root === null || root === undefined) {

Copilot uses AI. Check for mistakes.

const [next, ...remaining] = rest;

if (next == null) {
Copy link

Copilot AI Aug 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The null check should use strict equality (===) instead of loose equality (==) for better type safety and to avoid potential issues with falsy values like empty strings.

Suggested change
if (next == null) {
if (next === null || next === undefined) {

Copilot uses AI. Check for mistakes.
@github-actions github-actions bot added the pkg: shared Changes related to the Shared package. label Aug 27, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/shared/src/filter.ts (1)

68-74: Fix docs: precedence is last-match-wins (order matters), not “exclude overrides include.”

Current implementation evaluates rules in order and the last matching rule sets the decision. Update docs to prevent confusion.

Apply this diff:

- * - **Precedence**: Exclude patterns override include patterns
+ * - **Precedence**: Last matching rule wins — pattern order matters.
+ *   Place excludes after includes to override them, or includes after excludes to re-include.
♻️ Duplicate comments (3)
packages/shared/test/filter.test.ts (1)

164-168: Fix expectation: order matters; last matching rule wins.

The comment says “same patterns in different order should work the same,” but the implementation is last-match-wins. This test should expect false for test.js to mirror filter1 and the documented semantics we adopt in code/tests elsewhere.

Apply this diff:

-      // same patterns in different order should work the same
+      // same patterns in different order do NOT work the same (last match wins)
       const filter2 = createPathFilter(["!test.js", "*.js"]);
       expect(filter2("index.js")).toBe(true);
-      expect(filter2("test.js")).toBe(true);
+      expect(filter2("test.js")).toBe(false);
packages/shared/src/filter.ts (2)

377-379: Use strict null checks.

Prefer explicit null/undefined checks for clarity.

Apply this diff:

-    if (root == null) {
+    if (root === null || root === undefined) {

408-411: Use strict null checks.

Same rationale as above.

Apply this diff:

-      if (next == null) {
+      if (next === null || next === undefined) {
🧹 Nitpick comments (15)
.changeset/shy-roses-admire.md (2)

1-3: Also bump @ucdjs/ucd-store (patch) in this changeset.

The filtering bug fix affects the store package behavior; please include a patch bump for it.

Apply:

 ---
-"@ucdjs/shared": minor
+"@ucdjs/shared": minor
+"@ucdjs/ucd-store": patch
 ---

5-5: Tighten the release note and reference the linked issue.

Clarify the behavioral fix so consumers know dirs are kept when children match, and link #215.

Apply:

-Enhanced PathFilter interface with internal ruleset and improved path filtering capabilities with default exclusions and tree structure support
+Enhanced PathFilter with default exclusions, tree-structure support, and internal ruleset. Fix directory filtering so parent dirs are retained when any child matches (#215).
packages/ucd-store/test/core/factory.test.ts (1)

208-208: Order-agnostic assertion: confirm API contract.

If filter.patterns() no longer guarantees order (due to defaults/rule expansion), consider documenting that in JSDoc. If order should be stable, assert exact sequence.

packages/ucd-store/test/file-operations/get-file.test.ts (3)

1-7: Import node:path join for cross-platform paths.

Prepares for the absolute-path test change below.

Apply:

-import { assert, beforeEach, describe, expect, it, vi } from "vitest";
+import { assert, beforeEach, describe, expect, it, vi } from "vitest";
+import { join } from "node:path";

72-91: Optional: assert error type in addition to message.

If UCDStoreFileNotFoundError is exported, asserting its class strengthens the test.

Apply:

-    expect(fileData).toBe(null);
-    assert(fileError != null, "Expected error for nonexistent file");
-    expect(fileError.message).toBe("File not found: ./nonexistent.txt in version 15.0.0");
+    expect(fileData).toBe(null);
+    assert(fileError != null, "Expected error for nonexistent file");
+    // expect(fileError).toBeInstanceOf(UCDStoreFileNotFoundError);
+    expect(fileError.message).toBe("File not found: ./nonexistent.txt in version 15.0.0");

8-20: Add coverage for directory filtering behavior (issue #215).

Consider a new test in file-tree tests that asserts parents are kept when any descendant matches the filter.

I can draft a minimal case: a non-matching dir containing a matching file; expect the dir and file to appear in getFileTree results when filters exclude the dir name but include the file pattern.

Also applies to: 93-113

packages/ucd-store/test/file-operations/file-paths.test.ts (2)

45-51: Avoid brittle ordering assertion unless API guarantees sort

If getFilePaths doesn’t guarantee order, this can be flaky across platforms. Either guarantee sort in impl or relax the test.

Option A (keep strictness, assert on sorted copy):

-    expect(filePaths).toEqual([
+    expect(filePaths?.slice().sort()).toEqual([
       "ArabicShaping.txt",
       "BidiBrackets.txt",
       "extracted/DerivedBidiClass.txt",
       "extracted/nested/DeepFile.txt",
     ]);

If the API guarantees sorted output, ignore this; otherwise consider the change above or sort in the implementation.


77-81: Same note on ordering for filtered paths

Mirror the ordering approach here to avoid flakes.

-    expect(filePaths).toEqual([
+    expect(filePaths?.slice().sort()).toEqual([
       "ArabicShaping.txt",
       "BidiBrackets.txt",
       "extracted/DerivedBidiClass.txt",
     ]);
packages/shared/src/index.ts (1)

1-3: Add missing re-exports for public API

In packages/shared/src/index.ts, re-export the TreeEntry type and DEFAULT_EXCLUSIONS constant so they’re available to consumers of the shared package. Both symbols are already exported from filter.ts (lines 29 and 345), so you can safely include them in the public surface.

/packages/shared/src/index.ts

-export { createPathFilter, filterTreeStructure, PRECONFIGURED_FILTERS } from "./filter";
-export type { FilterOptions, PathFilter } from "./filter";
+export { createPathFilter, filterTreeStructure, PRECONFIGURED_FILTERS, DEFAULT_EXCLUSIONS } from "./filter";
+export type { FilterOptions, PathFilter, TreeEntry } from "./filter";
packages/shared/test/filter.test.ts (3)

128-137: Clarify the case-sensitivity comment to match expectations.

Tests assume case-sensitive matching (default picomatch on POSIX). The current “case should always match” phrasing is confusing.

Apply this diff:

-      // case should always match
+      // picomatch is case-sensitive by default (on POSIX)

201-215: Add a test proving user includes can override DEFAULT_EXCLUSIONS.

We exercise disableDefaultExclusions and extras, but not explicit override via positive patterns (e.g., *.zip). Add a targeted assertion.

Apply this diff:

   describe("disableDefaultExclusions option", () => {
@@
   });
+
+  it("allows overriding DEFAULT_EXCLUSIONS with a positive pattern", () => {
+    const filter = createPathFilter(["**/*.zip"]); // explicit include
+    expect(filter("archive.zip")).toBe(true);
+    expect(filter("docs.pdf")).toBe(false); // still excluded by default
+  });

Also applies to: 505-512


515-525: Consider asserting the presence of the internal rules accessor for completeness.

Since PathFilter exposes "~__internal__rules", a minimal check adds confidence without coupling to internals.

Apply this diff:

   it("should have patterns method", () => {
     const filter = createPathFilter(["*.txt"]);
     expect(typeof filter.patterns).toBe("function");
   });
+
+  it("should expose internal rules accessor", () => {
+    const filter = createPathFilter(["*.txt"]);
+    // @ts-expect-error runtime property check
+    expect(typeof filter["~__internal__rules"]).toBe("function");
+  });
packages/shared/src/filter.ts (3)

130-150: Avoid recompiling default exclusion matchers on every call.

Compiling picomatch per invocation adds overhead in tight loops. Precompile once.

Apply this diff:

+const DEFAULT_EXCLUSIONS_EXPANDED = DEFAULT_EXCLUSIONS.map((p) => expandDirectoryPattern(`!${p}`).slice(1));
+const DEFAULT_EXCLUSION_MATCHERS = DEFAULT_EXCLUSIONS_EXPANDED.map((p) => picomatch(p, PICOMATCH_DEFAULT_OPTIONS));
@@
-      for (const defaultPattern of DEFAULT_EXCLUSIONS) {
+      for (let i = 0; i < DEFAULT_EXCLUSIONS.length; i++) {
+        const defaultPattern = DEFAULT_EXCLUSIONS[i]!;
@@
-          const expandedDefault = expandDirectoryPattern(`!${defaultPattern}`);
-          const cleanDefault = expandedDefault.slice(1);
-          const defaultTest = picomatch(cleanDefault, PICOMATCH_DEFAULT_OPTIONS);
-          if (defaultTest(normalizedPath)) {
+          if (DEFAULT_EXCLUSION_MATCHERS[i](normalizedPath)) {
             return false;
           }

381-401: Optional: reduce reconstruction to O(n) with maps.

Repeated array finds scale poorly with large trees. Consider Map<string, TreeEntry> per level to speed insertions.

I can provide a targeted refactor if you want this optimized now.

Also applies to: 417-436


403-406: Nit: comment phrasing.

“use the rest to run recursively append it” reads awkwardly.

Apply this diff:

-    // use the rest to run recursively append it
+    // recursively append remaining path segments
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between ee9840c and d2b3929.

📒 Files selected for processing (16)
  • .changeset/shy-roses-admire.md (1 hunks)
  • packages/shared/src/filter.ts (4 hunks)
  • packages/shared/src/index.ts (1 hunks)
  • packages/shared/test/filter.test.ts (11 hunks)
  • packages/ucd-store/test/core/errors.test.ts (1 hunks)
  • packages/ucd-store/test/core/factory.test.ts (2 hunks)
  • packages/ucd-store/test/core/init.test.ts (1 hunks)
  • packages/ucd-store/test/file-operations.test.ts (0 hunks)
  • packages/ucd-store/test/file-operations/capability-requirements.test.ts (1 hunks)
  • packages/ucd-store/test/file-operations/file-paths.test.ts (1 hunks)
  • packages/ucd-store/test/file-operations/file-tree.test.ts (1 hunks)
  • packages/ucd-store/test/file-operations/get-file.test.ts (1 hunks)
  • packages/ucd-store/test/maintenance/analyze.test.ts (1 hunks)
  • packages/ucd-store/test/maintenance/clean.test.ts (1 hunks)
  • packages/ucd-store/test/maintenance/mirror.test.ts (1 hunks)
  • packages/ucd-store/test/maintenance/repair.test.ts (1 hunks)
💤 Files with no reviewable changes (1)
  • packages/ucd-store/test/file-operations.test.ts
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: luxass
PR: ucdjs/ucd#6
File: packages/cli/src/cmd/codegen/fields.ts:68-82
Timestamp: 2025-05-04T11:52:22.858Z
Learning: User prefers to address the issue with Node.js API compatibility (`Dirent.parentPath` not being part of Node's API and the `recursive: true` option for `readdir` being only available in Node 21+) in a separate PR rather than the current one.
Learnt from: luxass
PR: ucdjs/ucd#45
File: packages/ucd-store/src/download.ts:24-24
Timestamp: 2025-06-09T05:10:32.105Z
Learning: In the ucd-store package refactor, picomatch was moved from direct usage in download.ts to internal usage within the createPathFilter function in filter.ts. The pattern format is still picomatch-compatible, so JSDoc comments referencing picomatch pattern format remain correct.
📚 Learning: 2025-06-09T05:10:32.105Z
Learnt from: luxass
PR: ucdjs/ucd#45
File: packages/ucd-store/src/download.ts:24-24
Timestamp: 2025-06-09T05:10:32.105Z
Learning: In the ucd-store package refactor, picomatch was moved from direct usage in download.ts to internal usage within the createPathFilter function in filter.ts. The pattern format is still picomatch-compatible, so JSDoc comments referencing picomatch pattern format remain correct.

Applied to files:

  • packages/shared/src/index.ts
  • .changeset/shy-roses-admire.md
  • packages/ucd-store/test/maintenance/clean.test.ts
  • packages/ucd-store/test/file-operations/file-paths.test.ts
  • packages/ucd-store/test/core/errors.test.ts
  • packages/ucd-store/test/core/factory.test.ts
  • packages/ucd-store/test/maintenance/mirror.test.ts
  • packages/ucd-store/test/maintenance/analyze.test.ts
  • packages/ucd-store/test/core/init.test.ts
  • packages/ucd-store/test/maintenance/repair.test.ts
  • packages/shared/src/filter.ts
📚 Learning: 2025-07-13T09:23:43.820Z
Learnt from: luxass
PR: ucdjs/ucd#107
File: apps/api/src/routes/v1_files.ts:18-32
Timestamp: 2025-07-13T09:23:43.820Z
Learning: UNICODE_STABLE_VERSION from luxass/unicode-utils-new refers to a specific Unicode version string (not a dynamic value), and the validation logic in the files API correctly checks the normalized version (after "latest" substitution) against UNICODE_VERSION_METADATA rather than the mappedVersion from resolveUCDVersion().

Applied to files:

  • packages/ucd-store/test/file-operations/file-paths.test.ts
  • packages/ucd-store/test/maintenance/mirror.test.ts
  • packages/ucd-store/test/maintenance/analyze.test.ts
🧬 Code graph analysis (4)
packages/ucd-store/test/file-operations/file-paths.test.ts (1)
tooling/test-utils/src/store.ts (1)
  • setupMockStore (65-122)
packages/ucd-store/test/file-operations/capability-requirements.test.ts (1)
tooling/test-utils/src/store.ts (1)
  • setupMockStore (65-122)
packages/ucd-store/test/file-operations/file-tree.test.ts (2)
tooling/test-utils/src/store.ts (1)
  • setupMockStore (65-122)
packages/shared/src/flatten.ts (1)
  • flattenFilePaths (23-43)
packages/ucd-store/test/file-operations/get-file.test.ts (1)
tooling/test-utils/src/store.ts (1)
  • setupMockStore (65-122)
🔇 Additional comments (16)
.changeset/shy-roses-admire.md (1)

1-5: Remove lingering isFiltered guards in filterDirectoryChildren

The core filter-relax fix hasn’t been fully applied: there are still two instances in packages/ucd-store/src/store.ts where isFiltered is gating non-leaf directories. These must be changed so that presence of filtered children alone determines inclusion.

• packages/ucd-store/src/store.ts:157–161
• packages/ucd-store/src/store.ts:189–193

Suggested diff:

--- a/packages/ucd-store/src/store.ts
+++ b/packages/ucd-store/src/store.ts
@@ -157,7 +157,7 @@
           // handle directories with children
           if (child.type === "directory" && child.children) {
             const filteredGrandChildren = filterDirectoryChildren(child.children, childPath);

-            if (isFiltered && filteredGrandChildren.length > 0) {
+            if (filteredGrandChildren.length > 0) {
               result.push({
                 name: child.name,
                 path: childPath,
@@ -189,7 +189,7 @@
           // handle directories with children
           if (entry.type === "directory" && entry.children) {
             const filteredChildren = filterDirectoryChildren(entry.children, entry.path);

-            if (isFiltered && filteredChildren.length > 0) {
+            if (filteredChildren.length > 0) {
               result.push({
                 name: entry.name,
                 path: entry.path,

Please update these checks and verify no other isFiltered && filtered…Children.length > 0 patterns remain.

⛔ Skipped due to learnings
Learnt from: luxass
PR: ucdjs/ucd#45
File: packages/ucd-store/src/download.ts:24-24
Timestamp: 2025-06-09T05:10:32.105Z
Learning: In the ucd-store package refactor, picomatch was moved from direct usage in download.ts to internal usage within the createPathFilter function in filter.ts. The pattern format is still picomatch-compatible, so JSDoc comments referencing picomatch pattern format remain correct.
packages/ucd-store/test/core/init.test.ts (1)

14-14: Import path correction looks good.

Consistent with the repo’s updated test import strategy.

packages/ucd-store/test/core/errors.test.ts (1)

11-11: Import path update LGTM.

Keeps tests aligned with local src exports.

packages/ucd-store/test/core/factory.test.ts (2)

4-7: Import path moves are correct.

Matches other core tests and avoids package boundary indirection.


196-201: Array formatting change is non-semantic.

Readability improved; no behavior change.

packages/ucd-store/test/file-operations/capability-requirements.test.ts (1)

132-136: Do not remove toHaveBeenCalledAfter matchers
Vitest’s official API includes toHaveBeenCalledAfter(mock, …) as a core matcher since version 3.0.0, so these assertions are valid and won’t break in a standard Vitest environment. (vitest.dev, main.vitest.dev)

Likely an incorrect or invalid review comment.

packages/ucd-store/test/maintenance/repair.test.ts (1)

8-8: Import path switch to local source looks good

Keeps tests closer to in-repo sources; avoids package resolution drift. No behavior change.

packages/ucd-store/test/maintenance/mirror.test.ts (1)

10-10: Import path switch to local source looks good

Consistent with other maintenance tests; no functional impact.

packages/ucd-store/test/maintenance/clean.test.ts (1)

10-10: Import path migration to local source looks good.

Switching to ../../src/factory improves test stability against packaging changes. No functional impact observed.

packages/ucd-store/test/maintenance/analyze.test.ts (1)

9-11: Consistent internal import realignment.

Imports now point to ../../src/* and ../__shared, aligning with other maintenance tests. No behavior change; tests remain coherent.

packages/ucd-store/test/file-operations/file-tree.test.ts (4)

22-95: LGTM: baseline file tree and flatten verification.

The expectations match the contract of getFileTree and flattenFilePaths. Good coverage for nested directories.


167-185: LGTM: directory exclusion via multiple equivalent patterns.

Validates "!/extracted" vs "!/extracted/**" equivalence with expandDirectoryPattern behavior.


187-235: LGTM: nested directory filtering parity across patterns.

Good assertions proving order- and form-equivalence of directory-targeted excludes.


403-437: LGTM: preservation of empty directories post-filter.

Confirms directories with all children filtered remain as empty nodes, which is important for tree structure fidelity.

packages/shared/src/filter.ts (2)

215-241: Directory expansion logic: solid and matches tests.

expandDirectoryPattern correctly handles trailing slashes, already-expanded patterns, and glob-token segments.


314-339: OK: override detection for default exclusions.

Extension, exact, and .DS_Store cases are handled predictably.

Comment on lines +53 to +82
it("should respect filters in getFilePaths", async () => {
const storePath = await testdir({
"15.0.0": {
"ArabicShaping.txt": "Arabic shaping data",
"BidiBrackets.txt": "Bidi brackets data",
"extracted": {
"DerivedBidiClass.txt": "Derived bidi class data",
"nested": {
"DeepFile.txt": "Deep nested file",
},
},
},
".ucd-store.json": JSON.stringify({
"15.0.0": "15.0.0",
}),
});

const store = await createNodeUCDStore({
basePath: storePath,
});

await store.init();
const [filePaths, filePathsError] = await store.getFilePaths("15.0.0", ["!**/nested/**"]);
assert(filePathsError === null, "Expected getFilePaths to succeed");
expect(filePaths).toEqual([
"ArabicShaping.txt",
"BidiBrackets.txt",
"extracted/DerivedBidiClass.txt",
]);
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Add regression test for #215: include parent directory when only children match

Current tests don’t exercise the bug described in #215. Add a case where the parent dir doesn’t match an include, but a deep child does; the directory must still be traversed and the child included.

Example test to add:

it("should include directories when any descendant matches include filters (#215)", async () => {
  const storePath = await testdir({
    "15.0.0": {
      "extracted": {
        "nested": { "DeepFile.txt": "Deep nested file" },
      },
      "ArabicShaping.txt": "x",
    },
    ".ucd-store.json": JSON.stringify({ "15.0.0": "15.0.0" }),
  });

  const store = await createNodeUCDStore({ basePath: storePath });
  await store.init();

  // Include only the deep file; parent "extracted" itself does not match.
  const [paths, err] = await store.getFilePaths("15.0.0", ["extracted/**/DeepFile.txt"]);
  assert(err === null, "Expected getFilePaths to succeed");

  expect(paths).toEqual(["extracted/nested/DeepFile.txt"]);
});

Optional: add a complementary negative test that ensures a directory is excluded when none of its descendants match.


Automation to confirm the code fix in store.ts:

---

🏁 Script executed:

```shell
#!/bin/bash
# Verify that directory inclusion no longer depends on `isFiltered`
rg -nP 'filterDirectoryChildren|getFileTree' packages/ucd-store/src/store.ts -C4
rg -nP 'isFiltered\s*&&\s*filtered(Grand)?Children\.length\s*>\s*0' packages -n || echo "OK: no legacy condition found"

Length of output: 2137


Legacy directory-filtering logic still present in getFileTree must be updated

The grep output shows that the old isFiltered && filtered…Children.length > 0 checks remain at lines 159 and 191 in packages/ucd-store/src/store.ts. These must be removed or refactored so that directories are traversed whenever any descendant matches—even if the directory itself is filtered out.

• In packages/ucd-store/src/store.ts

  • Line 159: if (isFiltered && filteredGrandChildren.length > 0) { … }
  • Line 191: if (isFiltered && filteredChildren.length > 0) { … }

Each of these should be changed to push the directory node whenever filtered…Children.length > 0 (regardless of isFiltered), ensuring parent dirs are included when only descendants match.

Additionally, please add the regression test for issue #215 as suggested:

it("should include directories when any descendant matches include filters (#215)", async () => {
  const storePath = await testdir({
    "15.0.0": {
      extracted: { nested: { "DeepFile.txt": "Deep nested file" } },
      "ArabicShaping.txt": "x",
    },
    ".ucd-store.json": JSON.stringify({ "15.0.0": "15.0.0" }),
  });

  const store = await createNodeUCDStore({ basePath: storePath });
  await store.init();

  const [paths, err] = await store.getFilePaths("15.0.0", ["extracted/**/DeepFile.txt"]);
  assert(err === null, "Expected getFilePaths to succeed");
  expect(paths).toEqual(["extracted/nested/DeepFile.txt"]);
});

And, optionally, a negative test confirming that a directory with no matching descendants is excluded.

🤖 Prompt for AI Agents
In packages/ucd-store/src/store.ts around lines 159 and 191, the legacy checks
`if (isFiltered && filteredGrandChildren.length > 0)` and `if (isFiltered &&
filteredChildren.length > 0)` prevent parent directories from being emitted when
only descendants match; update both conditions to push the directory node
whenever the corresponding `filtered…Children.length > 0` is true (remove the
`isFiltered &&` part) so directories are included if any descendant matches,
ensure traversal still recurses into those directories even if the directory
itself is filtered out, and add the provided regression test in
packages/ucd-store/test/file-operations/file-paths.test.ts (plus an optional
negative test) to verify directories are included only when descendants match;
run tests to confirm behavior.

Comment on lines +439 to +506
it.todo("should include directories that don't match filter but contain matching children (GitHub Issue #215)", async () => {
const storePath = await testdir({
"15.0.0": {
"root-file.txt": "root content",
"nonMatching": {
"matching.txt": "should be included",
"also-matching.txt": "should also be included",
"nested": {
"deep-matching.txt": "should be included too",
"non-matching.log": "should be excluded",
},
},
"alsoNonMatching": {
"exclude.log": "should be excluded",
},
},
".ucd-store.json": JSON.stringify({
"15.0.0": "15.0.0",
}),
});

const store = await createNodeUCDStore({
basePath: storePath,
});

await store.init();
const [fileTree, fileTreeError] = await store.getFileTree("15.0.0", ["**/*.txt"]);
assert(fileTreeError === null, "Expected getFileTree to succeed");

const expected = [
{
name: "root-file.txt",
path: "root-file.txt",
type: "file",
},
{
name: "nonMatching",
path: "nonMatching",
type: "directory",
children: [
{
name: "also-matching.txt",
path: "also-matching.txt",
type: "file",
},
{
name: "matching.txt",
path: "matching.txt",
type: "file",
},
{
name: "nested",
path: "nested",
type: "directory",
children: [
{
name: "deep-matching.txt",
path: "deep-matching.txt",
type: "file",
},
],
},
],
},
];

expect(fileTree).toEqual(expected);
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Enable the regression test for Issue #215 (directories with matching descendants must be kept).

This is the core objective of the PR. Converting this from todo to an active test will prevent regressions and validate the new filtering semantics end-to-end.

Apply this diff to activate the test:

-  it.todo("should include directories that don't match filter but contain matching children (GitHub Issue #215)", async () => {
+  it("should include directories that don't match filter but contain matching children (GitHub Issue #215)", async () => {
@@
-    expect(fileTree).toEqual(expected);
+    expect(fileTree).toEqual(expected);
   });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
it.todo("should include directories that don't match filter but contain matching children (GitHub Issue #215)", async () => {
const storePath = await testdir({
"15.0.0": {
"root-file.txt": "root content",
"nonMatching": {
"matching.txt": "should be included",
"also-matching.txt": "should also be included",
"nested": {
"deep-matching.txt": "should be included too",
"non-matching.log": "should be excluded",
},
},
"alsoNonMatching": {
"exclude.log": "should be excluded",
},
},
".ucd-store.json": JSON.stringify({
"15.0.0": "15.0.0",
}),
});
const store = await createNodeUCDStore({
basePath: storePath,
});
await store.init();
const [fileTree, fileTreeError] = await store.getFileTree("15.0.0", ["**/*.txt"]);
assert(fileTreeError === null, "Expected getFileTree to succeed");
const expected = [
{
name: "root-file.txt",
path: "root-file.txt",
type: "file",
},
{
name: "nonMatching",
path: "nonMatching",
type: "directory",
children: [
{
name: "also-matching.txt",
path: "also-matching.txt",
type: "file",
},
{
name: "matching.txt",
path: "matching.txt",
type: "file",
},
{
name: "nested",
path: "nested",
type: "directory",
children: [
{
name: "deep-matching.txt",
path: "deep-matching.txt",
type: "file",
},
],
},
],
},
];
expect(fileTree).toEqual(expected);
});
it("should include directories that don't match filter but contain matching children (GitHub Issue #215)", async () => {
const storePath = await testdir({
"15.0.0": {
"root-file.txt": "root content",
"nonMatching": {
"matching.txt": "should be included",
"also-matching.txt": "should also be included",
"nested": {
"deep-matching.txt": "should be included too",
"non-matching.log": "should be excluded",
},
},
"alsoNonMatching": {
"exclude.log": "should be excluded",
},
},
".ucd-store.json": JSON.stringify({
"15.0.0": "15.0.0",
}),
});
const store = await createNodeUCDStore({
basePath: storePath,
});
await store.init();
const [fileTree, fileTreeError] = await store.getFileTree("15.0.0", ["**/*.txt"]);
assert(fileTreeError === null, "Expected getFileTree to succeed");
const expected = [
{
name: "root-file.txt",
path: "root-file.txt",
type: "file",
},
{
name: "nonMatching",
path: "nonMatching",
type: "directory",
children: [
{
name: "also-matching.txt",
path: "also-matching.txt",
type: "file",
},
{
name: "matching.txt",
path: "matching.txt",
type: "file",
},
{
name: "nested",
path: "nested",
type: "directory",
children: [
{
name: "deep-matching.txt",
path: "deep-matching.txt",
type: "file",
},
],
},
],
},
];
expect(fileTree).toEqual(expected);
});
🤖 Prompt for AI Agents
In packages/ucd-store/test/file-operations/file-tree.test.ts around lines 439 to
506, the test for Issue #215 is marked as todo so it is not executed; change the
test declaration from it.todo(...) to it(...) (i.e., activate the test) so the
suite runs it, keeping the async callback and all existing assertions intact;
run the test suite to confirm it now executes and passes (or adjust expectations
if needed).

luxass added 3 commits August 28, 2025 05:32
- Replaced custom filtering logic with `filterTreeStructure` function for improved readability and maintainability.
- Streamlined the process of filtering file paths and directories in the `getFileTree` method.
- Updated `filterTreeStructure` function to include an optional `extraFilters` parameter.
- Modified the filtering logic to utilize the new parameter for improved path filtering capabilities.
- Added `includeDirectories` option to `FlattenFilePathsOptions` to control whether directory paths are included in the output.
- Updated `flattenFilePaths` function to handle the new option.
- Added tests to verify behavior for including and excluding directory paths.
- Included error handling for invalid input types.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/shared/src/filter.ts (1)

68-74: Docs vs. behavior mismatch: code is last-match-wins, docs say “exclude overrides include”.

The evaluator sets allowed by the last matching rule; excludes don’t always win. Update the docs (or change the logic). Also list all default exclusions.

Apply this doc fix:

- * - **Precedence**: Exclude patterns override include patterns
- * - **Default exclusions**: `.zip` and `.pdf` files are excluded by default (unless disabled)
+ * - **Resolution**: Last-match-wins between user patterns; list later patterns to override earlier ones.
+ * - **Default exclusions**: `*.zip`, `*.pdf`, and `.DS_Store` are excluded by default (unless disabled or overridden by a positive pattern)

If you prefer “exclude wins” semantics, I can provide a small evaluator change to enforce that deterministically.

♻️ Duplicate comments (2)
packages/shared/src/filter.ts (2)

378-380: Use strict null checks.

Align with previous feedback; avoids accidental truthiness pitfalls.

-    if (root == null) {
+    if (root === null || root === undefined) {

409-411: Use strict null checks.

Same here for the child segment.

-      if (next == null) {
+      if (next === null || next === undefined) {
🧹 Nitpick comments (5)
packages/shared/src/flatten.ts (1)

51-56: Guard against undefined children (defensive).

If a provider ever omits children on a directory node, this will throw due to the input-array check in the recursive call. Consider a safe fallback.

Apply:

-      paths.push(...flattenFilePaths(file.children, fullPath, options));
+      paths.push(...flattenFilePaths(file.children ?? [], fullPath, options));
packages/shared/test/flatten.test.ts (1)

256-261: Invalid-input cases covered; add one more edge if you want.

Optionally add an empty-directory-with-includeDirectories=true case to lock in inclusion of empty dirs (or their absence, depending on desired behavior).

packages/shared/src/filter.ts (3)

166-169: De-dupe when extending patterns.

Avoid rule bloat and preserve deterministic order.

-  filter.extend = (additionalFilters: string[]) => {
-    basePatterns = [...basePatterns, ...additionalFilters];
-    baseRules = buildRules(basePatterns);
-  };
+  filter.extend = (additionalFilters: string[]) => {
+    basePatterns = Array.from(new Set([...basePatterns, ...additionalFilters]));
+    baseRules = buildRules(basePatterns);
+  };

365-366: Type the matches array.

Minor readability and prevents accidental non-string pushes.

-  const matches = [];
+  const matches: string[] = [];

356-445: Tree rebuild logic is correct; consider small perf wins.

For very large match sets, replacing linear .find lookups with Maps per level will drop worst-case behavior from O(n²) to near O(n).

I can provide a Map-backed version if this becomes hot.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between d2b3929 and 6422aaf.

📒 Files selected for processing (5)
  • packages/shared/src/filter.ts (4 hunks)
  • packages/shared/src/flatten.ts (3 hunks)
  • packages/shared/src/index.ts (1 hunks)
  • packages/shared/test/flatten.test.ts (1 hunks)
  • packages/ucd-store/src/store.ts (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/shared/src/index.ts
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: luxass
PR: ucdjs/ucd#6
File: packages/cli/src/cmd/codegen/fields.ts:68-82
Timestamp: 2025-05-04T11:52:22.858Z
Learning: User prefers to address the issue with Node.js API compatibility (`Dirent.parentPath` not being part of Node's API and the `recursive: true` option for `readdir` being only available in Node 21+) in a separate PR rather than the current one.
Learnt from: luxass
PR: ucdjs/ucd#45
File: packages/ucd-store/src/download.ts:24-24
Timestamp: 2025-06-09T05:10:32.105Z
Learning: In the ucd-store package refactor, picomatch was moved from direct usage in download.ts to internal usage within the createPathFilter function in filter.ts. The pattern format is still picomatch-compatible, so JSDoc comments referencing picomatch pattern format remain correct.
📚 Learning: 2025-06-14T05:20:20.149Z
Learnt from: luxass
PR: ucdjs/ucd#56
File: packages/utils/src/ucd-files/validate.ts:115-130
Timestamp: 2025-06-14T05:20:20.149Z
Learning: Node.js 22 supports the `recursive` option in `fs.readdir()` (introduced in Node.js 18.17.0), which returns a flattened string array of all descendant files and directories when `recursive: true` is passed.

Applied to files:

  • packages/shared/src/flatten.ts
📚 Learning: 2025-06-09T05:10:32.105Z
Learnt from: luxass
PR: ucdjs/ucd#45
File: packages/ucd-store/src/download.ts:24-24
Timestamp: 2025-06-09T05:10:32.105Z
Learning: In the ucd-store package refactor, picomatch was moved from direct usage in download.ts to internal usage within the createPathFilter function in filter.ts. The pattern format is still picomatch-compatible, so JSDoc comments referencing picomatch pattern format remain correct.

Applied to files:

  • packages/ucd-store/src/store.ts
  • packages/shared/src/filter.ts
🧬 Code graph analysis (4)
packages/shared/src/flatten.ts (1)
packages/fetch/src/components.ts (1)
  • UnicodeTreeNode (12-12)
packages/shared/test/flatten.test.ts (2)
packages/fetch/src/components.ts (1)
  • UnicodeTree (13-13)
packages/shared/src/flatten.ts (1)
  • flattenFilePaths (34-62)
packages/ucd-store/src/store.ts (1)
packages/shared/src/filter.ts (1)
  • filterTreeStructure (356-445)
packages/shared/src/filter.ts (2)
packages/ucd-store/src/store.ts (1)
  • filter (97-99)
packages/shared/src/flatten.ts (1)
  • flattenFilePaths (34-62)
🔇 Additional comments (8)
packages/shared/src/flatten.ts (2)

4-12: Nice, clear option surface for directory inclusion.

Good docs and a sensible default (false). Name reads well in call sites.


34-38: Signature + destructuring look solid.

Defaulting prefix and options keeps callers terse while remaining explicit.

packages/shared/test/flatten.test.ts (1)

207-230: Good coverage for includeDirectories=true/false.

These assertions pin the option’s behavior and protect against regressions in output ordering.

Also applies to: 232-254

packages/ucd-store/src/store.ts (3)

18-18: Import changes align with shared utilities.

Centralizing filtering and flattening in @ucdjs/shared reduces duplication.


139-139: Delegating to filterTreeStructure fixes the “prune-parent, drop-children” bug.

This matches the objective in #215 by rebuilding the minimal tree of matched files and their ancestor dirs.


122-140: Confirm type compatibility: TreeEntry[] vs UnicodeTreeNode[].

filterTreeStructure returns TreeEntry[], while the method signature expects UnicodeTreeNode[]. If these diverge later, this could become a hidden break. Either tighten the return type or make filterTreeStructure generic over the node shape.

Would you like me to open a follow-up to make filterTreeStructure generic or add a shared “tree node” type alias used by both packages?

packages/shared/src/filter.ts (2)

104-110: Evaluator is clear and predictable (last-match-wins).

The normalized-path handling and base/extra composition are straightforward.

Also applies to: 152-160


215-241: Directory-pattern expansion may over-expand file-like patterns.

Heuristic expands any last segment without glob tokens, which could treat a bare filename (e.g., “LICENSE”) as a directory. Either document this explicitly or refine the heuristic (e.g., only expand when the original pattern ends with “/” or when the last segment has no dot and no glob tokens).

I can patch the heuristic to: expand if endsWith("/") OR (no glob tokens AND no ".")—let me know your preference.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
packages/shared/test/filter.test.ts (4)

127-131: Clarify case-sensitivity wording to match expectations.

The comments/descriptions imply matching across cases, but the assertions verify case-sensitive behavior. Tighten the wording for clarity.

-      // case should always match
-      [["*.TXT"], "file.txt", false, "doesn't match different case"],
-      [["*.txt"], "FILE.TXT", false, "doesn't match different case reverse"],
-      [["SRC/**"], "src/file.js", false, "matches directory different case"],
+      // case-sensitive: patterns are case-sensitive
+      [["*.TXT"], "file.txt", false, "does not match different case"],
+      [["*.txt"], "FILE.TXT", false, "does not match different case (reverse)"],
+      [["SRC/**"], "src/file.js", false, "does not match directory with different case"],

391-400: Avoid brittle error-message assertions; assert error type instead.

Exact V8 error messages can change across Node versions. Assert TypeError rather than the string.

-      expect(() => {
-        // @ts-expect-error typescript gives type-errors since we are using readonly.
-        patterns.push("*.js"); // try to mutate the returned array
-      }).toThrow("Cannot add property 1, object is not extensible");
+      expect(() => {
+        // @ts-expect-error typescript gives type-errors since we are using readonly.
+        patterns.push("*.js"); // try to mutate the returned array
+      }).toThrow(TypeError);

335-346: Duplicate-pattern test doesn’t actually verify multiplicity.

arrayContaining ignores counts. Explicitly assert the number of occurrences.

-      expect(filter.patterns()).toEqual(expect.arrayContaining([
-        "*.txt",
-        "*.txt",
-        "*.js",
-        "*.js",
-      ]));
+      const patterns = filter.patterns();
+      expect(patterns.filter((p) => p === "*.txt")).toHaveLength(2);
+      expect(patterns.filter((p) => p === "*.js")).toHaveLength(2);

528-579: Add a tree-filter test exercising extraFilters.

filterTreeStructure accepts extraFilters, but tests don’t cover it. Add a focused case to ensure combined rules behave as intended.

@@
   describe("basic filtering", () => {
@@
     it("should handle exclude patterns with include patterns", () => {
@@
       expect(flattened).toEqual([
         "root-file.txt",
         "extracted/DerivedBidiClass.txt",
         "extracted/nested/DeepFile.txt",
       ]);
     });
+
+    it("should respect extraFilters in filterTreeStructure", () => {
+      const filter = createPathFilter([
+        "!**/extracted/**",
+      ]);
+      const result = filterTreeStructure(filter, tree, ["**/*.txt"]);
+      const flattened = flattenFilePaths(result);
+      expect(flattened).toEqual([
+        "root-file.txt",
+        "extracted/DerivedBidiClass.txt",
+        "extracted/nested/DeepFile.txt",
+      ]);
+    });

Also applies to: 664-679

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 6422aaf and a9a7edb.

📒 Files selected for processing (1)
  • packages/shared/test/filter.test.ts (11 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: luxass
PR: ucdjs/ucd#6
File: packages/cli/src/cmd/codegen/fields.ts:68-82
Timestamp: 2025-05-04T11:52:22.858Z
Learning: User prefers to address the issue with Node.js API compatibility (`Dirent.parentPath` not being part of Node's API and the `recursive: true` option for `readdir` being only available in Node 21+) in a separate PR rather than the current one.
Learnt from: luxass
PR: ucdjs/ucd#45
File: packages/ucd-store/src/download.ts:24-24
Timestamp: 2025-06-09T05:10:32.105Z
Learning: In the ucd-store package refactor, picomatch was moved from direct usage in download.ts to internal usage within the createPathFilter function in filter.ts. The pattern format is still picomatch-compatible, so JSDoc comments referencing picomatch pattern format remain correct.
🧬 Code graph analysis (1)
packages/shared/test/filter.test.ts (2)
packages/shared/src/filter.ts (3)
  • createPathFilter (104-178)
  • TreeEntry (345-354)
  • filterTreeStructure (356-445)
packages/shared/src/flatten.ts (1)
  • flattenFilePaths (34-62)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: lint
🔇 Additional comments (3)
packages/shared/test/filter.test.ts (3)

164-168: Order-specific precedence: nice catch; last rule wins here.

["!test.js", "*.js"] correctly allows test.js since the last matching include overrides the earlier exclude. This fixes the earlier inconsistency flagged in past reviews.


664-678: Good: tree filtering includes parent dirs when children match.

This aligns with the PR objective to avoid dropping directories whose own path doesn’t match but contain matching children.


636-649: No action: buildRules maps user patterns through expandDirectoryPattern (line 259), covering directory-pattern expansion for user rules.

@luxass
Copy link
Member Author

luxass commented Aug 29, 2025

closing in favour of #223

@luxass luxass closed this Aug 29, 2025
@luxass luxass deleted the filter-fix branch September 29, 2025 18:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pkg: shared Changes related to the Shared package. pkg: ucd-store Changes related to the UCD Store package.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix: Directory filtering drops valid children when parent path doesn't match filter

1 participant