Skip to content

feat: add filter parameter to file_browser for flexible filtering#8751

Open
by22Jy wants to merge 2 commits intomarimo-team:mainfrom
by22Jy:feat-file-browser-filter
Open

feat: add filter parameter to file_browser for flexible filtering#8751
by22Jy wants to merge 2 commits intomarimo-team:mainfrom
by22Jy:feat-file-browser-filter

Conversation

@by22Jy
Copy link
Copy Markdown

@by22Jy by22Jy commented Mar 18, 2026

Summary

Adds a filter parameter to mo.ui.file_browser() for more flexible file filtering beyond simple file extensions.

Fixes #8399

Changes

  • Added filter parameter to file_browser.__init__() that accepts:
    • Regex pattern string (matched against file.name using re.match)
    • Callable that takes a Path and returns bool
  • Added _should_include_file() method to handle filtering logic
  • Updated _list_directory() to use the new unified filtering method
  • Added examples to docstring demonstrating regex and callback filters
  • Added import re at the top of the file

Implementation Details

Filter Priority: filter > filetypes

When filter is provided, it takes precedence over filetypes. If filter is None, the existing filetypes behavior is used (backward compatible).

Error Handling:

  • Invalid regex patterns: logged as warning, file excluded
  • Callback exceptions: logged as warning, file excluded

Backward Compatibility: ✅

  • filetypes parameter retained
  • Default behavior unchanged (filter=None)

Examples

Regex Filter

# Only show Python files
file_browser = mo.ui.file_browser(
    initial_path=Path("."),
    filter=r".*\.py$"
)

# Only show files starting with "test_"
file_browser = mo.ui.file_browser(
    initial_path=Path("."),
    filter=r"^test_.*"
)

Callback Filter

# Only show files larger than 1MB
def large_files(path: Path) -> bool:
    return path.is_file() and path.stat().st_size > 1_000_000

file_browser = mo.ui.file_browser(
    initial_path=Path("."),
    filter=large_files
)

Testing

This PR includes the core functionality. Tests should cover:

  • Regex filter with valid patterns
  • Regex filter with invalid patterns (error handling)
  • Callback filter with normal functions
  • Callback filter with exceptions (error handling)
  • Filter priority over filetypes
  • Backward compatibility (filetypes still works)

Notes

  • Does not modify frontend code (all filtering is backend-side)
  • Recursive directory checking (_has_files_recursive) still uses filetypes only, not the filter parameter (for simplicity and consistency)
  • Uses re.match() for regex matching (not re.search()), matching against the full filename

Related

@vercel
Copy link
Copy Markdown

vercel bot commented Mar 18, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
marimo-docs Ready Ready Preview, Comment Mar 18, 2026 4:09am

Request Review

@github-actions
Copy link
Copy Markdown


Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.


I have read the CLA Document and I hereby sign the CLA


You can retrigger this bot by commenting recheck in this Pull Request. Posted by the CLA Assistant Lite bot.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new backend-side filter parameter to mo.ui.file_browser() to support flexible file inclusion rules (regex or callback) beyond the existing filetypes extension filtering.

Changes:

  • Introduces a filter parameter to file_browser.__init__ (regex string via re.match or Callable[[Path], bool]).
  • Adds _should_include_file() and updates _list_directory() to use unified filtering logic.
  • Expands the file_browser docstring with usage examples for regex and callback filtering.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +278 to +316
def _should_include_file(self, file: Path, is_directory: bool) -> bool:
"""Determine if a file should be included based on filter/filetypes.

Args:
file: The file path to check
is_directory: Whether the path is a directory

Returns:
bool: True if the file should be included, False otherwise
"""
# Priority: filter > filetypes
if self._filter is not None:
# Apply custom filter
if isinstance(self._filter, str):
# Regex filter (match against file.name)
try:
return bool(re.match(self._filter, file.name))
except re.error:
# Invalid regex pattern, treat as no match
LOGGER.warning(
f"Invalid regex pattern in filter: {self._filter}"
)
return False
else:
# Callback filter
try:
return self._filter(file)
except Exception as e:
# Callback raised exception, log and exclude file
LOGGER.warning(
f"Filter callback raised exception for {file}: {e}"
)
return False

# Fall back to filetypes
if self._filetypes and not is_directory:
return file.suffix.lower() in self._filetypes

return True
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New filter behavior is not covered by the existing tests/_plugins/ui/_impl/test_file_browser.py suite (which currently focuses on filetypes and ignore_empty_dirs). Adding tests for regex filters (valid/invalid), callback filters (normal/exception), and filter precedence over filetypes would help prevent regressions—especially around directory visibility/navigation when filter is set.

Copilot uses AI. Check for mistakes.
Comment on lines +288 to +295
# Priority: filter > filetypes
if self._filter is not None:
# Apply custom filter
if isinstance(self._filter, str):
# Regex filter (match against file.name)
try:
return bool(re.match(self._filter, file.name))
except re.error:
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When filter is provided, _should_include_file applies it to directories too (no is_directory guard). This can hide folders and prevent users from navigating into them, unlike the existing filetypes behavior which only filters non-directories. Consider always including directories (or only applying filter when not is_directory, unless selection_mode intentionally wants directory-name filtering).

Copilot uses AI. Check for mistakes.
Comment on lines +293 to +300
try:
return bool(re.match(self._filter, file.name))
except re.error:
# Invalid regex pattern, treat as no match
LOGGER.warning(
f"Invalid regex pattern in filter: {self._filter}"
)
return False
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regex filtering currently calls re.match(self._filter, file.name) for every entry, and an invalid pattern will raise/log on every file, potentially spamming logs and adding avoidable overhead. Consider compiling/validating the regex once in __init__ (store a compiled re.Pattern or disable filtering after one warning) and then using pattern.match(...) during listing.

Suggested change
try:
return bool(re.match(self._filter, file.name))
except re.error:
# Invalid regex pattern, treat as no match
LOGGER.warning(
f"Invalid regex pattern in filter: {self._filter}"
)
return False
# Compile the pattern once and cache it; if invalid, log once.
invalid = getattr(self, "_filter_invalid", False)
if invalid:
# Previously determined to be invalid; treat as no match.
return False
pattern = getattr(self, "_compiled_filter", None)
if pattern is None:
try:
pattern = re.compile(self._filter)
except re.error:
# Invalid regex pattern, treat as no match and log once.
LOGGER.warning(
f"Invalid regex pattern in filter: {self._filter}"
)
setattr(self, "_filter_invalid", True)
return False
else:
setattr(self, "_compiled_filter", pattern)
return bool(pattern.match(file.name))

Copilot uses AI. Check for mistakes.
Comment on lines +301 to +310
else:
# Callback filter
try:
return self._filter(file)
except Exception as e:
# Callback raised exception, log and exclude file
LOGGER.warning(
f"Filter callback raised exception for {file}: {e}"
)
return False
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If filter is neither a str nor a callable, the code falls into the callback branch and will raise TypeError at call time, which gets caught and logged as a warning and silently excludes the file. Since filter is part of the public API, it would be better to validate the type in __init__ and raise a ValueError for unsupported values so misconfiguration is detected early.

Copilot uses AI. Check for mistakes.
@dmadisetti
Copy link
Copy Markdown
Collaborator

@by22Jy Were you still interested in contributing this? Happy to give a deeper look once you sign the CLA

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

file browser should have more flexible filtering

3 participants