native support for grep by lorenzejay · Pull Request #4368 · crewAIInc/crewAI

lorenzejay · 2026-02-04T18:33:53Z

Note

Medium Risk
Introduces a new filesystem-search capability that touches path handling and regex execution; while guarded (path restriction, sensitive-file filtering, size/binary limits), bugs could still leak data or cause performance issues if misconfigured.

Overview
Adds a new GrepTool that performs recursive regex search over local files with glob filtering, context lines, and multiple output modes, plus safety controls (cwd-only path restriction by default, sensitive-file exclusions, binary/size skipping, output truncation, and basic ReDoS mitigations).

Exposes GrepTool via the package/tool __init__ exports, adds comprehensive unit tests, and regenerates tool.specs.json to include the new tool while also removing some previously listed init/required fields from other tool schemas (e.g. Couchbase cluster and Oxylabs/Tavily client params).

^{Written by Cursor Bugbot for commit 2c78e60. This will update automatically on new commits. Configure here.}

greysonlalonde · 2026-02-11T23:27:01Z

+
+        pattern = glob_pattern or "*"
+        files: list[Path] = []
+        for p in search_path.rglob(pattern):


I don't think pathlib.rglob supports brace expansion. Any use of {*.py,*.txt} will yield empty results

greysonlalonde · 2026-02-11T23:29:42Z

+        # Compile regex
+        flags = re.IGNORECASE if case_insensitive else 0
+        try:
+            compiled = re.compile(pattern, flags)


This is a ReDoS risk - malicious / careless pattern like (a+)+$, (.*a){20}, or (\w+\s?)*$ may cause catastrophic backtracking when matched against certain lines

greysonlalonde · 2026-02-11T23:30:55Z

+            Formatted search results as a string.
+        """
+        # Resolve search path
+        search_path = Path(path) if path else Path(os.getcwd())


This allows for transversal of any absolute path in the system

greysonlalonde · 2026-02-11T23:33:01Z

Along with the comments below, there is risk of sensitive content leakage. Any .env, .netrc, .npmrc, etc can be in a result

greysonlalonde · 2026-02-11T23:34:31Z

+
+        pattern = glob_pattern or "*"
+        files: list[Path] = []
+        for p in search_path.rglob(pattern):


There is also risk of symlink transversal here

greysonlalonde · 2026-02-11T23:35:43Z

+
+        try:
+            with open(file_path, encoding="utf-8", errors="replace") as f:
+                lines = f.readlines()


We need a size check here as a fast follow, GB+ will be read fully into memory

greysonlalonde · 2026-02-11T23:36:23Z

+        default=False,
+        description="Whether to perform case-insensitive matching",
+    )
+    context_lines: int = Field(


Should set an upper bound here

…d brace expansion support - Added MAX_REGEX_LENGTH to limit regex pattern length and prevent ReDoS. - Introduced allow_unrestricted_paths option to enable searching outside the current working directory. - Implemented brace expansion for glob patterns to support multiple file types. - Enhanced error handling for path traversal and regex compilation. - Updated tests to cover new features and ensure robustness.

- Added MAX_CONTEXT_LINES to define the upper limit for context lines shown in search results. - Introduced MAX_FILE_SIZE_BYTES to skip files larger than 10 MB during searches. - Implemented logic to exclude sensitive files (e.g., .env, .netrc) from search results to prevent accidental leakage of credentials. - Updated tests to validate sensitive file exclusion and file size limits, ensuring robustness in handling sensitive content.

…into lorenze/feat/grep-tool

cursor

Cursor Bugbot has reviewed your changes and found 3 potential issues.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

cursor · 2026-03-10T17:35:05Z

+            return None
+        finally:
+            signal.alarm(0)
+            signal.signal(signal.SIGALRM, old_handler)


SIGALRM crashes when tool runs in worker threads

High Severity

_safe_search calls signal.signal(signal.SIGALRM, ...) which raises ValueError: signal only works in main thread when invoked from a non-main thread. CrewAI's crew_agent_executor.py runs tool calls inside a ThreadPoolExecutor, so this will crash in normal usage. The except TimeoutError block doesn't catch ValueError, so the exception propagates up unhandled and kills the tool invocation entirely.

cursor · 2026-03-10T17:35:05Z

+                continue
+            files.append(p)
+            if len(files) >= MAX_FILES:
+                break


Symlink traversal bypasses path restriction boundary

Medium Severity

_collect_files uses rglob which follows symlinks, and never verifies that discovered files' real paths remain within the search boundary. A symlink inside the search directory pointing to an arbitrary location (e.g., /etc/shadow) would be followed, read, and its contents returned. Neither is_symlink() nor resolve() is called on individual file paths before reading them.

cursor · 2026-03-10T17:35:06Z

+            if self._safe_search(compiled_pattern, line):
+                match_line_nums.append(i)
+                if len(match_line_nums) >= MAX_MATCHES_PER_FILE:
+                    break


ReDoS mitigation absent — regex runs on untruncated lines

Medium Severity

The _safe_search docstring claims Windows is "bounded by MAX_LINE_LENGTH truncation applied earlier in the pipeline," but _search_file calls _safe_search on full, untruncated lines. Truncation only occurs later during output formatting. On Windows (no SIGALRM), and in non-main threads on Unix (where SIGALRM crashes), there is no effective ReDoS protection — a malicious pattern against a long line can hang indefinitely.

Additional Locations (1)

lib/crewai-tools/src/crewai_tools/tools/grep_tool/grep_tool.py#L327-L339

github-actions · 2026-04-25T12:27:23Z

This PR is stale because it has been open for 45 days with no activity.

lorenzejay and others added 6 commits February 4, 2026 10:28

native support for grep

5a14007

moved to tools

f04bedc

chore: update tool specifications

c9971a7

Merge branch 'main' into lorenze/feat/grep-tool

ad2435f

Merge branch 'main' into lorenze/feat/grep-tool

e659408

Merge branch 'main' into lorenze/feat/grep-tool

25835ca

greysonlalonde requested changes Feb 11, 2026

View reviewed changes

lorenzejay and others added 3 commits February 11, 2026 20:44

linted

925ed78

chore: update tool specifications

b97fc83

greysonlalonde approved these changes Feb 12, 2026

View reviewed changes

lorenzejay and others added 10 commits February 12, 2026 09:24

Merge branch 'main' into lorenze/feat/grep-tool

dea2e1e

Merge branch 'lorenze/feat/grep-tool' of github.com:crewAIInc/crewAI …

1f02657

…into lorenze/feat/grep-tool

chore: update tool specifications

f894d8c

fix test

364143a

Merge branch 'lorenze/feat/grep-tool' of github.com:crewAIInc/crewAI …

73f44c8

…into lorenze/feat/grep-tool

Merge branch 'main' into lorenze/feat/grep-tool

c8dd6c0

chore: update tool specifications

2d0e81c

Merge branch 'main' into lorenze/feat/grep-tool

8e336a4

chore: update tool specifications

2c78e60

cursor Bot reviewed Mar 10, 2026

View reviewed changes

github-actions Bot added the no-pr-activity label Apr 25, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

native support for grep#4368

native support for grep#4368
lorenzejay wants to merge 19 commits into
mainfrom
lorenze/feat/grep-tool

lorenzejay commented Feb 4, 2026 •

edited by cursor Bot

Loading

Uh oh!

greysonlalonde Feb 11, 2026

Uh oh!

greysonlalonde Feb 11, 2026

Uh oh!

greysonlalonde Feb 11, 2026

Uh oh!

greysonlalonde Feb 11, 2026

Uh oh!

greysonlalonde Feb 11, 2026

Uh oh!

greysonlalonde Feb 11, 2026

Uh oh!

greysonlalonde Feb 11, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot Mar 10, 2026

Uh oh!

cursor Bot Mar 10, 2026

Uh oh!

cursor Bot Mar 10, 2026

Uh oh!

github-actions Bot commented Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lorenzejay commented Feb 4, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greysonlalonde Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

greysonlalonde Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

greysonlalonde Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

greysonlalonde Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

greysonlalonde Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

greysonlalonde Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

greysonlalonde Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot Mar 10, 2026

Choose a reason for hiding this comment

SIGALRM crashes when tool runs in worker threads

Uh oh!

cursor Bot Mar 10, 2026

Choose a reason for hiding this comment

Symlink traversal bypasses path restriction boundary

Uh oh!

cursor Bot Mar 10, 2026

Choose a reason for hiding this comment

ReDoS mitigation absent — regex runs on untruncated lines

Uh oh!

github-actions Bot commented Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lorenzejay commented Feb 4, 2026 •

edited by cursor Bot

Loading