Prevent spacegrep from skipping files #3418
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The
spacegrep
command line can take a single file or folders. It skips files silently based on a heuristic that guess whether the files look like programs, based on average line length. This causes markdown files with very long lines to be skipped. The--force
option of spacegrep disables this heuristic.I added test to check the performance on big and/or binary files. It's not that great and will depend on the pattern. Semgrep uses a 10MB limit for all targets, which should be fine for programs with proper indentation but will cause problems on minified files or some binary files that don't contain newlines. Rule authors must ensure that paths are filtered by extension like this rule. If that's already the case in practice, then we're good.
Fixes #2987
PR checklist: