Skip to content

Search has no file-size or binary-file cap #27

@willwashburn

Description

@willwashburn

Problem

crates/wash/src/search.rs:50 builds the Searcher with no size or binary limits:

let mut searcher = SearcherBuilder::new()
    .before_context(opts.context_lines as usize)
    .after_context(opts.context_lines as usize)
    .line_number(true)
    .build();

If a repo contains a 500MB log, vendored minified bundle, or a stray binary that isn't gitignored, Search reads it. ripgrep applies sane defaults (binary detection that skips on the first NUL byte, an optional --max-filesize); we should too.

Proposed fix

On SearcherBuilder:

  • binary_detection(BinaryDetection::quit(b'\x00')) — stop scanning on the first NUL.
  • Add a configurable max file size (default ~10MB), surfaced as a maxFileBytes arg in the Search schema with a sensible default. Skip files over the limit (record them in a skipped field of the response so the agent isn't surprised).

Bonus: also cap total response bytes across all files, separate from maxResults. Today one giant file with one match can return a huge snippet block even though the hit count is 1. A byte budget prevents that.

Files

  • crates/wash/src/search.rsSearcherBuilder config, file-size guard.
  • crates/wash/src/tools/search.rs — schema + plumbing for the new option.
  • Tests for both the binary-skip and size-skip paths.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions