Skip to content

[ENHANCEMENT] Provider‑aware large file reads to prevent context overload #8038

@hannesrudolph

Description

@hannesrudolph

Type

Enhancement

Problem

Reading large files can exceed the model’s capacity, leading to timeouts, failures, or stalled requests. A recent attempt to mitigate this helped in some cases but adds complexity, misses edge cases, and can produce inconsistent behavior. Recent attempt: #6319

Context

This affects users who ask the assistant to read very large or minified files, extremely long single‑line documents, or multiple files in one run. In some scenarios, guidance suggests using partial line ranges even when that mode is turned off, which is confusing and inconsistent. Current behavior can also trigger repeated condensing attempts that loop and ultimately stall or fail the task.

Desired behavior

  • Provider‑aware cap: Respect the actual available space in the current conversation and the model’s window to avoid overload.
  • Predictable reads: Use a chunked approach that doesn’t rely on newline boundaries so it works for minified and very long‑line files without special‑casing.
  • Consistent UX: When content can’t fit, show a clear, human‑readable notice and do not prompt the assistant to use features that are disabled.
  • Fast failure with guidance: If a file is too large to read as requested, fail quickly with suggestions (e.g., specify a range or smaller sections).
  • Simple controls: Provide a straightforward option to choose behavior when files are too large (truncate with notice, require explicit ranges, or summarize).
  • Performance‑minded: Avoid heavy pre‑counting heuristics in favor of efficient streaming/chunking with an early stop.
  • Coverage: Include tests for long single‑line files, minified content, disabled partial reads, multi‑range usage, and provider/model variations.

Reproduction (current pain)

  1. Turn off partial reads and ask the assistant to read a very large or minified file.
  2. Expected: A clear, immediate message explaining the limit with guidance to read targeted ranges.
  3. Actual: The request may stall, fail, or enter a condensing loop that never resolves, with mixed guidance about using partial ranges.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Issue - Needs ApprovalReady to move forward, but waiting on maintainer or team sign-off.enhancementNew feature or request

    Type

    No type

    Projects

    Status

    Issue [Marketplace]

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions