Skip to content

Security & Performance Issue: Unlimited Code Output in LLM Context #1588

Open
@ovenzeze

Description

@ovenzeze

Issue Description

When executing code that produces large amounts of output (e.g., directory listings, file contents, system information), all output is sent to the LLM in its entirety before being truncated in the final response. This raises both security and performance concerns:

  1. Security Risk:

    • Sensitive information in large outputs (logs, system info, file contents) is sent to the LLM
    • Even if truncated in the final response, the LLM has already processed the complete output
    • This could lead to unintended data exposure
  2. Performance Impact:

    • Unnecessary token consumption when sending large outputs to the LLM
    • Increased API costs
    • Potential context window overflow

Example

# Simple code that generates large output
import os
for root, dirs, files in os.walk("/"):
    print(f"Directory: {root}")
    for file in files:
        print(f"  File: {file}")

Current behavior:

  1. Code executes and generates complete output
  2. Complete output is sent to LLM
  3. LLM processes all output
  4. Response is truncated for display

Proposed Solution

Add output limiting at the source (code execution) level:

  1. Add a configurable max_output_lines or max_output_bytes parameter
  2. Implement truncation during code execution, before sending to LLM
  3. Add clear indicators when output is truncated

This aligns with the project's philosophy of simplicity and security while maintaining core functionality.

Questions

  1. Would this feature align with the project's scope?
  2. Should this be configurable per execution or as a global setting?
  3. What would be a reasonable default limit?

Additional Context

This issue was discovered while building a service using Open Interpreter's API. The complete output being sent to the LLM was noticed through debug logs and token usage metrics.

Describe the solution you'd like

Add output limiting at the source (code execution) level:

  1. Add a configurable max_output_lines or max_output_bytes parameter
  2. Implement truncation during code execution, before sending to LLM
  3. Add clear indicators when output is truncated

This aligns with the project's philosophy of simplicity and security while maintaining core functionality.

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions