Skip to content

Benchmarks and Token Savings

Rob Joy edited this page Apr 1, 2026 · 1 revision

Benchmarks and Token Savings

This page preserves the benchmark-oriented material removed from the README.

Use this page for

understanding where SymForge saves context and where it does not.

What was measured

Two benchmark styles were used:

  • per-tool comparisons
  • end-to-end workflow comparisons

The goal was to compare typical raw-file agent workflows against indexed SymForge workflows.

Headline result

On large-file and multi-hop code understanding tasks, SymForge usually reduces context consumption dramatically because the server resolves structure first and returns only the focused result.

Representative outcomes from the original measurements:

  • file outline understanding: often around 90%+ token reduction
  • edit-preparation workflows: often around 90%+ token reduction
  • search/reference lookups: smaller but still meaningful savings

Why the savings happen

Traditional agents often do this:

  1. read a large file
  2. grep for a symbol
  3. read a second file
  4. read a third file
  5. repeat until enough context is assembled

SymForge moves that traversal into the server:

  • symbol lookup is indexed
  • reference tracing is indexed
  • related types can be bundled
  • the response omits unrelated file content

Caveats

  • Small files do not benefit much.
  • Exact config/doc reads should still use raw reads.
  • An agent that already knows exact line ranges will see smaller savings.
  • Token counts are approximate and model-dependent.

Original benchmark examples

Representative comparisons from the previous README:

  • get_file_context(outline) vs reading a large source file directly
  • find_references(compact=true) vs grep output
  • get_symbol_context(bundle=true) vs reading several related files
  • explore plus get_symbol_context vs broad repo exploration through repeated reads

Recommended interpretation

Treat these numbers as directional, not contractual.

The durable takeaway is:

  • SymForge saves the most when the agent does not already know where to look
  • SymForge saves less when the task is already narrowed to exact lines
  • SymForge is strongest on orientation, tracing, and edit preparation

Caution

Exact percentages depend on the codebase, file sizes, and the client model. Use the numbers here as a decision aid, not as a guarantee.

If you want to reproduce the original measurements, compare the total output sizes of:

  • your normal read/grep workflow
  • the equivalent SymForge workflow

on the same target and task.

Clone this wiki locally