Skip to content

Integrate export size and token estimation into the selection workflow

Choose a tag to compare

@jkjitendra jkjitendra released this 28 Jun 15:04
9710022

Release v0.1.6 introduces total export size and token count estimation models to help users gauge codebase sizes relative to LLM context window limits before initiating exports.

Key Features & Enhancements:

  1. Token & Size Estimation Engine (apps/desktop/src/renderer/lib/tokenEstimate.ts)
    • Implemented a dependency-free token estimation utility using the standard heuristic of 1 token per 4 bytes.
    • Defined precise export wrapper overhead constants, including a 500-byte header overhead and 120-byte per-file Markdown formatting overhead.
    • Added state classifiers to calculate context status against common LLM context window thresholds: 128K, 200K, and 1M tokens.
    • Added utility functions to format raw bytes and token counts into human-readable strings.

  2. Selection Summary Calculation (apps/desktop/src/renderer/lib/selection.ts)
    • Extended the selection model summary to aggregate raw size measurements of all checked file nodes.
    • Automatically integrated export header and markdown wrapper overhead calculations into the total byte counts.
    • Integrated the estimation engine to expose estimated total bytes and token counts dynamically.

  3. Ul-level Statistics & Context Badges (apps/desktop/src/renderer/App.tsx)
    • Added user interface rows for rendering human-readable totals for export size and estimated tokens.
    • Rendered dynamic, color-coded badges (green, amber, or red status flags) corresponding to 128K, 200K, and 1M token limits.
    • Added styles for the statistics row, alignment details, and limits state alerts.

  4. Testing & Verification (apps/desktop/tests/tokenEstimate.test.ts, apps/desktop/tests/selection.test.ts)
    • Implemented a dedicated suite of unit tests checking boundary conditions, ceiling rules, formatting, and badge transitions.
    • Updated the selection model tests to verify correct total byte accumulation and metadata overhead insertion.