Skip to content

Releases: teriyakichild/mcp-condenser

v0.8.4

26 Mar 02:18

Choose a tag to compare

v0.8.4 (2026-03-26)

This release is published under the Apache-2.0 License.

Bug Fixes

  • Use original tool name for upstream calls when prefix_tools is enabled (73152fb)

ProxyTool.run() uses self.name to call the upstream MCP server, but when prefix_tools is enabled, self.name gets changed to the prefixed version (e.g. server_name_tool_name). This causes "tool not found" errors from the upstream server which only knows the original name.

Introduces _make_patched_run() which calls the upstream client directly with the original remote name, bypassing ProxyTool.run() entirely so no shared state is mutated and concurrent async calls are safe.


Detailed Changes: v0.8.3...v0.8.4

v0.8.3

26 Mar 01:14

Choose a tag to compare

v0.8.3 (2026-03-26)

This release is published under the Apache-2.0 License.

Bug Fixes

  • Address PR review feedback (2a40b53)

Gate _log_residual() diagnostic work behind logger.isEnabledFor(DEBUG) to avoid scanning all_flat when debug logging is disabled.

Fix test docstring/name mismatch — test checks for valid output, not identical output.

  • Eliminate silent data loss for nested arrays in render_table (4d79b4f)

render_table() had no fallback for array fields that couldn't be inlined or extracted as sub-tables — data silently vanished. This affected Prometheus time-series values, Istio routing rules, and any nested arrays that didn't fit the narrow inline/sub-table constraints.

Add residual field tracking: after inlining and sub-table extraction, any remaining array fields are rendered via recursive condense() (for arrays of dicts) or json.dumps() (for primitives/mixed). Sub-tables now also recurse through render_table() so their nested arrays get the same treatment.

Adds 4 new benchmark fixtures (Prometheus, Elasticsearch, Istio CRDs, JSONL access logs) with 60 questions and diagnostic logging for residual fields to guide future heuristic work.

  • Improve identity column detection for compound names like InstanceId (d4bfe90)

Identity column matching used exact last-segment comparison, so compound names like InstanceId, NodeName, SubnetId never matched keywords "id" or "name". This caused poor identity column selection in split-mode wide tables — IamInstanceProfile.Id was chosen over InstanceId, and State.Name over Tags.Name.

Changes:

  • Add _col_matches_keyword() with CamelCase/separator boundary detection
    to avoid false positives (valid, liquid, filename rejected)
  • Score candidates by (cardinality, avg_value_length, depth) via shared
    _identity_score() helper
  • In render_split, limit repeated identity columns to best 2-3 per
    keyword with lazy stats computation
  • Update find_identity_column, _find_identity_from_cleaned, order_columns,
    and render_split to use suffix matching consistently

Documentation

  • Update token reduction benchmarks with accurate numbers (0558347)

Add 5 new fixtures to the table (App performance, Prometheus, Elasticsearch, Istio, JSONL access logs). Update EC2 from 86.9% to 56.3% — the old number was inflated by silently dropped sub-table data.


Detailed Changes: v0.8.2...v0.8.3

v0.8.2

16 Mar 21:24

Choose a tag to compare

v0.8.2 (2026-03-16)

This release is published under the Apache-2.0 License.

Bug Fixes

  • Address review feedback on inline nested arrays (5740d3f)

  • Fix docstring to match actual output format (space-separated, not comma)

  • Cache inline results during feasibility scan to avoid double computation

  • Guard against non-list values being silently overwritten with empty string

  • Inline small nested arrays instead of exploding into sub-tables (b3435bc)

Nested arrays with ≤10 simple items per parent (e.g. topErrorApps) were being flattened into massive sub-tables with non-unique join keys. Now they render inline as compact "name:value" strings in the parent row.

  • Normalize empty lists in inlined fields, add edge-case tests (a361ccf)

  • Empty lists for inlined fields now become "" instead of staying as []

  • Added tests for: >10 items fallback to sub-table, nested dicts not
    inlined, empty array rows get blank, no key column skipped, multi-value
    column format

  • Require consistent keys across items before inlining (f9c7e5f)

Bail out to sub-table when nested array items have different key sets, preventing silent data loss. Added regression test for heterogeneous keys.

  • Sort val_cols for stable output, skip all-empty inlining, improve test assertions (cf6c28f)

  • Sort value columns alphabetically for deterministic inline rendering

  • Require at least one non-empty cached result before inlining a field

  • Assert sub-table presence (not just absence of inline format) in tests

Chores

  • Update uv.lock to match v0.8.0 (068eb34)

Detailed Changes: v0.8.1...v0.8.2

v0.8.1

04 Mar 15:28

Choose a tag to compare

v0.8.1 (2026-03-04)

This release is published under the Apache-2.0 License.

Bug Fixes

  • Use isfinite/is_integer guard in fmt(), add edge-case tests (041687d)

Address Copilot review feedback:

  • Use math.isfinite() + val.is_integer() instead of val == int(val) to
    avoid OverflowError/ValueError on inf/nan before the magnitude check
  • Use <= 2**53 to include the largest exactly-representable integer
  • Add TestFmt covering None, bool, int, string, whole floats, fractional
    floats, boundary values (253, 254), inf, and nan

Continuous Integration

  • Add prerelease workflow for testing docker images (d781660)

Triggered by maintainers commenting /prerelease on a PR, or via workflow_dispatch. Builds and pushes images tagged as pr- to DockerHub. Restricted to OWNER/MEMBER/COLLABORATOR roles.

Performance Improvements

  • Pre-compile is_iso_ts regex at module level (bdb4218)

Moves the ISO timestamp regex from inline re.match() to a module-level compiled pattern. Benchmarks show ~2x speedup (3.3M → 6.9M calls/s), which matters since is_iso_ts is called per-value in timestamp columns.

Refactoring

  • Remove dead code, fix float precision, replace print with logging (3122b6d)

  • Remove unused extract_array_fields function and its no-op loop

  • Simplify redundant None/bool branching in preprocess_table row builder

  • Guard fmt() float-to-int conversion with abs(val) < 2**53 for large floats

  • Replace all print(stderr) calls in proxy.py with structured logging


Detailed Changes: v0.8.0...v0.8.1

v0.8.0

23 Feb 21:26

Choose a tag to compare

v0.8.0 (2026-02-23)

This release is published under the Apache-2.0 License.

Bug Fixes

  • Install project at build time instead of runtime (4850036)

Add a second uv sync --frozen after copying source to install the project package during the Docker build. Replace uv run entrypoint with direct .venv/bin invocation so nothing is resolved at container start.

  • Remove stale QUESTIONS and match functions from accuracy.py (435e9fe)

The local QUESTIONS dict shadowed the import from fixtures.py, limiting standalone accuracy.py to only 2 fixtures instead of all 8. The local match functions also referenced re without importing it.

  • Split accuracy into separate JSON/TOON tables, remove raw backup (dbe7119)

Split the combined accuracy matrix into separate JSON (baseline) and TOON (balanced profile) tables for readability. Remove raw_results_pre_profiles.json from repo.

Documentation

  • Add CSV/XML to README, note on CSV token behavior (b1305fe)

Update supported formats, CLI examples, benchmark table, and question count. Add note explaining that CSV already being tabular means TOON adds minimal overhead — the value is auto-detection, type inference, and heuristic column elision rather than token reduction.

  • Rename JSON to Raw in README benchmark tables (4d4aaa6)

Features

  • Add app_performance.csv benchmark fixture (ff03d7d)

30 microservices x 25 columns APM dashboard export designed to exercise TOON heuristics: 4 all-zero cols, 3 all-null cols, 1 constant col elided; latency and errors tuple-grouped (25→13 effective columns). 15 accuracy questions including annotation-reading tests. qwen3:4b scores 93% TOON vs 87% raw at 1.6x speed.

  • Add CSV and XML benchmark fixtures with accuracy questions (de3770a)

Add server_metrics.csv (25 servers x 10 columns) and deploy_inventory.xml (20 deployments across 3 environments) as accuracy benchmark fixtures. Update load_sample to handle non-JSON formats via parse_input, add 15 questions each covering direct lookups, filtering, cross-reference, and multi-hop queries.

XML fixture shows 65.6% token reduction from eliminating duplicated tags — exactly the enterprise API use case the condenser targets.

  • Add CSV/TSV parser with type inference (a35492c)

Register a new CSV parser after JSON/YAML in the parser registry. Uses csv.Sniffer for dialect detection (comma, tab, pipe, semicolon), requires 2+ columns and at least one data row. Type normalization converts numeric strings to int/float and empty values to None.

  • Add format_hint config for per-tool format override (1d8dea4)

ServerConfig gains format_hint and tool_format_hints fields, parsed from FORMAT_HINT / TOOL_FORMAT_HINTS env vars and JSON config. The proxy passes the resolved hint through to parse_input().

  • Add XML parser with tree-to-dict conversion (17ca4db)

Uses xml.etree.ElementTree to parse XML into nested dicts/lists that the existing condense pipeline handles natively. Attributes become @attr keys, repeated child elements become lists, and text values get int/float/bool coercion. Registered after CSV (lowest auto-detect priority).

Refactoring

  • Extract parser registry into parsers.py (8ba88e5)

Move parse_input() from condenser.py into a new parsers module with an extensible registry (Parser NamedTuple, PARSER_REGISTRY, register_parser). Existing importers continue to work via re-export from condenser.py.

  • Rename benchmark format label from json to raw (ad6057f)

The baseline format is the original input (JSON, CSV, XML, etc.), not always JSON. Rename format field, display labels, variable names, and help text throughout accuracy.py and matrix.py. Also add app_performance.csv to matrix DEFAULT_FIXTURES.

  • Rename condense_json/toon_encode_json to format-agnostic names (7d24b54)

condense_json() → condense_text(), toon_encode_json() → toon_encode(). Old names remain as deprecated aliases that emit DeprecationWarning. All internal callers updated to use new names.


Detailed Changes: v0.7.0...v0.8.0

v0.7.0

23 Feb 13:58

Choose a tag to compare

v0.7.0 (2026-02-23)

This release is published under the Apache-2.0 License.

Bug Fixes

  • Update GitHub URLs from logdna to teriyakichild (4613815)

Documentation

  • Add shields.io badges to README (8890266)

  • Update benchmark reports with balanced profile results (ef06aa7)

Rerun full accuracy matrix (5 models × 4 fixtures) using the balanced profile with TOON-only mode. EC2 accuracy improved from 0% to 80-100% on qwen3 models thanks to wide_table_format=split.

Features

  • Add --profile and --toon-only flags to benchmark matrix (c72ad7c)

Support heuristics profiles and TOON-only mode in the matrix runner, passing resolved heuristics through to token/context table generation. Back up pre-profile raw results for historical comparison.

  • Add heuristics profiles (balanced, compact, precise) (8eb8ba0)

Named profiles provide preset heuristic configurations selectable via config, env var (CONDENSER_PROFILE), or benchmark CLI (--profile). Resolution order: profile defaults → server heuristics → tool_heuristics.

  • Add wide table rendering and per-tool heuristics (05ff812)

Resolve merge conflicts from feat/wide-table-rendering branch. Adds two alternative renderers for tables that exceed a column threshold:

  • vertical: key-value blocks per row, labeled by identity column
  • split: multiple narrow sub-tables grouped by column prefix, with
    identity columns repeated in each

New heuristics: wide_table_threshold (column count trigger), wide_table_format ("vertical" or "split"). Per-tool heuristic overrides via tool_heuristics config. Expanded benchmark questions for K8s fixtures.


Detailed Changes: v0.6.0...v0.7.0

v0.6.0

23 Feb 01:22

Choose a tag to compare

v0.6.0 (2026-02-23)

This release is published under the Apache-2.0 License.

Features

  • Add live progress output to accuracy benchmark (17c854d)

Print per-question status lines to stderr as the benchmark runs, showing pass/fail/skip, format, fixture, question, and elapsed time.

  • Add max_table_columns and elide_mostly_zero_pct heuristics for wide table readability (21ad4bd)

Add two new experimental heuristics to help small LLMs parse wide tables:

  • max_table_columns: caps table width, dropping rightmost columns (identity columns survive via ordering)
  • elide_mostly_zero_pct: removes columns where most values are zero, annotating outliers with identity labels

Also adds --heuristics flag to benchmarks/accuracy.py for testing strategies without code changes, and updates config.py to support float-valued heuristic parameters.

  • Add multi-hop, arithmetic, and ranking benchmark questions (eb47680)

Add 12 harder questions requiring multi-step reasoning: multi-hop lookups, percentage calculations, inverse filtering, ranking beyond top-1, cross-section joins, and reading elided annotation values.

  • Add per-tool heuristic overrides and harder benchmark questions (880e371)

  • tool_heuristics config allows per-tool heuristic overrides that merge
    on top of base server heuristics

  • fix string value parsing in CONDENSER_HEURISTICS env var (previously
    coerced non-bool strings to True)

  • add 12 harder cross-reference/comparison/aggregation benchmark questions

  • Expand benchmark suite with multi-model matrix and new fixtures (cda4274)

Add multi-model accuracy benchmark infrastructure: - benchmarks/fixtures.py: shared questions (90 across 5 fixtures), match functions, and fixture metadata extracted from accuracy.py - benchmarks/matrix.py: multi-model orchestrator with resume support, incremental saves, and markdown report generation - benchmarks/accuracy.py: refactored to import from fixtures.py, added per-question error handling and 600s timeout

Add synthetic test fixtures: - tests/fixtures/aws_ec2_instances.json: 20 EC2 instances (33K tokens, 87% reduction) with deterministic generator script - tests/fixtures/db_query_results.json: 150 SQL order rows (26K tokens, 57% reduction) with deterministic generator script

Benchmark results across 5 models (qwen3:1.7b/4b, llama3.1:8b, qwen3:14b/30b) show TOON matches or beats JSON accuracy on Kubernetes fixtures (100% TOON on both K8s fixtures for all models 4b+) while achieving 57-87% token reduction.

Document EC2 Tags condensing gap in docs/ec2-tags-fix.md — nested tag arrays are silently dropped during sub-table rendering, making 5 of 15 EC2 questions impossible to answer from TOON output.

  • Pivot Key-Value arrays (AWS Tags) into scalar columns (e0c14fd)

Detect [{Key, Value}] arrays (AWS tag convention) and pivot them into scalar columns on the parent row (e.g. Tags.Name, Tags.Environment) instead of extracting them as cross-referenced sub-tables.

Refactoring

  • Split benchmark reports into separate JSON and TOON tables (eb0cbc4)

The combined JSON/TOON accuracy cells were hard to scan. Split into two independent tables and move context window enablement under a "Local Models" heading since frontier models don't have those limits.


Detailed Changes: v0.5.1...v0.6.0

v0.5.1

21 Feb 04:15

Choose a tag to compare

v0.5.1 (2026-02-21)

This release is published under the Apache-2.0 License.

Bug Fixes

  • Improve TOON accuracy with cardinality-aware identity columns and tuple size cap (7becc62)

  • find_identity_column now picks the highest-cardinality column when
    multiple match the same keyword (e.g. podRef.name over network.name)

  • Add max_tuple_size heuristic (default 4) to prevent large positional
    tuple groups that small LLMs misparse

  • Support int-valued heuristics in env config parser

  • Add --num-ctx flag to accuracy benchmark for Ollama context control

  • Add failure logging and detail printing to accuracy benchmark

Documentation

  • Add comprehensive configuration reference (b6fe86b)

Move inline config tables from README to docs/CONFIGURATION.md covering all env vars, config file schema, condensing heuristics, and Helm values.

  • Use latest tag instead of hardcoded version in README (280ec8d)

Detailed Changes: v0.5.0...v0.5.1

v0.5.0

21 Feb 02:17

Choose a tag to compare

v0.5.0 (2026-02-21)

This release is published under the Apache-2.0 License.

Bug Fixes

  • Address review feedback on heuristics config (b115c0a)

  • Wrap Heuristics(**cfg.heuristics) in try/except to surface helpful
    error on typos listing valid key names

  • Add config parsing tests for CONDENSER_HEURISTICS env var and
    from_file heuristics dict

  • Add test for invalid heuristic key error message

  • List valid heuristic keys in helm values.yaml comment

Chores

  • Sync uv.lock with pyproject.toml version 0.4.2 (db34015)

Documentation

  • Add Docker Compose and Helm deployment examples (1f63415)

Add examples/ directory with quick-start files for both Docker Compose (single and multi-upstream) and Helm (values files and Helmfile). Link to the new examples from the README.

  • Improve README accuracy and clarity (6b1927a)

Fix broken quick start (add -p 9000:9000, note about host.docker.internal on Linux). Rewrite subheading and How it works for accuracy: drop "JSON objects" since YAML is also supported, replace jargon (elide, homogeneous arrays, numeric tuples) with plain language, explain clustered-timestamp condensing. Clarify TOON_FALLBACK description, add transition between Docker and source-based usage, restore benchmark summary sentence, and document key config file options inline.

  • Revamp README and bump chart appVersion to 0.4.2 (4399dff)

Rewrite README with a leaner get-started-fast structure: docker run quick start, brief proxy usage sections, and env var reference table. Move verbose config examples, header-forwarding docs, and per-server tables out of README in favor of links to examples/ and values.yaml. Update Helm chart appVersion from 0.2.0 to 0.4.2.

Features

  • Add tunable condensing heuristics (6f50782)

Make each preprocessing heuristic in preprocess_table() individually toggleable via config, so users can disable specific elisions (e.g. timestamp clustering) without switching to toon_only mode.

New Heuristics dataclass with 5 boolean fields (all default true): elide_all_zero, elide_all_null, elide_timestamps, elide_constants, group_tuples. Configurable via CONDENSER_HEURISTICS env var or per-server "heuristics" dict in the config file. Helm chart updated with the new config value.

Testing

  • Add accuracy benchmark for TOON condensed output (66912b9)

Ollama-based benchmark that verifies an LLM can answer factual questions from condensed TOON output vs raw JSON. Includes two fixture sets (toolresult.json, toolresult2_small.json) with 21 total questions and configurable model/context settings.


Detailed Changes: v0.4.2...v0.5.0

v0.4.2

20 Feb 23:02

Choose a tag to compare

v0.4.2 (2026-02-20)

This release is published under the Apache-2.0 License.

Bug Fixes

  • Commit uv.lock for Docker build (d37c485)

The Dockerfile copies uv.lock but it was gitignored, causing the Docker build to fail with "not found".


Detailed Changes: v0.4.1...v0.4.2