Skip to content

Releases: amahi2001/python-token-killer

v0.1.1 — Bug fixes, test refactor, improved examples

12 Apr 06:29

Choose a tag to compare

What changed since v0.1.0

Bug Fixes

  • Silent data loss in _shorten_keystimestamp and created_at both mapped to ts. When both existed in the same dict, aggressive mode silently dropped one value. Fixed with collision detection.
  • URL corruption in CodeMinimizerhttps://example.com inside string literals was stripped because the // comment regex had a broken single-char lookbehind. Replaced with a string-aware regex that matches quoted strings first.
  • _shorten_dotted_keys crash on non-string keys"." in 1 raised TypeError. Fixed with isinstance(k, str) guard.
  • DiffMinimizer silently folded \ No newline at end of file — the backslash marker was treated as context and lost in large diffs. Fixed by adding "\\ " to significant prefixes.
  • Markdown misdetected as diff — a --- horizontal rule plus any @@ mention triggered diff detection. Tightened to require diff --git header or --- followed by +++ .
  • _sample(n=1) caused ZeroDivisionErrorstep = (len-1) / (n-1). Fixed with early-return guards.
  • ALL CAPS words abbreviated with wrong caseIMPLEMENTATION became Impl instead of IMPL. Fixed with word.isupper() check.

Test Suite

Refactored 3 monolithic files (1000+ lines each) into 19 focused modules:

tests/
  unit/          # one file per minimizer
  adversarial/   # one file per concern (types, unicode, regex, contracts, mutation, concurrency, performance)
  real_world/    # tool output tests (pytest, cargo, go test, ruff, git, docker, build errors, pipelines)

New Makefile targets: make test-unit, make test-adversarial, make test-real-world

Examples

Replaced abstract examples with runnable, output-showing demos:

  • examples/rag_pipeline.py — 3 realistic wiki docs, naive vs ptk context, per-query cost + monthly savings
  • examples/langgraph_agent.py — simulated 3-step agent loop with token savings per step
  • examples/log_triage.py — 400-line CI log → 19-line error triage (96.5% saved)

Other

  • README: lead with before/after, real token counts, cost math
  • uv-only CI: uv sync --locked --only-group <group> per job, built-in caching
  • Dependabot: switched from pip to uv ecosystem (picks up uv.lock + [dependency-groups])
  • SECURITY.md added
  • CodeQL continue-on-error removed (repo is now public)

v0.1.0 — Initial Release

12 Apr 04:58

Choose a tag to compare

ptk — Python Token Killer

Minimize LLM tokens from Python objects in one call. Zero dependencies. 361 tests.

import ptk

ptk.minimize({"users": [{"name": "Alice", "bio": None}]})
# → {"users":[{"name":"Alice"}]}

ptk(big_api_response, aggressive=True)  # max compression

Install

pip install python-token-killer
# or
uv add python-token-killer

Benchmarks (tiktoken cl100k_base)

Benchmark Original Default Saved Aggressive Saved
API response (JSON) 1,450 792 45.4% 782 46.1%
Python module (code) 2,734 2,113 22.7% 309 88.7%
Server log (58 lines) 1,389 1,388 0.1% 231 83.4%
50 user records (list) 2,774 922 66.8% 922 66.8%
Verbose paragraph (text) 101 96 5.0% 74 26.7%
Total 11,182 7,424 33.6% 2,627 76.5%

6 Minimizers

  • Dict — null stripping, key shortening, single-child flattening, kv/tabular formats
  • List — schema-once tabular, dedup with counts, deterministic sampling
  • Code — comment stripping with pragma preservation (noqa, type: ignore, TODO, eslint-disable), docstring collapse, signature extraction (Python, JS, Rust, Go)
  • Log — duplicate line collapse, error-only filtering, stack trace + test runner output preservation
  • Diff — context folding, noise stripping, \ No newline preservation
  • Text — 20+ word abbreviations, 16 phrase abbreviations, filler removal, stopword removal

Quality

  • 361 tests (153 feature + 169 adversarial + 39 real-world tool outputs), 0.55s
  • mypy --strict clean across all source files
  • ruff check clean
  • Python 3.10–3.13
  • Zero required dependencies (tiktoken optional)
  • CodeQL security scanning
  • uv lockfile for reproducible installs