Releases: amahi2001/python-token-killer
Releases · amahi2001/python-token-killer
v0.1.1 — Bug fixes, test refactor, improved examples
What changed since v0.1.0
Bug Fixes
- Silent data loss in
_shorten_keys—timestampandcreated_atboth mapped tots. When both existed in the same dict, aggressive mode silently dropped one value. Fixed with collision detection. - URL corruption in CodeMinimizer —
https://example.cominside string literals was stripped because the//comment regex had a broken single-char lookbehind. Replaced with a string-aware regex that matches quoted strings first. _shorten_dotted_keyscrash on non-string keys —"." in 1raisedTypeError. Fixed withisinstance(k, str)guard.- DiffMinimizer silently folded
\ No newline at end of file— the backslash marker was treated as context and lost in large diffs. Fixed by adding"\\ "to significant prefixes. - Markdown misdetected as diff — a
---horizontal rule plus any@@mention triggered diff detection. Tightened to requirediff --githeader or---followed by+++. _sample(n=1)causedZeroDivisionError—step = (len-1) / (n-1). Fixed with early-return guards.- ALL CAPS words abbreviated with wrong case —
IMPLEMENTATIONbecameImplinstead ofIMPL. Fixed withword.isupper()check.
Test Suite
Refactored 3 monolithic files (1000+ lines each) into 19 focused modules:
tests/
unit/ # one file per minimizer
adversarial/ # one file per concern (types, unicode, regex, contracts, mutation, concurrency, performance)
real_world/ # tool output tests (pytest, cargo, go test, ruff, git, docker, build errors, pipelines)
New Makefile targets: make test-unit, make test-adversarial, make test-real-world
Examples
Replaced abstract examples with runnable, output-showing demos:
examples/rag_pipeline.py— 3 realistic wiki docs, naive vs ptk context, per-query cost + monthly savingsexamples/langgraph_agent.py— simulated 3-step agent loop with token savings per stepexamples/log_triage.py— 400-line CI log → 19-line error triage (96.5% saved)
Other
- README: lead with before/after, real token counts, cost math
- uv-only CI:
uv sync --locked --only-group <group>per job, built-in caching - Dependabot: switched from
piptouvecosystem (picks upuv.lock+[dependency-groups]) - SECURITY.md added
- CodeQL
continue-on-errorremoved (repo is now public)
v0.1.0 — Initial Release
ptk — Python Token Killer
Minimize LLM tokens from Python objects in one call. Zero dependencies. 361 tests.
import ptk
ptk.minimize({"users": [{"name": "Alice", "bio": None}]})
# → {"users":[{"name":"Alice"}]}
ptk(big_api_response, aggressive=True) # max compressionInstall
pip install python-token-killer
# or
uv add python-token-killerBenchmarks (tiktoken cl100k_base)
| Benchmark | Original | Default | Saved | Aggressive | Saved |
|---|---|---|---|---|---|
| API response (JSON) | 1,450 | 792 | 45.4% | 782 | 46.1% |
| Python module (code) | 2,734 | 2,113 | 22.7% | 309 | 88.7% |
| Server log (58 lines) | 1,389 | 1,388 | 0.1% | 231 | 83.4% |
| 50 user records (list) | 2,774 | 922 | 66.8% | 922 | 66.8% |
| Verbose paragraph (text) | 101 | 96 | 5.0% | 74 | 26.7% |
| Total | 11,182 | 7,424 | 33.6% | 2,627 | 76.5% |
6 Minimizers
- Dict — null stripping, key shortening, single-child flattening, kv/tabular formats
- List — schema-once tabular, dedup with counts, deterministic sampling
- Code — comment stripping with pragma preservation (
noqa,type: ignore,TODO,eslint-disable), docstring collapse, signature extraction (Python, JS, Rust, Go) - Log — duplicate line collapse, error-only filtering, stack trace + test runner output preservation
- Diff — context folding, noise stripping,
\ No newlinepreservation - Text — 20+ word abbreviations, 16 phrase abbreviations, filler removal, stopword removal
Quality
- 361 tests (153 feature + 169 adversarial + 39 real-world tool outputs), 0.55s
mypy --strictclean across all source filesruff checkclean- Python 3.10–3.13
- Zero required dependencies (tiktoken optional)
- CodeQL security scanning
- uv lockfile for reproducible installs