Wanna see how AI chops your text into tiny weird pieces? This shows it.
Watch it do its thing 🎥:
Screen.Recording.2026-02-27.at.10.47.03.PM.mov
AI doesn’t read like humans. It breaks stuff into tokens — sometimes words, sometimes weird fragments, sometimes punctuation.
TokenLens lets you:
- See why prompts cost $$$
- Figure out why AI freaks out sometimes
- Compare how different models slice your text
- Colors for tokens (because why not)
- Token IDs, indexes, counts, ratios
- Shows API cost for GPT-4o, GPT-4, GPT-3.5 💸
- Flip between 4 encoders
- Updates as you type ⚡
| Encoding | Models | Vocab |
|---|---|---|
cl100k_base |
GPT-4, GPT-3.5-turbo | 100k-ish |
o200k_base |
GPT-4o | 200k-ish |
p50k_base |
text-davinci-002/003 | 50k-ish |
r50k_base |
GPT-2, GPT-3 | 50k-ish |
git clone https://github.com/your-username/TokenLens.git
cd TokenLens
python -m venv venv
# Mac/Linux
source venv/bin/activate
# Windows
venv\Scripts\activate
pip install -r requirements.txt
streamlit run app.pyOpen http://localhost:8501 and watch the magic happen ✨
For terminal lovers, TokenLens has a built-in CLI module.
# Tokenize direct text input
python -m tokenlens.cli "This is a test of the tokenlens engine."
# Pipe data from other tools
echo "Pipe me!" | python -m tokenlens.cli
# Check options
python -m tokenlens.cli --helpCLI Flags:
-e, --encoder: Choose your encoder (e.g.,o200k_base)-q, --quiet: Output just the integer token count for scripts--no-color: Disable ANSI color background output-s, --stats: Force show detailed stats/costs
| Model | Cost / 1M tokens |
|---|---|
| GPT-4o | $5 |
| GPT-4 | $30 |
| GPT-3.5-turbo | $0.50 |
Check OpenAI if prices matter.
- Tokens = text chunks (~4 chars each)
- BPE = AI’s way of merging bytes until vocab is big
- Costs, context windows, weird splits = all token stuff
TokenLens/
├── tokenlens/
│ ├── __init__.py ← package marker
│ ├── core.py ← tokenization heart
│ └── cli.py ← the cool terminal tool
├── app.py ← web app
├── requirements.txt ← boring dependencies
└── README.md ← this lazy thing
MIT. Do what you want. Seriously.
Built to learn. Inspired by tiktokenizer.vercel.app