🔬 TokenLens

Wanna see how AI chops your text into tiny weird pieces? This shows it.

Watch it do its thing 🎥:

Screen.Recording.2026-02-27.at.10.47.03.PM.mov

What even is TokenLens?

AI doesn’t read like humans. It breaks stuff into tokens — sometimes words, sometimes weird fragments, sometimes punctuation.

TokenLens lets you:

See why prompts cost $$$
Figure out why AI freaks out sometimes
Compare how different models slice your text

Features (or whatever)

Colors for tokens (because why not)
Token IDs, indexes, counts, ratios
Shows API cost for GPT-4o, GPT-4, GPT-3.5 💸
Flip between 4 encoders
Updates as you type ⚡

Supported Encoders

Encoding	Models	Vocab
`cl100k_base`	GPT-4, GPT-3.5-turbo	100k-ish
`o200k_base`	GPT-4o	200k-ish
`p50k_base`	text-davinci-002/003	50k-ish
`r50k_base`	GPT-2, GPT-3	50k-ish

Run the Web App (super easy)

git clone https://github.com/your-username/TokenLens.git
cd TokenLens
python -m venv venv
# Mac/Linux
source venv/bin/activate
# Windows
venv\Scripts\activate
pip install -r requirements.txt
streamlit run app.py

Open http://localhost:8501 and watch the magic happen ✨

Run the CLI (also super easy)

For terminal lovers, TokenLens has a built-in CLI module.

# Tokenize direct text input
python -m tokenlens.cli "This is a test of the tokenlens engine."

# Pipe data from other tools
echo "Pipe me!" | python -m tokenlens.cli

# Check options
python -m tokenlens.cli --help

CLI Flags:

-e, --encoder: Choose your encoder (e.g., o200k_base)
-q, --quiet: Output just the integer token count for scripts
--no-color: Disable ANSI color background output
-s, --stats: Force show detailed stats/costs

Token money stuff

Model	Cost / 1M tokens
GPT-4o	$5
GPT-4	$30
GPT-3.5-turbo	$0.50

Check OpenAI if prices matter.

Why tokens even matter

Tokens = text chunks (~4 chars each)
BPE = AI’s way of merging bytes until vocab is big
Costs, context windows, weird splits = all token stuff

Project tree (looks organized, kinda)

TokenLens/
├── tokenlens/
│   ├── __init__.py   ← package marker
│   ├── core.py       ← tokenization heart
│   └── cli.py        ← the cool terminal tool
├── app.py            ← web app
├── requirements.txt  ← boring dependencies
└── README.md         ← this lazy thing

License

MIT. Do what you want. Seriously.

Built to learn. Inspired by tiktokenizer.vercel.app

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
static		static
tokenlens		tokenlens
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🔬 TokenLens

What even is TokenLens?

Features (or whatever)

Supported Encoders

Run the Web App (super easy)

Run the CLI (also super easy)

Token money stuff

Why tokens even matter

Project tree (looks organized, kinda)

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

License

NinjaOfNeurons/TokenLens

Folders and files

Latest commit

History

Repository files navigation

🔬 TokenLens

What even is TokenLens?

Features (or whatever)

Supported Encoders

Run the Web App (super easy)

Run the CLI (also super easy)

Token money stuff

Why tokens even matter

Project tree (looks organized, kinda)

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages