Skip to content

UltraToken CLI tool for token cost analysis

TrintechResearch/UltraToken

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

UltraToken v1.0.3

UltraToken is a CLI utility that replicates TikTokens BPE tokenizer, and utilizes OpenAIs o200K Harmony encodings for fast, precise token cost estimation.

Features

  • Zero external dependencies - entirely self contained
  • Complete BPE Implementation - Full byte-pair encoding algorithm
  • Embedded o200k Vocabulary - 200k token vocabulary included
  • Accurate Regex Splitting - Matches OpenAI's text segmentation
  • Optimized Token Mapping - Fast lookups using Map structures
  • Special Token Support - Handles control tokens correctly
  • Batch Processing

Installation

Install UltraToken with npm

  npm i ultratoken

Or clone and run locally:

git clone https://github.com/TrintechResearch/UltraToken.git
cd UltraToken
npm install -g .

Usage

Command Overview

Command Description
ultratoken Start interactive mode
ultratoken <text> Get token count for text
ultratoken economy <file.md> Process word list file
ultratoken jump Exit the program
ultratoken --help Show help information
ultratoken --version Show version information

Interactive Mode

Start the interactive session:

ultratoken
🚀 UltraToken TikToken Utility
Interactive Mode - Type words to get token counts
Commands: "jump" to exit, "help" for help

ultratoken hello world
"hello world" = 2 tokens

ultratoken programming
"programming" = 2 tokens

ultratoken The quick brown fox
"The quick brown fox" = 4 tokens

ultratoken jump
UltraToken terminated. Goodbye!

Single Text Analysis

Get token count for any text:

ultratoken "machine learning"
# Output:
# Word: machine learning
# Tokens: 2

ultratoken "The quick brown fox jumps over the lazy dog"
# Output:
# Word: The quick brown fox jumps over the lazy dog
# Tokens: 9

Economy Mode - Batch Processing

Process a file containing a list of words. UltraToken will append the token count to each line:

Input file (words.txt):

hello world
artificial intelligence
machine learning
programming
natural language processing

Command:

ultratoken economy words.txt

Output file (words.txt) after processing:

hello world 2
artificial intelligence 3
machine learning 2
programming 2
natural language processing 4

Documentation

For detailed documentation and advanced features, see Documentation.md

License

MIT

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgements

About

UltraToken is developed by TrinityAI Research, specializing in advanced LLM development.

🔗 Links

About

UltraToken CLI tool for token cost analysis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published