Skip to content

Standardize on u64 for token counts #32869

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 17, 2025
Merged

Standardize on u64 for token counts #32869

merged 1 commit into from
Jun 17, 2025

Conversation

rtfeldman
Copy link
Contributor

Previously we were using a mix of u32 and usize, e.g. max_tokens: usize, max_output_tokens: Option<u32> in the same struct.

Although tiktoken uses usize, token counts should be consistent across targets (e.g. the same model doesn't suddenly get a smaller context window if you're compiling for wasm32), and these token counts could end up getting serialized using a binary protocol, so usize is not the right choice for token counts.

I chose to standardize on u64 over u32 because we don't store many of them (so the extra size should be insignificant) and future models may exceed u32::MAX tokens.

Release Notes:

  • N/A

@cla-bot cla-bot bot added the cla-signed The user has signed the Contributor License Agreement label Jun 17, 2025
@rtfeldman rtfeldman merged commit 5405c2c into main Jun 17, 2025
20 checks passed
@rtfeldman rtfeldman deleted the u64-tokens branch June 17, 2025 14:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed The user has signed the Contributor License Agreement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant