Standardize on u64 for token counts #32869

rtfeldman · 2025-06-17T14:06:50Z

Previously we were using a mix of u32 and usize, e.g. max_tokens: usize, max_output_tokens: Option<u32> in the same struct.

Although tiktoken uses usize, token counts should be consistent across targets (e.g. the same model doesn't suddenly get a smaller context window if you're compiling for wasm32), and these token counts could end up getting serialized using a binary protocol, so usize is not the right choice for token counts.

I chose to standardize on u64 over u32 because we don't store many of them (so the extra size should be insignificant) and future models may exceed u32::MAX tokens.

Release Notes:

N/A

cla-bot bot added the cla-signed The user has signed the Contributor License Agreement label Jun 17, 2025

Standardize on u64 for token counts

6a12025

rtfeldman force-pushed the u64-tokens branch from 1e7464b to 6a12025 Compare June 17, 2025 14:14

rtfeldman merged commit 5405c2c into main Jun 17, 2025
20 checks passed

rtfeldman deleted the u64-tokens branch June 17, 2025 14:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Standardize on u64 for token counts #32869

Standardize on u64 for token counts #32869

Uh oh!

rtfeldman commented Jun 17, 2025

Uh oh!

Uh oh!

Uh oh!

Standardize on u64 for token counts #32869

Standardize on u64 for token counts #32869

Uh oh!

Conversation

rtfeldman commented Jun 17, 2025

Uh oh!

Uh oh!

Uh oh!