Skip to content

Conversation

m42a
Copy link

@m42a m42a commented Jan 19, 2024

This is useful if your text is too large to fit in to the cache all at once. Previously, removing tokens from the beginning or middle would require reevaluating all tokens after that. This shifts already evaluated tokens to keep them in the cache, so there's less work to do before we can start generating new tokens.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant