Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(cache): use optimized StaticCache class for XLA #70

Merged
merged 1 commit into from
Jul 9, 2024

Conversation

tengomucho
Copy link
Collaborator

This is actually a ripoff of the work originally done as a contribution to transformers:

huggingface/transformers#31129

The original contribution has not been merged yet, but it shows lower memory usage and better performance on XLA. So I think it's worth adding it here.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

This is actually a ripoff of the work originally done as a contribution
to transformers:

huggingface/transformers#31129

The original contribution has not been merged yet, but it shows lower
memory usage and better performance on XLA. So I think it's worth adding
it here, to be integrated on optimum-tpu.
@tengomucho tengomucho marked this pull request as ready for review July 8, 2024 14:00
@tengomucho tengomucho merged commit 77bebf8 into main Jul 9, 2024
3 of 4 checks passed
@tengomucho tengomucho deleted the lower-memory-static-cache branch July 9, 2024 09:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants