Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

T5 Static Cache #30845

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

T5 Static Cache #30845

wants to merge 5 commits into from

Conversation

huseinzol05
Copy link
Contributor

@huseinzol05 huseinzol05 commented May 16, 2024

This enable to use torch.compile for T5 generation to enable faster generation.

Compiled static cache able to achieve 905.746397513013 tokens / sec while non-compiled got 259.57796367162115 tokens / sec on Small T5 60M parameters.

  1. Notebook for non-compiled, https://github.com/mesolitica/t5-static-cache/blob/main/t5-static-cache-non-compile.ipynb
  2. Notebook for compiled, https://github.com/mesolitica/t5-static-cache/blob/main/t5-static-cache.ipynb

Still work in progress

  1. Current forked only work to use static cache, need to follow caching steps as Llama.
  2. There are so many conditions need to fulfill first.
  3. Only worked on Pytorch 2.4.0.dev20240508+cu121 version, not yet released as stable for custom function reduce-overhead torch compile.

@amyeroberts
Copy link
Collaborator

cc @ArthurZucker

@ArthurZucker
Copy link
Collaborator

Once this is ready feel free to ping either @fxmarty or @gante

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants