Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support left padding to forward batch prompts in a single step #1349

Open
5 tasks
guillaumekln opened this issue Jul 18, 2023 · 0 comments
Open
5 tasks

Support left padding to forward batch prompts in a single step #1349

guillaumekln opened this issue Jul 18, 2023 · 0 comments
Labels
enhancement New feature or request

Comments

@guillaumekln
Copy link
Collaborator

guillaumekln commented Jul 18, 2023

Batches of variable-length prompts are currently not forwarded in a single step into the decoder. Only the tokens up to the minimum length in the batch are forwarded at once, and the remaining tokens are force-decoded in the decoding loop.

It could be more efficient to forward the batch in a single step but this requires supporting padding positions on the left. It means each entry in the batch has an offset in position-aware modules:

  • softmax
  • position encodings
  • rotary embeddings
  • padder
  • MHA values "mask"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant