Support left padding to forward batch prompts in a single step #1349

guillaumekln · 2023-07-18T08:34:40Z

Batches of variable-length prompts are currently not forwarded in a single step into the decoder. Only the tokens up to the minimum length in the batch are forwarded at once, and the remaining tokens are force-decoded in the decoding loop.

It could be more efficient to forward the batch in a single step but this requires supporting padding positions on the left. It means each entry in the batch has an offset in position-aware modules:

guillaumekln added the enhancement New feature or request label Jul 18, 2023

This was referenced Jul 27, 2023

Accept left offsets in the masked softmax operator #1370

Draft

Accept left offsets in the rotary embeddings layer #1372

Draft

Accept left offsets when applying position encodings #1374

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support left padding to forward batch prompts in a single step #1349

Support left padding to forward batch prompts in a single step #1349

guillaumekln commented Jul 18, 2023 •

edited

Loading

Support left padding to forward batch prompts in a single step #1349

Support left padding to forward batch prompts in a single step #1349

Comments

guillaumekln commented Jul 18, 2023 • edited Loading

guillaumekln commented Jul 18, 2023 •

edited

Loading