Skip to content

Max Seq length for inference #73

Answered by Zymrael
JunboShen asked this question in Q&A
Discussion options

You must be logged in to vote

Prompting with longer sequences requires sharding for the model, which is currently not supported. However, you can generate much longer, up to 500k and beyond on a single 80Gb GPU.

If you'd like to test the model with longer prompt I recommend Together's API.

Replies: 5 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by JunboShen
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
4 participants
Converted from issue

This discussion was converted from issue #24 on June 21, 2024 02:31.