Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inference Speed for Long Articles #68

Open
saptarshi059 opened this issue Jan 13, 2023 · 2 comments
Open

Inference Speed for Long Articles #68

saptarshi059 opened this issue Jan 13, 2023 · 2 comments

Comments

@saptarshi059
Copy link

Is there any way to increase the generation speed for extremely long articles, such as 5000 tokens long? I've been trying to apply several optimization tricks, but none seems to work. Or is it just the case that text-generation in general for such long spans WILL be slow and there's no way around it?

@magicknight
Copy link

Is there any way to increase the generation speed for extremely long articles, such as 5000 tokens long? I've been trying to apply several optimization tricks, but none seems to work. Or is it just the case that text-generation in general for such long spans WILL be slow and there's no way around it?

How to generate 5000 tokens?

@saptarshi059
Copy link
Author

So there's no way to directly generate ~5000 tokens,.. that's a limitation of any decoder-based models since they can only process tokens up to their maximum input length which in this case is 2048,.. what I was doing then is,. generate 2048 tokens,.. and then use the last N (say 150) tokens as input to generate new text (almost like a sliding window).. in this way I saw that the ultimate text was reasonably coherent..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants