Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lookahead-prompt : add example #4226

Closed
ggerganov opened this issue Nov 26, 2023 · 5 comments
Closed

lookahead-prompt : add example #4226

ggerganov opened this issue Nov 26, 2023 · 5 comments
Labels
good first issue Good for newcomers performance Speed related topics

Comments

@ggerganov
Copy link
Owner

Add an example implementing the "Prompt Lookup Decoding" technique:

https://github.com/apoorvumang/prompt-lookup-decoding

This should be a great exercise for people looking to become familiar with llama.cpp's KV cache management and batched decoding API. Looking for contributions.

The following examples can be used as starting points:

  • speculative
  • lookahead
  • batched
@0xdevalias
Copy link

@apoorvumang FYI

@wsxiaoys
Copy link
Contributor

I just implemented this for tabby in https://github.com/TabbyML/tabby/pull/916/files - it's a slightly more complicated implementation (since tabby runs on continuous batching), but should be something can be used as reference.

@LeonEricsson
Copy link
Contributor

I'd love to give this a try, first time contributing.

@bullno1
Copy link
Contributor

bullno1 commented Dec 19, 2023

Somewhat related: https://arxiv.org/abs/2312.11462

It seems someone looked at lookup decoding (ngram) and speculative decoding and asked themselves: "Why not both?".
And thus this is the result.

I'm still reading through the paper.

@ggerganov
Copy link
Owner Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers performance Speed related topics
Projects
Status: Done
Development

No branches or pull requests

5 participants