Qs on Understanding Lookahead and Jacobi #37

RonanKMcGovern · 2023-12-20T13:55:51Z

Thanks for putting this blog together.

Regarding simple jacobi decoding:

The process starts with some random guesses for future tokens, correct?
As the process continues the guesses improve a little bit, but are still quite poor on average as guesses (because they always are based on a previous token that is wrong). So, only occasionally is a guess correct and therefore the next token can also be accepted.

Regarding look ahead

Basically, a bank of n-grams is built up. And those n-grams are built up from decoding guessed input tokens. Correct?
I suppose adding the prompt itself (and any confirmed generated tokens) as a source of n-grams would probably also improve performance?

Jacobi
Jacobi is mentioned a lot in the blog, but I don't really see that as central... Basically we're just randomly guessing a token that is W tokens away, using previous forward passes to improve the quality of guesses within the window, and then using that as an ngram database?

To further improve the quality of guesses, would it be an idea to just mask the input effect completely - rather than guessing the tokens. My sense is that - because attention is so strong for nearby tokens - that guessing the tokens is worse than actually passing through blank information for that guess position. That would allow the decoded output to be based purely on the info from tokens we do know 100%.

shermansiu · 2023-12-22T00:08:24Z

I'm not one of the authors, but I can answer.

Yes, random guesses are used initially. See Question on Initial guess tokens #8. But by the time we've gone through enough passes (i.e. the length of the window), the guesses are no longer completely random (but still noisy).
Yes, a bank of n-grams is built up during decoding guessed input tokens.

Generally, prompts are not regenerated during inference... you start generating tokens after the prompt so you don't get any speedup from that. Using template-guided generation (e.g. see https://github.com/guidance-ai/guidance) is known to speed up inference.

Jacobi is central to the idea because we are using the context from the known prefix to generate the subsequent token guesses. The idea is that because the contribution from the known prefix is larger than the guessed tokens after the first few passes, tokens that are closer to the known prefix will be more accurately guessed. There are no guarantees though.

What you're saying is more or less correct, except only recent n-grams are kept.

As for your idea, it's certainly viable to use either [MASK] tokens or 0-embeddings. Even if it improves upon the performance, I don't think that the improvement will be huge, but you're welcome to try it out.

RonanKMcGovern · 2023-12-22T10:07:25Z

Thanks very much. As an aside, apparently TGI tried look ahead but found little speed up for the added compute.

shermansiu · 2023-12-23T00:33:24Z

You mean Huggingface's transformers, right? Not TGI. But yeah, both Joao Gante and Louis-y-nlp in #19 noticed that you don't get much of a speedup if you don't have the FLOPS to spare.

RonanKMcGovern · 2023-12-23T01:10:33Z

Yeah makes sense. The comment I was referencing is this one: huggingface/text-generation-inference#1169 (comment)

Thanks

shermansiu · 2023-12-24T03:33:26Z

I'm assuming that when Olivier Dehaene mentioned that it was tested internally, that he was referring to Joao Gante's test (Gante works at Huggingface). See huggingface/transformers#27649 (comment) for details.

RonanKMcGovern · 2023-12-24T11:10:07Z

Wow yeah that's a great post from Joao, thanks for sharing that. I didn't appreciate FA2 compatibility was a consideration too.

shermansiu · 2024-01-17T12:55:08Z

Incidentally, it seems like the original Jacobi decoding paper uses [PAD] tokens instead of random vocabulary tokens.

RonanKMcGovern mentioned this issue Dec 20, 2023

Comparison with LookAhead apoorvumang/prompt-lookup-decoding#2

Open

RonanKMcGovern closed this as completed Dec 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qs on Understanding Lookahead and Jacobi #37

Qs on Understanding Lookahead and Jacobi #37

RonanKMcGovern commented Dec 20, 2023

shermansiu commented Dec 22, 2023

RonanKMcGovern commented Dec 22, 2023 •

edited

Loading

shermansiu commented Dec 23, 2023

RonanKMcGovern commented Dec 23, 2023

shermansiu commented Dec 24, 2023

RonanKMcGovern commented Dec 24, 2023

shermansiu commented Jan 17, 2024

Qs on Understanding Lookahead and Jacobi #37

Qs on Understanding Lookahead and Jacobi #37

Comments

RonanKMcGovern commented Dec 20, 2023

shermansiu commented Dec 22, 2023

RonanKMcGovern commented Dec 22, 2023 • edited Loading

shermansiu commented Dec 23, 2023

RonanKMcGovern commented Dec 23, 2023

shermansiu commented Dec 24, 2023

RonanKMcGovern commented Dec 24, 2023

shermansiu commented Jan 17, 2024

RonanKMcGovern commented Dec 22, 2023 •

edited

Loading