You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I notice that when running equal.py to compare decoding tokens of speculative decoding methods (pld/eagle/hydra) with vanilla decoding tokens, the outputs are always Not Equal! I observe the decoding tokens between different methods are almost the same. What are the reasons that cause the slight differences between decoding results?
The text was updated successfully, but these errors were encountered:
We have noticed similar discrepancies in our experiments with multiple Speculative Decoding methods compared to vanilla autoregressive (AR) decoding. Specifically, in our float32 precision and greedy decoding settings, only SpS and PLD produce results that are exactly the same as AR decoding. Other methods show minor differences, especially towards the end of long sequences.
We believe these slight variations may result from small computational errors that accumulate over each decoding step, becoming noticeable in longer sequences.
We plan to investigate this issue further in the coming days. If you have any insights or ideas, please feel free to share them!
BTW, we did not modify the coding of specific algorithms, which means that the original implementation also has this issue. You can also communicate with the authors of corresponding methods😊.
Hi, thank you so much for your awesome work!
I notice that when running
equal.py
to compare decoding tokens of speculative decoding methods (pld/eagle/hydra) with vanilla decoding tokens, the outputs are alwaysNot Equal!
I observe the decoding tokens between different methods are almost the same. What are the reasons that cause the slight differences between decoding results?The text was updated successfully, but these errors were encountered: