The effect of Clustering via Pooling may be greater？ #2

HarryWu99 · 2024-04-27T12:50:19Z

Just a guess.

What will happen if H2O also uses Clustering via Pooling when comparing? It seems that Clustering via Pooling can improve the effectiveness of such drop token methods.

leeyeehoo · 2024-04-27T22:05:49Z

As we stated in the paper, the generated answers are very query-dependent. So evicting KV during generation may introduce losses of information. Given a high-level example, if a user gives the model a book, the first question is about the first chapter, and the model evicts other parts. The user queries about the last chapter, the model will have very limited knowledge about the answer.
Pooling is a very interesting observation since the model will perform perfectly on easier tasks like the original haystack task without pooling. But when you switch to more challenging tasks, the method with pooling is significantly better than the one without pooling.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The effect of Clustering via Pooling may be greater？ #2

The effect of Clustering via Pooling may be greater？ #2

HarryWu99 commented Apr 27, 2024

leeyeehoo commented Apr 27, 2024

The effect of Clustering via Pooling may be greater？ #2

The effect of Clustering via Pooling may be greater？ #2

Comments

HarryWu99 commented Apr 27, 2024

leeyeehoo commented Apr 27, 2024