Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to handle the newly added item? #7

Open
hgzjy25 opened this issue Mar 22, 2024 · 1 comment
Open

How to handle the newly added item? #7

hgzjy25 opened this issue Mar 22, 2024 · 1 comment

Comments

@hgzjy25
Copy link

hgzjy25 commented Mar 22, 2024

You have been experimenting with academic datasets, where the entire item set serves as the candidate set. However, the question arises: how do you address the issue of newly added items? It's known that RQ-VAE can manage zero-shot or newly added items with their corresponding embeddings. But when the new item is assigned an ID by RQ-VAE that the LLM has never seen before, how does it generate the new ids/retrieve the new item?

@zhengbw0324
Copy link
Collaborator

@hgzjy25
Hello, thank you for your interest in our work!

Indeed as you said, RQ-VAE is able to assign indices to zero-shot items when item embeddings are provided. Besides, we do not restrict the generation of unseen items during the inference process, but only restrict the index levels generated at each step, so in theory, LLM can directly generate the indices of unseen items.

However, in fact, a fully trained LLM is often more inclined to generate seen items (unseen items are actually illegal indices). Therefore, it may be difficult to directly expect LLM to generate unseen items. You can try to adjust the use of the get_prefix_allowed_tokens_fn function to limit the generated content of LLM, or use some policy rules to increase the proportion of unseen items.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants