Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Reversal Curse: LLMs trained on “A is B” fail to learn “B is A” #1059

Open
AkihikoWatanabe opened this issue Oct 9, 2023 · 2 comments

Comments

@AkihikoWatanabe
Copy link
Owner

AkihikoWatanabe commented Oct 9, 2023

https://bit.ly/3Rw6kk4

@AkihikoWatanabe AkihikoWatanabe changed the title Studying Large Language Model Generalization with Influence Functions, Roger Grosse+, N/A, arXiv'23 Oct 9, 2023
@AkihikoWatanabe
Copy link
Owner Author

A is Bという文でLLMを訓練しても、B is Aという逆方向には汎化されないことを示した。

著者ツイート: https://x.com/owainevans_uk/status/1705285631520407821?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Q
image

@AkihikoWatanabe AkihikoWatanabe changed the title Studying Large Language Model Generalization with Influence Functions, Roger Grosse+, N/A, arXiv'23 The Reversal Curse: LLMs trained on “A is B” fail to learn “B is A” Oct 9, 2023
@AkihikoWatanabe
Copy link
Owner Author

AkihikoWatanabe commented Oct 9, 2023

GPT3, LLaMaを A is Bでfinetuneし、B is Aという逆方向のfactを生成するように(質問をして)テストしたところ、0%付近のAcc.だった。
image

また、Acc.が低いだけでなく、対数尤度もrandomなfactを生成した場合と、すべてのモデルサイズで差がないことがわかった。
image

このことら、Reversal Curseはモデルサイズでは解決できないことがわかる。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant