You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We recently proposed a new ranking strategy to align large language models (LLMs) to enhance their reasoning ability.
We also delve deeply into recent ranking-based alignment methods, such as DPO, RRHF, and PRO, and provide some analyses.
Hi, thanks for your excellent survey.
We recently proposed a new ranking strategy to align large language models (LLMs) to enhance their reasoning ability.
We also delve deeply into recent ranking-based alignment methods, such as DPO, RRHF, and PRO, and provide some analyses.
Here are the details of our work:
Title: Making Large Language Models Better Reasoners with Alignment
Link: https://arxiv.org/pdf/2309.02144.pdf
We kindly request that you consider adding our work to this repository and the survey.
Thank you for your time and consideration. 😊
The text was updated successfully, but these errors were encountered: