Skip to content
This repository was archived by the owner on Jun 24, 2024. It is now read-only.
This repository was archived by the owner on Jun 24, 2024. It is now read-only.

REQUEST: Add LIMA model to RustFormers #279

@aktiver

Description

@aktiver

Paper: https://arxiv.org/abs/2305.11206
Laymen's explanation: https://azizbelaweid.medium.com/lima-less-is-more-for-alignment-explained-ccdf22726631

REQUEST:
"LIMA: Less Is More for Alignment" might be a game-changer for researchers and tinkerers who want to develop capable LLMs.

In this paper, the researchers showed that a 65B LLaMA model finetuned on only 1000 examples (in a supervised fashion) outperforms bigger models like ChatGPT / GPT3.5 (DaVinci003). And it is just a little behind GPT-4.

In 57% of the cases, GPT-4 is still better, but in 43% of the cases, LIMA outperforms or matches GPT-4, which sounds impressive.

An interesting question, though, is why LIMA outperforms Alpaca by such a large margin. Both are LLaMA models after supervised finetuning.

First, LIMA is based on a 65B LLaMA model, whereas the original Alpaca model is based on the 7B LLaMA base model. To make the comparison fair, the authors reproduced the Alpaca training using a 65B base model, training it on 52,000 samples as described in the original Alpaca project.

So, we may conclude that the difference is really in the quality of the training set that the authors carefully curated for LIMA as it beats the same 65B LLaMA base model trained on 52x more data (i.e., Alpaca).

This is an interesting paper with exciting results! However, since the authors make the point that RLHF (reinforcement learning with human feedback) is not necessary and that supervised learning is sufficient, I wish they included another baseline: How does LIMA compare to a 65B LLaMA base model finetuned with RLHF instead of supervised learning?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions