REQUEST: Add LIMA model to RustFormers

**Paper:** https://arxiv.org/abs/2305.11206
**Laymen's explanation:** https://azizbelaweid.medium.com/lima-less-is-more-for-alignment-explained-ccdf22726631

**REQUEST:**
"LIMA: Less Is More for Alignment" might be a game-changer for researchers and tinkerers who want to develop capable LLMs.

In this paper, the researchers showed that a 65B LLaMA model finetuned on only 1000 examples (in a supervised fashion) outperforms bigger models like ChatGPT / GPT3.5 (DaVinci003). And it is just a little behind GPT-4.

In 57% of the cases, GPT-4 is still better, but in 43% of the cases, LIMA outperforms or matches GPT-4, which sounds impressive.

An interesting question, though, is why LIMA outperforms Alpaca by such a large margin. Both are LLaMA models after supervised finetuning.



First, LIMA is based on a 65B LLaMA model, whereas the original Alpaca model is based on the 7B LLaMA base model. To make the comparison fair, the authors reproduced the Alpaca training using a 65B base model, training it on 52,000 samples as described in the original Alpaca project.

So, we may conclude that the difference is really in the quality of the training set that the authors carefully curated for LIMA as it beats the same 65B LLaMA base model trained on 52x more data (i.e., Alpaca).

This is an interesting paper with exciting results! However, since the authors make the point that RLHF (reinforcement learning with human feedback) is not necessary and that supervised learning is sufficient, I wish they included another baseline: How does LIMA compare to a 65B LLaMA base model finetuned with RLHF instead of supervised learning?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

REQUEST: Add LIMA model to RustFormers #279

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

REQUEST: Add LIMA model to RustFormers #279

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions