Distill Mistral 7B? #3

ojus1 · 2023-12-30T23:42:08Z

Mistral-7b is a much better model (and perhaps a teacher) than Llama-2-7b. Would you kindly release checkpoints for a distilled mistral? Would greatly appreciate it!

GeneZC · 2023-12-31T01:53:38Z

Thanks for your interests, we will consider using mistral-7b as an alternative teacher.

However, we are concerned that mistral-7b would make no difference from llama-2-7b since we cannot tell which pretraining data has been used by mistral-7b. And the data used for distillation would largely impact the results.

GeneZC added the enhancement New feature or request label Dec 31, 2023

GeneZC added the wontfix This will not be worked on label Apr 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Distill Mistral 7B? #3

Distill Mistral 7B? #3

ojus1 commented Dec 30, 2023

GeneZC commented Dec 31, 2023 •

edited

Loading

Distill Mistral 7B? #3

Distill Mistral 7B? #3

Comments

ojus1 commented Dec 30, 2023

GeneZC commented Dec 31, 2023 • edited Loading

GeneZC commented Dec 31, 2023 •

edited

Loading