Note: it can likely also be used for other HBS languages (Croatian, Bosnian, Montenegrin) - support for these languages is on my roadmap (see future work).
- Common sense reasoning:
Hellaswag
,Winogrande
,PIQA
,OpenbookQA
,ARC-Easy
,ARC-Challenge
- World knowledge:
NaturalQuestions
,TriviaQA
- Reading comprehension:
BoolQ
You can find the Serbian LLM eval dataset on HuggingFace. For more details on how the dataset was built see this technical report on Weights & Biases. The branch serb_eval_translate was used to do machine translation, while serb_eval_refine was used to do further refinement using GPT-4.
Please email me at gordicaleksa at gmail com in case you're willing to sponsor the projects I'm working on.
You will get the credits and eternal glory. :)
In Serbian:
I na srpskom, ukoliko ste voljni da finansijski podržite ovaj poduhvat korišćenja ChatGPT da se dobiju kvalitetniji podaci, i koji je od nacionalnog/regionalnog interesa, moj email je gordicaleksa at gmail com. Dobićete priznanje na ovom projektu da ste sponzor (i postaćete deo istorije). :)
Dalje ovaj projekat će pomoći da se pokrene lokalni large language model ekoksistem.
git clone https://github.com/gordicaleksa/lm-evaluation-harness-serbian
cd lm-evaluation-harness-serbian
pip install -e .
Currently you might need to manually install also the following packages (do pip install): sentencepiece
, protobuf
, and one more (submit PR if you hit this).
--model_args
<- any name from HuggingFace or a path to HuggingFace compatible checkpoint will work
--tasks
<- pick any subset of these arc_challenge,arc_easy,boolq,hellaswag,openbookqa,piqa,winogrande,nq_open,triviaqa
--num_fewshot
<- set the number of shots, should be 0 for all tasks except for nq_open
and triviaqa
(these should be run in 5-shot manner if you want to compare against Mistral 7B)
--batch_size
<- depending on your available VRAM set this as high as possible to get the max speed up
- Cover popular aggregated results benchmarks:
MMLU
,BBH
,AGI Eval
and math:GSM8K
,MATH
- Explicit support for other HBS languages.
Thanks to all of our sponsor(s) for donating for the yugoGPT (first 7B HBS LLM) & Serbian LLM eval projects.
yugoGPT base model will soon be open-source under permissive Apache 2.0 license.
- Ivan (anon)
- qq (anon)
- Adam Sofronijevic
- Yanado
- Mitar Perovic
- Nikola Ivancevic
- Rational Development DOO
- Ivan i Natalija Kokić
- psk.rs
- OmniStreak
- Luka Važić
- Miloš Durković
- Marjan Radeski
- Marjan Stankovic
- Nikola Stojiljkovic
- Mihailo Tomić
- Bojan Jevtic
- Jelena Jovanović
- Nenad Davidović
- Mika Tasich
- TRENCH-NS
- Nemanja Grujičić
- Mladen Fernežir
- tim011
Also a big thank you to the following individuals:
- Slobodan Marković - for spreading the word! :)
- Aleksander Segedi - for help around bookkeeping
A huge thank you to the following technical contributors who helped translate the evals from English into Serbian:
- Vera Prohaska
- Chu Kin Chan
- Joe Makepeace
- Toby Farmer
- Malvi Bid
- Raphael Vienne
- Nenad Aksentijevic
- Isaac Nicolas
- Brian Pulfer
- Aldin Cimpo
Apache 2.0
@article{serbian-llm-eval,
author = "Gordić Aleksa",
title = "Serbian LLM Eval",
year = "2023"
howpublished = {\url{https://huggingface.co/datasets/gordicaleksa/serbian-llm-eval-v1}},
}