Skip to content

materials for Finetuning Open-Source LLMs to small languages workshop

License

Notifications You must be signed in to change notification settings

monitora-media/mlprague2024

 
 

Repository files navigation

Fine-tuning Open-Source LLMs to Small Languages

Link: bit.ly/praguellm

Slides:

Exercises:

Benchmarks:

  • mlprague: The benchmark we created together during the workshop, 111 A/B/C/D questions (🇨🇿 41, 🇸🇰 27, 🇮🇹 8, 🇫🇷 7, 🇺🇦 6...)
  • synczech50: Synthetic dataset of 50 A/B/C/D questions for quick evaluation how the LMM understands Czech and Czech specific knowledge.

Small Czech LLM:

  • cswikimistral_0.1: Mistral7B model fine-tuned with 4bit-QLoRA on Czech Wikipedia data

About

materials for Finetuning Open-Source LLMs to small languages workshop

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.7%
  • Python 0.3%