This repository provides tutorial materials for KORMo(Korean Open Reasoning Model), a Korean Large Language Model (LLM) project built with the Hugging Face ecosystem.
It demonstrates how to pretrain, fine-tune, and evaluate large-scale language models using modern open-source frameworks.
bash setup/create_uv_venv.shThis script creates an isolated virtual environment and installs all dependencies required to run the tutorials.
You can find step-by-step examples in the tutorial directory:
tutorial
├── 01.pretrain_from_scratch.ipynb # Pretraining a language model from scratch using custom data
├── 02.sft_qlora.ipynb # Supervised Fine-Tuning with QLoRA for efficiency
└── 03.inference.ipynb # Performing inference and evaluating the trained modelEach notebook is designed to be self-contained and runnable within the prepared environment.
These tutorials aim to help researchers and practitioners:
- Understand the full training pipeline of large Korean language models
- Learn how to use Hugging Face Transformers, Datasets, and PEFT (Parameter-Efficient Fine-Tuning)
- Experiment with QLoRA and distributed training setups
- Run inference and evaluation on trained checkpoints
Developed by the KORMo Team.