This is the PyTorch codebase used for the training and evaluation of our model CloverLM; please see the full report.
The training harness is a heavily-modified version of ant, with NVFP4 kernels from Quartet-II. The evaluation code is by Matin Ansaripour (@matinansaripour) and Andrei Panferov (@BlackSamorez).
- Clone the repo:
git clone https://github.com/IST-DASLab/CloverLM.git- Use uv to install dependencies from
pyproject.toml
uv sync-
Install FlashAttention
-
Download pretokenized ClimbMix (305B tokens/610GB)
-
Train CloverLM
OMP_NUM_THREADS=1 torchrun --standalone --nproc_per_node=8 ./src/train.py 4b-28h-29d-cm310b-v3 --opt adam --micro_batch_size 32 --train_batches 590000 --k_input 3e-3 --momentum 0.9 --beta2 0.95 --eps 1e-6 --quartet true --info false --extra_freq 200 --backend flash2 --dataset=climbmix10m --num_blocks=29 --heads=28 --ratio=4 --checkpoint_freq 20000 --dataset_seed=654356 --dataset_path=climbmix --wandb_kwargs='{"project": "expedition44"}' --warmup 2000 --cooldown 20000 --model_stats_freq=5000