# Getting Started with Fine-Tuning Hibiki - 2B

This notebook is an example of how to LoRA finetune Hibiki 2B. Recommended GPU is **A100 GPU**.

Check out `moshi-finetune` Github repo to learn more: https://github.com/kyutai-labs/moshi-finetune/


## Setup

Clone the `hibiki-finetune` repo from my profile `imrnh`:


In [None]:
# Clone the repository
!git clone https://github.com/imrnh/hibiki-finetune.git

# Copy files to current directory (./)
!cp -r hibiki-finetune/* ./
!rm -rf hibiki-finetune

# Install deps
%pip install -e ./

In [2]:
from pathlib import Path
from huggingface_hub import snapshot_download
import os
import json
import yaml

os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = "0"

## Prepare dataset


In [None]:
# Download the dataset
Path("data/dailytalkm").mkdir(parents=True, exist_ok=True)
local_dir = snapshot_download("kyutai/DailyTalkContiguous", repo_type="dataset", local_dir="data/dailytalkm")

In [5]:
# Paths
data_stereo_folder = "data/dailytalkm/data_stereo"
original_jsonl = "data/dailytalkm/dailytalk.jsonl"
output_jsonl = "data/dailytalkm/dailytalk_m.jsonl"

# Get available indexes from data_stereo
available_indexes = set()
for f in os.listdir(data_stereo_folder):
    if f.endswith(".wav"):
        idx = os.path.splitext(f)[0]
        if idx.isdigit():
            available_indexes.add(idx)

# Read original jsonl and filter
with open(original_jsonl, "r") as infile, open(output_jsonl, "w") as outfile:
    for line in infile:
        data = json.loads(line)
        # Extract index from path, e.g., "data_stereo/0.wav" -> "0"
        file_index = os.path.splitext(os.path.basename(data["path"]))[0]
        if file_index in available_indexes:
            outfile.write(json.dumps(data) + "\n")

print(f"Filtered JSONL saved to {output_jsonl}")

Filtered JSONL saved to data/dailytalkm/dailytalk_m.jsonl


## Write `trainer.yaml` file

In [9]:
config = """
# data
data:
  train_data: 'data/dailytalkm/dailytalk_m.jsonl'
  eval_data: ''
  shuffle: true

# model
moshi_paths:
  hf_repo_id: "kyutai/hibiki-2b-pytorch-bf16"

full_finetuning: false # Activate lora.enable if partial finetuning
lora:
  enable: true
  rank: 128
  scaling: 2.
  ft_embed: false

# training hyperparameters
first_codebook_weight_multiplier: 100.
text_padding_weight: .5


# tokens per training steps = batch_size x num_GPUs x duration_sec
# we recommend a sequence duration of 300 seconds
# If you run into memory error, you can try reduce the sequence length
duration_sec: 60
batch_size: 1
max_steps: 2

gradient_checkpointing: true # Activate checkpointing of layers

# optim
optim:
  lr: 2.e-6
  weight_decay: 0.1
  pct_start: 0.05

# other
seed: 0
log_freq: 1
eval_freq: 1
do_eval: False
ckpt_freq: 10

save_adapters: True

run_dir: "run_dir"
"""

In [None]:
# save the same file locally into the example.yaml file
with open("trainer.yaml", "w") as file:
    yaml.dump(yaml.safe_load(config), file)

! rm -r run_dir # Remove run directory

In [None]:
# start training
!torchrun --nproc-per-node 1 -m train trainer.yaml