# Quickstart: Fine-Tune Qwen3-0.6B

**What this notebook does:** installs NeMo AutoModel, fine-tunes a small LLM (20 steps), and runs inference on the result.

**Requirements:** 1 NVIDIA GPU with 8+ GB VRAM (T4, L4, A10, or better)

## 1 -- Install

In [None]:
!pip install -q nemo-automodel
import nemo_automodel; print(f"AutoModel ready: {nemo_automodel.__version__}")

## 2 -- Fine-Tune (20 steps, ~2 min on T4)

In [None]:
# Clone repo to get the example configs
!git clone --depth 1 https://github.com/NVIDIA-NeMo/Automodel.git /tmp/automodel 2>/dev/null || true

# Run fine-tuning: Qwen3-0.6B on HellaSwag, 20 steps
!cd /tmp/automodel && automodel finetune llm \
  -c examples/llm_finetune/qwen/qwen3_0p6b_hellaswag.yaml \
  --checkpoint.enabled true \
  --checkpoint.model_save_format safetensors \
  --checkpoint.save_consolidated true \
  --step_scheduler.max_steps 20 \
  --step_scheduler.ckpt_every_steps 20 \
  --step_scheduler.val_every_steps 20

## 3 -- Inference

In [None]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

ckpt = "/tmp/automodel/checkpoints/epoch_0_step_20/model/consolidated/"
tokenizer = AutoTokenizer.from_pretrained(ckpt)
model = AutoModelForCausalLM.from_pretrained(ckpt, torch_dtype=torch.bfloat16, device_map="auto")

inputs = tokenizer("The capital of France is", return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=30)
print(tokenizer.decode(output[0], skip_special_tokens=True))

## 4 -- Use Your Own Data

Replace the dataset section in the YAML config to use a local JSONL file:

In [None]:
import json, pathlib

# Create a tiny example dataset
data = [
    {"instruction": "Translate 'Hello' to French", "output": "Bonjour"},
    {"instruction": "What is the capital of Japan?", "output": "Tokyo"},
    {"instruction": "Summarize photosynthesis in one sentence.", "output": "Plants convert sunlight into energy using chlorophyll."},
]

path = pathlib.Path("/tmp/my_data.jsonl")
path.write_text("\n".join(json.dumps(d) for d in data))
print(f"Wrote {len(data)} examples to {path}")

# To train on this data, update the YAML dataset section:
yaml_snippet = """
dataset:
  _target_: nemo_automodel.components.datasets.llm.column_mapped_text_instruction_dataset.ColumnMappedTextInstructionDataset
  path_or_dataset_id: /tmp/my_data.jsonl
  column_mapping:
    question: instruction
    answer: output
  answer_only_loss_mask: true
"""
print("Add this to your YAML config:")
print(yaml_snippet)