# NeMo backend setup (optional)
Use this notebook if you want to train with the NeMo Framework Launcher instead of the default Hugging Face trainer.


## 1. Install NeMo Framework Launcher
```bash
sudo mkdir -p /opt
sudo git clone https://github.com/NVIDIA/NeMo-Framework-Launcher.git /opt/NeMo-Framework-Launcher
```


## 2. Install NeMo + Megatron-LM code
The launcher expects NeMo at `/opt/NeMo` and Megatron-LM at `/opt/megatron-lm`.

```bash
sudo mkdir -p /opt
sudo git clone https://github.com/NVIDIA/NeMo.git /opt/NeMo
sudo git clone https://github.com/NVIDIA/Megatron-LM.git /opt/megatron-lm
```

If you want to keep the repos elsewhere, create symlinks under `/opt`:

```bash
git clone https://github.com/NVIDIA/NeMo.git ~/src/NeMo
git clone https://github.com/NVIDIA/Megatron-LM.git ~/src/megatron-lm
sudo ln -s ~/src/NeMo /opt/NeMo
sudo ln -s ~/src/megatron-lm /opt/megatron-lm
```

If the launcher reports a missing `megatron_gpt_finetuning.py`, locate the script and set `NEMO_GPT_FINETUNING_SCRIPT`:

```bash
find /opt/NeMo -name megatron_gpt_finetuning.py
export NEMO_GPT_FINETUNING_SCRIPT=/opt/NeMo/examples/nlp/language_modeling/megatron_gpt_finetuning.py
```


## 3. Install launcher dependencies in your venv
You can use the existing workshop venv or create a separate one so the Hugging Face trainer stays isolated.

```bash
# Option A: use the existing workshop venv
source .venv/bin/activate
pip install -r /opt/NeMo-Framework-Launcher/requirements.txt
```
```bash
# Option B: separate venv for NeMo
python3 -m venv .venv_nemo
source .venv_nemo/bin/activate
pip install -r /opt/NeMo-Framework-Launcher/requirements.txt
```


## 4. Configure cluster mode (interactive vs Slurm)
If you are not on a Slurm cluster, the NeMo launcher should run in interactive mode.
The wrapper script auto-selects interactive mode when `srun` is not available.

Optional: force interactive mode manually.
```bash
export NEMO_LAUNCHER_CLUSTER=interactive
```

If you created a separate venv, export it so the launcher uses it for `python` and `torchrun`.
```bash
export NEMO_VENV=$PWD/.venv_nemo
```

On a Slurm cluster, leave it unset or set `NEMO_LAUNCHER_CLUSTER=bcm`.


## 5. Run training with the NeMo backend
```bash
NEMO_FRAMEWORK_LAUNCHER_DIR=/opt/NeMo-Framework-Launcher \
NEMO_VENV=$PWD/.venv_nemo \
python scripts/train.py \
  --backend nemo \
  --data_dir data/processed_small \
  --output_dir outputs \
  --model_name nvidia/Nemotron-Mini-4B-Instruct \
  --num_train_epochs 1 \
  --max_seq_length 512
```
