# Running InstructLab with a GPU

<ul>
<li>Contributors: InstructLab team and IBM Research Technology Education team:
<li>Questions and support: kochel@us.ibm.com, IBM.Research.JupyterLab@ibm.com
<li>Release date: 2025-07-03
</ul>

# Overview
This Jupyter notebook demonstrates InstructLab, an open source AI project that facilitates knowledge and skills contributions to Large Language Models (LLMs). InstructLab uses a novel synthetic data-based alignment tuning method for Large Language Models introduced in this [paper](https://arxiv.org/abs/2403.01081). The open source InstructLab repository is available [here](https://github.com/instructlab/instructlab) and provides additional documentation on using InstructLab.

This notebook demonstrates the high level steps in the InstructLab processing, but does not run the full fidelity processing due to the limited GPU capability. The full InstructLab processing can be run via the [Red Hat AI InstructLab](https://cloud.ibm.com/instructlab/overview) service.

InstructLab can take the form of an open source installation or a Red Hat AI InstructLab installation. In this notebook, we will demonstrate the open source version of InstructLab running on Colab with a GPU, broken into the following major sequential steps:

1. Accepts one or more of Question and Answer (QNA) files as input
1. Performs `yamllint` checks on the QNA files to verify their format
1. Places the QNA files in the desired structure in a taxonomy
1. Verifies the taxonomy by running the `ilab diff` function
1. Generates synthetic data 
1. Trains with locally installed InstructLab
1. Inferences with the newly trained model
1. Downloads the trained model

## Running this Notebook

This notebook must be run within a Colab GPU runtime. You can check you are running with a GPU by selecting Runtime-> Change Runtime Type and confirming that a GPU Runtime is selected. While this notebook can be started on a free Colab account, the GPUs availabe with a free access do not have sufficient memory to run InstructLab training.

You can run this notebook either:
- Running All Cells by selecting Runtime->Run all
- Cell by cell by selecting the arrow on each code cell and running them sequentially.

Once the Configuring Instructlab section has been run, the other sections of this notebook can be repeatedly run on other data sets.

# Step 1. Clone the Instructlab Environment and Select Run Options

The cell replicates an `ilab` data repository containing the pip requirements and data files, and then presents options for running the notebook.

After selecting the parameters, the remainder of this notebook can be run by either:
- Running all cells by selecting `Runtime`->`Run cell and below`.
- Running each cell sequentially by clicking <img src="./refs/run-cell.png" width=23> **Run cell** by each code cell.

Run the following cell, select from the following parameters, and then follow the directions in the cell to run the rest of this notebook.

We've provided question and qnswer files for these datasets:
- "2024 Oscar Awards Ceremony"
- "Quantum Roadmap and Patterns"
- "Artificial Intelligence Agents"
- "Multi-QNA Example": Contains QNA files for Oscars, Quantum, and Agentic AI data sets to show how multiple QNA files can be provided and processed.
- "Your Content 1" or "Your Content 2": Follow the instructions in Step 2.2 to provide your own data.

In [None]:
#Install these itesm first to avoid a later reset
!pip install psutil==7.0.0 pillow==10.4.0 numpy==1.26.4 torch==2.6.0 importlib_metadata==8.0.0 --quiet
import os
os.chdir('/content/')
if not os.path.exists("ilab"):
    !git clone https://github.com/KenOcheltree/ilab-open.git --quiet --recurse-submodules ilab
#Remove the colab sample_data
if os.path.exists("sample_data"):
    !rm -rf sample_data

#Display run options
import ipywidgets as widgets
#See instructions on placing your hf_token in colab userdata
from google.colab import userdata
hf_token=userdata.get('hf_token')
data_set = widgets.ToggleButtons(
    options=['2024 Oscars', 'Quantum', 'Agentic AI', 'Multi-QNA Example', 'Your Content 1', 'Your Content 2'],
    description='Dataset:', style={"button_width": "auto"}
)
sdg_pipe = widgets.ToggleButtons(
    options=['Simple', 'Full with GPU'], description='Processing:', style={"button_width": "auto"}
)
instr=widgets.ToggleButtons(
    options=['Default (>450)','>15', '>50', '>200', '>500', '>1000'],
    description='# of QNAs:', style={"button_width": "auto"}
)
train_pipe = widgets.ToggleButtons(
    options=['Simple with GPU','Accelerated GPU'],description='Processing',style={"button_width":"auto"}
)
epoch=widgets.ToggleButtons(
    options=['1', '2', '3', '4', '5', '10', '15'],description='Epochs:',style={"button_width":"auto"}
)
it=widgets.ToggleButtons(
    options=['1', '3', '5','10','20','50','100','200'],description='Iterations:',style={"button_width":"auto"}
)
questions=widgets.ToggleButtons(options=['Yes','No'],description='Live Q&A:',style={"button_width":"auto"})
download=widgets.ToggleButtons(options=['Yes','No'],description='Download:',style={"button_width":"auto"}
)
print("\nSelect the Dataset for this run:")
display(data_set)
print("Select the Synthetic Data Generation parameter to use:")
sdg_pipe.value='Simple'
display(sdg_pipe)
instr.value = 'Default (>450)'
display(instr)
print("Select the Training parameters to use:")
train_pipe.value='Simple with GPU'
#display(train_pipe)
epoch.value="3"
display(epoch)
it.value="5"
display(it)
print("Select what to do with the model after training:")
questions.value="Yes"
display(questions)
download.value="No"
display(download)
print("After selecting the parameters, select the next cell and then choose Runtime->Run cell and below")
print("When that run completes, you can come here, choose different parameters and rerun at the next cell with Runtime->Run cell and below")
print("Note: You can also go back and rerun individual sections of the notebook with different parameters.")

#Step 2. Prepare to Create the Taxonomy

## 2.1 Provide the Taxonomy data

You might want to run this notebook once with an existing data set before creating your own to understand the taxonomy creation flow.

You can provide your own InstructLab QNA file for processing in this step.
1. Create your own `qna.yaml` file by following the directions in the InstructLab taxonomy [readme](https://github.com/instructlab/taxonomy).
1. After creating your `qna.yaml` file, add a comment in the first line that starts with `# Location:` and specifies the location of the file in the taxonomy. For example, a quantum computing `qna.yaml` file has the following path for the location:
    ```
    # location: /knowledge/information/computer_science/quantum_computing
    ```
1. Add your `qna.yaml` to the `/content/ilab/data/your_content_1` folder or the `/content/ilab/data/your_content_2` folder by dragging and dropping it into the folder.
1. To include multiple `qna.yaml` files in your taxonomy, add a unique identifer `NNN` to the name so it is of the format `qnaNNN.yaml`. Any number of QNA files can be included as long as they have unique names.
1. You can use your own data by selecting **Your Content 1** or **Your Content 2** in the code cell.

##2.2 Complete the Environment Set Up

This code cell installs the remainder of the required pip packages and configures InstructLab. The InstructLab configuration is captured in the `config.yaml` file. The `config.yaml` file is created for you and `taxomony_path = taxonomy` is set. The root location of the taxonomy is set to the taxonomy folder in `instructlab-latest`.

**Note:** 
- This step can take a few minutes to run. If you are running all of the cells at the same time, it can take 10 minutes to run.
- Ignore any pip inconsistency errors or warnings in the installation. They do not affect the running of this notebook.

In [None]:
# Run the rest of the notebook by selecting this third cell and choosing "Runtime->Run cell and below"

# Wrap Code cell output
from IPython.display import HTML, display
def set_css():
  display(HTML('''
  <style>
    pre {
        white-space: pre-wrap;
    }
  </style>
  '''))
get_ipython().events.register('pre_run_cell', set_css)

# Install the rest of the requirements
!pip cache remove llama_cpp_python
!pip install -r ilab/requirements_gpu.txt --quiet
#Check the starting configuration
!ilab system info

#Step 3. Perform Imports and Check for a GPU

This code cell checks for a GPU in the configuration. This notebook requires a GPU in the configuration to run properly.

In [None]:
import os
import torch
from IPython.display import Image, display
from datasets import load_dataset
from langchain.llms import LlamaCpp
from langchain.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
import json
import subprocess
import shutil
import ruamel.yaml
os.environ['NUMEXPR_MAX_THREADS'] = '64'
Norm = "<p style='font-family:IBM Plex Sans;font-size:20px'>"

notebook_dir='/content/ilab/'
os.chdir(notebook_dir)

## torch and cuda version check
TORCH_VERSION = ".".join(torch.__version__.split(".")[:2])
CUDA_VERSION = torch.__version__.split("+")[-1]
print("torch: ", TORCH_VERSION, "; cuda: ", CUDA_VERSION)

if torch.cuda.is_available() is False:
    print("No GPU in configuration")
else:
    os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "expandable_segments:True"
    print("GPU(s) are Available")
    gpus=torch.cuda.device_count()
    if gpus==1:
      gpu_type=torch.cuda.get_device_name(0)
      print("One GPU of Type: ", gpu_type)
    else:
      print("ERROR: More than 1 GPU in configuration: ",gpus)
print("Starting directory: "+ os.getcwd())

# Step 4. Configure the InstructLab Environment

## 4.1 Run Instructlab initialization

The InstructLab configuration is captured in the *config.yaml* file. This step creates the config.yaml file and sets:
- **taxomony_path = taxonomy** - the root location of the taxonomy is set to the taxonomy folder in instructlab-latest
- **model_path = models/merlinite-7b-lab-Q4_K_M.gguf** - the default model is set to merlinite

**Note:** The default directories for InstructLab are the following. If you initialize InstructLab on your own system, it will default to the following:
* **Downloaded Models:**  ~/.cache/instructlab/models/ - Contains all downloaded large language models, including the saved output of ones you generate with ilab.
* **Synthetic Data:** ~/.local/share/instructlab/datasets/ - Contains data output from the SDG phase, built on modifications to the taxonomy repository.
* **Taxonomy:** ~/.local/share/instructlab/taxonomy/ - Contains the skill and knowledge data.
* **Training Output:** ~/.local/share/instructlab/checkpoints/ - Contains the output of the training process.
* **config.yaml:** ~/.config/instructlab/config.yaml - Contains the config.yaml file

In [None]:
#Remove Colab Sample directory
if os.path.exists("sample_data"):
    print("removing sample_data")
    shutil.rmtree("sample_data")
    os.chdir("ilab")

#Initialize ilab
base_dir="/root/"
##Choose the base model as granite or mixtral
model_dir="models"
model_name="granite-7b-lab-Q4_K_M.gguf"
model_path = os.path.join(model_dir, model_name)

taxonomy_path='taxonomy'

## Define the file name
file_name = "config.yaml"
if os.path.exists(file_name):
    os.remove(file_name)
    print(f"ilab was already initialized. {file_name} has been deleted. Reinitialized")
else:
    print(f"ilab was not initialized yet. {file_name} does not exist.")

##Remove old data
if os.path.exists("taxonomy"):
    print("removing taxonomy")
    shutil.rmtree("taxonomy")
if os.path.exists(base_dir+".cache/instructlab"):
    print("removing " + base_dir+".cache/instructlab")
    shutil.rmtree(base_dir+".cache/instructlab")
if os.path.exists(base_dir+".config/instructlab"):
    print("removing " + base_dir+".config/instructlab")
    shutil.rmtree(base_dir+".config/instructlab")
if os.path.exists(base_dir+".local/share/instructlab"):
    print("removing " + base_dir+".local/share/instructlab")
    shutil.rmtree(base_dir+".local/share/instructlab")

print(f"ilab model is {model_path}.")
print('#############################################################')
print(' ')

command = f"""
ilab config init<<EOF
{taxonomy_path}
Y
{model_path}
0
EOF
"""

## Using the ! operator to run the command
!echo "Running ilab config init"
!{command}

#$ 4.2 Display the config.yaml file
We examine the base configuration for identifying parameters for changing in the next step.

In [None]:
##to copy config.yaml to local directory
!cp /root/.config/instructlab/config.yaml .
!cat config.yaml

## 4.3 Select LLM Models and Download

This cell changes the models to use for the generate stage. The mistral model as the teacher model in the generate step and as the student model to be trained.

If you want to customize other models for generation or the training phase, you would specify the models in this step.

This step specifies that the models to be used will be from this notebook's models directory.

### LLM models
The models that will be used in the InstructLab processing are downloaded in this step. Additional steps can be added if other models are used in processing.

- The merlinite model will be used as the teacher model for the simple pipeline in the **Training with InstructLab** section.
- The mistral-7b-instruct-v0.2.Q4_K_M model will be used as the teacher model for the full pipeline in that section.
- The granite07b-lab.gguf model is a quantized version of the granite-7b-lab model.

In [None]:
##Use ruamel.yaml to load the yaml file to preserve comments
yaml = ruamel.yaml.YAML()
with open('config.yaml', 'r') as file:
    config = yaml.load(file)

##Upate to use the same models and just change the directory
teacher_model_path = "models/mistral-7b-instruct-v0.2.Q4_K_M.gguf"
base_model_path = "models/instructlab/granite-7b-lab"
##judge_model_path = "models/prometheus-eval/prometheus-8x7b-v2.0"

##config['evaluate']['mt_bench']['judge_model'] = judge_model_path
##config['evaluate']['mt_bench_branch']['judge_model'] = judge_model_path
config['generate']['model'] = teacher_model_path
config['generate']['teacher']['model_path']= teacher_model_path
##config['train']['phased_mt_bench_judge']=judge_model_path

#Update GPU information
config['evaluate']['gpus']=gpus
config['generate']['teacher']['vllm']['gpus']=gpus
config['serve']['vllm']['gpus']=gpus
config['train']['nproc_per_node']=gpus
config['metadata']['gpu_count']=gpus
if gpus==1:
  config['train']['device']="cuda"
  if gpu_type[:6]=="NVIDIA":
    config['metadata']['gpu_manufacturer']="Nvidia"
    config['metadata']['gpu_family']=gpu_type[7:]

## Save the updated config.yaml file
yaml.default_flow_style=False
with open('config.yaml', 'w') as file:
    yaml.dump(config, file)

##copy the config file to the .config/instructlab/ where it is used by InstructLab
!cp config.yaml {base_dir}.config/instructlab/

print("Updated config.yaml successfully.\n")
!cat config.yaml

models_dir="models"

# Download models
!ilab model download --hf-token {hf_token} --model-dir {models_dir}
!ilab model download --repository instructlab/granite-7b-lab --hf-token {hf_token} --model-dir {models_dir}

# Step 5. Specify the Data for this Run

We've provided question-and-answer files for these datasets: "2024 Oscar Awards Ceremony", "Quantum Roadmap and Patterns" and "Artificial Intelligence Agents". Feel free to choose one of these datasets, or select your own custom dataset in the cell below.

### Optionally, Create your own data set for InstructLab

You can optionally provide your own InstructLab QNA file for processing in this step.

Follow these steps to add your own dataset:
1. Create your own qna.yaml file following the directions on the InstructLab taxonomy [readme](https://github.com/instructlab/taxonomy).
1. Create a questions.txt file with related sample questions to use on inferencing.
1. Add your qna.yaml and sample questions.txt files to the /content/ilab/data/your_content_1 folder or the /content/ilab/data/your_content_2 folder by dragging and dropping them in the desired folder.
1. Double click on the /content/ilab/config.json file to edit and specify the qna_location where your data resides within the Dewey Decimal classification system. Close and save the config.json file.
1. You can now specify to run with your own data by selecting **Your Content 1** or **Your Content 2** in the next code cell.

In [None]:
print("\nSelect the QNA dataset to add:")
display(data_set)
print("After choosing your dataset, please select and run the following cell")

# Step 6. Check the Format of the QNA YAML Files

Running this cell checks the format of the yaml files before they are placed in the taxonomy to ensure they are the right length and there are no trailing blanks.

Important: Rerun the following cell until all of the QNA files pass the yamllint test. Otherwise the file will fail in the Synthetic Data Generation step.

In [None]:
import yamllint
# Select the folder of the dataset
use_cases = {"2024 Oscars": "oscars", "Quantum": "quantum", "Agentic AI": "agentic_ai",
            "Multi-QNA Example": "example","Your Content 1": "your_content_1", "Your Content 2": "your_content_2"}
use_case = use_cases[data_set.value]
qna_dir = "data/" + use_case + "/"
print("Running yaml checker on " + data_set.value + " data in folder " + qna_dir)
for f in os.listdir(qna_dir):
    f=f.lower()
    if f.startswith('qna'):
      print("Checking File: " + f)
      yaml_file = qna_dir + f
      shell_command = f"yamllint /content/ilab/{yaml_file} -c /content/ilab/yamlrules.yaml"
      !{shell_command}

# Step 7. Create the Taxonomy Data Repository
Running this next cell places the QNA files in the proper directories of the taxonomy.

If you want to add additional QNA files to the taxonomy after the following cell is run, you can create the necessary levels of directories and add the qna.yaml named file directly to the taxonomy.

In [None]:
# List all of the files in the use_case directory that begin with QNA
print_lines=30
for f in os.listdir(qna_dir):
    f=f.lower()
    if f.startswith('qna'):
        qna_file = qna_dir + f
        print("Show the QNA file: " + qna_file)
        with open(qna_file, 'r') as input_file:
            for line_number, line in enumerate(input_file):
                if line_number == 0:
                    words = line.split()
                    print("Checking first line of QNA file for placement location: " + line)
                    if words[0] == "#" and words[1] == "location:" and len(words) == 3:
                      qna_location = words[2]
                    else:
                      print("ERROR: Placement location not specified in QNA File: " + qna_file)
                      break
                if line_number > print_lines:  # line_number starts at 0.
                    break
                print(line_number, line, end="")
        # Place the QNA file in the proper taxonomy directory if it does not already exist
        new_qna_dir = "/taxonomy" + qna_location
        if os.path.exists(os.getcwd()+new_qna_dir):
            print("\nWARNING: QNA file already exists in the taxonomy at duplicate location, not inserting")
        else:
            print("\nPlace QNA file in taxononmy as: /taxonomy"+qna_location+"/qna.yaml")
            shell_command1 = f"mkdir -p ./taxonomy{qna_location}"
            shell_command2 = f"cp ./{qna_file} ./taxonomy{qna_location}/qna.yaml"
            !{shell_command1}
            !{shell_command2}

# Step 8. Verify the Taxonomy Data Repository
Run diff to verify the taxonomy. Record the errors on this step and correct them in your QNA files and then rerun the notebook with the corrected QNA files.

In [None]:
print("Verify the taxonomy")
!ilab -vvv taxonomy diff --taxonomy-path taxonomy --taxonomy-base empty

# Step 9. Synthetic Data Generation

## 9.1 Select parameters

### Select pipeline

InstructLab has three primary pipelines that can be used: simple, full and acellerated:
- The **simple pipeline** runs fast and can be used for initial model and data testing.
- The **full pipeline** runs all of the InstrctLab steps and takes more time but produces a better tuned model.

**Note:** If you are running with a new or modifed dataset, you may want to use the **Simple pipeline** for the first run to verify the configuration

### Select number of samples to generate

Data generation takes 19 minutes for generating 15 synthetic data samples. You may wish to generate a small number on your first run to verify the QNA dataset format.

To produce **sufficient synthetic data** to focus training on the new material, **about 30 synthetic questions and answer pairs need to be generated** for each question and answer pair provided. This will require a proportionally longer time to generate, but will provide better training.

Before following these instructions, ensure the existing model you are adding skills or knowledge to is still running. Alternatively, ilab data generate can start a server for you if you provide a fully qualified model path via --model.

To generate a synthetic dataset based on your newly added knowledge or skill set in taxonomy repository, run the following command:

    ilab data generate

### **Simple Pipeline**

The Simple Pipeline works solely with Merlinite 7b Lab as the teacher model. The Simple Pipeline is called without GPU acceleration as follows:

    ilab data generate --pipeline simple

### **Full Pipeline**

The Full Pipeline runs the full processing with a GPU. Currently, the Full Pipeline only supports the Mixtral and Mistral Instruct Family models as the teacher model.  This is due to only supporting specific model prompt templates.

Using a non-default model such as Mixtral-8x7B-Instruct-v0.1) to generate data with the Full Pipeline:

    ilab data generate --model ~/.cache/instructlab/models/mistralai/mixtral-8x7b-instruct-v0.1 --pipeline full --gpus 4

**Note** Synthetic Data Generation can take from 2 minutes to 1+ hours to complete, depending on your computing resources.

In [None]:
print("Select Pipeline to use")
display(sdg_pipe)
display(instr)
print("After making your selections for data generation, please select and run the following cell")

## 9.2 Run data generation
Data generation with a GPU can take 2 minutes or more to generate 15 synthetic data samples. It takes proportionately longer to generate more samples.

In [None]:
gen_directory = "data/"+ use_case+"/ilab_generated/"
if instr.value == "Default (>450)":
        sdg_factor=""
elif instr.value == '>15':
    sdg_factor="--sdg-scale-factor 1"
elif instr.value == '>50':
    sdg_factor="--sdg-scale-factor 3"
elif instr.value == '>200':
    sdg_factor="--sdg-scale-factor 13"
elif instr.value == '>500':
    sdg_factor="--sdg-scale-factor 33"
else:
    sdg_factor="--sdg-scale-factor 67"
# 'Fast (Simple)', 'Full with CPU'
if sdg_pipe.value == 'Simple':
    pipeline = 'simple'
    model = '--model models/instructlab/granite-7b-lab'
    gpus = '--gpus 1'
elif sdg_pipe.value == 'Full with GPU':
    pipeline = 'full'
    model = ''
#   model = '--model models/instructlab/granite-7b-lab'
    gpus = '--gpus 1'
else:
    print("ERROR: Undefined pipeline")

il_data_path= '/root/.local/share/instructlab/datasets/'
#Remove old data so there is only one test_merlinite and train_merlinite after generation
print("Remove old datasets")
!rm -rf {il_data_path}*
#shell_command = f"ilab --verbose data generate {model} --num-cpus 10 {gpus} {sdg_factor} --taxonomy-path taxonomy --pipeline {pipeline} --max-num-tokens 512"
shell_command = f"ilab data generate {model} --num-cpus 10 {gpus} {sdg_factor} --taxonomy-path taxonomy --pipeline {pipeline} --max-num-tokens 512"

print("Generating data")
print("Running: !"+shell_command)
!{shell_command}

#Rename results to  test_gen.jsonl and train_gen.jsonl and move to local data directory
if not os.path.exists(gen_directory):
    print("Create directory: " + gen_directory)
    !mkdir {gen_directory}
file_cnt=0
try:
    for dirname in os.listdir(il_data_path):
        date_path=il_data_path+'/'+ dirname + '/'
        for filename in os.listdir(date_path):
            if filename[:6]=='train_':
                train_name= 'train_gen.jsonl'
                print('Renaming '+ filename+ ' to ' + train_name)
                !mv {date_path+filename} {gen_directory+train_name}
                file_cnt+=1
            elif filename[:5]=='test_':
                test_name= 'test_gen.jsonl'
                print('Renaming '+ filename+ ' to ' + test_name)
                !mv {date_path+filename} {gen_directory+test_name}
                file_cnt+=1
    if file_cnt < 2:
        print("ERROR: train_gen.jsonl and/or test.jsonl not created")
    elif os.path.getsize(gen_directory+train_name) == 0:
        print("ERROR: train_gen.jsonl file is empty")
    elif os.path.getsize(gen_directory+test_name) == 0:
        print("ERROR: test_gen.jsonl file is empty")
    else:
        print("Training and test files successfully created in: " + gen_directory)
except:
    print("Error running ilab generate, no synthetic data generated")

## 9.3 Show examples of generated data

In [None]:
print("2.4.3 Show examples of generated data")
for filename in os.listdir(gen_directory):
    if filename[:9]=='train_gen':
        with open(gen_directory+filename, 'r') as syn_file:
            cnt=0
            for line_number, line in enumerate(syn_file):
                if cnt >= 8:
                    break
                jsonLine= json.loads(line)
                syn_user=jsonLine["user"]
                syn_assist=jsonLine["assistant"]
                #Remove "Answer:" and "Response:" from answers for displaying
                if syn_user[:10]=="Question: ":
                    syn_user=syn_user[10:]
                if syn_assist[:8]=="Answer: ":
                    syn_assist=syn_assist[8:]
                cnt+=1
                print("\nQuestion: "+syn_user+"\nAnswer: "+syn_assist)

# Step 10. Train with InstructLab

## 10.1 Select the model training pipeline

InstructLab has three primary model training pipelines: simple, full (default), and accelerated. For all of the models, the training time can be limited by adjusting the num_epoch paramater. The maximum number of epochs for running the InstructLab end-to-end workflow is 10.

#### **Simple pipeline**

The simple pipeline uses an SFT Trainer on Linux and MLX on MacOS. This type of training takes roughly an hour and produces the lowest fidelity model but should indicate if your data is being picked up by the training process. The simple pipeline only works with Merlinite 7b Lab as the teacher model. For this Linux system, the trained model is saved in the models directory as ggml-model-f16.gguf.

The command form is:

    ilab model train --pipeline simple

**Note:** This process will take a little while to complete (time can vary based on hardware and output of ilab data generate but on the order of 5 to 15 minutes)

#### **Accelerated pipeline**

The accelerated uses the instructlab-training library which supports GPU accelerated and distributed training. The full loop and data processing functions are either pulled directly from or based off of the work in this library. For the accelerated pipeline, the models are saved in the ~/.local/share/instructlab/checkpoints directory. The instructlab command "ilab model evaluate" can be used to choose the best one. Training is support for GPU acceleration with Nvidia CUDA or AMD ROCm. Please see the GPU acceleration documentation for more details. At present, hardware acceleration requires a data center GPU or high-end consumer GPU with at least 18 GB free memory.

The command form is:

    ilab model train --pipeline accelerated --device cuda --data-path <path-to-sdg-data>

In [None]:
os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "expandable_segments:True"
torch.cuda.empty_cache()
print("Select to Continue or to Train the model")
display(train_pipe)
display(epoch)
display(it)
print("After choosing your training options, please select and run the following cell")

## 10.2 Run the model training

Model training can take 30 minutes or more for 1 epoch and 1 iteration and takes 1 hour for the default paramter values. This minimal training could be used for testing the generation and training for a new set of data.

To produce a higher quality model, more epochs and iterations are needed for refining the model. This will require a proportionally longer time to train the model.

In [None]:
data_path="data/"+ use_case+"/ilab_generated/"
train_data=data_path+"train_gen.jsonl"
model_path="models/instructlab/granite-7b-lab"
##model_path='/root/.cache/instructlab/models/instructlab/granite-7b-lab'
trained_model_path="data/"+ use_case+"/new_model/"

##'Simple (Fast)', 'Accelerated GPU'
file_cnt=0
for filename in os.listdir(data_path):
    if filename[:15]=='train_gen.jsonl': file_cnt+=1
    elif filename[:14]=='test_gen.jsonl': file_cnt+=1
if file_cnt < 2 or os.path.getsize(gen_directory+train_name) < 5 or os.path.getsize(gen_directory+test_name) < 5:
    print("ERROR: train_gen.jsonl and/or test.jsonl are not present or too small")

if not os.path.exists(trained_model_path):
    print("Create directory: " + trained_model_path)
    !mkdir {trained_model_path}
ep=int(epoch.value)
its=int(it.value)
if train_pipe.value=='Simple with GPU':
    print("Train with simple pipeline with a GPU")
    shell_command = f"ilab model train --pipeline simple --device cuda --model-path {model_path} --data-path {data_path} --num-epochs {ep} --iters {its}"
elif train_pipe.value=='Accelerated GPU':
    print("Train accelerated with a GPU")
    #shell_command = f"ilab model train --pipeline accelerated --device cuda --model-path {model_path} --data-path {train_data} --num-epochs {ep} --iters {its}"
    shell_command = f"ilab -v -v model train --pipeline accelerated --device cuda --model-path {'/content/ilab/'+model_path} --data-path {'/content/ilab/'+train_data}"

print("Running: !"+shell_command)
!{shell_command}
if train_pipe.value=='Accelerated GPU':
    print("Run ilab model evaluate")
    !ilab model evaluate --benchmark mmlu
#Move the model to the use_case/new_model directory
print("Moving the trained model to the directory: "+trained_model_path)
!mv /root/.local/share/instructlab/checkpoints/ggml-model-f16.gguf {trained_model_path}

# Step 11. Utilize the Newly Trained Model

## 11.1 Run Interactive Q&A Session with Base and Trained Models to Evaluate Performance

You have now completed InstructLab training. You can run this section to ask questions to both the base and InstructLab trained models and to compare answers.

Run both base and trained models to compare results with interactive questions and and answers.


In [None]:
print("Do you want to run interactive question on the base and trained models?")
display(questions)
print("After making your choice, please select and run the following cell")

The following are sample questions derived from the data used to generate synthetic data, which was then employed to train the language model.

In [None]:
# Define a Function to Perform Inference on Base and Trained Models
def model_inference(base_model_path, trained_model_path):
    _DEFAULT_TEMPLATE = """The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

    Current conversation:
    Human: {input}
    AI:"""
    base_llm = LlamaCpp(model_path=base_model_path,
                   verbose=False,
                   n_gpu_layers=25,
                   max_tokens=90,
                   temperature=0,
                   top_k=1
                  )
    trained_llm = LlamaCpp(model_path=trained_model_path,
                   verbose=False,
                   n_gpu_layers=25,
                   max_tokens=90,
                   temperature=0,
                   top_k=1
                  )
    PROMPT = PromptTemplate( input_variables=["input"],
                            template=_DEFAULT_TEMPLATE
                            )
    chain1 = PROMPT | base_llm | StrOutputParser()
    chain2 = PROMPT | trained_llm | StrOutputParser()
    while True:
        question = input("Ask me a question (type 'exit' to end): ")
        if question.lower() == 'exit':
            print("Exiting this Q&A session.")
            break
        else:
            print("You asked: ", question)
            answer1 = chain1.invoke(question)
            answer1= answer1.split('Human',1)[0]
            print ("Base Model Answer: ",answer1)
            answer2 = chain2.invoke(question)
            answer2= answer2.split('Human',1)[0]
            print ("Trained Model Answer: ",answer2)

##Display Sample Questions
base_model = notebook_dir +"/models/granite-7b-lab-Q4_K_M.gguf"
trained_model = trained_model_path + "ggml-model-f16.gguf"
if questions.value=='Yes':
  with open(notebook_dir+'/data/' + use_case + '/questions.txt') as f:
      for line in f.readlines():
          display(widgets.HTML(Norm+line))
  print("Processing may take several minutes on the first run...")
  model_inference(base_model, trained_model)

## 11.2 Download the Newly Trained Model
### Show download option
 Now that we have a model trained on our dataset, we can download the trained model for futher testing and use.

In [None]:
print("Do you want to download the trained model to your local machine?")
display(download)
print("After making your selection, please select and run the following cell")

### Download the model
Select and run the next cell to download if selected. Please make sure the download is completed before closing this notebook

In [None]:
from google.colab import files
if download.value=='Yes':
  files.download(trained_model)

<a id="IL3_learn"></a>
# Learn More
This notebook demonstrates the high level steps in the InstructLab processing, but does not run the full fidelity processing due to the limited GPU capability. The full InstructLab processing can be run via the [Red Hat AI InstructLab](https://cloud.ibm.com/instructlab/overview) service.

InstructLab uses a novel synthetic data-based alignment tuning method for Large Language Models introduced in this [paper](https://arxiv.org/abs/2403.01081).

This notebook is based on the InstructLab CLI repository available [here](https://github.com/instructlab/instructlab).

Contact us by email to ask questions, discuss potential use cases, or schedule a technical deep dive. The contact email is IBM.Research.JupyterLab@ibm.com.

© 2025 IBM Corporation