# MLX Model Fine-Tuning with LoRA

This notebook will guide you through the steps of loading a pre-trained model, modifying it with LoRA layers, and training it on a specific dataset. This process is crucial for adapting large pre-trained models to new tasks with relatively small datasets and computational resources. We will load a pre-trained model, modify it with LoRA layers, and train it on a specific dataset.

---

# Setting Up JupyterLab for MLX Fine-Tuning

## Installation
Before you can start fine-tuning models with MLX, you need to set up your environment. We recommend using JupyterLab for this tutorial as it provides a robust, interactive development environment for Jupyter notebooks.

### Install JupyterLab
If you haven't already installed JupyterLab, you can do so using Conda, a popular package and environment management system. Run the following command in your terminal:

```bash
conda install jupyterlab
```

*This command will install JupyterLab and all required dependencies in your Conda environment.*

## Launch JupyterLab
Once the installation is complete, you can launch JupyterLab by running:

```bash
jupyter lab
```


*This command starts the JupyterLab server and opens JupyterLab in your default web browser. You can create a new notebook by clicking on the "New" button and selecting "Python 3" from the dropdown menu.*

## Next Steps
With JupyterLab running, you can now proceed to the tutorial sections in this notebook to start fine-tuning your MLX model with LoRA layers.

---

## Importing Necessary Libraries and Modules
Before we start, we need to import all necessary libraries and modules that will be used throughout this notebook. This includes standard libraries for handling files and JSON data, as well as specific modules from the MLX library for model loading, modification, and training. 

*Before we begin the tutorial, it's important to ensure that all necessary Python libraries are installed. This includes libraries for machine learning, data manipulation, and model training. We will install these from a `requirements.txt` file that lists all the dependencies.*

In [None]:
# Clone mlx-examples repo
!git clone https://github.com/ml-explore/mlx-examples
!python3 -m pip install -r ./mlx-examples/lora/requirements.txt

# Install the necessary libraries from the requirements.txt file
!python3 -m pip install -r MLX_Fine-Tuning/requirements.txt

In [None]:
# Importing necessary libraries and modules
import random
from typing import Tuple
from mlx_lm import load  # Load function to load models
from mlx_lm.tuner.lora import LoRALinear  # LoRA module for linear transformations
from mlx.utils import tree_flatten  # Utility to flatten model parameters
from mlx_lm.tuner.trainer import TrainingArgs, train  # Training utilities
import mlx.optimizers as optim  # Optimizers for model training
import json  # Module to work with JSON data
from pathlib import Path  # Module for handling filesystem paths

---

## Dataset Class Definition
Here we define a `Dataset` class to handle data operations. This class will be responsible for loading and accessing our dataset. It takes a list of data items and a key under which text data is stored. This abstraction allows us to easily fetch data by index and get its length, which are essential operations during training.

In [None]:
# Definition of the Dataset class to handle data operations
class Dataset:
    def __init__(self, data, key: str = "text"):
        self._data = data
        self._key = key

    def __getitem__(self, idx: int):
        return self._data[idx][self._key]

    def __len__(self):
        return len(self._data)

## Loading the Dataset
To train our model, we first need to load our training and validation datasets. This function `load_dataset` takes a file path as input, checks for the file's existence, and reads the data. It returns an instance of the `Dataset` class filled with the loaded data. This setup is crucial for handling data efficiently during model training.

In [None]:
# Function to load a dataset from a specified path
def load_dataset(path: str):
    path = Path(path)
    if not path.exists():
        raise FileNotFoundError(f"File not found: {path}")
    with open(path, "r") as fid:
        data = [json.loads(line) for line in fid]
    return Dataset(data)

---

## Setting Up the Model and Data
In this cell, we define the `setup` function which initializes and returns essential components for our training: the model, tokenizer, and datasets. We load a pre-trained model and tokenizer from a specified path and load both training and validation datasets using the previously defined `load_dataset` function.

In [None]:
# Main function setup
def setup():
    train_dataset_path = "./data/dorian_training_dataset.jsonl"
    val_dataset_path = "./data/dorian_tvalid_dataset.jsonl"
    model_path = "/Users/anima/DorainGray-Phi3-4k-MLX"
    model, tokenizer = load(model_path)
    train_dst, valid_dst = load_dataset(train_dataset_path), load_dataset(val_dataset_path)
    return model, tokenizer, train_dst, valid_dst

---

## Modifying the Model with LoRA
In this section, we will modify the pre-trained model by integrating LoRA layers. LoRA allows us to adapt large pre-trained models with minimal additional parameters, making it efficient for fine-tuning on specific tasks. Below, we will freeze the original model parameters and add LoRA layers where necessary.

In [None]:
# Modify the model with LoRA layers
def modify_model_with_lora(model):
    # Freeze the model to prevent updating weights of non-LoRA layers
    model.freeze()
    for l in model.model.layers:
        # Iterate through each layer in the model
        # Define the projections you want to update
        projections = [
            "q_proj", "k_proj", "v_proj", "o_proj",
            "gate_proj", "up_proj", "down_proj"
        ]
        
        # Update self_attn projections if they exist
        for proj in projections[:4]:  # For q_proj, k_proj, v_proj, o_proj
            if hasattr(l.self_attn, proj):
                # Replace existing linear layers with LoRALinear layers
                setattr(l.self_attn, proj, LoRALinear.from_linear(
                    getattr(l.self_attn, proj), r=64, alpha=128
                ))
        
        # Update block_sparse_moe projections if they exist
        if hasattr(l, "block_sparse_moe"):
            for proj in projections[4:]:  # For gate_proj, up_proj, down_proj
                if hasattr(l.block_sparse_moe, proj):
                    # Replace existing linear layers with LoRALinear layers
                    setattr(l.block_sparse_moe, proj, LoRALinear.from_linear(
                        getattr(l.block_sparse_moe, proj), r=64, alpha=128
                    ))
            
            # Update experts within block_sparse_moe
            for e in l.block_sparse_moe.experts:
                for proj in projections:  # Check all projections for each expert
                    if hasattr(e, proj):
                        # Replace existing linear layers with LoRALinear layers
                        setattr(e, proj, LoRALinear.from_linear(
                            getattr(e, proj), r=64, alpha=128
                        ))

---

## Training Configuration and Execution
Now that our model has been modified to include LoRA layers, we need to set up the training configuration. This includes defining the training arguments, learning rate schedule, and optimizer. We will then proceed to train the model using the specified training and validation datasets. The training process is monitored by evaluating the model periodically and saving the model at specified intervals.

In [None]:
# Configure and execute training
def train_model(model, tokenizer, train_dst, valid_dst):
    trainingArgs = TrainingArgs(
        batch_size=1,
        iters=5000,
        val_batches=25,
        steps_per_report=10,
        steps_per_eval=200,
        steps_per_save=200,
        adapter_file="adapters.npz",
        max_seq_length=4096,
    )
    decay_steps = trainingArgs.iters
    lr_schedule = optim.cosine_decay(1e-5, decay_steps)
    opt = optim.AdamW(learning_rate=lr_schedule)

    
    train(model=model, 
          tokenizer=tokenizer, 
          args=trainingArgs, 
          optimizer=opt, 
          train_dataset=train_dst, 
          val_dataset=valid_dst)

---

## Executing the Main Function
Finally, we execute the main function which orchestrates the setup, model modification, and training process. This cell will trigger all the defined functions and start the model training process. Watch the outputs for progress and any potential issues that might need debugging.

### The saved adapaters will appear in your directory as training progresses

In [None]:
# Execute main function
model, tokenizer, train_dst, valid_dst = setup()
modify_model_with_lora(model)
train_model(model, tokenizer, train_dst, valid_dst)

---

## Fuse Trained Adapters to the Base Model

After training adapters for specific tasks, you can fuse these adapters to the base model. This step integrates the specialized capabilities of the adapters directly into the model, which in turn creates a single model that can be used for inference. This model will be used as the starting point for conversion into GGUF format. This allows us to interact with it locally!

### Breakdown

- `python3`: This invokes the Python interpreter to run the script.

- `./mlx-examples/lora/fuse.py`: This is the path to the Python script that handles the fusion of adapters to the base model.

- `--model ./path/to/model`: Specifies the path to the base model file. This should be the path where the pre-trained or previously fine-tuned model is stored.

- `--save-path ./new-fused-model-name`: This option sets the path and name for the output model file after the fusion process. This file will contain the base model with the adapters integrated.

- `--de-quantize`: This flag indicates that if the model is quantized, it should be de-quantized before fusion. This is often necessary to ensure compatibility between the model and the adapters.

- `--adapter-file ./adapters.npz.safetensors`: Specifies the path to the adapter file. This file contains the trained adapter parameters that will be fused with the base model.

### Customization Options

- **Model Path (`--model`)**: You can specify different models to which you want to apply the adapters, allowing for flexibility in experimenting with various base models.

- **Output Path (`--save-path`)**: Adjust this path based on where you want to store the fused model. This is useful for organizing different versions or types of fused models.

- **De-quantization (`--de-quantize`)**: This option can be toggled based on whether the input model is quantized. If your workflow involves models that are not quantized, this flag can be omitted.

- **Adapter File (`--adapter-file`)**: This path can be changed to point to different adapter files, allowing you to fuse various adapters with the base model depending on the specific enhancements or customizations you've developed.

In [None]:
!python3 ./mlx-examples/lora/fuse.py --model ./path/to/model \
    --save-path ./new-fused-model-name \
    --de-quantize \
    --adapter-file ./adapters.npz.safetensors