# MLX Ultralytics Object Detection Training (Colab)

This notebook mirrors the MLX object detection workflow in Google Colab. It executes the following steps:
1. Clone the MLX repository.
2. Install the custom Ultralytics fork and MLX dependencies.
3. Capture the required API credentials.
4. Download and extract the DroneFace YOLO12 dataset from Roboflow.
5. Write the provided YOLO12 model configuration.
6. Launch training through the MLX CLI.

> ⚠️ Switch your Colab runtime to **GPU** for reasonable training speed (`Runtime → Change runtime type → GPU`).


## Step 0 · Clone the MLX repository
This pulls the public repository and switches the working directory so subsequent installs run in-place.


In [None]:
import os
import subprocess
from pathlib import Path

REPO_URL = 'https://github.com/ralampay/mlx'
cwd = Path.cwd()
if cwd.name != 'mlx':
    repo_dir = cwd / 'mlx'
    if not repo_dir.exists():
        subprocess.run(['git', 'clone', REPO_URL], check=True)
    os.chdir(repo_dir)
print('Working directory:', Path.cwd())


## Step 1 · Install the ralampay Ultralytics fork
Installs the YOLO implementation that MLX expects for object detection.


In [None]:
import subprocess
import sys

subprocess.run([sys.executable, '-m', 'pip', 'install', '--upgrade', 'pip'], check=True)
subprocess.run([sys.executable, '-m', 'pip', 'install', 'git+https://github.com/ralampay/ultralytics'], check=True)


## Step 2 · Install MLX and supporting dependencies
The editable install exposes the `mlx` CLI. The Roboflow SDK handles dataset downloads.


In [None]:
import subprocess
import sys

subprocess.run([sys.executable, '-m', 'pip', 'install', '-e', '.'], check=True)
subprocess.run([sys.executable, '-m', 'pip', 'install', 'roboflow'], check=True)


## Step 3 · Supply API credentials
Roboflow requires an API key to download datasets. The key is kept only in the current Colab session.


In [None]:
import os
import getpass

if not os.environ.get('ROBOFLOW_API_KEY'):
    os.environ['ROBOFLOW_API_KEY'] = getpass.getpass('Enter your Roboflow API Key: ')

if not os.environ.get('OPENAI_API_KEY'):
    optional_key = getpass.getpass('Optional: Enter your OpenAI API Key (leave blank to skip): ')
    if optional_key:
        os.environ['OPENAI_API_KEY'] = optional_key

print('Roboflow key loaded:', bool(os.environ.get('ROBOFLOW_API_KEY')))
print('OpenAI key detected:' if os.environ.get('OPENAI_API_KEY') else 'OpenAI key not set (only needed for chat module).')


## Step 4 · Download and extract the DroneFace YOLO12 dataset
The Roboflow SDK pulls version 8 of the dataset and expands it under `datasets/`.


In [None]:
from pathlib import Path
from roboflow import Roboflow
import os

api_key = os.environ.get('ROBOFLOW_API_KEY')
if not api_key:
    raise RuntimeError('ROBOFLOW_API_KEY is not set. Please rerun the previous cell.')

rf = Roboflow(api_key=api_key)
project = rf.workspace('face-detection-and-recognition-dataset').project('droneface')
dataset = project.version(8).download('yolo12', location='datasets/droneface-yolo12')
dataset_dir = Path(dataset.location)
print('Dataset ready at:', dataset_dir)
print('Sample contents:', [p.name for p in dataset_dir.iterdir()][:5])


## Step 5 · Write the `cad_yolo12.yaml` model configuration
Stores the provided YAML under `ultralytics/cfg/models/ext/cad_yolo12.yaml` so the MLX CLI can reference it directly.


In [None]:
from pathlib import Path

model_yaml_path = Path('ultralytics/cfg/models/ext/cad_yolo12.yaml')
model_yaml_path.parent.mkdir(parents=True, exist_ok=True)
yaml_content = "# Ultralytics \ud83d\ude80 AGPL-3.0 License - https://ultralytics.com/license\n\n# YOLO12 object detection model with P3/8 - P5/32 outputs\n# Model docs: https://docs.ultralytics.com/models/yolo12\n# Task docs: https://docs.ultralytics.com/tasks/detect\n\n# Parameters\nnc: 1 # number of classes\nscales: # model compound scaling constants, i.e. 'model=yolo12n.yaml' will call yolo12.yaml with scale 'n'\n  # [depth, width, max_channels]\n  n: [0.50, 0.25, 1024] # summary: 272 layers, 2,602,288 parameters, 2,602,272 gradients, 6.7 GFLOPs\n  s: [0.50, 0.50, 1024] # summary: 272 layers, 9,284,096 parameters, 9,284,080 gradients, 21.7 GFLOPs\n  m: [0.50, 1.00, 512] # summary: 292 layers, 20,199,168 parameters, 20,199,152 gradients, 68.1 GFLOPs\n  l: [1.00, 1.00, 512] # summary: 488 layers, 26,450,784 parameters, 26,450,768 gradients, 89.7 GFLOPs\n  x: [1.00, 1.50, 512] # summary: 488 layers, 59,210,784 parameters, 59,210,768 gradients, 200.3 GFLOPs\n\n# YOLO12n backbone\nbackbone:\n  # [from, repeats, module, args]\n  - [-1, 1, ConvAttnDeform, [64, 3, 2]] # 0-P1/2\n  - [-1, 1, ConvAttnDeform, [128, 3, 2]] # 1-P2/4\n  - [-1, 2, C3k2, [256, False, 0.25]]\n  - [-1, 1, ConvAttnDeform, [256, 3, 2]] # 3-P3/8\n  - [-1, 2, C3k2, [512, False, 0.25]]\n  - [-1, 1, ConvAttnDeform, [512, 3, 2]] # 5-P4/16\n  - [-1, 4, A2C2f, [512, True, 4]]\n  - [-1, 1, ConvAttnDeform, [1024, 3, 2]] # 7-P5/32\n  - [-1, 4, A2C2f, [1024, True, 1]] # 8\n\n# YOLO12n head\nhead:\n  - [-1, 1, nn.Upsample, [None, 2, \"nearest\"]]\n  - [[-1, 6], 1, Concat, [1]] # cat backbone P4\n  - [-1, 2, A2C2f, [512, False, -1]] # 11\n\n  - [-1, 1, nn.Upsample, [None, 2, \"nearest\"]]\n  - [[-1, 4], 1, Concat, [1]] # cat backbone P3\n  - [-1, 2, A2C2f, [256, False, -1]] # 14\n\n  - [-1, 1, Conv, [256, 3, 2]]\n  - [[-1, 11], 1, Concat, [1]] # cat head P4\n  - [-1, 2, A2C2f, [512, False, -1]] # 17\n\n  - [-1, 1, Conv, [512, 3, 2]]\n  - [[-1, 8], 1, Concat, [1]] # cat head P5\n  - [-1, 2, C3k2, [1024, True]] # 20 (P5/32-large)\n\n  - [[14, 17, 20], 1, Detect, [nc]] # Detect(P3, P4, P5)\n"
model_yaml_path.write_text(yaml_content)
print('Model config saved to:', model_yaml_path.resolve())


## Step 6 · Launch MLX object detection training
Adjust the hyperparameters as needed. The command streams Rich/typer output directly in the notebook.


In [None]:
import subprocess
from pathlib import Path

try:
    dataset_dir
except NameError as exc:
    raise RuntimeError('Dataset directory not found. Please run the previous steps first.') from exc

try:
    import torch
    device = 'cuda:0' if torch.cuda.is_available() else 'cpu'
except ImportError:
    device = 'cuda:0'

command = [
    'mlx',
    '--module', 'obj-detect',
    '--platform', 'ultralytics',
    '--action', 'train',
    '--dataset-path', str(dataset_dir),
    '--model-path', 'ultralytics/cfg/models/ext/cad_yolo12.yaml',
    '--epochs', '100',
    '--batch-size', '16',
    '--device', device,
]

print('Running:', ' '.join(command))
subprocess.run(command, check=True)
