# Object Detection with Synthetic Data using Blender and YOLO

## Overview
In this lesson, we will explore how to generate synthetic image data using **Blender**, an open-source 3D creation suite, and use it to train an object detection model with **YOLO (You Only Look Once)**.

By the end of this notebook, you will:
- Understand how to automate Blender for dataset generation.
- Learn how to create annotations in YOLO format.
- Convert images to grayscale to force the model to relay less on the object's color.
- Train and test a YOLOv11 object detection model.

### What We'll Do
We’ll use **Blender** to create synthetic datasets for object detection tasks in machine learning.  
Synthetic data allows us to generate large, labeled datasets without the time-consuming process of manual image collection and annotation.  

For the detection model, we’ll use the **YOLO Python library**, which provides tools for both training and inference.

## 1. Setting Up Blender and Python Environment
To begin, ensure you have Blender installed on your system. You can download it from [blender.org](https://www.blender.org/download/).

Download or create your own HDRI's (environment textures) to use inside Blender for random backgrounds.
[Download Free, CC0, High Resolution HDRI's](https://polyhaven.com/hdris) (Use HDR format)

Additionally, set up your Python environment with necessary libraries such as `cv2`, `ultralytics`.


Finlay If you have a dedicated GPU in your system, make sure to set up Blender and pytorch to use it for rendering and model training.

Use the cell below to check your environment and make sure all the packages are working properly.


In [None]:
# Install required packages
# Note: Run this in a Jupyter environment with internet access

# YOLOv11 (Ultralytics)
%pip install ultralytics --quiet

# OpenCV for image processing
%pip install opencv-python --quiet


# Install torch with CUDA support if a compatible GPU is available
# https://pytorch.org/get-started/locally/


# Blender's Python API is included in Blender itself.
# If you want to run Blender scripts from outside Blender, you need to call Blender in background mode:
# blender --background --python your_script.py

# Check versions
import sys
import cv2
import ultralytics
import os
import torch
from ultralytics import YOLO
import numpy as np

print("Python version:", sys.version)
print("OpenCV version:", cv2.__version__)
print("Ultralytics YOLO version:", ultralytics.__version__)
print("PyTorch version:", torch.__version__)


In [None]:
# Check if GPU is available for PyTorch training
# This is important for performance, especially with large models and datasets
# For more information, visit: https://pytorch.org/docs/stable/notes/cuda.html
# If you encounter issues, ensure that your CUDA drivers are correctly installed and compatible with your PyTorch version.
print("Is GPU available for PyTorch:")
torch.cuda.is_available()

## 2. Automating Blender for Synthetic Data Generation

In this section, we will use **Blender's Python API** to automate the rendering of synthetic images.  
This includes randomizing environments, objects, and camera angles to generate diverse training data for machine learning applications.
<br>

### ⚠️ <span style="color:red">Important Notice</span>

Before running the automation script and rendering the animation, make sure the following settings are correctly configured in Blender:

- **Render Engine:** Set to `Cycles`
- **Device:** Set to `GPU Compute`  
  *(If it it is gray go to: `Edit` → `Preferences` → `System` → `Cycles Render Devices`)*

<img src="attachments/System.png" alt="System Preferences" width="650"/>
<img src="attachments/RenderPreferences.png" alt="Render Preferences" width="480"/>

- **Output Settings:**
  - Choose the correct **output folder**
  - The **Frame Range** defines how much images you will get
  - Set **file format** to `JPEG`

<img src="attachments/OutputPreferences.png" alt="Output Preferences" width="400"/>

---

### ✅ **<span style="color:green">Run the script and Render the animation</span>**
The cell below contains a basic Blender automation script to get you started, to run it navigate to the `Scripting` tab and press *Run Script*.

After running the script, scroll through the timeline at the bottom and check that the camera is moving as expected.

Finally press `Render` → `Render Animation`.

Be sure to configure your scene as described above before rendering.

<img src="attachments/RunScript.png" alt="Output Preferences" width="1000"/>


In [None]:
import bpy
import mathutils
import random
import os
import math
from math import pi

# -------- CONFIGURATION --------
# Camera settings
# Camera x position range
radius_range = (1, 10)  # in Blender units (meters)

# Camera y position range
theta_range = (0, 2 * pi)  # in Blender units (radians)

# Camera z position range
phi_range = (10 * (pi / 180), pi / 2)  # in Blender units (radians)

# Camera x rotation offset range (in radians)
x_rotation_offset_range = (-6 / (180 / pi), 6 / (180 / pi))  # in radians

# Camera z rotation offset range (in radians)
z_rotation_offset_range = (-18 / (180 / pi), 18 / (180 / pi))  # in radians

# Name of the camera object in the scene
camera_name = "Camera"

# Name of the target object the camera should point at
target_object_name = "FRC_Coral"

# Environment texture settings
# Folder containing .hdr files
hdr_folder = "Path/HDRIs/8K_HDRI"

# Name of the object whose material has the image texture
object_name = "Sphere"

# Name of the image texture node in the object's material
object_texture_node_name = "Image Texture"

# Name of the environment texture node in the world
world_texture_node_name = "Environment Texture"

# How often to change the environment texture (in frames)
change_interval = 10
# --------------------------------

#region checks
# -------- CHECKS --------
# Check that the Blender scene is configured correctly

# Get list of all .hdr files
hdr_files = [f for f in os.listdir(hdr_folder) if f.lower().endswith(".hdr")]
hdr_files.sort()  # ensure consistent order

if not hdr_files:
    raise FileNotFoundError(f"No .hdr files found in {hdr_folder}")

# Ensure object and world exist
obj = bpy.data.objects.get(object_name)
if not obj:
    raise ValueError(f"Object '{object_name}' not found")

if not obj.data.materials:
    raise ValueError(f"Object '{object_name}' has no materials")

material = obj.data.materials[0]
if not material.use_nodes:
    raise ValueError(f"Material on '{object_name}' is not using nodes")

# Get image texture node for object
obj_tex_node = material.node_tree.nodes.get(object_texture_node_name)
if not obj_tex_node or obj_tex_node.type != "TEX_IMAGE":
    raise ValueError(f"Image texture node '{object_texture_node_name}' not found in object material")

# Get environment texture node in world
world = bpy.context.scene.world
if not world or not world.use_nodes:
    raise ValueError("World material not found or not using nodes")

world_tex_node = world.node_tree.nodes.get(world_texture_node_name)
if not world_tex_node or world_tex_node.type != "TEX_ENVIRONMENT":
    raise ValueError(f"Environment texture node '{world_texture_node_name}' not found in world material")
# --------------------------------
#endregion

# -------- UTILITY FUNCTIONS --------
def convert_old_value_to_new_range(newMin, newMax, oldValue):
    oldRange = (1 - 0)
    newRange = (newMax - newMin)
    newValue = (((oldValue - 0) * newRange) / oldRange) + newMin
    return newValue
# --------------------------------


# -------- FRAME UPDATE --------
# Handler function to change background texture
def randomize_environment_texture(scene):
    frame = scene.frame_current
    if frame % change_interval != 0:
        return  # only change on frames divisible by 10

    index = (frame // change_interval) % len(hdr_files)
    hdr_path = os.path.join(hdr_folder, hdr_files[index])

    print(f"Frame {frame}: Loading {hdr_path}")

    # Load or reuse image
    img = bpy.data.images.get(hdr_files[index])
    if not img:
        img = bpy.data.images.load(hdr_path)

    # Update object image texture
    obj_tex_node.image = img

    # Update world environment texture
    world_tex_node.image = img

# Handler function to change camera position and rotation
def randomize_camera_transform(scene):
    current_frame = scene.frame_current
    random.seed(current_frame)  # Seed random number generator for reproducibility

    # Get scene objects
    target_obj = bpy.data.objects[target_object_name]
    camera_obj = bpy.data.objects[camera_name]
    
    # Generate random position
    random_radius = random.random()
    random_theta = random.random()
    random_phi = random.random()

    random_radius = convert_old_value_to_new_range(radius_range[0], radius_range[1], random_radius)
    random_theta = convert_old_value_to_new_range(theta_range[0], theta_range[1], random_theta)
    random_phi = convert_old_value_to_new_range(phi_range[0], phi_range[1], random_phi)

    # Generate random rotation offsets
    random_rotation_x = random.random()
    ransom_rotation_z = random.random()
    
    # Camera position in world space
    x_position = random_radius * math.sin(random_phi) * math.cos(random_theta)
    y_position = random_radius * math.sin(random_phi) * math.sin(random_theta)
    z_position = random_radius * math.cos(random_phi)

    # Set camera location
    camera_obj.location = mathutils.Vector((x_position, y_position, z_position))

    # First point camera at target and them apply the random offsets
    # Get direction vector from camera to target
    direction = target_obj.location - camera_obj.location
    direction.normalize()

    # Create a rotation matrix that points the camera's -Z axis toward the target (Blender's camera looks down -Z)
    up = mathutils.Vector((0.0, 0.0, 1.0))  # world up axis
    rotation = direction.to_track_quat('-Z', 'Y').to_euler()

    # Apply the rotation to the camera directly
    camera_obj.rotation_euler = rotation

    # Add random camera offsets AFTER the camera is pointing at the target
    camera_obj.rotation_euler[0] += convert_old_value_to_new_range(x_rotation_offset_range[0], x_rotation_offset_range[1], random_rotation_x)
    camera_obj.rotation_euler[1] = 0
    camera_obj.rotation_euler[2] += convert_old_value_to_new_range(z_rotation_offset_range[0], z_rotation_offset_range[1], ransom_rotation_z)
# --------------------------------

# -------- SET UP FRAME UPDATES --------
bpy.app.handlers.frame_change_pre.clear()
bpy.app.handlers.frame_change_pre.append(randomize_camera_transform)
bpy.app.handlers.frame_change_pre.append(randomize_environment_texture)
# --------------------------------

After generating the Image data, go to render properties and change the `max samples` under `Sampling` → `Render` to 1, and under `Film` check the `Transparent` option

Then change both materials of the Coral Object to the `RED_Mat` material, then delete the `Sphere` object

Finally, select a different output folder and render the animation to get the masks for your Image data

## 3. Converting Images to Grayscale
To force the model to rely less on color data, we can convert the rendered RGB images to grayscale using OpenCV or similar libraries.

Use the cell below to perform the conversion.


In [67]:
# Folder containing image data from Blender renders
image_data_folder = "Path/To/Image/Data"

# Output folder for grayscale images
output_folder = "Output/Path/For/GrayScale/Images"

os.makedirs(output_folder, exist_ok=True)
os.makedirs(os.path.join(output_folder, "Train"), exist_ok=True)
os.makedirs(os.path.join(output_folder, "Val"), exist_ok=True)

# Convert all images in the image data folder to grayscale and save to output folder, train, and val subfolders
image_files = [f for f in os.listdir(image_data_folder) if f.lower().endswith('.jpg')]
image_files.sort()  # for reproducibility

num_val = max(1, int(0.1 * len(image_files)))
val_files = image_files[:num_val]
train_files = image_files[num_val:]

for fname in train_files:
    img = cv2.imread(os.path.join(image_data_folder, fname))
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    cv2.imwrite(os.path.join(output_folder, "Train", fname), gray)

for fname in val_files:
    img = cv2.imread(os.path.join(image_data_folder, fname))
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    cv2.imwrite(os.path.join(output_folder, "Val", fname), gray)

## 4. Generating Annotations in YOLOv11 Format
Once images are rendered, we need to generate bounding box annotations in YOLOv11 format. This format includes class ID and normalized coordinates for each object.

Use the cell below to generate annotation files.

In this example we generate annotation files for YOLOv11 [Object Detection](https://docs.ultralytics.com/tasks/detect/), with some small modifications you can also generate annotations for [Oriented Bounding Boxes Object Detection](https://docs.ultralytics.com/tasks/obb/) and [Instance Segmentation](https://docs.ultralytics.com/tasks/segment/)


In [11]:
# Path of the grayscale images
gray_scale_images_folder_train = os.path.join(output_folder, "Train")
gray_scale_images_folder_val = os.path.join(output_folder, "Val")

# Path of the mask images from Blender
mask_images_folder = "Path/To/Mask/Images"

# Get train/val split from grayscale images
train_files = set(os.listdir(gray_scale_images_folder_train))
val_files = set(os.listdir(gray_scale_images_folder_val))

# Define range of red color in HSV
lower_red = np.array([0, 50, 50])
upper_red = np.array([10, 255, 255])

for filename in os.listdir(mask_images_folder):
    if filename.endswith(".jpg"):
        img = cv2.imread(os.path.join(mask_images_folder, filename))

        # Convert BGR to HSV
        hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)

        # Threshold the HSV image to get only red colors
        mask = cv2.inRange(hsv, lower_red, upper_red)

        # Find contours in the binary image
        contours, _ = cv2.findContours(mask, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

        if contours:
            # Find the bounding box of the largest contour
            x, y, w, h = cv2.boundingRect(max(contours, key=cv2.contourArea))

            # Calculate normalized bounding box coordinates
            x_center = (x + w/2) / img.shape[1]
            y_center = (y + h/2) / img.shape[0]
            width = w / img.shape[1]
            height = h / img.shape[0]

            # Determine output folder (train or val)
            if filename in train_files:
                txt_folder = gray_scale_images_folder_train
            elif filename in val_files:
                txt_folder = gray_scale_images_folder_val
            else:
                continue  # skip if not found in either set

            # Save bounding box coordinates in YOLOv8 format
            with open(os.path.join(txt_folder, filename.replace(".jpg", ".txt")), 'w') as f:
                f.write(f'0 {x_center} {y_center} {width} {height}')


# Create YOLOv11 YAML configuration file
yaml_content = f"""
train: {gray_scale_images_folder_train}
val: {gray_scale_images_folder_val}
nc: 1
names: ['coral']
"""

yaml_path = os.path.join(output_folder, "yolov11_config.yaml")
with open(yaml_path, 'w') as f:
    f.write(yaml_content)

print(f"YOLOv11 configuration file saved to {yaml_path}")

## 5. Training YOLOv11 Object Detection Model
With the dataset and annotations ready, we can now train a YOLOv11 object detection model. This step involves configuring the model, loading the dataset, and running the training process.

Use the cell below to initiate training.

For more info visit the [ultralytics website](https://docs.ultralytics.com/tasks/detect/).


In [12]:
# Load a model
model = YOLO("yolo11n.pt")  # load a pretrained model (recommended for training)

# Train the model
results = model.train(data=yaml_path, epochs=100, imgsz=640, device=0)  # device=0 for GPU

## 6. Using the Trained Model for Detection
After training, we can use the model to detect objects in new images. This step involves loading the trained weights and running predictions on test images.

Use the cell below to perform prediction on test images.


In [13]:
# Replace with your test image path
test_image = cv2.imread("Path/To/Test/Image.jpg")

# Replace with your best.pt path
model = YOLO("runs/detect/train1/weights/best.pt")

# Test model
result = model.predict(source=test_image, save=True, show=True, conf=0.5, save_txt=False)

# Display results
result[0].show()
print(result[0].boxes.xywh)  # x_center, y_center, width, height