# Object Detection with Synthetic Data using Blender and YOLO

## 1. Setting Up Blender
To begin, ensure you have Blender installed on your system. You can download it from [blender.org](https://www.blender.org/download/).

Download or create your own HDRI's (environment textures) to use inside Blender for random backgrounds.
[Download Free, CC0, High Resolution HDRI's](https://polyhaven.com/hdris) (Use HDR format)

Additionally, set up your Python environment with necessary libraries such as `cv2`, `ultralytics`.


Finlay If you have a dedicated GPU in your system, make sure to set up Blender and pytorch to use it for rendering and model training.

Use the cell below to check your environment and make sure all the packages are working properly.


In [None]:
# Install required packages
# Note: Run this in a Jupyter environment with internet access

# YOLOv11 (Ultralytics)
%pip install ultralytics --quiet

# OpenCV for image processing
%pip install opencv-python --quiet


# Install torch with CUDA support if a compatible GPU is available
# https://pytorch.org/get-started/locally/


# Blender's Python API is included in Blender itself.
# If you want to run Blender scripts from outside Blender, you need to call Blender in background mode:
# blender --background --python your_script.py

# Check versions
import sys
import cv2
import ultralytics
import os
import torch
from ultralytics import YOLO
import numpy as np

print("Python version:", sys.version)
print("OpenCV version:", cv2.__version__)
print("Ultralytics YOLO version:", ultralytics.__version__)
print("PyTorch version:", torch.__version__)


In [None]:
# Check if GPU is available for PyTorch training
# This is important for performance, especially with large models and datasets
# For more information, visit: https://pytorch.org/docs/stable/notes/cuda.html
# If you encounter issues, ensure that your CUDA drivers are correctly installed and compatible with your PyTorch version.
print("Is GPU available for PyTorch:")
torch.cuda.is_available()

### Open the Blender file that is included in this repository: `DatasetGenerator.blend`
It includes a Coral and an Algae model (By default only the Coral is visible, if you wish to create a dataset for the algae game piece, hide the coral from the viewport, disable it in the render and enable the algae)

## 2. Automating Blender for Synthetic Data Generation

Using the **Blender's Python API** I automate the rendering of synthetic images.  
This includes randomizing environments, objects, and camera angles to generate diverse training data for machine learning applications.
<br>

### <span style="color:red">Important Notice</span>

Before running the automation script and rendering the animation, make sure the following settings are correctly configured in Blender:

- **Render Engine:** Set to `Cycles`
- **Device:** Set to `GPU Compute`  
  *(If it it is gray go to: `Edit` → `Preferences` → `System` → `Cycles Render Devices`)*

<img src="media/System.png" alt="System Preferences" width="650"/>
<img src="media/RenderPreferences.png" alt="Render Preferences" width="480"/>

- **Output Settings:**
  - Choose the correct **output folder**
  - The **Frame Range** defines how much images you will get
  - Set **file format** to `JPEG`

<img src="media/OutputPreferences.png" alt="Output Preferences" width="400"/>

---

### **<span style="color:green">Run the script and Render the animation</span>**
To run the script navigate to the `Scripting` tab and press *Run Script*.

Before running the script make sure the `hdr_folder` path is specified correctly, you can download HDRI's [here](https://polyhaven.com/hdris).

After running the script, scroll through the timeline at the bottom and check that the camera is moving as expected.

Finally press `Render` → `Render Animation`.

Be sure to configure your scene as described above before rendering.

<img src="media/RunScript.png" alt="Output Preferences" width="800"/>


After generating the Image data, go to render properties and change the `max samples` under `Sampling` → `Render` to 1, and under `Film` check the `Transparent` option

Then change both materials of the Coral Object to the `RED` material, then delete the `Sphere` object

Finally, select a different output folder and render the animation to get the masks for your Image data

## 3. Convert Images to Grayscale and organize them for YOLO
To force the model to rely less on color data, we can convert the rendered RGB images to grayscale using OpenCV or similar libraries.

Use the cell below to perform the conversion.


In [None]:
# Folder containing image data from Blender renders
image_data_folder = "Path/To/Blender/Output/Folder/Images"

# Output folder for grayscale images
output_folder = "Path/To/Folder/Where/To/Store/Training/Data"

# Create output folder and subfolders if they don't exist
os.makedirs(output_folder, exist_ok=True)
os.makedirs(os.path.join(output_folder, "Train"), exist_ok=True)
os.makedirs(os.path.join(output_folder, "Val"), exist_ok=True)

# Convert all images in the image data folder to grayscale and save to output folder, train, and val subfolders
image_files = [f for f in os.listdir(image_data_folder) if f.lower().endswith('.jpg')]
image_files.sort()  # for reproducibility

# Split into training and validation sets (90% train, 20% val)
num_val = max(1, int(0.2 * len(image_files)))
val_files = image_files[:num_val]
train_files = image_files[num_val:]

for fname in train_files:
    img = cv2.imread(os.path.join(image_data_folder, fname))
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    cv2.imwrite(os.path.join(output_folder, "Train", fname), gray)

for fname in val_files:
    img = cv2.imread(os.path.join(image_data_folder, fname))
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    cv2.imwrite(os.path.join(output_folder, "Val", fname), gray)