# Multi-Object Image Generation
**Author:** G8  
**Task:** 1.2 - Generate Multi-Object Images with YOLO Annotations  
**Timeline:** Feb 3-4, 2025  

**Purpose:**
- Generate 1200+ multi-object images using grid layouts
- Create YOLO format annotations
- Split into train/val/test for YOLOv8 training

## Setup and Imports

In [1]:
import os
import cv2
import numpy as np
import pandas as pd
from pathlib import Path
import shutil
from tqdm import tqdm
import matplotlib.pyplot as plt
import matplotlib.patches as patches
import json
import random

# Set random seed
np.random.seed(42)
random.seed(42)

print("Libraries imported successfully!")

Libraries imported successfully!


## Configuration

In [None]:
# Paths
PROJECT_ROOT = Path.cwd().parent if 'notebooks' in str(Path.cwd()) else Path.cwd()
PROCESSED_PATH = PROJECT_ROOT / "data" / "processed" / "single_objects" / "train"
OUTPUT_IMAGES = PROJECT_ROOT / "data" / "multi_objects" / "images"
OUTPUT_LABELS = PROJECT_ROOT / "data" / "multi_objects" / "labels"
STATS_PATH = PROJECT_ROOT / "data" / "statistics"

# Load class mapping
with open(PROJECT_ROOT / "data" / "class_mapping.json", 'r') as f:
    class_mapping = json.load(f)

# Grid configurations
GRID_CONFIGS = [
    {'name': '2x2', 'rows': 2, 'cols': 2, 'count': 400},
    {'name': '2x3', 'rows': 2, 'cols': 3, 'count': 400}, 
    {'name': '3x3', 'rows': 3, 'cols': 3, 'count': 400}
]

CELL_SIZE = 224  # Each object is 224x224

print("Configuration:")
print(f"  Input path: {PROCESSED_PATH}")
print(f"  Total classes: {class_mapping['num_classes']}")
print(f"  Grid types: {len(GRID_CONFIGS)}")
print(f"  Total images to generate: {sum(g['count'] for g in GRID_CONFIGS)}")

Configuration:
  Input path: /Users/kevin/Documents/GitHub/Python/VESKL/11.DAE/NEU/NEU_IE7615/Prj/Discriminative/G8/Project1/IE7615_Discriminative_Project/data/processed/single_objects/train
  Total classes: 39
  Grid types: 3
  Total images to generate: 1200


## Step 1: Load All Processed Images
Load paths to all preprocessed single-object images

In [3]:
def load_image_paths():
    """
    Load all preprocessed image paths organized by class
    Returns: dict {object_id: [list of image paths]}
    """
    print("="*80)
    print("LOADING PROCESSED IMAGES")
    print("="*80)
    
    image_dict = {}
    
    for obj_folder in PROCESSED_PATH.iterdir():
        if obj_folder.is_dir():
            obj_id = obj_folder.name
            images = sorted(list(obj_folder.glob("*.jpg")))
            if len(images) > 0:
                image_dict[obj_id] = images
    
    total_images = sum(len(imgs) for imgs in image_dict.values())
    
    print(f"\nLoaded {len(image_dict)} object classes")
    print(f"Total images available: {total_images}")
    print(f"Avg images per class: {total_images/len(image_dict):.1f}")
    
    return image_dict

# Load image paths
image_paths = load_image_paths()

LOADING PROCESSED IMAGES

Loaded 39 object classes
Total images available: 2943
Avg images per class: 75.5


## Step 2: Generate Grid Images
Create multi-object images by placing single objects in grid layouts

In [4]:
def create_grid_image(image_paths_dict, class_mapping, rows, cols):
    """
    Create a grid image with multiple objects
    
    Args:
        image_paths_dict: Dict of {obj_id: [image_paths]}
        class_mapping: Class to index mapping
        rows: Number of rows in grid
        cols: Number of columns in grid
    
    Returns:
        grid_image: Combined image (rows*224 x cols*224)
        annotations: List of YOLO format annotations
    """
    num_objects = rows * cols
    
    # Randomly select different object classes
    available_classes = list(image_paths_dict.keys())
    selected_classes = random.sample(available_classes, num_objects)
    
    # Create blank grid
    grid_height = rows * CELL_SIZE
    grid_width = cols * CELL_SIZE
    grid_image = np.zeros((grid_height, grid_width, 3), dtype=np.uint8)
    
    annotations = []
    
    # Fill grid with objects
    for i in range(rows):
        for j in range(cols):
            idx = i * cols + j
            obj_id = selected_classes[idx]
            
            # Randomly select an image from this class
            img_path = random.choice(image_paths_dict[obj_id])
            img = cv2.imread(str(img_path))
            img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
            
            # Place in grid
            y_start = i * CELL_SIZE
            y_end = y_start + CELL_SIZE
            x_start = j * CELL_SIZE
            x_end = x_start + CELL_SIZE
            grid_image[y_start:y_end, x_start:x_end] = img
            
            # Create YOLO annotation (normalized coordinates)
            class_idx = class_mapping['class_to_idx'][obj_id]
            x_center = (x_start + CELL_SIZE/2) / grid_width
            y_center = (y_start + CELL_SIZE/2) / grid_height
            box_width = CELL_SIZE / grid_width
            box_height = CELL_SIZE / grid_height
            
            # YOLO format: class_id x_center y_center width height
            annotations.append(f"{class_idx} {x_center:.6f} {y_center:.6f} {box_width:.6f} {box_height:.6f}")
    
    return grid_image, annotations

# Test function
print("Testing grid generation...")
test_grid, test_annot = create_grid_image(image_paths, class_mapping, 2, 2)
print(f"\nTest grid shape: {test_grid.shape}")
print(f"Test annotations: {len(test_annot)} objects")
print("\nSample annotation:")
print(f"  {test_annot[0]}")
print("\nFormat: class_id x_center y_center width height (all normalized)")

Testing grid generation...

Test grid shape: (448, 448, 3)
Test annotations: 4 objects

Sample annotation:
  34 0.250000 0.250000 0.500000 0.500000

Format: class_id x_center y_center width height (all normalized)


## Step 3: Generate All Multi-Object Images
Create 1200+ images with all grid configurations

In [5]:
print("="*80)
print("GENERATING MULTI-OBJECT IMAGES")
print("="*80)

# Create temp folder for all generated images
temp_images = OUTPUT_IMAGES / "all_generated"
temp_labels = OUTPUT_LABELS / "all_generated"
temp_images.mkdir(parents=True, exist_ok=True)
temp_labels.mkdir(parents=True, exist_ok=True)

generation_stats = []
image_counter = 0

# Generate for each grid configuration
for config in GRID_CONFIGS:
    grid_name = config['name']
    rows = config['rows']
    cols = config['cols']
    count = config['count']
    
    print(f"\nGenerating {count} images for {grid_name} grid ({rows}x{cols})...")
    
    for i in tqdm(range(count), desc=f"{grid_name} grid"):
        # Generate grid image
        grid_img, annotations = create_grid_image(image_paths, class_mapping, rows, cols)
        
        # Save image
        img_filename = f"grid_{grid_name}_{image_counter:04d}.jpg"
        img_path = temp_images / img_filename
        cv2.imwrite(str(img_path), cv2.cvtColor(grid_img, cv2.COLOR_RGB2BGR))
        
        # Save annotation
        label_filename = f"grid_{grid_name}_{image_counter:04d}.txt"
        label_path = temp_labels / label_filename
        with open(label_path, 'w') as f:
            f.write('\n'.join(annotations))
        
        image_counter += 1
    
    generation_stats.append({
        'grid_type': grid_name,
        'rows': rows,
        'cols': cols,
        'objects_per_image': rows * cols,
        'images_generated': count
    })

# Summary
df_gen = pd.DataFrame(generation_stats)
print("\n" + "="*80)
print("GENERATION SUMMARY")
print("="*80)
print(df_gen.to_string(index=False))
print(f"\nTotal images generated: {image_counter}")
print(f"Total annotations created: {image_counter}")

GENERATING MULTI-OBJECT IMAGES

Generating 400 images for 2x2 grid (2x2)...


2x2 grid: 100%|██████████| 400/400 [00:01<00:00, 328.08it/s]



Generating 400 images for 2x3 grid (2x3)...


2x3 grid: 100%|██████████| 400/400 [00:02<00:00, 195.19it/s]



Generating 400 images for 3x3 grid (3x3)...


3x3 grid: 100%|██████████| 400/400 [00:02<00:00, 190.22it/s]


GENERATION SUMMARY
grid_type  rows  cols  objects_per_image  images_generated
      2x2     2     2                  4               400
      2x3     2     3                  6               400
      3x3     3     3                  9               400

Total images generated: 1200
Total annotations created: 1200





## Step 4: Split Multi-Object Dataset
Split into train/val/test (70/15/15)

In [6]:
print("="*80)
print("SPLITTING MULTI-OBJECT DATASET")
print("="*80)

# Get all generated images
all_images = sorted(list(temp_images.glob("*.jpg")))
all_labels = sorted(list(temp_labels.glob("*.txt")))

print(f"\nTotal images: {len(all_images)}")
print(f"Total labels: {len(all_labels)}")

# Shuffle with same seed
indices = list(range(len(all_images)))
random.shuffle(indices)

# Calculate split
n = len(all_images)
train_end = int(n * 0.70)
val_end = train_end + int(n * 0.15)

train_idx = indices[:train_end]
val_idx = indices[train_end:val_end]
test_idx = indices[val_end:]

# Copy to split folders
splits = {
    'train': train_idx,
    'val': val_idx,
    'test': test_idx
}

for split_name, idx_list in splits.items():
    img_folder = OUTPUT_IMAGES / split_name
    label_folder = OUTPUT_LABELS / split_name
    img_folder.mkdir(parents=True, exist_ok=True)
    label_folder.mkdir(parents=True, exist_ok=True)
    
    print(f"\nCopying {len(idx_list)} images to {split_name}...")
    
    for idx in tqdm(idx_list, desc=f"{split_name}"):
        # Copy image
        shutil.copy2(all_images[idx], img_folder / all_images[idx].name)
        # Copy label
        shutil.copy2(all_labels[idx], label_folder / all_labels[idx].name)

print("\n" + "="*80)
print("SPLIT COMPLETE")
print("="*80)
print(f"  Train: {len(train_idx)} ({len(train_idx)/n*100:.1f}%)")
print(f"  Val:   {len(val_idx)} ({len(val_idx)/n*100:.1f}%)")
print(f"  Test:  {len(test_idx)} ({len(test_idx)/n*100:.1f}%)")

SPLITTING MULTI-OBJECT DATASET

Total images: 1200
Total labels: 1200

Copying 840 images to train...


train: 100%|██████████| 840/840 [00:00<00:00, 1296.79it/s]



Copying 180 images to val...


val: 100%|██████████| 180/180 [00:00<00:00, 1711.24it/s]



Copying 180 images to test...


test: 100%|██████████| 180/180 [00:00<00:00, 1767.98it/s]


SPLIT COMPLETE
  Train: 840 (70.0%)
  Val:   180 (15.0%)
  Test:  180 (15.0%)





## Step 5: Create data.yaml for YOLOv8

In [7]:
print("="*80)
print("CREATING data.yaml FOR YOLOv8")
print("="*80)

# Create data.yaml content
data_yaml_path = PROJECT_ROOT / "data" / "data.yaml"
multi_objects_path = PROJECT_ROOT / "data" / "multi_objects"

yaml_content = f"""# YOLOv8 Dataset Configuration
# Generated for CNN Attendance System Project

path: {multi_objects_path.absolute()}
train: images/train
val: images/val
test: images/test

# Number of classes
nc: {class_mapping['num_classes']}

# Class names
names:
"""

# Add class names
idx_to_class = class_mapping['idx_to_class']
for idx in range(class_mapping['num_classes']):
    class_name = idx_to_class[str(idx)]
    yaml_content += f"  {idx}: {class_name}\n"

# Save yaml
with open(data_yaml_path, 'w') as f:
    f.write(yaml_content)

print(f"\ndata.yaml created with {class_mapping['num_classes']} classes")
print(f"Saved to: {data_yaml_path}")
print("\nFirst 5 classes:")
for i in range(min(5, class_mapping['num_classes'])):
    print(f"  {i}: {idx_to_class[str(i)]}")

CREATING data.yaml FOR YOLOv8

data.yaml created with 39 classes
Saved to: /Users/kevin/Documents/GitHub/Python/VESKL/11.DAE/NEU/NEU_IE7615/Prj/Discriminative/G8/Project1/IE7615_Discriminative_Project/data/data.yaml

First 5 classes:
  0: OBJ001
  1: OBJ002
  2: OBJ003
  3: OBJ004
  4: OBJ005


## Step 6: Visualize Samples
Display sample multi-object images with bounding boxes

In [8]:
def visualize_grid_with_boxes(image_path, label_path, class_mapping):
    """
    Visualize multi-object image with bounding boxes
    """
    # Read image
    img = cv2.imread(str(image_path))
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    h, w = img.shape[:2]
    
    # Read annotations
    with open(label_path, 'r') as f:
        lines = f.readlines()
    
    # Plot
    fig, ax = plt.subplots(1, 1, figsize=(8, 8))
    ax.imshow(img)
    
    idx_to_class = class_mapping['idx_to_class']
    
    # Draw bounding boxes
    for line in lines:
        parts = line.strip().split()
        class_idx = int(parts[0])
        x_center = float(parts[1]) * w
        y_center = float(parts[2]) * h
        box_w = float(parts[3]) * w
        box_h = float(parts[4]) * h
        
        # Convert to corner coordinates
        x1 = x_center - box_w/2
        y1 = y_center - box_h/2
        
        # Draw rectangle
        rect = patches.Rectangle((x1, y1), box_w, box_h, 
                                  linewidth=2, edgecolor='red', facecolor='none')
        ax.add_patch(rect)
        
        # Add label
        class_name = idx_to_class[str(class_idx)]
        ax.text(x1, y1-5, class_name, color='red', fontsize=10, 
                bbox=dict(boxstyle='round', facecolor='white', alpha=0.7))
    
    ax.axis('off')
    ax.set_title(f"Multi-Object Image: {image_path.name}")
    plt.tight_layout()
    return fig

# Visualize samples from each grid type
print("="*80)
print("VISUALIZING SAMPLES")
print("="*80)

for config in GRID_CONFIGS:
    grid_name = config['name']
    
    # Find first image of this grid type in train
    train_images = sorted(list((OUTPUT_IMAGES / "train").glob(f"grid_{grid_name}_*.jpg")))
    if train_images:
        img_path = train_images[0]
        label_path = OUTPUT_LABELS / "train" / f"{img_path.stem}.txt"
        
        print(f"\nVisualizing {grid_name} grid sample...")
        fig = visualize_grid_with_boxes(img_path, label_path, class_mapping)
        
        # Save visualization
        vis_path = STATS_PATH / f"sample_{grid_name}_visualization.png"
        fig.savefig(vis_path, dpi=150, bbox_inches='tight')
        plt.close()
        print(f"  Saved: {vis_path}")

VISUALIZING SAMPLES

Visualizing 2x2 grid sample...
  Saved: /Users/kevin/Documents/GitHub/Python/VESKL/11.DAE/NEU/NEU_IE7615/Prj/Discriminative/G8/Project1/IE7615_Discriminative_Project/data/statistics/sample_2x2_visualization.png

Visualizing 2x3 grid sample...
  Saved: /Users/kevin/Documents/GitHub/Python/VESKL/11.DAE/NEU/NEU_IE7615/Prj/Discriminative/G8/Project1/IE7615_Discriminative_Project/data/statistics/sample_2x3_visualization.png

Visualizing 3x3 grid sample...
  Saved: /Users/kevin/Documents/GitHub/Python/VESKL/11.DAE/NEU/NEU_IE7615/Prj/Discriminative/G8/Project1/IE7615_Discriminative_Project/data/statistics/sample_3x3_visualization.png


## Step 7: Verification
Verify annotations are correct

In [9]:
print("="*80)
print("VERIFICATION")
print("="*80)

for split in ['train', 'val', 'test']:
    img_folder = OUTPUT_IMAGES / split
    label_folder = OUTPUT_LABELS / split
    
    images = list(img_folder.glob("*.jpg"))
    labels = list(label_folder.glob("*.txt"))
    
    print(f"\n{split.upper()}:")
    print(f"  Images: {len(images)}")
    print(f"  Labels: {len(labels)}")
    
    # Check annotation format
    if labels:
        with open(labels[0], 'r') as f:
            lines = f.readlines()
        print(f"  Objects in first image: {len(lines)}")
        
        # Verify values are normalized (0-1)
        parts = lines[0].strip().split()
        values = [float(x) for x in parts[1:]]
        if all(0 <= v <= 1 for v in values):
            print(f"  Annotation format: VALID (all values in [0,1])")
        else:
            print(f"  WARNING: Annotation values not normalized!")

print("\n" + "="*80)
print("VERIFICATION PASSED!")
print("="*80)

VERIFICATION

TRAIN:
  Images: 919
  Labels: 919
  Objects in first image: 4
  Annotation format: VALID (all values in [0,1])

VAL:
  Images: 231
  Labels: 231
  Objects in first image: 6
  Annotation format: VALID (all values in [0,1])

TEST:
  Images: 233
  Labels: 233
  Objects in first image: 4
  Annotation format: VALID (all values in [0,1])

VERIFICATION PASSED!


## Step 8: Generate Statistics Report

In [10]:
print("="*80)
print("GENERATING STATISTICS REPORT")
print("="*80)

# Save generation stats
gen_stats_path = STATS_PATH / "multi_object_generation_stats.csv"
df_gen.to_csv(gen_stats_path, index=False)
print(f"\nGeneration stats saved: {gen_stats_path}")

# Create split summary
split_summary = {
    'split': ['train', 'val', 'test'],
    'images': [
        len(list((OUTPUT_IMAGES / 'train').glob('*.jpg'))),
        len(list((OUTPUT_IMAGES / 'val').glob('*.jpg'))),
        len(list((OUTPUT_IMAGES / 'test').glob('*.jpg')))
    ]
}
df_split_mo = pd.DataFrame(split_summary)
df_split_mo['percentage'] = df_split_mo['images'] / df_split_mo['images'].sum() * 100

print("\nMulti-Object Dataset Split:")
print(df_split_mo.to_string(index=False))

split_mo_path = STATS_PATH / "multi_object_split_distribution.csv"
df_split_mo.to_csv(split_mo_path, index=False)
print(f"\nSplit stats saved: {split_mo_path}")

GENERATING STATISTICS REPORT

Generation stats saved: /Users/kevin/Documents/GitHub/Python/VESKL/11.DAE/NEU/NEU_IE7615/Prj/Discriminative/G8/Project1/IE7615_Discriminative_Project/data/statistics/multi_object_generation_stats.csv

Multi-Object Dataset Split:
split  images  percentage
train     919   66.449747
  val     231   16.702820
 test     233   16.847433

Split stats saved: /Users/kevin/Documents/GitHub/Python/VESKL/11.DAE/NEU/NEU_IE7615/Prj/Discriminative/G8/Project1/IE7615_Discriminative_Project/data/statistics/multi_object_split_distribution.csv


## Summary

In [11]:
print("="*80)
print("TASK 1.2 COMPLETED: Multi-Object Image Generation")
print("="*80)
print("\nDeliverables:")
print(f"1. Multi-object images: {OUTPUT_IMAGES}")
print(f"2. YOLO annotations: {OUTPUT_LABELS}")
print(f"3. YOLOv8 config: {PROJECT_ROOT / 'data' / 'data.yaml'}")
print(f"4. Statistics: {STATS_PATH}")
print("\nDataset ready for YOLOv8 training!")
print("\nNext: Task 2.1 - Train classification models (Kevin)")

TASK 1.2 COMPLETED: Multi-Object Image Generation

Deliverables:
1. Multi-object images: /Users/kevin/Documents/GitHub/Python/VESKL/11.DAE/NEU/NEU_IE7615/Prj/Discriminative/G8/Project1/IE7615_Discriminative_Project/data/multi_objects/images
2. YOLO annotations: /Users/kevin/Documents/GitHub/Python/VESKL/11.DAE/NEU/NEU_IE7615/Prj/Discriminative/G8/Project1/IE7615_Discriminative_Project/data/multi_objects/labels
3. YOLOv8 config: /Users/kevin/Documents/GitHub/Python/VESKL/11.DAE/NEU/NEU_IE7615/Prj/Discriminative/G8/Project1/IE7615_Discriminative_Project/data/data.yaml
4. Statistics: /Users/kevin/Documents/GitHub/Python/VESKL/11.DAE/NEU/NEU_IE7615/Prj/Discriminative/G8/Project1/IE7615_Discriminative_Project/data/statistics

Dataset ready for YOLOv8 training!

Next: Task 2.1 - Train classification models (Kevin)
