# Data Splitting and Data Augmentation Summary

## 1. Data Splitting Process
   - Split each dataset into train, validation, and test sets using multiple ratios: 60/20/20, 70/15/15, and 80/10/10.
   - The splitting was performed on all three individual datasets as well as the combined dataset.
   - Ensured consistent directory structure and proper allocation across splits.
   - **Details**:
     - Utilized a shuffling mechanism to ensure randomness in split allocation.
     - **Directory Structure**:
       - Created `train`, `val`, and `test` directories for each class in each dataset split.
   - **Outcome**:
     - Generated separate training, validation, and testing datasets for different split ratios.
     - Provided detailed statistics for per-class and overall splits.

## 2. Data Augmentation Process
   - Applied data augmentations to the pre-split datasets across all splits (60/20/20, 70/15/15, and 80/10/10).
   - Augmentation was performed independently for train, validation, and test splits.
   - **Augmentation Techniques**:
     - **Transformations Used**:
       - `RandomResizedCrop`: Randomly resized crops of images.
       - `RandomRotation`: Applied random rotations.
       - `RandomHorizontalFlip`: Randomly flipped images horizontally.
       - `ColorJitter`: Adjusted brightness, contrast, saturation, and hue.
       - `RandomAffine`: Applied random affine transformations.
       - `RandomErasing`: Performed random erasing for data augmentation.
     - **Augmentation Count**:
       - Generated 10 augmented images per original image.
   - **Outcome**:
     - Created enhanced datasets with multiple augmentations per original image across different splits and combinations.
     - Detailed statistics and summaries provided for augmented data.

## 3. Tools and Libraries Utilized
   - **Data Splitting**: Utilized Python's `os`, `shutil`, and `random` libraries for file handling and directory management.
   - **Image Augmentation**: Used `PIL` for image handling and `torchvision.transforms` for augmentations.
   - **Progress Monitoring**: Employed `tqdm` for tracking file operations and augmentation processes.

## 4. Final Results
   - Delivered train, validation, and test splits with consistent class distribution across various split ratios.
   - Generated augmented datasets with comprehensive transformations to increase data diversity.
   - Processed all three individual datasets as well as the combined dataset.
   - Provided detailed documentation and summaries, including per-class statistics and overall dataset statistics for each stage.


In [1]:
# # Create the virtual environment named 'dmp'
# !python3 -m venv /scratch/movi/dmp
# Install ipykernel inside the 'dmp' environment
# !pip install ipykernel
# Add 'dmp' as a kernel for Jupyter Notebook
# !python -m ipykernel install --user --name=dmp --display-name "Python (dmp)"
# # Upgrade pip in the 'dmp' environment
# !pip install --upgrade pip
# # Install necessary packages (NumPy, PyTorch, etc.) inside 'dmp'
# !pip install numpy torch torchvision torchaudio pandas matplotlib scikit-learn
# For CUDA 11.8
# !pip uninstall torch torchvision torchaudio
!pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
# !pip uninstall -y tensorflow
# !pip install numpy==1.21.4 scikit-learn==1.0.2
# import tensorflow as tf
# print("TensorFlow version:", tf.__version__)

Looking in indexes: https://download.pytorch.org/whl/cu118
Collecting torch
  Using cached https://download.pytorch.org/whl/cu118/torch-2.4.1%2Bcu118-cp38-cp38-win_amd64.whl (2695.5 MB)
Collecting torchvision
  Using cached https://download.pytorch.org/whl/cu118/torchvision-0.19.1%2Bcu118-cp38-cp38-win_amd64.whl (5.0 MB)
Collecting torchaudio
  Using cached https://download.pytorch.org/whl/cu118/torchaudio-2.4.1%2Bcu118-cp38-cp38-win_amd64.whl (4.0 MB)
Collecting filelock (from torch)
  Downloading https://download.pytorch.org/whl/filelock-3.13.1-py3-none-any.whl.metadata (2.8 kB)
Collecting sympy (from torch)
  Downloading https://download.pytorch.org/whl/sympy-1.13.3-py3-none-any.whl.metadata (12 kB)
Collecting networkx (from torch)
  Downloading https://download.pytorch.org/whl/networkx-3.2.1-py3-none-any.whl.metadata (5.2 kB)
Collecting jinja2 (from torch)
  Downloading https://download.pytorch.org/whl/Jinja2-3.1.4-py3-none-any.whl.metadata (2.6 kB)
Collecting fsspec (from torch)
 



In [2]:
import torch

print("Torch version:", torch.__version__)
print("CUDA available:", torch.cuda.is_available())
print("Device name:", torch.cuda.get_device_name(0) if torch.cuda.is_available() else "No GPU found")


Torch version: 2.4.1+cu118
CUDA available: True
Device name: NVIDIA GeForce RTX 4070


In [3]:
# Prints the installed versions of Python, NumPy, and PyTorch libraries

import sys
import numpy as np
import torch

print(f"Python Version: {sys.version}")
print(f"NumPy Version: {np.__version__}")
print(f"PyTorch Version: {torch.__version__}")



# Function to check GPU availability and display memory statistics using PyTorch's CUDA interface

def check_gpu_status():
    # Check if GPU is available
    if torch.cuda.is_available():
        print(f"CUDA is available. PyTorch is using GPU.\n")

        # Get the number of available GPUs
        num_gpus = torch.cuda.device_count()
        print(f"Number of GPUs available: {num_gpus}")

        # Loop through each GPU and display its details
        for gpu_id in range(num_gpus):
            gpu_name = torch.cuda.get_device_name(gpu_id)
            gpu_memory_allocated = torch.cuda.memory_allocated(gpu_id) / (1024 ** 3)  # In GB
            gpu_memory_cached = torch.cuda.memory_reserved(gpu_id) / (1024 ** 3)      # In GB
            gpu_memory_total = torch.cuda.get_device_properties(gpu_id).total_memory / (1024 ** 3)  # In GB

            print(f"\nGPU {gpu_id}: {gpu_name}")
            print(f"  Total Memory: {gpu_memory_total:.2f} GB")
            print(f"  Memory Allocated: {gpu_memory_allocated:.2f} GB")
            print(f"  Memory Reserved (Cached): {gpu_memory_cached:.2f} GB")
    else:
        print("CUDA is not available. PyTorch is using the CPU.")

# Run the GPU status check
check_gpu_status()

Python Version: 3.8.9 (tags/v3.8.9:a743f81, Apr  6 2021, 14:02:34) [MSC v.1928 64 bit (AMD64)]
NumPy Version: 1.24.1
PyTorch Version: 2.4.1+cu118
CUDA is available. PyTorch is using GPU.

Number of GPUs available: 1

GPU 0: NVIDIA GeForce RTX 4070
  Total Memory: 11.99 GB
  Memory Allocated: 0.00 GB
  Memory Reserved (Cached): 0.00 GB


In [6]:
# Code to split dataset into train, validation, and test sets, with detailed analysis and logging

import os
import random
import shutil

# Paths
original_dataset = 'dataset/dataset_combined_unique'  # Replace with the path to your original dataset
split_base_dir = 'dataset/dataset_split'        # Base directory to store train/val/test splits

# Split ratios
TRAIN_RATIO = 0.7
VAL_RATIO = 0.15
TEST_RATIO = 0.15

def create_dir_structure(base_dir, class_names):
    """Create train, val, and test directories for each class."""
    for split in ['train', 'val', 'test']:
        for class_name in class_names:
            os.makedirs(os.path.join(base_dir, split, class_name), exist_ok=True)

def analyze_and_split_dataset(original_dataset, split_base_dir):
    """Analyze dataset and split into train, val, and test sets."""
    class_names = sorted(os.listdir(original_dataset))  # Get class names in alphabetical order
    create_dir_structure(split_base_dir, class_names)   # Create the necessary directory structure

    total_images = 0  # Track the total number of images across all classes
    split_summary = {}  # Dictionary to store per-class split details

    # Loop through each class folder
    for class_name in class_names:
        class_path = os.path.join(original_dataset, class_name)

        if os.path.isdir(class_path):  # Ensure it's a folder
            # List all images in the class folder
            image_files = [f for f in os.listdir(class_path) if os.path.isfile(os.path.join(class_path, f))]
            random.shuffle(image_files)  # Shuffle images to ensure randomness

            # Calculate split indices
            total_images_in_class = len(image_files)
            train_end = int(total_images_in_class * TRAIN_RATIO)
            val_end = train_end + int(total_images_in_class * VAL_RATIO)

            # Split the image files into train, val, and test sets
            train_files = image_files[:train_end]
            val_files = image_files[train_end:val_end]
            test_files = image_files[val_end:]

            # Copy files to the respective split directories
            for file in train_files:
                shutil.copy(os.path.join(class_path, file), os.path.join(split_base_dir, 'train', class_name, file))
            for file in val_files:
                shutil.copy(os.path.join(class_path, file), os.path.join(split_base_dir, 'val', class_name, file))
            for file in test_files:
                shutil.copy(os.path.join(class_path, file), os.path.join(split_base_dir, 'test', class_name, file))

            # Store the split summary for this class
            split_summary[class_name] = {
                'Total': total_images_in_class,
                'Train': len(train_files),
                'Validation': len(val_files),
                'Test': len(test_files)
            }

            # Update total image count
            total_images += total_images_in_class

            # Print per-class summary
            print(f"{class_name}: {len(train_files)} train, {len(val_files)} val, {len(test_files)} test (Total: {total_images_in_class})")

    # Print overall summary
    print("\nOverall Dataset Summary:")
    print(f"Total Images: {total_images}")
    print(f"Train Ratio: {TRAIN_RATIO}, Validation Ratio: {VAL_RATIO}, Test Ratio: {TEST_RATIO}\n")

    # Print detailed split summary for all classes
    print("Detailed Split Summary:")
    for class_name, counts in split_summary.items():
        print(f"{class_name} - Total: {counts['Total']}, Train: {counts['Train']}, Val: {counts['Validation']}, Test: {counts['Test']}")

    return split_summary

# Run the split function and store the summary
dataset_summary = analyze_and_split_dataset(original_dataset, split_base_dir)


1: 25 train, 5 val, 6 test (Total: 36)
10: 248 train, 53 val, 54 test (Total: 355)
100: 172 train, 36 val, 38 test (Total: 246)
1000: 175 train, 37 val, 38 test (Total: 250)
2: 184 train, 39 val, 41 test (Total: 264)
20: 253 train, 54 val, 55 test (Total: 362)
200: 13 train, 2 val, 4 test (Total: 19)
5: 207 train, 44 val, 46 test (Total: 297)
50: 248 train, 53 val, 54 test (Total: 355)
500: 208 train, 44 val, 46 test (Total: 298)

Overall Dataset Summary:
Total Images: 2482
Train Ratio: 0.7, Validation Ratio: 0.15, Test Ratio: 0.15

Detailed Split Summary:
1 - Total: 36, Train: 25, Val: 5, Test: 6
10 - Total: 355, Train: 248, Val: 53, Test: 54
100 - Total: 246, Train: 172, Val: 36, Test: 38
1000 - Total: 250, Train: 175, Val: 37, Test: 38
2 - Total: 264, Train: 184, Val: 39, Test: 41
20 - Total: 362, Train: 253, Val: 54, Test: 55
200 - Total: 19, Train: 13, Val: 2, Test: 4
5 - Total: 297, Train: 207, Val: 44, Test: 46
50 - Total: 355, Train: 248, Val: 53, Test: 54
500 - Total: 298, Tra

In [None]:
import os
from PIL import Image
from torchvision import transforms
import torch
import shutil
from tqdm import tqdm

# Paths - update these paths
split_base_dir = 'dataset/dataset_split'  # Your already split dataset
augmented_data_dir = 'dataset/dataset_aug'  # Where to save augmented data

# Number of augmentations per image
NUM_AUGMENTATIONS = 10

# Set device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

# Define augmentation transformations (tensor-based)
augmentation_transforms = transforms.Compose([
    transforms.RandomResizedCrop(224, scale=(0.8, 1.0)),
    transforms.RandomRotation(degrees=15),
    transforms.RandomHorizontalFlip(),
    transforms.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2, hue=0.1),
    transforms.RandomAffine(degrees=0, translate=(0.1, 0.1), scale=(0.9, 1.1), shear=10),
    transforms.RandomErasing(p=0.5, scale=(0.02, 0.15), ratio=(0.3, 3.3)),
])

def apply_augmentations():
    print("\n=== Starting Augmentation Process ===")

    # Create destination directory structure
    for split in ['train', 'val', 'test']:
        split_path = os.path.join(augmented_data_dir, split)
        os.makedirs(split_path, exist_ok=True)
        for class_name in os.listdir(os.path.join(split_base_dir, split)):
            class_path = os.path.join(split_path, class_name)
            os.makedirs(class_path, exist_ok=True)

    stats = {'train': {}, 'val': {}, 'test': {}}

    for split in ['train', 'val', 'test']:
        print(f"\nProcessing {split.upper()} split:")
        split_source = os.path.join(split_base_dir, split)
        split_dest = os.path.join(augmented_data_dir, split)

        for class_name in sorted(os.listdir(split_source)):
            class_source = os.path.join(split_source, class_name)
            class_dest = os.path.join(split_dest, class_name)

            if os.path.isdir(class_source):
                original_files = [f for f in os.listdir(class_source) 
                                  if os.path.isfile(os.path.join(class_source, f))]

                print(f"\nProcessing class {class_name}:")
                print(f"Found {len(original_files)} original images")

                # Copy originals
                for file in tqdm(original_files, desc="Copying originals"):
                    shutil.copy2(os.path.join(class_source, file),
                                 os.path.join(class_dest, file))

                # Generate augmentations
                print("Generating augmented images...")
                for file in tqdm(original_files, desc="Generating augmentations"):
                    img_path = os.path.join(class_source, file)
                    try:
                        with Image.open(img_path) as img:
                            # Convert to RGB if needed
                            if img.mode != 'RGB':
                                img = img.convert('RGB')

                            # Convert to tensor and send to GPU
                            img_tensor = transforms.ToTensor()(img).to(device)

                            for i in range(NUM_AUGMENTATIONS):
                                try:
                                    # Add batch dimension
                                    img_aug = img_tensor.unsqueeze(0)

                                    # Apply augmentation (keep on GPU)
                                    augmented_tensor = augmentation_transforms(img_aug[0])

                                    # Move back to CPU and save
                                    augmented_img = transforms.ToPILImage()(augmented_tensor.cpu())

                                    base_name = os.path.splitext(file)[0]
                                    aug_name = f"{base_name}_aug_{i+1}.jpg"
                                    augmented_img.save(os.path.join(class_dest, aug_name))
                                except Exception as e:
                                    print(f"Error generating augmentation {i+1} for {file}: {str(e)}")
                    except Exception as e:
                        print(f"Error processing file {file}: {str(e)}")

                total_augmented = len(original_files) * NUM_AUGMENTATIONS
                stats[split][class_name] = {
                    'original': len(original_files),
                    'augmented': total_augmented,
                    'total': len(original_files) + total_augmented
                }

    # Print summary
    print("\n=== Augmentation Summary ===")
    for split in ['train', 'val', 'test']:
        print(f"\n{split.upper()} Split:")
        for class_name, counts in sorted(stats[split].items()):
            print(f"Class {class_name}:")
            print(f"  Original: {counts['original']}")
            print(f"  Augmented: {counts['augmented']}")
            print(f"  Total: {counts['total']}")
        print("-" * 50)

    total_orig = sum(sum(c['original'] for c in s.values()) for s in stats.values())
    total_aug = sum(sum(c['augmented'] for c in s.values()) for s in stats.values())
    print("\nOverall Dataset Statistics:")
    print(f"Total Original Images: {total_orig}")
    print(f"Total Augmented Images: {total_aug}")
    print(f"Total Images: {total_orig + total_aug}")

if __name__ == "__main__":
    apply_augmentations()


Using device: cuda

=== Starting Augmentation Process ===

Processing TRAIN split:

Processing class 1:
Found 25 original images


Copying originals: 100%|██████████| 25/25 [00:00<00:00, 2379.34it/s]


Generating augmented images...


Generating augmentations: 100%|██████████| 25/25 [00:01<00:00, 17.12it/s]



Processing class 10:
Found 248 original images


Copying originals: 100%|██████████| 248/248 [00:00<00:00, 2582.96it/s]


Generating augmented images...


Generating augmentations: 100%|██████████| 248/248 [00:12<00:00, 19.67it/s]



Processing class 100:
Found 172 original images


Copying originals: 100%|██████████| 172/172 [00:00<00:00, 2547.66it/s]


Generating augmented images...


Generating augmentations: 100%|██████████| 172/172 [00:09<00:00, 18.48it/s]



Processing class 1000:
Found 175 original images


Copying originals: 100%|██████████| 175/175 [00:00<00:00, 2243.12it/s]


Generating augmented images...


Generating augmentations: 100%|██████████| 175/175 [00:09<00:00, 18.37it/s]



Processing class 2:
Found 184 original images


Copying originals: 100%|██████████| 184/184 [00:00<00:00, 1999.68it/s]


Generating augmented images...


Generating augmentations: 100%|██████████| 184/184 [00:10<00:00, 18.05it/s]



Processing class 20:
Found 253 original images


Copying originals: 100%|██████████| 253/253 [00:00<00:00, 2555.38it/s]


Generating augmented images...


Generating augmentations: 100%|██████████| 253/253 [00:13<00:00, 19.42it/s]



Processing class 200:
Found 13 original images


Copying originals: 100%|██████████| 13/13 [00:00<00:00, 1999.70it/s]


Generating augmented images...


Generating augmentations: 100%|██████████| 13/13 [00:00<00:00, 15.60it/s]



Processing class 5:
Found 207 original images


Copying originals: 100%|██████████| 207/207 [00:00<00:00, 1815.64it/s]


Generating augmented images...


Generating augmentations: 100%|██████████| 207/207 [00:11<00:00, 18.25it/s]



Processing class 50:
Found 248 original images


Copying originals: 100%|██████████| 248/248 [00:00<00:00, 2455.31it/s]


Generating augmented images...


Generating augmentations: 100%|██████████| 248/248 [00:13<00:00, 18.86it/s]



Processing class 500:
Found 208 original images


Copying originals: 100%|██████████| 208/208 [00:00<00:00, 2513.52it/s]


Generating augmented images...


Generating augmentations: 100%|██████████| 208/208 [00:10<00:00, 19.36it/s]



Processing VAL split:

Processing class 1:
Found 5 original images


Copying originals: 100%|██████████| 5/5 [00:00<00:00, 1998.05it/s]


Generating augmented images...


Generating augmentations: 100%|██████████| 5/5 [00:00<00:00, 18.25it/s]



Processing class 10:
Found 53 original images


Copying originals: 100%|██████████| 53/53 [00:00<00:00, 2254.77it/s]


Generating augmented images...


Generating augmentations: 100%|██████████| 53/53 [00:02<00:00, 19.08it/s]



Processing class 100:
Found 36 original images


Copying originals: 100%|██████████| 36/36 [00:00<00:00, 1845.75it/s]


Generating augmented images...


Generating augmentations: 100%|██████████| 36/36 [00:01<00:00, 18.35it/s]



Processing class 1000:
Found 37 original images


Copying originals: 100%|██████████| 37/37 [00:00<00:00, 2386.90it/s]


Generating augmented images...


Generating augmentations: 100%|██████████| 37/37 [00:01<00:00, 19.14it/s]



Processing class 2:
Found 39 original images


Copying originals: 100%|██████████| 39/39 [00:00<00:00, 2599.90it/s]


Generating augmented images...


Generating augmentations: 100%|██████████| 39/39 [00:01<00:00, 19.92it/s]



Processing class 20:
Found 54 original images


Copying originals: 100%|██████████| 54/54 [00:00<00:00, 2399.13it/s]


Generating augmented images...


Generating augmentations: 100%|██████████| 54/54 [00:02<00:00, 18.94it/s]



Processing class 200:
Found 2 original images


Copying originals: 100%|██████████| 2/2 [00:00<00:00, 1334.70it/s]


Generating augmented images...


Generating augmentations: 100%|██████████| 2/2 [00:00<00:00, 18.52it/s]



Processing class 5:
Found 44 original images


Copying originals: 100%|██████████| 44/44 [00:00<00:00, 2512.41it/s]


Generating augmented images...


Generating augmentations: 100%|██████████| 44/44 [00:02<00:00, 20.09it/s]



Processing class 50:
Found 53 original images


Copying originals: 100%|██████████| 53/53 [00:00<00:00, 2119.37it/s]


Generating augmented images...


Generating augmentations: 100%|██████████| 53/53 [00:02<00:00, 17.91it/s]



Processing class 500:
Found 44 original images


Copying originals: 100%|██████████| 44/44 [00:00<00:00, 2512.93it/s]


Generating augmented images...


Generating augmentations: 100%|██████████| 44/44 [00:02<00:00, 18.51it/s]



Processing TEST split:

Processing class 1:
Found 6 original images


Copying originals: 100%|██████████| 6/6 [00:00<00:00, 1997.60it/s]


Generating augmented images...


Generating augmentations: 100%|██████████| 6/6 [00:00<00:00, 18.90it/s]



Processing class 10:
Found 54 original images


Copying originals: 100%|██████████| 54/54 [00:00<00:00, 2249.94it/s]


Generating augmented images...


Generating augmentations: 100%|██████████| 54/54 [00:03<00:00, 17.77it/s]



Processing class 100:
Found 38 original images


Copying originals: 100%|██████████| 38/38 [00:00<00:00, 2302.66it/s]


Generating augmented images...


Generating augmentations: 100%|██████████| 38/38 [00:02<00:00, 18.45it/s]



Processing class 1000:
Found 38 original images


Copying originals: 100%|██████████| 38/38 [00:00<00:00, 2620.79it/s]


Generating augmented images...


Generating augmentations: 100%|██████████| 38/38 [00:01<00:00, 20.05it/s]



Processing class 2:
Found 41 original images


Copying originals: 100%|██████████| 41/41 [00:00<00:00, 2516.34it/s]


Generating augmented images...


Generating augmentations: 100%|██████████| 41/41 [00:02<00:00, 17.72it/s]



Processing class 20:
Found 55 original images


Copying originals: 100%|██████████| 55/55 [00:00<00:00, 2000.16it/s]


Generating augmented images...


Generating augmentations: 100%|██████████| 55/55 [00:02<00:00, 19.04it/s]



Processing class 200:
Found 4 original images


Copying originals: 100%|██████████| 4/4 [00:00<00:00, 2001.10it/s]


Generating augmented images...


Generating augmentations: 100%|██████████| 4/4 [00:00<00:00, 18.60it/s]



Processing class 5:
Found 46 original images


Copying originals: 100%|██████████| 46/46 [00:00<00:00, 2555.40it/s]


Generating augmented images...


Generating augmentations: 100%|██████████| 46/46 [00:02<00:00, 19.39it/s]



Processing class 50:
Found 54 original images


Copying originals: 100%|██████████| 54/54 [00:00<00:00, 2347.75it/s]


Generating augmented images...


Generating augmentations: 100%|██████████| 54/54 [00:02<00:00, 18.15it/s]



Processing class 500:
Found 46 original images


Copying originals: 100%|██████████| 46/46 [00:00<00:00, 2421.02it/s]


Generating augmented images...


Generating augmentations: 100%|██████████| 46/46 [00:02<00:00, 17.85it/s]


=== Augmentation Summary ===

TRAIN Split:
Class 1:
  Original: 25
  Augmented: 250
  Total: 275
Class 10:
  Original: 248
  Augmented: 2480
  Total: 2728
Class 100:
  Original: 172
  Augmented: 1720
  Total: 1892
Class 1000:
  Original: 175
  Augmented: 1750
  Total: 1925
Class 2:
  Original: 184
  Augmented: 1840
  Total: 2024
Class 20:
  Original: 253
  Augmented: 2530
  Total: 2783
Class 200:
  Original: 13
  Augmented: 130
  Total: 143
Class 5:
  Original: 207
  Augmented: 2070
  Total: 2277
Class 50:
  Original: 248
  Augmented: 2480
  Total: 2728
Class 500:
  Original: 208
  Augmented: 2080
  Total: 2288
--------------------------------------------------

VAL Split:
Class 1:
  Original: 5
  Augmented: 50
  Total: 55
Class 10:
  Original: 53
  Augmented: 530
  Total: 583
Class 100:
  Original: 36
  Augmented: 360
  Total: 396
Class 1000:
  Original: 37
  Augmented: 370
  Total: 407
Class 2:
  Original: 39
  Augmented: 390
  Total: 429
Class 20:
  Original: 54
  Augmented: 540
  




# END of Data Split and Augmentation

---

