Here's a step-by-step guide to create a PyTorch custom dataset to train a model using two groups of images: moire present and clean. The images are stored in the same folder, with filenames for moiré images ending in _moire.jpg and their clean counterparts ending in _gt.jpg. The images have a resolution of 4032x3024. The guide will also include GPU support for training.

1. Importing Required Libraries
You need to start by importing the necessary libraries for data processing, model creation, and GPU handling.

In [17]:
import os
import torch
from torch.utils.data import Dataset, DataLoader
from PIL import Image
from torchvision import transforms
import torch.optim as optim
import torch.nn as nn
from tqdm import tqdm  # Optional, for progress bars
import matplotlib.pyplot as plt


2. Creating the Custom Dataset Class
We will create a custom dataset class that loads the images and assigns labels. In this case:

Images with '_moire.jpg' will be assigned label 1 (moire present).
Images with '_gt.jpg' will be assigned label 0 (clean).

In [24]:
class MoireCleanDataset(Dataset):
    def __init__(self, root_dir, transform=None):
        """
        Args:
            root_dir (str): Directory containing all the images.
            transform (callable, optional): Optional transform to be applied on a sample.
        """
        self.root_dir = root_dir
        self.transform = transform

        # List all images in the directory
        self.images = [f for f in os.listdir(root_dir) if f.endswith('.jpg')]
        
    def __len__(self):
        return len(self.images)

    def __getitem__(self, idx):
        img_name = self.images[idx]
        img_path = os.path.join(self.root_dir, img_name)

        # Open the image
        img = Image.open(img_path)

        # Label: 1 for moire images (those with '_moire' in the filename), 0 for clean images
        if '_moire' in img_name:
            label = 1  # Moire image
        else:
            label = 0  # Clean image

        # Apply transformations if provided
        if self.transform:
            img = self.transform(img)

        return img, label


Explanation:

root_dir: The folder where your images are stored.
images: A list of filenames ending with _moire.jpg.
The getitem function returns the pair of images: the moire image and its corresponding clean version, along with the label.

3. Transformations
In order to preprocess the images, you can apply transformations such as resizing, normalization, and conversion to tensors.

In [25]:
transform = transforms.Compose([
    transforms.Resize((256, 256)),  # Resize images to 256x256 (you can change this)
    transforms.ToTensor(),          # Convert images to PyTorch tensors
    transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])  # Normalize images
])

The Resize((256, 256)) ensures that the images are resized to a manageable size. You can change this to the resolution you prefer. If you need to work with the original size of 4032x3024, you can adjust the Resize() transform accordingly.

4. Creating Dataset and DataLoader
Now that we’ve defined the dataset class, let's instantiate it and load the data using the DataLoader.

In [28]:
# Directory where images are stored
root_dir = 'Dataset/train/train/pair_00'  # Update this with the actual folder path

# Create the dataset instance
dataset = MoireCleanDataset(root_dir, transform=transform)

# Print all the loaded images in the dataset
for i in range(len(dataset)):
    moire_img_name = dataset.images[i]
    print(f"Moire image: {moire_img_name}")


# Create the DataLoader instance for batching
batch_size = 16
train_loader = DataLoader(dataset, batch_size=batch_size, shuffle=True)


Moire image: 0000_gt.jpg
Moire image: 0000_moire.jpg
Moire image: 0001_gt.jpg
Moire image: 0001_moire.jpg
Moire image: 0002_gt.jpg
Moire image: 0002_moire.jpg
Moire image: 0004_gt.jpg
Moire image: 0004_moire.jpg
Moire image: 0005_gt.jpg
Moire image: 0005_moire.jpg
Moire image: 0006_gt.jpg
Moire image: 0006_moire.jpg
Moire image: 0007_gt.jpg
Moire image: 0007_moire.jpg
Moire image: 0009_gt.jpg
Moire image: 0009_moire.jpg
Moire image: 0010_gt.jpg
Moire image: 0010_moire.jpg
Moire image: 0011_gt.jpg
Moire image: 0011_moire.jpg
Moire image: 0012_gt.jpg
Moire image: 0012_moire.jpg
Moire image: 0013_gt.jpg
Moire image: 0013_moire.jpg
Moire image: 0014_gt.jpg
Moire image: 0014_moire.jpg
Moire image: 0017_gt.jpg
Moire image: 0017_moire.jpg
Moire image: 0018_gt.jpg
Moire image: 0018_moire.jpg
Moire image: 0020_gt.jpg
Moire image: 0020_moire.jpg
Moire image: 0021_gt.jpg
Moire image: 0021_moire.jpg
Moire image: 0022_gt.jpg
Moire image: 0022_moire.jpg
Moire image: 0023_gt.jpg
Moire image: 0023_moi

5. Using the GPU (Optional)
If you have a GPU available, we can move the dataset, model, and tensors to the GPU to accelerate training.

First, check if CUDA is available:

In [30]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print("Using device:", device)


Using device: cpu


During data loading, make sure to move the tensors to the GPU:

In [31]:
# Debugging: Print the first few samples and labels
for moire_img, label, gt_img in train_loader:
    print("Moire image shape:", moire_img.shape)  # Should print torch.Size([batch_size, 3, 256, 256])
    print("Clean image shape:", gt_img.shape)  # Should print torch.Size([batch_size, 3, 256, 256])
    print("Labels:", label)
    break  # Only print the first batch


# Iterate through the DataLoader and send data to GPU
for moire_img, label, gt_img in train_loader:
    # Move to GPU
    moire_img, label, gt_img = moire_img.to(device), label.to(device), gt_img.to(device)
    
    # Now you can pass the images and labels through the model
    print(f"Batch of images shape: {moire_img.shape}")
    print(f"Labels: {label}")


ValueError: not enough values to unpack (expected 3, got 2)

7. Training Loop (Example)
Let’s assume you have a simple model to train on these images. Here’s an example of how to set up a basic model and training loop.

In [16]:
# Define a simple CNN model
model = nn.Sequential(
    nn.Conv2d(3, 16, kernel_size=3, padding=1),  # Convolutional layer
    nn.ReLU(),
    nn.MaxPool2d(2),  # Pooling layer
    nn.Conv2d(16, 32, kernel_size=3, padding=1),  # Another convolution
    nn.ReLU(),
    nn.MaxPool2d(2),  # Pooling layer
    nn.Flatten(),
    nn.Linear(32 * 64 * 64, 1)  # Fully connected layer (you may need to adjust the dimensions)
)

# Move model to GPU
model = model.to(device)

# Loss function (Binary Cross-Entropy for 2 classes) and optimizer
criterion = nn.BCEWithLogitsLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Training loop
num_epochs = 5
for epoch in range(num_epochs):
    model.train()  # Set model to training mode
    running_loss = 0.0
    for moire_img, label, gt_img in tqdm(train_loader, desc=f"Epoch {epoch+1}/{num_epochs}"):
        moire_img, label = moire_img.to(device), label.to(device)

        # Zero the gradients
        optimizer.zero_grad()

        # Forward pass
        outputs = model(moire_img)
        
        # Calculate the loss
        loss = criterion(outputs.squeeze(), label.float())  # Make sure the label is float for BCE

        # Backward pass and optimization
        loss.backward()
        optimizer.step()

        running_loss += loss.item()

    print(f"Epoch {epoch+1}/{num_epochs}, Loss: {running_loss/len(train_loader)}")


Epoch 1/5:  80%|████████  | 8/10 [01:01<00:15,  7.70s/it]


KeyboardInterrupt: 

8. Saving and Loading the Model
Once the model is trained, you can save it and load it again later for inference or further training.

In [None]:
# Save the model
torch.save(model.state_dict(), 'moire_model.pth')

# Load the model
model = nn.Sequential(
    nn.Conv2d(3, 16, kernel_size=3, padding=1),
    nn.ReLU(),
    nn.MaxPool2d(2),
    nn.Conv2d(16, 32, kernel_size=3, padding=1),
    nn.ReLU(),
    nn.MaxPool2d(2),
    nn.Flatten(),
    nn.Linear(32 * 64 * 64, 1)
)
model.load_state_dict(torch.load('moire_model.pth'))
model.to(device)


Sequential(
  (0): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (1): ReLU()
  (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (3): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (4): ReLU()
  (5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (6): Flatten(start_dim=1, end_dim=-1)
  (7): Linear(in_features=131072, out_features=1, bias=True)
)

Summary
In this notebook guide:

You’ve learned how to create a custom dataset in PyTorch to load pairs of images (moiré and clean).
You’ve applied transformations to preprocess the images.
You’ve utilized the GPU for faster training.
You’ve set up a simple CNN model, a training loop, and demonstrated how to save/load models.
This setup allows you to train a model that can classify whether an image contains moiré or is clean based on its corresponding pair. Adjust the network architecture, batch size, and other parameters as needed for your specific task.