# Rock-Paper-Scissors Image Classification with PyTorch

In this notebook, we will build a Convolutional Neural Network (CNN) using PyTorch to classify images of rock, paper, and scissors. We will train the model on a dataset of images and evaluate its performance on a validation set.

In [1]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader
from torchvision.datasets import ImageFolder
from torchvision.transforms import Compose, Resize, ToTensor
from torch.utils.tensorboard import SummaryWriter

In [2]:
# Check for GPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f'Using device: {device}')

Using device: cpu


In [3]:
# TensorBoard writer
writer = SummaryWriter('runs/tree_recognition_experiment')

## Load and Preprocess the Dataset

We will use `ImageFolder` to load our dataset, which expects images to be organized in subdirectories based on their class labels. We will apply transformations to resize the images and convert them to tensors.

In [4]:
# Transformations
transform = Compose([Resize((28, 28)), ToTensor()])

# Load the dataset
train_data = ImageFolder(root='train_data', transform=transform)
val_data = ImageFolder(root='val_data', transform=transform)

# Print class names
print('Classes:', train_data.classes)

Classes: ['birch', 'maple', 'pine', 'rowan', 'spruce']


In [5]:
# Create data loaders
train_loader = DataLoader(train_data, batch_size=16, shuffle=True)
val_loader = DataLoader(val_data, batch_size=16, shuffle=False)

#### Display some samples from the dataset

In [None]:
from PIL import Image
import matplotlib.pyplot as plt

# Paths to the images
image_paths = [
    "rps/paper/paper02-089.png",
    "rps/rock/rock06ck02-100.png",
    "rps/scissors/testscissors02-006.png"
]

# Load the images
images = [Image.open(image_path) for image_path in image_paths]
titles = ['Paper', 'Rock', 'Scissors']

# Display the images
fig, axs = plt.subplots(1, 3, figsize=(12, 5))
for ax, image, title in zip(axs, images, titles):
    ax.imshow(image)
    ax.axis('off')  # Hide axis ticks
    ax.set_title(title)

plt.tight_layout()
plt.show()


## Define a Simple CNN Model

We will define a simple Convolutional Neural Network (CNN) with two convolutional layers followed by two fully connected layers.

In [6]:
# Define the CNN model
class SimpleCNN(nn.Module):
    def __init__(self):
        super(SimpleCNN, self).__init__()
        self.conv1 = nn.Conv2d(3, 16, kernel_size=3, padding=1)
        self.conv2 = nn.Conv2d(16, 32, kernel_size=3, padding=1)
        self.pool = nn.MaxPool2d(2, 2)
        self.fc1 = nn.Linear(32 * 7 * 7, 128)
        self.fc2 = nn.Linear(128, len(train_data.classes))  # Number of classes

    def forward(self, x):
        x = self.pool(torch.relu(self.conv1(x)))
        x = self.pool(torch.relu(self.conv2(x)))
        x = x.view(-1, 32 * 7 * 7)  # Flattening the tensor for the fully connected layers
        x = torch.relu(self.fc1(x))
        x = self.fc2(x)
        return x

## Initialize the Model, Loss Function, and Optimizer

We will create an instance of the CNN model, define the loss function as Cross Entropy Loss, and use SGD optimizer.

In [7]:
# Initialize the model
model = SimpleCNN().to(device)  # Move the model to the appropriate device

# Loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)

## Train the Model

We will define a function to train the model and monitor its performance on the validation set after each epoch. The best model weights will be saved based on validation loss.

In [8]:
# Training function
def train_model(model, train_loader, val_loader, criterion, optimizer, num_epochs=10):
    best_val_loss = float('inf')

    for epoch in range(num_epochs):
        running_train_loss = 0.0
        model.train()
        for inputs, labels in train_loader:
            inputs, labels = inputs.to(device), labels.to(device)
            optimizer.zero_grad()
            outputs = model(inputs)
            loss = criterion(outputs, labels)
            loss.backward()
            optimizer.step()
            running_train_loss += loss.item()

        train_loss = running_train_loss / len(train_loader)

        # Validation
        running_val_loss = 0.0
        correct = 0
        total = 0
        model.eval()
        with torch.no_grad():
            for inputs, labels in val_loader:
                inputs, labels = inputs.to(device), labels.to(device)
                outputs = model(inputs)
                loss = criterion(outputs, labels)
                running_val_loss += loss.item()

                # Calculate validation accuracy
                _, predicted = torch.max(outputs, 1)
                total += labels.size(0)
                correct += (predicted == labels).sum().item()

        val_loss = running_val_loss / len(val_loader)
        val_accuracy = 100 * correct / total

        print(f'Epoch [{epoch+1}/{num_epochs}], Train Loss: {train_loss:.4f}, Val Loss: {val_loss:.4f}, Val Accuracy: {val_accuracy:.2f}%')

        # Log losses and accuracy to TensorBoard
        writer.add_scalar('Loss/train', train_loss, epoch)
        writer.add_scalar('Loss/validation', val_loss, epoch)
        writer.add_scalar('Accuracy/validation', val_accuracy, epoch)

        # Save the best model
        if val_loss < best_val_loss:
            best_val_loss = val_loss
            torch.save(model.state_dict(), 'best_model_weights.pth')
            print(f"Model saved at epoch {epoch+1}")

    writer.close()

In [9]:
# Train the model
train_model(model, train_loader, val_loader, criterion, optimizer, num_epochs=10)

Epoch [1/10], Train Loss: 1.4963, Val Loss: 1.9551, Val Accuracy: 24.39%
Model saved at epoch 1
Epoch [2/10], Train Loss: 1.4882, Val Loss: 1.5479, Val Accuracy: 24.39%
Model saved at epoch 2
Epoch [3/10], Train Loss: 1.3927, Val Loss: 1.6174, Val Accuracy: 24.39%
Epoch [4/10], Train Loss: 1.2537, Val Loss: 1.5275, Val Accuracy: 31.71%
Model saved at epoch 4
Epoch [5/10], Train Loss: 1.0986, Val Loss: 1.6746, Val Accuracy: 29.27%
Epoch [6/10], Train Loss: 1.1976, Val Loss: 1.2727, Val Accuracy: 48.78%
Model saved at epoch 6
Epoch [7/10], Train Loss: 0.8468, Val Loss: 0.9692, Val Accuracy: 63.41%
Model saved at epoch 7
Epoch [8/10], Train Loss: 0.7193, Val Loss: 0.8144, Val Accuracy: 70.73%
Model saved at epoch 8
Epoch [9/10], Train Loss: 0.5858, Val Loss: 0.7422, Val Accuracy: 75.61%
Model saved at epoch 9
Epoch [10/10], Train Loss: 0.4881, Val Loss: 0.8509, Val Accuracy: 65.85%


## Load the Best Model for Inference

After training, we will load the best model weights saved during training for inference.

In [10]:
# Load the best model for inference
model.load_state_dict(torch.load('best_model_weights.pth'))
model.eval()

  model.load_state_dict(torch.load('best_model_weights.pth'))


SimpleCNN(
  (conv1): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (conv2): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (fc1): Linear(in_features=1568, out_features=128, bias=True)
  (fc2): Linear(in_features=128, out_features=5, bias=True)
)

## Perform Inference on a Single Image

We will define a function to perform inference on a single image and predict its class label.

In [11]:
# Function to perform inference on a single image
from PIL import Image

def infer_single_image(model, image_path, transform, classes):
    # Load and preprocess the image
    image = Image.open(image_path).convert('RGB')
    image = transform(image)
    image = image.unsqueeze(0)  # Add batch dimension

    # Move to device
    image = image.to(device)

    # Forward pass
    with torch.no_grad():
        output = model(image)
        _, predicted = torch.max(output, 1)
        predicted_class = classes[predicted.item()]
    return predicted_class

## Just to Make It More Interactive, I Added Gradio Flavor!

*Warning:* This notebook now contains traces of Gradio. Side effects may include uncontrollable excitement and a sudden urge to classify everything you see!



In [12]:
# Install Gradio if not already installed
!pip install gradio

# Import Gradio
import gradio as gr

# Ensure the model is in evaluation mode
model.eval()

# Define the prediction function
def predict(image):
    # Apply the same transformations as during training
    image = transform(image).unsqueeze(0).to(device)
    with torch.no_grad():
        output = model(image)
        _, predicted = torch.max(output, 1)
        predicted_class = train_data.classes[predicted.item()]
    return predicted_class

# Create the Gradio interface
iface = gr.Interface(
    fn=predict,
    inputs=gr.Image(type="pil"),
    outputs="text",
    title="Tree recognition",
    description="Upload an image of birch, maple, pine, rowan or spruce, and the model will predict its class."
)

# Launch the Gradio app
iface.launch(share=True)


Running on local URL:  http://127.0.0.1:7860
Running on public URL: https://c02beb79d04f0f0faa.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)


