# Week 14: CNN Lab - Rock, Paper, Scissors

**Objective:** Build, train, and test a Convolutional Neural Network (CNN) to classify images of hands playing Rock, Paper, or Scissors.

### Step 1: Setup and Data Download

This notebook downloads the Rock, Paper, Scissors dataset from Kaggle using `kagglehub`.

**Prerequisites:**
1. Install kagglehub: `pip install kagglehub`
2. Create a Kaggle account at https://www.kaggle.com
3. Generate API credentials (Settings → API → Create New Token)
4. Save the `kaggle.json` file to `~/.kaggle/kaggle.json`

**On Windows:**
- Place `kaggle.json` at `C:\Users\<YourUsername>\.kaggle\kaggle.json`

In [16]:
import kagglehub
import os

print("Step 1: Downloading Rock, Paper, Scissors dataset from Kaggle...")
print("=" * 60)

try:
    # Download the dataset using kagglehub
    # This requires kaggle.json credentials to be set up
    downloaded_path = kagglehub.dataset_download("drgfreeman/rockpaperscissors")
    print(f"✓ Dataset downloaded successfully!")
    print(f"Path: {downloaded_path}")
except Exception as e:
    print(f"✗ Error downloading dataset: {e}")
    print("\nTo fix this, ensure you have:")
    print("1. A Kaggle account (https://www.kaggle.com)")
    print("2. Kaggle API credentials file at ~/.kaggle/kaggle.json")
    print("3. Installed kagglehub: pip install kagglehub")
    downloaded_path = None

print("=" * 60)

Step 1: Downloading Rock, Paper, Scissors dataset from Kaggle...
✓ Dataset downloaded successfully!
Path: C:\Users\nanda\.cache\kagglehub\datasets\drgfreeman\rockpaperscissors\versions\2
✓ Dataset downloaded successfully!
Path: C:\Users\nanda\.cache\kagglehub\datasets\drgfreeman\rockpaperscissors\versions\2


In [17]:
import shutil
import os
from pathlib import Path

print("Step 2: Organizing downloaded dataset...")
print("=" * 60)

# Define local dataset directory (Windows-compatible)
# Using a local path instead of /content/dataset
local_dataset_dir = os.path.join(os.path.expanduser("~"), "rockpaperscissors_dataset")

# The downloaded path from kagglehub (or use downloaded_path if available)
if downloaded_path:
    src_root = downloaded_path
else:
    # Fallback path structure
    src_root = os.path.join(os.path.expanduser("~"), ".cache", "kagglehub", "datasets", "drgfreeman", "rockpaperscissors")

# Ensure destination directory exists
os.makedirs(local_dataset_dir, exist_ok=True)

folders_to_copy = ["rock", "paper", "scissors"]
copy_count = 0
total_files = 0

for folder in folders_to_copy:
    src_path = os.path.join(src_root, folder)
    dst_path = os.path.join(local_dataset_dir, folder)

    if os.path.exists(src_path):
        # Copy the folder
        if os.path.exists(dst_path):
            shutil.rmtree(dst_path)  # Remove existing folder
        shutil.copytree(src_path, dst_path)
        
        # Count files
        num_files = len(os.listdir(dst_path))
        total_files += num_files
        copy_count += 1
        print(f"✓ Copied: {folder} ({num_files} images)")
    else:
        print(f"✗ Folder not found: {src_path}")

print("=" * 60)
if copy_count == 3:
    print(f"✓ Dataset organized successfully!")
    print(f"Total images: {total_files}")
    print(f"Dataset location: {local_dataset_dir}")
    DATA_DIR = local_dataset_dir  # Set DATA_DIR for later use
else:
    print(f"⚠ Warning: Only {copy_count}/3 folders were found")
    print("The dataset may not be complete.")

Step 2: Organizing downloaded dataset...
✓ Copied: rock (726 images)
✓ Copied: rock (726 images)
✓ Copied: paper (712 images)
✓ Copied: paper (712 images)
✓ Copied: scissors (750 images)
✓ Dataset organized successfully!
Total images: 2188
Dataset location: C:\Users\nanda\rockpaperscissors_dataset
✓ Copied: scissors (750 images)
✓ Dataset organized successfully!
Total images: 2188
Dataset location: C:\Users\nanda\rockpaperscissors_dataset


### Step 2: Imports and Device Setup

Import the necessary libraries and check if a GPU is available.

In [18]:
import os
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader, random_split
from PIL import Image
import numpy as np

# Set the device variable
# Check if CUDA (GPU) is available, otherwise use CPU
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")

print("Using device:", device)

Using device: cpu


### Step 3: Data Loading and Preprocessing

Here we will define our image transformations, load the dataset, split it, and create DataLoaders.

In [19]:
import os
from pathlib import Path

print("Step 3: Loading and Preprocessing Data...")
print("=" * 60)

# Use DATA_DIR from the previous cell (set during download)
# If it doesn't exist, check for alternative paths or create dummy data
if 'DATA_DIR' not in globals() or not os.path.exists(DATA_DIR):
    print("⚠ DATA_DIR not set from download. Checking for dataset...")
    
    # Try to find the dataset in common locations
    potential_paths = [
        os.path.join(os.path.expanduser("~"), "rockpaperscissors_dataset"),
        "/content/dataset",
        "./rockpaperscissors_dataset"
    ]
    
    DATA_DIR = None
    for path in potential_paths:
        if os.path.exists(path) and len(os.listdir(path)) > 0:
            DATA_DIR = path
            print(f"✓ Found dataset at: {DATA_DIR}")
            break
    
    if not DATA_DIR:
        print("✗ Dataset not found. Creating dummy data for testing...")
        DATA_DIR = os.path.join(os.path.expanduser("~"), "rockpaperscissors_dataset")
        os.makedirs(DATA_DIR, exist_ok=True)
        
        classes = ["rock", "paper", "scissors"]
        for class_name in classes:
            class_dir = os.path.join(DATA_DIR, class_name)
            os.makedirs(class_dir, exist_ok=True)
            
            # Create dummy images
            from PIL import Image
            import numpy as np
            for i in range(10):
                img_array = np.random.randint(0, 256, (128, 128, 3), dtype=np.uint8)
                img = Image.fromarray(img_array)
                img.save(os.path.join(class_dir, f"dummy_{i}.png"))
        
        print(f"✓ Created dummy dataset at {DATA_DIR}")
else:
    print(f"✓ Using dataset from: {DATA_DIR}")

print("=" * 60)

# Define the image transforms
# 1. Resize all images to 128x128
# 2. Convert them to Tensors
# 3. Normalize them (mean=0.5, std=0.5)
transform = transforms.Compose([
    transforms.Resize((128, 128)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
])

# Load dataset using ImageFolder
full_dataset = datasets.ImageFolder(DATA_DIR, transform=transform)

class_names = full_dataset.classes
print("Classes:", class_names)

# Split the dataset
# We want 80% for training and 20% for testing
train_size = int(0.8 * len(full_dataset))
test_size = len(full_dataset) - train_size

# Use random_split to create train_dataset and test_dataset
train_dataset, test_dataset = random_split(full_dataset, [train_size, test_size])

# Create the DataLoaders
# Use a batch_size of 32
# Shuffle the training loader, but not the test loader
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)

print(f"\nDataset Summary:")
print(f"  Total images: {len(full_dataset)}")
print(f"  Training images: {len(train_dataset)}")
print(f"  Test images: {len(test_dataset)}")

Step 3: Loading and Preprocessing Data...
✓ Using dataset from: C:\Users\nanda\rockpaperscissors_dataset
Classes: ['paper', 'rock', 'scissors']

Dataset Summary:
  Total images: 2188
  Training images: 1750
  Test images: 438


### Step 4: Define the CNN Model

Fill in the `conv_block` and `fc_block` with the correct layers.

In [20]:
class RPS_CNN(nn.Module):
    def __init__(self):
        super(RPS_CNN, self).__init__()

        # Define the convolutional block
        # 1. Conv2d(3 -> 16 channels, kernel=3, padding=1), ReLU, MaxPool2d(2)
        # 2. Conv2d(16 -> 32 channels, kernel=3, padding=1), ReLU, MaxPool2d(2)
        # 3. Conv2d(32 -> 64 channels, kernel=3, padding=1), ReLU, MaxPool2d(2)
        self.conv_block = nn.Sequential(
            nn.Conv2d(in_channels=3, out_channels=16, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2),

            nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2),

            nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2)
        )

        # After 3 MaxPool(2) layers, our 128x128 image becomes:
        # 128 -> 64 -> 32 -> 16
        # So the flattened size is 64 * 16 * 16

        # Define the fully-connected (classifier) block
        # 1. Flatten the input
        # 2. Linear layer (64 * 16 * 16 -> 256)
        # 3. ReLU
        # 4. Dropout (p=0.3)
        # 5. Linear layer (256 -> 3) (3 classes: rock, paper, scissors)
        self.fc = nn.Sequential(
            nn.Flatten(),
            nn.Linear(64 * 16 * 16, 256),
            nn.ReLU(),
            nn.Dropout(p=0.3),
            nn.Linear(256, 3)
        )

    def forward(self, x):
        x = self.conv_block(x)
        x = self.fc(x)
        return x

# Initialize the model, criterion, and optimizer
# 1. Create an instance of RPS_CNN and move it to the 'device'
model = RPS_CNN().to(device)

# 2. Define the loss function (Criterion). Use CrossEntropyLoss for classification.
criterion = nn.CrossEntropyLoss()

# 3. Define the optimizer. Use Adam with a learning rate of 0.001
optimizer = optim.Adam(model.parameters(), lr=0.001)

print(model)

RPS_CNN(
  (conv_block): Sequential(
    (0): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): ReLU()
    (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (3): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (4): ReLU()
    (5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (6): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (7): ReLU()
    (8): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (fc): Sequential(
    (0): Flatten(start_dim=1, end_dim=-1)
    (1): Linear(in_features=16384, out_features=256, bias=True)
    (2): ReLU()
    (3): Dropout(p=0.3, inplace=False)
    (4): Linear(in_features=256, out_features=3, bias=True)
  )
)


### Step 5: Train the Model

Fill in the core training steps inside the loop.

In [21]:
EPOCHS = 10

for epoch in range(EPOCHS):
    model.train() # Set the model to training mode
    total_loss = 0

    for images, labels in train_loader:
        # Move data to the correct device
        images, labels = images.to(device), labels.to(device)

        # Implement the training steps
        # 1. Clear the gradients (optimizer.zero_grad())
        optimizer.zero_grad()

        # 2. Perform a forward pass (get model outputs)
        outputs = model(images)

        # 3. Calculate the loss (using criterion)
        loss = criterion(outputs, labels)

        # 4. Perform a backward pass (loss.backward())
        loss.backward()

        # 5. Update the weights (optimizer.step())
        optimizer.step()

        total_loss += loss.item()

    print(f"Epoch {epoch+1}/{EPOCHS}, Loss = {total_loss/len(train_loader):.4f}")

print("Training complete!")

Epoch 1/10, Loss = 0.6363
Epoch 2/10, Loss = 0.1840
Epoch 2/10, Loss = 0.1840
Epoch 3/10, Loss = 0.0961
Epoch 3/10, Loss = 0.0961
Epoch 4/10, Loss = 0.0468
Epoch 4/10, Loss = 0.0468
Epoch 5/10, Loss = 0.0223
Epoch 5/10, Loss = 0.0223
Epoch 6/10, Loss = 0.0107
Epoch 6/10, Loss = 0.0107
Epoch 7/10, Loss = 0.0235
Epoch 7/10, Loss = 0.0235
Epoch 8/10, Loss = 0.0124
Epoch 8/10, Loss = 0.0124
Epoch 9/10, Loss = 0.0049
Epoch 9/10, Loss = 0.0049
Epoch 10/10, Loss = 0.0222
Training complete!
Epoch 10/10, Loss = 0.0222
Training complete!


### Step 6: Evaluate the Model

Test the model's accuracy on the unseen test set.

In [22]:
model.eval() # Set the model to evaluation mode
correct = 0
total = 0

# Use torch.no_grad()
with torch.no_grad():
    for images, labels in test_loader:
        images, labels = images.to(device), labels.to(device)

        # Get model predictions
        # 1. Get the raw model outputs (logits)
        outputs = model(images)

        # 2. Get the predicted class (the one with the highest score)
        #    Hint: use torch.max(outputs, 1)
        _, predicted = torch.max(outputs, 1)

        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print(f"Test Accuracy: {100 * correct / total:.2f}%")

Test Accuracy: 98.17%


### Step 7: Test on a Single Image

Let's see how the model performs on one image.

In [23]:
def predict_image(model, img_path):
    model.eval()

    img = Image.open(img_path).convert("RGB")
    # Apply the same transforms as training, and add a batch dimension (unsqueeze)
    img = transform(img).unsqueeze(0).to(device)

    with torch.no_grad():
        # Get the model prediction
        # 1. Get the raw model outputs (logits)
        output = model(img)

        # 2. Get the predicted class index
        _, pred = torch.max(output, 1)

    return class_names[pred.item()]

# Test the function - find an actual image from the dataset
import os
import glob

test_img_path = None
for class_name in class_names:
    class_folder = os.path.join(DATA_DIR, class_name)
    images = glob.glob(os.path.join(class_folder, "*.png"))
    if images:
        test_img_path = images[0]
        break

if test_img_path:
    prediction = predict_image(model, test_img_path)
    print(f"Model prediction for {test_img_path}: {prediction}")
else:
    print("No images found in dataset to test.")

Model prediction for C:\Users\nanda\rockpaperscissors_dataset\paper\04l5I8TqdzF9WDMJ.png: paper


### Step 8: Play the Game!

This code is complete. If your model is trained, you can run this cell to have the model play against itself.

In [24]:
import random
import os

def pick_random_image(class_name):
    folder = os.path.join(DATA_DIR, class_name)
    files = os.listdir(folder)
    img = random.choice(files)
    return os.path.join(folder, img)

def rps_winner(move1, move2):
    if move1 == move2:
        return "Draw"

    rules = {
        "rock": "scissors",
        "paper": "rock",
        "scissors": "paper"
    }

    if rules[move1] == move2:
        return f"Player 1 wins! {move1} beats {move2}"
    else:
        return f"Player 2 wins! {move2} beats {move1}"


# -----------------------------------------------------------
# 1. Choose any two random classes
# -----------------------------------------------------------

choices = ["rock", "paper", "scissors"]
c1 = random.choice(choices)
c2 = random.choice(choices)

img1_path = pick_random_image(c1)
img2_path = pick_random_image(c2)

print("Randomly selected images:")
print("Image 1:", img1_path)
print("Image 2:", img2_path)


# -----------------------------------------------------------
# 2. Predict their labels using the model
# -----------------------------------------------------------

p1 = predict_image(model, img1_path)
p2 = predict_image(model, img2_path)

print("\nPlayer 1 shows:", p1)
print("Player 2 shows:", p2)

# -----------------------------------------------------------
# 3. Decide the winner
# -----------------------------------------------------------

print("\nRESULT:", rps_winner(p1, p2))

Randomly selected images:
Image 1: C:\Users\nanda\rockpaperscissors_dataset\rock\y0ZTIzS3rpKagERb.png
Image 2: C:\Users\nanda\rockpaperscissors_dataset\scissors\aMAVOdimraDSK6P1.png

Player 1 shows: rock
Player 2 shows: scissors

RESULT: Player 1 wins! rock beats scissors
