# Food Classification with CNN - Building a Restaurant Recommendation System

This assignment focuses on developing a deep learning-based food classification system using Convolutional Neural Networks (CNNs). You will build a model that can recognize different food categories and use it to return the food preferences of a user.

## Learning Objectives
- Implement CNNs for image classification
- Work with real-world food image datasets
- Build a preference-detector system

## Background: AI-Powered Food Preference Discovery

The system's core idea is simple:

1. Users upload 10 photos of dishes they enjoy
2. Your CNN classifies these images into the 91 categories
3. Based on these categories, the system returns the user's taste profile

Your task is to develop the core computer vision component that will power this detection engine.

You are given a training ("train" folder) and a test ("test" folder) dataset which have ~45k and ~22k samples respectively. For each one of the 91 classes there is a subdirectory containing the images of the respective class.

## Assignment Requirements

### Technical Requirements
- Implement your own pytorch CNN architecture for food image classification
- Use only the provided training dataset split for training
- Train the network from scratch ; No pretrained weights can be used
- Report test-accuracy after every epoch
- Report all hyperparameters of final model
- Use a fixed seed and do not use any CUDA-features that break reproducibility
- Use Pytorch 2.6

### Deliverables
1. Jupyter Notebook with CNN implementation, training code etc.
2. README file
3. Report (max 3 pages)

Submit your report, README and all code files as a single zip file named GROUP_[number]_NC2425_PA. The names and IDs of the group components must be mentioned in the README.
Do not include the dataset in your submission.

### Grading

1. Correct CNN implementation, training runs on the uni DSLab computers according to the README.MD instructions without ANY exceptions on the DSLab machines: 3pt
2. Perfect 1:1 reproducibility on DSLab machines: 1pt
3. Very clear github-repo-style README.MD with instructions for running the code: 1pt
4. Report: 1pt
5. Model test performance on test-set: interpolated from 30-80% test-accuracy: 0-3pt
6. Pick 10 random pictures of the test set to simulate a user uploading images and report which categories occur how often in these: 1pt
7. Bonus point: use an LLM (API) to generate short description / profile of preferences of the simulated user

**If there is anything unclear about this assignment please post your question in the Brightspace discussions forum or send an email**


# Loading the datasets
The dataset is already split into a train and test set in the directories "train" and "test". 

In [1]:
import torch
import numpy as np
from torchvision import datasets, transforms
from torch.utils.data import DataLoader


random_seed = 42
np.random.seed(random_seed)
torch.manual_seed(random_seed) 
torch.cuda.manual_seed(random_seed)
torch.cuda.manual_seed_all(random_seed)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")


transform = transforms.Compose([
    transforms.Resize((256, 256)),  # not all images are exactly 256x256
    transforms.ToTensor(),     

    # TO DO: understand/explain why these parameters are suggested
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])  # Normalize with ImageNet stats
])

# Automatically download the dataset and associate folder names as labels
train_dataset = datasets.ImageFolder(root='train', transform=transform)
test_dataset = datasets.ImageFolder(root='test', transform=transform)


class_names = train_dataset.classes
print("Class names:", class_names)

class_to_idx = train_dataset.class_to_idx
print("Class to index mapping:", class_to_idx)

# Create a DataLoader for the subset

train_loader = DataLoader(dataset=train_dataset, batch_size=32, shuffle=True, num_workers=16)
test_loader = DataLoader(dataset=test_dataset, batch_size=32, shuffle=True, num_workers=16)


for images, labels in train_loader:
    print("Labels:", labels)  # Print the labels for the batch
    print("Labels as class names:", [class_names[label] for label in labels])  # Convert labels to class names
    break


Class names: ['beet_salad', 'beignets', 'bibimbap', 'bread_pudding', 'breakfast_burrito', 'bruschetta', 'caesar_salad', 'cannoli', 'caprese_salad', 'carrot_cake', 'ceviche', 'cheese_plate', 'cheesecake', 'chicken_curry', 'chicken_quesadilla', 'chicken_wings', 'chocolate_cake', 'chocolate_mousse', 'churros', 'clam_chowder', 'club_sandwich', 'crab_cakes', 'creme_brulee', 'croque_madame', 'cup_cakes', 'deviled_eggs', 'donuts', 'dumplings', 'edamame', 'eggs_benedict', 'escargots', 'falafel', 'filet_mignon', 'fish_and_chips', 'foie_gras', 'french_fries', 'french_onion_soup', 'french_toast', 'fried_calamari', 'fried_rice', 'frozen_yogurt', 'garlic_bread', 'gnocchi', 'greek_salad', 'grilled_cheese_sandwich', 'grilled_salmon', 'guacamole', 'gyoza', 'hamburger', 'hot_and_sour_soup', 'hot_dog', 'huevos_rancheros', 'hummus', 'ice_cream', 'lasagna', 'lobster_bisque', 'lobster_roll_sandwich', 'macaroni_and_cheese', 'macarons', 'miso_soup', 'mussels', 'nachos', 'omelette', 'onion_rings', 'oysters', 

# CNN Implementation

In [2]:
# Your code here

import torch.nn as nn
import torch.nn.functional as F

class FoodCNN(nn.Module):
    def __init__(self):
        super().__init__()

        # Start: 256x256x3
        self.conv1 = nn.Conv2d(3, 32, kernel_size=6, stride=2)
        # After Conv1: 126x126x32
        # MaxPool2d(3, 3): 42x42x32
        self.conv2 = nn.Conv2d(32, 76, kernel_size=5)
        # After Conv2: 
        self.conv3 = nn.Conv2d(192, 384, kernel_size=3)
        self.conv4 = nn.Conv2d(384, 256, kernel_size=3)
        self.conv5 = nn.Conv2d(256, 256, kernel_size=3)

        self.max_pool = nn.MaxPool2d(kernel_size=3, stride=2)
        self.avg_pool = nn.AdaptiveAvgPool2d((6, 6))

        self.fc1 = nn.Linear(256 * 6 * 6, 4096)
        self.fc2 = nn.Linear(4096, 4096)
        self.fc3 = nn.Linear(4096, len(class_names))

    def forward(self, x):
        x = self.max_pool(F.relu(self.conv1(x)))
        x = self.max_pool(F.relu(self.conv2(x)))
        x = F.relu(self.conv3(x))
        x = F.relu(self.conv4(x))
        x = self.max_pool(F.relu(self.conv5(x)))

        x = self.avg_pool(x)
        x = torch.flatten(x, 1)

        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
       
        return x


# Training the model
Implement your training process below. Report the test-accuracy after every epoch for the training run of the final model.

Hint: before training your model make sure to reset the seed in the training cell, as otherwise the seed may have changed due to previous training runs in the notebook

Note: If you implement automatic hyperparameter tuning, split the train set into train and validation subsets for the objective function.

In [3]:
import datetime as dt
import time


def calculate_test_accuracy(model):
    correct = 0 
    total = 0
    with torch.no_grad():
        for images, labels in test_loader:
            images = images.to(device)
            labels = labels.to(device)
            outputs = model(images)
            _, predicted = torch.max(outputs, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()
    return (correct/total) * 100


# Set the variables for training
batch_size = 32
num_classes = len(class_names)
learning_rate = 0.001
num_epochs = 20


# Train and validate the CNN model
model = FoodCNN().to(device)
# print(model)

# Set Loss function --- SOLUTION
loss_fn = nn.CrossEntropyLoss() 

# Set optimizer 
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate) #??? , weight_decay = 0.005, momentum = 0.9) # Define the optimizer

for epoch in range(num_epochs):
	#Load data in batches
    print(f"Epoch {epoch+1} begin: {dt.datetime.now()}.")
    start = time.time()
    for i, (images, labels) in enumerate(train_loader):

        images = images.to(device)
        labels = labels.to(device)

        # Forward pass
        outputs = model(images)
        loss = loss_fn(outputs, labels)

        # Backward and optimize
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
    acc = calculate_test_accuracy(model)
    print("Took {:.2f} seconds.".format((time.time() - start)))

    print('Epoch [{}/{}], Loss: {:.4f}, Accuracy: {:.2f}%'.format(epoch+1, num_epochs, loss.item(), acc))

Epoch 1 begin: 2025-04-29 18:39:10.786897.
Took 56.61 seconds.
Epoch [1/20], Loss: 4.5212, Accuracy: 1.05%
Epoch 2 begin: 2025-04-29 18:40:07.399797.
Took 56.13 seconds.
Epoch [2/20], Loss: 4.5107, Accuracy: 0.97%
Epoch 3 begin: 2025-04-29 18:41:03.532948.
Took 56.84 seconds.
Epoch [3/20], Loss: 4.5136, Accuracy: 0.97%
Epoch 4 begin: 2025-04-29 18:42:00.377389.
Took 57.04 seconds.
Epoch [4/20], Loss: 4.5199, Accuracy: 0.95%
Epoch 5 begin: 2025-04-29 18:42:57.418279.
Took 56.81 seconds.
Epoch [5/20], Loss: 4.5047, Accuracy: 0.95%
Epoch 6 begin: 2025-04-29 18:43:54.230396.
Took 57.13 seconds.
Epoch [6/20], Loss: 4.5108, Accuracy: 0.95%
Epoch 7 begin: 2025-04-29 18:44:51.362373.
Took 56.89 seconds.
Epoch [7/20], Loss: 4.5181, Accuracy: 0.95%
Epoch 8 begin: 2025-04-29 18:45:48.255769.
Took 56.73 seconds.
Epoch [8/20], Loss: 4.5053, Accuracy: 0.96%
Epoch 9 begin: 2025-04-29 18:46:44.984624.
Took 56.91 seconds.
Epoch [9/20], Loss: 4.5082, Accuracy: 0.96%
Epoch 10 begin: 2025-04-29 18:47:41.8

In [4]:
print("Overwrite? \"Yes\" / else: ")
user_input = input().lower()
if user_input == "yes" or user_input == 'y':
    PATH = './cnn.pth'
    torch.save(model.state_dict(), PATH)
    print("Save complete.")
else:
    PATH = './cnn_backup.pth'
    torch.save(model.state_dict(), PATH)
    print("Original unchanged, Backup overwritten.")

Overwrite? "Yes" / else: 


KeyboardInterrupt: Interrupted by user

# Calculating model performance
Load the best version of your model ( which should be produced and saved by previous cells ), calculate and report the test accuracy.

In [None]:
# Load the best model weights
model2 = FoodCNN().to(device)
model2.load_state_dict(torch.load("cnn_backup.pth"))

final_test_acc = calculate_test_accuracy(model2)
print(f"Final Test Accuracy: {final_test_acc:.2f}%")


# Summary of hyperparameters
Report the hyperparameters ( learning rate etc ) that you used in your final model for reproducibility.

# Simulation of random user
Pick 10 random pictures of the test set to simulate a user uploading images and report which categories occur how often in these: 1pt

In [None]:
# Your code here
# Below an example showing the format of the code output

# Bonus point
Use an LLM (API) to generate a description of the food preference of a user based on 10 images that a potential user could provide. 
Please include an example of the output of your code, especially if you used an API other than the OpenAI API.

This should work well even with differing test images by setting different random seeds for the image selector.