# Food Image Classification with Hugging Face

<p align="center">
  <img src="Food class_2.png" alt="Food class_2" width="500">
</p>

A popular social media platform dedicated to food enthusiasts wants to improve user engagement by adding advanced image recognition features. As a machine learning engineer, you are tasked with developing a food image classification system using Hugging Face's state-of-the-art models. This system will automatically identify and categorize food items in user-uploaded photos, allowing for better content organization and personalized food content recommendations.

Your responsibility is to develop a robust food category image classification system using pre-trained models from Hugging Face.

The goal is to enhance user interaction by providing accurate food classification, enabling users to easily find and engage with content related to their favorite foods, and improving the overall experience on the platform.

In this dynamic project, we leverage the power of PyTorch and transformers, utilizing an open-source model from Hugging Face as the backbone of our solution.

In [3]:
# Install required libraries
!pip install matplotlib
!pip install pillow
!pip install scikit-learn
!pip install transformers datasets evaluate
!pip install torchvision

Defaulting to user installation because normal site-packages is not writeable

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.2[0m[39;49m -> [0m[32;49m24.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3 -m pip install --upgrade pip[0m
Defaulting to user installation because normal site-packages is not writeable

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.2[0m[39;49m -> [0m[32;49m24.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3 -m pip install --upgrade pip[0m
Defaulting to user installation because normal site-packages is not writeable

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.2[0m[39;49m -> [0m[32;49m24.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3 -m pip instal

In [4]:
# Import required libraries
import numpy as np
import matplotlib.pyplot as plt
import evaluate
from PIL import Image
from datasets import load_dataset
from transformers import pipeline
from torchvision.transforms import RandomHorizontalFlip, RandomRotation, RandomResizedCrop, ColorJitter, ToTensor, CenterCrop, Compose, Normalize
from transformers.utils import logging
# Only show error messages from the transformers library to reduce the amount of log output
logging.set_verbosity_error()

import warnings
# Ignore all Python warnings to keep the output clean
warnings.filterwarnings("ignore")

In [5]:
# Helper function to convert image to RGB format
def convert_to_rgb(image):
    """
    Converts an image to RGB format.

    Parameters:
    image (PIL.Image): An image object.

    Returns:
    PIL.Image: Image object in RGB format.
    """
    return image.convert('RGB')

In [6]:
#load the dataset
food = load_dataset("ethz/food101", split="train[:5000]")
food = food.train_test_split(test_size=0.2)



In [7]:
#proprocess and the dataset
from transformers import AutoImageProcessor

model_name = "google/vit-base-patch16-224-in21k"
image_processor = AutoImageProcessor.from_pretrained(model_name)

# Define the size based on the image processor
size = ( image_processor.size["shortest_edge"] if "shortest_edge" in image_processor.size else (image_processor.size["height"], image_processor.size["width"]) )

# Define the transformations 
_transforms = Compose([ CenterCrop(size), ToTensor(), Normalize(mean=image_processor.image_mean, std=image_processor.image_std) ])

In [8]:
# Apply the transformations 
def transforms(samples): 
    samples["pixel_values"] = [_transforms(convert_to_rgb(img)) for img in samples["image"]] 
    del samples["image"] 
    return samples

food = food.with_transform(transforms)

In [9]:
#Data Augmentation on training set and test set
# Define the data augmentation transformations for the training set
train_data_augmentation = Compose([
    RandomHorizontalFlip(),
    RandomRotation(degrees=(-10, 10)),
    RandomResizedCrop(size=size, scale=(0.8, 1.2)),
])


# Apply the transformations to the datasets
food["train"].transform = train_data_augmentation


In [10]:
labels = food["train"].features["label"].names
label2id, id2label = dict(), dict()
for i, label in enumerate(labels):
    label2id[label] = str(i)
    id2label[str(i)] = label

In [11]:
from transformers import AutoModelForImageClassification


model = AutoModelForImageClassification.from_pretrained(
    model_name,
    num_labels=len(labels),
    id2label=id2label,
    label2id=label2id,
)

In [12]:
from sklearn.metrics import accuracy_score
import numpy as np
from transformers import DefaultDataCollator

data_collator = DefaultDataCollator()


def compute_metrics(eval_pred):
    predictions, labels = eval_pred
    predictions = np.argmax(predictions, axis=1)
    return {"accuracy": accuracy_score(labels, predictions)}


In [0]:
from transformers import  TrainingArguments, Trainer

training_args = TrainingArguments(
    output_dir="./food_model_result",
    remove_unused_columns=False,
    evaluation_strategy="epoch",
    save_strategy="epoch",
    per_device_train_batch_size=16,
    gradient_accumulation_steps=4,
    per_device_eval_batch_size=16,
    num_train_epochs=3,
    logging_steps=10,
    load_best_model_at_end=True,
    metric_for_best_model="accuracy",
)

trainer = Trainer(
    model=model,
    args=training_args,
    data_collator=data_collator,
    train_dataset=food["train"],
    eval_dataset=food["test"],
    tokenizer=image_processor,
    compute_metrics=compute_metrics,
)

trainer.train()

In [None]:
local_path = "./fine_tuned_food_model"
trainer.save_model(local_path)