# HW04

### Setup

Setting everything up

In [2]:
#Import libraries
from PIL import Image
from transformers import pipeline
from transformers import ViTImageProcessor, ViTForImageClassification
from pathlib import Path
import torch
import pandas as pd

#Setting it all up
!pip install Pillow

Defaulting to user installation because normal site-packages is not writeable

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.0.1[0m[39;49m -> [0m[32;49m25.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


### Load the Model and Processor

Loading the Hugging Face Model

In [3]:
# Load model and processor
EMOTION_MODEL = "dima806/facial_emotions_image_detection"
processor = ViTImageProcessor.from_pretrained(EMOTION_MODEL)
model = ViTForImageClassification.from_pretrained(EMOTION_MODEL)

### Test Zero
Running the model on the first image, to ensure accuracy

#### Uploading the Image
Uploading the first test image

In [4]:
# Define the images path
image_folder = Path("./TestImage0/")
image_files = list(image_folder.glob("*.*"))
print("Found images:", image_files)

# Check the image is really there
image_path = image_files[0]
image = Image.open(image_path)
print(type(image))

Found images: [PosixPath('TestImage0/Happy1.png')]
<class 'PIL.PngImagePlugin.PngImageFile'>


#### Running the Model
Running the model on the first image, to ensure accuracy

In [6]:
# Open the image
image = Image.open(image_path)

# RGB format
if image.mode != "RGB":
    image = image.convert("RGB")

# Processing
inputs = processor(images=image, return_tensors="pt")

# Inference
with torch.no_grad():
    outputs = model(**inputs)

# Get predicted class index
logits = outputs.logits
predicted_class = logits.argmax(-1).item()

# Labels
labels = model.config.id2label[predicted_class]
print("Predicted Emotion:", labels)


Predicted Emotion: happy


#### Figuring out the possible labels

In [7]:
#List model labels
labels = model.config.id2label
print(labels)

{0: 'sad', 1: 'disgust', 2: 'angry', 3: 'neutral', 4: 'fear', 5: 'surprise', 6: 'happy'}


### Defining Image Processing Function

In [8]:
def process_images(folder_path):
    folder = Path(folder_path)
    image_files = list(folder.glob("*.*"))

    results = []

    for image_path in image_files:
        
        image = Image.open(image_path)

        if image.mode != "RGB":
            image = image.convert("RGB")
        inputs = processor(images=image, return_tensors="pt")

        with torch.no_grad():
            outputs = model(**inputs)
            logits = outputs.logits
            predicted_class = logits.argmax(-1).item()
        
        label = model.config.id2label[predicted_class]

        results.append((image_path.name, label))

    return pd.DataFrame(results, columns=["Image Name", "Predicted Label"])


### Test One
In this test, all images in this set are single human front-facing faces, with different races, genders and ages. I included one for each emotion.

In [9]:
process_images("./TestImages1/")

Unnamed: 0,Image Name,Predicted Label
0,Scared1.jpg,happy
1,Happy1.JPG,angry
2,Disgusted1.jpg,happy
3,Surprise1.JPG,surprise
4,Angry1.jpg,happy
5,Sad1.JPG,neutral
6,Neutral1.PNG,neutral


### Test Two
'How deterministic is the model ? Does it always produce the same result if you repeat the input ?'
I repeat the same input 5 times.

In [11]:
process_images("./TestImages2/")

Unnamed: 0,Image Name,Predicted Label
0,Happy5.JPG,angry
1,Happy1.JPG,angry
2,Happy2.JPG,angry
3,Happy3.JPG,angry
4,Happy4.JPG,angry


### Test Three
'Consider doing some quantitative analysis: If you run the model 20 times with similar inputs, how many times does it give a "bad" answer ?'
All 20 inputs here are for the emotion 'happy'.

In [27]:
process_images("./TestImages3/")

Unnamed: 0,Image Name,Predicted Label
0,Happy16.png,happy
1,Happy14.png,happy
2,Happy13.png,happy
3,Happy15.jpg,happy
4,Happy19.png,happy
5,Happy17.png,happy
6,Happy5.JPG,happy
7,Happy7.jpg,happy
8,Happy8.JPG,happy
9,Happy1.JPG,fear


### Test Four
In this test, images contain more than one person.

In [24]:
process_images("./TestImages4/")

Unnamed: 0,Image Name,Predicted Label
0,Many1.JPG,happy
1,Many3.JPG,fear
2,Many4.JPG,neutral
3,Many2.JPG,happy
4,Many5.JPG,happy


### Test Five
In this test, images are of art pieces, sketches and sculptures of human faces.

In [25]:
process_images("./TestImages5/")

Unnamed: 0,Image Name,Predicted Label
0,HappyArt3.jpg,fear
1,AngryArt.jpg,sad
2,HappyArt.jpg,fear
3,NeutralArt.jpg,neutral
4,HappyArt2.jpg,neutral


### Test Six
In this test, images do not include human faces.

In [26]:
process_images("./TestImages6/")

Unnamed: 0,Image Name,Predicted Label
0,tile.jpg,happy
1,dog2.jpg,fear
2,dog1.jpg,fear
3,bottle.JPG,fear
4,cake.jpg,surprise
