## <p style="font-family: Georgia; font-weight: normal; letter-spacing: 2px; color: #fabd2f;     font-size: 140%; text-align: left; padding: 0px; border-bottom: 3px solid #fabd2f;">Webcam Proof of Concept</p>

Real-time emotion detection using a pre-trained CNN on live webcam feed. Detects three emotions: Happy, Sad, and Neutral.

In [None]:
#import cv2, torch, time 
import torch.nn.functional as F
import time
import torch
import csv
import cv2
from datetime import datetime
from pathlib import Path
from playsound3 import playsound
from model import MyCNN


device = torch.device('mps' if torch.backends.mps.is_available() else 'cpu')
print(f'Using device: {device}')

Using device: mps


## <p style="font-family: Georgia; font-weight: normal; letter-spacing: 2px; color: #fabd2f;     font-size: 140%; text-align: left; padding: 0px; border-bottom: 3px solid #fabd2f;">Setup: Initialize Model and Device</p>

Load the pre-trained MyCNN model on GPU (MPS) if available (im on mac), otherwise use CPU. Initialize the Haar Cascade classifier for face detection.

In [None]:
model = MyCNN().to(device)
# Load with strict=False to handle any architectural differences
model.load_state_dict(torch.load('models/best_model.pth', map_location=device), strict=False)
model.eval()

face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')

# Emotion labels
class_names = ['happy', 'sad', 'neutral']

## <p style="font-family: Georgia; font-weight: normal; letter-spacing: 2px; color: #fabd2f;     font-size: 140%; text-align: left; padding: 0px; border-bottom: 3px solid #fabd2f;">Face Detection</p> 


Use Haar Cascade to detect faces in each frame. Defined emotion class labels for output.

In [3]:
face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')
# labels
class_names = ['happy', 'sad', 'neutral']

## <p style="font-family: Georgia; font-weight: normal; letter-spacing: 2px; color: #fabd2f;     font-size: 140%; text-align: left; padding: 0px; border-bottom: 3px solid #fabd2f;"> Real-time Emotion Detection Loop</p>

Capture frames from webcam, detect faces, classify emotions, log predictions, and display results. Press 'q' to quit.

In [4]:
log_path = Path('predictions_log.csv')
if not log_path.exists():
    with open(log_path, "w", newline="") as f:
        writer = csv.writer(f)
        writer.writerow(["timestamp", "emotion", "confidence"])
        
cap = cv2.VideoCapture(0) # 0 for default camera
pause_until = 0

while True:
    ret, frame = cap.read()
    if not ret:
        break
    
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    faces = face_cascade.detectMultiScale(gray, 1.3, 5)
    
    # pause
    current_time = time.time()
    if current_time < pause_until:
        cv2.imshow("FER Live", frame)
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break
        continue
    
    for (x, y, w, h) in faces:
        face = gray[y:y+h, x:x+w]
        face_resized = cv2.resize(face, (48, 48))
        face_tensor = torch.tensor(face_resized).unsqueeze(0).unsqueeze(0).float() / 255.0
        face_tensor = (face_tensor - 0.5) / 0.5
        face_tensor = face_tensor.to(device)
        
        with torch.no_grad():
            output = model(face_tensor)
            probs = F.softmax(output, dim=1)
            conf, pred = torch.max(probs, 1)
            emotion = class_names[pred.item()]
            confidence = conf.item()
            
        # Log
        with open(log_path, "a", newline="") as f:
            writer = csv.writer(f)
            writer.writerow([datetime.now().isoformat(), emotion, round(confidence, 4)])
            
        # Draw
        cv2.rectangle(frame, (x, y), (x+w, y+h), (0, 255, 0), 2)
        text = f"{emotion} ({confidence*100:.1f}%)"
        cv2.putText(frame, text, (x, y-10), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (255,255,255), 2)
        
        if emotion == "sad":
            # CHANGE LATER
            #playsound("alert.mp3", block=False)
            print("Sad detected!")
            pause_until = time.time() + 1.5
        elif emotion == "happy":
            print("Happy detected!")
            #playsound("happy.mp3", block=False)
            pause_until = time.time() + 1.5
            
    cv2.imshow("FER Live", frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
        
cap.release()
cv2.destroyAllWindows()
            

Happy detected!


2025-11-11 17:44:54.768 Python[54342:153705] +[IMKClient subclass]: chose IMKClient_Legacy
2025-11-11 17:44:54.768 Python[54342:153705] +[IMKInputSession subclass]: chose IMKInputSession_Legacy


Happy detected!
Happy detected!
Happy detected!
Sad detected!
Sad detected!
Sad detected!
Sad detected!
Happy detected!
Happy detected!
Sad detected!
Sad detected!
Sad detected!
Sad detected!
Happy detected!
Happy detected!
Happy detected!
Happy detected!
Happy detected!
Happy detected!
Happy detected!
Happy detected!
Happy detected!
Happy detected!
Happy detected!
Happy detected!
Sad detected!
Sad detected!
Sad detected!
Sad detected!
