<h1 style="text-align: center;">Facial Recognition for Seamless Student Identification</h1>
<h3 style="text-align: center;">Ghaisan Rabbani<br>Melin Ayu Safitri<br>Rachmawati Hapsari Putri</h3>

---

In [13]:
# Import library
from sklearn.neighbors import KNeighborsClassifier
import cv2 # for computer vision tasks
import pickle # for saving and loading data
import numpy as np # for numerical operations and array handling
import os # for interacting with the operating system
import csv 
import time # for time-related functions
import pyttsx3 # for text-to-speech (TTS) functionality
from datetime import datetime # for manipulating date and time
from deepface import DeepFace # for facial recognition and emotion analysis
from threading import Thread # for concurrent execution of tasks
# Others
import warnings
import pickle

# Configure Settings
warnings.simplefilter("ignore")  # Ignore warnings


In [14]:
import sys
print(f"Python version: {sys.version}")

Python version: 3.12.5 | packaged by Anaconda, Inc. | (main, Sep 12 2024, 18:18:29) [MSC v.1929 64 bit (AMD64)]


In [3]:
# Inisialisasi text-to-speech
engine = pyttsx3.init()

# Variable for tracking
last_emotion_check = 0
emotion_check_interval = 2.0
current_emotion = "Unknown"
current_emotion_accuracy = 0
face_detection_confidence = 0
last_speech_time = 0
speech_interval = 5.0
is_speaking = False

# Column definitions for attendance CSV file
COL_NAMES = ['NAME', 'TIME', 'EMOTION', 'EMOTION_ACCURACY', 'FACE_CONFIDENCE']

**Code Explanation**

- Initializes the `pyttsx3` library creates an instance of the TTS engine, which can be used later to convert text to speech. pyttsx3 works offline and supports various platforms.

- **last_emotion_check:** To avoid unnecessary checks and manage the frequency of emotion analysis (by comparing with emotion_check_interval).
- **emotion_check_interval:** Controls how often the system checks for emotions from the face. For example, if this is set to 2.0, the program will check emotions every 2 seconds.
- **current_emotion:** Keeps track of the most recent detected emotion, such as "happy", "sad", or "angry", which can be used later for feedback or analysis.
- **current_emotion_accuracy:** Represents how confident the model is in detecting the current emotion. This could be used to filter out low-confidence results or display feedback based on the certainty of the emotion.
- **face_detection_confidence:** Helps ensure that face detection is reliable before performing tasks like emotion analysis or speech output. Can be used to filter out false detections.
- **last_speech_time:** Helps control how often the system speaks by checking if the time interval (speech_interval) has passed since the last speech output.
- **speech_interval:** Prevents the system from speaking too frequently. For example, if the system says something every 5 seconds, speech_interval is set to 5.0.
- **is_speaking:** Prevents overlapping speech. This ensures that only one speech instance can occur at a time, allowing the system to wait for speech to finish before starting another one.

In [4]:
def speak_async(text):
    global is_speaking
    if not is_speaking:
        is_speaking = True
        engine.say(text)
        engine.runAndWait()
        is_speaking = False

**Code Explanation:**

- The `speak_async function` performs text-to-speech operations asynchronously by:
    - Checking if the system is already speaking (using the `is_speaking` flag).
    - If not speaking, it starts the speech process and waits for it to complete.
    - Once done, it sets the is_speaking flag to False, allowing further speech commands to be processed.
- Use Case:
    This function ensures that the TTS engine doesn't overlap with itself and avoids issues where multiple speech requests might conflict or overlap. It's helpful in scenarios where you might have multiple speech outputs in a program (such as providing feedback or notifications).

In [10]:
def record_attendance(name, emotion, emotion_acc, face_conf):
    try:
        ts = time.time()
        date = datetime.fromtimestamp(ts).strftime("%d-%m-%Y")
        timestamp = datetime.fromtimestamp(ts).strftime("%H:%M:%S")
        
        if not os.path.exists("Attendance"):
            os.makedirs("Attendance")
            
        attendance_file = f"Attendance/Attendance_{date}.csv"
        file_exists = os.path.isfile(attendance_file)
        
        # Add accuracy to attendance data
        attendance_data = [
            str(name), 
            str(timestamp), 
            str(emotion),
            f"{emotion_acc:.2f}%",
            f"{face_conf:.2f}%"
        ]
        
        with open(attendance_file, "a", newline='') as csvfile:
            writer = csv.writer(csvfile)
            if not file_exists:
                writer.writerow(COL_NAMES)
            writer.writerow(attendance_data)
            
        return True
    except Exception as e:
        print(f"Error recording attendance: {e}")
        return False

**Code Explanation:**

- **Function:** record_attendance(`name`, `emotion`, `emotion_acc`, `face_conf`)
- **Purpose:** This function records a person's attendance along with additional data like their detected emotion, the accuracy of the emotion detection, and the confidence of the face detection. It logs this data into a CSV file under a folder named `Attendance`, creating the file if it doesn't exist.
- **Uses:**
    - It ensures proper organization by creating an "Attendance" folder.
    - It allows adding new records of attendance throughout the day, appending data to the file for each person detected.
    - The function tracks both time and emotion for each attendance record and provides feedback on the accuracy of the emotion and face detection.

In [11]:
def analyze_emotion_async(frame, name):
    global current_emotion, last_speech_time
    try:
        analysis = DeepFace.analyze(frame, actions=['emotion'], enforce_detection=False)
        new_emotion = analysis[0]['dominant_emotion']
        
        # If emotions change and have passed the speech interval
        current_time = time.time()
        if (new_emotion != current_emotion and 
            current_time - last_speech_time > speech_interval):
            
            current_emotion = new_emotion
            # Create messages based on emotions
            if current_emotion == 'happy':
                message = f"Hi {name}, your mood today is happy! Keep up the positive energy!"
            elif current_emotion == 'sad':
                message = f"Hi {name}, your mood today is sad. Stay strong and positive!"
            elif current_emotion == 'angry':
                message = f"Hi {name}, your mood today is angry. Take a deep breath and relax."
            else:
                message = f"Hi {name}, your mood is {current_emotion}. Stay positive!"
            
            # Run speech in a separate thread
            Thread(target=speak_async, args=(message,)).start()
            last_speech_time = current_time
            
    except Exception as e:
        print("Error analyzing emotion:", e)

The `analyze_emotion_async function` is a crucial part of an emotion-based feedback system.<br>
**It:**
- Analyzes emotions from a video frame using the DeepFace library.
- Provides verbal feedback based on detected emotions.
- Ensures the system operates efficiently by managing timing intervals for speech.
- Runs asynchronously, keeping the main program responsive.

In [7]:
# Starting video capture
video = cv2.VideoCapture(0)

# Load classifier and data
facedetect = cv2.CascadeClassifier(r'data\haarcascade_frontalface_default.xml')
with open(r'data\names.pkl', 'rb') as f:
    LABELS = pickle.load(f)
with open(r'data\faces_data.pkl', 'rb') as f:
    FACES = pickle.load(f)

# Setup KNN
knn = KNeighborsClassifier(n_neighbors=5)
knn.fit(FACES, LABELS)

**Code Explanation:**
- `cv2.VideoCapture(0)`: This is the starting point for capturing live frames from the webcam for processing, such as face detection and recognition.
- `haarcascade_frontalface_default.xml` file contains the trained model for detecting frontal faces.
- `cv2.CascadeClassifier()` initializes the classifier using the provided XML file.
- `Usage`: This classifier will be used later to detect the location of faces in the video frames.
- The code is part of a real-time face recognition system: Match detected faces with the pre-trained face data using the `KNN classifier`.
- Applications:
    - Attendance systems.
    - Access control systems.

**How It Works Together**
- `Video Capture:` Starts the webcam feed for real-time video processing.
- `Face Detection:` Uses the Haar Cascade classifier to locate faces in the video frames.
- `Data Loading:` Loads pre-trained face data and corresponding labels for recognition.
- `KNN Training:` Trains a KNN classifier to recognize faces based on the pre-loaded data.

In [8]:
# Variable for tracking attendance
last_attendance_name = None
last_attendance_time = 0
attendance_cooldown = 30  # 30 second cooldown between attendance for the same name

- **Tracking Attendance:**<br>
    - `last_attendance_name` ensures that only unique attendance events are logged in quick succession.
    - If the recognized name matches `last_attendance_name`, the system checks the cooldown before allowing another entry.
- **Cooldown Mechanism:**<br>
    - `last_attendance_time` is updated every time a person's attendance is recorded.
    - If the current time minus `last_attendance_time` is less than attendance_cooldown, the system skips recording attendance for that person.
- **Prevention of Duplicates:**<br>
    - The combination of these variables avoids creating multiple entries for the same person within a short time.

In [9]:
while True:
    ret, frame = video.read()
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    
    # Face detection with confidence
    faces = facedetect.detectMultiScale(
        gray,
        scaleFactor=1.3,
        minNeighbors=5,
        minSize=(30, 30),
        flags=cv2.CASCADE_SCALE_IMAGE
    )
    
    # Calculate confidence score for face detection
    if len(faces) > 0:
        # Face size is used for detection as a confidence indicator.
        x, y, w, h = faces[0]
        face_area = w * h
        frame_area = frame.shape[0] * frame.shape[1]
        face_detection_confidence = min((face_area / frame_area) * 400, 100)  # Scale dan cap at 100%
    
    for (x,y,w,h) in faces:
        crop_img = frame[y:y+h, x:x+w, :]
        resized_img = cv2.resize(crop_img, (50,50)).flatten().reshape(1,-1)
        output = knn.predict(resized_img)
        current_name = str(output[0])
        
        current_time = time.time()
        if current_time - last_emotion_check > emotion_check_interval:
            Thread(target=analyze_emotion_async, args=(frame.copy(), current_name)).start()
            last_emotion_check = current_time
        
        # Draw rectangles and information
        cv2.rectangle(frame, (x,y), (x+w, y+h), (0,0,255), 1)
        cv2.rectangle(frame,(x,y-40),(x+w,y),(50,50,255),-1)
        cv2.putText(frame, current_name, (x,y-15), cv2.FONT_HERSHEY_COMPLEX, 1, (255,255,255), 1)
        
        # Display emotion and accuracies
        y_offset = y + h + 25
        emotion_text = f"Emotion: {current_emotion} ({current_emotion_accuracy:.1f}%)"
        cv2.putText(frame, emotion_text, (x, y_offset), 
                    cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0,255,0), 2)
        
        face_conf_text = f"Face Detection Confidence: {face_detection_confidence:.1f}%"
        cv2.putText(frame, face_conf_text, (x, y_offset + 25), 
                    cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0,255,0), 2)
        
        # Add instruction text
        cv2.putText(frame, "Press 'o' to mark attendance", (10, 30), 
                    cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 255, 0), 2)

    cv2.imshow("Frame", frame)
    
    k = cv2.waitKey(1)
    
    # Handle attendance recording
    if k == ord('o') and len(faces) > 0:
        current_time = time.time()
        if (last_attendance_name != current_name or 
            current_time - last_attendance_time > attendance_cooldown):
            
            Thread(target=speak_async, args=("Recording attendance...",)).start()
            
            if record_attendance(current_name, current_emotion, 
                               current_emotion_accuracy, face_detection_confidence):
                last_attendance_name = current_name
                last_attendance_time = current_time
                Thread(target=speak_async, args=("Attendance recorded successfully!",)).start()
            else:
                Thread(target=speak_async, args=("Failed to record attendance.",)).start()
        break
    
    if k == ord('q'):
        break

video.release()
cv2.destroyAllWindows()

This code integrates multiple functionalities into a cohesive real-time system:<br>

- Detecting faces and calculating detection confidence.
- Recognizing faces using a pre-trained KNN model.
- Analyzing emotions with periodic checks.
- Recording attendance with checks to prevent duplicate entries.
- Providing a user-friendly interface with visual information and text-to-speech notifications.
- The program continuously processes video frames until the user stops it or attendance is recorded.