Technological Institute of the Philippines | Quezon City - Computer Engineering
--- | ---
Course Code: | CPE 313
Code Title: |  Advanced Machine Learning and Deep Learning
1st Semester | AY 2024-2025
<hr> | <hr>
<u>**ACTIVITY NO.7** | **Performing _Face_Recognition**
**Name** | Quibral, Juliann Vincent
**Section** | CPE32S3
**Date Performed**: | 2/21/2025
**Date Submitted**: | 2/21/2025
**Instructor**: | Engr. Roman M. Richard

<hr>

## 1. Objectives

This activity aims to enable students to perform data preparation and face recognition on their own generated dataset.

## 2. Intended Learning Outcomes (ILOs)
After this activity, the students should be able to:
* Utilize data preparation techniques for images.
* Perform Face Recognition using multiple algorithms.
* Evaluate the performance of different algorithms.

## 3. Procedures and Outputs

### Preparing the training data

Now that we have our data, we need to load these sample pictures into our face recognition algorithms. All face recognition algorithms take two parameters in their `train()` method: an array of images and an array of labels. What do these labels represent? They are the IDs of a certain individual/face so that when face recognition is performed, we not only know the person was recognized but also who—among the many people available in our database—the person is.

To do that, we need to create a comma-separated value (CSV) file, which will contain the path to a sample picture followed by the ID of that person.

**Include a Screenshot of Your Dataset Here**

---

### Loading the data and recognizing faces

Next up, we need to load these two resources (the array of images and CSV file) into the face recognition algorithm, so it can be trained to recognize our face. To do this, we build a function that reads the CSV file and—for each line of the file—loads the image at the corresponding path into the images array and the ID into the labels array.

In [1]:
import numpy as np
import os
import errno
import sys
import cv2

def read_images(path, sz=None):
  c = 0
  X, y = [], []

  for dirname, dirnames, filenames in os.walk(path):
    for subdirname in dirnames:
      subject_path = os.path.join(dirname, subdirname)
      for filename in os.listdir(subject_path):
        try:
          if(filename == ".directory"):
            continue
          filepath = os.path.join(subject_path, filename)
          im = cv2.imread(os.path.join(subject_path, filename), cv2.IMREAD_GRAYSCALE)

          # Resize the images to the prescribed size
          if (sz is not None):
            im = cv2.resize(im, (200,200))

          X.append(np.asarray(im, dtype=np.uint8))
          y.append(c)

        except IOError as e:
          print(f"I/O Error({e.errno}): {e.strerror}")
        except:
          print("Unexpected error:", sys.exc_info()[0])
          raise
      c = c+1
  return [X, y]

**Question: Run the function above on your generated dataset. Provide an analysis and note all the challenges you have encountered running this code.**

I first encountered a number of directory-related problems when executing the function on the created dataset.  The script's directory paths did not match my local directory structure, which is why the code kept failing.  When the script tried to access or save files, this resulted in problems.  Nevertheless, the function operated as planned once I modified the paths to fit my local context.  The primary lesson here is how crucial it is to make sure that file paths are set up appropriately in order to prevent such problems.

---

### Performing Face Recognition Algorithms

Here is a sample script for testing the Face Recognition Algorithm. In this section, we're going to follow the same process but with different algorithms for face recognitions, namely:
- Eigenface Recognition
- Fisherface Recognition
- Local Binary Pattern Histograms (LBPH) Recognition

In [9]:
import sys
import cv2
import numpy as np

def face_rec():
  names = ['Friend1', 'Friend2'] # Put your names here for faces to recognize
  if len(sys.argv) < 2:
    print("USAGE: facerec_demo.py </path/to/images> [</path/to/store/images/at>]")
    sys.exit()

  [X, y] = read_images(sys.argv[1])
  y = np.asarray(y, dtype=np.int32)

  model = cv2.face.EigenFaceRecognizer_create()
  model.train(X, y)

  camera = cv2.VideoCapture(0)
  face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')

  while True:
    ret, img = camera.read()
    if not ret:
      break

    faces = face_cascade.detectMultiScale(img, 1.3, 5)

    for (x, y, w, h) in faces:
      cv2.rectangle(img, (x, y), (x + w, y + h), (255, 0, 0), 2)
      gray = cv2.cvtColor(img[y:y + h, x:x + w], cv2.COLOR_BGR2GRAY)
      roi = cv2.resize(gray, (200, 200), interpolation=cv2.INTER_LINEAR)

      try:
        params = model.predict(roi)
        label = names[params[0]]
        cv2.putText(img, label + ", " + str(params[1]), (x, y - 20), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 0, 0), 2)
      except:
        continue

    cv2.imshow("camera", img)
    if cv2.waitKey(1) & 0xFF == ord("q"):
      break

  camera.release()
  cv2.destroyAllWindows()

if __name__ == "__main__":
  face_rec()


error: OpenCV(4.11.0) D:\bld\libopencv_1739279475736\work\opencv_contrib\modules\face\src\eigen_faces.cpp:62: error: (-5:Bad argument) Empty training data was given. You'll need more than one sample to learn a model. in function 'cv::face::Eigenfaces::train'


**Question: Provide an analysis of the sample script for the process using the Eigenface Model. What is the sample code doing? Are you able to troubleshoot any problems encountered?**

Principal Component Analysis (PCA) is used in the Eigenface Model sample script to do facial recognition.  By decreasing the dimensionality of facial images to a collection of "eigenfaces," it is possible to recognize or validate faces.  Usually, the script loads a dataset of face photos, preprocesses them (e.g., resizes, converts to grayscale), trains the model, and then uses the model to identify or predict faces.

I also ran across problems with the OpenCV library and its dependencies during this exercise, like "cv2 has no face" and "cv2 has no data."  After doing some investigation, I discovered that my environment rather than the code was the issue.  The issues were caused by the script's directory structure not matching my local setup.  By changing the paths to match my local directory, this was fixed.

---
Perform the remaining face recognition techniques by using the same (or modified) process from the sample code:

- `model = cv2.face.createFisherFaceRecognizer()`
- `model = cv2.face.createLBPHFaceRecognizer()`

**Question: The `predict()` method returns a two-element array. Provide your analysis of the two returned values and their important ince this application.**

The Eigenface Model's predict() function yields a two-element array.  The predicted label or class (such as the identification of the person in the picture) is usually represented by the first element, while the prediction's confidence level or distance metric is shown by the second element.  Because it aids in assessing the prediction's dependability, the confidence level is essential.  While a greater number implies uncertainty or a possible mismatch, a lesser distance measure typically indicates a higher level of confidence in the prediction.

## 4. Supplementary Activity

Your accomplisment of the tasks below contribute to the achievement of ILO1, ILO2, and ILO3 for this module.

---

Tasks:
1. Create a new dataset for testing, this dataset must include the following:
  - The same person/s that the model has to recognize.
  - Different person/s that the model should not recognize.
2. For each model, perform 20 tests. Document the testing performed and provide observations.
3. Conclude on the performed tests by providing your evaluation of the performance of the models.

facerec_train.py

In [None]:
import cv2
import numpy as np
import os

def load_images_from_folder(folder_path):
    X, y = [], []
    label_dict = {}
    label_id = 0

    for person_name in os.listdir(folder_path):
        person_path = os.path.join(folder_path, person_name)
        if not os.path.isdir(person_path):
            continue
        
        if person_name not in label_dict:
            label_dict[person_name] = label_id
            label_id += 1

        for image_name in os.listdir(person_path):
            image_path = os.path.join(person_path, image_name)
            img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
            if img is None:
                continue
            img = cv2.resize(img, (200, 200))
            X.append(img)
            y.append(label_dict[person_name])
    
    return X, np.array(y), label_dict

# Load training data
train_path = "images_train"
X_train, y_train, label_dict = load_images_from_folder(train_path)

# Train the LBPH Face Recognizer
model = cv2.face.LBPHFaceRecognizer_create()
model.train(X_train, y_train)

# Save the trained model
model.save("face_model.yml")
np.save("label_dict.npy", label_dict)
print("Model trained and saved successfully!")


facerec_test.py

In [None]:
import cv2
import numpy as np
import os

# Load trained model
model = cv2.face.LBPHFaceRecognizer_create()
model.read("face_model.yml")
label_dict = np.load("label_dict.npy", allow_pickle=True).item()

face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')


# Testing dataset
test_path = "images_test"
correct_recognitions = 0
total_tests = 0

for person_name in os.listdir(test_path):
    person_path = os.path.join(test_path, person_name)
    if not os.path.isdir(person_path):
        continue
    
    for image_name in os.listdir(person_path):
        image_path = os.path.join(person_path, image_name)
        img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
        if img is None:
            continue
        img = cv2.resize(img, (200, 200))

        predicted_label, confidence = model.predict(img)
        predicted_name = [name for name, label in label_dict.items() if label == predicted_label][0]

        if person_name == predicted_name:
            correct_recognitions += 1
        
        total_tests += 1
        print(f"Test Image: {image_name}, Actual: {person_name}, Predicted: {predicted_name}, Confidence: {confidence:.2f}")

accuracy = (correct_recognitions / total_tests) * 100
print(f"\nFinal Accuracy: {accuracy:.2f}% ({correct_recognitions}/{total_tests} correct recognitions)")

# Real-time face recognition
camera = cv2.VideoCapture(0)

while True:
    ret, img = camera.read()
    if not ret:
        break

    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    faces = face_cascade.detectMultiScale(gray, 1.3, 5)

    for (x, y, w, h) in faces:
        face_roi = gray[y:y+h, x:x+w]
        face_roi = cv2.resize(face_roi, (200, 200))

        predicted_label, confidence = model.predict(face_roi)
        predicted_name = [name for name, label in label_dict.items() if label == predicted_label][0]

        cv2.putText(img, f"{predicted_name}, {confidence:.2f}", (x, y - 10),
                    cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 0, 0), 2)
        cv2.rectangle(img, (x, y), (x + w, y + h), (255, 0, 0), 2)

    cv2.imshow("Face Recognition", img)
    if cv2.waitKey(1) & 0xFF == ord("q"):
        break

camera.release()
cv2.destroyAllWindows()

## 5. Summary, Conclusions and Lessons Learned

**Summary / Conclusion**

I had difficulties with directory directories and OpenCV dependencies while building the Eigenface Model for facial recognition in this exercise.  The software creates eigenfaces for predictions by applying PCA to facial image processing.  I got errors like "cv2 has no face" and "cv2 has no data" because the directory structure of the script didn't match my local environment.  I fixed issues by modifying file locations and making sure everything was set up correctly after troubleshooting.  A two-element array containing the predicted label and a confidence metric—which is essential for assessing prediction reliability—was returned by the predict() method.  This experience made it clear how crucial it is to comprehend model results and match directory structures with the current environment.  The most important lesson is to confirm environment configuration and pathways prior to executing code in order to prevent problems and ensure smooth executaion.



**Lesson Learned**
The noted part of my takeaways is how crucial it is to make sure the environment and directory structures are set up properly before executing code.  Interpreting and evaluating the outcomes of machine learning models also requires a comprehension of the output of functions like predict().  This experience made it clear that using other libraries and datasets requires careful troubleshooting and flexibility.


<hr/>

***Proprietary Clause***

*Property of the Technological Institute of the Philippines (T.I.P.). No part of the materials made and uploaded in this learning management system by T.I.P. may be copied, photographed, printed, reproduced, shared, transmitted, translated, or reduced to any electronic medium or machine-readable form, in whole or in part, without the prior consent of T.I.P.*