<a href="https://colab.research.google.com/github/acewolfag/modelFaceNet/blob/main/Face_Pytorch30/10.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
# IMPORTANT: SOME KAGGLE DATA SOURCES ARE PRIVATE
# RUN THIS CELL IN ORDER TO IMPORT YOUR KAGGLE DATA SOURCES.
import kagglehub
kagglehub.login()


In [None]:
# IMPORTANT: RUN THIS CELL IN ORDER TO IMPORT YOUR KAGGLE DATA SOURCES,
# THEN FEEL FREE TO DELETE THIS CELL.
# NOTE: THIS NOTEBOOK ENVIRONMENT DIFFERS FROM KAGGLE'S PYTHON
# ENVIRONMENT SO THERE MAY BE MISSING LIBRARIES USED BY YOUR
# NOTEBOOK.

rawatjitesh_avengers_face_recognition_path = kagglehub.dataset_download('rawatjitesh/avengers-face-recognition')
acewolfag_facenet_dataset_path = kagglehub.dataset_download('acewolfag/facenet-dataset')
acewolfag_endgame_path = kagglehub.dataset_download('acewolfag/endgame')
acewolfag_model_train_path = kagglehub.dataset_download('acewolfag/model-train')

print('Data source import complete.')


<a href="https://colab.research.google.com/github/acewolfag/black/blob/main/FaceNet_PyTourch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Face Recognition using Facenet and SVM with PCA Visualization

The provided code implements a face recognition system using the [facenet-pytorch](https://www.kaggle.com/datasets/timesler/facenet-pytorch-vggface2) library and training using SVM Model. Here's a summary of the code:

1. Installation: The code installs the facenet-pytorch library from a local file using pip.
2. Model Checkpoints: The code copies the model checkpoints to the torch cache directory for automatic loading during runtime.
3. Importing Required Libraries: Necessary libraries and modules are imported for various functionalities, including image processing, machine learning, face detection, and feature extraction.
4. Helper Functions: Two helper functions are defined. The whitens function normalizes an input image tensor, and the extract_features function detects faces in an image and extracts their feature embeddings.
5. Device Configuration: The code sets the device to either GPU or CPU based on availability.
6. Model Initialization: The MTCNN and Inception ResNet V1 models are initialized with pre-trained weights. The models are moved to the selected device.
7. Dataset Preparation: The code defines the path to the dataset folder and creates ImageFolder datasets for the training and validation sets.
8. Embedding Extraction: The dataset_to_embeddings function is called to extract face embeddings from the training and validation sets.
9. Model Training: The train function is called to train the face recognition model using the extracted embeddings.
10. Model Saving: The trained model is serialized and saved as a pickle file.
11. Model Evaluation: The code generates a classification report and calculates the accuracy of the trained model using the embeddings from the training set.
12. Visualization: The code uses PCA to reduce the dimensionality of the embeddings and plots a 2D scatter plot of the embeddings with different colors representing different labels.
13. Image Testing: Random images from the validation set are selected and their predicted labels are compared with the true labels. The images, along with the predicted and actual labels, are displayed.
14. Single Image Testing: A single image is tested for face recognition, and the predicted label is displayed along with the image.

## Importing Required Libraries

The necessary libraries and modules are imported, including:

In [None]:
import os
import shutil
from sklearn.model_selection import train_test_split
path_folder = "/kaggle/input/facenet-dataset/Original Images"
# Đường dẫn gốc đến thư mục chứa dữ liệu
source_folder = path_folder
train_folder = "/kaggle/working/train"
val_folder = "/kaggle/working/val"

# Tạo thư mục train và val nếu chưa có
os.makedirs(train_folder, exist_ok=True)
os.makedirs(val_folder, exist_ok=True)

# Lặp qua tất cả các folder (tương ứng với từng class)
for person in os.listdir(source_folder):
    person_path = os.path.join(source_folder, person)

    # Kiểm tra nếu đó là một folder
    if os.path.isdir(person_path):
        # Lấy danh sách các file trong folder person
        images = os.listdir(person_path)

        # Chia dữ liệu theo tỷ lệ train và val
        train_images, val_images = train_test_split(images, test_size=0.2, random_state=42)

        # Tạo thư mục cho class trong train và val
        train_class_folder = os.path.join(train_folder, person)
        val_class_folder = os.path.join(val_folder, person)
        os.makedirs(train_class_folder, exist_ok=True)
        os.makedirs(val_class_folder, exist_ok=True)

        # Di chuyển file vào thư mục tương ứng
        for image in train_images:
            shutil.copy(os.path.join(person_path, image), os.path.join(train_class_folder, image))

        for image in val_images:
            shutil.copy(os.path.join(person_path, image), os.path.join(val_class_folder, image))

        print(f"Đã hoàn thành phân chia cho class {person}")

print("Đã hoàn tất phân chia dữ liệu!")


Đã hoàn thành phân chia cho class Alia Bhatt
Đã hoàn thành phân chia cho class Charlize Theron
Đã hoàn thành phân chia cho class Zac Efron
Đã hoàn thành phân chia cho class Billie Eilish
Đã hoàn thành phân chia cho class Jessica Alba
Đã hoàn thành phân chia cho class Priyanka Chopra
Đã hoàn thành phân chia cho class Natalie Portman
Đã hoàn thành phân chia cho class Hrithik Roshan
Đã hoàn thành phân chia cho class Tom Cruise
Đã hoàn thành phân chia cho class Roger Federer
Đã hoàn thành phân chia cho class Henry Cavill
Đã hoàn thành phân chia cho class Amitabh Bachchan
Đã hoàn thành phân chia cho class Brad Pitt
Đã hoàn thành phân chia cho class Dwayne Johnson
Đã hoàn thành phân chia cho class Kashyap
Đã hoàn thành phân chia cho class Elizabeth Olsen
Đã hoàn thành phân chia cho class Camila Cabello
Đã hoàn thành phân chia cho class Vijay Deverakonda
Đã hoàn thành phân chia cho class Courtney Cox
Đã hoàn thành phân chia cho class Ellen Degeneres
Đã hoàn thành phân chia cho class Margot Ro

In [None]:
!pip install facenet-pytorch

Collecting facenet-pytorch
  Downloading facenet_pytorch-2.6.0-py3-none-any.whl.metadata (12 kB)
Collecting Pillow<10.3.0,>=10.2.0 (from facenet-pytorch)
  Downloading pillow-10.2.0-cp310-cp310-manylinux_2_28_x86_64.whl.metadata (9.7 kB)
Collecting torch<2.3.0,>=2.2.0 (from facenet-pytorch)
  Downloading torch-2.2.2-cp310-cp310-manylinux1_x86_64.whl.metadata (26 kB)
Collecting torchvision<0.18.0,>=0.17.0 (from facenet-pytorch)
  Downloading torchvision-0.17.2-cp310-cp310-manylinux1_x86_64.whl.metadata (6.6 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch<2.3.0,>=2.2.0->facenet-pytorch)
  Downloading nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch<2.3.0,>=2.2.0->facenet-pytorch)
  Downloading nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch<2.3.0,>=2.2.0->facenet-pytorch)
  Downloading nvidia_cuda_

In [None]:
import os
import argparse
import joblib
import numpy as np
from PIL import Image
from torchvision import transforms, datasets
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import GridSearchCV
from sklearn import metrics
from sklearn.metrics import accuracy_score
import joblib
import random
import matplotlib.pyplot as plt
from facenet_pytorch import MTCNN, InceptionResnetV1, extract_face
import torch
from sklearn.decomposition import PCA
from sklearn import svm

## Utility Functions

The following utility functions are defined:

* **whitens(img):** Performs image whitening by subtracting the mean and dividing by the standard deviation of the image.
* **extract_features(mtcnn, facenet, img):** Extracts facial features from an image using the MTCNN face detection and the InceptionResnetV1 model. Returns the bounding boxes and embeddings of detected faces.
* **dataset_to_embeddings(dataset, mtcnn, facenet):** Converts a dataset of images into a list of embeddings and labels using the MTCNN and InceptionResnetV1 models.
* **train(embeddings, labels):** Train the embeddings and labels using SVM Model with True probability

In [None]:
import torch
import numpy as np
from torchvision import transforms
from PIL import Image
from sklearn import svm

# Adjusted whitens function to maintain device consistency
def whitens(img):
    mean = img.mean()
    std = img.std()
    std_adj = std.clamp(min=1.0 / (float(img.numel()) ** 0.5))
    y = (img - mean) / std_adj
    return y

def extract_features(mtcnn, facenet, img, device='cuda'):
    img = img.to(device)
    img = transforms.ToPILImage()(img.squeeze_(0))
    bbs, _ = mtcnn.detect(img)
    if bbs is None:
        # If no face is detected
        return None, None

    # Move detected faces to the specified device
    faces = torch.stack([extract_face(img, bb) for bb in bbs]).to(device)
    whitened_faces = whitens(faces).to(device)  # Ensure whitening result is on the same device
    embeddings = facenet(whitened_faces).detach().cpu().numpy()  # Move to CPU for numpy conversion

    return bbs, embeddings

def dataset_to_embeddings(dataset, mtcnn, facenet, device='cuda'):
    transform = transforms.Compose([
        transforms.Resize(1024),
        transforms.ToTensor()
    ])

    embeddings = []
    labels = []
    for img_path, label in dataset.samples:
        print(img_path)

        img = transform(Image.open(img_path).convert('RGB')).unsqueeze_(0)
        _, embedding = extract_features(mtcnn, facenet, img, device=device)

        if embedding is None:
            print("Could not find face on {}".format(img_path))
            continue
        if embedding.shape[0] > 1:
            print("Multiple faces detected for {}, taking one with highest probability".format(img_path))
            embedding = embedding[0, :]
        embeddings.append(embedding.flatten())
        labels.append(label)

    return np.stack(embeddings), labels

def train(embeddings, labels):
    clf = svm.SVC(probability=True)
    clf.fit(embeddings, labels)
    return clf


## Model Initialization
The code initializes the MTCNN and InceptionResnetV1 models for face detection and feature extraction, respectively:

In [None]:
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')

mtcnn = MTCNN(keep_all=True, thresholds=[0.6, 0.7, 0.9], device=device)
facenet = InceptionResnetV1(pretrained='vggface2').eval()

facenet = facenet.to(device)

## Dataset Preparation
The code defines the path to the dataset folder and creates ImageFolder datasets for training and validation:

In [None]:

data_path = path_folder
train_path = "/kaggle/working/train"
val_path = "/kaggle/working/val"

dataset_train = datasets.ImageFolder(root=train_path)
dataset_val = datasets.ImageFolder(root=val_path)

## Generating Embeddings
The code generates embeddings and labels for the training and validation datasets using the dataset_to_embeddings function:

In [None]:
X_train, y_train = dataset_to_embeddings(dataset_train, mtcnn, facenet)
X_test, y_test = dataset_to_embeddings(dataset_val, mtcnn, facenet)

X_train_class_idx = dataset_train.class_to_idx
X_test_class_idx = dataset_val.class_to_idx

embeddings, labels, class_to_idx = X_train, y_train, X_train_class_idx

/kaggle/working/train/Akshay Kumar/Akshay Kumar_0.jpg
/kaggle/working/train/Akshay Kumar/Akshay Kumar_1.jpg
/kaggle/working/train/Akshay Kumar/Akshay Kumar_10.jpg
/kaggle/working/train/Akshay Kumar/Akshay Kumar_11.jpg
/kaggle/working/train/Akshay Kumar/Akshay Kumar_12.jpg
/kaggle/working/train/Akshay Kumar/Akshay Kumar_15.jpg
/kaggle/working/train/Akshay Kumar/Akshay Kumar_16.jpg
/kaggle/working/train/Akshay Kumar/Akshay Kumar_17.jpg
/kaggle/working/train/Akshay Kumar/Akshay Kumar_18.jpg
/kaggle/working/train/Akshay Kumar/Akshay Kumar_2.jpg
/kaggle/working/train/Akshay Kumar/Akshay Kumar_20.jpg
/kaggle/working/train/Akshay Kumar/Akshay Kumar_23.jpg
/kaggle/working/train/Akshay Kumar/Akshay Kumar_24.jpg
Multiple faces detected for /kaggle/working/train/Akshay Kumar/Akshay Kumar_24.jpg, taking one with highest probability
/kaggle/working/train/Akshay Kumar/Akshay Kumar_25.jpg
/kaggle/working/train/Akshay Kumar/Akshay Kumar_26.jpg
/kaggle/working/train/Akshay Kumar/Akshay Kumar_28.jpg
/ka

## Training
The code trains a classification model using the training embeddings and labels:

In [None]:
clf = train(embeddings, labels)

import json
import numpy as np

# Kiểm tra và chuyển đổi từng biến nếu là ndarray
data = {
    "embeddings": embeddings.tolist() if isinstance(embeddings, np.ndarray) else embeddings,
    "labels": labels.tolist() if isinstance(labels, np.ndarray) else labels,
    "class_to_idx": class_to_idx.tolist() if isinstance(class_to_idx, np.ndarray) else class_to_idx
}

# Lưu dữ liệu vào file .txt
file_path = 'data.txt'
with open(file_path, 'w', encoding='utf-8') as file:
    json.dump(data, file)
print("Dữ liệu đã được lưu thành công vào", file_path)

# Save the trained model
joblib.dump(clf, 'face_recognition_model.pkl')


## Model Evaluation
The code evaluates the trained model by printing a classification report and calculating the accuracy on the validation set:

In [None]:
idx_to_class = {v: k for k, v in class_to_idx.items()}
print(idx_to_class)

target_names = list(map(lambda i: i[1], sorted(idx_to_class.items(), key=lambda i: i[0])))
print(metrics.classification_report(labels, clf.predict(embeddings), target_names=target_names))

# Predict labels for validation set and calculate accuracy
y_val_pred = clf.predict(X_test)
accuracy = accuracy_score(y_test, y_val_pred)
print('Validation Accuracy: {:.2f}%'.format(accuracy*100))

## Visualization
The code visualizes the embeddings in a 2D space using PCA and plots a scatter plot with colored points representing different classes:



In [None]:
import os
import random
import matplotlib.pyplot as plt
from sklearn.decomposition import PCA

# Tạo màu ngẫu nhiên
def generate_random_color():
    """Hàm tạo màu ngẫu nhiên dưới dạng ký tự của matplotlib"""
    return "#{:06x}".format(random.randint(0, 0xFFFFFF))

# Gán màu ngẫu nhiên cho từng class trong thư mục dữ liệu
def assign_colors(data_folder):
    """
    Gán màu ngẫu nhiên cho từng class trong thư mục dữ liệu.
    Trả về dictionary với class làm key và màu sắc làm giá trị.
    """
    class_labels = [d for d in os.listdir(data_folder) if os.path.isdir(os.path.join(data_folder, d))]
    colors = {label: generate_random_color() for label in class_labels}
    return colors

# Thư mục dữ liệu của bạn
data_folder = path_folder
color_mapping = assign_colors(data_folder)

# In ra kết quả gán màu
for label, color in color_mapping.items():
    print(f"'{label}' : '{color}',")

# Thực hiện PCA để giảm chiều của embeddings (giả sử X_train và y_train đã có sẵn)
pca = PCA(n_components=2)  # Giảm xuống 2 chiều để vẽ biểu đồ
embeddings_2d = pca.fit_transform(X_train)

# Gán nhãn từ số sang tên (giả sử idx_to_class đã có sẵn)
mapped_labels = [idx_to_class[label] for label in y_train]

# Sử dụng dictionary color_mapping để lấy màu sắc cho từng nhãn
def assign_colors(label):
    return color_mapping.get(label, "#000000")  # Mặc định màu đen nếu nhãn không có trong color_mapping

# Tạo mảng màu cho tất cả các điểm
colors = list(map(assign_colors, mapped_labels))

# Vẽ biểu đồ tán xạ với embeddings 2D
plt.scatter(embeddings_2d[:, 0], embeddings_2d[:, 1], c=colors)
plt.xlabel("PCA Dimension 1")
plt.ylabel("PCA Dimension 2")
plt.title("Visualization of Embeddings with PCA")
plt.show()


## Face Recognition

The code performs face recognition on randomly selected images from the validation set. It predicts the labels for the images and displays the predicted and actual labels along with the image:

In [None]:

transform_img = transforms.Compose([transforms.Resize(1024)])
transform = transforms.Compose([
        transforms.Resize(1024),
        transforms.ToTensor()
    ])



dataset_val = datasets.ImageFolder(root="/kaggle/working/val")
val_samples = dataset_val.samples


random_samples = random.choices(val_samples, k=5)

for img_path, true_label in random_samples:

    img_ = transform_img(Image.open(img_path).convert('RGB'))
    img = transform(Image.open(img_path).convert('RGB'))

    # Extract features
    _, embedding = extract_features(mtcnn, facenet, img)
    if embedding is None:
        print("Could not find face on {}".format(img_path))
        continue
    if embedding.shape[0] > 1:
        print("Multiple faces detected for {}, taking one with highest probability".format(img_path))
        embedding = embedding[0, :]


    predicted_label = clf.predict(embedding.reshape(1, -1))
    print(clf.predict_proba(embedding))

    predicted_class = idx_to_class[predicted_label[0]]
    true_class = idx_to_class[true_label]

    plt.imshow(img_)
    plt.title(f'Predicted: {predicted_class}, Actual: {true_class}')
    plt.axis('off')
    plt.show()


## Intruder Recognition

The code performs face recognition on a single image specified by img_path and displays the predicted label along with the image:

In [None]:
import json
import numpy as np
import joblib

# Đường dẫn file
data_file_path = '/kaggle/input/model-train/data.txt'
model_file_path = '/kaggle/input/model-train/face_recognition_model.pkl'

# Tải lại dữ liệu
with open(data_file_path, 'r', encoding='utf-8') as file:
    data = json.load(file)

# Chuyển đổi lại các dữ liệu đã lưu
embeddings = np.array(data["embeddings"])
labels = np.array(data["labels"])
class_to_idx = data["class_to_idx"]

print("Dữ liệu đã được tải lại thành công từ", data_file_path)

# Tải model đã huấn luyện
clf = joblib.load(model_file_path)
print("Model đã được tải lại thành công từ", model_file_path)
transform_img = transforms.Compose([transforms.Resize(1024)])
transform = transforms.Compose([
        transforms.Resize(1024),
        transforms.ToTensor()
    ])


In [None]:
img_path = '/kaggle/input/avengers-face-recognition/cropped_images/robert_downey_jr/robert_downey_jr13.png'

img_ = transform_img(Image.open(img_path).convert('RGB'))
img = transform(Image.open(img_path).convert('RGB'))
_, embedding = extract_features(mtcnn, facenet, img)

predicted_label = clf.predict(embedding.reshape(1, -1))
# Convert label indexes back to original classes
predicted_class = idx_to_class[predicted_label[0]]
true_class = idx_to_class[true_label]
print((clf.predict_proba(embedding)))
thres = 0.6
if np.max(clf.predict_proba(embedding)) < thres:
    predicted_class = 'Intruder'


plt.imshow(img_)
plt.title(f'Predicted: {predicted_class}, Actual: {"TomCruise"}')
plt.axis('off')
plt.show()

In [None]:
# Đường dẫn đến video
video_path = '/kaggle/input/endgame/AVENGERS- ENDGAME All Movie Clips - Final Battle (2019).mp4'
output_video_path = '/kaggle/working/output_video.mp4'

import cv2
import torch
import numpy as np
from facenet_pytorch import MTCNN, InceptionResnetV1
from tqdm import tqdm

# Tải model InceptionResnetV1 đã huấn luyện sẵn và lớp MTCNN
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
mtcnn = MTCNN(keep_all=True, device=device)
facenet = InceptionResnetV1(pretrained='vggface2').eval().to(device)

# Kiểm tra xem video có mở được không
cap = cv2.VideoCapture(video_path)
if not cap.isOpened():
    print("Không thể mở video")
    exit()

# Lấy tổng số khung hình để thiết lập thanh tiến độ
total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))

# Thiết lập video writer để lưu video đầu ra
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
out = cv2.VideoWriter(output_video_path, fourcc, 20.0, (int(cap.get(3)), int(cap.get(4))))

# Tạo thanh tiến độ
pbar = tqdm(total=total_frames, desc="Processing video frames", position=0, leave=True)

# Đọc từng khung hình của video
for _ in range(total_frames):
    ret, frame = cap.read()
    if not ret:
        break

    # Chuyển từ BGR sang RGB vì OpenCV đọc ảnh theo BGR
    frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

    # Phát hiện tất cả khuôn mặt trong khung hình
    boxes, _ = mtcnn.detect(frame_rgb)

    # Nếu phát hiện được ít nhất một khuôn mặt
    if boxes is not None:
        h, w, _ = frame.shape
        for box in boxes:
            x1, y1, x2, y2 = map(int, box)
            x1, y1, x2, y2 = max(0, x1), max(0, y1), min(w, x2), min(h, y2)
            cropped_face = frame_rgb[y1:y2, x1:x2]

            # Bỏ qua nếu khuôn mặt không hợp lệ
            if cropped_face.size == 0:
                continue

            # Chuẩn bị khuôn mặt cho dự đoán
            face = cv2.resize(cropped_face, (160, 160))
            face = torch.tensor(face).permute(2, 0, 1).float().div(255).unsqueeze(0).to(device)

            # Tạo embedding và dự đoán danh tính
            with torch.no_grad():
                embedding = facenet(face).cpu().numpy()
                predicted_label = clf.predict(embedding.reshape(1, -1))
                predicted_class = idx_to_class[predicted_label[0]]

                # Lấy xác suất dự đoán
                probabilities = clf.predict_proba(embedding)
                max_prob = np.max(probabilities)

                # Nếu độ tin cậy thấp hơn ngưỡng, gán nhãn là "Intruder"
                if max_prob < 0.6:
                    predicted_class = 'Intruder'

                # Vẽ khung và nhãn trên hình ảnh
                label = f"{predicted_class} ({max_prob * 100:.1f}%)"
                cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
                cv2.putText(frame, label, (x1, y1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 0, 0), 2)

    # Ghi khung hình đã xử lý vào video đầu ra
    out.write(cv2.cvtColor(frame, cv2.COLOR_RGB2BGR))

    # Cập nhật thanh tiến độ
    pbar.update(1)

# Đóng thanh tiến độ và giải phóng bộ nhớ
pbar.close()
cap.release()
out.release()
print("Video đã được xử lý và lưu thành công.")
