# CELEBRITY IDENTIFICATION
---
Welcome to this notebook focused on Celebrity Identification! In this project, we will explore the capabilities of the Face API, a powerful algorithm developed by Microsoft Cognitive Services, to recognize the face of a celebrity. Our dataset for this project is the Celebrity Face Recognition Dataset, which contains a variety of images of famous individuals from various fields, such as actors, musicians, athletes, politicians, and more.

By leveraging the power of the Face API, we will be able to detect and identify celebrities in images with high accuracy, making it a valuable tool for various applications, including security systems, social media analytics, and marketing campaigns. Throughout this notebook, we will explore the features and functionality of the Face API, and how to utilize it to recognize and identify celebrities in images.

So, let's get started and dive into the exciting world of Celebrity Identification with the Face API!

## Installing required packages

In [None]:
!pip install --upgrade azure-cognitiveservices-vision-face

## Collecting Data

In [None]:
!gdown 0B5G8pYUQMNZnN0tPSi16RzYtMGM

## Importing required packages

In [67]:
import asyncio
import io
import os
import sys
import zipfile
import shutil
import time
import uuid
import requests
from urllib.parse import urlparse
from io import BytesIO
# To install this module, run:
# python -m pip install Pillow
from PIL import Image, ImageDraw
from azure.cognitiveservices.vision.face import FaceClient
from msrest.authentication import CognitiveServicesCredentials
from azure.cognitiveservices.vision.face.models import TrainingStatusType, Person, QualityForRecognition


## Data pre-processing

In [None]:
zip_files_dir = '/content/'
extract_dir = '/content/celebrity_images_extract'
destination_dir = '/content/celebrity_images'

# Create extract and destination directories if they don't exist
if not os.path.exists(extract_dir):
    os.makedirs(extract_dir)
if not os.path.exists(destination_dir):
    os.makedirs(destination_dir)

# Extract zip files to extract_dir
for zip_file in os.listdir(zip_files_dir):
    if zip_file.endswith('.zip'):
        with zipfile.ZipFile(os.path.join(zip_files_dir, zip_file), 'r') as zip_ref:
            zip_ref.extractall(extract_dir)

# Move subfolders from extract_dir to destination_dir
for folder in os.listdir(extract_dir):
    folder_path = os.path.join(extract_dir, folder)
    if os.path.isdir(folder_path):
        for sub_folder in os.listdir(folder_path):
            sub_folder_path = os.path.join(folder_path, sub_folder)
            if os.path.isdir(sub_folder_path):
                shutil.move(sub_folder_path, destination_dir)
        #os.rmdir(folder_path)

## Initializing parameters ...

In [None]:
# Set up the Face API client
face_key = "<MY FACE KEY>"
face_endpoint = "<MY FACE API ENDPOINT>"
face_client = FaceClient(face_endpoint, CognitiveServicesCredentials(face_key))

In [None]:
# Set the path to the directory containing the celebrity images
images_dir = destination_dir

# Set the name for the celebrity recognition model
model_name = 'celebrity-recognition'

## Create the PersonGroup

In [None]:
# Create a new PersonGroup for the celebrities
face_client.person_group.create(person_group_id=model_name, name=model_name)

## Detect faces and register them to each person

In [None]:
# Loop through all the celebrity folders in the images directory
for celebrity_folder in os.listdir(images_dir):
    celebrity_id = celebrity_folder.lower().replace(' ', '-')
    celebrity_name = celebrity_folder.title()
    
    # Create a new Person for the celebrity
    person = face_client.person_group_person.create(model_name, celebrity_id, celebrity_name)
    
    # Loop through all the image files in the celebrity folder
    for file in os.listdir(os.path.join(images_dir, celebrity_folder)):
        image_path = os.path.join(images_dir, celebrity_folder, file)
        
        # Add the image to the Person's Face list
        with open(image_path, 'rb') as image_file:
            # Check if the image is of sufficent quality for recognition.
            sufficientQuality = True
            detected_faces = face_client.face.detect_with_stream(model_name, person.person_id, image_file)
            for face in detected_faces:
              if face.face_attributes.quality_for_recognition != QualityForRecognition.high:
                sufficientQuality = False
                break
              face_client.person_group_person.add_face_from_stream(model_name, person.person_id, image_file)
              print("face {} added to person {}".format(face.face_id, person.person_id))
            if not sufficientQuality: continue


## Train PersonGroup

In [None]:
# Train the person group
print("pg resource is {}".format(model_name))
rawresponse = face_client.person_group.train(model_name, raw= True)
print(rawresponse)

while (True):
    training_status = face_client.person_group.get_training_status(model_name)
    print("Training status: {}.".format(training_status.status))
    print()
    if (training_status.status is TrainingStatusType.succeeded):
        break
    elif (training_status.status is TrainingStatusType.failed):
        face_client.person_group.delete(person_group_id=model_name)
        sys.exit('Training the person group has failed.')
    time.sleep(5)

## Identify a face against a defined PersonGroup

In [None]:
# Loop through all the image files in the directory again
for root, dirs, files in os.walk(images_dir):
    for file in files:
        # Load the image file
        image_path = os.path.join(root, file)
        with open(image_path, 'rb') as image_file:
            image_data = image_file.read()

        # Use the Face API to detect faces in the image
        detected_faces = face_client.face.detect_with_stream(image_data)

        # Check if any faces were detected
        if detected_faces:
            # Identify the celebrity in the image
            identified_celebrities = face_client.face.identify([f.face_id for f in detected_faces], model_name)
            if identified_celebrities:
                celebrity_id = identified_celebrities[0].candidates[0].person_id
                celebrity_name = face_client.person_group_person.get(model_name, celebrity_id).name
                print(f'{file} is a picture of {celebrity_name}')
            else:
                print(f'No celebrities identified in {file}')
        else:
            print(f'No faces detected in {file}')


## Conclusion & Summary
This notebook explored the use of the Face API algorithm developed by Microsoft Cognitive Services for celebrity identification using the Celebrity Face Recognition Dataset. By leveraging the powerful features of the API, we were able to accurately detect and identify famous individuals from various fields in images, highlighting its potential for use in security systems, social media analytics, and marketing campaigns.

In conclusion, the Face API algorithm is a valuable tool for celebrity identification, with its high accuracy in detecting and identifying famous individuals in images. As technology continues to advance, we can expect further developments in facial recognition technology, which will transform various industries and provide new opportunities for innovation and growth.