DeepFace (Keras) doesn't work with the current conda environment. The parser ueses DeepFace, same as the one from previous semester's face similarity parser. Please load the conda environment related to previous semester's face similarity parser. The face similarity ipynb file has also been added to a separate file for convenience

# Initial Setup

We've modified code from this github repo [https://github.com/taesungp/swapping-autoencoder-pytorch] and therefore doesn't embed the original github repo.

Download the FFHQ512 pretrained model for the Swapping Autoencoder Model from this link. [http://efrosgans.eecs.berkeley.edu/SwappingAutoencoder/swapping_autoencoder_models_and_test_images.zip]
(Note: this is a http (not https) address, and you may need to paste in the link URL directly in the address bar for some browsers like Chrome, or download the dataset using wget)

There will be multiple folders once this zip file is uncompressed. The one that should be kept is "ffhq512_pretrained". Drag this folder into "spring24/swapping-autoencoder-pytorch/experiments/checkpoints".

In spring24 folder, run `conda env create -f environment.yml` then `conda activate swap-auto-pytorch` to enable the conda environment.

As noted in the block above, we were unable to get DeepFace to work with this swap-auto-pytorch conda environment. We're aware that a previous group uses this model for their project. Please load their conda environment if you want to attempt any code under the tag "Deepface". For any code under "Swapping Autoencoder", use the swap-auto-pytorch conda environment.

Variable names:

`candidate_directory`: A full path to where the image of the textures should be (candidate images that are hand picked)

`image_directory`: A full path to where the image of the structure should be (images of missing people that we want to apply homelessness to)

`ouput_file_name`: A file name of the csv that DeepFace will output after matching faces from image_directory to candidate_directory.

`parsed_output_file_name`: A file name of the csv that the notebook will output after parsing through output_file_name.

`output_dir`: A full path to where the output_file_name is located

`parsed_output_dir`: A full path to where the output_file_name is located

After auto-encoder runs, run the matplotlib visualizer to see the results

The matplotlib visualizer will print the structure image file name and texture image file name. Identify which alpha value looks good and the result will be in the folder spring24/swapping-autoencoder-pytorch/results/ffhq512_pretrained/simpleswapping/.

Once you've determined which image looks good, find the image in the folder specified above. The file will be named {structure_image_file_name}\_{texture_image_file_name}\_{alpha_value}.png

NOTE: When adding to candidate_directory, do not add AI generated images of homeless people. Almost all of them perform extremely poorly.

In [6]:
# CHANGE THIS DIRECTORY TO WHERE TEXTURES ARE
candidate_directory = '/projectnb/sparkgrp/ml-atfal-mafkoda-grp/google_candidate/'
# CHANGE THIS DIRECTORY TO NEW SET OF IMAGES FOR STRUCTURE
image_directory = '/projectnb/sparkgrp/ml-atfal-mafkoda-grp/missing_children_johndoe_reunited_images_bounding_box/'

output_file_name = 'google-trial.csv'
output_dir = "/projectnb/sparkgrp/ml-atfal-mafkoda-grp/simonkye/" + output_file_name
parsed_output_dir = "/projectnb/sparkgrp/ml-atfal-mafkoda-grp/simonkye/" + parsed_output_file_name
parsed_output_file_name = 'google-parsed_trial.csv'

# Deepface

In [1]:
# Import for both DeepFace & Autoencoder
import os
import subprocess
from tqdm import tqdm
import glob
import argparse
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
from PIL import Image

In [2]:
# Import for DeepFace
from deepface import DeepFace

This block exists as DeepFace keeps a checkpoint of previous inputs, sometimes it will have access to photos that don't exist anymore in the directories specified above. This attempts to remove these checkpoints

In [4]:
directory = '/projectnb/sparkgrp/ml-atfal-mafkoda-grp/candidate_repository'
file_list = os.listdir(directory)

for file_name in file_list:
    if file_name.endswith('.pkl'):
        os.remove(os.path.join(directory, file_name))

This block runs that actual DeepFace model.

In [None]:
class FaceRecognition:
    def __init__(self, missing_dir, unknown_dir, output_path, model='VGG-Face', metric='euclidean_l2', detector_backend='mtcnn'):
        # Initialize the FaceRecognition class with directories, output path, model, metric, and detector backend
        self.missing_dir = missing_dir  # Directory of missing persons' images
        self.unknown_dir = unknown_dir  # Directory of unknown persons' images
        self.output_path = output_path  # Path for the output CSV file
        self.model = model  # Face recognition model
        self.metric = metric  # Distance metric for comparison
        self.detector_backend = detector_backend  # Backend for face detection
        # self.metric_col = self.model + "_" + self.metric  # Column name for metric in the output DataFrame

    def detect_and_match_faces(self):
        # Detect and match faces in images from the missing directory
        image_paths = glob.glob(os.path.join(self.missing_dir, "*"))  # Get all image paths
        results = []  # List to store the results
        for path in image_paths:
            # Check the file extension
            ext = os.path.splitext(path)[1].lower()
            if ext in [".jpg", ".jpeg", ".png", ".bmp", ".tif", ".tiff"]:
                # Extract faces from the image
                detected_img_objs = DeepFace.extract_faces(path, enforce_detection=False, detector_backend=self.detector_backend)
                
                if detected_img_objs:
                    # Find matches in the unknown directory using DeepFace
                    dfs_list = DeepFace.find(
                        img_path=path,
                        db_path=self.unknown_dir, 
                        model_name=self.model,
                        distance_metric=self.metric,
                        enforce_detection=False,
                        detector_backend=self.detector_backend
                    )

                    dfs = dfs_list[0]

                    
                    dfs['identity'] = dfs['identity'].str.split('/').str.get(-1)
                    print(dfs.head())

                    # Check if the DataFrame is not empty and append the results
                    if not dfs.empty:
                        matched_filenames = ', '.join(dfs['identity'].tolist())
                        print(matched_filenames)
                        # metric_values = ', '.join(map(str, dfs[self.metric_col].tolist()))

                        results.append({
                            'missing_filename': path,
                            'unknowns_matched_filenames': matched_filenames,
                            'distance': dfs['distance']
                        })

        df = pd.DataFrame(results)  # Create DataFrame from results
        df.to_csv(self.output_path, index=False)  # Save the results to a CSV file

def parse_args():
    # Define command line arguments for the script
    parser = argparse.ArgumentParser(description='Face recognition')
    parser.add_argument('--model', type=str, default='VGG-Face', help='Model to use for Face Recognition')
    parser.add_argument('--metric', type=str, default='euclidean_l2', help='Metrics to use for Face Recognition')
    parser.add_argument('--missing_dir', required=True, type=str, help='Directory containing images of missing persons')
    parser.add_argument('--unknown_dir', required=True, type=str, help='Directory containing images of unknown persons')
    parser.add_argument('--output_path', required=True, type=str, help='Path to the output CSV file')
    parser.add_argument('--detector_backend', type=str, default='mtcnn', help='Detector backend to use')
    return parser.parse_args()

if __name__ == '__main__':
    fr = FaceRecognition(
        missing_dir=image_directory,
        unknown_dir=candidate_directory,
        output_path=output_file_name
    )
    fr.detect_and_match_faces()


## (After DeepFace Face Similarity)
The code block assumes you have the outputs from FaceSimilarity. It finds the closest result that isn't itself. (that's why the < 0.001 check exists) It will use the most similar looking image from the image repository as the structure and the candidate repository as the texture.

The following code block exists as the DeepFace model keeps checkpoints in its model leading it to match faces that no longer exist in the candidate_directory or the image_directory. This is why the -checkpoint check exists, so do not name any input file with the string '-checkpoint'

In [4]:
df = pd.read_csv(output_dir)

for index, row in df.iterrows():
    row['unknowns_matched_filenames'] = row['unknowns_matched_filenames'].split(', ')
    row['missing_filename'] = row['missing_filename'].split('/')[-1]
    
    # Convert the 'distance' column to a list
    data = row['distance']
    datapoints = data.split()
    df.loc[index, 'valid_found'] = False
    best_match = None

    for i in range(len(datapoints)):
        try:
            if float(datapoints[i]) < 0.001:
                continue
            filename = row['unknowns_matched_filenames'][i - 1]
            if '-checkpoint' not in filename:
                best_match = filename
                df.loc[index, 'valid_found'] = True
                break
        except ValueError:
            break

    if best_match is None:
        df.loc[index, 'best_match'] = row['missing_filename']
    else:
        df.loc[index, 'best_match'] = best_match
        
df.to_csv(parsed_output_file_name, index=False)

# Swapping Autoencoder

In [None]:
# Import for both DeepFace & Autoencoder
import os
import subprocess
from tqdm import tqdm
import glob
import argparse
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
from PIL import Image

## Without DeepFace Similarity
The code block below assumes a manually made csv is being used. The only columns needed is 'missing_filename' (structure image file name), 'valid_found' (set to True), and best_match (texture image file name)

In [None]:
df = pd.read_csv(parsed_output_dir)

## Auto Swapping Encoder
The code block is simply running the python commands needed to run the model as it only uses Python files rather than ipynb files. Therefore, if a user wanted to run the model without generating a csv, they can cd to the swapping-autoencoder-pytorch directory and run the following command.

`python -m experiments ffhq512_pretrained test simple_interpolation --input_structure_image [full path to structure image] --input_texture_image [full path to texture image]`

In [None]:
os.chdir("/projectnb/sparkgrp/ml-atfal-mafkoda-grp/swapping-autoencoder-pytorch")
for index, row in tqdm(df.iterrows(), total=len(df)):
    
    missing_filename = row['missing_filename'] 
    input_structure_image = image_directory + missing_filename
    directory_exists = os.path.exists(os.path.dirname(input_structure_image))
    best_match_filename = row['best_match']
    input_texture_image = candidate_directory + best_match_filename
    directory_exists = os.path.exists(os.path.dirname(input_texture_image)) and directory_exists
    if not directory_exists:
        print("invalid directory")
    if not row['valid_found']:
        print("no valid image found for " + missing_filename)
    else:
        command = [
            "python",
            "-m",
            "experiments",
            "ffhq512_pretrained",
            "test",
            "simple_interpolation",
            "--input_structure_image",
            input_structure_image,
            "--input_texture_image",
            input_texture_image
        ]
        
        try:
            subprocess.run(command, check=True)
        except subprocess.CalledProcessError as e:
            print(f"Error executing command: {e}")

## Displaying results of Autoencoder

In [None]:
missing_structure = 0
missing_texture = 0
missing_results = 0
def display_images(image_folder, image_prefix, image_suffix, input_structure_image, input_texture_image):
    global missing_structure, missing_texture, missing_results
    # Create a figure and grid layout
    fig = plt.figure(figsize=(12, 3))
    gs = gridspec.GridSpec(1, 7)  # 1 row, 7 columns

    # Display the input_structure_image
    try: 
        ax1 = plt.subplot(gs[0, 0])
        ax1.imshow(Image.open(input_structure_image))
        ax1.axis('off')
        ax1.set_title('Structure')
    except:
        print(f"File not found: {input_structure_image}")
        missing_structure += 1

    # Display the interpolated images
    try:
        for i in range(5):
            image_path = os.path.join(image_folder, f"{image_prefix}_{image_suffix}_{i*0.25:.2f}.png")
            ax = plt.subplot(gs[0, i+1])
            ax.imshow(Image.open(image_path))
            ax.axis('off')
            ax.set_title(f"{i*0.25:.2f}")
    except:
        print(f"File not found: {image_prefix}_{image_suffix}_{i*0.25:.2f}.png")
        missing_texture += 1

    # Display the input_texture_image
    try:
        ax7 = plt.subplot(gs[0, 6])
        ax7.imshow(Image.open(input_texture_image))
        ax7.axis('off')
        ax7.set_title('Texture')
    except:
        print(f"File not found: {input_texture_image}")
        missing_results += 1
    
    plt.tight_layout()
    plt.show()

for index, row in tqdm(df.iterrows()):
    if not row['valid_found']:
        continue
    missing_filename = row['missing_filename'] 
    input_structure_image = image_directory + missing_filename
    directory_exists = os.path.exists(os.path.dirname(input_structure_image))
    best_match_filename = row['best_match']
    input_texture_image = candidate_directory + best_match_filename
    print(input_structure_image)
    print(input_texture_image)
    directory_exists = os.path.exists(os.path.dirname(input_texture_image)) and directory_exists
    if not directory_exists or not row['valid_found']:
        print("Missing person's file not found or no valid candidate image found!")
    else:
        image_folder = "/projectnb/sparkgrp/ml-atfal-mafkoda-grp/swapping-autoencoder-pytorch/results/ffhq512_pretrained/simpleswapping/"
        image_prefix = row['missing_filename'].replace('.png', '')
        image_suffix = row['best_match'].replace('.png', '')
        input_structure_image = input_structure_image
        input_texture_image = input_texture_image
        
        display_images(image_folder, image_prefix, image_suffix, input_structure_image, input_texture_image)

print(f"Missing Structures: {missing_structure}")
print(f"Missing Textures: {missing_texture}")
print(f"Missing Results: {missing_results}")