# **Similarity calculations**

This notebook is designed to caculate Map scores and topK accuracy for evaluate performance of Generative and discrimiantive characters identifications models in movie frames. The notebook is structured to align with the objectives outlined in the thesis.



## **1. Environment Setup**
Install necessary libraries and clone the required repository.

In [None]:
!git clone https://github.com/Reouth/Movie-Character-Identification-With-Perosnalized-Generative-Models.git
%pip install -qq git+https://github.com/huggingface/diffusers.git
%pip install -q accelerate
!pip install bitsandbytes
!pip install git+https://github.com/openai/CLIP.git

## **2. Import Libraries**
Load necessary Python libraries and scripts.

In [None]:
import os
import pandas as pd

# Change directory to cloned repository
os.chdir('/content/Movie-Character-Identification-With-Perosnalized-Generative-Models')

from metrics import MetricsCalc
from handlers import CSVHandler

## **4. Mount Google Drive**
Store and retrieve files from Google Drive.

In [None]:
from google.colab import drive
drive.mount('/content/drive')

## **4. Upload CSV files**
* csvs_folder: raw csv scores files for calculating scores.
* results_folder: output csv resutls folder
* clip_model: True for clip identifcation model, False for Diffusion Identification model

In [None]:
csvs_folder = "/content/drive/MyDrive/thesis_OO_SD/ex_machina/csv_results/red_filter/CLIP_imagic_embeds/a_red_filtered_photo" #@param {type:"string"}
results_folder = '/content/drive/MyDrive/thesis_OO_SD/ex_machina/similarity_results/red_filter/a_red_filtered_photo/CLIP_imagic_embeds' #@param {type:"string"}
clip_model = True #@param {type:"boolean"}
os.makedirs(results_folder, exist_ok=True)

csvs = []
csvs = CSVHandler.upload_csvs(csvs_folder)


## **4. Top K results**
* avg: True for average score per class False for all inputs
* k: k value for topk
* pred_column_name: prediction column name in csv

In [None]:
avg=False #@param {type:"boolean"}
k=4 #@param {type:"integer"}
pred_column_name ='cls_predicted' #@param {type:"string"}
k_range = range(1,k,1)

results =MetricsCalc.csv_to_topk_results(avg,clip_model,k_range,csvs,pred_column_name,results_folder)


## **5. mAP results**


In [None]:
ap_results, mean_ap = MetricsCalc.calculate_average_precision(csvs_folder, results_folder, clip_model)
print(f'AP results: {ap_results}')
print(f'Mean Average Precision (mAP): {mean_ap:.4f}')

## **6. move all files (optional)**
* moves all files from subfolders to one main folder with concatinated subfolder names file

In [None]:
source_folder = '/content/drive/MyDrive/thesis_OO_SD/ex_machina/similarity_results/red_filter'  #@param {type:"string"}
destintion_folder = '/content/drive/MyDrive/thesis_OO_SD/ex_machina/similarity_results/red_filter_all'  #@param {type:"string"}
CSVHandler.move_csv_files(source_folder,destintion_folder)

## **7. compare all results (optional)**
* compares all csv results and saves to one file.

In [None]:
folder_path = '/content/drive/MyDrive/thesis_OO_SD/ex_machina/similarity_results/red_filter_all'  #@param {type:"string"}
output_folder = '/content/drive/MyDrive/thesis_OO_SD/ex_machina/similarity_results/red_filter_merged'  #@param {type:"string"}
os.makedirs(output_folder, exist_ok=True)
folders_names= list(os.listdir(folder_path))
CSVHandler.merge_csv_results(folder_path,folders_names,output_folder)