#**AI Saru: a tool for macaques detection and identification**

This is the tool developed in the 2024 Primates article *Deep Learning for Automatic Detection and Facial Recognition in Japanese Macaques: Illuminating Social Networks* by Paulet et al., 2024. You can find it here:  https://doi.org/10.1007/s10329-024-01137-5 and the preprint is here: https://doi.org/10.48550/arXiv.2310.06489 .

Please note that this is still a work in project, we will do our best to provide a simple beginner friendly code so that anyone can use it and expend on it. If you need any help running the code or if you have any question feel free to contact Axel Molina by email using either molina.axel@ens.psl.eu or molina.axel.pro@gmail.com . You should not need any specialist knowledge to run this code.

#**What this will do**

This tool will take your **Japanese macaques** videos, **detect the faces** of any japanese macaques on it, **track the faces** across the video and **identify the individuals** faces.

For the moment this tool can identify the 2023 **Koshima** population (famous for its potatoe washing !) and the 2024 **Shodoshima** population (famous for its "sarudango" !) .

It will also give you simple **co-occurrence matrices** showing how many times two individuals appear together on video. There is different options for the matrices. Details are provided at the end of this colab.

They can then be used easily with Gephi to create visual representations of social networks, or with R for statistics.

This code also generate videos with annotation, if you want to see the model track your macaques faces. In futur versions this will be optional.

# **What you will need**

#A Google Drive

To use this tool you will need to have, in a Google Drive:
- the weights of a **trained model for macaques detection** in .pt format
- the weights of a **trained model for macaques identification** in .pt format
- **videos** of your Japanese macaques in .mp4 format

You can find the .pt files for our AIs on the **GitHub** of the project here, along with some exemple videos: https://github.com/AxelCodaeMolina/AI-Saru . You can also use your own.

This code assume that your Drive is organized as follows:
- a folder in your drive named "**AISaru**"
- a subfolder in the "AISaru" folder named "**models**"
- another subfolder in the "AISaru" folder named "**videos**"
- (the code should also create a folder named "output")

#A few clicks

Once you have that, the code will be easy to run: simply press the arrows on the left of every block of code. I recommand you wait until you see a little greeen arrow on the side of the block of code to click on the next arrow. If anything crashes, simply refresh the page and re-run everything.

The code may take a few hours to run, depending on the number of videos.

With a few adjustments this can also run on a computer using Python.

#**Part 0: connect the Drive and load the packages needed**

In [None]:
# first let's connect your well organized drive to this code, just click on the left grey arrow !

from google.colab import drive
drive.mount('/drive')


In [None]:
# now just click on the arrow here, this will load all the packages we need

# Install supervision for the tracker
!pip install supervision
import supervision as sv

# Install ultranalytics to get the models
!pip install ultralytics
from ultralytics import YOLO


# Import numpy
import numpy as np
# Import panda
import pandas as pd
# Import os
import os
# Import intertools combinations
from itertools import combinations
# Import files
from google.colab import files

# Install FFmpeg
!apt-get update -qq && apt-get install -y ffmpeg

Perfect ! Now let's dive into it.

#**Part 1: processing the videos**

This is the longest and hardest part. First we will define some variables and functions for the rest of the code. Then we will use them on the videos in the drive.

If you are using a custom model this is where you should edit the code: simply remplace the "/drive/MyDrive/AISaru/models/ID.pt" in the first block of code with the path to the .pt file you will use.  

In [None]:
# This is where we define the variables and functions needed for
# both the tracking and creation of a matrix with the results

# First we specify where are the models we want to use for detection and classification
detector = YOLO("/drive/MyDrive/AISaru/models/DET.pt")
classifier = YOLO("/drive/MyDrive/AISaru/models/ID.pt")

# Then we define the tracker and annotator from the supervison package
byte_tracker = sv.ByteTrack()
annotator = sv.BoxAnnotator()

# And some variables for the following functions
data = []
detection_index = 0
video = ""

In [None]:
# This is the "aisaru" callback function that will be used to detect, track
# and identify the macaques, and to create a matrix

def aisaru(frame: np.ndarray, index: int) -> np.ndarray:

    global video, detection_index  # Specify that you are using the global variables
    results = detector(frame)[0]
    detections = sv.Detections.from_ultralytics(results)
    detections = byte_tracker.update_with_detections(detections)
    labels = [
    f"#{tracker_id} {detections.data['class_name'][class_id]} {confidence:0.2f}"
    for confidence, class_id, tracker_id
    in zip(detections.confidence, detections.class_id, detections.tracker_id)
    ]

    if detections: # This is because if there is no detection, we are not interested in running a prediction
        bbox = detections.xyxy
        for bbox_item, label in zip(bbox, labels):
            x1, y1, x2, y2 = map(int, bbox_item) # First we define the region of interest for classification
            tracker_id = int(label.split()[0][1:])
            class_name = label.split()[1]
            confidence = float(label.split()[2])
            roi = frame[y1:y2, x1:x2]
            results_classification = classifier(roi)[0]  # Predict on an image
            classifications = sv.Classifications.from_ultralytics(results_classification)
            classifications.get_top_k(5) # This will take the 5 best prediction for the image
            top_k_class_ids, top_k_confidences = classifications.get_top_k(5)
            class_names = results_classification.names
            top_k_class_names = [class_names[class_id] for class_id in top_k_class_ids]
            for class_id, class_name, confidence in zip(top_k_class_ids, top_k_class_names, top_k_confidences): # Here we create the labels for the DataFrame
                data.append({
                    "DetectionID": detection_index,
                    "Video": video,
                    "Track": tracker_id,
                    "Frame": index,
                    "Confidence": confidence,
                    "Box": bbox_item,
                    "NameID": class_name,
                    "ConfidenceID": confidence,
                    "ClassID": class_id,
                })
            detection_index += 1 # Increment the detection index for the next detection
            df = pd.DataFrame(data)

    return annotator.annotate(scene=frame.copy(), detections=detections, labels=labels)

In [None]:
# This is the function using the aisaru callback
# and processing each video file of the video Drive folder

def aisaru_allvideos(directory):
    global video
    for filename in os.listdir(directory):
        if filename.endswith(".mp4"):
            print(filename)
            source_path = os.path.join(directory, filename)
            target_filename = os.path.splitext(filename)[0] + "_output.mp4" # This is the name of the new version of the video, with the detections
            target_path = directory + "/output/" + target_filename
            video = os.path.splitext(filename)[0]

            # Process the videos
            sv.process_video(source_path=source_path, target_path=target_path, callback=aisaru)

            # Generate matrix and save it to Excel
            df = pd.DataFrame(data)
            excel_filename = os.path.splitext(filename)[0] + ".xlsx"
            df.to_excel(os.path.join(directory + "/output/", excel_filename), index=False)

# This should be the path to your directory containing the videos
directory_path = "/drive/MyDrive/AISaru/videos"

Great, now it is time to try our tool on the videos.

In [None]:
# This is the code for the actual video processing, this will probably take time
aisaru_allvideos(directory_path)

#**Part 2: creating the co-occurrence matrices**

Now that we have our results of the video processing, we will clean and edit the output excel file to have a good co-occurrence matrix.

But first we need an empty co-occurences matrix to fill.

In [None]:
# This will create a co-occurrence matrix for the individuals of your classifier

results_classification = classifier("/drive/MyDrive/AISaru/Yotsuba.jpg")[0] # Perform an identifiation on a random image (change the path to one of yours) of your drive to then execute the rest of the code
class_names = results_classification.names

# Extract class names from the dictionary values
class_names = list(class_names.values())

# Create a DataFrame with class names as both rows and columns
co_occurrence_matrix = pd.DataFrame(0, index=class_names, columns=class_names)

# Save it
co_occurrence_matrix.to_excel("/drive/MyDrive/AISaru/output/co_occurrence_matrix.xlsx")

Perfect, now let's edit the Excel files of the results.

In [None]:
# This is the function that compile all the Excel files from the videos in the directory
# into one (potentially very) long Excel file

def compile_excel_files(directory):
    # Initialize an empty DataFrame to store all data
    all_data = pd.DataFrame()

    # Iterate over each file in the directory
    for filename in os.listdir(directory):
        if filename.endswith(".xlsx"):
            file_path = os.path.join(directory, filename)

            # Read the Excel file into a DataFrame
            df = pd.read_excel(file_path)

            # Append the contents of the DataFrame to all_data
            all_data = pd.concat([all_data, df], ignore_index=True)
            all_data = all_data.drop_duplicates()
            print(file_path)

    # Save all_data to a single Excel file
    compiled_file_path = os.path.join(directory.replace("/output", ""), "compiled_output.xlsx")
    all_data.to_excel(compiled_file_path, index=False)

# This is the path to your directory containing all the Excel files
directory_path = "/drive/MyDrive/AISaru/output"

# This is the execution of the function
compile_excel_files(directory_path)

In [None]:
# This will give us the all potential IDs for each tracks

df = pd.read_excel("/drive/MyDrive/AISaru/compiled_output.xlsx")

# Group by both 'Track' and 'NameID' and calculate the sum of 'ConfidenceID' for each group
sum_confidence_per_name = df.groupby(['Track', 'NameID', 'Video'])['ConfidenceID'].sum().reset_index()

# Get the index of the row with the highest 'ConfidenceID' for each 'Track'
max_confidence_idx = sum_confidence_per_name.groupby('Track')['ConfidenceID'].idxmax()

# Filter the DataFrame to keep only the rows with the highest 'ConfidenceID' for each 'Track'
df_best_id_per_track = sum_confidence_per_name.loc[max_confidence_idx]

# Group by 'Track' and find the minimum and maximum values of 'Frame' for each group
first_last_frame_per_track = df.groupby(['Track', 'Video'])['Frame'].agg(['min', 'max']).reset_index()

# Merge the two DataFrames on the 'Track' and 'Video' columns
combined_df = pd.merge(df_best_id_per_track, first_last_frame_per_track, on=['Track', 'Video'])

# Save the resulting DataFrame
combined_df.to_excel("/drive/MyDrive/AISaru/TracksID.xlsx", index=False)

In [None]:
# Here we clean and simplify the compiled output to be able to use it in the next steps
# (This file is actually small and clear, you can open it !)

df = pd.read_excel("/drive/MyDrive/AISaru/compiled_output.xlsx")

cleaned_output = df.drop(['Box', 'Confidence', 'ClassID'], axis=1)
cleaned_output = cleaned_output.drop_duplicates(subset=['DetectionID', 'Video'])

# Create dictionaries mapping 'Track' to 'NameID' and 'ConfidenceID' in combined_df
track_to_nameid = combined_df.set_index('Track')['NameID'].to_dict()
track_to_confidenceid = combined_df.set_index('Track')['ConfidenceID'].to_dict()

# Replace 'NameID' and 'ConfidenceID' in cleaned_output using the mappings
cleaned_output['NameID'] = cleaned_output['Track'].map(track_to_nameid)
cleaned_output['ConfidenceID'] = cleaned_output['Track'].map(track_to_confidenceid)

# Save the cleaned DataFrame
cleaned_output.to_excel("/drive/MyDrive/AISaru/output/cleaned_output.xlsx", index=False)

Now let's fill and edit the matrix based on the compiled output file ! There is multiple methods that you can use.

You can have the co-occurrence on a frame (if your video is filmed in such a way that some macaques are visible but not in social contact) or by video (if the videos are close up centered around one individual, like we did).

This is the code to do it by videos:

In [None]:
# This will create an updated co-occurence matrix
# using the compiled output file, using co-occurences by videos

# Load the cleaned DataFrame
cleaned_output = pd.read_excel("/drive/MyDrive/AISaru/output/cleaned_output.xlsx")

# Load the co-occurrence matrix DataFrame
co_occurrence_matrix = pd.read_excel("/drive/MyDrive/AISaru/output/co_occurrence_matrix.xlsx", index_col=0)

# Iterate through each unique video
for video, group in cleaned_output.groupby('Video'):
    # Check if there are multiple NameIDs for the same video
    if len(group) > 1:
        # Extract the unique NameIDs for this video
        name_ids = group['NameID'].unique()

        # Update the corresponding cells in the co-occurrence matrix for each pair of NameIDs
        for name_id1, name_id2 in combinations(name_ids, 2):
            co_occurrence_matrix.at[name_id1, name_id2] += 1
            co_occurrence_matrix.at[name_id2, name_id1] += 1

# Save the updated co-occurrence matrix to a new Excel file
co_occurrence_matrix.to_excel("/drive/MyDrive/AISaru/output/co_occurrence_matrix_updatedVID.xlsx")
files.download("/drive/MyDrive/AISaru/output/co_occurrence_matrix_updatedVID.xlsx")

In [None]:
# This will normalize the co-occurence matrix

# List the number of unique videos for each NameID
cleaned_output_df = pd.read_excel("/drive/MyDrive/AISaru/output/cleaned_output.xlsx")
videos_per_nameid = cleaned_output_df.groupby('NameID')['Video'].nunique().to_dict()

# Extract the number of co-occurrences for each unique pair of NameID
co_occurrence_matrix_df = pd.read_excel("/drive/MyDrive/AISaru/output/matrix_VID.xlsx", index_col=0)

# Calculate the ratio
for name_id1 in co_occurrence_matrix_df.index:
    for name_id2 in co_occurrence_matrix_df.columns:
        if name_id1 != name_id2:
            num_co_occurrences = co_occurrence_matrix_df.at[name_id1, name_id2]
            num_videos_nameid1 = videos_per_nameid.get(name_id1, 0)
            num_videos_nameid2 = videos_per_nameid.get(name_id2, 0)
            total_videos = num_videos_nameid1 + num_videos_nameid2
            ratio = num_co_occurrences / total_videos
            co_occurrence_matrix_df.at[name_id1, name_id2] = ratio

# Update the co-occurrence matrix
co_occurrence_matrix_df.to_excel("/drive/MyDrive/AISaru/output/matrix_normalized_VID.xlsx")

# Save the updated co-occurrence matrix
files.download("/drive/MyDrive/AISaru/output/matrix_normalized_VID.xlsx")


This is the code to do it by frames:

In [None]:
# This will create an updated co-occurence matrix
# using the compiled output file, using co-occurences by frames

# Load the cleaned DataFrame
cleaned_output = pd.read_excel("/drive/MyDrive/AISaru/output/cleaned_output.xlsx")

# Load the co-occurrence matrix DataFrame
co_occurrence_matrix = pd.read_excel("/drive/MyDrive/AISaru/output/co_occurrence_matrix.xlsx", index_col=0)

# Iterate through each unique combination of Video and Frame number
for (video, frame), group in cleaned_output.groupby(['Video', 'Frame']):
    # Check if there are multiple NameIDs for the same Video and Frame number
    if len(group) > 1:
        # Extract the NameIDs for this combination
        name_ids = group['NameID'].unique()

        # Avoid counting co-occurrences if the NameIDs are the same
        if name_id1 != name_id2:
          # Update the corresponding cells in the co-occurrence matrix for each pair of NameIDs
          for name_id1, name_id2 in combinations(name_ids, 2):
              co_occurrence_matrix.at[name_id1, name_id2] += 1
              co_occurrence_matrix.at[name_id2, name_id1] += 1

# Save the updated co-occurrence matrix
co_occurrence_matrix.to_excel("/drive/MyDrive/AISaru/output/co_occurrence_matrix_updatedFRAMES.xlsx")

In [None]:
# This will normalize the co-occurence matrix

# List the number of unique videos for each NameID
cleaned_output_df = pd.read_excel("/drive/MyDrive/AISaru/output/cleaned_output.xlsx")
videos_per_nameid = cleaned_output_df.groupby('NameID')['Video'].nunique().to_dict()

# Extract the number of co-occurrences for each unique pair of NameID
co_occurrence_matrix_df = pd.read_excel("/drive/MyDrive/AISaru/output/matrix_FRAMES.xlsx", index_col=0)

# Calculate the ratio
for name_id1 in co_occurrence_matrix_df.index:
    for name_id2 in co_occurrence_matrix_df.columns:
        if name_id1 != name_id2:
            num_co_occurrences = co_occurrence_matrix_df.at[name_id1, name_id2]
            num_videos_nameid1 = videos_per_nameid.get(name_id1, 0)
            num_videos_nameid2 = videos_per_nameid.get(name_id2, 0)
            total_videos = num_videos_nameid1 + num_videos_nameid2
            ratio = num_co_occurrences / total_videos
            co_occurrence_matrix_df.at[name_id1, name_id2] = ratio

# Update the co-occurrence matrix
co_occurrence_matrix_df.to_excel("/drive/MyDrive/AISaru/output/matrix_normalized_FRAMES.xlsx")


You're done ! You should now have :
- normalized matrices to use in Gephi or R for social network modeling (matrix_normalizedXXX.xlsx)
- a list of detection and identification (cleaned_output.xlsx)

#**Credits**

If you want to use this code for your work, please cite the article by Paulet et al. mentioned at the start of this colab and provide a link to the github of this project, where this colab can be found: (SOON)

Contact me if you need anything ! Good luck !