<a href="https://colab.research.google.com/github/AxelCodaeMolina/AI-Saru/blob/main/Molina_Paulet_AISaru.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Face detection and classification of Koshima macaques

This is the code used in the 10/10/2023 version of the preprint **Deep Learning for Automatic Detection and Facial Recognition in Japanese Macaques: Illuminating Social Networks** by Paulet et al., 2023.

We suggest reading the methods provided in the article if anything is unclear in this notebook.

You can find the preprint here: https://doi.org/10.48550/arXiv.2310.06489


If you want to reproduce our results you will also need the annotated datasets we used. You can find them here: (this will soon be provided).

Please note that this is still a work in project, we will do our best to provide a simple user friendly code so that anyone can reproduce our results and expend on it.

You can also contact me by email using either molina.axel@ens.psl.eu or molina.axel.pro@gmail.com if you have any question, correction or comment.

## Preparation of the dataset

In this first part we create a dataset of frames from macaques videos from the field, you can use your own videos if you want !

In [None]:
# connect to your Google Drive, this is where all the following datasets should be
from google.colab import drive
drive.mount('/drive')

In [None]:
# upgrade FFmpeg to v5.0
import os, uuid, re, IPython
import ipywidgets as widgets
import time

from glob import glob
from google.colab import output, drive

from IPython.display import clear_output
import os, sys, urllib.request
HOME = os.path.expanduser("~")
pathDoneCMD = f'{HOME}/doneCMD.sh'
if not os.path.exists(f"{HOME}/.ipython/ttmg.py"):
    hCode = "https://raw.githubusercontent.com/yunooooo/gcct/master/res/ttmg.py"
    urllib.request.urlretrieve(hCode, f"{HOME}/.ipython/ttmg.py")

from ttmg import (
    loadingAn,
    textAn,
)

import os, sys, re

loadingAn(name="lds")
textAn("Cloning Repositories...", ty='twg')
!git clone https://github.com/XniceCraft/ffmpeg-colab.git
!chmod 755 ./ffmpeg-colab/install
textAn("Installing FFmpeg...", ty='twg')
!./ffmpeg-colab/install
clear_output()
print('Installation finished!')
!rm -fr /content/ffmpeg-colab
!ffmpeg -version

In [None]:
# this will extract the frames from all videos in the folder,
# here we use a AISARUFULL folder with a sub folder named VIDEOS, edit the paths if needed

!cd /content/drive/MyDrive/AISARUFULL/VIDEOS

folder_path = "/content/drive/MyDrive/AISARUFULL/VIDEOS"
video_files = [file for file in os.listdir(folder_path) if file.endswith((".mp4", ".avi", ".mkv"))]

for video_file in video_files:

  VIDEO_FILE = os.path.join(folder_path, video_file)
  os.environ['inputFile'] = VIDEO_FILE
  file_name = os.path.splitext(VIDEO_FILE)[0]
  os.environ['inputName'] = file_name

  !ffmpeg -hide_banner -i "$inputFile" -r 1/1 "$inputName"_frame%04d.png # we are taking 1 frame / second, you can change this, for exemple for 3 frames second: !ffmpeg -hide_banner -i "$inputFile" -r 3/1 "$inputName"_frame%04d.png
  !mv "$inputFile" "/content/drive/MyDrive/AISARUFULL/VIDEOS/Done" # this puts all the video that have been extracted in another folder

## Detection

In this part we train a ML model that will be able to detect macaque faces in images !

In the following section we will use the previous dataset that was:
- annotated using VGG Image Annotator (note that Roboflow can also be used)
- data-augmented and split in train and val folders using Roboflow (see the methods in our pre-print)

In [None]:
# please not that most of the following code from "Detection" is copied from the "Detectron2 Beginner's Tutorial",
#some "#" are not my own, refer to the original notebook for more details

# this is everything we will need to run the detection part

# some basic setup:
# setup detectron2 logger
import detectron2
from detectron2.utils.logger import setup_logger
setup_logger()

# import some common libraries
import numpy as np
import os, json, cv2, random
from google.colab.patches import cv2_imshow

# import some common detectron2 utilities
from detectron2 import model_zoo
from detectron2.engine import DefaultPredictor
from detectron2.engine import DefaultTrainer
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog, DatasetCatalog
import torch
import torch.cuda
import torchvision

In [None]:
# just set UTF-8 as default, can be changed
import locale
def getpreferredencoding(do_setlocale = True):
    return "UTF-8"
locale.getpreferredencoding = getpreferredencoding

In [None]:
# since our dataset is in a custom format, we need to create a function to parse it
# and prepare it into detectron2's standard format.
# edit the img_dir to fit the name of your dataset

def get_saru_dicts(img_dir):
    img_dir = '../drive/MyDrive/AISARUFULL/train/'
    json_file = os.path.join(img_dir, "/json/trainval.json")
    with open(json_file) as f:
        imgs_anns = json.load(f)
        # explanation:
        # list(imgs_anns.values())[3] contains the information of the IMAGES (including id)
        # list(imgs_anns.values())[4] contains the information of the bounding boxes
        # we link the two using the id
    dataset_dicts = []
    images = list(imgs_anns.values())[3]
    annotations = list(imgs_anns.values())[4]
    annot_ids = [val['image_id'] for val in annotations]
    for idx, v in enumerate(images):
        #print(idx) # if you want to make sure it is working !
        img_id = v['id'] # for this image, we extract the id
        record = {}
        filename = os.path.join(img_dir, v["file_name"])
        record["image_id"] = v['id']
        record["file_name"] = filename # name of each image
        record["height"] = 540
        record["width"] = 960
        #height, width = cv2.imread(filename).shape[:2] # if your images are not standardized
        annos_idx = np.where(np.array(annot_ids) == img_id )[0] # to obtain the annotations of the coresponding image
        objs = []
        for anno in annos_idx: # for each annotations of this image
            anno = annotations[anno]
            x = anno['bbox'][0]
            x = int(x)
            y = anno['bbox'][1]
            y = int(y)
            w = anno['bbox'][2]
            w = int(w)
            h = anno['bbox'][3]
            h = int(h)
            obj = {
            "bbox": [x,y,w,h],
            "bbox_mode": BoxMode.XYWH_ABS,
            "category_id": 0,
        }
            objs.append(obj) # objs is, for an image, a list of detected macaques (bbox, bbox_mode, category_id)
        record["annotations"] = objs # record contains filename_, image_id, height, width, annotations. annotations = list of detections (bbox, bbox_mode, category_id)
        dataset_dicts.append(record)

In [None]:
# here we use the function get_saru_dicts to then be able to train our AI
# edit the paths of the different folders if needed, here we use the "AISARUFULL" path

for d in ["train", "val"]:
    DatasetCatalog.register("saru_" + d, lambda d=d: get_saru_dicts("AISARUFULL/" + d))
    MetadataCatalog.get("saru_" + d).set(thing_classes=["macaque"])
saru_metadata = MetadataCatalog.get("AISARUFULL_train")

In [None]:
torch.cuda.empty_cache()

# here we define all the hyper-parameters of the model
cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml")) # you can use other models
cfg.DATASETS.TRAIN = ("saru_train",) # as defined in MetadataCatalog.get
cfg.DATASETS.TEST = ("saru_val",)
cfg.DATALOADER.NUM_WORKERS = 1
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml")
cfg.SOLVER.IMS_PER_BATCH = 8  # This is the real "batch size" commonly known to deep learning people
cfg.SOLVER.BASE_LR = 0.001  # pick a good LR (note: we agree that this is a bit vague, we hope it is a good one!)
cfg.SOLVER.MAX_ITER = 6000    # 300 iterations is good enough for a toy dataset; you will need to train longer for practical datasets
cfg.SOLVER.STEPS = []        # do not decay learning rate
cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 512   # The "RoIHead batch size". 128 is faster, and good enough for a toy dataset (default: 512)
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1   # only has one class (macaque).
# NOTE: this config means the number of classes, but a few popular unofficial tutorials incorrect uses num_classes+1 here.

# and now we train the model
os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)
trainer = DefaultTrainer(cfg)
trainer.resume_or_load(resume=False)
trainer.train()

# and copy the output on our drive
!cp --recursive /content/output /drive/MyDrive/AISARUFULL/

In [None]:
# look at training curves in tensorboard:
%load_ext tensorboard
%tensorboard --logdir output

In [None]:
# we can now run inferences with our model
# inference should use the config with parameters that are used in training
# cfg now already contains everything we've set previously. We changed it a little bit for inference:
cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth")  # path to the model we just trained
print(cfg.dump())  # print formatted configs
with open("final6000Weights.yaml", "w") as f:
  f.write(cfg.dump())   # save config to file

cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.9   # set a custom testing threshold
predictor = DefaultPredictor(cfg)

In [None]:
# this part should copy the models weights to your drive

cfg = get_cfg()    # obtain detectron2's default config
cfg.merge_from_list(["/content/output/model_final.pth"])   # can also load values from a list of str
print(cfg.dump())  # print formatted configs
with open("final6000Weights.yaml", "w") as f:
  f.write(cfg.dump())   # save config to file

!cp /content/output/model_final.pth /drive/MyDrive/AISARUFULL/essaioutput
!cp /content/final6000Weights.yaml /drive/MyDrive/AISARUFULL/essaioutput

In [None]:
# we can evaluate the performance of the model using AP metric implemented in COCO API.

from detectron2.evaluation import COCOEvaluator, inference_on_dataset
from detectron2.data import build_detection_test_loader
evaluator = COCOEvaluator("saru_val", output_dir="./output")
val_loader = build_detection_test_loader(cfg, "saru_val")
print(inference_on_dataset(predictor.model, val_loader, evaluator))
# another equivalent way to evaluate the model is to use `trainer.test`

## Classification

In this part we train a ML model that will be able to identify different macaques of the population in images !

In [None]:
# this is everything we will need to run the classification part

# roboflow to get the datasets
%pip install roboflow
from roboflow import Roboflow

# ultranalytics to get the model and train it
%pip install ultralytics
from ultralytics import YOLO

# Image from the Pillow library for image processing
from PIL import Image

Before the next step we:
- created a new version of our Roboflow project where 50% of the images were in a train folder, and 50 in a validation folder. (This was trick to randomly split the dataset in half)
- took the new train folder and uploaded it on a new Roboflow identification project, adding new annoteded macaques images from other sources to diversify it (This was an easy way to create a standardized dataset with id annotations from two datasets with detection annotations)
- downloaded the resulting dataset, from wich all images are then croped in the next *section*

In [None]:
# we constructed a dataset of croped images using Open CV 2, do achieve this we modified a previously used code
# please edit the paths like img_dir to select the right folder in your Drive

def get_crop_dicts(img_dir):
    img_dir = '../drive/MyDrive/AISARUFULL/train/'
    json_file = os.path.join(img_dir, "/json/trainval.json")
    with open(json_file) as f:
        imgs_anns = json.load(f)
        # explanation:
        # list(imgs_anns.values())[3] contains the information of the IMAGES (including id)
        # list(imgs_anns.values())[4] contains the information of the bounding boxes
        # we link the two using the id
    dataset_dicts = []
    images = list(imgs_anns.values())[3]
    annotations = list(imgs_anns.values())[4]
    annot_ids = [val['image_id'] for val in annotations]
    for idx, v in enumerate(images):
        #print(idx) # if you want to make sure it is working !
        img_id = v['id'] # for this image, we extract the id
        record = {}
        filename = os.path.join(img_dir, v["file_name"])
        record["image_id"] = v['id']
        record["file_name"] = filename # name of each image
        record["height"] = 540
        record["width"] = 960
        #height, width = cv2.imread(filename).shape[:2] # if your images are not standardized
        annos_idx = np.where(np.array(annot_ids) == img_id )[0] # to obtain the annotations of the coresponding image
        objs = []
        for anno in annos_idx: # for each annotations of this image
            anno = annotations[anno]
            x = anno['bbox'][0]
            x = int(x)
            y = anno['bbox'][1]
            y = int(y)
            w = anno['bbox'][2]
            w = int(w)
            h = anno['bbox'][3]
            h = int(h)
            # CV2 modification start
            idn = anno['id']
            imn = anno['image_id']
            namen = v["file_name"]
            newname = "/drive/MyDrive/AISARUFULL/box/" + str(imn) + "_" + str(idn) + ".jpg"
            newname = "/drive/MyDrive/AISARUFULL/box/" + namen
            im = cv2.imread(filename)
            roi = im[y:y+h, x:x+w]
            down_points = (224, 224)
            resize_down = cv2.resize(roi, down_points, interpolation= cv2.INTER_LINEAR)
            cv2.imwrite(newname, resize_down)
            #print(newname) # if you want to make sure it is working !
            # CV2 modification end
            obj = {
            "bbox": [x,y,w,h],
            "bbox_mode": BoxMode.XYWH_ABS,
            "category_id": 0,
        }
            objs.append(obj) # objs is, for an image, a list of detected macaques (bbox, bbox_mode, category_id)
        record["annotations"] = objs # record contains filename_, image_id, height, width, annotations. annotations = list of detections (bbox, bbox_mode, category_id)
        dataset_dicts.append(record)

# then we can crop
get_crop_dicts(img_dir)

Before the next step we:
- uploaded the newly croped annotated frames to a new Roboflow identification project, this is "macaquesid" in the next section

In [None]:
# here we take the files from our Roboflow project to create a local dataset,
# please edit the variable with the names of your Roboflow project if you use a different dataset

rf = Roboflow(api_key="2cTPBi7uvIenK2K9kgB0")
project = rf.workspace("aisaruid").project("macaquesid")
dataset = project.version(1).download("folder")

In [None]:
# our dataset had a few problem, due to errors in the dataset annotating process we
# had a few files with 2 macaques id on 1 face. We deleted such files with this code:

!rm -rf /content/macaquesID-1/train/Kuro-Tsuwa
!rm -rf /content/macaquesID-1/train/Kuro-Tsutsuji-Tsuwa
!rm -rf /content/macaquesID-1/valid/Baby_Hiba-Hiba
!rm -rf /content/macaquesID-1/valid/Chimaki-Hado
!rm -rf /content/macaquesID-1/valid/Tsutsuji-Tsuwa
!rm -R /content/macaquesID-1/train/.ipynb_checkpoints
!rm -R /content/macaquesID-1/val/.ipynb_checkpoints
!rm -R /content/macaquesID-1/test/.ipynb_checkpoints

In [None]:
# we select and train the model for 100 epochs, you can change the model you wish to train

model = YOLO("yolov8n-cls.pt")
model.train(data="/content/macaquesID-1", epochs=100)

# check the classify folder that this process created, you will find usefull things like the confusion matrix

In [None]:
# this gives us the some basic metrics to see how the model performed

metrics = model.val()

# this is for testing the model on an image

model = YOLO('yolov8n-cls.pt')  # load an official model
model = YOLO('/content/runs/best.pt')  # load a custom model. note: make sure you select the path to the best weights the model you trained right before

# predict with the model
results = model('/content/runs/yourfile.png')  # predict on an image, please change the path to the path of the image you want to test the model on

for r in results:
    im_array = r.plot()  # plot a BGR numpy array of predictions
    im = Image.fromarray(im_array[..., ::-1])  # RGB PIL image
    im.show()  # show image
    im.save('results.jpg')  # save image

In [None]:
# we used this code to download our results
!zip -r IDSaru.zip /content/runs/classify/train8/

## Credits

We used a lot of pre-exiting code for this project, here is a list of what we used (feel free to contact us I we have missed something !)

The FFmpeg code for extracting frames was based on this code https://github.com/yunooooo/FFmpeg-for-Google-Drive/blob/master/FFmpeg.ipynb by Yuno on GitHub

The detection section was based on the Detectron2 Beginner's Tutorial : https://gist.github.com/indigoviolet/d49b84e153bb58bee809b55dc8d47ee5 by Venky Iyer on GitHub

The classification section was based on the codes provided by ultranalytics tutorial on using YOLOv8 for classification tasks : https://docs.ultralytics.com/fr/tasks/classify/

Paulet, J., Molina, A., Beltzung, B., Suzumura, T., Yamamoto, S., & Sueur, C. (2023). Deep Learning for Automatic Detection and Facial Recognition in Japanese Macaques: Illuminating Social Networks. ArXiv. https://doi.org/10.48550/arXiv.2310.06489

This Notebook was a group effort by Paulet, Beltzung and Molina. Please contact Axel Molina if needed.

You can find more informations on the GitHub of the project:
https://github.com/AxelCodaeMolina/AI-Saru