## Cleaner Script

This notebook demonstrates how to create mini-scenes from raw drone footage. Mini-scenes are sub-videos centered on each individual animal.

#### Inputs: 
- Drone video footage of animals
- Tracklets of each individual animal in CVAT format

#### Output:
- Mini-scene video clip for each animal in the video

#### Purpose 
Mini-scenes may be used to a train behavior recognition model.
The mini-scenes may also be labelled using a pre-trained behavior recognition model, such as KABR, to generate a time-budget analysis.

### Step 1: Import packages

In [3]:
import numpy as np
import os
import json
from lxml import etree
import cv2
from collections import OrderedDict
from tqdm import tqdm

### Step 2: Define file locations for inputs and outputs 

In [6]:
session = '20_01_2023_session_3'
videos = ['DJI_0142', 'DJI_0143', 'DJI_0144','DJI_0145','DJI_0146','DJI_0147',]
video_location = f"/session_data/{session}/drone/"
annotation_location = "/cvat_annotations/"
tracks_location = "/tracks/"

### Step 3: Generate tracks xml file for each video and save tracks xml file

In [7]:
for video in videos:
    video_path = f"{video_location}/videos/{video}.MP4"
    annotation_path = f"{annotation_location}{session}-{video}.xml"

    root = etree.parse(annotation_path).getroot()
    annotated = dict()
    track2end = {}

    for track in root.iterfind("track"):
        track_id = int(track.attrib["id"])

        for box in track.iter("box"):
            frame_id = int(box.attrib["frame"])
            keyframe = int(box.attrib["keyframe"])

            if keyframe == 1:
                track2end[track_id] = frame_id

    for track in root.iterfind("track"):
        track_id = int(track.attrib["id"])

        for box in track.iter("box"):
            frame_id = int(box.attrib["frame"])
            keyframe = int(box.attrib["keyframe"])

            if frame_id <= track2end[track_id]:
                if annotated.get(track_id) is None:
                    annotated[track_id] = OrderedDict()
                    
                scaling_factor = 3

                annotated[track_id][frame_id] = [int(float(box.attrib["xtl"])*scaling_factor),
                                                    int(float(box.attrib["ytl"])*scaling_factor),
                                                    int(float(box.attrib["xbr"])*scaling_factor),
                                                    int(float(box.attrib["ybr"])*scaling_factor), keyframe]
                
    xml_page = etree.Element("annotations")
    etree.SubElement(xml_page, "version").text = "1.1"

    for track_id in annotated.keys():
        xml_track = etree.Element("track", id=str(track_id), label="Grevy", source="manual")

        for frame_id in annotated[track_id].keys():
            if frame_id == sorted(annotated[track_id].keys())[-1]:
                outside = "1"
            else:
                outside = "0"

            xml_box = etree.Element("box", frame=str(frame_id), outside=outside, occluded="0",
                                    keyframe=str(annotated[track_id][frame_id][4]),
                                    xtl=f"{annotated[track_id][frame_id][0]:.2f}",
                                    ytl=f"{annotated[track_id][frame_id][1]:.2f}",
                                    xbr=f"{annotated[track_id][frame_id][2]:.2f}",
                                    ybr=f"{annotated[track_id][frame_id][3]:.2f}", z_order="0")

            xml_track.append(xml_box)

        if len(annotated[track_id].keys()) > 0:
            xml_page.append(xml_track)

    xml_document = etree.ElementTree(xml_page)

    xml_document.write(f"{tracks_location}{session}-{video}.xml", xml_declaration=True, pretty_print=True, encoding="utf-8")

### Step 4: Create mini-scenes of each animal in the original video with the tracks_extractor.py script, using the tracks file generated in the previous step
Note: depending on the length of the video, the processing step may take some time. 

In [8]:
# define location of the tracks_extractor.py script
script_location="/kabr-tools/tracks_extractor.py"

In [None]:
# run the tracks_extractor.py script
# inputs: original video, xml file with annotations
# outputs: mini-scene for each annotated track

os.system(f"python {script_location} /session_data/{session}/{video}.MP4 {tracks_location}{session}-{video}.xml")