# Final Project - Data Collection

This is a **project group assignment** with the teams already established.

# Project E: Simple Object Tracking

## Step 1: Data Collection

Each team must **record at least 100 videos, each 3-seconds long**. We recommend collecting evenly number of samples per class, in this case, **20 videos per class (ball, mug, pen, spoon, notebook)**. Videos should be stored as *.mov* or *.mp4* format, and 15 frames from each video must include a bounding box annotation alongside its label. After data collection, all teams’ data will be merged, preprocessed, and split into training and test sets.

* **Note:** We will assume that all videos were collected with 30 fps (frames per second) camera.

The **ambient sound labels** are: 
* ball - Label 1
* mug - Label 2
* pen - Label 3
* spoon - Label 4 
* notebook - Label 5

We recommend you to save your files using a **coding system**, e.g. **ID-trial-label**. First give a number from 1 to 3 to each team member, this is the ID. Then, for example, when team member with ID 2 is recording hers/his/their 3rd video of moving notebook (label 5), the file name should read "2-3-5.mov".

## Step 3: Save frames for Annotation

The code below reads a folder containing all your .mov or .mp4 files and saves each video’s 15 extracted frames into a separate folder.

You will then use these frames to annotate the object’s bounding box in each image (see step 4 below).

Before running the code, make sure to update the variable ```mydir``` with the directory path where your video recordings are stored as well as the csv annotations.

In [None]:
import numpy as np
import os
import cv2
import pandas as pd

# CHANGE ME!
# mydir = 'change-this-to-your-data-directory-local-path
mydir = 'videos'

In [None]:
i=0
for file in os.listdir(mydir):
    if file.endswith((".mp4", ".MOV", ".mov")):  # only read video files
        filename = os.path.join(mydir, file)
        cap = cv2.VideoCapture(filename)

        # Parameters
        fps = 30
        target_fps = 5
        frame_interval = int(fps / target_fps)  # capture 1 out of every 6 frames
        max_frames = int(3 * target_fps)        # 3 seconds × 5 fps = 15 frames

        # Create output folder for this video
        video_name = os.path.splitext(file)[0]
        output_dir = os.path.join(mydir, video_name)
        os.makedirs(output_dir, exist_ok=True)

        frame_count = 0
        saved_frames = 0
        frames = []
        while cap.isOpened() and saved_frames < max_frames:
            ret, frame = cap.read()
            if not ret:
                break
            frame = cv2.resize(frame, (100, 100))
            frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
            frames.append(frame)
            frame_filename = os.path.join(output_dir, f"frame_{saved_frames:02d}.png")
            cv2.imwrite(frame_filename, cv2.cvtColor(frame_rgb, cv2.COLOR_RGB2BGR))
            frame_count += 1
            saved_frames += 1

        # Convert to numpy array (frames, height, width, channels)
        video_array = np.array(frames, dtype=np.uint8)

        if i==0:
            data = video_array
        else:
            data = np.stack((data, video_array), axis=0)

        cap.release()
        print(f"Saved {saved_frames} frames from {file} to {output_dir}")
        i+=1

print('----------------------DONE-----------------------------')

# Saves the files to your current directory
np.save('data', data)

In [None]:
# Confirm that the data saved all 100 videos. 
# The shape of data should be Nx15x100x100x3
# where N is the number of samples, N=100.

data.shape

## Step 4: Data Annotation

Use [makesense.ai](https://www.makesense.ai/) to annotate each frame in the video.
1. Click on "Get Started" (bottom right corner).
2. Import the frames from one video folder (this must be repeated for all 100 videos).
3. Select "Object Detection".
4. Click "Start Project".
5. For each frame, use the cursor to draw a bounding box around the object, and select the appropriate label.
    * For ball use 1, mug use 2, pen use 3, spoon use 4, and notebook use 5.
6. Click "Actions" (yop left), then "Export Annotations" as a **csv** file. **Important:** rename the csv file with the same name as video file and its frames folder (e.g. 1-2-4.mov, 1-2-4.csv, and 1-2-4 folder).

# Step 5: Prepare Your Data for Submission

Run the following code to save the annotations labels as a single ```csv``` file.

In [None]:
# CHANGE ME!

csv_dir = 'change-this-to-your-csv-directory-local-path

In [None]:
df_list = []

# Expected columns in each CSV from makesense.ai
columns = ["label_name", "bbox_x", "bbox_y", "bbox_width", "bbox_height",
    "image_name", "image_width", "image_height"]

for file in os.listdir(csv_dir):
    if file.endswith(".csv"):
        filepath = os.path.join(csv_dir, file)
        df = pd.read_csv(filepath, usecols=columns)  # load only expected columns
        df["filename"] = os.path.splitext(file)[0]
        df_list.append(df)

merged_df = pd.concat(df_list, ignore_index=True)

merged_df.to_csv(os.path.join(csv_dir, "team_annotations.csv"), index=False)

In [None]:
merged_df

## Step 6: Upload Your Data in Canvas

To receive full credit in this question, you should submit to Canvas all of the following files:

1. Compressed folder (.zip) with the videos from all team members. (100 videos per team should be included.)
2. File "data.npy"
3. File "team_annotations.csv"

## Submit your Solution

Confirm that you've successfully completed the assignment.

```add``` and ```commit``` the final version of your work, and ```push``` your code/data to your GitHub repository -- **you may run into memory issues. If this happens, disregard this step and only submit the data files to Canvas**

Submit the URL of your GitHub Repository along with all data as your assignment submission on Canvas.