### Small-Scale Composite Dataset Generation

We generate a small-scale video dataset consisting of 518 human-action classes with a total of 24,927 videos where each video is a 10 second clip of resolution 64x64 pixels.

The following code describes our dataset and annotation generation process. We utilize the [Mini-Kinetics200](https://github.com/s9xie/Mini-Kinetics-200), [Kinetics-400](https://www.kaggle.com/datasets/rohanmallick/kinetics-train-5per?select=kinetics400_5per), and [TinyVIRAT](https://www.crcv.ucf.edu/research/projects/tinyvirat-low-resolution-video-action-recognition/) human-object video datasets to create this composite dataset. 

In terms of data preprocessing from raw formats, we compile multiple JSON annotation files into a single ```train.json``` and ```test.json``` for our training/testing splits of the composite dataset. Each JSON object in the annotation file is in the following format:

```
{'id': <numerical ID>,
 'video_id': <name of video>,
 'path': <filepath in directory>,
 'dim': <dimension of a single frame in video (i.e. [64,64])>,
 'label': <str label of action class (i.e. "counting money")}

```

For the video data, we utilize ```opencv``` to compress video resolution from native YouTube formats or standardized 224x224 px formats in TinyVIRAT to 64x64 px resolution.

---

In [None]:
import json

# TinyVIRAT annotations (raw)

f = open("tiny_train.json")
train = json.load(f)

f = open("tiny_test.json")
test = json.load(f)

newTrain = train['tubes']
newTest = test['tubes']

for item in newTrain:
    item['label'] = item['label'][0]
    item['id'] = str(item['id'])
    item['dim'] = [64,64]
    
for item in newTest:
    item['label'] = item['label'][0]
    item['id'] = str(item['id'])
    item['dim'] = [64,64]

In [62]:
len(newTrain)

7663

In [None]:
import cv2
import os

# Reducing image resolution with OpenCV
# https://www.geeksforgeeks.org/how-to-change-video-resolution-in-opencv-in-python/
dataPath = "compositeDataset"
newDim = [64,64]

for dirName in os.listdir(dataPath):
    subPath = dataPath + "/" + dirName
    for fname in os.listdir(subPath):
        if ".mp4" in fname:
            fPath = subPath + "/" + fname
            vidcap = cv2.VideoCapture(fPath)
            success, image = vidcap.read()
            while success:
                success, image = vidcap.read()
                resize = cv2.resize(image, (newDim[0], newDim[1]))

In [64]:
# Kinetics annotations (raw)

dataPath = "kinetics200"

for dirName in os.listdir(dataPath):
    subPath = dataPath + "/" + dirName
    for fname in os.listdir(subPath):
        if ".mp4" in fname:
            fPath = subPath + "/" + fname
            path = dirName + "/" + fname
            idName = fname.replace(".mp4", "")
            video_id = str(dirName)
            label = dirName
            
            s = dict()
            s['id'] = idName
            s['video_id'] = video_id
            s['path'] = path
            s['dim'] = [64,64]
            s['label'] = label
            
            newTrain.append(s)

In [65]:
# example JSON annotation
newTrain[-1]

{'id': 'mBIgVnm995E',
 'video_id': 'counting money',
 'path': 'counting money/mBIgVnm995E.mp4',
 'dim': [64, 64],
 'label': 'counting money'}

In [68]:
# Total number of samples
len(newTrain) + len(newTest)

24927

In [67]:
with open('train.json', 'w') as fp:
    json.dump(newTrain, fp)

In [None]:
with open('test.json', 'w') as fp:
    json.dump(newTest, fp)