# Automated Visual Weak Supervision for Object Recognition in Videos

In [33]:
%load_ext autoreload
%autoreload 2

import sys
import warnings
if not sys.warnoptions:
    warnings.simplefilter("ignore")

from process_tubes import *
from reef_label_tubes import reef_label
import ClassifierLoader
from tube_classifier import *

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


## 1. Split the Dataset

Here, we split our dataset into an unlabeled train set (30 objects), a labeled validation set (300), and a labeled test set (70). Each object maps to an action tube consisting of 30 frames.

In [49]:
labeled_vehicles, unlabeled_vehicles, test_vehicles = get_objects((3, 10, 7))

print(labeled_vehicles.shape[0], 'labeled train examples')
print(unlabeled_vehicles.shape[0], 'unlabeled train examples')
print(test_vehicles.shape[0], 'labeled test examples')

3 labeled train examples
10 unlabeled train examples
7 labeled test examples


## 2. Create Action Tubes for Weak Supervision

At this step, we read in our frames, encode them with our optimized autoencoder, and concatenate them to make action tubes.

In [50]:
val_primitive_matrix, val_tubes = tube_loader(labeled_vehicles, label = True)
val_ground = [label for tube in val_tubes for label in tube.tube_true_labels]

Producing action tubes...
Creating action tube for object index 3899
Creating action tube for object index 3760
Creating action tube for object index 1434


In [51]:
train_primitive_matrix, train_tubes = tube_loader(unlabeled_vehicles)

Using device: cuda
Producing action tubes...
Creating action tube for object index 927
Creating action tube for object index 1087
Creating action tube for object index 4009
Creating action tube for object index 3872
Creating action tube for object index 3642
Creating action tube for object index 1485
Creating action tube for object index 2136
Creating action tube for object index 1179
Creating action tube for object index 2162
Creating action tube for object index 931


## 3. Apply Visual Weak Supervision

To perform visual weak supervision, we pass our frame encodings from the unlabeled train set and our labeled validation set into Reef. Note that we have already tested various methods for weak supervision, and so the `train_ground` parameter is not used here. Please refer to the `generate*.ipynb` notebooks to see our weak supervision experiments.

In [52]:
unlabeled_frame_nums = [frame_num for tube in train_tubes for frame_num in tube.sampled_frames]
weak_train = reef_label(train_primitive_matrix, val_primitive_matrix, val_ground, None, unlabeled_frame_nums)

In [53]:
print(weak_train)

Using device: cuda
{58470.0: 0.00291669906996321, 58524.0: 0.00291669906996321, 58462.0: 0.00291669906996321, 58464.0: 0.00291669906996321, 58509.0: 0.00291669906996321, 58540.0: 0.00291669906996321, 58503.0: 0.00291669906996321, 58445.0: 0.00291669906996321, 58521.0: 0.00291669906996321, 58453.0: 0.00291669906996321, 58523.0: 0.00291669906996321, 58463.0: 0.00291669906996321, 58458.0: 0.00291669906996321, 58478.0: 0.00291669906996321, 58485.0: 0.00291669906996321, 58467.0: 0.00291669906996321, 58459.0: 0.00291669906996321, 58483.0: 0.00291669906996321, 58514.0: 0.00291669906996321, 58522.0: 0.00291669906996321, 58533.0: 0.00291669906996321, 58456.0: 0.00291669906996321, 58499.0: 0.00291669906996321, 58472.0: 0.00291669906996321, 58544.0: 0.00291669906996321, 58520.0: 0.00291669906996321, 58479.0: 0.00291669906996321, 58548.0: 0.00291669906996321, 58516.0: 0.00291669906996321, 58489.0: 0.00291669906996321, 67170.0: 0.00291669906996321, 67160.0: 0.00291669906996321, 67242.0: 0.002916699

## 4. Weakly Label Training Tubes

In the previous step, we generate probabilistic weak labels for each frame. Here, we aggregate all the frame-level labels for all frames in the tube and assign the tube the majority label. This step is necessary for future tube-level classification.

In [54]:
for tube in train_tubes:
    tube.assign_label(weak_train)
    print(tube.pred_vehicle, tube.true_vehicle)
    
for tube in val_tubes:
    tube.true_vehicle = 0 if tube.true_vehicle == -1 else 1
    print(tube.true_vehicle)

0 None
0 None
0 None
0 None
0 None
0 None
0 None
0 None
0 None
0 None
0
0
0


## 5. Run Classifier

Here, we demonstrate hyperparameter tuning with our final 3D-CNN classifier built on our weakly-labeled tubes. This classifier assigns a given tube either a 1 (for car) or a 0 (for truck). We have conducted several experiments that involve varying the volume of weakly-labeled data passed into the classifier and monitor relative performance.

In [55]:
# val_data = ClassifierLoader.ClassifierLoader(val_primitive_matrix, val_tubes, 'val')
# val_data[0]

# train_data = ClassifierLoader.ClassifierLoader(train_primitive_matrix, train_tubes, 'train')
# train_data[0]

tune(train_primitive_matrix, train_tubes, val_primitive_matrix, val_tubes)

Epoch 0
Epoch Loss = 0.69520
Classified 3 / 3 correctly (100.00)
Epoch 1
Epoch Loss = 0.56109
Classified 3 / 3 correctly (100.00)
Epoch 2
Epoch Loss = 0.31614
Classified 3 / 3 correctly (100.00)
Epoch 3
Epoch Loss = 0.32248
Classified 3 / 3 correctly (100.00)
Epoch 4
Epoch Loss = 0.31326
Classified 3 / 3 correctly (100.00)
Epoch 5
Epoch Loss = 0.31326
Classified 3 / 3 correctly (100.00)
Epoch 6
Epoch Loss = 0.31326
Classified 3 / 3 correctly (100.00)
Epoch 7
Epoch Loss = 0.31326
Classified 3 / 3 correctly (100.00)
Epoch 8
Epoch Loss = 0.31326
Classified 3 / 3 correctly (100.00)
Epoch 9
Epoch Loss = 0.31326
Classified 3 / 3 correctly (100.00)
Final Loss = 0.31326
Learning Rate = 0.01000, loss = 0.3133
Epoch 0
Epoch Loss = 0.83397
Classified 0 / 3 correctly (0.00)
Epoch 1
Epoch Loss = 0.80788
Classified 3 / 3 correctly (100.00)
Epoch 2
Epoch Loss = 0.63181
Classified 3 / 3 correctly (100.00)
Epoch 3
Epoch Loss = 0.54802
Classified 3 / 3 correctly (100.00)
Epoch 4
Epoch Loss = 0.48217
Cla