# Prediction

This notebook shows off the prediction methods we use in this project, in addition to the evaluation scheme.

In [1]:
import sys, os

import plotly
plotly.offline.init_notebook_mode(connected=True)

sys.path.append('..')
import planet.predict, planet.util

import sklearn.metrics
import numpy
import tqdm

data_dir = '../data'

First, we load all the label data.

In [2]:
all_tags = planet.util.read_tags(os.path.join(data_dir, 'train_v2.csv'))
tag_indices = planet.util.get_tag_indices(all_tags)
all_labels = planet.util.tags_to_labels(all_tags, tag_indices)
(num_all, num_labels) = all_labels.shape

## Random Classifier

In order to establish a baseline for performance, we first use a classifier that assigns labels at random by flipping an unbiased coin for each label.

In [3]:
pred_labels_rand = planet.predict.random(num_all, num_labels)

rand_fig = planet.predict.make_scores_plot(pred_labels_rand, all_labels, tag_indices.keys(), 'Random')
plotly.offline.iplot(rand_fig, filename=os.path.join(data_dir, 'rand_scores.html'))

First, note that the recall (`tp / (tp + fn)`) of this classifier is roughly `0.5` because both the number of true positives (`tp`) and false negatives (`fn`) should be half the number of positive labels (`p/2`). Also note that there is a bit more fluctuation for rarer labels like *conventional_mine*. The average used here, and elsewhere in these analyses, computes the total number of `tp` and `fn` across all samples and labels.

Next, note that the precision (`tp / (tp + fp)`) of this classifier precision roughly follows the empirical distribution of the labels (see the `Data Exploration` notebook for comparison). That's because the number of false positives should be roughly half the number of negative occurrences, leading to a precision of `p/2 / (p/2 + n/2) =  p / (p + n)` which is the empirical probability of the label.

Finally, note that the F2 score of this classifier is a little closer to the recall than precision, which is expected because it's a geometric mean between recall and precision that weights recall more heavily than precision.

## Empirical Random Classifier

Instead of using a threshold of 0.5 for each label, we can use the empirical probability of each label instead.

In [4]:
label_probs = numpy.mean(all_labels, axis=0, keepdims=True)
pred_labels_emp_rand = planet.predict.empirical_random(num_all, label_probs)

emp_rand_fig = planet.predict.make_scores_plot(pred_labels_emp_rand, all_labels, tag_indices.keys(), 'Empirical Random')
plotly.offline.iplot(emp_rand_fig, filename=os.path.join(data_dir, 'emp_rand_scores.html'))

Introducing these probabilities increases the average recall a little, increases the average precision a lot and balances the two scores. Note that even though recall decreased for many labels, it increased overall because some labels much more frequently that others and so the overall score is boosted by predicting those more frequently.

## Nearest Neighbors Classifier

The simplest supervised learning method is a nearest neighbors classifier.

In [31]:
#import importlib
#importlib.reload(planet.util)
#importlib.reload(planet.predict)

#num_test = int(num_all / 4)
#num_train = num_all - num_test
num_test = 100
num_train = 300
num_images = num_train + num_test
image_names = list(all_tags.keys())[0:num_images]

image_size = (32, 32)
image_dir = os.path.join(data_dir, 'train-jpg')
all_images = planet.util.read_images(image_dir, image_names, out_size=image_size)
print('Loaded all images.')
num_samples = all_images[0, :, :, :].size
print(all_images.shape)

train_labels = all_labels[0:num_train, :]
train_images = all_images[0:num_train, :, :, :]
train_images_flat = train_images.reshape((num_train, num_samples))

test_labels = all_labels[num_train:num_images, :]
test_images = all_images[num_train:num_images, :, :, :]
test_images_flat = test_images.reshape((num_test, num_samples))

nbors = range(1, 32)
num_nbors = len(nbors)
nnbor_f2_scores = []
with tqdm.tqdm(nbors) as pbar:
    for k in nbors:
        nnbor_classifier = planet.predict.NearestNeighbors(train_labels, train_images_flat, num_neighbors=k)
        nnbor_pred_labels = nnbor_classifier.predict(test_images_flat)
        nnbor_f2_scores.append(planet.predict.f2_score(nnbor_pred_labels, test_labels, 'micro'))
        pbar.update()
        
knn_scores_fig = planet.util.make_bar_plot(nbors, nnbor_f2_scores, 'F2 Scores of KNN')
plotly.offline.iplot(knn_scores_fig, filename=os.path.join(data_dir, 'knn_f2_scores.html'))

  3%|▎         | 1/31 [00:00<00:03,  8.56it/s]

Loaded all images.
(400, 32, 32, 3)


100%|██████████| 31/31 [00:04<00:00,  7.31it/s]


In [32]:
nnbor_classifier = planet.predict.NearestNeighbors(train_labels, train_images_flat, num_neighbors=12)
pred_labels_knn = nnbor_classifier.predict(test_images_flat)

knn_fig = planet.predict.plot_scores(pred_labels_knn, test_labels, tag_indices.keys(), '12 Nearest Neighbors')
plotly.offline.iplot(knn_fig, filename=os.path.join(data_dir, 'knn_scores.html'))


Recall is ill-defined and being set to 0.0 in labels with no true samples.


Precision is ill-defined and being set to 0.0 in labels with no predicted samples.


F-score is ill-defined and being set to 0.0 in labels with no predicted samples.


F-score is ill-defined and being set to 0.0 in labels with no true samples.



[0.55594405594405594, 0.88826815642458101, 0.60090702947845798]