# Areal Project

<div>
<img src="logo.jpg" width=150 ALIGN="left" border="20">
<h1> Starting Kit for raw data (images)</h1>
<br>This code was tested with <br>
Python 3.6.7 <br>
Created by Areal Team <br><br>
ALL INFORMATION, SOFTWARE, DOCUMENTATION, AND DATA ARE PROVIDED "AS-IS". The CDS, CHALEARN, AND/OR OTHER ORGANIZERS OR CODE AUTHORS DISCLAIM ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR ANY PARTICULAR PURPOSE, AND THE WARRANTY OF NON-INFRIGEMENT OF ANY THIRD PARTY'S INTELLECTUAL PROPERTY RIGHTS. IN NO EVENT SHALL AUTHORS AND ORGANIZERS BE LIABLE FOR ANY SPECIAL, 
INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF SOFTWARE, DOCUMENTS, MATERIALS, PUBLICATIONS, OR INFORMATION MADE AVAILABLE FOR THE CHALLENGE. 
</div>

<div>
    <h2>Introduction </h2>
     <br>
Aerial imagery has been a primary source of geographic data for quite a long time. With technology progress, aerial imagery became really practical for remote sensing : the science of obtaining information about an object, area or phenomenon.
Nowadays, there are many uses of image recognition spanning from robotics/drone vision to autonomous driving vehicules or face detection.
<br>
In this challenge, we will use pre-processed data, coming from landscape images. The goal is to learn to differentiate common and uncommon landscapes such as a beach, a lake or a meadow.
    Data comes from part of the data set (NWPU-RESISC45) originally used in <a href="https://arxiv.org/pdf/1703.00121.pdf?fbclid=IwAR16qo-EX_Z05ZpxvWG8F-oBU0SlnY-3BPCWBVVOGPyJcVy7BBqCKjnsvJo">Remote Sensing Image Scene Classification</a>. This data set contains 45 categories while we only kept 13 out of them.

References and credits: 
Yuliya Tarabalka, Guillaume Charpiat, Nicolas Girard for the data sets presentation.<br>
Gong Cheng, Junwei Han, and Xiaoqiang Lu, for the original article on the chosen data set.
</div>

### Requirements 

The next cell will install all the required dependencies on your computer. You should consider replacing pip with pip3 if pip is related to python2.7 on your computer, or comment it if you already have the dependencies/are running in the docker of the challenge (runnable with the name areal/codalab:pytorch if you know how to run a docker).

In [1]:
#!pip install --user -r requirements.txt

In [2]:
import numpy as np
import random
import re

In [3]:
model_dir = "sample_code_submission"
result_dir = 'sample_result_submission/' 
problem_dir = 'ingestion_program/'  
score_dir = 'scoring_program/'

In [4]:
from sys import path; path.append(model_dir); path.append(problem_dir); path.append(score_dir);

Go through the challenge website and watch the trailer video.

#### Question 1: Briefly explain the problem.

We are to classify images of certain landscapes to 13 different classes(beach,chaparral,cloud,desert,forest,island,lake,meadow,mountain,river,sea,snowberg,wetland).

#### Question 2: What is the scoring metric used to evaluate submissions?

The accuracy that is used to evaluate submission is the percentage of rightly identified(prediction==solution) classes.

(1 - (error / len(solution)))

<div>
    <h1> Step 1: Exploratory data analysis </h1>
<p>
We provide sample_data with the starting kit, but to prepare your submission, you must fetch the public_data from the challenge website and point to it.
</div>

In [5]:
#data_dir = 'sample_data'
data_dir = '../public_data_tp5' # download "public_data" from the challenge website
data_name = 'Areal'

<h2 style="color:red " >Warning</h2>

<p style="font-style:italic"> In case you want to load the full data </p> 
Files being big, your computer needs to have enough space available in your RAM. It should take about 3-4GB while loading and 1.5GB in the end.

In [None]:
from ingestion_program.data_io import read_as_df
data = read_as_df(data_dir  + '/' + data_name)

In [None]:
data.shape

In [None]:
data.head()

In [None]:
# data.describe()

In [None]:
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
%matplotlib inline

num_toshow = 6
fig, _axs = plt.subplots(nrows=2, ncols=3, figsize=(10,10))
fig.subplots_adjust(hspace=0.3)
axs = _axs.flatten()


for i in range(num_toshow):
    img = data.iloc[i].values[:-1].reshape(128,128,3)
    label = data.iloc[i].values[-1:]
    axs[i].set_title('Example of {}'.format(label))
    axs[i].imshow(img.astype(float) / 255)

plt.show()

We can use a library to extract features with different algorithms. Here we print the example pictures from before with the different extracted features on top of them. We can see that the algorithm espacially finds the edges of the pictures(dessert has no features on the bleak surfaces etc.)
To go through the pcitures use any key of the keyboard.

In [None]:
import cv2 as cv

for i in range(num_toshow):
    img = data.iloc[i].values[:-1].reshape(128,128,3)
    new_img = (img.astype(float) / 255)
    gray= cv.cvtColor(new_img.astype('float32'),cv.COLOR_BGR2GRAY)
    sift = cv.SIFT_create()
    image8bit = cv.normalize(gray, None, 0, 255, cv.NORM_MINMAX).astype('uint8')
    kp, des = sift.detectAndCompute(image8bit,None)
    img=cv.drawKeypoints(image8bit,kp,new_img)
    cv.imshow('sift_keypoints.jpg',img)
    cv.waitKey(0)

In [None]:
data.head()

In [None]:
#np.array(data)
#np.array(features)

In [None]:
print(data.iloc[:, -1:])
X = data.iloc[:, :-1]
y = data.iloc[:, -1:]

In [None]:
np.unique(data["target"])#.shape

In [None]:
data[data["target"]=="island"].shape

#### Code 1: compute statistics of the dataset.

* How many features?
* How many data points?
* How many classes?
* What is the most represented class?
* What is the least represented class?

In [None]:
#Features: 128*128*3
#Data points: 5200
#Classes: 13
#They are equally represented: all by 400 samples

# Step 2 : Building a predictive model

<h2 style="color:red " >Warning</h2>

<p style="font-style:italic"> In case you want to load the full data </p> 
This time, also, still make sure that your RAM has at least 2-3GB available.

In [6]:
from data_manager import DataManager
D = DataManager(data_name, data_dir, replace_missing=False, verbose=True)
print(D)

Info file found : /home/paavo/Saclay/OPT9/public_data_tp5/Areal_public.info
[+] Success in  0.00 sec
[+] Success in 61.83 sec
[+] Success in  0.02 sec
[+] Success in 22.63 sec
[+] Success in  0.00 sec
[+] Success in 26.99 sec
[+] Success in  0.00 sec
DataManager : Areal
info:
	usage = Sample dataset Areal data
	name = areal
	task = multiclass.classification
	target_type = Categorical
	feat_type = Numerical
	metric = accuracy
	time_budget = 12000
	feat_num = 49152
	target_num = 13
	label_num = 13
	train_num = 5200
	valid_num = 1950
	test_num = 1950
	has_categorical = 0
	has_missing = 0
	is_sparse = 0
	format = dense
data:
	X_train = array(5200, 49152)
	Y_train = array(5200, 1)
	X_valid = array(1950, 49152)
	Y_valid = array(0,)
	X_test = array(1950, 49152)
	Y_test = array(0,)
feat_type:	array(0,)
feat_idx:	array(0,)



In [7]:
X_train = D.data['X_train']
Y_train = D.data['Y_train']

### Processing

Basically, there are two approaches:

* Use raw data as input. This may be the good way to go with, for instance, deep learning models.
* Do feature engineering: process the data to create features. You can then use this features as the input of your classifier (Random forest, SVM, etc.). An example of feature is the number of blue pixel in the image. Feature extraction can also be done by a CNN.

In [48]:
import cv2 as cv
def extract_features(image, vector_size=32):
    try:
        #alg = cv.KAZE_create()
        #image = (image.astype(float) / 255)
        image = image.reshape(128,128,3)
        image= cv.cvtColor(image.astype('float32'),cv.COLOR_BGR2GRAY)
        #alg = cv.SIFT_create()
        alg = cv.KAZE_create()
        kps = alg.detect(image)
        # Sorting them based on keypoint response value(bigger is better)
        kps = sorted(kps, key=lambda x: -x.response)[:vector_size]
        kps, dsc = alg.compute(image, kps)
        dsc = dsc.flatten()
        needed_size = (vector_size * 64)
        if dsc.size < needed_size:
            dsc = np.concatenate([dsc, np.zeros(needed_size - dsc.size)])
    except cv.error as e:
        print('Error: ', e)
        return None
    return dsc

In [16]:
import mahotas

def fd_hu_moments(image):
    image = cv.cvtColor(image.reshape(128,128,3), cv.COLOR_BGR2GRAY)
    feature = cv.HuMoments(cv.moments(image)).flatten()
    return feature

def fd_haralick(image):    # convert the image to grayscale
    #image = (image.astype(float) / 255)
    gray = cv.cvtColor(image.reshape(128,128,3), cv.COLOR_BGR2GRAY)
    #print(gray)
    # compute the haralick texture feature vector
    haralick = mahotas.features.haralick(gray).mean(axis=0)
    return haralick
 
def fd_histogram(image, mask=None):
    # convert the image to HSV color-space
    image = cv.cvtColor(image.reshape(128,128,3), cv.COLOR_BGR2HSV)
    # compute the color histogram
    hist  = cv.calcHist([image], [0, 1, 2], None, [256], [0, 256, 0, 256, 0, 256])
    # normalize the histogram
    cv.normalize(hist, hist)
    return hist.flatten()

In [10]:
def create_features(images):
    kaze_features = np.array([extract_features(X) for X in images])
    hu_features = np.array([fd_hu_moments(X) for X in images])
    haralick_features = np.array([fd_haralick(X) for X in images])
    #hist_features = np.array([fd_histogram(X) for X in X_train])
    return np.hstack((kaze_features, hu_features,haralick_features))

In [None]:
from sklearn.metrics import confusion_matrix
con = confusion_matrix(preds, Y_train)
con

In [43]:
#split into training and validation set
X_train = D.data['X_train'][:1500, :]
X_validation = D.data['X_train'][1500:, :]
y_train = D.data['Y_train'][:1500]
y_validation = D.data['Y_train'][1500:]

In [49]:
clf = RandomForestClassifier(class_weight="balanced")
clf.fit(create_features(X_train), y_train)
Y_hat_train = clf.predict(create_features(X_train))
Y_hat_valid = clf.predict(create_features(X_validation))

Error:  OpenCV(4.4.0) /tmp/pip-req-build-99ib2vsi/opencv/modules/features2d/src/sift.dispatch.cpp:465: error: (-5:Bad argument) image is empty or has incorrect depth (!=CV_8U) in function 'detectAndCompute'

Error:  OpenCV(4.4.0) /tmp/pip-req-build-99ib2vsi/opencv/modules/features2d/src/sift.dispatch.cpp:465: error: (-5:Bad argument) image is empty or has incorrect depth (!=CV_8U) in function 'detectAndCompute'

Error:  OpenCV(4.4.0) /tmp/pip-req-build-99ib2vsi/opencv/modules/features2d/src/sift.dispatch.cpp:465: error: (-5:Bad argument) image is empty or has incorrect depth (!=CV_8U) in function 'detectAndCompute'

Error:  OpenCV(4.4.0) /tmp/pip-req-build-99ib2vsi/opencv/modules/features2d/src/sift.dispatch.cpp:465: error: (-5:Bad argument) image is empty or has incorrect depth (!=CV_8U) in function 'detectAndCompute'

Error:  OpenCV(4.4.0) /tmp/pip-req-build-99ib2vsi/opencv/modules/features2d/src/sift.dispatch.cpp:465: error: (-5:Bad argument) image is empty or has incorrect depth (!

Error:  OpenCV(4.4.0) /tmp/pip-req-build-99ib2vsi/opencv/modules/features2d/src/sift.dispatch.cpp:465: error: (-5:Bad argument) image is empty or has incorrect depth (!=CV_8U) in function 'detectAndCompute'

Error:  OpenCV(4.4.0) /tmp/pip-req-build-99ib2vsi/opencv/modules/features2d/src/sift.dispatch.cpp:465: error: (-5:Bad argument) image is empty or has incorrect depth (!=CV_8U) in function 'detectAndCompute'

Error:  OpenCV(4.4.0) /tmp/pip-req-build-99ib2vsi/opencv/modules/features2d/src/sift.dispatch.cpp:465: error: (-5:Bad argument) image is empty or has incorrect depth (!=CV_8U) in function 'detectAndCompute'

Error:  OpenCV(4.4.0) /tmp/pip-req-build-99ib2vsi/opencv/modules/features2d/src/sift.dispatch.cpp:465: error: (-5:Bad argument) image is empty or has incorrect depth (!=CV_8U) in function 'detectAndCompute'

Error:  OpenCV(4.4.0) /tmp/pip-req-build-99ib2vsi/opencv/modules/features2d/src/sift.dispatch.cpp:465: error: (-5:Bad argument) image is empty or has incorrect depth (!

Error:  OpenCV(4.4.0) /tmp/pip-req-build-99ib2vsi/opencv/modules/features2d/src/sift.dispatch.cpp:465: error: (-5:Bad argument) image is empty or has incorrect depth (!=CV_8U) in function 'detectAndCompute'

Error:  OpenCV(4.4.0) /tmp/pip-req-build-99ib2vsi/opencv/modules/features2d/src/sift.dispatch.cpp:465: error: (-5:Bad argument) image is empty or has incorrect depth (!=CV_8U) in function 'detectAndCompute'

Error:  OpenCV(4.4.0) /tmp/pip-req-build-99ib2vsi/opencv/modules/features2d/src/sift.dispatch.cpp:465: error: (-5:Bad argument) image is empty or has incorrect depth (!=CV_8U) in function 'detectAndCompute'

Error:  OpenCV(4.4.0) /tmp/pip-req-build-99ib2vsi/opencv/modules/features2d/src/sift.dispatch.cpp:465: error: (-5:Bad argument) image is empty or has incorrect depth (!=CV_8U) in function 'detectAndCompute'

Error:  OpenCV(4.4.0) /tmp/pip-req-build-99ib2vsi/opencv/modules/features2d/src/sift.dispatch.cpp:465: error: (-5:Bad argument) image is empty or has incorrect depth (!

ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 1 dimension(s) and the array at index 1 has 2 dimension(s)

In [45]:
scoring_function(y_validation, Y_hat_valid)

0.5248648648648648

In [36]:
cl = RandomForestClassifier(class_weight="balanced")
cl.fit(create_features(D.data['X_train']), D.data['Y_train'])
Y_hat_train = cl.predict(create_features(D.data['X_train']))
Y_hat_valid = cl.predict(create_features(D.data['X_valid']))
Y_hat_test = cl.predict(create_features(D.data['X_test']))

  cl.fit(create_features(D.data['X_train']), D.data['Y_train'])


In [None]:
scoring_function(D.data['Y_train'], Y_hat_train)

In [40]:
from sklearn.svm import SVC
scv = SVC()
scv.fit(create_features(D.data['X_train']), D.data['Y_train'])
Y_hat_train = scv.predict(create_features(D.data['X_train']))
Y_hat_valid = scv.predict(create_features(D.data['X_valid']))
Y_hat_test = scv.predict(create_features(D.data['X_test']))

  return f(**kwargs)


In [41]:
scoring_function(D.data['Y_train'], Y_hat_train)

0.3486538461538462

In [32]:
#cl.save(trained_model_name)                 
result_name = result_dir + data_name
from data_io import write
write(result_name + '_train.predict', Y_hat_train)
write(result_name + '_valid.predict', Y_hat_valid)
write(result_name + '_test.predict', Y_hat_test)
!ls $result_name*

sample_result_submission/Areal_test.predict
sample_result_submission/Areal_train.predict
sample_result_submission/Areal_valid.predict


In [None]:
Y_hat_valid = clf.predict(features)
con = confusion_matrix(preds, y_validation)
con

In [None]:
D.data['Y_valid'].shape

### Use of the baseline model

Using our BasicCNN model needs PyTorch libraries installed.

In case you have them but still encounter errors related to them, you should probably do an upgrade : 

    pip install -U torch

Our model is a simple implementation of a Convolutional Neural Network (CNN).

More information on CNN:
* [Convolutional neural network on Wikipedia](https://en.wikipedia.org/wiki/Convolutional_neural_network)
* [A Comprehensive Guide to Convolutional Neural Networks (blog)](https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53)

In [None]:
import gc

# del m.model_conv

gc.collect()

In [None]:
from model import BasicCNN, SimpleConvModel
import numpy as np

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
import torch
torch.cuda.empty_cache()
import math
import numpy as np
import pickle
from torchvision import transforms
from sklearn.base import BaseEstimator
from sklearn.preprocessing import normalize
from PIL import Image
from os.path import isfile

def requires_grad(p):
    return p.requires_grad


class SimpleConvModel(nn.Module):
    
    def __init__(self, block, layers, num_classes=13):
        self.inplanes = 64
        super(SimpleConvModel, self).__init__()
        self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3,
                               bias=False)
        self.bn1 = nn.BatchNorm2d(64)
        self.relu = nn.LeakyReLU(0.1, inplace=True)
        self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
        self.layer1 = self._make_layer(block, 64, layers[0])
        self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
        self.layer3 = self._make_layer(block, 256, layers[2], stride=2)
        self.layer4 = self._make_layer(block, 512, layers[3], stride=1)
        self.avgpool = nn.AvgPool2d(7)
        self.fc = nn.Linear(512 * block.expansion, num_classes)

        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
                m.weight.data.normal_(0, math.sqrt(2. / n))
            elif isinstance(m, nn.BatchNorm2d):
                m.weight.data.fill_(1)
                m.bias.data.zero_()

    def _make_layer(self, block, planes, blocks, stride=1):
        downsample = None
        if stride != 1 or self.inplanes != planes * block.expansion:
            downsample = nn.Sequential(
                nn.Conv2d(self.inplanes, planes * block.expansion,
                          kernel_size=1, stride=stride, bias=False),
                nn.BatchNorm2d(planes * block.expansion),
            )

        layers = []
        layers.append(block(self.inplanes, planes, stride, downsample))
        self.inplanes = planes * block.expansion
        for i in range(1, blocks):
            layers.append(block(self.inplanes, planes))

        return nn.Sequential(*layers)

    def forward(self, x):
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.maxpool(x)

        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)

        x = self.avgpool(x)
        x = x.view(x.size(0), -1)
        x = self.fc(x)

        return x

class BasicCNN(BaseEstimator):
    def __init__(self, learning_rate=1e-3, nb_epoch = 5, batch_size = 48, verbose=False, use_cuda=False):
        super(BasicCNN, self).__init__()
        if learning_rate is None:
            learning_rate = 1e-3
        if nb_epoch is None:
            nb_epoch = 10
        if batch_size is None:
            batch_size = 32
        if verbose is None:
            verbose = False
        if use_cuda is None:
            use_cuda = False
        self.nb_epoch = nb_epoch
        self.batch_size = batch_size
        self.verbose = verbose
        self.use_cuda = use_cuda
        self.model_conv = SimpleConvModel(Bottleneck, [2,2,2,2])
        # Loss function
        self.criterion = nn.CrossEntropyLoss()
        if self.use_cuda:
            self.model_conv.cuda()
            self.criterion.cuda()
        # Optimizer
        self.optim = optim.Adagrad(self.model_conv.parameters(), lr=1e-3, weight_decay=0.05)

    def fit(self, X, Y):
        '''
            param X: numpy.ndarray
                shape = (num_sample, C * W * H)
                with C = 3, W = H = 128
            param Y: numpy.ndarray
                shape = (num_sample, 1)
        '''
        X = self.process_data(X)
        Y = self.process_label(Y)
        self.model_conv.train()
        nb_batch = int(X.shape[0] / self.batch_size)
        for e in range(self.nb_epoch):
            sum_loss = 0
            for i in range(nb_batch):
                print(i, 'out of', nb_batch)
                self.optim.zero_grad()
                beg = i * self.batch_size
                end = min(X.shape[0], (i + 1) * self.batch_size)
                x = X[beg:end]
                y = Y[beg:end]
                if self.use_cuda:
                    x, y = x.cuda(), y.cuda()
#                     print(x.isnan().any(), y.isnan().any(), np.isnan(x.cpu().numpy()).any())
                out = self.model_conv(x)
                loss = self.criterion(out, y)
                del x
                del y
                loss.backward()
                self.optim.step()
#                 print(loss.item())
                sum_loss += loss.item()
            sum_loss /= nb_batch
            if self.verbose:
                print("Epoch %d : loss = %f" % (e, sum_loss))

    def process_data(self, X):
        n_sample = X.shape[0]
        mean = np.mean(X, axis=1)[:, np.newaxis]
        std = np.std(X, axis=1)[:, np.newaxis]
        X = (X - mean) / (std+1e-8)
        X = X.reshape(n_sample, 3, 128, 128)
        X = X.astype(np.float)# / 255.
        #print(X[0])
        isnan = np.isnan(X).any()
        if isnan:
            raise Exception()

        return torch.Tensor(X)

    def process_label(self, y):
        res = torch.zeros(1)
        for i in range(y.shape[0]):
            l = torch.Tensor([y[i,0]])
            res = torch.cat((res, l))
        return res[1:].type(torch.long)
    
    #def predict(self, X):
      #  self.model_conv.eval()
      #  X = self.process_data(X)
      #  if self.use_cuda:
      #      X = X.cuda()
       # pred = self.model_conv(X).argmax(dim=1).cpu().numpy()
       # return pred
    
    def predict(self, X):
        '''
            param X: numpy.ndarray
                shape = (num_sample, C * W * H)
                with C = 3, W = H = 128
            return: numpy.ndarray
                of int with shape (num_sample) ?
                of float with shape (num_sample, num_class) ?
                of string with shape (num_sample) ?
        '''
        # inverted_dico = {v:k for k,v in self.label_dico.items()}
        self.model_conv.eval()
        X = self.process_data(X)

        nb_batch = int(X.shape[0] / self.batch_size)
        pred = []
        for i in range(nb_batch):
            beg = i * self.batch_size
            end = min(X.shape[0], (i + 1) * self.batch_size)
            x = X[beg:end]
            
            if self.use_cuda:
                x = x.cuda()
            preds = self.model_conv(x).argmax(dim=1).cpu().numpy()
            pred.append(preds)
            
        x = X[end:]
            
        if self.use_cuda:
            x = x.cuda()
        preds = self.model_conv(x).argmax(dim=1).cpu().numpy()
        pred.append(preds)
        pred = np.array([item for sublist in pred for item in sublist]).reshape((-1,1))
        return pred

    def save(self, path="./"):
        pickle.dump(self, open(path + '_model.pickle', "wb"))

    def load(self, path="./"):
        modelfile = path + '_model.pickle'
        if isfile(modelfile):
            with open(modelfile, 'rb') as f:
                self = pickle.load(f)
            print("Model reloaded from: " + modelfile)
        return self
        
        
class BasicBlock(nn.Module):
    expansion = 1

    def __init__(self, inplanes, planes, stride=1, downsample=None):
        super(BasicBlock, self).__init__()
        self.conv1 = conv3x3(inplanes, planes, stride)
        self.bn1 = nn.BatchNorm2d(planes)
        self.relu = nn.LeakyReLU(0.1, inplace=True)
        self.conv2 = conv3x3(planes, planes)
        self.bn2 = nn.BatchNorm2d(planes)
        self.downsample = downsample
        self.stride = stride

    def forward(self, x):
        residual = x
        print(x.shape)
        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)

        out = self.conv2(out)
        out = self.bn2(out)

        if self.downsample is not None:
            residual = self.downsample(x)

        out += residual
        out = self.relu(out)

        return out


class Bottleneck(nn.Module):
    expansion = 4

    def __init__(self, inplanes, planes, stride=1, downsample=None):
        super(Bottleneck, self).__init__()
        self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
        self.bn1 = nn.BatchNorm2d(planes)
        self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride,
                               padding=1, bias=False)
        self.bn2 = nn.BatchNorm2d(planes)
        self.conv3 = nn.Conv2d(planes, planes * 4, kernel_size=1, bias=False)
        self.bn3 = nn.BatchNorm2d(planes * 4)
        self.relu = nn.LeakyReLU(0.1, inplace=True)
        self.downsample = downsample
        self.stride = stride

    def forward(self, x):
        residual = x

        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)

        out = self.conv2(out)
        out = self.bn2(out)
        out = self.relu(out)

        out = self.conv3(out)
        out = self.bn3(out)

        if self.downsample is not None:
            residual = self.downsample(x)

        out += residual
        out = self.relu(out)

        return out

In [None]:
m = BasicCNN(verbose=True, use_cuda=True)
# print(m.model_conv)
trained_model_name = model_dir + '/' + data_name

In [None]:
# import 
# torch.cuda.is_available()

In [None]:
print(next(m.model_conv.parameters()).device)

In [None]:
# torch.autograd.set_detect_anomaly(True)
# for i in range(10):
m.fit(X_train.reshape((-1,128,128,3)), Y_train)

In [None]:
import torch
torch.cuda.empty_cache()

In [None]:
D.data['X_train'].shape

In [None]:
# Y_hat_train = m.predict_batch(D.data['X_train'])
# Y_hat_valid = m.predict_batch(D.data['X_valid'],)
# Y_hat_test = m.predict_batch(D.data['X_test'])

In [None]:
Y_hat_train = m.predict(D.data['X_train'])
Y_hat_valid = m.predict(D.data['X_valid'],)
Y_hat_test = m.predict(D.data['X_test'])

In [None]:
np.array(Y_hat_train).shape, Y_train.shape

In [None]:
Y_hat_train[0:10], Y_hat_train

In [30]:
m.save(trained_model_name)                 
result_name = result_dir + data_name
from data_io import write
write(result_name + '_train.predict', Y_hat_train)
write(result_name + '_valid.predict', Y_hat_valid)
write(result_name + '_test.predict', Y_hat_test)
!ls $result_name*

NameError: name 'm' is not defined

#### Question 3: What are the hyperparameters of a CNN?

Apart from learning parameters such as number of layers similar to a normal neural net the CNN specifically has the size of the kernels that are used to detect features, the step size with which the kernel is applied as well as padding of the sides of the image.

#### Code 2: Edit model.py to vary the CNN's hyperparameter

In [None]:
#TODO in model.py

#### Code 3: Try another model (e.g. Random Forest, SVM, etc.)

In [None]:
#TODO in another model.py file

# Scoring the result

Obviously, since it is made with sample_data, which has too few samples, results won't be really good

In [19]:
from libscores import get_metric
import libscores
type(libscores)

module

In [20]:
from libscores import get_metric
metric_name, scoring_function = get_metric()
print('Using scoring metric:', metric_name)

Using scoring metric: accuracy


In [21]:
len(D.data['Y_valid']), len(D.data['Y_test'])

(0, 0)

In [22]:
print('Ideal score for the', metric_name, 'metric = %5.4f' % scoring_function(Y_train, Y_train))
print('Training score for the', metric_name, 'metric = %5.4f' % scoring_function(Y_train, Y_hat_train))
if len(D.data['Y_valid']) > 0 and len(D.data['Y_test']) > 0:
    print('Valid score for the', metric_name, 'metric = %5.4f' % scoring_function(D.data['Y_valid'], Y_hat_valid))
    print('Test score for the', metric_name, 'metric = %5.4f' % scoring_function(D.data['Y_test'], Y_hat_test))

Ideal score for the accuracy metric = 1.0000
Training score for the accuracy metric = 1.0000


## Confusion matrix

In [23]:
from sklearn.metrics import confusion_matrix
con = confusion_matrix(Y_train[:Y_hat_train.shape[0]], Y_hat_train)

#### Question 4: what does the confusion matrix represent?

On one axis the confusion matrix shows the predictions and on one side the solutions. The middle diagonal shows where both are the same. To the side there are cases where may a beach is classified as mountains. If there is a large number of missclassification between two classes that means that the model has difficulties distinguishing the two(so one could create features to differentiate the two, or change the model architecture)

#### Code 4: display the confusion matrix with a colored heatmap

In [24]:
# TODO
#sns.heatmap(con)

## Cross validation

CV scores on sample_data doesn't have enough data, and so isn't meaningful.
Run it with the full data to see meaningful values.

In [25]:
from sklearn.metrics import make_scorer
from sklearn.model_selection import cross_val_score

In [26]:
scores = cross_val_score(BasicCNN(), X_train, Y_train, cv=3, scoring=make_scorer(scoring_function))
print('\nCV score (95 perc. CI): %0.2f (+/- %0.2f)' % (scores.mean(), scores.std() * 2))

NameError: name 'BasicCNN' is not defined

#### Question 5: Why is there a standard deviation associated with the cross-validation score?

The cross validation uses several folds(or particitions) of the data. Each of these folds hasits accuracy score, in the end the scores of all folds are added. The variations thus shows the variation of the scores. A low variation shows that the model did similarily well on all folds it is therefore more consistent and might be preferable to a different model with similar score and higher variability because it is more consistent.

# Submission

## Example

Example needs to have python3 installed

Test to see whether submission with ingestion program is working

In [33]:
!python3 $problem_dir/ingestion.py $data_dir $result_dir $problem_dir $model_dir

Using input_dir: /home/paavo/Saclay/OPT9/public_data_tp5
Using output_dir: /home/paavo/Saclay/OPT9/TP5/sample_result_submission
Using program_dir: /home/paavo/Saclay/OPT9/TP5/ingestion_program
Using submission_dir: /home/paavo/Saclay/OPT9/TP5/sample_code_submission


************************************************
******** Processing dataset Areal ********
************************************************
Info file found : /home/paavo/Saclay/OPT9/public_data_tp5/Areal_public.info
[+] Success in  0.00 sec
[+] Success in 60.49 sec
[+] Success in  0.01 sec
[+] Success in 22.36 sec
[+] Success in  0.00 sec
[+] Success in 22.93 sec
[+] Success in  0.00 sec
DataManager : Areal
info:
	usage = Sample dataset Areal data
	name = areal
	task = multiclass.classification
	target_type = Categorical
	feat_type = Numerical
	metric = accuracy
	time_budget = 12000
	feat_num = 49152
	target_num = 13
	label_num = 13
	train_num = 5200
	valid_num = 1950
	test_num = 1950
	has_categorical = 0
	has_missing = 0

### Test scoring program

In [34]:
scoring_output_dir = 'scoring_output'
!python3 $score_dir/score.py $data_dir $result_dir $scoring_output_dir



# Prepare the submission

In [35]:
import datetime 
from data_io import zipdir
the_date = datetime.datetime.now().strftime("%y-%m-%d-%H-%M")
sample_code_submission = './sample_code_submission_' + the_date + '.zip'
sample_result_submission = './sample_result_submission_' + the_date + '.zip'
zipdir(sample_code_submission, model_dir)
zipdir(sample_result_submission, result_dir)
print("Submit one of these files:\n" + sample_code_submission + "\n" + sample_result_submission)

Submit one of these files:
./sample_code_submission_20-12-30-15-39.zip
./sample_result_submission_20-12-30-15-39.zip


# Try to submit your submissions on Codalab!