# Notebook resume
<p style="font-size:15px; font-family:verdana; line-height: 1.7em">   
Throughout this notebook I'll be using the Tez framework to train neural networks with Pytorch, the idea behind it seems amazing to me. In fact it has allowed me to create a linear model, with very little work and being inexperienced on the subject. Also the resulting code is quite pythonic. The link to the repository is down below. Happy kaggling! </p>

> ## Github: https://github.com/abhishekkrthakur/tez

In [None]:
! pip install tez -q

In [None]:
import numpy as np
import pandas as pd

import tez
import tez
from tez.datasets import GenericDataset
from tez.callbacks import EarlyStopping

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import torch.utils.data as data
from sklearn import preprocessing, metrics, model_selection

# Data loading


In [None]:
train = pd.read_parquet('../input/d/maxdiazbattan/playgroundkfold/train_kfold_play_dic.parquet')
test = pd.read_parquet('../input/d/maxdiazbattan/playgroundkfold/test_play_dic.parquet')
submission = pd.read_parquet('../input/d/maxdiazbattan/playgroundkfold/submission_play_dic.parquet')

In [None]:
feat = [feature for feature in train.columns if feature not in ('id', 'kfold','target')]

In [None]:
n_features = len(feat)

# Dataset
<p style="font-size:15px; font-family:verdana; line-height: 1.7em">   
Tez has a generic dataset class, you can use that if you want, in this case I don't do it because I want to make some changes to it, for that reason I have used this one.</p>

In [None]:
class GenDataset:
    def __init__(self, data, targets):
        self.data = data
        self.targets = targets
        
    def __len__(self):
        return len(self.data)

    def __getitem__(self, idx):
        
        if self.targets is not None:
            data = self.data[idx]
            targets = self.targets[idx]
            return {
                    "features": torch.tensor(data, dtype=torch.float),
                    "target": torch.tensor(targets, dtype=torch.long),
                   }
        else:
            data = self.data[idx]
            return {
                    "features": torch.tensor(data, dtype=torch.float),
                   }

# Tez Model Class
<p style="font-size:15px; font-family:verdana; line-height: 1.7em">   
This model inherits from the tez.Model module. It's a very simple model, you can change it quite a lot.</p>

In [None]:
class TPSModel(tez.Model):
    def __init__(self, n_features, hidd_layers, num_classes):
        super().__init__()
        # You can play with the layers.
        self.layer_1 = nn.Linear(n_features, hidd_layers)
        self.relu = nn.ReLU()
        self.layer_2 = nn.Linear(hidd_layers, num_classes)
        
        self.step_scheduler_after = "batch"
        
    def monitor_metrics(self, outputs, target):
        # You can monitor several metrics at the same time  if you want.
        if target is None:
            return {}
        outputs = torch.argmax(outputs, dim=1).cpu().detach().numpy() 
        target = target.cpu().detach().numpy()
        accuracy = metrics.accuracy_score(target, outputs)
        return {"accuracy": accuracy
               }
    
    def loss(self, outputs, target):
        if target is None:
            return None
        return nn.CrossEntropyLoss()(outputs, target)
    
    def fetch_optimizer(self):
        # You can change the optimizer, link about this down below.
        opt = torch.optim.Adam(self.parameters(), lr=1e-4)
        return opt
    
    def fetch_scheduler(self):
        # You can also change the scheduler, down below I share a link about this.
        sch = torch.optim.lr_scheduler.CosineAnnealingWarmRestarts(self.optimizer, T_0=10, T_mult=1, eta_min=1e-5, last_epoch=-1)
        return sch

    def forward(self, features, target=None):
        outputs = self.layer_1(features)
        outputs = self.relu(outputs)
        outputs = self.layer_2(outputs)
        
        if target is not None:
            outputs = outputs
            target = target
            
            loss = self.loss(outputs, target)
            metrics = self.monitor_metrics(outputs, target)
            return outputs, loss, metrics
        return outputs, None, None

> ## Optimizer & Scheduler link: https://pytorch.org/docs/stable/optim.html

# Data Split, Preprocessing (with oversampling) & Model Train
<p style="font-size:15px; font-family:verdana; line-height: 1.7em">   
In this section is where everything happens, first, since I'm going to be using CrossEntropyLoss, this loss metric accepts values in the range from 0 to n-1, and for that reason I need to modify the values of the target variable so that it starts at 0. With a simple subtraction it can be done. Second, for time reasons I'm going to train the model just in one fold, for this case fold # 4, which is the one that contains the only value Cover_Type = 5. On the other hand I'm going to do an oversampling assigning a weight to each of the values of the classes, and a simple scaling with standard scaler.</p>

In [None]:
train.Cover_Type.value_counts()

In [None]:
train.Cover_Type = train.Cover_Type - 1 

In [None]:
train.Cover_Type.value_counts()

In [None]:
# Data split
fold = 4   
X_train = train[train.kfold == fold].reset_index(drop=True)
X_valid = train[train.kfold != fold].reset_index(drop=True)

y_train = X_train.Cover_Type 
y_valid = X_valid.Cover_Type 

# Sampling the data
train_labels_unique, class_counts = np.unique(X_train['Cover_Type'], return_counts=True)
weights = 1. / class_counts
samples_weights = np.array([weights[t] for t in y_train])
samples_weights = torch.from_numpy(samples_weights)
sampler = data.WeightedRandomSampler(samples_weights, len(samples_weights))

# Scaling
feat = [feature for feature in train.columns if feature not in ('Id', 'kfold','Cover_Type','Weights')]

scl = preprocessing.StandardScaler()
X_train = scl.fit_transform(X_train[feat].values)
X_valid = scl.transform(X_valid[feat].values)
X_test = scl.transform(test[feat].values)

# Creating the datasets
train_dataset = GenDataset(X_train, y_train.values)
valid_dataset = GenDataset(X_valid, y_valid.values)
test_dataset = GenDataset(X_test, None)

In [None]:
n_features = len(feat)

In [None]:
model = TPSModel(n_features=n_features, hidd_layers=100, num_classes=train['Cover_Type'].nunique())

In [None]:
e_stop = EarlyStopping(monitor="valid_loss", model_path='Model.bin',patience=5,mode="max")
model.fit(train_dataset,valid_dataset=valid_dataset,
          train_sampler=sampler,train_shuffle=False,
          train_bs=32,valid_bs=64,device="cuda",epochs=5, 
          callbacks=[e_stop], n_jobs=2)

# Predictions
<p style="font-size:15px; font-family:verdana; line-height: 1.7em">
To compensate a bit for running the model in a single fold, I'm going to make it run 5 times when predicting, I don't know if that's the right thing to do, but it gave me a slightly better result. On the other hand, at moment of submitting, I add 1 to the classes to restore the original values.</p>

In [None]:
final_preds = None
for j in range(5):
    predictions = model.predict(test_dataset, batch_size=32, n_jobs=-1)
    temp_preds = None
    for pred in predictions:
        if temp_preds is None:
            temp_preds = pred
        else:
            temp_preds = np.vstack((temp_preds, pred))
    if final_preds is None:
        final_preds = temp_preds
    else:
        final_preds += temp_preds
final_preds /= 5

In [None]:
final_preds = final_preds.argmax(axis=1)
    
submission.Cover_Type = final_preds
submission.Cover_Type = submission.Cover_Type + 1
submission.to_csv("submission.csv", index=False)