# SFC PAYGo Solar Credit Repayment Competition

File name: PAYGoAI.ipynb

Author: kogni7

Date: August 2021

## Contents
* 1 Preparation
* 2 Data
* 3 Training
* 4 Prediction and Submission

This notebook uses only the data sets provided by ZINDI. These data sets contain information of PAYGo SHS contracts. These are the only used features in this notebook. The task is to predict PAYGo SHS contract repayments for the next six months.

The file system for this project is:

* PAYGoAI (root)
    * PAYGoAI.ipynb (this notebook)
    * Data
        * Train.csv
        * metadata.csv
        * Test.csv
        * SampleSubmission.csv
    * Submission
        * 1 - x: Submission directories named by the version number
            * submission.csv

This jupyter notebook runs in Google Colab without special configuration. GPU is enabled.

This notebook uses a Transformer based approach.

## 1 Preparation
### Time

In [1]:
import time
start_time = time.time()

### Libraries and Seed

In [2]:
# Seed, Libraries
SEED = 42

# Math
import numpy as np
print("Numpy Version: " + str(np.__version__))

import random
import os
os.environ['PYTHONHASHSEED'] = str(SEED)

np.random.seed(SEED)

random.seed(SEED)

# PyTorch
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch import optim
from torch.utils.data import DataLoader
print("PyTorch Version: " + str(torch.__version__))
torch.manual_seed(SEED)
torch.cuda.manual_seed_all(SEED)
torch.backends.cudnn.benchmark = False
torch.backends.cudnn.deterministic = False

# CSV
import pandas as pd
print("Pandas Version: " + str(pd.__version__))

# Machine Learning
import sklearn
from sklearn.model_selection import KFold
from sklearn.preprocessing import MinMaxScaler
from sklearn.preprocessing import LabelEncoder
print("SciKit-Learn Version: " + str(sklearn.__version__))

from tqdm import tqdm
import gc

Numpy Version: 1.19.5
PyTorch Version: 1.9.0+cu102
Pandas Version: 1.1.5
SciKit-Learn Version: 0.22.2.post1


For reproducibility, use the same GPU as shown below!

In [3]:
!nvidia-smi

Sun Aug 29 15:15:32 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla K80           Off  | 00000000:00:04.0 Off |                    0 |
| N/A   37C    P8    26W / 149W |      0MiB / 11441MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

### Parameters

In [4]:
EPOCHS = 7

BATCH_SIZE = 32

TEST_BATCH_SIZE = 128

CV = 5

VERBOSE = True
VERBOSE_TRAIN = True

LEARNING_RATE = 0.01
SCHEDULER = False

DROPOUT = 0.15

PAD = 60
LAYERS = 16
HIDDEN = 500

# The Version
VERSION = "2"

# for use in Google Colab
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [5]:
WD = os.getcwd() + "/drive/My Drive/PAYGoAI"
print(WD)

/content/drive/My Drive/PAYGoAI


## 2 Data

In [6]:
train_csv = pd.read_csv(WD + "/Data/Train.csv")
train_csv.head()

Unnamed: 0,ID,TransactionDates,PaymentsHistory,m1,m2,m3,m4,m5,m6
0,ID_MR53LEX,"['04-2018', '05-2018', '06-2018', '07-2018', '...","[3600.0, 750.0, 350.0, 65.0, 95.0, 135.0, 85.0...",880.0,930.0,495.0,715.0,220.0,385.0
1,ID_3D7NQUH,"['04-2018', '05-2018', '06-2018', '07-2018', '...","[2940.0, 970.0, 380.0, 880.0, 385.0, 440.0, 11...",660.0,935.0,935.0,825.0,770.0,935.0
2,ID_0IWQNPI,"['02-2020', '03-2020', '04-2020', '05-2020', '...","[2850.0, 1500.0, 1350.0, 610.0, 200.0, 250.0]",700.0,1350.0,1550.0,1400.0,1450.0,1200.0
3,ID_IY8SYB9,"['09-2017', '10-2017', '11-2017', '12-2017', '...","[2200.0, 1420.0, 1180.0, 900.0, 1400.0, 780.0,...",580.0,480.0,800.0,1260.0,1650.0,530.0
4,ID_9XHL7VZ,"['09-2017', '10-2017', '11-2017', '12-2017', '...","[2640.0, 910.0, 480.0, 280.0, 200.0, 180.0, 33...",40.0,440.0,460.0,360.0,80.0,330.0


In [7]:
metadata_csv = pd.read_csv(WD + "/Data/metadata.csv")
metadata_csv.head()

  interactivity=interactivity, compiler=compiler, result=result)


Unnamed: 0,ID,RegistrationDate,Deposit,UpsellDate,AccessoryRate,PaymentMethod,rateTypeEntity,RatePerUnit,DaysOnDeposit,MainApplicantGender,Age,Region,Town,Occupation,SupplierName,Term,TotalContractValue,ExpectedTermDate,FirstPaymentDate,LastPaymentDate
0,ID_K00S4N4,2015-12-10 00:00:00,2000,,0.0,FINANCED,DAILY,35,7,Male,41.0,Mount Kenya Region,Embu,Other,d_light,364,14740.0,2016-12-08 00:00:00,2015-12-10 09:52:35,2016-10-23 04:52:30
1,ID_6L67PAA,2015-12-09 00:00:00,2000,,0.0,FINANCED,DAILY,35,7,Male,33.0,Coast Region,Kilifi,Other,d_light,364,14740.0,2016-12-07 00:00:00,2015-12-09 13:14:03,2020-05-24 15:32:18
2,ID_102CV85,2015-12-18 00:00:00,2000,2018-03-29 10:14:58,35.0,FINANCED,DAILY,35,7,Female,48.0,Nairobi Region,Makueni,Business,d_light,392,29480.0,2017-01-13 00:00:00,2015-12-18 06:22:34,2017-02-01 15:23:44
3,ID_HXBJFHB,2015-11-25 00:00:00,2000,,0.0,FINANCED,DAILY,35,7,Female,43.0,,UNKNOWN,Teacher,d_light,364,14740.0,2016-11-23 00:00:00,2015-11-25 13:25:57,2017-05-22 16:46:54
4,ID_3K9VZ5J,2015-12-02 00:00:00,2000,,0.0,FINANCED,DAILY,35,7,Female,56.0,Mount Kenya Region,Kirinyaga,Other,d_light,364,14740.0,2016-11-30 00:00:00,2015-12-05 10:34:32,2017-05-12 16:50:52


How many values has each feature?

In [8]:
print(len(metadata_csv))
print("RegistrationDate " + str(len(set(metadata_csv.RegistrationDate))))
print("Deposit " + str(len(set(metadata_csv.Deposit))))
print("UpsellDate " + str(len(set(metadata_csv.UpsellDate))))
print("AccessoryRate " + str(len(set(metadata_csv.AccessoryRate))))
print("PaymentMethod " + str(len(set(metadata_csv.PaymentMethod))))
print("rateTypeEntity " + str(len(set(metadata_csv.rateTypeEntity))))
print("RatePerUnit " + str(len(set(metadata_csv.RatePerUnit))))
print("DaysOnDeposit " + str(len(set(metadata_csv.DaysOnDeposit))))
print("MainApplicantGender " + str(len(set(metadata_csv.MainApplicantGender))))
print("Age " + str(len(set(metadata_csv.Age))))
print("Region " + str(len(set(metadata_csv.Region))))
print("Town " + str(len(set(metadata_csv.Town))))
print("Occupation " + str(len(set(metadata_csv.Occupation))))
print("SupplierName " + str(len(set(metadata_csv.SupplierName))))
print("Term " + str(len(set(metadata_csv.Term))))
print("TotalContractValue " + str(len(set(metadata_csv.TotalContractValue))))
print("ExpectedTermDate " + str(len(set(metadata_csv.ExpectedTermDate))))
print("FirstPaymentDate " + str(len(set(metadata_csv.FirstPaymentDate))))

37343
RegistrationDate 37211
Deposit 11
UpsellDate 4764
AccessoryRate 18
PaymentMethod 1
rateTypeEntity 3
RatePerUnit 11
DaysOnDeposit 6
MainApplicantGender 2
Age 7022
Region 8
Town 48
Occupation 7
SupplierName 1
Term 48
TotalContractValue 33
ExpectedTermDate 37206
FirstPaymentDate 37279


We use only features with more than one value, but not too much values.

In [9]:
metadata_csv = metadata_csv[["ID", "Deposit", "AccessoryRate", "rateTypeEntity", "RatePerUnit", "DaysOnDeposit", "MainApplicantGender",
                             "Age", "Region", "Town", "Occupation", "Term", "TotalContractValue"]]
metadata_csv.head()

Unnamed: 0,ID,Deposit,AccessoryRate,rateTypeEntity,RatePerUnit,DaysOnDeposit,MainApplicantGender,Age,Region,Town,Occupation,Term,TotalContractValue
0,ID_K00S4N4,2000,0.0,DAILY,35,7,Male,41.0,Mount Kenya Region,Embu,Other,364,14740.0
1,ID_6L67PAA,2000,0.0,DAILY,35,7,Male,33.0,Coast Region,Kilifi,Other,364,14740.0
2,ID_102CV85,2000,35.0,DAILY,35,7,Female,48.0,Nairobi Region,Makueni,Business,392,29480.0
3,ID_HXBJFHB,2000,0.0,DAILY,35,7,Female,43.0,,UNKNOWN,Teacher,364,14740.0
4,ID_3K9VZ5J,2000,0.0,DAILY,35,7,Female,56.0,Mount Kenya Region,Kirinyaga,Other,364,14740.0


In [10]:
metadata_csv.isna().sum()

ID                        0
Deposit                   0
AccessoryRate             0
rateTypeEntity            0
RatePerUnit               0
DaysOnDeposit             0
MainApplicantGender       0
Age                    6939
Region                 1934
Town                      0
Occupation                0
Term                      0
TotalContractValue        0
dtype: int64

We replace NAs in Age with the mean of Age.

In [11]:
mean_Age = np.ceil(np.mean(metadata_csv.Age))
metadata_csv.Age = metadata_csv.Age.fillna(mean_Age)
metadata_csv.Age.isna().sum()

0

We replace NAs in Region with UNKNOWN.

In [12]:
metadata_csv.Region = metadata_csv.Region.fillna("UNKNOWN")
metadata_csv.Region.isna().sum()

0

We label_encode the text features.

In [13]:
enc = LabelEncoder()

metadata_csv.rateTypeEntity = enc.fit_transform(metadata_csv.rateTypeEntity)
metadata_csv.MainApplicantGender = enc.fit_transform(metadata_csv.MainApplicantGender)
metadata_csv.Region = enc.fit_transform(metadata_csv.Region)
metadata_csv.Town = enc.fit_transform(metadata_csv.Town)
metadata_csv.Occupation = enc.fit_transform(metadata_csv.Occupation)
metadata_csv.head()

Unnamed: 0,ID,Deposit,AccessoryRate,rateTypeEntity,RatePerUnit,DaysOnDeposit,MainApplicantGender,Age,Region,Town,Occupation,Term,TotalContractValue
0,ID_K00S4N4,2000,0.0,0,35,7,1,41.0,1,5,5,364,14740.0
1,ID_6L67PAA,2000,0.0,0,35,7,1,33.0,0,13,5,364,14740.0
2,ID_102CV85,2000,35.0,0,35,7,0,48.0,2,22,0,392,29480.0
3,ID_HXBJFHB,2000,0.0,0,35,7,0,43.0,6,43,6,364,14740.0
4,ID_3K9VZ5J,2000,0.0,0,35,7,0,56.0,1,14,5,364,14740.0


We minmaxscale all features.

In [14]:
scaler = MinMaxScaler()

metadata_csv.Deposit = scaler.fit_transform(np.array(metadata_csv.Deposit).reshape(-1, 1))
metadata_csv.AccessoryRate = scaler.fit_transform(np.array(metadata_csv.AccessoryRate).reshape(-1, 1))
metadata_csv.rateTypeEntity = scaler.fit_transform(np.array(metadata_csv.rateTypeEntity).reshape(-1, 1))
metadata_csv.RatePerUnit = scaler.fit_transform(np.array(metadata_csv.RatePerUnit).reshape(-1, 1))
metadata_csv.DaysOnDeposit = scaler.fit_transform(np.array(metadata_csv.DaysOnDeposit).reshape(-1, 1))
metadata_csv.MainApplicantGender = scaler.fit_transform(np.array(metadata_csv.MainApplicantGender).reshape(-1, 1))
metadata_csv.Age = scaler.fit_transform(np.array(metadata_csv.Age).reshape(-1, 1))
metadata_csv.Region = scaler.fit_transform(np.array(metadata_csv.Region).reshape(-1, 1))
metadata_csv.Town = scaler.fit_transform(np.array(metadata_csv.Town).reshape(-1, 1))
metadata_csv.Occupation = scaler.fit_transform(np.array(metadata_csv.Occupation).reshape(-1, 1))
metadata_csv.Term = scaler.fit_transform(np.array(metadata_csv.Term).reshape(-1, 1))
metadata_csv.TotalContractValue = scaler.fit_transform(np.array(metadata_csv.TotalContractValue).reshape(-1, 1))
metadata_csv.head()

Unnamed: 0,ID,Deposit,AccessoryRate,rateTypeEntity,RatePerUnit,DaysOnDeposit,MainApplicantGender,Age,Region,Town,Occupation,Term,TotalContractValue
0,ID_K00S4N4,0.25,0.0,0.0,0.0,0.116667,1.0,0.217822,0.142857,0.106383,0.833333,0.574675,0.073392
1,ID_6L67PAA,0.25,0.0,0.0,0.0,0.116667,1.0,0.138614,0.0,0.276596,0.833333,0.574675,0.073392
2,ID_102CV85,0.25,0.145833,0.0,0.0,0.116667,0.0,0.287129,0.285714,0.468085,0.0,0.62013,0.347962
3,ID_HXBJFHB,0.25,0.0,0.0,0.0,0.116667,0.0,0.237624,0.857143,0.914894,1.0,0.574675,0.073392
4,ID_3K9VZ5J,0.25,0.0,0.0,0.0,0.116667,0.0,0.366337,0.142857,0.297872,0.833333,0.574675,0.073392


In [15]:
test_csv = pd.read_csv(WD + "/Data/Test.csv")
test_csv.head()

Unnamed: 0,ID,TransactionDates,PaymentsHistory
0,ID_6L67PAA,"['12-2015', '01-2016', '02-2016', '03-2016', '...","[4000.0, 1050.0, 1050.0, 1050.0, 1050.0, 400.0]"
1,ID_VJ80SX2,"['12-2015', '01-2016', '02-2016', '03-2016', '...","[3000.0, 850.0, 750.0, 1500.0, 650.0, 1250.0, ..."
2,ID_7OU9HLK,"['12-2015', '01-2016', '03-2016', '05-2016', '...","[2400.0, 300.0, 500.0, 450.0, 675.0, 700.0, 87..."
3,ID_WVWTPGK,"['12-2015', '01-2016', '02-2016', '03-2016', '...","[4700.0, 1200.0, 950.0, 1200.0, 900.0, 1110.0,..."
4,ID_04DSDQS,"['12-2015', '01-2016', '02-2016', '03-2016', '...","[4800.0, 750.0, 995.0, 995.0, 1300.0, 750.0, 1..."


In [16]:
train_csv = pd.merge(left=metadata_csv, right=train_csv, on="ID")
test_csv = pd.merge(left=metadata_csv, right=test_csv, on="ID")

In [17]:
sample_submission_csv = pd.read_csv(WD + "/Data/SampleSubmission.csv")
sample_submission_csv.head()

Unnamed: 0,ID,Target
0,ID_6L67PAA x m1,0.0
1,ID_6L67PAA x m2,0.0
2,ID_6L67PAA x m3,0.0
3,ID_6L67PAA x m4,0.0
4,ID_6L67PAA x m5,0.0


We encode the feature TransactionDates.

In [18]:
dates_train = []
for i in range(len(train_csv)):
    dates_train += train_csv.TransactionDates.iloc[i][1:len(train_csv.TransactionDates.iloc[i])-1].split(",")

dates_test = []
for i in range(len(test_csv)):
    dates_test += test_csv.TransactionDates.iloc[i][1:len(test_csv.TransactionDates.iloc[i])-1].split(",")

dates = list(dict.fromkeys(dates_train + dates_test))

dates_codes = {}
code = 1
for i in dates:
    dates_codes[i] = code
    code += 1

In [19]:
class MakeDataSet(torch.utils.data.Dataset):
    def __init__(self, data, mode):
        self.data = data
        self.mode = mode

    def __getitem__(self, idx):

        sample_tensor = torch.zeros((PAD, 2))

        # PaymentsHistory
        sample_csv = self.data.PaymentsHistory.iloc[idx][1:len(self.data.PaymentsHistory.iloc[idx])-1].split(",")
        sample = torch.tensor([float(i) for i in sample_csv])
       
        if sample.shape[0] > PAD:
            stop = PAD
        else:
            stop = sample.shape[0]
        sample_tensor[:stop, 0] = sample[:stop]

        # TransactionDates
        sample_csv = self.data.TransactionDates.iloc[idx][1:len(self.data.TransactionDates.iloc[idx])-1].split(",")
        sample = torch.tensor([dates_codes[i] for i in sample_csv])

        sample_tensor[:stop, 1] = sample[:stop]

        # Meta
        meta_tensor = torch.zeros((1, 12))
        meta_tensor[0, 0] = torch.tensor(self.data.Deposit.iloc[idx])
        meta_tensor[0, 1] = torch.tensor(self.data.AccessoryRate.iloc[idx])
        meta_tensor[0, 2] = torch.tensor(self.data.rateTypeEntity.iloc[idx])
        meta_tensor[0, 3] = torch.tensor(self.data.RatePerUnit.iloc[idx])
        meta_tensor[0, 4] = torch.tensor(self.data.DaysOnDeposit.iloc[idx])
        meta_tensor[0, 5] = torch.tensor(self.data.MainApplicantGender.iloc[idx])
        meta_tensor[0, 6] = torch.tensor(self.data.Age.iloc[idx])
        meta_tensor[0, 7] = torch.tensor(self.data.Region.iloc[idx])
        meta_tensor[0, 8] = torch.tensor(self.data.Town.iloc[idx])
        meta_tensor[0, 9] = torch.tensor(self.data.Occupation.iloc[idx])
        meta_tensor[0, 10] = torch.tensor(self.data.Term.iloc[idx])
        meta_tensor[0, 11] = torch.tensor(self.data.TotalContractValue.iloc[idx])

        # Target
        if self.mode != "test":
            target_csv = [self.data.m1.iloc[idx], self.data.m2.iloc[idx], self.data.m3.iloc[idx], self.data.m4.iloc[idx], self.data.m5.iloc[idx], self.data.m6.iloc[idx]]
            target_tensor = torch.tensor(target_csv)

            return {'sample' : sample_tensor, "meta": meta_tensor, 'target' : target_tensor}
 
        else:
 
            return {'sample': sample_tensor, "meta": meta_tensor}

    def __len__(self):
        return len(self.data)

The TransformerNet processes the sequential data with a Transformer layer and the meta data with a feed-forward network. Both outputs will be combined in a feed-forward network afterwards.

In [20]:
class TransformerNet(nn.Module):
    def __init__(self):
        super(TransformerNet, self).__init__()
        self.encoder_layer = nn.TransformerEncoderLayer(d_model=2, nhead=2, dropout=DROPOUT)
        self.transformer_encoder = nn.TransformerEncoder(self.encoder_layer, num_layers=LAYERS)
        self.flatten = nn.Flatten()
        self.transformer_ffn = nn.Linear(PAD * 2, HIDDEN)

        self.meta_ffn = nn.Linear(12, HIDDEN)
        self.relu = nn.ReLU()

        self.drop = nn.Dropout(p=DROPOUT)
        self.ffn = nn.Linear(HIDDEN * 2, 6)

    def forward(self, x, meta):
        x = self.transformer_encoder(x)
        x = self.flatten(x)
        x = self.transformer_ffn(x)

        meta = self.meta_ffn(meta)
        meta = self.relu(meta)
        meta = self.flatten(meta)

        x = torch.cat((x, meta), dim=1)
        x = self.drop(x)
        x = self.ffn(x)

        return x

## 3 Training

In [21]:
def training(data_loader, model, optimizer):
    
    model.train()
    
    if VERBOSE_TRAIN:
        losses = []

    for data in tqdm(data_loader):
        sample = data['sample']
        meta = data['meta']
        target = data['target']

        sample = sample.cuda()
        meta = meta.cuda()
        target = target.cuda()

        optimizer.zero_grad()

        output = model(sample.float(), meta.float())

        loss = nn.MSELoss()(output.squeeze().float(), target.float())

        loss.backward()
        optimizer.step()

        if VERBOSE_TRAIN:
            losses.append(loss.data.cpu())

        del output, loss, sample, meta, target
        gc.collect()

    if VERBOSE_TRAIN:
        return np.sqrt(np.mean(losses))


def evaluation(data_loader, model):
    
    model.eval()
    
    losses = []

    with torch.no_grad():
        for data in tqdm(data_loader):
            sample = data['sample']
            meta = data['meta']
            target = data['target']

            sample = sample.cuda()
            meta = meta.cuda()
            target = target.cuda()

            output = model(sample.float(), meta.float())
            loss = nn.MSELoss()(output.squeeze().float(), target.float())

            losses.append(loss.data.cpu())

            del output, loss, sample, meta, target
            gc.collect()

    return np.sqrt(np.mean(losses))


def prediction(data_loader, model):
    model.eval()
    
    outputs = []

    with torch.no_grad():
        for data in tqdm(data_loader):
            sample = data['sample']
            meta = data['meta']

            sample = sample.cuda()
            meta = meta.cuda()

            output = model(sample.float(), meta.float())
            outputs.append(output.squeeze().data.cpu())

            del output, sample, meta
            gc.collect()

        return torch.cat(outputs)

In [22]:
def seed_worker(worker_id):
    """
    https://pytorch.org/docs/stable/notes/randomness.html
    """
    worker_seed = torch.initial_seed() % 2**32
    np.random.seed(worker_seed)
    random.seed(worker_seed)


cv = 1

STATISTICS = {}
if VERBOSE_TRAIN:
    STATISTICS['TRAIN'] = {}
    STATISTICS['TRAIN']['LOSS'] = np.zeros((EPOCHS, CV))
STATISTICS['VALIDATION'] = {}
STATISTICS['VALIDATION']['LOSS'] = np.zeros((EPOCHS, CV))

for train, val in KFold(n_splits=CV, shuffle=True, random_state=SEED).split(train_csv.ID):
    print("Run {} of {}.".format(cv, CV))

    # Data
    g = torch.Generator()
    g.manual_seed(SEED)
    TRAIN_SET = MakeDataSet(train_csv.iloc[train], "train")
    train_data_loader = torch.utils.data.DataLoader(TRAIN_SET, batch_size=BATCH_SIZE, worker_init_fn=seed_worker, generator=g, num_workers=2)

    g = torch.Generator()
    g.manual_seed(SEED)
    VALIDATION_SET = MakeDataSet(train_csv.iloc[val], "val")
    val_data_loader = torch.utils.data.DataLoader(VALIDATION_SET, batch_size=TEST_BATCH_SIZE, worker_init_fn=seed_worker, generator=g, num_workers=2)

    # Model
    model = TransformerNet()
    model = model.cuda()

    optimizer = torch.optim.Adam(model.parameters(), lr=LEARNING_RATE)

    if SCHEDULER:
        scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=2, gamma=0.1)

    BEST_RMSE = np.inf

    for epoch in range(EPOCHS):
        if VERBOSE_TRAIN:
            train_loss = training(train_data_loader, model, optimizer)
        else:
            training(train_data_loader, model, optimizer)
        val_loss = evaluation(val_data_loader, model)

        if SCHEDULER:
            scheduler.step()

        if BEST_RMSE > val_loss:
            torch.save(model.state_dict(), '/content/model_' + str(cv) + '.pt')
            BEST_RMSE = val_loss

        if VERBOSE:
            if VERBOSE_TRAIN:
                print("Epoch: {}; TRAINING: {:.1f}; VALIDATION: {:.1f}".format(epoch, train_loss, val_loss))
            else:
                print("Epoch: {}; VALIDATION: {:.1f}".format(epoch, val_loss))

        if VERBOSE_TRAIN:
            STATISTICS['TRAIN']['LOSS'][epoch, cv-1] = train_loss
        STATISTICS['VALIDATION']['LOSS'][epoch, cv-1] = val_loss

    if VERBOSE_TRAIN:
        del train_loss
    del BEST_RMSE, val_loss, model, TRAIN_SET, train_data_loader, VALIDATION_SET, val_data_loader
    gc.collect()
    torch.cuda.synchronize()
    torch.cuda.empty_cache()

    print("\n")
    cv += 1

print("Result:")
for epoch in range(EPOCHS):
    if VERBOSE_TRAIN:
        print("Epoch: {}; TRAINING: {:.1f}; VALIDATION: {:.1f}".format(epoch, np.mean(STATISTICS['TRAIN']['LOSS'][epoch, :]), np.mean(STATISTICS['VALIDATION']['LOSS'][epoch, :])))
    else:
        print("Epoch: {}; VALIDATION: {:.1f}".format(epoch, np.mean(STATISTICS['VALIDATION']['LOSS'][epoch, :])))

Run 1 of 5.


100%|██████████| 701/701 [02:19<00:00,  5.01it/s]
100%|██████████| 44/44 [00:10<00:00,  4.21it/s]


Epoch: 0; TRAINING: 898.9; VALIDATION: 784.1


100%|██████████| 701/701 [02:19<00:00,  5.01it/s]
100%|██████████| 44/44 [00:10<00:00,  4.20it/s]


Epoch: 1; TRAINING: 886.4; VALIDATION: 799.0


100%|██████████| 701/701 [02:17<00:00,  5.09it/s]
100%|██████████| 44/44 [00:10<00:00,  4.09it/s]


Epoch: 2; TRAINING: 885.8; VALIDATION: 800.2


100%|██████████| 701/701 [02:18<00:00,  5.07it/s]
100%|██████████| 44/44 [00:10<00:00,  4.17it/s]


Epoch: 3; TRAINING: 885.4; VALIDATION: 798.3


100%|██████████| 701/701 [02:18<00:00,  5.04it/s]
100%|██████████| 44/44 [00:10<00:00,  4.16it/s]


Epoch: 4; TRAINING: 885.1; VALIDATION: 794.8


100%|██████████| 701/701 [02:18<00:00,  5.08it/s]
100%|██████████| 44/44 [00:10<00:00,  4.16it/s]


Epoch: 5; TRAINING: 885.0; VALIDATION: 792.2


100%|██████████| 701/701 [02:20<00:00,  4.99it/s]
100%|██████████| 44/44 [00:10<00:00,  4.15it/s]


Epoch: 6; TRAINING: 884.6; VALIDATION: 790.3


Run 2 of 5.


100%|██████████| 701/701 [02:19<00:00,  5.01it/s]
100%|██████████| 44/44 [00:10<00:00,  4.17it/s]


Epoch: 0; TRAINING: 900.3; VALIDATION: 770.2


100%|██████████| 701/701 [02:18<00:00,  5.07it/s]
100%|██████████| 44/44 [00:10<00:00,  4.20it/s]


Epoch: 1; TRAINING: 889.6; VALIDATION: 778.3


100%|██████████| 701/701 [02:19<00:00,  5.03it/s]
100%|██████████| 44/44 [00:10<00:00,  4.20it/s]


Epoch: 2; TRAINING: 889.2; VALIDATION: 779.6


100%|██████████| 701/701 [02:19<00:00,  5.03it/s]
100%|██████████| 44/44 [00:10<00:00,  4.20it/s]


Epoch: 3; TRAINING: 888.8; VALIDATION: 777.2


100%|██████████| 701/701 [02:20<00:00,  5.01it/s]
100%|██████████| 44/44 [00:10<00:00,  4.17it/s]


Epoch: 4; TRAINING: 888.7; VALIDATION: 773.6


100%|██████████| 701/701 [02:18<00:00,  5.05it/s]
100%|██████████| 44/44 [00:10<00:00,  4.17it/s]


Epoch: 5; TRAINING: 888.5; VALIDATION: 772.1


100%|██████████| 701/701 [02:20<00:00,  5.00it/s]
100%|██████████| 44/44 [00:10<00:00,  4.21it/s]


Epoch: 6; TRAINING: 888.2; VALIDATION: 769.5


Run 3 of 5.


100%|██████████| 701/701 [02:19<00:00,  5.04it/s]
100%|██████████| 44/44 [00:10<00:00,  4.18it/s]


Epoch: 0; TRAINING: 879.4; VALIDATION: 887.0


100%|██████████| 701/701 [02:21<00:00,  4.95it/s]
100%|██████████| 44/44 [00:10<00:00,  4.12it/s]


Epoch: 1; TRAINING: 864.6; VALIDATION: 888.0


100%|██████████| 701/701 [02:17<00:00,  5.08it/s]
100%|██████████| 44/44 [00:10<00:00,  4.27it/s]


Epoch: 2; TRAINING: 863.5; VALIDATION: 885.1


100%|██████████| 701/701 [02:20<00:00,  5.01it/s]
100%|██████████| 44/44 [00:10<00:00,  4.22it/s]


Epoch: 3; TRAINING: 863.2; VALIDATION: 885.4


100%|██████████| 701/701 [02:17<00:00,  5.10it/s]
100%|██████████| 44/44 [00:10<00:00,  4.36it/s]


Epoch: 4; TRAINING: 863.0; VALIDATION: 884.5


100%|██████████| 701/701 [02:14<00:00,  5.20it/s]
100%|██████████| 44/44 [00:10<00:00,  4.32it/s]


Epoch: 5; TRAINING: 862.9; VALIDATION: 881.5


100%|██████████| 701/701 [02:16<00:00,  5.14it/s]
100%|██████████| 44/44 [00:10<00:00,  4.31it/s]


Epoch: 6; TRAINING: 862.7; VALIDATION: 879.5


Run 4 of 5.


100%|██████████| 701/701 [02:15<00:00,  5.17it/s]
100%|██████████| 44/44 [00:10<00:00,  4.21it/s]


Epoch: 0; TRAINING: 896.7; VALIDATION: 793.7


100%|██████████| 701/701 [02:19<00:00,  5.04it/s]
100%|██████████| 44/44 [00:10<00:00,  4.25it/s]


Epoch: 1; TRAINING: 883.3; VALIDATION: 805.3


100%|██████████| 701/701 [02:20<00:00,  4.97it/s]
100%|██████████| 44/44 [00:10<00:00,  4.15it/s]


Epoch: 2; TRAINING: 882.7; VALIDATION: 806.7


100%|██████████| 701/701 [02:20<00:00,  4.97it/s]
100%|██████████| 44/44 [00:10<00:00,  4.15it/s]


Epoch: 3; TRAINING: 882.2; VALIDATION: 803.6


100%|██████████| 701/701 [02:21<00:00,  4.94it/s]
100%|██████████| 44/44 [00:10<00:00,  4.19it/s]


Epoch: 4; TRAINING: 881.8; VALIDATION: 800.1


100%|██████████| 701/701 [02:20<00:00,  4.98it/s]
100%|██████████| 44/44 [00:10<00:00,  4.10it/s]


Epoch: 5; TRAINING: 881.5; VALIDATION: 797.3


100%|██████████| 701/701 [02:20<00:00,  5.00it/s]
100%|██████████| 44/44 [00:10<00:00,  4.15it/s]


Epoch: 6; TRAINING: 881.2; VALIDATION: 794.1


Run 5 of 5.


100%|██████████| 701/701 [02:22<00:00,  4.93it/s]
100%|██████████| 44/44 [00:10<00:00,  4.17it/s]


Epoch: 0; TRAINING: 772.4; VALIDATION: 1084.1


100%|██████████| 701/701 [02:21<00:00,  4.96it/s]
100%|██████████| 44/44 [00:10<00:00,  4.16it/s]


Epoch: 1; TRAINING: 759.2; VALIDATION: 1084.0


100%|██████████| 701/701 [02:20<00:00,  5.01it/s]
100%|██████████| 44/44 [00:10<00:00,  4.15it/s]


Epoch: 2; TRAINING: 757.9; VALIDATION: 1080.6


100%|██████████| 701/701 [02:20<00:00,  4.98it/s]
100%|██████████| 44/44 [00:10<00:00,  4.16it/s]


Epoch: 3; TRAINING: 756.7; VALIDATION: 1080.3


100%|██████████| 701/701 [02:21<00:00,  4.96it/s]
100%|██████████| 44/44 [00:10<00:00,  4.12it/s]


Epoch: 4; TRAINING: 757.6; VALIDATION: 1083.2


100%|██████████| 701/701 [02:20<00:00,  4.98it/s]
100%|██████████| 44/44 [00:10<00:00,  4.14it/s]


Epoch: 5; TRAINING: 757.6; VALIDATION: 1076.7


100%|██████████| 701/701 [02:20<00:00,  5.01it/s]
100%|██████████| 44/44 [00:10<00:00,  4.13it/s]


Epoch: 6; TRAINING: 758.2; VALIDATION: 1078.2


Result:
Epoch: 0; TRAINING: 869.5; VALIDATION: 863.8
Epoch: 1; TRAINING: 856.6; VALIDATION: 870.9
Epoch: 2; TRAINING: 855.8; VALIDATION: 870.4
Epoch: 3; TRAINING: 855.3; VALIDATION: 869.0
Epoch: 4; TRAINING: 855.2; VALIDATION: 867.2
Epoch: 5; TRAINING: 855.1; VALIDATION: 863.9
Epoch: 6; TRAINING: 854.9; VALIDATION: 862.3


## 4 Prediction and Submission

In [23]:
predictions = torch.zeros(len(sample_submission_csv))

for cv in range(CV):
    # Data
    g = torch.Generator()
    g.manual_seed(SEED)
    TEST_SET = MakeDataSet(test_csv, "test")
    test_data_loader = torch.utils.data.DataLoader(TEST_SET, batch_size = TEST_BATCH_SIZE, worker_init_fn=seed_worker, generator=g, num_workers=2)

    # Model
    model = TransformerNet()
    model.load_state_dict(torch.load('/content/model_' + str(cv + 1) + '.pt'))
    model = model.cuda()

    # Prediction
    outputs = prediction(test_data_loader, model)

    predictions += outputs.reshape(outputs.shape[0] * 6)

    del model, TEST_SET, test_data_loader, outputs

    gc.collect()
    torch.cuda.synchronize()
    torch.cuda.empty_cache()

sample_submission_csv['Target'] = predictions / CV

os.mkdir(WD + '/Submission/' + str(VERSION))
sample_submission_csv.to_csv(WD + '/Submission/' + str(VERSION) + '/submission.csv', index=False)

100%|██████████| 73/73 [00:16<00:00,  4.47it/s]
100%|██████████| 73/73 [00:16<00:00,  4.49it/s]
100%|██████████| 73/73 [00:16<00:00,  4.50it/s]
100%|██████████| 73/73 [00:16<00:00,  4.51it/s]
100%|██████████| 73/73 [00:16<00:00,  4.49it/s]


In [24]:
sample_submission_csv

Unnamed: 0,ID,Target
0,ID_6L67PAA x m1,977.485535
1,ID_6L67PAA x m2,891.397644
2,ID_6L67PAA x m3,838.236816
3,ID_6L67PAA x m4,878.855164
4,ID_6L67PAA x m5,885.994324
...,...,...
56011,ID_WKQPWF3 x m2,833.090515
56012,ID_WKQPWF3 x m3,760.155090
56013,ID_WKQPWF3 x m4,810.684448
56014,ID_WKQPWF3 x m5,809.381226


In [25]:
drive.flush_and_unmount()

In [26]:
end_time = time.time()
print("Runtime of the Notebook: {} min".format(np.round((end_time - start_time) / 60, 2)))

Runtime of the Notebook: 89.79 min
