Now that I have this nifty dataset, time to see if I can do predictions on it. 

Now I could just do what I did the past few notebooks, and just throw some neural network at it. 
However why not have some fun first?

What I'm going to do is a "pachinko" type of system.
What is pachinko? IRL, Pachinko is a Japanese gambling game that resembles a cross between a pinball machine 
and the coin drop seen at carnival fairs in the West.  

The concept is simple: pachinko consists of a box. A ball is thrown in such a way that it reaches the top of the box 
and goes down due to gravity, navigating obstacles and mazes along the way. 

When it reaches the bottom, it inevitably ends up in another box.Depending on where the ball lands at the end, you either get other ball (that can be exchanged for prizes), or nothing. 

See this youtube vid for a better explanation:
https://www.youtube.com/watch?v=-tBy2jemw4s

In our case, the ball will be a datum. The mazes and obstacles will be different clusters. And the ending boxes will be very simple (one or two layers) neural networks specifically trained for the task. It'll make more sense when we get to the code.

But first some libraries:

In [2]:
import torch
from torch import nn

from sklearn.utils import shuffle
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import LabelEncoder

from sklearn.decomposition import FastICA
from sklearn.neighbors import KNeighborsClassifier
import numpy as np
import os

import pandas as pd

In [3]:
from tqdm.notebook import tqdm

import torch.optim as optim
from torch.utils.data import Dataset, DataLoader, WeightedRandomSampler


from sklearn.preprocessing import MinMaxScaler    
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix, classification_report
from sklearn.datasets import load_iris

Learning rate modifier downstairs

#https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html

https://pytorch.org/tutorials/beginner/basics/data_tutorial.html

https://medium.com/@shashikachamod4u/excel-csv-to-pytorch-dataset-def496b6bcc1

https://realpython.com/pandas-groupby/

https://towardsdatascience.com/pytorch-tabular-multiclass-classification-9f8211a123ab

First we'll set up the preliminaries:

In [4]:
torch.manual_seed(0)

# Use 3 decimal places in output display
pd.set_option("display.precision", 3)

# Don't wrap repr(DataFrame) across additional lines
pd.set_option("display.expand_frame_repr", False)

# Set max rows displayed in output to 25
pd.set_option("display.max_rows", 25)

In [5]:
class ClassifierDataset(Dataset):
    
    def __init__(self, X_data, y_data):
        self.X_data = X_data
        self.y_data = y_data
        
    def __getitem__(self, index):
        return self.X_data[index], self.y_data[index]
        
    def __len__ (self):
        return len(self.X_data)


In [6]:
# Get cpu or gpu device for training.
device = "cuda" if torch.cuda.is_available() else "cpu"
print("Using {} device".format(device))

# Define model
class NeuralNetwork(nn.Module):
    def __init__(self,num_features,num_classes):
        super(NeuralNetwork, self).__init__()
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(num_features, num_features*2-1),
            nn.ReLU(),
            nn.Linear(num_features*2-1, num_features),
            nn.ReLU(),
            nn.Linear(num_features, num_classes),
            nn.ReLU()
        )

    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits


Using cuda device


In [7]:
# returns  % correct between predictions and test
def multi_acc(y_pred, y_test):
    y_pred_softmax = torch.log_softmax(y_pred, dim = 1)
    _, y_pred_tags = torch.max(y_pred_softmax, dim = 1)    
    
    correct_pred = (y_pred_tags == y_test).float()
    acc = correct_pred.sum() / len(correct_pred)
    
    acc = torch.round(acc * 100)
    
    return acc

accuracy_stats = {
                    'train': [],
                    "val": []}

loss_stats = {
    
                'train': [],
                "val": []}

In [8]:
#Not really needed right now but I might use it in some other projects
###############------- CUSTOM LOSS FUNCTIONS ---------#########################################

class Monte_fn(nn.Module):
    def __init__(self, weight=None, size_average=True):
        super(Monte_fn, self).__init__()

    def forward(self, output, target, smooth=1): 
        
        #neural network output
        def add(loss , output, target, u_left):
            if u_left < 0:
                return loss
            else:

                #split NN output into two
                a1,a2 = output.split(split_size = 1,dim=1)
                pred = (a2-a1).floor() + 1

                pred_target = torch.column_stack((pred,target))


                #for each values of output_target that are == to indexed member of unique
                class_rows = pred_target[(pred_target[:, 1] == target.unique()[u_left])]
                pred,class_target = class_rows.split(1,dim=1)

                percentage_class_correct =torch.sum(1-(pred-class_target))/len(class_target)
                proportion_class_k = len(class_rows)/len(target)

                return add(loss + percentage_class_correct/proportion_class_k,output,target,u_left-1)

        u_left = len(target.unique())-1

        
        return add(0, output, target, u_left)

    

In [9]:
################################------ Function to make NEURAL NETWORKS -------###########################################

def make_nn(df , verbose):
    print("Making Neural Network")
    
    X = df.iloc[:, 0:-1]
    y = df.iloc[:, -1]
        
        
#############--- PARAMETERS ------#####################

    NUM_FEATURES = len(X.columns)
    NUM_CLASSES = len(np.unique(y.values))
    LEARNING_RATE = 0.0007
    EPOCHS = 50
    BATCH_SIZE = 16
    
    OBJECTIVE_FUNCTION = nn.CrossEntropyLoss() #Monte_fn() #

######################################################
    
    y = np.ravel(y)
    le = LabelEncoder()
    le.fit(y)
    y=le.transform(y)
        
    nn_error= False

    try:
        # Split into train+val and test
        X_trainval, X_test, y_trainval, y_test = train_test_split(X, y, test_size=0.2, 
                                                                  #If class if imbalanced, use stratify!
                                                                  #stratify=y,
                                                                  random_state=69)

        # Split train into train-val
        X_train, X_val, y_train, y_val = train_test_split(X_trainval, y_trainval, test_size=0.1,
                                                          #stratify=y_trainval,
                                                          random_state=21)
    except Exception as e:
        print(e)
        nn_error= True

    if nn_error == True:
        return False

    #Normalize everything
    scaler = MinMaxScaler()
    X_train = scaler.fit_transform(X_train)
    X_val = scaler.transform(X_val)
    X_test = scaler.transform(X_test)
    X_train, y_train = np.array(X_train), np.array(y_train)
    X_val, y_val = np.array(X_val), np.array(y_val)
    X_test, y_test = np.array(X_test), np.array(y_test)


    train_dataset = ClassifierDataset(torch.from_numpy(X_train).float(), torch.from_numpy(y_train).long())
    val_dataset = ClassifierDataset(torch.from_numpy(X_val).float(), torch.from_numpy(y_val).long())
    test_dataset = ClassifierDataset(torch.from_numpy(X_test).float(), torch.from_numpy(y_test).long())
    
    
    train_loader = DataLoader(dataset=train_dataset,
                          batch_size=BATCH_SIZE,
                          #sampler=weighted_sampler <--- In this example the classes are balanced, no need for weighted sampler
                         )
    val_loader = DataLoader(dataset=val_dataset, batch_size=1)
    test_loader = DataLoader(dataset=test_dataset, batch_size=1)

    model = NeuralNetwork(NUM_FEATURES,NUM_CLASSES).to(device)

    #loss function
    criterion = OBJECTIVE_FUNCTION 
    optimizer = optim.Adam(model.parameters(), lr=LEARNING_RATE)

    accuracy_stats = {
                        'train': [],
                        "val": []}

    loss_stats = {
                    'train': [],
                    "val": []}

    print("Begin training.")

    for e in tqdm(range(1, EPOCHS+1)):

        # TRAINING
        train_epoch_loss = 0
        train_epoch_acc = 0

        model.train()
        for X_train_batch, y_train_batch in train_loader:
            X_train_batch, y_train_batch = X_train_batch.to(device), y_train_batch.to(device)
            optimizer.zero_grad()

            y_train_pred = model(X_train_batch)

            train_loss = criterion(y_train_pred, y_train_batch)
            train_acc = multi_acc(y_train_pred, y_train_batch)

            train_loss.backward()
            optimizer.step()

            train_epoch_loss += train_loss.item()
            train_epoch_acc += train_acc.item()


        # VALIDATION    
        with torch.no_grad():

            val_epoch_loss = 0
            val_epoch_acc = 0

            model.eval()
            for X_val_batch, y_val_batch in val_loader:
                #print(X_val_batch)
                X_val_batch, y_val_batch = X_val_batch.to(device), y_val_batch.to(device)

                y_val_pred = model(X_val_batch)
                #print(y_val_pred)

                val_loss = criterion(y_val_pred, y_val_batch)
                val_acc = multi_acc(y_val_pred, y_val_batch)

                val_epoch_loss += val_loss.item()
                val_epoch_acc += val_acc.item()
                loss_stats['train'].append(train_epoch_loss/len(train_loader))

                loss_stats['val'].append(val_epoch_loss/len(val_loader))
                accuracy_stats['train'].append(train_epoch_acc/len(train_loader))
                accuracy_stats['val'].append(val_epoch_acc/len(val_loader))

        if verbose:
            print(f'Epoch {e+0:03}: | Train Loss: {train_epoch_loss/len(train_loader):.5f} | Val Loss: {val_epoch_loss/len(val_loader):.5f} %| Train Acc: {train_epoch_acc/len(train_loader):.3f} %| Val Acc: {val_epoch_acc/len(val_loader):.3f}')
    
    print(f"Final Val. Accuracy:{accuracy_stats['val'][-1]}")
    return model


In [10]:
df = pd.read_csv("encoded_data.csv") #,header=None)

In [11]:
df

Unnamed: 0,"('Divorced',)","('Married',)","('Single',)","('Female',)","('Male',)","('Healthcare Representative',)","('Human Resources',)","('Laboratory Technician',)","('Manager',)","('Manufacturing Director',)",...,StandardHours,StockOptionLevel,TotalWorkingYears,TrainingTimesLastYear,WorkLifeBalance,YearsAtCompany,YearsInCurrentRole,YearsSinceLastPromotion,YearsWithCurrManager,Attrition
0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,...,80,0,8,0,1,6,4,0,5,1
1,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,...,80,1,10,3,3,10,7,1,7,0
2,0.0,0.0,1.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,...,80,0,7,3,3,0,0,0,0,1
3,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,...,80,0,8,3,3,8,7,3,0,0
4,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,...,80,1,6,3,3,2,2,2,2,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1465,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,...,80,1,17,3,3,5,2,0,3,0
1466,0.0,1.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,...,80,1,9,5,3,7,7,1,7,0
1467,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,...,80,1,6,0,3,6,2,0,3,0
1468,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,...,80,0,17,3,2,9,6,0,8,0


In [12]:
le = LabelEncoder()
le.fit(df.iloc[:,-1:])
df[4] = le.transform(df.iloc[:,-1:])
df



  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


Unnamed: 0,"('Divorced',)","('Married',)","('Single',)","('Female',)","('Male',)","('Healthcare Representative',)","('Human Resources',)","('Laboratory Technician',)","('Manager',)","('Manufacturing Director',)",...,StockOptionLevel,TotalWorkingYears,TrainingTimesLastYear,WorkLifeBalance,YearsAtCompany,YearsInCurrentRole,YearsSinceLastPromotion,YearsWithCurrManager,Attrition,4
0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0,8,0,1,6,4,0,5,1,1
1,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,...,1,10,3,3,10,7,1,7,0,0
2,0.0,0.0,1.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,...,0,7,3,3,0,0,0,0,1,1
3,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0,8,3,3,8,7,3,0,0,0
4,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,...,1,6,3,3,2,2,2,2,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1465,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,...,1,17,3,3,5,2,0,3,0,0
1466,0.0,1.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,...,1,9,5,3,7,7,1,7,0,0
1467,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,...,1,6,0,3,6,2,0,3,0,0
1468,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,...,0,17,3,2,9,6,0,8,0,0


In [13]:
############### Preprocessing#######################

df.columns = [*df.columns[:-1], 'class']

np.random.seed(1)  
df = shuffle(df)

train= df[0:round(len(df)*0.75)]
train = train.iloc[:,1:]
test=df [round(len(df)*0.75):len(df)]
test = test.iloc[:,1:]

X = train.iloc[:, 0:-1]
y = train.iloc[:, -1]

#Normalize
#min_max_scaler = MinMaxScaler()
#x_scaled = min_max_scaler.fit_transform(X.values)
#X =pd.DataFrame(x_scaled)


This is where we actually train our tree or pachinko machine. 

So in more concrete terms what it does is this:
it takes a dataframe, split it up into clusters with the KNN algorithm. 

THEN if said cluster passes a certain threshold of homogeinity, it creates a new neural network as a leaf of the tree and trains it on the remaining cluster. 

If not it applies KNN again and repeats the process. 


In [14]:

KNN_PARAM = 3
HOMO_THRESHOLD =0.99

################################################################
#Tree class
################################################################

class Tree:
    def __init__(self,name,df,homo_threshold):
        self.name = name
        self.df = df
        self.homo_threshold = homo_threshold
        self.X = df.iloc[:, 0:-1]
        self.y = df.iloc[:, -1]
        self.KNN = KNeighborsClassifier(n_neighbors=KNN_PARAM).fit(self.X,self.y)
        self.pred = self.KNN.predict(self.X)
        
        # if self.homo == 0 then KNN failed and you should self destruct it
        self.homo = self.homogeneity(self.pred, df)
        self.error = self.Error(self.pred,self.y)
        self.cluster_names = np.unique(self.pred)
        self.kids = self.create_kids() if self.homo <= homo_threshold else None
        self.kid_rank = None

        self.model = make_nn(df,verbose=False) if self.kids == None else None
        
    def Error(self,pred,y):
        correct= np.where(pred==y,True,False)
        perc = correct.sum()/len(correct)
        return perc
    
    def homogeneity(self, pred, y):
        #most_pop_class = self.pred.value_counts() <----this is good for dfs
        most_pop_class = max(np.unique(pred, return_counts = True)[1])
        homo = most_pop_class/len(y)
        return homo
        
    def check_thresh(self,cluster_name, X , y ,pred , homo_threshold):
        new_name = self.name + str(cluster_name) + "-"
        
        print(f"------------{new_name}----------")
        df_ = X.copy()
        df_['cluster'] = pred
        df_['class'] = y
        df_ = df_[df_['cluster']== cluster_name].drop('cluster',1)
        
        try:
            new_node = Tree(new_name,
                            df_,
                           homo_threshold)
            new_node.kid_rank=cluster_name 
            
        except Exception as e:
            print(e)
            new_node = None

        return new_node

    def create_kids(self):
        kids = [self.check_thresh(i, self.X, self.y , self.pred, self.homo_threshold) for i in self.cluster_names]
        kids = list(filter(lambda kid: (kid != None ), kids))  #filter out "None/Ghost Kids"
        
        return kids
    
        #################################################
        
    def get_kids(self):
        for kid in self.kids:
            print(kid.kid_rank)

    
    def pachinko_test(self,x):
        if self.kids == None:
            print("No kids! Checking NN!")
            
            #Makes sure everything starts from 0 --- pytorch hates it when it doesn't
            y = np.ravel(self.y)
            le = LabelEncoder()
            le.fit(self.y)
            y=le.transform(self.y)
            
            with torch.no_grad():
                x = torch.Tensor([x]).to(device)
                y_pred = self.model(x)

                y_pred_softmax = torch.log_softmax(y_pred, dim = 1)
                _, y_pred_tags = torch.max(y_pred_softmax, dim = 1)    
                
            y_pred_tags = le.inverse_transform(y_pred_tags.to("cpu"))
            print(f"NN pred:{y_pred_tags}")

            
            return y_pred_tags
        else:
            indy_pred = self.KNN.predict([x])[0]
            print(f"Pachinko down we go!:{indy_pred}")
            
            chosen_one = list(filter(lambda kid: (kid.kid_rank == indy_pred ), self.kids))
            
            if chosen_one == []:
                return indy_pred
            else:
                return chosen_one[0].pachinko_test(x)
            


In [15]:
df

Unnamed: 0,"('Divorced',)","('Married',)","('Single',)","('Female',)","('Male',)","('Healthcare Representative',)","('Human Resources',)","('Laboratory Technician',)","('Manager',)","('Manufacturing Director',)",...,StockOptionLevel,TotalWorkingYears,TrainingTimesLastYear,WorkLifeBalance,YearsAtCompany,YearsInCurrentRole,YearsSinceLastPromotion,YearsWithCurrManager,Attrition,class
1291,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,...,0,10,4,1,10,3,0,8,1,1
1153,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0,0,2,4,0,0,0,0,1,1
720,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0,7,2,3,5,2,0,1,1,1
763,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,...,1,1,2,3,1,1,0,0,0,0
976,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,...,1,33,0,3,19,16,15,9,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
715,0.0,1.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,...,1,6,2,3,6,5,1,2,0,0
905,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,...,2,9,2,2,7,7,1,7,0,0
1096,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,...,0,21,2,3,21,7,7,7,0,0
235,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,...,1,22,3,3,17,13,1,9,0,0


In [16]:
df.columns

Index(['('Divorced',)', '('Married',)', '('Single',)', '('Female',)',
       '('Male',)', '('Healthcare Representative',)', '('Human Resources',)',
       '('Laboratory Technician',)', '('Manager',)',
       '('Manufacturing Director',)', '('Research Director',)',
       '('Research Scientist',)', '('Sales Executive',)',
       '('Sales Representative',)', '('Human Resources',).1',
       '('Life Sciences',)', '('Marketing',)', '('Medical',)', '('Other',)',
       '('Technical Degree',)', '('Human Resources',).2',
       '('Research & Development',)', '('Sales',)', 'Age', 'BusinessTravel',
       'DailyRate', 'DistanceFromHome', 'Education', 'EmployeeCount',
       'EmployeeNumber', 'EnvironmentSatisfaction', 'HourlyRate',
       'JobInvolvement', 'JobLevel', 'JobSatisfaction', 'MonthlyIncome',
       'MonthlyRate', 'NumCompaniesWorked', 'OverTime', 'PercentSalaryHike',
       'PerformanceRating', 'RelationshipSatisfaction', 'StandardHours',
       'StockOptionLevel', 'TotalWorkingYear

In [17]:
df.apply(lambda row: row.astype(str).str.contains('Yes').any(), axis=1).sum()

0

In [18]:

root = Tree("root-",train,
            HOMO_THRESHOLD)


------------root-0-----------
Making Neural Network


  df_ = df_[df_['cluster']== cluster_name].drop('cluster',1)


Begin training.


  0%|          | 0/50 [00:00<?, ?it/s]

Final Val. Accuracy:80.24691358024691
------------root-1-----------
------------root-1-0-----------
Making Neural Network
Begin training.


  df_ = df_[df_['cluster']== cluster_name].drop('cluster',1)
  df_ = df_[df_['cluster']== cluster_name].drop('cluster',1)


  0%|          | 0/50 [00:00<?, ?it/s]

Final Val. Accuracy:100.0
------------root-1-1-----------
Making Neural Network
Begin training.


  df_ = df_[df_['cluster']== cluster_name].drop('cluster',1)


  0%|          | 0/50 [00:00<?, ?it/s]

Final Val. Accuracy:100.0


In [19]:
X = test.iloc[:, 0:-1]
y = test.iloc[:, -1]

y_preds = [int(root.pachinko_test(x)) for x in test.iloc[:,0:-1].values]

Pachinko down we go!:0
No kids! Checking NN!
NN pred:[0]
Pachinko down we go!:1
Pachinko down we go!:1
No kids! Checking NN!
NN pred:[0]
Pachinko down we go!:1
Pachinko down we go!:1
No kids! Checking NN!
NN pred:[1]
Pachinko down we go!:0
No kids! Checking NN!
NN pred:[0]
Pachinko down we go!:0
No kids! Checking NN!
NN pred:[0]
Pachinko down we go!:0
No kids! Checking NN!
NN pred:[0]
Pachinko down we go!:0
No kids! Checking NN!
NN pred:[0]
Pachinko down we go!:0
No kids! Checking NN!
NN pred:[0]
Pachinko down we go!:0
No kids! Checking NN!
NN pred:[0]
Pachinko down we go!:0
No kids! Checking NN!
NN pred:[0]
Pachinko down we go!:0
No kids! Checking NN!
NN pred:[0]
Pachinko down we go!:0
No kids! Checking NN!
NN pred:[0]
Pachinko down we go!:0
No kids! Checking NN!
NN pred:[0]
Pachinko down we go!:0
No kids! Checking NN!
NN pred:[0]
Pachinko down we go!:0
No kids! Checking NN!
NN pred:[0]
Pachinko down we go!:0
No kids! Checking NN!
NN pred:[0]
Pachinko down we go!:0
No kids! Checking N

  x = torch.Tensor([x]).to(device)


Pachinko down we go!:0
No kids! Checking NN!
NN pred:[0]
Pachinko down we go!:0
No kids! Checking NN!
NN pred:[0]
Pachinko down we go!:0
No kids! Checking NN!
NN pred:[0]
Pachinko down we go!:0
No kids! Checking NN!
NN pred:[0]
Pachinko down we go!:0
No kids! Checking NN!
NN pred:[0]
Pachinko down we go!:0
No kids! Checking NN!
NN pred:[0]
Pachinko down we go!:0
No kids! Checking NN!
NN pred:[0]
Pachinko down we go!:0
No kids! Checking NN!
NN pred:[0]
Pachinko down we go!:1
Pachinko down we go!:1
No kids! Checking NN!
NN pred:[1]
Pachinko down we go!:0
No kids! Checking NN!
NN pred:[0]
Pachinko down we go!:0
No kids! Checking NN!
NN pred:[0]
Pachinko down we go!:0
No kids! Checking NN!
NN pred:[0]
Pachinko down we go!:0
No kids! Checking NN!
NN pred:[0]
Pachinko down we go!:0
No kids! Checking NN!
NN pred:[0]
Pachinko down we go!:0
No kids! Checking NN!
NN pred:[0]
Pachinko down we go!:0
No kids! Checking NN!
NN pred:[0]
Pachinko down we go!:0
No kids! Checking NN!
NN pred:[0]
Pachinko



In [20]:
confusion_matrix(y,y_preds)

array([[295,  16],
       [ 53,   4]])

To interpret these results we refer to:
https://scikit-learn.org/stable/modules/generated/sklearn.metrics.confusion_matrix.html

The i-th row and j-th column entry indicates the number of samples with true label being i-th class and predicted label being j-th class.

So... as you can see,the system completely misclassified the second class! 


We can gather more stats:

In [59]:
print(classification_report(y,y_preds))

              precision    recall  f1-score   support

           0       0.85      0.95      0.90       311
           1       0.20      0.07      0.10        57

    accuracy                           0.81       368
   macro avg       0.52      0.51      0.50       368
weighted avg       0.75      0.81      0.77       368



https://medium.com/@kohlishivam5522/understanding-a-classification-report-for-your-machine-learning-model-88815e2ce397

There are four ways to check if the predictions are right or wrong:

TN / True Negative: the case was negative and predicted negative
TP / True Positive: the case was positive and predicted positive
FN / False Negative: the case was positive but predicted negative
FP / False Positive: the case was negative but predicted positive
    

Precision = TP/(TP + FP) — What percent of your predictions were correct?

Recall — What percent of the positive cases did you catch? Fraction of positives that were correctly identified. Recall = TP/(TP+FN)

F1 score = 2*(Recall * Precision) / (Recall + Precision) — What percent of positive predictions were correct?

The F1 score is a weighted harmonic mean of precision and recall such that the best score is 1.0 and the worst is 0.0. F1 scores are lower than accuracy measures as they embed precision and recall into their computation. As a rule of thumb, the weighted average of F1 should be used to compare classifier models, not global accuracy.

Support -Support is the number of actual occurrences of the class in the specified dataset.


But we already know we'll have to do better.

https://stackoverflow.com/questions/27474921/compare-two-columns-using-pandas

https://pytorch.org/docs/stable/nn.html#loss-functions


https://pytorch.org/docs/stable/optim.html

https://ruder.io/optimizing-gradient-descent/index.html#gradientdescentvariants

https://machinelearningmastery.com/how-to-choose-loss-functions-when-training-deep-learning-neural-networks/

https://analyticsindiamag.com/ultimate-guide-to-pytorch-optimizers/

https://towardsdatascience.com/pytorch-tabular-multiclass-classification-9f8211a123ab