## Folds on SST Sentiment Dataset

This notebook is used to test the variation of testing result after varying the input using K-folds stratification, on Stanford Sentiment Treebank.

BERT is used as the 5 way classifier.


###### This is the dataset of the following paper:

  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
  
 Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher Manning, Andrew Ng and Christopher Potts
 
 Conference on Empirical Methods in Natural Language Processing (EMNLP 2013)

In [4]:
# Importing necessary libraries
import pandas as pd
import numpy as np
from datetime import datetime
import random
import sklearn
import torch
import torch.nn as nn
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
device

from simpletransformers.classification import ClassificationModel

In [22]:
# procedure for getting the data sets and formatting them for the transformer
 
train=pd.read_csv('./SST_data/sst5_train_sentences.csv', names=['text', 'labels'] )

Eval=pd.read_csv('./SST_data/sst5_dev.csv' , names=['text', 'labels'] )

test=pd.read_csv('./SST_data/sst5_test.csv', names=['text', 'labels']  )

train

Unnamed: 0,text,labels
0,Reno himself can take credit for most of the m...,pos
1,"Despite the film 's shortcomings , the stories...",pos
2,"Despite its dry wit and compassion , the film ...",neg
3,The central character is n't complex enough to...,neu
4,Rifkin no doubt fancies himself something of a...,very neg
...,...,...
8529,A conventional but heartwarming tale .,very pos
8530,It has the air of a surprisingly juvenile lark...,neg
8531,The culmination of everyone 's efforts is give...,neu
8532,Overcomes its visual hideousness with a sharp ...,pos


In [19]:
Eval

Unnamed: 0,text,labels
0,( director ) O'Fallon manages to put some love...,very neg
1,A thinly veiled look at different aspects of C...,neu
2,If your taste runs to ` difficult ' films you ...,pos
3,( Leigh ) lays it on so thick this time that i...,neu
4,"A full world has been presented onscreen , not...",pos
...,...,...
1095,"Just as moving , uplifting and funny as ever .",pos
1096,Davis ... is so enamored of her own creation t...,neg
1097,"An exhilarating futuristic thriller-noir , Min...",very pos
1098,I got a headache watching this meaningless dow...,very neg


In [20]:
test

Unnamed: 0,text,labels
0,Maybe I found the proceedings a little bit too...,neg
1,"As with too many studio pics , plot mechanics ...",very neg
2,"Beers , who , when she 's given the right line...",neu
3,"Cute , funny , heartwarming digitally animated...",very pos
4,So what is the point ?,very neg
...,...,...
2205,It 's a glorious groove that leaves you wantin...,very pos
2206,It 's getting harder and harder to ignore the ...,neg
2207,"A real movie , about real people , that gives ...",pos
2208,"Sharp , lively , funny and ultimately sobering...",very pos


To keep closer to the standard way of doing K-folds I am including the validation and training set together.

The data will be spilt in 5 folds.  At every trial 4 will be used for training and the other will be used for validation.

In [21]:
train=train.append(Eval, ignore_index = True)
train

Unnamed: 0,text,labels
0,Reno himself can take credit for most of the m...,pos
1,"Despite the film 's shortcomings , the stories...",pos
2,"Despite its dry wit and compassion , the film ...",neg
3,The central character is n't complex enough to...,neu
4,Rifkin no doubt fancies himself something of a...,very neg
...,...,...
9629,"Just as moving , uplifting and funny as ever .",pos
9630,Davis ... is so enamored of her own creation t...,neg
9631,"An exhilarating futuristic thriller-noir , Min...",very pos
9632,I got a headache watching this meaningless dow...,very neg


#####  We now change the text labels to numeric( 0 to 4)

In [24]:

def labelsToNumbers(set):
    for row in range(len(set)):
        if set.iloc[row,1]=='very pos': set.iloc[row,1]=4
        if set.iloc[row,1]=='pos': set.iloc[row,1]=3
        if set.iloc[row,1]=='neu': set.iloc[row,1]=2
        if set.iloc[row,1]=='neg': set.iloc[row,1]=1
        if set.iloc[row,1]=='very neg': set.iloc[row,1]=0

    return set

train=labelsToNumbers(train)
train

Unnamed: 0,text,labels
0,Reno himself can take credit for most of the m...,3
1,"Despite the film 's shortcomings , the stories...",3
2,"Despite its dry wit and compassion , the film ...",1
3,The central character is n't complex enough to...,2
4,Rifkin no doubt fancies himself something of a...,0
...,...,...
8529,A conventional but heartwarming tale .,4
8530,It has the air of a surprisingly juvenile lark...,1
8531,The culmination of everyone 's efforts is give...,2
8532,Overcomes its visual hideousness with a sharp ...,3


In [25]:
test=labelsToNumbers(test)
test

Unnamed: 0,text,labels
0,Maybe I found the proceedings a little bit too...,1
1,"As with too many studio pics , plot mechanics ...",0
2,"Beers , who , when she 's given the right line...",2
3,"Cute , funny , heartwarming digitally animated...",4
4,So what is the point ?,0
...,...,...
2205,It 's a glorious groove that leaves you wantin...,4
2206,It 's getting harder and harder to ignore the ...,1
2207,"A real movie , about real people , that gives ...",3
2208,"Sharp , lively , funny and ultimately sobering...",4


In [26]:
# first we randomise the order of train

import random
 
randNum=[]

for row in range(len(train)):
    randNum.append(random.random())

train['RandNum']=randNum

train=train.sort_values(by=['RandNum'] )
train=train.drop(['RandNum'],axis=1)
del(randNum)
train
    

Unnamed: 0,text,labels
760,the phone rings and a voice tells you you 've ...,2
5169,"An often watchable , though goofy and lurid , ...",3
7343,Hawn and Sarandon form an acting bond that mak...,4
5695,"An unremarkable , modern action\/comedy buddy ...",2
3680,A hard look at one man 's occupational angst a...,3
...,...,...
5065,Not many movies have that kind of impact on me...,4
782,Intimate and panoramic .,3
2306,"Suspend your disbelief here and now , or you '...",1
1217,`` 13 Conversations About One Thing '' is an i...,4


In [27]:
count0=0
count1=0
count2=0
count3=0
count4=0
 

train0=[] #all stamtements that are very neg  (class 0)
train1=[]
train2=[]
train3=[]
train4=[] 


for row in range(len(train)):
        if train.iloc[row,1]==0: 
            count0+=1
            train0.append(train.iloc[row,:])
        if train.iloc[row,1]==1: 
            count1+=1
            train1.append(train.iloc[row,:])
        if train.iloc[row,1]==2: 
            count2+=1
            train2.append(train.iloc[row,:])
        if train.iloc[row,1]==3: 
            count3+=1
            train3.append(train.iloc[row,:])
        if train.iloc[row,1]==4: 
            count4+=1
            train4.append(train.iloc[row,:])


print('0s ', count0)
print('1s ', count1)
print('2s ', count2)
print('3s ', count3)
print('4s ', count4) 

            

0s  1090
1s  2215
2s  1623
3s  2319
4s  1287


In [29]:
def div5(myinteger):
    size_m5 =myinteger-(myinteger%5)
    QuantityToRemove=size_m5/5
    
    return QuantityToRemove, myinteger%5

C0div5=div5(count0)
C1div5=div5(count1)
C2div5=div5(count2)
C3div5=div5(count3)
C4div5=div5(count4)
 



print('To omit from class 0s ', C0div5)
print('To omit from class 1s ', C1div5)
print('To omit from class 2s ', C2div5)
print('To omit from class 3s ', C3div5)
print('To omit from class 4s ', C4div5) 

To omit from class 0s  (218.0, 0)
To omit from class 1s  (443.0, 0)
To omit from class 2s  (324.0, 3)
To omit from class 3s  (463.0, 4)
To omit from class 4s  (257.0, 2)


In [30]:
train0

[text      The characters are so generic and the plot so ...
 labels                                                    0
 Name: 1769, dtype: object,
 text      ... plays like a badly edited , 91-minute trai...
 labels                                                    0
 Name: 6037, dtype: object,
 text      How inept is Serving Sara ?
 labels                              0
 Name: 5551, dtype: object,
 text      With a completely predictable plot , you 'll s...
 labels                                                    0
 Name: 5763, dtype: object,
 text      Unfortunately , it 's also not very good .
 labels                                             0
 Name: 5446, dtype: object,
 text      The movie has a script ( by Paul Pender ) made...
 labels                                                    0
 Name: 104, dtype: object,
 text      Most of the problems with the film do n't deri...
 labels                                                    0
 Name: 2660, dtype: object,
 text    

 The order is already for each set is already randomised.
We will now, for sets train0 to train5 split each into 5 roughly equal parts.

In [31]:
#train[class][fold]
train01=[]
train02=[]
train03=[]
train04=[]
train05=[] 
   

for row in range(len(train0)):
    
    if row<C0div5[0]: train01.append(train0[row])
        
    if row>=C0div5[0] and row<(C0div5[0]*2): train02.append(train0[row])
        
    if row>=(C0div5[0]*2) and row<(C0div5[0]*3): train03.append(train0[row])
    
    if row>=(C0div5[0]*3) and row<(C0div5[0]*4): train04.append(train0[row])
        
    if row>=(C0div5[0]*4) and row<(C0div5[0]*5): train05.append(train0[row])
        
    if row>=(C0div5[0]*5):
        train01.append(train0[row])
         
     

     



In [32]:
train01

[text      The characters are so generic and the plot so ...
 labels                                                    0
 Name: 1769, dtype: object,
 text      ... plays like a badly edited , 91-minute trai...
 labels                                                    0
 Name: 6037, dtype: object,
 text      How inept is Serving Sara ?
 labels                              0
 Name: 5551, dtype: object,
 text      With a completely predictable plot , you 'll s...
 labels                                                    0
 Name: 5763, dtype: object,
 text      Unfortunately , it 's also not very good .
 labels                                             0
 Name: 5446, dtype: object,
 text      The movie has a script ( by Paul Pender ) made...
 labels                                                    0
 Name: 104, dtype: object,
 text      Most of the problems with the film do n't deri...
 labels                                                    0
 Name: 2660, dtype: object,
 text    

In [33]:
train02

[text      The film has a nearly terminal case of the cut...
 labels                                                    0
 Name: 4259, dtype: object,
 text      The most offensive thing about the movie is th...
 labels                                                    0
 Name: 1608, dtype: object,
 text      This is the case of a pregnant premise being w...
 labels                                                    0
 Name: 2485, dtype: object,
 text      There 's already been too many of these films ...
 labels                                                    0
 Name: 3775, dtype: object,
 text      A half-assed film .
 labels                      0
 Name: 1488, dtype: object,
 text      It 's mired in a shabby script that piles laye...
 labels                                                    0
 Name: 2910, dtype: object,
 text      director Hoffman , his writer and Kline 's age...
 labels                                                    0
 Name: 644, dtype: object,
 text      

In [34]:
        
train11=[]
train12=[]
train13=[]
train14=[]
train15=[]
train1remaining=[]

train21=[]
train22=[]
train23=[]
train24=[]
train25=[]
train2remaining=[]

train31=[]
train32=[]
train33=[]
train34=[]
train35=[]
train3remaining=[]

train41=[]
train42=[]
train43=[]
train44=[]
train45=[]
train4remaining=[]

 



In [35]:

for row in range(len(train1)):
    if row<C1div5[0]: train11.append(train1[row])
        
    if row>=C1div5[0] and row<(C1div5[0]*2): train12.append(train1[row])
        
    if row>=(C1div5[0]*2) and row<(C1div5[0]*3): train13.append(train1[row])
    
    if row>=(C1div5[0]*3) and row<(C1div5[0]*4): train14.append(train1[row])
        
    if row>=(C1div5[0]*4) and row<(C1div5[0]*5): train15.append(train1[row])
        
    if row>=(C1div5[0]*5):
        train11.append(train1[row])
         



In [36]:
for row in range(len(train2)):
    if row<C2div5[0]: train21.append(train2[row])
        
    if row>=C2div5[0] and row<(C2div5[0]*2): train22.append(train2[row])
        
    if row>=(C2div5[0]*2) and row<(C2div5[0]*3): train23.append(train2[row])
    
    if row>=(C2div5[0]*3) and row<(C2div5[0]*4): train24.append(train2[row])
        
    if row>=(C2div5[0]*4) and row<(C2div5[0]*5): train25.append(train2[row])
        
    if row>=(C2div5[0]*5):
        train21.append(train2[row])
         


In [37]:

for row in range(len(train3)):
    if row<C3div5[0]: train31.append(train3[row])
        
    if row>=C3div5[0] and row<(C3div5[0]*2): train32.append(train3[row])
        
    if row>=(C3div5[0]*2) and row<(C3div5[0]*3): train33.append(train3[row])
    
    if row>=(C3div5[0]*3) and row<(C3div5[0]*4): train34.append(train3[row])
        
    if row>=(C3div5[0]*4) and row<(C3div5[0]*5): train35.append(train3[row])
        
    if row>=(C3div5[0]*5):
        train31.append(train3[row])
         

In [38]:
for row in range(len(train4)):
    if row<C4div5[0]: train41.append(train4[row])
        
    if row>=C4div5[0] and row<(C4div5[0]*2): train42.append(train4[row])
        
    if row>=(C4div5[0]*2) and row<(C4div5[0]*3): train43.append(train4[row])
    
    if row>=(C4div5[0]*3) and row<(C4div5[0]*4): train44.append(train4[row])
        
    if row>=(C4div5[0]*4) and row<(C4div5[0]*5): train45.append(train4[row])
        
    if row>=(C4div5[0]*5):
        train41.append(train4[row])
         

In [39]:
pants1=pd.DataFrame(train01, columns=['text','labels'])
pants2=pd.DataFrame(train02, columns=['text','labels'])
pants3=pd.DataFrame(train03, columns=['text','labels'])
pants4=pd.DataFrame(train04, columns=['text','labels'])
pants5=pd.DataFrame(train05, columns=['text','labels'])


fake1=pd.DataFrame(train11, columns=['text','labels'])
fake2=pd.DataFrame(train12, columns=['text','labels'])
fake3=pd.DataFrame(train13, columns=['text','labels'])
fake4=pd.DataFrame(train14, columns=['text','labels'])
fake5=pd.DataFrame(train15, columns=['text','labels'])


Mfake1=pd.DataFrame(train21, columns=['text','labels'])
Mfake2=pd.DataFrame(train22, columns=['text','labels'])
Mfake3=pd.DataFrame(train23, columns=['text','labels'])
Mfake4=pd.DataFrame(train24, columns=['text','labels'])
Mfake5=pd.DataFrame(train25, columns=['text','labels'])


half1=pd.DataFrame(train31, columns=['text','labels'])
half2=pd.DataFrame(train32, columns=['text','labels'])
half3=pd.DataFrame(train33, columns=['text','labels'])
half4=pd.DataFrame(train34, columns=['text','labels'])
half5=pd.DataFrame(train35, columns=['text','labels'])


Mreal1=pd.DataFrame(train41, columns=['text','labels'])
Mreal2=pd.DataFrame(train42, columns=['text','labels'])
Mreal3=pd.DataFrame(train43, columns=['text','labels'])
Mreal4=pd.DataFrame(train44, columns=['text','labels'])
Mreal5=pd.DataFrame(train45, columns=['text','labels'])

 

frames1 = [pants2, pants3, pants4, pants5, fake2, fake3, fake4, fake5, Mfake2, Mfake3, Mfake4, Mfake5, half2, half3, half4, half5, Mreal2, Mreal3, Mreal4, Mreal5  ]

frames2 = [ pants1, pants3, pants4, pants5,fake1,  fake3, fake4, fake5,Mfake1, Mfake3, Mfake4, Mfake5,half1, half3, half4, half5,Mreal1, Mreal3, Mreal4, Mreal5 ]


frames3 = [pants1, pants2,  pants4, pants5,fake1, fake2, fake4, fake5,Mfake1, Mfake2, Mfake4, Mfake5,half1, half2,  half4, half5,Mreal1, Mreal2,  Mreal4, Mreal5 ]


frames4 = [ pants1, pants2, pants3, pants5,fake1, fake2, fake3, fake5,Mfake1, Mfake2, Mfake3, Mfake5,half1, half2, half3, half5,Mreal1, Mreal2, Mreal3,  Mreal5]

frames5 = [pants1, pants2, pants3, pants4, fake1, fake2, fake3, fake4,Mfake1, Mfake2, Mfake3, Mfake4, half1, half2, half3, half4,Mreal1, Mreal2, Mreal3, Mreal4]




train_fold1 = pd.concat(frames1)
train_fold2 = pd.concat(frames2)
train_fold3 = pd.concat(frames3)
train_fold4 = pd.concat(frames4)
train_fold5 = pd.concat(frames5)

#we set the omitted fold as the validation set

frames1=[pants1,fake1,Mfake1,half1,Mreal1 ]
valid1 = pd.concat(frames1)

frames2=[pants2,fake2,Mfake2,half2,Mreal2 ]
valid2 = pd.concat(frames2)

frames3=[pants3,fake3,Mfake3,half3,Mreal3 ]
valid3 = pd.concat(frames3)

frames4=[pants4,fake4,Mfake4,half4,Mreal4]
valid4 = pd.concat(frames4)

frames5=[pants5,fake5,Mfake5,half5,Mreal5 ]
valid5 = pd.concat(frames5)
 

In [40]:
def randomiseSet(set):
    #this function randomises the order of the set.
    #order shouldn't be an issue but better keep things close to the realistic realms
 
    randNum=[]

    for row in range(len(set)):
        randNum.append(random.random())

    set['RandNum']=randNum
    set=set.sort_values(by=['RandNum'] )
    set=set.drop(['RandNum'],axis=1)
    
    return set

In [41]:
train_fold1=randomiseSet(train_fold1)
train_fold2=randomiseSet(train_fold2)
train_fold3=randomiseSet(train_fold3)
train_fold4=randomiseSet(train_fold4)
train_fold5=randomiseSet(train_fold5)
valid1=randomiseSet(valid1)
valid2=randomiseSet(valid2)
valid3=randomiseSet(valid3)
valid4=randomiseSet(valid4)
valid5=randomiseSet(valid5)

In [44]:
train_fold1.to_excel('./folds/train_fold1.xls',index=False)
train_fold2.to_excel('./folds/train_fold2.xls',index=False)
train_fold3.to_excel('./folds/train_fold3.xls',index=False)
train_fold4.to_excel('./folds/train_fold4.xls',index=False)
train_fold5.to_excel('./folds/train_fold5.xls',index=False)

valid1.to_excel('./folds/valid1.xls',index=False)
valid2.to_excel('./folds/valid2.xls',index=False)
valid3.to_excel('./folds/valid3.xls',index=False)
valid4.to_excel('./folds/valid4.xls',index=False)
valid5.to_excel('./folds/valid5.xls',index=False)

# We can now run the tests

## Fold1 training and capturing predictions

In [45]:
fold_number='1'

train=pd.read_excel('./folds/train_fold'+fold_number+'.xls')
Eval =pd.read_excel('./folds/valid'+fold_number+'.xls') #evaluation set


In [52]:
#Set the model being used here
model_class='bert'  # bert or roberta or albert
model_version='bert-base-cased' #bert-base-cased, roberta-base, roberta-large, albert-base-v2 OR albert-large-v2
labels_count=5  # the number of classification classes


output_folder='./folds/fold'+fold_number+'/'+model_class+'/'+model_version+"/"
cache_directory= "./folds/fold"+fold_number+'/'+model_class+"/"+model_version+"/cache/"


print('model variables were set up: ')

 
save_every_steps=1285
# assuming training batch size of 8
# any number above 1284 saves the model only at every epoch
# Saving the model mid training very often will consume disk space fast

train_args={
    "output_dir":output_folder,
    "cache_dir":cache_directory,
    'reprocess_input_data': True,
    'overwrite_output_dir': True,
    'num_train_epochs': 2,
    "save_steps": save_every_steps, 
    "learning_rate": 2e-5,
    "train_batch_size": 64,
    "eval_batch_size": 16,
    "evaluate_during_training_steps": 312,
    "max_seq_length": 100,
    "n_gpu": 1,
}

# Create a ClassificationModel
model = ClassificationModel(model_class, model_version, num_labels=labels_count, args=train_args) 

# loading a previously saved ClassificationModel model based on this particular Transformer Class and model_name



model variables were set up: 


In [53]:
# loading the checkpoint that gave the best result
'''
CheckPoint='checkpoint-254-epoch-2'   


preSavedCheckpoint=output_folder+CheckPoint

print('Loading model, please wait...')
model = ClassificationModel( model_class, preSavedCheckpoint, num_labels=labels_count, args=train_args) 
print('model in use is :', preSavedCheckpoint )
''' 

"\nCheckPoint='checkpoint-254-epoch-2'   \n\n\npreSavedCheckpoint=output_folder+CheckPoint\n\nprint('Loading model, please wait...')\nmodel = ClassificationModel( model_class, preSavedCheckpoint, num_labels=labels_count, args=train_args) \nprint('model in use is :', preSavedCheckpoint )\n"

In [54]:
# Train the model
current_time = datetime.now()
model.train_model(train)
print("Training time taken: ", datetime.now() - current_time, ' at:',datetime.now())

Converting to features started. Cache is not used.


HBox(children=(FloatProgress(value=0.0, max=6820.0), HTML(value='')))


Selected optimization level O1:  Insert automatic casts around Pytorch functions and Tensor methods.

Defaults for this optimization level are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic
Processing user overrides (additional kwargs that are not None)...
After processing overrides, optimization options are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic


HBox(children=(FloatProgress(value=0.0, description='Epoch', max=2.0, style=ProgressStyle(description_width='i…

HBox(children=(FloatProgress(value=0.0, description='Current iteration', max=107.0, style=ProgressStyle(descri…

Running loss: 1.240288


HBox(children=(FloatProgress(value=0.0, description='Current iteration', max=107.0, style=ProgressStyle(descri…

Running loss: 0.927448

Training of bert model complete. Saved to ./folds/fold1/bert/bert-base-cased/.
Training time taken:  0:04:33.200231  at: 2020-04-27 13:09:17.946907


In [55]:
TrainResult, TrainModel_outputs, wrong_predictions = model.eval_model(train, acc=sklearn.metrics.accuracy_score)
 
EvalResult, EvalModel_outputs, wrong_predictions = model.eval_model(Eval, acc=sklearn.metrics.accuracy_score)

TestResult, TestModel_outputs, wrong_predictions = model.eval_model(test, acc=sklearn.metrics.accuracy_score)

print('Training Result:', TrainResult['acc'])
#print('Model Out:', TrainModel_outputs)

print('Eval Result:', EvalResult['acc'])
#print('Model Out:', EvalModel_outputs)

print('Test Set Result:', TestResult['acc'])
#print('Model Out:', TestModel_outputs)

Converting to features started. Cache is not used.


HBox(children=(FloatProgress(value=0.0, max=6820.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, max=427.0), HTML(value='')))


{'mcc': 0.47169901372247147, 'acc': 0.5825513196480938, 'eval_loss': 0.9814901188609192}
Converting to features started. Cache is not used.


HBox(children=(FloatProgress(value=0.0, max=1714.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, max=108.0), HTML(value='')))


{'mcc': 0.37498423596792646, 'acc': 0.5134189031505251, 'eval_loss': 1.1189885597538065}
Converting to features started. Cache is not used.


HBox(children=(FloatProgress(value=0.0, max=2210.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, max=139.0), HTML(value='')))


{'mcc': 0.36198617975242603, 'acc': 0.5063348416289593, 'eval_loss': 1.1242049113452006}
Training Result: 0.5825513196480938
Eval Result: 0.5134189031505251
Test Set Result: 0.5063348416289593


In [69]:
Pred=[]
Targets=[]

countCorrect=0

for row in range(TestModel_outputs.shape[0]):
    outputs=TestModel_outputs[row]
    #print(test.iloc[row,0])
    print(outputs, end=' ')
    
    result=0
    if outputs[0]<outputs[1]:result=1
    if outputs[result]<outputs[2]:result=2
    if outputs[result]<outputs[3]:result=3
    if outputs[result]<outputs[4]:result=4
    
    Pred.append(result)
    Targets.append(test.iloc[row,1])
    print(result, ' ',test.iloc[row,1], end=' ')
    if result==test.iloc[row,1]:
        countCorrect+=1
        print('Match',countCorrect)
    print('')

print(countCorrect)
#Pred

[ 0.68603516  1.6220703   0.3876953  -1.1123047  -1.8847656 ] 1   1 Match 1

[ 0.78808594  1.6962891   0.41723633 -1.2841797  -2.0273438 ] 1   0 
[ 0.7451172   1.4824219   0.49316406 -1.1816406  -2.0976562 ] 1   2 
[-1.59375   -1.4960938 -0.3239746  1.7753906  2.7011719] 4   4 Match 2

[ 0.04043579  1.2314453   0.6425781  -0.47827148 -1.5195312 ] 1   0 
[-1.7421875  -0.96875     0.38867188  2.4609375   1.3701172 ] 3   3 Match 3

[ 1.2675781  1.3359375  0.2076416 -1.3466797 -1.9296875] 1   0 
[-1.7294922  -1.4238281  -0.14343262  2.2089844   2.3632812 ] 4   4 Match 4

[ 1.3066406   1.4951172   0.09570312 -1.4677734  -1.9287109 ] 1   0 
[ 1.4052734   1.3125      0.06286621 -1.4199219  -1.8837891 ] 0   0 Match 5

[ 0.6533203   1.6855469   0.55810547 -1.0820312  -1.9951172 ] 1   2 
[-1.8222656 -1.1132812  0.5180664  2.5175781  1.4101562] 3   4 
[-0.734375    0.6791992   0.96191406  0.48510742 -0.9355469 ] 2   1 
[ 1.2119141   1.3935547   0.15344238 -1.4003906  -1.9980469 ] 1   0 
[-1.58300

[-1.7099609 -1.2070312  0.4387207  2.3222656  1.7207031] 3   4 
[-1.6787109  -1.5048828  -0.15661621  2.0839844   2.4902344 ] 4   4 Match 67

[-0.1697998  1.0166016  1.0390625 -0.2446289 -1.6044922] 2   2 Match 68

[ 0.8808594   1.2089844   0.44995117 -1.0488281  -1.9179688 ] 1   1 Match 69

[-1.3007812  -0.18066406  0.6479492   1.3349609   0.3017578 ] 3   2 
[-1.6992188 -1.4541016 -0.1661377  1.9316406  2.515625 ] 4   4 Match 70

[-1.8427734  -1.2988281   0.09844971  2.1308594   2.0644531 ] 3   3 Match 71

[-1.8056641  -1.4462891   0.00797272  2.2324219   2.3652344 ] 4   3 
[-0.2529297   0.94433594  0.87597656  0.00984192 -1.4482422 ] 1   2 
[-1.5244141  -1.3632812  -0.32983398  1.5019531   2.6191406 ] 4   4 Match 72

[ 1.0478516  1.4140625  0.5058594 -1.1806641 -1.9785156] 1   1 Match 73

[-1.3027344  -0.3798828   0.70947266  1.6787109   0.4855957 ] 3   2 
[ 1.1523438   1.4384766   0.28759766 -1.3125     -2.0527344 ] 1   1 Match 74

[-0.23779297  0.88671875  1.0068359   0.05831909 -1

[-0.15283203  0.92333984  0.88134766  0.02966309 -1.3251953 ] 1   1 Match 132

[-1.0644531   0.27124023  0.89208984  1.2285156  -0.5629883 ] 3   1 
[ 0.98291016  1.3730469   0.35009766 -1.1035156  -1.8388672 ] 1   1 Match 133

[-1.0546875   0.06365967  0.91064453  1.328125   -0.28076172] 3   4 
[ 1.0839844   1.5449219   0.13000488 -1.3222656  -1.921875  ] 1   1 Match 134

[-1.1035156   0.1048584   0.90478516  1.2587891  -0.43579102] 3   2 
[ 1.0224609   1.359375    0.31323242 -1.2705078  -1.9306641 ] 1   2 
[ 0.99365234  1.6601562   0.3383789  -1.3369141  -2.0097656 ] 1   1 Match 135

[ 0.9526367   1.6298828   0.16540527 -1.4296875  -2.0566406 ] 1   3 
[-0.95410156  0.5888672   1.0214844   1.15625    -0.7915039 ] 3   3 Match 136

[-0.40527344  0.7246094   0.9667969   0.4753418  -1.1279297 ] 2   3 
[ 0.45874023  1.5732422   0.5214844  -1.0058594  -2.0527344 ] 1   1 Match 137

[-1.3642578  -0.31811523  0.7661133   1.7412109   0.3359375 ] 3   3 Match 138

[ 1.2177734   1.5771484   0.25439

[ 1.3867188   1.3818359   0.00461578 -1.5439453  -1.8964844 ] 0   2 
[-1.5898438 -1.0322266  0.3947754  2.3144531  1.2880859] 3   3 Match 211

[ 0.07208252  1.4521484   0.8701172  -0.42504883 -1.8740234 ] 1   1 Match 212

[-1.7363281  -1.3691406   0.03079224  2.25        2.0820312 ] 3   4 
[-0.1361084   0.99902344  1.0341797   0.03085327 -1.4580078 ] 2   2 Match 213

[-1.6386719  -1.3984375  -0.29663086  1.8398438   2.4628906 ] 4   2 
[ 0.97753906  1.4414062   0.3581543  -1.2001953  -1.9550781 ] 1   1 Match 214

[ 0.15759277  1.3613281   0.7241211  -0.33789062 -1.6132812 ] 1   2 
[-0.7631836   0.44995117  1.0019531   0.8149414  -0.72314453] 2   4 
[-1.6142578  -1.6259766  -0.16662598  1.9277344   2.6738281 ] 4   4 Match 215

[ 0.5439453  1.6269531  0.5913086 -1.0771484 -2.0820312] 1   2 
[ 1.0195312   1.6240234   0.33251953 -1.265625   -2.        ] 1   1 Match 216

[-1.6503906  -1.5517578  -0.34155273  1.8291016   2.6992188 ] 4   3 
[-1.5664062 -1.5351562 -0.4182129  1.640625   2.85546

[-0.02345276  1.1953125   1.0410156  -0.1899414  -1.6660156 ] 1   1 Match 284

[ 0.42944336  1.3222656   0.80029297 -0.8017578  -1.8330078 ] 1   1 Match 285

[-0.39331055  0.9824219   1.1601562   0.09509277 -1.3935547 ] 2   3 
[-1.7734375  -1.3916016  -0.02366638  2.265625    2.2734375 ] 4   4 Match 286

[ 1.3076172   1.4648438   0.14880371 -1.4160156  -1.8867188 ] 1   1 Match 287

[-1.6220703  -1.4482422  -0.37524414  1.6943359   2.6679688 ] 4   4 Match 288

[-1.6582031  -1.5400391  -0.12988281  1.9580078   2.3515625 ] 4   4 Match 289

[ 0.77490234  1.7294922   0.50390625 -1.1669922  -1.9863281 ] 1   1 Match 290

[-1.7705078  -1.2373047   0.17810059  2.4453125   1.9716797 ] 3   3 Match 291

[ 0.7216797   1.0224609   0.40893555 -0.7861328  -1.5136719 ] 1   1 Match 292

[-1.1630859   0.02227783  0.97216797  1.4101562  -0.3227539 ] 3   2 
[-0.7861328  -0.62646484  0.30615234  0.86816406  1.2685547 ] 4   1 
[-1.3046875  -1.0917969  -0.17028809  1.3613281   2.1484375 ] 4   3 
[ 1.2646484  

[-1.6181641 -1.5146484 -0.2097168  1.9716797  2.5644531] 4   4 Match 355

[ 0.8208008   1.6513672   0.36010742 -1.1582031  -2.0117188 ] 1   2 
[ 0.7973633   1.6464844   0.43359375 -1.1298828  -1.9648438 ] 1   1 Match 356

[-1.8085938  -1.3359375  -0.12890625  1.9736328   2.3417969 ] 4   4 Match 357

[ 0.66259766  1.2167969   0.5053711  -0.9433594  -1.8789062 ] 1   1 Match 358

[ 0.5751953  1.6552734  0.6411133 -1.0634766 -2.0058594] 1   1 Match 359

[-1.7509766  -1.3886719   0.02404785  1.984375    2.2148438 ] 4   3 
[ 0.7763672   1.1914062   0.57177734 -0.76660156 -1.6689453 ] 1   2 
[ 0.828125    1.4765625   0.43017578 -1.1835938  -2.0039062 ] 1   2 
[-1.4052734 -0.4951172  0.7626953  2.0507812  0.3918457] 3   3 Match 360

[ 0.33911133  1.2744141   0.79833984 -0.76220703 -1.7910156 ] 1   3 
[-1.3837891  -1.4482422  -0.46142578  1.4033203   2.7363281 ] 4   4 Match 361

[ 1.0253906  1.4677734  0.3474121 -1.2363281 -1.8916016] 1   0 
[-1.5400391  -0.8051758   0.43017578  2.3789062   1.0

[ 1.0449219  1.3251953  0.4560547 -1.1376953 -1.9462891] 1   0 
[ 0.38549805  1.3564453   0.68115234 -0.7753906  -1.7929688 ] 1   1 Match 420

[-0.6972656   0.71435547  1.1845703   0.46655273 -1.2050781 ] 2   1 
[-1.3798828  -0.6123047   0.48828125  1.9550781   0.62597656] 3   3 Match 421

[-1.6435547  -1.4921875  -0.04364014  1.9873047   2.4257812 ] 4   3 
[-0.9794922  -0.00585175  0.7392578   1.1416016   0.00173187] 3   1 
[-1.6816406 -1.2539062 -0.0041275  2.1503906  1.8740234] 3   3 Match 422

[-1.4873047  -0.46826172  0.7973633   2.1054688   0.28051758] 3   3 Match 423

[-0.6791992   0.4169922   0.8774414   0.48901367 -0.71728516] 2   1 
[ 1.3330078   1.2861328   0.06481934 -1.4023438  -1.8984375 ] 0   0 Match 424

[-0.36621094  0.94091797  0.8520508   0.17114258 -1.2519531 ] 1   3 
[-0.4794922   0.7836914   1.125       0.46557617 -1.125     ] 2   0 
[-1.6855469 -1.0166016  0.5883789  2.3632812  1.3154297] 3   4 
[-1.5771484 -1.3769531 -0.3947754  1.6318359  2.6328125] 4   4 Match

[-1.6416016  -1.5234375  -0.34277344  1.8183594   2.6914062 ] 4   4 Match 491

[ 0.546875   1.4287109  0.6621094 -0.9277344 -2.0273438] 1   0 
[-1.1826172   0.07781982  0.8613281   1.2792969  -0.26342773] 3   3 Match 492

[ 0.9892578   1.5556641   0.45532227 -1.2060547  -1.9970703 ] 1   2 
[-1.7285156  -1.4150391  -0.03491211  2.109375    2.3378906 ] 4   4 Match 493

[ 0.8125     1.3232422  0.5541992 -1.046875  -1.9404297] 1   2 
[-0.63427734  0.3149414   0.69921875  0.59375    -0.52685547] 2   1 
[-1.7080078  -1.3173828   0.35888672  2.3847656   1.8388672 ] 3   3 Match 494

[-1.7119141 -1.390625  -0.3125     1.7822266  2.4882812] 4   4 Match 495

[-1.2626953  -1.3740234  -0.29223633  1.2714844   2.4394531 ] 4   3 
[ 0.3671875   1.5849609   0.69384766 -0.7871094  -2.0214844 ] 1   1 Match 496

[ 0.9584961   1.6308594   0.47070312 -1.2441406  -2.0488281 ] 1   0 
[ 1.2226562   1.0878906   0.04443359 -1.2714844  -1.8017578 ] 0   1 
[-1.8183594  -1.0009766   0.23669434  2.2890625   1.660156

[-1.0771484   0.15686035  0.89746094  1.3574219  -0.39941406] 3   3 Match 557

[-1.7226562 -0.9033203  0.5698242  2.4082031  1.2021484] 3   2 
[-1.7587891  -1.1425781   0.21459961  2.3886719   1.7460938 ] 3   4 
[-1.7695312  -1.2695312   0.09387207  2.3046875   1.9296875 ] 3   3 Match 558

[-1.6044922  -1.5019531  -0.36279297  1.7695312   2.6484375 ] 4   4 Match 559

[-1.4892578  -1.3671875  -0.45263672  1.4873047   2.7226562 ] 4   4 Match 560

[-1.5537109  -0.7026367   0.43969727  2.2675781   0.9111328 ] 3   4 
[-1.7841797 -1.0019531  0.5019531  2.3632812  1.2509766] 3   4 
[-1.4560547 -0.7607422  0.3046875  2.125      1.0205078] 3   4 
[-1.7832031  -1.1611328   0.23046875  2.4648438   1.5849609 ] 3   3 Match 561

[-0.48901367  0.5883789   0.8852539   0.6381836  -0.86865234] 2   2 Match 562

[ 1.0537109   1.3408203   0.29760742 -1.2441406  -1.8320312 ] 1   1 Match 563

[ 0.5488281  1.6533203  0.6904297 -0.9428711 -2.0488281] 1   1 Match 564

[-1.2958984  -0.27075195  0.8510742   1.572

[-1.7548828  -1.2304688   0.37939453  2.3066406   1.5224609 ] 3   2 
[-1.5820312  -1.4433594  -0.31323242  1.6826172   2.703125  ] 4   4 Match 631

[ 0.44799805  1.2294922   0.6660156  -0.65185547 -1.7226562 ] 1   2 
[ 1.2910156   1.4863281   0.11834717 -1.4951172  -1.890625  ] 1   1 Match 632

[ 1.3164062   1.4287109   0.11547852 -1.2978516  -1.8408203 ] 1   1 Match 633

[ 0.00971222  1.2392578   0.7807617  -0.34350586 -1.7568359 ] 1   1 Match 634

[-1.6953125  -1.2158203   0.15063477  2.3300781   1.7666016 ] 3   3 Match 635

[-0.60498047  0.6904297   0.90722656  0.39697266 -1.1015625 ] 2   1 
[ 0.20141602  1.5195312   0.89990234 -0.6411133  -1.8759766 ] 1   1 Match 636

[-8.6545944e-04  1.0683594e+00  8.9160156e-01 -1.4538574e-01
 -1.3496094e+00] 1   1 Match 637

[-1.1005859  -0.20446777  0.7885742   1.4482422   0.2927246 ] 3   2 
[-1.5332031  -1.3857422  -0.47558594  1.4707031   2.6953125 ] 4   4 Match 638

[ 0.82470703  1.2265625   0.5761719  -1.0742188  -1.8818359 ] 1   0 
[ 1.320

[ 0.91308594  1.6855469   0.3581543  -1.2607422  -2.1015625 ] 1   1 Match 707

[-1.6542969  -1.1269531   0.12255859  2.28125     1.7412109 ] 3   3 Match 708

[-1.7451172  -1.4794922  -0.15979004  2.1699219   2.5546875 ] 4   3 
[-1.6699219  -1.2490234   0.18591309  2.2285156   1.8662109 ] 3   3 Match 709

[-1.7666016  -1.2597656   0.19482422  2.4140625   1.8828125 ] 3   3 Match 710

[-0.22058105  0.9926758   1.0429688  -0.01557922 -1.3847656 ] 2   0 
[ 0.1665039   0.93847656  0.7373047  -0.2734375  -1.5195312 ] 1   1 Match 711

[-1.6201172  -1.3603516   0.02183533  2.1171875   2.0390625 ] 3   3 Match 712

[ 0.04779053  1.4541016   0.9370117  -0.44458008 -1.8486328 ] 1   1 Match 713

[-1.609375   -1.5419922  -0.10614014  2.0800781   2.5332031 ] 4   3 
[-1.4697266  -0.39819336  0.86572266  2.078125    0.10675049] 3   3 Match 714

[ 1.1855469  1.2314453  0.1227417 -1.1845703 -1.765625 ] 1   0 
[-0.1973877   0.76416016  0.86572266 -0.04846191 -1.4121094 ] 2   2 Match 715

[-1.7822266  -1.46

[-1.7626953  -1.3896484  -0.12414551  2.2558594   2.2148438 ] 3   4 
[-1.6943359  -1.4345703  -0.27172852  1.7919922   2.5175781 ] 4   4 Match 782

[ 0.80078125  1.6513672   0.5180664  -1.1464844  -1.9990234 ] 1   0 
[-1.6630859  -1.3798828  -0.06463623  1.9707031   2.1972656 ] 4   4 Match 783

[ 1.3339844   1.3085938   0.24865723 -1.3359375  -1.8867188 ] 0   1 
[-1.1679688  -0.2388916   0.8540039   1.5224609   0.13317871] 3   1 
[-0.6323242   0.55029297  0.89746094  0.49389648 -1.0654297 ] 2   3 
[ 1.4199219   1.5244141   0.11572266 -1.5117188  -1.9960938 ] 1   1 Match 784

[-1.4589844  -1.3964844  -0.46972656  1.4462891   2.6777344 ] 4   4 Match 785

[-1.7265625  -1.1962891   0.14990234  2.3886719   1.6914062 ] 3   3 Match 786

[-1.15625    -0.11743164  0.8144531   1.4560547  -0.03933716] 3   3 Match 787

[ 0.5        1.4394531  0.6269531 -0.9404297 -2.0703125] 1   1 Match 788

[ 0.09069824  1.5322266   0.7944336  -0.60058594 -1.8916016 ] 1   2 
[-0.21130371  1.0625      1.0634766   

[ 0.77978516  1.6240234   0.54052734 -1.109375   -2.0605469 ] 1   0 
[-0.6098633   0.6533203   0.9873047   0.60058594 -1.0029297 ] 2   3 
[-1.7421875  -1.5107422  -0.17749023  2.0507812   2.5175781 ] 4   4 Match 852

[ 0.62158203  1.2851562   0.5673828  -0.7910156  -1.8642578 ] 1   1 Match 853

[-1.3935547  -0.31811523  0.84716797  1.8662109   0.02383423] 3   3 Match 854

[ 1.0263672  1.4472656  0.328125  -1.1455078 -1.9033203] 1   0 
[-0.8198242   0.49951172  0.93652344  0.71875    -0.83935547] 2   3 
[ 1.0341797   1.6083984   0.15136719 -1.2509766  -1.9335938 ] 1   1 Match 855

[ 0.91503906  1.5732422   0.5131836  -1.2753906  -2.        ] 1   1 Match 856

[ 1.0039062   1.5117188   0.28100586 -1.2558594  -1.9384766 ] 1   0 
[ 0.5205078   1.5185547   0.71240234 -0.9458008  -2.0878906 ] 1   3 
[ 1.2373047  1.4189453  0.2364502 -1.4824219 -2.03125  ] 1   1 Match 857

[ 0.0871582   1.3886719   0.8574219  -0.47827148 -1.7011719 ] 1   1 Match 858

[ 0.78759766  1.3945312   0.38427734 -1.062

[-1.7246094  -1.3310547  -0.07836914  1.9873047   2.2324219 ] 4   4 Match 917

[ 0.18908691  1.3886719   0.7397461  -0.43017578 -1.6787109 ] 1   1 Match 918

[-0.99316406  0.07159424  0.98291016  1.5234375  -0.25195312] 3   1 
[ 0.41430664  1.3076172   0.70214844 -0.6513672  -1.7578125 ] 1   2 
[-1.3535156  -0.24890137  0.9433594   1.6953125  -0.15197754] 3   4 
[ 0.05001831  1.3945312   0.80371094 -0.43579102 -1.8349609 ] 1   3 
[-0.29614258  0.93408203  1.0302734  -0.03128052 -1.46875   ] 2   2 Match 919

[ 0.5854492   1.1699219   0.58984375 -0.9003906  -1.8164062 ] 1   2 
[-1.6875     -0.69873047  0.5678711   2.2617188   0.87158203] 3   4 
[-1.71875    -1.3115234   0.04760742  2.2617188   2.0488281 ] 3   3 Match 920

[ 1.0488281  1.5722656  0.4177246 -1.3193359 -2.0664062] 1   2 
[ 1.1953125   1.4306641   0.17370605 -1.4130859  -2.0136719 ] 1   1 Match 921

[ 1.1796875   1.5195312   0.30395508 -1.3203125  -1.8955078 ] 1   1 Match 922

[ 1.2773438   1.4335938   0.06958008 -1.4228516 

[-1.6464844  -0.828125    0.63378906  2.4589844   0.9536133 ] 3   4 
[-1.7197266  -1.4306641  -0.16723633  2.1835938   2.3808594 ] 4   4 Match 987

[-1.6240234  -1.4140625  -0.16418457  1.6074219   2.5488281 ] 4   4 Match 988

[-0.5234375   0.79345703  1.1777344   0.36791992 -1.2255859 ] 2   1 
[ 0.57470703  1.1972656   0.6816406  -0.65185547 -1.8300781 ] 1   1 Match 989

[ 0.46533203  1.4609375   0.63134766 -0.81396484 -1.8740234 ] 1   1 Match 990

[-1.7324219  -1.4248047   0.00563812  2.2636719   2.21875   ] 3   4 
[ 0.9941406   1.6025391   0.37817383 -1.3369141  -2.0625    ] 1   1 Match 991

[ 0.7495117  1.5498047  0.4765625 -1.1386719 -1.9853516] 1   0 
[ 0.29858398  1.2460938   0.70996094 -0.46948242 -1.6357422 ] 1   1 Match 992

[-1.1972656  -0.00827026  0.91064453  1.3808594  -0.25146484] 3   3 Match 993

[ 0.21936035  1.3496094   0.6542969  -0.6411133  -1.8183594 ] 1   2 
[ 0.26708984  1.2900391   0.78125    -0.57421875 -1.7695312 ] 1   2 
[-1.4375     -1.3261719  -0.27172852  

[-1.5654297  -0.58984375  0.7011719   2.203125    0.5644531 ] 3   3 Match 1058

[ 1.0341797   1.7011719   0.29077148 -1.3994141  -2.0058594 ] 1   0 
[-0.4777832   0.77978516  1.0419922   0.5439453  -0.9511719 ] 2   0 
[ 1.28125     1.2646484   0.08190918 -1.2548828  -1.8154297 ] 0   1 
[-1.5380859 -1.3916016 -0.3544922  1.4912109  2.6582031] 4   4 Match 1059

[ 0.7368164   1.5136719   0.3696289  -0.98535156 -1.8232422 ] 1   1 Match 1060

[-0.5053711   0.7324219   0.9614258   0.18518066 -1.2490234 ] 2   2 Match 1061

[ 1.2255859   1.1640625   0.10552979 -1.3251953  -1.9091797 ] 0   0 Match 1062

[-0.45239258  1.0839844   1.1044922   0.22509766 -1.546875  ] 2   3 
[-0.4416504   0.88916016  0.92578125  0.51464844 -1.2050781 ] 2   4 
[ 0.99853516  1.5390625   0.23266602 -1.1328125  -1.8720703 ] 1   1 Match 1063

[-1.2519531  -0.4350586   0.73876953  1.7958984   0.4404297 ] 3   1 
[ 0.5234375   1.5849609   0.54589844 -0.7758789  -1.859375  ] 1   2 
[-1.6171875  -0.8886719   0.47729492  2.38

In [72]:
from sklearn import metrics
 
print(metrics.confusion_matrix(Targets,Pred))

[[ 36 212  15  16   0]
 [ 20 480  75  55   3]
 [  1 194  58 126  10]
 [  2  36  45 314 113]
 [  0   8  15 145 231]]


In [74]:
target_names = ['Very Neg', 'Negative', 'Neutral','Positive','Very Pos']

print(metrics.classification_report(Targets, Pred,target_names =target_names))

              precision    recall  f1-score   support

    Very Neg       0.61      0.13      0.21       279
    Negative       0.52      0.76      0.61       633
     Neutral       0.28      0.15      0.19       389
    Positive       0.48      0.62      0.54       510
    Very Pos       0.65      0.58      0.61       399

    accuracy                           0.51      2210
   macro avg       0.51      0.45      0.43      2210
weighted avg       0.50      0.51      0.47      2210



In [75]:
Fold1_Predictions=pd.DataFrame(Pred, columns=['Pred1'])
Fold1_Predictions

Unnamed: 0,Pred1
0,1
1,1
2,1
3,4
4,1
...,...
2205,4
2206,1
2207,3
2208,4


In [76]:
Fold1_Predictions.to_excel(output_folder+'/Saves/fold1_Predictions.xls')

In [77]:
#clearing GPU cache

del(model)
del(TrainResult, TrainModel_outputs, EvalResult, EvalModel_outputs, TestResult, TestModel_outputs, wrong_predictions)
torch.cuda.empty_cache()

## Fold 2: training & caturing predictions

In [78]:
fold_number='2'

train=pd.read_excel('./folds/train_fold'+fold_number+'.xls')
Eval=pd.read_excel('./folds/valid'+fold_number+'.xls') #evaluation set


In [79]:

output_folder='./folds/fold'+fold_number+'/'+model_class+'/'+model_version+"/"
cache_directory= "./folds/fold"+fold_number+'/'+model_class+"/"+model_version+"/cache/"


print('model variables were set up: ')

 
save_every_steps=1285
# assuming training batch size of 8
# any number above 1284 saves the model only at every epoch
# Saving the model mid training very often will consume disk space fast

train_args={
    "output_dir":output_folder,
    "cache_dir":cache_directory,
    'reprocess_input_data': True,
    'overwrite_output_dir': True,
    'num_train_epochs': 2,
    "save_steps": save_every_steps, 
    "learning_rate": 2e-5,
    "train_batch_size": 64,
    "eval_batch_size": 16,
    "evaluate_during_training_steps": 312,
    "max_seq_length": 100,
    "n_gpu": 1,
}

# Create a ClassificationModel
model = ClassificationModel(model_class, model_version, num_labels=labels_count, args=train_args) 

model variables were set up: 


In [80]:
# loading the checkpoint that gave the best result
'''
CheckPoint='checkpoint-143-epoch-1'  #epoch 1


preSavedCheckpoint=output_folder+CheckPoint

print('Loading model, please wait...')
model = ClassificationModel( model_class, preSavedCheckpoint, num_labels=labels_count, args=train_args) 
print('model in use is :', preSavedCheckpoint )
'''

"\nCheckPoint='checkpoint-143-epoch-1'  #epoch 1\n\n\npreSavedCheckpoint=output_folder+CheckPoint\n\nprint('Loading model, please wait...')\nmodel = ClassificationModel( model_class, preSavedCheckpoint, num_labels=labels_count, args=train_args) \nprint('model in use is :', preSavedCheckpoint )\n"

In [81]:
# Train the model
current_time = datetime.now()
model.train_model(train)
print("Training time: ", datetime.now() - current_time)

Converting to features started. Cache is not used.


HBox(children=(FloatProgress(value=0.0, max=6829.0), HTML(value='')))


Selected optimization level O1:  Insert automatic casts around Pytorch functions and Tensor methods.

Defaults for this optimization level are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic
Processing user overrides (additional kwargs that are not None)...
After processing overrides, optimization options are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic


HBox(children=(FloatProgress(value=0.0, description='Epoch', max=2.0, style=ProgressStyle(description_width='i…

HBox(children=(FloatProgress(value=0.0, description='Current iteration', max=107.0, style=ProgressStyle(descri…

Running loss: 1.342487Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 32768.0
Running loss: 1.340764


HBox(children=(FloatProgress(value=0.0, description='Current iteration', max=107.0, style=ProgressStyle(descri…

Running loss: 1.104371

Training of bert model complete. Saved to ./folds/fold2/bert/bert-base-cased/.
Training time:  0:04:39.321206


In [82]:
TrainResult, TrainModel_outputs, wrong_predictions = model.eval_model(train, acc=sklearn.metrics.accuracy_score)
 
EvalResult, EvalModel_outputs, wrong_predictions = model.eval_model(Eval, acc=sklearn.metrics.accuracy_score)

TestResult, TestModel_outputs, wrong_predictions = model.eval_model(test, acc=sklearn.metrics.accuracy_score)

print('Training Result:', TrainResult['acc'])
#print('Model Out:', TrainModel_outputs)

print('Eval Result:', EvalResult['acc'])
#print('Model Out:', EvalModel_outputs)

print('Test Set Result:', TestResult['acc'])
#print('Model Out:', TestModel_outputs)

Converting to features started. Cache is not used.


HBox(children=(FloatProgress(value=0.0, max=6829.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, max=427.0), HTML(value='')))


{'mcc': 0.45524357847040087, 'acc': 0.5681651779177039, 'eval_loss': 1.016062933471778}
Converting to features started. Cache is not used.


HBox(children=(FloatProgress(value=0.0, max=1705.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, max=107.0), HTML(value='')))


{'mcc': 0.3559989200245821, 'acc': 0.4967741935483871, 'eval_loss': 1.132854224922501}
Converting to features started. Cache is not used.


HBox(children=(FloatProgress(value=0.0, max=2210.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, max=139.0), HTML(value='')))


{'mcc': 0.3509900985807728, 'acc': 0.4972850678733032, 'eval_loss': 1.1453220664168433}
Training Result: 0.5681651779177039
Eval Result: 0.4967741935483871
Test Set Result: 0.4972850678733032


In [83]:
Pred=[]
Targets=[]

countCorrect=0

for row in range(TestModel_outputs.shape[0]):
    outputs=TestModel_outputs[row]
    #print(test.iloc[row,0])
    print(outputs, end=' ')
    
    result=0
    if outputs[0]<outputs[1]:result=1
    if outputs[result]<outputs[2]:result=2
    if outputs[result]<outputs[3]:result=3
    if outputs[result]<outputs[4]:result=4
    
    Pred.append(result)
    Targets.append(test.iloc[row,1])
    print(result, ' ',test.iloc[row,1], end=' ')
    if result==test.iloc[row,1]:
        countCorrect+=1
        print('Match',countCorrect)
    print('')

print(countCorrect)

[ 0.5341797   1.0703125   0.01020813 -1.2958984  -2.3261719 ] 1   1 Match 1

[ 0.4387207   1.2148438   0.18432617 -1.1650391  -2.4492188 ] 1   0 
[ 0.37402344  1.2353516   0.28833008 -1.1337891  -2.4042969 ] 1   2 
[-2.1582031  -1.8671875  -0.95214844  1.7255859   2.5625    ] 4   4 Match 2

[-0.17980957  0.7714844   0.30371094 -0.9316406  -1.9716797 ] 1   0 
[-2.59375    -1.8427734  -0.18347168  2.0761719   1.2900391 ] 3   3 Match 3

[ 0.7416992   0.9765625  -0.02658081 -1.2832031  -2.3417969 ] 1   0 
[-2.5351562  -2.0664062  -0.62402344  2.0039062   2.3085938 ] 4   4 Match 4

[ 0.7182617  1.1064453  0.0635376 -1.3759766 -2.3125   ] 1   0 
[ 0.9423828   1.0136719  -0.09899902 -1.2802734  -2.3183594 ] 1   0 
[ 0.42895508  1.1972656   0.28735352 -1.1054688  -2.3476562 ] 1   2 
[-2.6171875 -1.4863281 -0.2590332  2.0058594  1.1816406] 3   4 
[-1.0332031   0.52197266  0.3552246  -0.11669922 -1.9658203 ] 1   1 Match 5

[ 0.6982422   1.1259766   0.08319092 -1.2695312  -2.3398438 ] 1   0 
[-1.

[-2.6054688 -1.9804688 -0.5957031  2.1230469  1.9609375] 3   3 Match 57

[-2.2460938 -1.9082031 -0.5605469  2.2148438  2.3652344] 4   4 Match 58

[-1.2792969  -0.10882568  0.4267578   0.68652344 -1.3623047 ] 3   2 
[ 2.6249886e-04  1.1474609e+00  4.2089844e-01 -9.6093750e-01
 -2.3476562e+00] 1   0 
[-1.9814453  -1.3632812  -0.39624023  1.4355469   0.5307617 ] 3   3 Match 59

[-1.9091797 -1.8085938 -0.8876953  1.5419922  2.7871094] 4   4 Match 60

[-0.29467773  0.9707031   0.56103516 -0.4753418  -2.2285156 ] 1   1 Match 61

[-2.109375   -1.5546875  -0.20349121  2.2675781   1.5419922 ] 3   3 Match 62

[-2.4355469  -1.4892578  -0.28833008  1.9091797   0.94921875] 3   3 Match 63

[-1.8652344  -0.8833008   0.11529541  1.5400391   0.4711914 ] 3   4 
[-0.15307617  1.0693359   0.4399414  -0.7817383  -2.2851562 ] 1   0 
[ 0.42700195  1.2929688   0.22961426 -1.0576172  -2.4335938 ] 1   1 Match 64

[ 0.27734375  1.2158203   0.31762695 -1.0175781  -2.34375   ] 1   1 Match 65

[ 0.86035156  1.02441

[ 0.5341797   1.2324219   0.27001953 -1.1845703  -2.4550781 ] 1   2 
[-0.6723633   0.28295898  0.36376953 -0.13415527 -1.8935547 ] 2   1 
[-0.63623047  0.6743164   0.6220703  -0.10943604 -1.9472656 ] 1   1 Match 127

[ 0.36083984  0.9667969  -0.08544922 -1.1494141  -2.2011719 ] 1   1 Match 128

[-1.7148438  -0.80859375  0.16516113  1.1367188  -0.62841797] 3   4 
[ 0.5317383   1.2324219   0.13427734 -1.1728516  -2.4023438 ] 1   1 Match 129

[-1.1337891   0.2565918   0.5493164   0.34423828 -1.7431641 ] 2   2 Match 130

[ 0.5073242   1.1826172   0.14709473 -1.2236328  -2.4082031 ] 1   2 
[ 0.58740234  1.1396484   0.08728027 -1.3037109  -2.4042969 ] 1   1 Match 131

[ 0.45874023  1.3056641   0.31860352 -1.0361328  -2.3359375 ] 1   3 
[-1.8642578  -0.35913086  0.56884766  0.99365234 -0.66845703] 3   3 Match 132

[-1.1494141   0.09899902  0.6308594   0.8120117  -1.34375   ] 3   3 Match 133

[ 0.1529541   1.1982422   0.39648438 -0.82373047 -2.3945312 ] 1   1 Match 134

[-1.8144531  -0.7700195

[-2.3964844  -1.8212891  -0.37963867  2.3261719   1.9433594 ] 3   4 
[-0.9760742   0.23339844  0.5605469   0.41601562 -1.5693359 ] 2   2 Match 207

[-2.4414062  -1.8310547  -0.79833984  1.9804688   2.25      ] 4   2 
[ 0.48242188  1.0761719   0.2097168  -1.1298828  -2.4023438 ] 1   1 Match 208

[ 0.3256836   1.0195312   0.31958008 -0.7480469  -2.21875   ] 1   2 
[-1.5205078  -0.7050781   0.03237915  0.92871094 -0.64501953] 3   4 
[-1.8730469 -1.8066406 -0.7524414  1.7695312  2.7890625] 4   4 Match 209

[ 0.16174316  1.2539062   0.42114258 -0.69189453 -2.2636719 ] 1   2 
[ 0.70166016  1.1318359  -0.01011658 -1.2734375  -2.4296875 ] 1   1 Match 210

[-2.2246094 -2.0214844 -0.8730469  1.8730469  2.3652344] 4   3 
[-1.8544922 -1.7929688 -0.8745117  1.5722656  2.7890625] 4   4 Match 211

[ 0.5654297  1.0810547  0.1418457 -1.0087891 -2.2714844] 1   2 
[-1.4091797  -0.05722046  0.30566406  0.63671875 -1.3242188 ] 3   2 
[-2.0078125  -0.92333984  0.06109619  1.4472656  -0.19592285] 3   3 Match

[ 0.4189453   0.9819336   0.03808594 -1.2119141  -2.3222656 ] 1   1 Match 278

[-2.4726562  -1.5673828   0.05175781  2.0527344   1.1699219 ] 3   3 Match 279

[-0.13952637  0.5415039   0.06292725 -0.5649414  -2.0234375 ] 1   1 Match 280

[-0.72265625  0.42919922  0.3154297  -0.32714844 -1.8193359 ] 1   2 
[-1.6298828  -1.3642578  -0.78515625  1.4570312   2.2792969 ] 4   1 
[-1.9765625  -1.5185547  -0.89453125  1.3232422   1.8476562 ] 4   3 
[ 0.90771484  0.8642578  -0.03555298 -1.1855469  -2.3476562 ] 0   1 
[-0.15759277  0.5883789   0.37573242 -0.32226562 -1.7138672 ] 1   0 
[ 0.02151489  0.50927734 -0.13671875 -0.43847656 -1.7666016 ] 1   1 Match 281

[-2.5820312  -1.3691406   0.05801392  1.8945312   0.8564453 ] 3   4 
[-1.8369141  -0.6455078   0.5649414   1.1767578  -0.57177734] 3   3 Match 282

[ 0.40649414  1.2324219   0.32763672 -1.0917969  -2.3925781 ] 1   1 Match 283

[-2.1191406  -0.73876953  0.37280273  1.6171875  -0.26879883] 3   3 Match 284

[ 0.24316406  1.0419922   0.19616

[-2.2363281 -1.7246094 -0.8017578  1.5957031  1.7041016] 4   4 Match 344

[ 0.0395813   0.77001953  0.42382812 -0.8457031  -2.0292969 ] 1   1 Match 345

[ 0.2548828  1.0673828  0.4987793 -0.7504883 -2.2714844] 1   1 Match 346

[-2.2910156  -1.6982422  -0.70703125  1.796875    1.8232422 ] 4   3 
[-0.05264282  0.45874023  0.08728027 -0.5708008  -1.6972656 ] 1   2 
[ 0.45361328  1.2685547   0.28271484 -1.0419922  -2.3613281 ] 1   2 
[-2.0722656  -1.2041016   0.27490234  1.7001953   0.42700195] 3   3 Match 347

[ 0.01354218  0.7866211   0.41381836 -0.61865234 -2.0625    ] 1   3 
[-1.5185547  -1.5078125  -0.83691406  1.2753906   2.7207031 ] 4   4 Match 348

[ 0.5151367   1.0361328   0.05426025 -1.0820312  -2.3945312 ] 1   0 
[-2.1660156  -1.6142578   0.09606934  1.953125    0.89404297] 3   2 
[-2.1484375  -1.6855469  -0.03387451  1.9814453   1.6621094 ] 3   2 
[ 0.8076172   1.0849609  -0.08508301 -1.2050781  -2.2988281 ] 1   0 
[-2.2597656 -2.0117188 -0.7998047  1.8115234  2.1425781] 4   4 

[-0.92333984 -0.0181427  -0.00561142  0.14355469 -1.359375  ] 3   1 
[ 0.76171875  0.9243164  -0.23535156 -1.1015625  -2.1484375 ] 1   0 
[ 0.15808105  1.0849609   0.30908203 -0.8701172  -2.4042969 ] 1   3 
[-0.3881836   0.74365234  0.45361328 -0.26416016 -2.1757812 ] 1   0 
[-2.3925781 -1.609375  -0.1784668  2.171875   1.4658203] 3   4 
[-2.1914062  -1.7871094  -0.79052734  1.6865234   2.6113281 ] 4   4 Match 410

[-0.58984375  0.91064453  0.6743164  -0.1685791  -2.1503906 ] 1   1 Match 411

[-1.65625    -1.1542969   0.0440979   1.1982422  -0.16430664] 3   1 
[-0.5385742   0.33081055  0.39916992 -0.3630371  -1.5585938 ] 2   2 Match 412

[ 0.32470703  1.2861328   0.25512695 -1.1474609  -2.4941406 ] 1   1 Match 413

[-0.47387695  0.87060547  0.4946289  -0.33129883 -2.1210938 ] 1   1 Match 414

[-1.9648438 -1.8808594 -0.9370117  1.6083984  2.6621094] 4   3 
[-1.9082031  -0.45507812  0.39697266  1.1386719  -0.77685547] 3   3 Match 415

[-2.2773438  -1.8740234  -0.85498047  1.7763672   1.9

[ 0.24267578  0.8432617  -0.04953003 -1.0429688  -2.15625   ] 1   2 
[-0.39819336  0.2993164  -0.07696533 -0.44335938 -1.5068359 ] 1   1 Match 481

[-2.2539062 -1.8339844 -0.3527832  1.8476562  1.2167969] 3   3 Match 482

[-2.4589844 -1.9677734 -0.8803711  1.8349609  2.4335938] 4   4 Match 483

[-2.0722656 -1.6171875 -0.6723633  1.6591797  2.1601562] 4   3 
[-0.15197754  1.0058594   0.4074707  -0.9321289  -2.3359375 ] 1   1 Match 484

[ 0.27490234  1.1796875   0.41723633 -0.8989258  -2.3496094 ] 1   0 
[ 0.8144531   0.9140625  -0.24829102 -1.2294922  -2.1328125 ] 1   1 Match 485

[-2.1582031  -1.1445312  -0.1194458   1.6005859   0.09265137] 3   3 Match 486

[-2.09375   -1.8203125 -0.6279297  2.0214844  2.5859375] 4   4 Match 487

[-1.8486328 -1.5888672 -1.0732422  1.2402344  2.0234375] 4   4 Match 488

[ 0.64990234  1.2001953   0.11566162 -1.2529297  -2.4355469 ] 1   1 Match 489

[-1.4980469  -0.9770508   0.07574463  0.94384766 -0.47509766] 3   3 Match 490

[ 0.78466797  1.0712891   0.

[ 0.5878906   1.0830078   0.06921387 -1.3505859  -2.3144531 ] 1   0 
[-2.0800781  -1.5107422  -0.38476562  1.6689453   1.0166016 ] 3   1 
[-0.8359375   0.5698242   0.609375    0.13598633 -1.8349609 ] 2   0 
[ 0.76416016  0.8017578  -0.2783203  -1.2148438  -2.0195312 ] 1   0 
[ 0.04241943  0.9394531   0.2902832  -1.0175781  -2.140625  ] 1   0 
[-1.8789062  -1.7324219  -0.44970703  1.9746094   2.4902344 ] 4   2 
[-2.5449219 -1.8886719 -0.3552246  2.1621094  1.8828125] 3   3 Match 548

[ 0.23522949  1.3515625   0.3647461  -0.96484375 -2.4023438 ] 1   1 Match 549

[-1.8271484  -1.4580078   0.07440186  2.109375    1.1289062 ] 3   3 Match 550

[-2.0820312  -0.9326172   0.17175293  1.4228516  -0.30908203] 3   3 Match 551

[-2.3417969  -1.5146484  -0.25805664  2.0097656   0.9638672 ] 3   2 
[-2.5644531 -1.8164062 -0.2734375  2.0605469  1.6738281] 3   4 
[-2.2714844  -1.4052734  -0.15551758  1.9912109   0.8823242 ] 3   3 Match 552

[-2.0390625  -1.8955078  -0.88964844  1.6357422   2.6777344 ] 4

[-2.0957031 -1.7705078 -0.3544922  2.0214844  2.0664062] 4   3 
[ 0.60791016  1.2431641   0.24365234 -1.21875    -2.3515625 ] 1   1 Match 616

[-0.24267578  0.93115234  0.51123047 -0.69677734 -2.3164062 ] 1   1 Match 617

[-2.0097656  -0.8066406   0.12036133  1.1259766  -0.41748047] 3   2 
[-2.0839844  -1.4765625  -0.3190918   1.7666016   0.65771484] 3   4 
[ 0.6069336   1.0283203   0.01480103 -1.3710938  -2.3457031 ] 1   1 Match 618

[ 0.8588867   0.9399414  -0.11352539 -1.2841797  -2.2929688 ] 1   0 
[-0.9916992  -0.36035156  0.41845703  0.5625     -0.7470703 ] 3   3 Match 619

[-1.4072266   0.01641846  0.55029297  0.24829102 -1.5410156 ] 2   4 
[-1.7880859  -1.5400391  -0.93310547  1.2363281   2.4042969 ] 4   4 Match 620

[ 0.79296875  1.1132812   0.11669922 -1.1621094  -2.3828125 ] 1   2 
[-1.3828125  -0.23962402  0.3544922   0.67578125 -1.3994141 ] 3   4 
[-0.76220703  0.2800293   0.44433594  0.10229492 -1.8984375 ] 2   1 
[-1.8896484  -0.5151367   0.57373047  1.0849609  -0.681152

[-2.2734375  -1.7089844  -0.22351074  2.1796875   1.8212891 ] 3   3 Match 689

[-0.52978516  0.1751709   0.13171387 -0.00428391 -1.5576172 ] 1   3 
[ 0.7397461   1.0615234  -0.05255127 -1.0703125  -2.3125    ] 1   0 
[-0.11505127  0.9550781   0.3618164  -0.7832031  -2.375     ] 1   2 
[-1.0380859   0.3149414   0.39038086 -0.03448486 -1.7089844 ] 2   0 
[-0.40527344  0.54052734  0.38623047 -0.7236328  -1.8115234 ] 1   0 
[ 0.74853516  1.1396484   0.03204346 -1.2480469  -2.3613281 ] 1   1 Match 690

[ 0.33007812  1.1875      0.5229492  -1.0791016  -2.2304688 ] 1   0 
[-2.2246094  -1.6542969  -0.34545898  1.6875      0.83251953] 3   3 Match 691

[-0.18237305  0.75878906  0.24975586 -0.8251953  -2.1445312 ] 1   1 Match 692

[ 0.32885742  1.1582031   0.34936523 -0.99609375 -2.3632812 ] 1   0 
[-1.671875   -0.92578125 -0.15319824  1.1884766  -0.17749023] 3   2 
[-0.72216797  0.44091797  0.58203125  0.1373291  -1.765625  ] 2   1 
[-1.1132812  -0.3256836   0.04064941  0.5942383  -1.0732422 ] 3

[-1.2773438  -0.26513672  0.29541016  0.5864258  -1.1044922 ] 3   1 
[-2.25      -1.8955078 -0.828125   1.8222656  2.59375  ] 4   3 
[-2.1972656  -1.2236328  -0.15100098  1.6855469   0.40576172] 3   3 Match 754

[ 0.11993408  1.0361328   0.484375   -0.5620117  -2.234375  ] 1   1 Match 755

[-2.34375    -1.6865234  -0.33618164  1.9189453   1.5273438 ] 3   3 Match 756

[-2.5527344 -1.9472656 -0.5913086  2.1621094  2.1875   ] 4   3 
[-0.16040039  0.9692383   0.35009766 -0.61572266 -2.3164062 ] 1   3 
[-2.1972656 -1.6621094 -0.8105469  1.4316406  1.3193359] 3   4 
[ 0.6538086   1.1083984   0.05285645 -1.2988281  -2.3574219 ] 1   1 Match 757

[-2.09375   -1.84375   -0.6489258  1.9951172  2.7148438] 4   4 Match 758

[-0.4621582   0.17041016  0.32714844 -0.55371094 -1.5908203 ] 2   2 Match 759

[-2.296875   -1.7138672  -0.16955566  2.2285156   1.4433594 ] 3   3 Match 760

[ 1.0380859   1.0683594  -0.05966187 -1.3007812  -2.2578125 ] 1   0 
[ 0.9345703   1.0019531  -0.16894531 -1.3369141  -2.2

[-2.4375     -1.7783203  -0.44995117  1.8779297   1.1455078 ] 3   3 Match 827

[-2.203125  -1.7441406 -0.53125    2.0644531  2.3027344] 4   4 Match 828

[ 0.66015625  0.9116211  -0.14318848 -1.2158203  -2.1523438 ] 1   0 
[ 0.4958496   1.1201172   0.01754761 -1.2421875  -2.421875  ] 1   1 Match 829

[ 0.85546875  1.046875   -0.09967041 -1.3339844  -2.3085938 ] 1   0 
[-1.8085938  -0.8408203   0.01065826  1.1777344  -0.37670898] 3   3 Match 830

[-1.5292969  -0.5229492   0.26513672  1.1054688  -0.46899414] 3   3 Match 831

[ 0.27783203  1.2333984   0.35473633 -0.8222656  -2.4179688 ] 1   1 Match 832

[-0.11566162  1.1953125   0.47192383 -0.54833984 -2.4003906 ] 1   1 Match 833

[ 0.5678711   0.8730469   0.2277832  -0.91503906 -2.2246094 ] 1   2 
[-0.8173828   0.5214844   0.42333984 -0.03540039 -1.9511719 ] 1   0 
[ 0.26611328  1.0625      0.36987305 -1.0351562  -2.3496094 ] 1   0 
[-2.078125   -1.0957031  -0.0592041   1.6835938   0.34277344] 3   3 Match 834

[-0.3400879   0.48266602  0.

[-2.5761719 -1.8632812 -0.4819336  2.0390625  1.8066406] 3   3 Match 894

[ 0.71533203  1.0146484   0.05181885 -1.15625    -2.3339844 ] 1   1 Match 895

[ 0.40478516  0.91748047 -0.03146362 -1.1142578  -2.2578125 ] 1   1 Match 896

[-0.6401367   0.46972656  0.43188477 -0.28710938 -1.8037109 ] 1   2 
[-1.9794922 -1.7148438 -0.8769531  1.4257812  2.6601562] 4   4 Match 897

[-1.7441406 -1.6787109 -0.5576172  1.8876953  2.7285156] 4   4 Match 898

[-1.0957031  -0.2619629   0.06140137  0.20703125 -1.3828125 ] 3   1 
[ 0.64160156  1.0185547  -0.06829834 -1.3671875  -2.3066406 ] 1   0 
[ 0.5078125   1.203125    0.27929688 -1.1542969  -2.3007812 ] 1   1 Match 899

[-1.9023438 -1.8232422 -0.8222656  1.4628906  2.7578125] 4   4 Match 900

[-2.4199219  -1.8505859  -0.75878906  1.7167969   1.9589844 ] 4   4 Match 901

[ 0.29003906  0.9604492   0.31469727 -0.9267578  -2.328125  ] 1   1 Match 902

[-1.2080078  -0.06787109  0.73779297  0.5620117  -1.1855469 ] 2   1 
[ 0.73339844  0.8334961  -0.08398

[-0.28076172  1.0488281   0.40356445 -0.50439453 -2.3632812 ] 1   2 
[-1.8085938 -1.7685547 -0.8911133  1.5019531  2.7363281] 4   4 Match 963

[-0.4008789   0.5883789   0.20825195 -0.38842773 -2.1660156 ] 1   1 Match 964

[ 0.86083984  0.9589844  -0.0165863  -1.2880859  -2.25      ] 1   1 Match 965

[-1.5273438  -0.2902832   0.28881836  0.70996094 -1.1953125 ] 3   2 
[-2.3554688  -1.5214844   0.18188477  2.1230469   0.97558594] 3   4 
[-2.4902344 -2.0351562 -0.6508789  2.1386719  2.1972656] 4   4 Match 966

[-2.0625    -1.6464844 -0.7001953  1.7763672  2.2949219] 4   4 Match 967

[-0.90234375  0.22351074  0.6381836   0.18615723 -1.8535156 ] 2   1 
[-0.14746094  0.74316406  0.33520508 -0.66064453 -2.2558594 ] 1   1 Match 968

[ 0.4399414   1.0644531   0.20080566 -0.93359375 -2.3710938 ] 1   1 Match 969

[-2.453125  -1.8789062 -0.6621094  2.0683594  2.2363281] 4   4 Match 970

[ 0.4633789   1.1279297   0.08398438 -1.3701172  -2.3808594 ] 1   1 Match 971

[ 0.36621094  1.1171875   0.09863

[-0.06390381  1.1962891   0.48339844 -0.5336914  -2.3984375 ] 1   1 Match 1034

[-2.0058594  -1.0302734   0.01093292  1.2314453  -0.07714844] 3   3 Match 1035

[ 0.50146484  1.0947266   0.16699219 -1.0205078  -2.359375  ] 1   1 Match 1036

[ 0.18945312  1.1044922   0.3413086  -0.90283203 -2.3808594 ] 1   2 
[-2.1152344  -0.97509766  0.4338379   1.5947266   0.02786255] 3   3 Match 1037

[ 0.5761719   1.0927734  -0.00705338 -1.2167969  -2.2753906 ] 1   0 
[-1.9248047  -0.5058594   0.38842773  1.1269531  -0.4321289 ] 3   0 
[ 0.37548828  0.72998047 -0.19433594 -1.0429688  -2.0195312 ] 1   1 Match 1038

[-1.9824219 -1.5898438 -0.7294922  1.5888672  2.5351562] 4   4 Match 1039

[ 0.5488281   1.0751953   0.03839111 -1.1738281  -2.3671875 ] 1   1 Match 1040

[-0.421875    0.77978516  0.3317871  -0.66796875 -2.234375  ] 1   2 
[ 0.70214844  1.0253906   0.07067871 -1.1289062  -2.3652344 ] 1   0 
[-0.88623047  0.6352539   0.66845703  0.07659912 -2.09375   ] 2   3 
[-1.2988281  -0.12469482  0.543

In [84]:
from sklearn import metrics
print(metrics.confusion_matrix(Targets,Pred))

[[ 16 236  14  13   0]
 [ 10 513  38  67   5]
 [  2 210  46 120  11]
 [  0  55  32 305 118]
 [  0  15   5 160 219]]


In [85]:
target_names = ['Very Neg', 'Negative', 'Neutral','Positive','Very Pos']
print(metrics.classification_report(Targets, Pred,target_names =target_names))

              precision    recall  f1-score   support

    Very Neg       0.57      0.06      0.10       279
    Negative       0.50      0.81      0.62       633
     Neutral       0.34      0.12      0.18       389
    Positive       0.46      0.60      0.52       510
    Very Pos       0.62      0.55      0.58       399

    accuracy                           0.50      2210
   macro avg       0.50      0.43      0.40      2210
weighted avg       0.49      0.50      0.45      2210



In [86]:
Fold_Predictions=pd.DataFrame(Pred, columns=['Pred2'] )
Fold_Predictions

Unnamed: 0,Pred2
0,1
1,1
2,1
3,4
4,1
...,...
2205,3
2206,1
2207,4
2208,4


In [87]:
Fold_Predictions.to_excel(output_folder+'/Saves/fold2_Predictions.xls')

In [88]:
#clearing GPU cache

del(model)
del(TrainResult, TrainModel_outputs, EvalResult, EvalModel_outputs, TestResult, TestModel_outputs, wrong_predictions)
torch.cuda.empty_cache()

## Fold 3: training & caturing predictions

In [89]:
fold_number='3'

train=pd.read_excel('./folds/train_fold'+fold_number+'.xls')
Eval=pd.read_excel('./folds/valid'+fold_number+'.xls') #evaluation set


In [90]:
 
output_folder='./folds/fold'+fold_number+'/'+model_class+'/'+model_version+"/"
cache_directory= "./folds/fold"+fold_number+'/'+model_class+"/"+model_version+"/cache/"


print('model variables were set up: ')

 
save_every_steps=1285
# assuming training batch size of 8
# any number above 1284 saves the model only at every epoch
# Saving the model mid training very often will consume disk space fast

train_args={
    "output_dir":output_folder,
    "cache_dir":cache_directory,
    'reprocess_input_data': True,
    'overwrite_output_dir': True,
    'num_train_epochs': 2,
    "save_steps": save_every_steps, 
    "learning_rate": 2e-5,
    "train_batch_size": 64,
    "eval_batch_size": 16,
    "evaluate_during_training_steps": 312,
    "max_seq_length": 100,
    "n_gpu": 1,
}

# Create a ClassificationModel
model = ClassificationModel(model_class, model_version, num_labels=labels_count, args=train_args) 

model variables were set up: 


In [91]:
# loading the checkpoint that gave the best result
'''
CheckPoint='checkpoint-130-epoch-1'  #epoch 1


preSavedCheckpoint=output_folder+CheckPoint

print('Loading model, please wait...')
model = ClassificationModel( model_class, preSavedCheckpoint, num_labels=labels_count, args=train_args) 
print('model in use is :', preSavedCheckpoint )
'''

"\nCheckPoint='checkpoint-130-epoch-1'  #epoch 1\n\n\npreSavedCheckpoint=output_folder+CheckPoint\n\nprint('Loading model, please wait...')\nmodel = ClassificationModel( model_class, preSavedCheckpoint, num_labels=labels_count, args=train_args) \nprint('model in use is :', preSavedCheckpoint )\n"

In [92]:
# Train the model
current_time = datetime.now()
model.train_model(train)
print("Training time: ", datetime.now() - current_time)

Converting to features started. Cache is not used.


HBox(children=(FloatProgress(value=0.0, max=6829.0), HTML(value='')))


Selected optimization level O1:  Insert automatic casts around Pytorch functions and Tensor methods.

Defaults for this optimization level are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic
Processing user overrides (additional kwargs that are not None)...
After processing overrides, optimization options are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic


HBox(children=(FloatProgress(value=0.0, description='Epoch', max=2.0, style=ProgressStyle(description_width='i…

HBox(children=(FloatProgress(value=0.0, description='Current iteration', max=107.0, style=ProgressStyle(descri…

Running loss: 1.165925


HBox(children=(FloatProgress(value=0.0, description='Current iteration', max=107.0, style=ProgressStyle(descri…

Running loss: 1.042245Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 32768.0
Running loss: 1.034553

Training of bert model complete. Saved to ./folds/fold3/bert/bert-base-cased/.
Training time:  0:04:50.527827


In [93]:
TrainResult, TrainModel_outputs, wrong_predictions = model.eval_model(train, acc=sklearn.metrics.accuracy_score)
 
EvalResult, EvalModel_outputs, wrong_predictions = model.eval_model(Eval, acc=sklearn.metrics.accuracy_score)

TestResult, TestModel_outputs, wrong_predictions = model.eval_model(test, acc=sklearn.metrics.accuracy_score)

print('Training Result:', TrainResult['acc'])
#print('Model Out:', TrainModel_outputs)

print('Eval Result:', EvalResult['acc'])
#print('Model Out:', EvalModel_outputs)

print('Test Set Result:', TestResult['acc'])
#print('Model Out:', TestModel_outputs)

Converting to features started. Cache is not used.


HBox(children=(FloatProgress(value=0.0, max=6829.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, max=427.0), HTML(value='')))


{'mcc': 0.48572544266546536, 'acc': 0.5930590130326548, 'eval_loss': 0.973277285171616}
Converting to features started. Cache is not used.


HBox(children=(FloatProgress(value=0.0, max=1705.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, max=107.0), HTML(value='')))


{'mcc': 0.37261878329760767, 'acc': 0.5096774193548387, 'eval_loss': 1.13340771253978}
Converting to features started. Cache is not used.


HBox(children=(FloatProgress(value=0.0, max=2210.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, max=139.0), HTML(value='')))


{'mcc': 0.3600734226320592, 'acc': 0.5049773755656108, 'eval_loss': 1.1278813820091083}
Training Result: 0.5930590130326548
Eval Result: 0.5096774193548387
Test Set Result: 0.5049773755656108


In [94]:
Pred=[]
Targets=[]

countCorrect=0

for row in range(TestModel_outputs.shape[0]):
    outputs=TestModel_outputs[row]
    #print(test.iloc[row,0])
    print(outputs, end=' ')
    
    result=0
    if outputs[0]<outputs[1]:result=1
    if outputs[result]<outputs[2]:result=2
    if outputs[result]<outputs[3]:result=3
    if outputs[result]<outputs[4]:result=4
    
    Pred.append(result)
    Targets.append(test.iloc[row,1])
    print(result, ' ',test.iloc[row,1], end=' ')
    if result==test.iloc[row,1]:
        countCorrect+=1
        print('Match',countCorrect)
    print('')

print(countCorrect)

[ 1.0849609   1.6826172   0.48413086 -0.8408203  -1.7958984 ] 1   1 Match 1

[ 0.92626953  2.0644531   0.578125   -0.9560547  -1.9677734 ] 1   0 
[ 0.90527344  1.8730469   0.69970703 -0.84277344 -2.046875  ] 1   2 
[-1.828125   -2.09375    -0.47265625  1.6298828   2.6367188 ] 4   4 Match 2

[ 0.4111328   1.1777344   0.83251953 -0.3684082  -1.3007812 ] 1   0 
[-2.1289062  -1.4404297   0.31323242  2.2851562   0.8378906 ] 3   3 Match 3

[ 1.5292969   1.7412109   0.30297852 -1.0732422  -1.9169922 ] 1   0 
[-2.0761719  -2.1035156  -0.27441406  2.1269531   2.4550781 ] 4   4 Match 4

[ 1.7226562   1.7353516   0.27416992 -1.3251953  -1.890625  ] 1   0 
[ 1.7792969   1.7304688   0.19335938 -1.5390625  -1.9580078 ] 0   0 Match 5

[ 0.7084961   1.8945312   0.68359375 -0.73876953 -2.0078125 ] 1   2 
[-2.1679688  -1.1298828   0.5073242   2.1777344   0.48461914] 3   4 
[-0.27685547  1.1796875   1.0498047   0.11358643 -1.2734375 ] 1   1 Match 6

[ 1.3398438   1.4375      0.42407227 -0.82958984 -1.672

[ 0.70458984  1.5224609   0.5761719  -0.67089844 -1.5927734 ] 1   0 
[-2.0273438  -1.5673828  -0.11181641  1.7421875   1.3320312 ] 3   3 Match 56

[-1.6582031 -2.1191406 -0.5732422  1.4638672  2.5957031] 4   4 Match 57

[-0.6557617   1.0224609   1.0048828   0.20898438 -1.515625  ] 1   1 Match 58

[-2.0644531  -1.7490234   0.15100098  2.3105469   1.4990234 ] 3   3 Match 59

[-2.1582031 -1.9248047 -0.1381836  2.2011719  2.0644531] 3   3 Match 60

[-1.8798828  -0.8598633   0.7270508   1.6513672   0.25830078] 3   4 
[ 0.9199219   1.6669922   0.7373047  -0.80810547 -1.8203125 ] 1   0 
[ 1.2919922   2.0761719   0.45581055 -1.2304688  -2.0839844 ] 1   1 Match 61

[ 1.4736328   1.9023438   0.29858398 -1.1748047  -2.0800781 ] 1   1 Match 62

[ 1.8076172   1.9179688   0.21386719 -1.4707031  -2.109375  ] 1   0 
[ 1.4755859   1.9541016   0.21130371 -1.3017578  -2.0019531 ] 1   3 
[ 0.17736816  1.4765625   0.94140625 -0.43969727 -1.7851562 ] 1   2 
[-1.3925781  -0.11639404  0.7998047   1.0820312  -

[ 0.59375     1.8652344   0.78222656 -0.6479492  -1.9482422 ] 1   2 
[ 0.39257812  1.4658203   0.77490234 -0.4958496  -1.5361328 ] 1   1 Match 127

[-0.6791992   0.63134766  0.97802734  0.60791016 -1.2617188 ] 2   1 
[ 0.9980469   1.6191406   0.7397461  -0.75878906 -1.7324219 ] 1   1 Match 128

[-1.1025391   0.39331055  1.0117188   0.68066406 -0.77734375] 2   4 
[ 1.3779297   1.8974609   0.36865234 -1.0908203  -1.9892578 ] 1   1 Match 129

[-0.85058594  0.62353516  0.9013672   0.8144531  -1.0556641 ] 2   2 Match 130

[ 0.6538086   1.5771484   0.76171875 -0.6645508  -1.6904297 ] 1   2 
[ 1.4931641  1.9716797  0.2824707 -1.2412109 -1.8798828] 1   1 Match 131

[ 1.1279297   1.8847656   0.54833984 -1.2099609  -2.0253906 ] 1   3 
[-1.3740234   0.09466553  1.0273438   1.1230469  -0.984375  ] 3   3 Match 132

[-1.0585938   0.46875     1.0810547   0.68652344 -1.2880859 ] 2   3 
[ 0.62890625  1.7441406   0.7675781  -0.7763672  -1.8896484 ] 1   1 Match 133

[-1.5644531  -0.6464844   0.5463867   

[-1.5556641  -0.23779297  1.0498047   1.3046875  -0.47485352] 3   3 Match 203

[-2.140625  -1.2373047  0.3635254  2.0566406  0.8984375] 3   2 
[ 1.2685547   1.5458984   0.40112305 -1.0976562  -1.7060547 ] 1   2 
[-2.203125   -1.7626953   0.10614014  2.2636719   1.4824219 ] 3   3 Match 204

[-0.5161133   1.0292969   1.0400391   0.11608887 -1.5830078 ] 2   1 
[-2.1777344  -2.1699219  -0.22302246  2.2851562   2.4160156 ] 4   4 Match 205

[ 0.21777344  1.4042969   0.89501953 -0.37768555 -1.6845703 ] 1   2 
[-2.09375    -2.0546875  -0.41796875  2.1113281   2.5117188 ] 4   2 
[ 1.1572266  1.7851562  0.6298828 -0.96875   -1.8720703] 1   1 Match 206

[ 0.3215332   1.5126953   0.90185547 -0.61279297 -1.8779297 ] 1   2 
[-1.5097656  -0.4050293   0.91259766  1.3486328  -0.35083008] 3   4 
[-1.9345703  -2.1464844  -0.44677734  1.8242188   2.609375  ] 4   4 Match 207

[ 0.86816406  1.9775391   0.68408203 -0.75341797 -2.0195312 ] 1   2 
[ 1.5224609   1.9433594   0.36669922 -1.1337891  -1.9628906 ] 1

[-2.1503906  -2.0253906  -0.24133301  2.3359375   2.1269531 ] 3   3 Match 273

[ 1.7207031   1.6181641   0.08416748 -1.6269531  -1.8027344 ] 0   0 Match 274

[ 0.61816406  1.6923828   0.7553711  -0.5571289  -1.875     ] 1   1 Match 275

[-1.9902344  -2.2285156  -0.31933594  1.96875     2.6210938 ] 4   4 Match 276

[-0.5498047   0.57373047  0.86035156  0.26293945 -1.0380859 ] 2   1 
[ 0.6040039  1.515625   0.8364258 -0.5751953 -1.5546875] 1   1 Match 277

[-0.13977051  1.1933594   0.8417969   0.0223999  -1.4755859 ] 1   3 
[-2.2460938  -2.0839844  -0.05233765  2.3574219   2.1171875 ] 3   4 
[ 1.6914062   1.6416016   0.11633301 -1.5566406  -1.8339844 ] 0   1 
[-1.9384766  -2.1074219  -0.48535156  1.7832031   2.7109375 ] 4   4 Match 278

[-1.6689453  -1.4306641   0.11431885  1.9033203   0.8100586 ] 3   4 
[ 1.0917969   1.8310547   0.46435547 -1.015625   -1.8642578 ] 1   1 Match 279

[-2.1035156  -2.1503906  -0.09564209  2.2167969   2.2480469 ] 4   3 
[ 0.6220703   1.1220703   0.77246094 -

[ 0.4951172   1.5595703   0.91845703 -0.44555664 -1.8134766 ] 1   0 
[-1.6982422  -0.50634766  0.7363281   1.7402344  -0.33447266] 3   2 
[ 1.4638672   1.6767578   0.43139648 -1.3095703  -1.9160156 ] 1   1 Match 343

[-2.109375   -1.5947266   0.17248535  2.296875    1.3603516 ] 3   4 
[-0.16394043  1.4648438   0.9394531  -0.02276611 -1.7519531 ] 1   1 Match 344

[-2.2558594  -1.9404297  -0.17663574  2.3066406   2.0878906 ] 3   4 
[-1.9873047 -2.1054688 -0.3852539  2.0019531  2.484375 ] 4   4 Match 345

[ 0.9477539   1.8867188   0.5341797  -0.94189453 -1.9277344 ] 1   2 
[ 0.9848633   1.9033203   0.41674805 -0.9941406  -1.8652344 ] 1   1 Match 346

[-1.9628906  -2.0585938  -0.43579102  1.8056641   2.5742188 ] 4   4 Match 347

[ 0.49169922  1.2607422   0.8222656  -0.47875977 -1.4267578 ] 1   1 Match 348

[ 0.99072266  1.9521484   0.5654297  -0.9194336  -1.9921875 ] 1   1 Match 349

[-2.0820312  -1.9970703  -0.14685059  1.9873047   2.4785156 ] 4   3 
[ 0.10522461  1.0585938   0.86328125 -

[ 0.9399414   1.4941406   0.81689453 -0.77246094 -1.7421875 ] 1   1 Match 409

[-0.44360352  1.1142578   0.9897461   0.1595459  -1.5019531 ] 1   1 Match 410

[-2.0703125  -0.9506836   0.46411133  2.0820312   0.39526367] 3   3 Match 411

[-2.1171875  -2.1054688  -0.20275879  2.2285156   2.3847656 ] 4   3 
[-1.0644531  -0.1986084   0.81933594  0.8510742  -0.3984375 ] 3   1 
[-2.1855469  -1.8701172  -0.15661621  2.2480469   1.9316406 ] 3   3 Match 412

[-1.53125     0.08959961  0.9296875   1.2265625  -0.83691406] 3   3 Match 413

[-0.6816406   0.64746094  0.96533203  0.73779297 -1.0439453 ] 2   1 
[ 1.7490234   1.8154297   0.27905273 -1.4248047  -2.03125   ] 1   0 
[-0.90625     0.56152344  1.1083984   0.5385742  -1.203125  ] 2   3 
[-0.46557617  0.9868164   1.0478516   0.37719727 -1.4208984 ] 2   0 
[-2.328125   -1.7900391   0.16125488  2.3710938   1.4599609 ] 3   4 
[-1.8544922  -2.1230469  -0.55029297  1.5966797   2.734375  ] 4   4 Match 414

[ 0.12792969  1.5439453   0.89208984 -0.350

[-1.4667969  -0.32104492  0.7397461   1.3388672  -0.37548828] 3   3 Match 482

[ 0.41430664  1.5478516   0.88916016 -0.390625   -1.7636719 ] 1   2 
[-2.1816406  -2.0859375  -0.22912598  2.1503906   2.3671875 ] 4   4 Match 483

[ 0.9140625   1.7197266   0.57128906 -1.0175781  -1.8105469 ] 1   2 
[-0.27563477  0.75341797  0.60302734  0.3486328  -1.0292969 ] 1   1 Match 484

[-2.1152344  -1.4814453   0.26513672  2.1757812   0.9580078 ] 3   3 Match 485

[-2.1386719  -2.1503906  -0.36791992  2.0722656   2.5664062 ] 4   4 Match 486

[-1.1035156  -1.6845703  -0.4038086   0.85595703  1.9785156 ] 4   3 
[ 0.02906799  1.5498047   0.7993164  -0.27807617 -1.7724609 ] 1   1 Match 487

[ 0.59472656  1.7568359   0.7416992  -0.6635742  -1.8242188 ] 1   0 
[ 1.7158203  1.4765625  0.2019043 -1.6044922 -1.6699219] 0   1 
[-1.8623047  -1.0195312   0.6010742   1.8632812   0.19067383] 3   3 Match 488

[-1.8701172 -2.1640625 -0.3725586  1.8134766  2.5058594] 4   4 Match 489

[-1.7041016 -2.0078125 -0.5673828

[-2.2207031 -1.9101562 -0.1104126  2.1757812  1.8310547] 3   3 Match 554

[-2.1132812 -2.1035156 -0.4675293  2.0585938  2.734375 ] 4   4 Match 555

[-1.765625   -2.0917969  -0.53271484  1.578125    2.6738281 ] 4   4 Match 556

[-1.7929688  -0.69628906  0.6298828   1.8037109   0.01779175] 3   4 
[-2.2832031  -1.6289062   0.25585938  2.2128906   1.3193359 ] 3   4 
[-1.9189453  -1.0185547   0.45410156  1.9560547   0.33862305] 3   4 
[-2.1308594  -1.8496094  -0.21643066  2.1914062   2.0019531 ] 3   3 Match 557

[-2.1503906  -1.5302734   0.5175781   2.1445312   0.86816406] 3   2 
[ 1.2421875   1.7744141   0.4645996  -0.99316406 -1.7451172 ] 1   1 Match 558

[ 0.8886719  1.9316406  0.5576172 -0.7944336 -1.9980469] 1   1 Match 559

[-1.5439453  -1.0820312   0.4729004   1.5361328   0.47558594] 3   2 
[ 0.02703857  1.4589844   0.9423828  -0.12036133 -1.6191406 ] 1   1 Match 560

[-0.0927124   0.8227539   0.9238281   0.15539551 -1.1503906 ] 2   2 Match 561

[-0.3125      1.0751953   0.9716797   

[ 1.5244141   1.7207031   0.27270508 -1.1699219  -1.6982422 ] 1   1 Match 630

[ 0.04837036  1.4619141   0.79052734 -0.31274414 -1.6523438 ] 1   1 Match 631

[-2.3183594  -1.515625    0.14013672  2.28125     1.1474609 ] 3   3 Match 632

[-0.43969727  1.0146484   0.9375      0.13195801 -1.4101562 ] 1   1 Match 633

[ 0.4296875  1.6689453  0.6611328 -0.6748047 -1.7207031] 1   1 Match 634

[-0.45581055  0.7182617   0.97021484  0.07177734 -1.1689453 ] 2   1 
[-1.1044922  -0.3635254   0.86328125  1.0097656  -0.3918457 ] 3   2 
[-1.8544922 -2.1894531 -0.6035156  1.6113281  2.796875 ] 4   4 Match 635

[ 1.2333984   1.6552734   0.5698242  -0.97753906 -1.78125   ] 1   0 
[ 1.5097656   1.8896484   0.30444336 -1.2265625  -1.9580078 ] 1   0 
[-2.0839844  -2.0917969  -0.25073242  2.0390625   2.46875   ] 4   4 Match 636

[ 1.8671875   1.8486328   0.10699463 -1.4814453  -2.0488281 ] 0   1 
[ 0.7890625   1.6289062   0.75097656 -0.7246094  -1.8613281 ] 1   1 Match 637

[-0.79052734 -0.1463623   0.49829

[-2.1328125 -1.6269531  0.1953125  2.3300781  1.3740234] 3   3 Match 705

[-2.2050781  -2.0410156  -0.23461914  2.3652344   2.3300781 ] 3   3 Match 706

[ 0.13537598  1.3203125   0.9511719  -0.09350586 -1.5576172 ] 1   0 
[-0.13415527  0.93359375  0.90771484  0.17553711 -1.2490234 ] 1   1 Match 707

[-2.2519531  -1.9765625  -0.14343262  2.2558594   2.0253906 ] 3   3 Match 708

[-0.28149414  1.0634766   1.0722656   0.2993164  -1.4423828 ] 2   1 
[-2.0214844 -2.1992188 -0.4248047  1.9365234  2.671875 ] 4   3 
[-1.7998047  -0.43139648  0.8095703   1.7792969  -0.33764648] 3   3 Match 709

[ 1.7138672  1.375      0.1361084 -1.4111328 -1.5830078] 0   0 Match 710

[-0.01870728  1.1044922   0.94091797  0.05627441 -1.2587891 ] 1   2 
[-2.2050781  -1.9130859  -0.12841797  2.2773438   2.0390625 ] 3   3 Match 711

[ 1.3291016   1.9316406   0.5292969  -0.98291016 -2.0195312 ] 1   1 Match 712

[ 1.4111328   1.8173828   0.32666016 -1.0478516  -1.8759766 ] 1   1 Match 713

[ 1.2089844   1.609375    0.

[ 0.24633789  1.3613281   0.8100586  -0.2409668  -1.6464844 ] 1   1 Match 779

[-1.4199219  -0.26757812  0.77978516  1.1835938  -0.49267578] 3   1 
[-0.50097656  0.78125     0.94091797  0.44702148 -1.1728516 ] 2   3 
[ 1.765625    1.6347656   0.10412598 -1.6523438  -1.8896484 ] 0   1 
[-1.6738281 -2.109375  -0.5673828  1.4443359  2.6777344] 4   4 Match 780

[-2.3242188  -1.7480469   0.09655762  2.2109375   1.4921875 ] 3   3 Match 781

[-1.8730469 -1.0166016  0.6113281  1.7910156  0.4099121] 3   3 Match 782

[ 0.18774414  1.4433594   1.0585938  -0.33203125 -1.8486328 ] 1   1 Match 783

[ 0.53564453  1.8144531   0.8017578  -0.6357422  -1.9130859 ] 1   2 
[-0.7871094   0.7841797   1.0566406   0.41186523 -1.2119141 ] 2   4 
[-1.8320312  -2.1289062  -0.39746094  1.8271484   2.53125   ] 4   2 
[-2.1679688  -1.9980469  -0.10461426  1.9775391   2.2070312 ] 4   4 Match 784

[ 0.8017578   1.8486328   0.7963867  -0.83740234 -1.9755859 ] 1   1 Match 785

[-1.7353516 -2.0878906 -0.5234375  1.375976

[ 1.4609375   1.8183594   0.30444336 -1.1298828  -1.9765625 ] 1   1 Match 852

[ 1.3007812  1.890625   0.3708496 -1.1347656 -1.9033203] 1   0 
[ 0.58691406  1.75        0.7729492  -0.6645508  -1.9052734 ] 1   3 
[ 1.5146484   1.8417969   0.35229492 -1.1611328  -2.0039062 ] 1   1 Match 853

[-0.44018555  1.0166016   0.91259766  0.16870117 -1.4560547 ] 1   1 Match 854

[ 1.1220703   1.6181641   0.51904297 -0.8808594  -1.6152344 ] 1   2 
[-1.921875   -2.0449219  -0.36132812  1.7294922   2.5527344 ] 4   2 
[-1.7138672  -1.1044922   0.11187744  1.7275391   0.6166992 ] 3   3 Match 855

[-1.9091797  -1.4960938   0.12231445  2.1582031   0.9951172 ] 3   2 
[-1.9140625  -2.1757812  -0.42919922  1.8398438   2.671875  ] 4   3 
[-1.5927734  -0.25634766  0.92333984  1.5175781  -0.5864258 ] 3   3 Match 856

[-0.24853516  1.2431641   0.9819336   0.04663086 -1.6347656 ] 1   2 
[-1.6328125  -0.3515625   0.82958984  1.2783203  -0.34179688] 3   0 
[-1.0537109   0.50146484  0.9326172   0.74121094 -1.003906

[ 1.7441406   1.765625    0.14147949 -1.5556641  -1.984375  ] 1   1 Match 918

[ 1.734375    1.8574219   0.24768066 -1.3447266  -1.9277344 ] 1   1 Match 919

[ 1.5849609  1.9453125  0.3244629 -1.2333984 -2.1113281] 1   1 Match 920

[-2.2871094  -1.8095703   0.05291748  2.4042969   1.7392578 ] 3   4 
[-1.9667969  -1.1552734   0.29296875  2.0214844   0.62597656] 3   3 Match 921

[-0.80078125  0.49121094  0.9873047   0.6850586  -1.1308594 ] 2   0 
[-0.47851562  1.0019531   0.94677734  0.48364258 -1.2324219 ] 1   1 Match 922

[-1.2822266   0.2163086   0.9692383   1.2314453  -0.84814453] 3   3 Match 923

[ 0.6411133   1.6396484   0.55859375 -0.48217773 -1.7832031 ] 1   1 Match 924

[ 1.7060547   1.9892578   0.13647461 -1.5087891  -2.0234375 ] 1   1 Match 925

[ 1.4072266   1.8515625   0.29711914 -1.2539062  -1.8730469 ] 1   1 Match 926

[-2.2539062 -1.6142578  0.1149292  2.4941406  1.2929688] 3   3 Match 927

[-1.3701172   0.02290344  1.0546875   1.1191406  -0.54345703] 3   1 
[ 0.67041016 

[ 0.47509766  1.5068359   0.69433594 -0.55029297 -1.3789062 ] 1   2 
[ 0.1920166  1.0507812  0.8803711 -0.1763916 -1.3945312] 1   2 
[-1.6533203  -2.0566406  -0.36010742  1.3574219   2.5214844 ] 4   3 
[-2.2265625  -1.5224609   0.19726562  2.1699219   1.1269531 ] 3   4 
[ 1.3798828   1.9335938   0.35839844 -1.1679688  -2.0273438 ] 1   0 
[-1.7294922  -0.16442871  0.86279297  1.4667969  -0.7055664 ] 3   2 
[-2.3398438  -1.8173828   0.06359863  2.4042969   1.6435547 ] 3   3 Match 989

[ 1.2949219   1.8740234   0.41918945 -1.125      -2.0390625 ] 1   1 Match 990

[ 1.4091797   2.0136719   0.38330078 -1.3349609  -2.125     ] 1   1 Match 991

[-0.12939453  0.9536133   0.8989258   0.2142334  -1.1367188 ] 1   1 Match 992

[ 0.81884766  1.4990234   0.72265625 -0.6904297  -1.6171875 ] 1   2 
[-1.8798828  -2.1425781  -0.41479492  1.7021484   2.6054688 ] 4   4 Match 993

[-2.3457031  -1.6708984   0.11810303  2.1386719   1.2099609 ] 3   4 
[-1.8779297  -0.76171875  0.62841797  1.9921875  -0.040374

[ 1.609375    1.5878906   0.17358398 -1.4169922  -1.8681641 ] 0   0 Match 1059

[ 0.02931213  1.2255859   1.0693359  -0.53222656 -1.4804688 ] 1   0 
[-1.6777344  -2.0097656  -0.51171875  1.3662109   2.578125  ] 4   4 Match 1060

[-2.0097656  -2.2265625  -0.33276367  1.8857422   2.6914062 ] 4   4 Match 1061

[-1.4179688  -0.29614258  0.87158203  1.3847656  -0.52734375] 3   1 
[-1.5322266  -0.35229492  0.78564453  1.2695312  -0.32739258] 3   3 Match 1062

[-1.8730469 -2.1875    -0.4411621  1.640625   2.7441406] 4   4 Match 1063

[-0.54248047  0.7290039   1.0732422   0.5        -1.2070312 ] 2   2 Match 1064

[ 0.66308594  1.7548828   0.6323242  -0.7338867  -1.9287109 ] 1   1 Match 1065

[-0.9819336   0.2163086   0.9916992   0.8901367  -0.88134766] 2   0 
[ 1.2607422  1.8007812  0.5419922 -0.9555664 -1.9277344] 1   0 
[ 1.1210938   1.8486328   0.546875   -0.85009766 -1.9150391 ] 1   0 
[-1.015625    0.49536133  0.96777344  0.90722656 -1.0195312 ] 2   2 Match 1066

[ 0.8618164   1.9257812  

In [95]:
from sklearn import metrics
print(metrics.confusion_matrix(Targets,Pred))

[[ 42 209  16  12   0]
 [ 28 488  62  51   4]
 [  3 198  70 108  10]
 [  1  48  51 293 117]
 [  0   9  18 149 223]]


In [96]:
target_names = ['Very Neg', 'Negative', 'Neutral','Positive','Very Pos']
print(metrics.classification_report(Targets, Pred,target_names =target_names))

              precision    recall  f1-score   support

    Very Neg       0.57      0.15      0.24       279
    Negative       0.51      0.77      0.62       633
     Neutral       0.32      0.18      0.23       389
    Positive       0.48      0.57      0.52       510
    Very Pos       0.63      0.56      0.59       399

    accuracy                           0.50      2210
   macro avg       0.50      0.45      0.44      2210
weighted avg       0.50      0.50      0.47      2210



In [97]:
Fold_Predictions=pd.DataFrame(Pred, columns=['Pred3'] )
Fold_Predictions

Unnamed: 0,Pred3
0,1
1,1
2,1
3,4
4,1
...,...
2205,4
2206,1
2207,3
2208,4


In [98]:
Fold_Predictions.to_excel(output_folder+'/Saves/fold3_Predictions.xls')

In [99]:
#clearing GPU cache

del(model)
del(TrainResult, TrainModel_outputs, EvalResult, EvalModel_outputs, TestResult, TestModel_outputs, wrong_predictions)
torch.cuda.empty_cache()

## Fold 4: training & capturing predictions

In [100]:
fold_number='4'

train=pd.read_excel('./folds/train_fold'+fold_number+'.xls')
Eval=pd.read_excel('./folds/valid'+fold_number+'.xls') #evaluation set


In [101]:

output_folder='./folds/fold'+fold_number+'/'+model_class+'/'+model_version+"/"
cache_directory= "./folds/fold"+fold_number+'/'+model_class+"/"+model_version+"/cache/"


print('model variables were set up: ')

 
save_every_steps=1285
# assuming training batch size of 8
# any number above 1284 saves the model only at every epoch
# Saving the model mid training very often will consume disk space fast

train_args={
    "output_dir":output_folder,
    "cache_dir":cache_directory,
    'reprocess_input_data': True,
    'overwrite_output_dir': True,
    'num_train_epochs': 2,
    "save_steps": save_every_steps, 
    "learning_rate": 2e-5,
    "train_batch_size": 64,
    "eval_batch_size": 16,
    "evaluate_during_training_steps": 312,
    "max_seq_length": 100,
    "n_gpu": 1,
}

# Create a ClassificationModel
model = ClassificationModel(model_class, model_version, num_labels=labels_count, args=train_args) 

model variables were set up: 


In [102]:
# loading the checkpoint that gave the best result
'''
CheckPoint='checkpoint-127-epoch-3' 


preSavedCheckpoint=output_folder+CheckPoint

print('Loading model, please wait...')
model = ClassificationModel( model_class, preSavedCheckpoint, num_labels=labels_count, args=train_args) 
print('model in use is :', preSavedCheckpoint )
 '''

"\nCheckPoint='checkpoint-127-epoch-3' \n\n\npreSavedCheckpoint=output_folder+CheckPoint\n\nprint('Loading model, please wait...')\nmodel = ClassificationModel( model_class, preSavedCheckpoint, num_labels=labels_count, args=train_args) \nprint('model in use is :', preSavedCheckpoint )\n "

In [103]:
# Train the model
current_time = datetime.now()
model.train_model(train)
print("Training time: ", datetime.now() - current_time, 'at: ' ,datetime.now())

Converting to features started. Cache is not used.


HBox(children=(FloatProgress(value=0.0, max=6829.0), HTML(value='')))


Selected optimization level O1:  Insert automatic casts around Pytorch functions and Tensor methods.

Defaults for this optimization level are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic
Processing user overrides (additional kwargs that are not None)...
After processing overrides, optimization options are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic


HBox(children=(FloatProgress(value=0.0, description='Epoch', max=2.0, style=ProgressStyle(description_width='i…

HBox(children=(FloatProgress(value=0.0, description='Current iteration', max=107.0, style=ProgressStyle(descri…

Running loss: 1.320088Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 32768.0
Running loss: 1.180273


HBox(children=(FloatProgress(value=0.0, description='Current iteration', max=107.0, style=ProgressStyle(descri…

Running loss: 1.076541

Training of bert model complete. Saved to ./folds/fold4/bert/bert-base-cased/.
Training time:  0:05:01.216839 at:  2020-04-27 13:49:58.465705


In [104]:
TrainResult, TrainModel_outputs, wrong_predictions = model.eval_model(train, acc=sklearn.metrics.accuracy_score)
 
EvalResult, EvalModel_outputs, wrong_predictions = model.eval_model(Eval, acc=sklearn.metrics.accuracy_score)

TestResult, TestModel_outputs, wrong_predictions = model.eval_model(test, acc=sklearn.metrics.accuracy_score)

print('Training Result:', TrainResult['acc'])
#print('Model Out:', TrainModel_outputs)

print('Eval Result:', EvalResult['acc'])
#print('Model Out:', EvalModel_outputs)

print('Test Set Result:', TestResult['acc'])
#print('Model Out:', TestModel_outputs)

Converting to features started. Cache is not used.


HBox(children=(FloatProgress(value=0.0, max=6829.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, max=427.0), HTML(value='')))


{'mcc': 0.46045301729440213, 'acc': 0.5765119343974228, 'eval_loss': 0.9898124037079287}
Converting to features started. Cache is not used.


HBox(children=(FloatProgress(value=0.0, max=1705.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, max=107.0), HTML(value='')))


{'mcc': 0.3459723336490942, 'acc': 0.4950146627565982, 'eval_loss': 1.1550438465359054}
Converting to features started. Cache is not used.


HBox(children=(FloatProgress(value=0.0, max=2210.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, max=139.0), HTML(value='')))


{'mcc': 0.3515925973287839, 'acc': 0.497737556561086, 'eval_loss': 1.138335683362947}
Training Result: 0.5765119343974228
Eval Result: 0.4950146627565982
Test Set Result: 0.497737556561086


In [105]:
Pred=[]
Targets=[]

countCorrect=0

for row in range(TestModel_outputs.shape[0]):
    outputs=TestModel_outputs[row]
    #print(test.iloc[row,0])
    print(outputs, end=' ')
    
    result=0
    if outputs[0]<outputs[1]:result=1
    if outputs[result]<outputs[2]:result=2
    if outputs[result]<outputs[3]:result=3
    if outputs[result]<outputs[4]:result=4
    
    Pred.append(result)
    Targets.append(test.iloc[row,1])
    print(result, ' ',test.iloc[row,1], end=' ')
    if result==test.iloc[row,1]:
        countCorrect+=1
        print('Match',countCorrect)
    print('')

print(countCorrect)

[ 0.44091797  1.0117188  -0.25073242 -1.4580078  -2.3398438 ] 1   1 Match 1

[ 0.33740234  1.3916016  -0.06057739 -1.6210938  -2.3417969 ] 1   0 
[ 0.16308594  1.2910156   0.31958008 -1.3945312  -2.2070312 ] 1   2 
[-1.6630859  -1.3740234  -0.20373535  2.5195312   2.7714844 ] 4   4 Match 2

[ 0.04568481  0.7211914   0.25463867 -0.890625   -2.1640625 ] 1   0 
[-1.9335938  -1.203125    0.17492676  2.8261719   2.0175781 ] 3   3 Match 3

[ 0.6401367   0.9995117  -0.47998047 -1.4667969  -2.21875   ] 1   0 
[-1.9541016  -1.2480469   0.01133728  2.6640625   2.1113281 ] 3   4 
[ 0.82177734  1.1396484  -0.27124023 -1.8925781  -2.3652344 ] 1   0 
[ 0.76220703  1.1787109  -0.3251953  -1.7509766  -2.359375  ] 1   0 
[ 5.2197266e-01  1.2226562e+00  2.2029877e-03 -1.5371094e+00
 -2.2167969e+00] 1   2 
[-1.7197266  -0.9350586   0.58691406  2.6855469   1.3701172 ] 3   4 
[-1.0292969   0.32543945  0.9501953   0.5029297  -1.2548828 ] 2   1 
[ 0.54003906  1.2294922  -0.10168457 -1.7539062  -2.2207031 ] 1

[-1.1376953   0.08276367  1.0820312   1.2265625  -0.87109375] 3   2 
[-0.02249146  1.0458984   0.32348633 -1.0556641  -2.1191406 ] 1   0 
[-1.8242188 -1.1992188  0.2331543  1.7041016  0.8364258] 3   3 Match 58

[-1.6386719 -1.5634766 -0.3894043  1.7597656  3.0898438] 4   4 Match 59

[ 0.05334473  1.1787109   0.4543457  -1.1064453  -2.1972656 ] 1   1 Match 60

[-1.6689453  -1.0820312   0.22875977  2.8085938   1.9365234 ] 3   3 Match 61

[-1.6191406  -0.80322266  0.59716797  2.4550781   0.8828125 ] 3   3 Match 62

[-1.6162109  -0.69384766  0.7421875   2.1933594   0.7451172 ] 3   4 
[-0.45654297  0.5673828   0.7246094  -0.21569824 -1.6953125 ] 2   0 
[ 0.36035156  1.2949219  -0.01707458 -1.5732422  -2.1933594 ] 1   1 Match 63

[ 0.4013672   1.3496094  -0.09686279 -1.6181641  -2.2226562 ] 1   1 Match 64

[ 0.6074219   1.3564453  -0.23669434 -1.78125    -2.1523438 ] 1   0 
[ 0.28881836  1.3349609  -0.12304688 -1.6611328  -1.9814453 ] 1   3 
[-0.17565918  1.1835938   0.70410156 -0.7714844  -

[ 0.22753906  1.1582031   0.15527344 -1.4160156  -2.3652344 ] 1   2 
[-0.79052734  0.12237549  0.7368164   0.54003906 -1.3037109 ] 2   1 
[-0.7036133  0.6074219  1.0263672  0.3972168 -1.4335938] 2   1 
[ 0.27368164  0.8544922   0.01924133 -1.3720703  -2.046875  ] 1   1 Match 131

[-1.484375   -0.48486328  0.8330078   1.3710938  -0.4272461 ] 3   4 
[ 0.54833984  1.2460938  -0.04681396 -1.7050781  -2.2675781 ] 1   1 Match 132

[-0.8339844  0.5029297  0.9892578  0.4645996 -1.3955078] 2   2 Match 133

[ 0.17749023  1.2080078   0.12310791 -1.4638672  -2.2089844 ] 1   2 
[ 0.5371094   1.3085938  -0.19519043 -1.6943359  -2.3085938 ] 1   1 Match 134

[ 0.16503906  1.3896484   0.3479004  -1.5478516  -1.9433594 ] 1   3 
[-1.4091797   0.16455078  1.2109375   1.0087891  -0.8466797 ] 2   3 
[-1.2460938   0.14880371  1.0771484   0.91845703 -1.0527344 ] 2   3 
[-0.02496338  1.3076172   0.41625977 -1.3476562  -2.0117188 ] 1   1 Match 135

[-1.4306641 -0.5317383  0.5073242  1.2529297 -0.2479248] 3   3 

[-1.2099609   0.06414795  1.2099609   1.4462891  -0.7402344 ] 3   1 
[-1.6650391 -1.0292969  0.3334961  2.7070312  1.7353516] 3   4 
[-0.7709961   0.26367188  0.8564453   0.46362305 -1.4179688 ] 2   2 Match 202

[-1.9199219  -1.3740234  -0.09320068  2.6054688   2.4140625 ] 3   2 
[ 0.35595703  1.1689453  -0.09625244 -1.6689453  -2.1992188 ] 1   1 Match 203

[ 0.13305664  1.1875      0.26904297 -1.3261719  -2.0996094 ] 1   2 
[-1.2148438   0.27075195  0.84375     0.23925781 -1.0195312 ] 2   4 
[-1.6650391  -1.515625   -0.16662598  2.4316406   2.9570312 ] 4   4 Match 204

[-0.06106567  1.2275391   0.5522461  -0.92822266 -1.828125  ] 1   2 
[ 0.5415039   1.1582031  -0.13671875 -1.6289062  -2.3691406 ] 1   1 Match 205

[-1.8896484  -1.5107422  -0.31274414  2.3984375   2.9199219 ] 4   3 
[-1.5527344  -1.5400391  -0.24499512  2.2109375   3.1367188 ] 4   4 Match 206

[ 0.3552246   1.0605469   0.06628418 -1.3037109  -2.3300781 ] 1   2 
[-1.5644531  -0.3552246   0.80615234  1.0419922  -0.428710

[ 0.37548828  1.1396484  -0.06677246 -1.4833984  -2.2480469 ] 1   1 Match 268

[-1.6943359  -0.85595703  0.48486328  2.6328125   1.3486328 ] 3   3 Match 269

[-0.71728516  0.30395508  0.63671875  0.04037476 -1.1708984 ] 2   1 
[-1.28125    -0.00631332  1.0244141   1.0576172  -0.7685547 ] 3   2 
[-1.4287109  -1.5390625  -0.13366699  1.0507812   1.9345703 ] 4   1 
[-1.5761719  -1.2744141  -0.19494629  1.3427734   2.3359375 ] 4   3 
[ 0.72216797  1.1328125  -0.30371094 -1.859375   -2.28125   ] 1   1 Match 270

[-0.75634766  0.43652344  1.0488281   1.1171875  -0.73583984] 3   0 
[ 0.41308594  0.7705078  -0.04837036 -1.25       -2.0722656 ] 1   1 Match 271

[-1.8671875  -0.74609375  0.515625    2.6621094   1.3388672 ] 3   4 
[-1.5390625  -0.34838867  1.0693359   1.9453125  -0.23217773] 3   3 Match 272

[-0.06817627  1.2744141   0.50439453 -1.1464844  -2.0585938 ] 1   1 Match 273

[-1.6962891 -0.6660156  0.8652344  2.5136719  0.5366211] 3   3 Match 274

[ 0.04632568  1.109375    0.39648438 -

[ 0.14086914  1.1689453   0.1303711  -1.4160156  -2.2167969 ] 1   2 
[-1.7695312  -0.93652344  0.67041016  2.0292969   0.48413086] 3   3 Match 339

[-0.1463623   1.0458984   0.44580078 -0.84765625 -2.1601562 ] 1   3 
[-1.3496094 -1.4042969 -0.3215332  1.5488281  3.0585938] 4   4 Match 340

[ 0.01333618  0.89453125  0.31176758 -1.0214844  -2.0234375 ] 1   0 
[-1.5986328 -0.671875   0.6621094  2.5761719  0.8613281] 3   2 
[-1.5644531  -0.75390625  0.5683594   2.5566406   1.2578125 ] 3   2 
[ 0.6035156   1.1474609  -0.25024414 -1.8291016  -2.4042969 ] 1   0 
[-1.9160156  -1.2744141  -0.24536133  2.5996094   2.4882812 ] 3   4 
[-1.6357422  -1.4755859  -0.20166016  2.2304688   3.0195312 ] 4   3 
[ 0.1194458   0.84228516  0.2607422  -1.1601562  -2.0078125 ] 1   0 
[-1.6005859  -1.0908203   0.24157715  2.7851562   2.        ] 3   3 Match 341

[ 0.34423828  1.2548828  -0.05776978 -1.6464844  -2.3046875 ] 1   1 Match 342

[-1.7148438 -1.0429688  0.5332031  2.1132812  0.9272461] 3   0 
[-0.25512

[-1.296875    0.32739258  1.0908203   1.1855469  -0.70996094] 3   1 
[-1.6152344  -0.70166016  0.5854492   1.4384766   0.2331543 ] 3   1 
[-0.44970703  0.4975586   0.16296387 -0.30688477 -1.6064453 ] 1   2 
[-0.02763367  1.0898438   0.3527832  -1.2578125  -2.2109375 ] 1   1 Match 408

[-0.8232422   0.4506836   0.88623047  0.35742188 -1.4404297 ] 2   1 
[-1.8007812  -1.5712891  -0.37060547  2.0917969   3.1191406 ] 4   3 
[-1.4785156  -0.30151367  1.0585938   1.78125    -0.3779297 ] 3   3 Match 409

[-1.8476562  -1.5478516  -0.08624268  2.2890625   2.5683594 ] 4   3 
[ 0.68066406  1.15625    -0.23986816 -1.7412109  -2.3574219 ] 1   0 
[ 0.4975586   1.2822266  -0.05764771 -1.7773438  -2.3359375 ] 1   2 
[ 0.15673828  1.2529297   0.2878418  -1.3662109  -2.2617188 ] 1   1 Match 410

[-1.5205078  -0.7368164   0.7446289   2.0253906   0.11260986] 3   2 
[ 0.2541504  1.3310547  0.105896  -1.4140625 -2.2363281] 1   1 Match 411

[-0.79296875  0.23388672  0.7475586   0.38989258 -1.3085938 ] 2   3 

[ 0.41357422  1.2773438  -0.04559326 -1.6875     -2.1933594 ] 1   1 Match 478

[-1.8623047  -0.73535156  0.7919922   2.5429688   0.77978516] 3   3 Match 479

[-1.3876953  -0.48535156  0.8544922   2.1542969   0.3569336 ] 3   3 Match 480

[-1.6660156  -0.75878906  0.71484375  2.6367188   0.98583984] 3   2 
[-1.2304688  -0.27514648  1.0400391   2.1074219   0.04006958] 3   3 Match 481

[ 0.7783203   1.2392578  -0.35083008 -1.7929688  -2.2753906 ] 1   1 Match 482

[-1.9521484  -1.4296875  -0.24731445  2.5703125   2.8046875 ] 4   3 
[-1.5224609  -0.64990234  0.67822266  1.7099609  -0.06781006] 3   3 Match 483

[ 0.31225586  1.0478516   0.16015625 -1.3457031  -2.2441406 ] 1   0 
[-0.8701172   0.21008301  0.90283203  0.25048828 -1.2197266 ] 2   2 Match 484

[ 0.1821289   0.8120117   0.23962402 -0.9560547  -2.0214844 ] 1   3 
[-1.8154297  -0.9746094   0.52734375  2.6816406   1.3125    ] 3   3 Match 485

[ 0.78515625  1.1611328  -0.40698242 -1.8417969  -2.3046875 ] 1   0 
[ 0.02758789  1.0078125

[ 0.05493164  1.0429688   0.37329102 -1.0537109  -2.0917969 ] 1   1 Match 548

[-0.37402344  0.55566406  0.6508789  -0.44726562 -1.8027344 ] 2   2 Match 549

[-0.7368164   0.7885742   0.86279297 -0.15332031 -1.7470703 ] 2   2 Match 550

[-1.7138672  -1.5419922  -0.26782227  2.0839844   2.9511719 ] 4   3 
[-1.5917969  -0.5493164   0.93408203  2.34375     0.45166016] 3   4 
[ 0.22290039  0.8066406  -0.056427   -1.3378906  -1.9707031 ] 1   1 Match 551

[-1.6347656  -0.37597656  1.0322266   1.8076172  -0.36889648] 3   3 Match 552

[ 0.52685547  1.15625    -0.04019165 -1.6044922  -2.2421875 ] 1   1 Match 553

[-0.19665527  1.0136719   0.7348633  -0.88720703 -2.046875  ] 1   2 
[-1.5527344 -1.5654297 -0.2680664  1.2734375  2.4472656] 4   4 Match 554

[-0.8432617   0.33081055  0.8955078   0.81884766 -1.3398438 ] 2   1 
[-1.1972656  -0.47680664  0.13452148  0.37963867 -0.33129883] 3   2 
[-0.00490952  1.2783203   0.33569336 -1.2460938  -2.1972656 ] 1   1 Match 555

[-1.6162109  -1.4101562  -0.

[ 0.85595703  1.1796875  -0.37646484 -1.9091797  -2.3066406 ] 1   1 Match 622

[ 0.20495605  1.1171875   0.17272949 -1.2226562  -2.203125  ] 1   1 Match 623

[-0.7661133   0.11279297  0.44726562  0.24963379 -1.2255859 ] 2   2 Match 624

[-0.12719727  1.1435547   0.48120117 -0.96875    -2.046875  ] 1   1 Match 625

[ 0.17456055  0.9633789   0.25463867 -1.3681641  -2.1875    ] 1   0 
[-0.5698242   0.7421875   0.67041016 -0.3154297  -1.7480469 ] 1   2 
[-1.6396484 -1.03125    0.7265625  2.7246094  1.1044922] 3   3 Match 626

[-1.8222656  -1.3154297   0.10992432  2.7382812   2.2050781 ] 3   4 
[ 0.3540039   1.0576172   0.08898926 -1.3710938  -2.1152344 ] 1   2 
[-1.8320312  -0.72558594  0.7207031   2.0175781   0.37939453] 3   3 Match 627

[-1.8945312  -1.3896484  -0.01085663  1.5488281   1.6123047 ] 4   4 Match 628

[ 0.4453125   1.1640625  -0.14038086 -1.6357422  -2.34375   ] 1   1 Match 629

[-1.90625    -1.4628906  -0.27148438  2.3378906   2.9199219 ] 4   4 Match 630

[-1.1191406   0.35


[-1.7871094  -1.0078125   0.5029297   1.7353516   0.47216797] 3   2 
[ 0.16870117  0.9355469   0.12731934 -1.3164062  -2.2402344 ] 1   1 Match 704

[-1.9042969  -1.5761719  -0.14819336  2.03125     2.4707031 ] 4   3 
[-1.7402344  -1.53125    -0.28295898  2.4082031   3.0722656 ] 4   4 Match 705

[ 0.20935059  1.2041016   0.26757812 -1.15625    -2.21875   ] 1   0 
[-0.36108398  1.1054688   0.8071289  -0.57714844 -1.7402344 ] 1   2 
[ 0.7578125   1.2207031  -0.39648438 -1.8212891  -2.3203125 ] 1   0 
[ 0.14233398  0.94384766  0.21362305 -1.0664062  -2.1875    ] 1   3 
[-1.9052734  -1.3261719  -0.12695312  2.6953125   2.5703125 ] 3   3 Match 706

[ 0.45458984  1.2177734  -0.04708862 -1.4521484  -2.1484375 ] 1   1 Match 707

[-1.3525391  -0.14709473  0.9423828   1.4404297  -0.6665039 ] 3   0 
[ 0.61279297  1.2783203  -0.19238281 -1.8388672  -2.3417969 ] 1   0 
[-1.7382812 -1.0703125  0.3984375  2.7246094  1.5224609] 3   3 Match 708

[ 0.20861816  0.99560547  0.14782715 -1.1113281  -2.20312

[-0.26245117  0.9995117   0.43530273 -0.9633789  -1.8271484 ] 1   4 
[-1.4775391  -1.4287109  -0.17749023  2.0664062   3.171875  ] 4   3 
[-1.5068359  -0.68896484  0.7011719   2.3652344   1.0488281 ] 3   3 Match 773

[-0.7402344   0.14111328  0.859375    0.31152344 -1.3876953 ] 2   2 Match 774

[-1.5966797  -1.3974609  -0.35058594  1.7089844   2.9785156 ] 4   4 Match 775

[ 0.7026367   0.9746094  -0.30859375 -1.7714844  -2.25      ] 1   1 Match 776

[-0.7084961   0.42529297  0.7714844   0.6357422  -1.0664062 ] 2   0 
[ 0.04888916  0.99121094  0.17053223 -1.1337891  -2.        ] 1   0 
[-1.4433594  -0.34301758  1.0332031   1.9433594  -0.13647461] 3   2 
[ 0.6791992   1.1435547  -0.22351074 -1.7177734  -2.3417969 ] 1   0 
[-0.33789062  0.6347656   0.60253906 -0.234375   -1.7119141 ] 1   0 
[ 0.36914062  0.95751953  0.08013916 -1.2958984  -2.2421875 ] 1   1 Match 777

[-1.8134766 -1.4160156 -0.3347168  2.3945312  3.0761719] 4   4 Match 778

[-1.9863281  -1.1757812   0.14147949  2.6777344 

[-2.0390625 -1.5068359 -0.3791504  2.4433594  2.8398438] 4   3 
[-0.7998047   0.55126953  1.0595703   0.41430664 -1.4189453 ] 2   3 
[-0.39941406  0.93652344  0.52197266 -0.7729492  -2.0644531 ] 1   1 Match 844

[-1.5634766  -0.54589844  0.6484375   1.4326172  -0.17822266] 3   4 
[-2.0117188  -1.2236328   0.02787781  2.6113281   2.1074219 ] 3   4 
[ 0.19470215  1.2734375   0.03585815 -1.3974609  -2.1953125 ] 1   0 
[ 0.58740234  1.0175781  -0.19519043 -1.546875   -2.2050781 ] 1   1 Match 845

[ 0.4897461   1.1884766  -0.08807373 -1.5351562  -2.1953125 ] 1   0 
[-1.5859375  -0.8828125   0.57470703  2.6835938   1.4609375 ] 3   3 Match 846

[-2.0019531  -1.25        0.38134766  2.5097656   1.3564453 ] 3   3 Match 847

[ 0.5751953   0.82421875 -0.21154785 -1.5322266  -2.0644531 ] 1   1 Match 848

[-1.4814453  -0.09570312  0.90185547  1.2460938  -0.7109375 ] 3   1 
[ 0.6220703   1.3144531  -0.16381836 -1.8251953  -2.2734375 ] 1   1 Match 849

[-1.9921875  -1.0361328   0.33642578  2.5175781 

[-0.5048828   0.76171875  0.98291016  0.22607422 -1.4638672 ] 2   3 
[-0.23376465  0.63964844  0.4831543  -0.63964844 -1.7294922 ] 1   1 Match 918

[ 0.09552002  0.8574219   0.19873047 -1.0800781  -2.2597656 ] 1   1 Match 919

[ 0.07025146  0.5463867   0.0567627  -1.0869141  -1.9580078 ] 1   0 
[-1.4697266  -0.45092773  0.7348633   2.4414062   0.90283203] 3   4 
[ 0.484375    1.3984375  -0.02229309 -1.6855469  -2.0214844 ] 1   1 Match 920

[ 0.6118164   1.1503906  -0.22888184 -1.6835938  -2.3925781 ] 1   0 
[-1.6376953  -1.1337891   0.19250488  2.8125      2.0175781 ] 3   4 
[ 0.46948242  1.0488281  -0.18273926 -1.5488281  -2.3535156 ] 1   1 Match 921

[-0.58740234  0.7753906   0.7451172  -0.36572266 -1.8984375 ] 1   0 
[-0.38085938  0.9326172   0.6801758  -0.33398438 -1.9746094 ] 1   2 
[-1.7001953  -0.6850586   0.75927734  1.7783203  -0.03930664] 3   3 Match 922

[-2.0253906 -1.0683594  0.6401367  2.4375     1.1953125] 3   4 
[ 0.7988281  1.2734375 -0.2088623 -1.8193359 -2.2753906] 1

[-0.08056641  0.8154297   0.4206543  -0.76953125 -1.7685547 ] 1   1 Match 990

[-0.88134766  0.30908203  1.0019531   0.59228516 -1.2451172 ] 2   3 
[-1.6523438  -0.50341797  0.9433594   2.1269531  -0.08074951] 3   3 Match 991

[-1.8310547  -0.97998047  0.3569336   2.6875      1.5478516 ] 3   3 Match 992

[ 0.60009766  1.2958984  -0.12585449 -1.6914062  -2.2773438 ] 1   0 
[ 0.5317383   1.265625   -0.06744385 -1.7861328  -2.3789062 ] 1   1 Match 993

[ 0.5566406   1.1435547  -0.12359619 -1.6503906  -2.2167969 ] 1   0 
[-0.95166016  0.5361328   0.98291016  0.5527344  -1.3662109 ] 2   3 
[ 0.41479492  1.2246094   0.06054688 -1.5986328  -2.2382812 ] 1   1 Match 994

[-1.7324219  -1.03125     0.42358398  2.7109375   1.7900391 ] 3   3 Match 995

[-0.03509521  1.1992188   0.3059082  -1.1953125  -2.2167969 ] 1   1 Match 996

[-1.6855469 -1.546875  -0.4650879  1.7792969  3.1464844] 4   4 Match 997

[-1.8984375  -1.3574219   0.00604248  2.578125    2.5839844 ] 4   2 
[-0.40454102  0.8691406   0.

[-1.7822266  -0.89941406  0.6118164   2.5429688   0.92578125] 3   3 Match 1065

[-1.6835938  -0.7602539   0.82470703  2.3085938   0.67578125] 3   4 
[-1.7382812  -1.2119141   0.08197021  2.6191406   2.3730469 ] 3   3 Match 1066

[-1.5878906  -0.16967773  1.0019531   1.6005859  -0.1484375 ] 3   2 
[-1.7802734  -1.1679688   0.44726562  2.6171875   1.1904297 ] 3   3 Match 1067

[ 0.5942383   1.2314453  -0.23266602 -1.6982422  -2.2929688 ] 1   1 Match 1068

[-1.796875   -1.1220703   0.21276855  2.6464844   1.859375  ] 3   4 
[-1.0205078   0.39526367  0.8432617   0.27514648 -1.1650391 ] 2   1 
[-1.8515625  -1.5400391  -0.31347656  2.515625    2.8671875 ] 4   4 Match 1069

[-1.578125   -0.38085938  0.8623047   2.3769531   0.19921875] 3   4 
[-1.7626953  -1.0117188   0.34057617  2.6875      1.9677734 ] 3   3 Match 1070

[-1.6621094  -1.5087891  -0.43310547  1.7832031   3.0507812 ] 4   4 Match 1071

[-0.7988281  0.3918457  0.7451172  0.7050781 -1.3242188] 2   2 Match 1072

[-0.11437988  1.1318

In [106]:
from sklearn import metrics
 
print(metrics.confusion_matrix(Targets,Pred))

[[  5 241  16  17   0]
 [  1 486  79  65   2]
 [  0 180  74 129   6]
 [  0  33  45 363  69]
 [  0   6  14 207 172]]


In [107]:
target_names = ['Very Neg', 'Negative', 'Neutral','Positive','Very Pos']
print(metrics.classification_report(Targets, Pred,target_names =target_names))

              precision    recall  f1-score   support

    Very Neg       0.83      0.02      0.04       279
    Negative       0.51      0.77      0.62       633
     Neutral       0.32      0.19      0.24       389
    Positive       0.46      0.71      0.56       510
    Very Pos       0.69      0.43      0.53       399

    accuracy                           0.50      2210
   macro avg       0.57      0.42      0.40      2210
weighted avg       0.54      0.50      0.45      2210



In [108]:
Fold_Predictions=pd.DataFrame(Pred, columns=['Pred4'] )
Fold_Predictions

Unnamed: 0,Pred4
0,1
1,1
2,1
3,4
4,1
...,...
2205,3
2206,1
2207,3
2208,4


In [109]:
Fold_Predictions.to_excel(output_folder+'/Saves/fold4_Predictions.xls')

In [110]:
#clearing GPU cache

del(model)
del(TrainResult, TrainModel_outputs, EvalResult, EvalModel_outputs, TestResult, TestModel_outputs, wrong_predictions)
torch.cuda.empty_cache()

## Fold 5: training & capturing predictions

In [111]:
fold_number='5'

train=pd.read_excel('./folds/train_fold'+fold_number+'.xls')
Eval=pd.read_excel('./folds/valid'+fold_number+'.xls') #evaluation set


In [112]:
 
output_folder='./folds/fold'+fold_number+'/'+model_class+'/'+model_version+"/"
cache_directory= "./folds/fold"+fold_number+'/'+model_class+"/"+model_version+"/cache/"


print('model variables were set up: ')

 
save_every_steps=1285
# assuming training batch size of 8
# any number above 1284 saves the model only at every epoch
# Saving the model mid training very often will consume disk space fast

train_args={
    "output_dir":output_folder,
    "cache_dir":cache_directory,
    'reprocess_input_data': True,
    'overwrite_output_dir': True,
    'num_train_epochs': 2,
    "save_steps": save_every_steps,
    "learning_rate": 2e-5,
    "train_batch_size": 64,
    "eval_batch_size": 16,
    "evaluate_during_training_steps": 312,
    "max_seq_length": 100,
    "n_gpu": 1,
}

# Create a ClassificationModel
model = ClassificationModel(model_class, model_version, num_labels=labels_count, args=train_args) 

model variables were set up: 


In [113]:
# loading the checkpoint that gave the best result
'''
CheckPoint='checkpoint-286-epoch-2'   


preSavedCheckpoint=output_folder+CheckPoint

print('Loading model, please wait...')
model = ClassificationModel( model_class, preSavedCheckpoint, num_labels=labels_count, args=train_args) 
print('model in use is :', preSavedCheckpoint )
'''

"\nCheckPoint='checkpoint-286-epoch-2'   \n\n\npreSavedCheckpoint=output_folder+CheckPoint\n\nprint('Loading model, please wait...')\nmodel = ClassificationModel( model_class, preSavedCheckpoint, num_labels=labels_count, args=train_args) \nprint('model in use is :', preSavedCheckpoint )\n"

In [114]:
# Train the model
current_time = datetime.now()
model.train_model(train)
print("Training time: ", datetime.now() - current_time)

Converting to features started. Cache is not used.


HBox(children=(FloatProgress(value=0.0, max=6829.0), HTML(value='')))


Selected optimization level O1:  Insert automatic casts around Pytorch functions and Tensor methods.

Defaults for this optimization level are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic
Processing user overrides (additional kwargs that are not None)...
After processing overrides, optimization options are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic


HBox(children=(FloatProgress(value=0.0, description='Epoch', max=2.0, style=ProgressStyle(description_width='i…

HBox(children=(FloatProgress(value=0.0, description='Current iteration', max=107.0, style=ProgressStyle(descri…

Running loss: 1.272975Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 32768.0
Running loss: 1.220581


HBox(children=(FloatProgress(value=0.0, description='Current iteration', max=107.0, style=ProgressStyle(descri…

Running loss: 1.190074

Training of bert model complete. Saved to ./folds/fold5/bert/bert-base-cased/.
Training time:  0:05:07.726797


In [115]:
TrainResult, TrainModel_outputs, wrong_predictions = model.eval_model(train, acc=sklearn.metrics.accuracy_score)
 
EvalResult, EvalModel_outputs, wrong_predictions = model.eval_model(Eval, acc=sklearn.metrics.accuracy_score)

TestResult, TestModel_outputs, wrong_predictions = model.eval_model(test, acc=sklearn.metrics.accuracy_score)

print('Training Result:', TrainResult['acc'])
#print('Model Out:', TrainModel_outputs)

print('Eval Result:', EvalResult['acc'])
#print('Model Out:', EvalModel_outputs)

print('Test Set Result:', TestResult['acc'])
#print('Model Out:', TestModel_outputs)

Converting to features started. Cache is not used.


HBox(children=(FloatProgress(value=0.0, max=6829.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, max=427.0), HTML(value='')))


{'mcc': 0.46635581124869785, 'acc': 0.5825157416898521, 'eval_loss': 1.0197946434836198}
Converting to features started. Cache is not used.


HBox(children=(FloatProgress(value=0.0, max=1705.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, max=107.0), HTML(value='')))


{'mcc': 0.35504301558947843, 'acc': 0.5008797653958944, 'eval_loss': 1.1580638088912607}
Converting to features started. Cache is not used.


HBox(children=(FloatProgress(value=0.0, max=2210.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, max=139.0), HTML(value='')))


{'mcc': 0.3525443930824588, 'acc': 0.49638009049773757, 'eval_loss': 1.1444769956225114}
Training Result: 0.5825157416898521
Eval Result: 0.5008797653958944
Test Set Result: 0.49638009049773757


In [116]:
Pred=[]
Targets=[]

countCorrect=0

for row in range(TestModel_outputs.shape[0]):
    outputs=TestModel_outputs[row]
    #print(test.iloc[row,0])
    print(outputs, end=' ')
    
    result=0
    if outputs[0]<outputs[1]:result=1
    if outputs[result]<outputs[2]:result=2
    if outputs[result]<outputs[3]:result=3
    if outputs[result]<outputs[4]:result=4
    
    Pred.append(result)
    Targets.append(test.iloc[row,1])
    print(result, ' ',test.iloc[row,1], end=' ')
    if result==test.iloc[row,1]:
        countCorrect+=1
        print('Match',countCorrect)
    print('')

print(countCorrect)

[ 1.0253906   1.7871094   0.5493164  -0.98339844 -1.7402344 ] 1   1 Match 1

[ 0.5463867   1.8837891   0.85791016 -0.63964844 -1.7949219 ] 1   0 
[ 0.4074707   1.734375    0.9819336  -0.66308594 -1.8183594 ] 1   2 
[-1.0341797  -0.5859375   0.13867188  1.5283203   1.6621094 ] 4   4 Match 2

[ 0.97265625  1.4746094   0.5644531  -0.76416016 -1.484375  ] 1   0 
[-1.3613281  -0.8017578   0.19580078  1.8330078   1.3642578 ] 3   3 Match 3

[ 1.4697266   1.2636719   0.02351379 -1.5615234  -2.1972656 ] 0   0 Match 4

[-1.1835938  -0.67626953  0.10040283  1.6503906   1.5537109 ] 3   4 
[ 1.7539062   1.3261719  -0.05227661 -1.5605469  -2.1972656 ] 0   0 Match 5

[ 1.6591797   0.9189453  -0.10003662 -1.6630859  -2.2890625 ] 0   0 Match 6

[ 0.73535156  1.9785156   0.66845703 -0.9941406  -1.8867188 ] 1   2 
[-1.3759766 -0.7285156  0.2668457  1.8720703  1.1289062] 3   4 
[-0.71972656  0.9067383   1.2763672   0.66015625 -0.7841797 ] 2   1 
[ 1.3623047   1.1904297   0.07788086 -1.5078125  -2.0683594 

[ 1.3896484   1.6787109   0.26416016 -1.546875   -2.2636719 ] 1   2 
[-0.16442871  1.1875      1.2089844   0.24743652 -1.2304688 ] 2   3 
[ 1.0537109   1.7099609   0.60058594 -0.96533203 -1.8759766 ] 1   2 
[-0.46411133  0.6166992   1.0869141   0.6713867  -0.8911133 ] 2   2 Match 91

[-1.0712891   0.29492188  0.9916992   1.3789062  -0.20129395] 3   2 
[-1.4375     -0.7441406   0.27807617  1.8642578   1.1259766 ] 3   1 
[-0.6699219  1.3916016  1.125      0.1463623 -1.0644531] 1   1 Match 92

[ 0.9765625  1.9101562  0.5258789 -0.9301758 -1.7011719] 1   1 Match 93

[ 0.00745392  1.5732422   1.1103516  -0.11126709 -1.2363281 ] 1   2 
[-0.92089844 -0.5151367   0.18737793  1.4414062   1.7382812 ] 4   4 Match 94

[ 1.2753906   1.3671875   0.26245117 -1.0302734  -1.7011719 ] 1   2 
[-1.0419922  -0.5878906   0.15014648  1.5253906   1.6542969 ] 4   3 
[ 0.9423828   1.703125    0.64746094 -0.7661133  -1.8046875 ] 1   1 Match 95

[-1.4082031 -0.5341797  0.4958496  1.8525391  0.8828125] 3   3 Match

[-1.2373047  -0.7158203   0.20166016  1.8076172   1.3740234 ] 3   3 Match 152

[ 1.5673828   0.93066406 -0.04095459 -1.6982422  -2.2519531 ] 0   0 Match 153

[ 1.7050781   1.4023438   0.06134033 -1.6289062  -2.3027344 ] 0   0 Match 154

[-1.3935547 -0.3334961  0.9970703  1.8193359  0.2298584] 3   3 Match 155

[-1.46875    -0.79785156  0.38085938  1.9785156   1.0839844 ] 3   4 
[-1.3867188  -0.79248047  0.2565918   1.8388672   1.2685547 ] 3   3 Match 156

[-1.3125     -0.7602539   0.14868164  1.7910156   1.3535156 ] 3   2 
[ 1.3476562  1.8271484  0.2775879 -1.2158203 -2.1367188] 1   1 Match 157

[-1.1591797  -0.6386719   0.08557129  1.5947266   1.5751953 ] 3   4 
[ 1.8720703   1.2509766  -0.09991455 -1.6865234  -2.296875  ] 0   0 Match 158

[-1.0039062  -0.5488281   0.14892578  1.4902344   1.6953125 ] 4   4 Match 159

[ 0.90283203  1.7197266   0.8149414  -0.7651367  -1.7099609 ] 1   3 
[-0.97216797 -0.5395508   0.16796875  1.4746094   1.7119141 ] 4   3 
[ 1.4316406   1.6025391   0.21459


[ 0.35473633  1.8769531   1.0322266  -0.53564453 -1.5927734 ] 1   2 
[ 0.20690918  1.6611328   1.0712891  -0.39282227 -1.3759766 ] 1   0 
[ 1.3359375  1.8554688  0.53125   -1.1142578 -1.9628906] 1   1 Match 223

[-1.3896484  -0.46704102  0.82421875  1.7705078   0.47460938] 3   3 Match 224

[-1.2617188  -0.7578125   0.15881348  1.8271484   1.3701172 ] 3   4 
[ 1.3066406   1.7636719   0.46020508 -1.1435547  -1.9824219 ] 1   1 Match 225

[-0.15759277  1.4951172   1.1875     -0.09838867 -1.390625  ] 1   4 
[ 1.4648438   1.6142578   0.44799805 -1.0820312  -1.9658203 ] 1   3 
[ 1.1513672  1.8417969  0.4580078 -1.1689453 -2.1367188] 1   2 
[-1.3818359  -0.7529297   0.25585938  1.8916016   1.1455078 ] 3   3 Match 226

[ 1.3320312  1.8173828  0.3527832 -1.2402344 -2.3125   ] 1   2 
[-1.3554688  -0.7866211   0.25512695  1.8671875   1.3388672 ] 3   3 Match 227

[-0.7138672   1.3554688   1.1484375   0.14379883 -1.0576172 ] 1   2 
[ 0.57373047  1.9023438   0.8652344  -0.90771484 -1.9648438 ] 1   1

[ 1.5849609   1.4052734   0.21032715 -1.5810547  -2.3164062 ] 0   1 
[ 0.05889893  1.6796875   1.0253906  -0.10583496 -1.2998047 ] 1   0 
[ 1.3017578   1.78125     0.50390625 -1.0361328  -1.8964844 ] 1   1 Match 282

[-1.4951172  -0.42578125  0.64501953  1.6757812   0.57666016] 3   4 
[-1.4765625  -0.48095703  0.84228516  1.9013672   0.5703125 ] 3   3 Match 283

[ 0.01036835  1.8779297   0.9536133  -0.5395508  -1.5830078 ] 1   1 Match 284

[-1.5410156  -0.64990234  0.8173828   1.9833984   0.51123047] 3   3 Match 285

[ 0.06402588  1.7431641   1.0214844  -0.28051758 -1.4335938 ] 1   0 
[-1.1269531  -0.640625    0.07617188  1.6044922   1.5917969 ] 3   4 
[ 1.4755859   1.7441406   0.09576416 -1.5019531  -2.2558594 ] 1   1 Match 286

[ 0.6489258  1.7050781  0.9350586 -0.7001953 -1.7148438] 1   1 Match 287

[ 1.2021484  1.6582031  0.5410156 -0.9736328 -1.7519531] 1   0 
[ 0.2746582   1.7714844   1.1181641  -0.36694336 -1.4580078 ] 1   0 
[-0.7324219   1.1074219   1.1542969   0.40795898 -0.7

[-1.4189453  -0.7890625   0.37060547  1.9189453   1.1601562 ] 3   2 
[ 1.8388672   1.4628906   0.04974365 -1.5693359  -2.140625  ] 0   0 Match 354

[-1.1513672  -0.6323242   0.08056641  1.5751953   1.5751953 ] 3   4 
[-1.2451172  -0.73046875  0.1151123   1.7353516   1.4853516 ] 3   3 Match 355

[ 0.90283203  1.8769531   0.7294922  -0.7446289  -1.5224609 ] 1   0 
[-1.3710938 -0.8041992  0.1829834  1.9052734  1.2558594] 3   3 Match 356

[ 1.5625      1.7265625   0.20092773 -1.4345703  -2.1914062 ] 1   1 Match 357

[-1.2832031 -0.7285156  0.1439209  1.7539062  1.3291016] 3   0 
[ 0.44482422  1.6240234   1.0107422  -0.28833008 -1.6855469 ] 1   0 
[ 0.08526611  1.5273438   1.0039062  -0.16845703 -1.3173828 ] 1   1 Match 358

[ 0.18237305  1.6171875   1.1386719  -0.36157227 -1.7958984 ] 1   2 
[ 0.97998047  1.8681641   0.7636719  -0.8208008  -1.90625   ] 1   2 
[ 0.45532227  1.5214844   1.1035156  -0.35473633 -1.5087891 ] 1   2 
[ 1.5166016   1.296875    0.02671814 -1.5214844  -2.1464844 ] 0

[-1.1660156  -0.6435547   0.11138916  1.6582031   1.5224609 ] 3   3 Match 423

[ 1.8408203   1.1572266  -0.04345703 -1.7197266  -2.3554688 ] 0   0 Match 424

[ 0.98535156  1.9101562   0.5913086  -1.0478516  -1.9335938 ] 1   2 
[ 0.34716797  1.703125    1.0439453  -0.6586914  -1.8544922 ] 1   1 Match 425

[-1.3818359  -0.69628906  0.60498047  1.9501953   0.7192383 ] 3   2 
[ 1.4824219   1.7167969   0.22546387 -1.5419922  -2.4179688 ] 1   1 Match 426

[-0.5283203   1.0605469   1.1152344   0.47998047 -0.8833008 ] 2   3 
[-1.3173828  -0.78271484  0.24694824  1.8935547   1.2919922 ] 3   4 
[ 1.1982422  1.6845703  0.4116211 -1.2109375 -2.1367188] 1   2 
[ 0.27807617  1.3330078   1.0039062   0.0133667  -1.1230469 ] 1   3 
[-1.2939453 -0.7519531  0.3269043  1.8525391  1.2333984] 3   3 Match 427

[ 1.2646484  1.7431641  0.5292969 -1.3945312 -2.203125 ] 1   1 Match 428

[-1.0673828  -0.5839844   0.13989258  1.5166016   1.6396484 ] 4   3 
[-0.06964111  1.6435547   1.0263672  -0.22888184 -1.540039

[ 0.79296875  1.8134766   0.8330078  -0.6875     -1.6572266 ] 1   0 
[-0.81396484  0.81640625  1.28125     0.7285156  -0.90185547] 2   2 Match 498

[-1.0136719   0.15039062  0.7583008   1.2675781   0.31811523] 3   3 Match 499

[-1.4707031  -0.75341797  0.4790039   1.9628906   1.0615234 ] 3   3 Match 500

[ 1.6054688   0.8256836  -0.04318237 -1.7050781  -2.2246094 ] 0   0 Match 501

[ 0.9921875   1.9658203   0.77978516 -0.8808594  -2.0214844 ] 1   1 Match 502

[-1.1572266   0.61279297  1.2666016   0.8803711  -0.47216797] 2   3 
[ 0.3942871   1.4433594   1.0175781  -0.11199951 -1.3623047 ] 1   1 Match 503

[-1.046875   -0.6064453   0.11877441  1.5419922   1.6396484 ] 4   3 
[ 1.1494141  1.546875   0.5131836 -0.8510742 -1.6083984] 1   0 
[ 0.43579102  1.5         0.86035156 -0.35986328 -1.3134766 ] 1   2 
[ 1.6806641   1.3876953   0.11706543 -1.6103516  -2.2167969 ] 0   2 
[-1.4610291e-03  1.7412109e+00  9.5263672e-01 -2.2766113e-01
 -1.5341797e+00] 1   2 
[-1.2880859  -0.7338867   0.4016

[-1.265625   -0.7080078   0.09265137  1.6962891   1.4375    ] 3   3 Match 562

[-0.16662598  1.5634766   1.1025391  -0.28735352 -1.5146484 ] 1   2 
[-0.65625     0.9169922   1.1669922   0.47387695 -0.9946289 ] 2   1 
[-1.1113281  -0.6152344   0.10113525  1.5390625   1.6201172 ] 4   3 
[ 0.20678711  1.6806641   1.0371094  -0.21606445 -1.3886719 ] 1   2 
[ 0.20678711  1.8701172   0.8979492  -0.65283203 -1.8476562 ] 1   2 
[-1.0341797  -0.5786133   0.14111328  1.5595703   1.6337891 ] 4   2 
[-0.01594543  1.5380859   1.1621094  -0.11157227 -1.4160156 ] 1   3 
[-0.9526367   0.73291016  1.1298828   0.796875   -0.7753906 ] 2   3 
[ 1.1884766  1.7363281  0.5371094 -1.0107422 -1.9667969] 1   1 Match 563

[-0.9628906  -0.52978516  0.18383789  1.4599609   1.7148438 ] 4   4 Match 564

[-0.76123047  0.8520508   1.0351562   0.76171875 -0.8442383 ] 2   3 
[-1.4609375  -0.6074219   0.62109375  1.9082031   0.81884766] 3   1 
[-0.9345703   0.7246094   1.2089844   0.82910156 -0.79248047] 2   2 Match 565


[-0.9638672  -0.53564453  0.171875    1.4667969   1.7138672 ] 4   3 
[-1.1894531  -0.6586914   0.09649658  1.6894531   1.4980469 ] 3   3 Match 637

[-1.21875    -0.5263672   0.22814941  1.6679688   1.1660156 ] 3   3 Match 638

[ 1.3740234   1.5126953   0.16003418 -1.2392578  -1.8652344 ] 1   1 Match 639

[ 1.2392578   1.7695312   0.47802734 -1.0869141  -2.0058594 ] 1   1 Match 640

[-1.4785156  -0.6591797   0.5722656   1.8056641   0.80029297] 3   2 
[ 1.5693359  1.1318359 -0.0319519 -1.6367188 -2.3046875] 0   0 Match 641

[ 0.5410156   1.8535156   0.98535156 -0.4873047  -1.5820312 ] 1   2 
[ 1.0488281   1.5302734   0.50683594 -0.99609375 -1.6191406 ] 1   2 
[ 1.5410156   1.375       0.04983521 -1.6914062  -2.3359375 ] 0   1 
[ 0.59521484  1.4736328   0.8227539  -0.3552246  -1.4501953 ] 1   2 
[ 1.2246094   1.5966797   0.16882324 -1.1220703  -1.8212891 ] 1   1 Match 642

[ 1.6347656   1.1240234   0.01564026 -1.5732422  -2.1914062 ] 0   1 
[-1.1582031  -0.62353516  0.08978271  1.5683594 

[-0.14904785  1.1455078   1.1962891   0.21643066 -0.7885742 ] 2   2 Match 715

[ 1.7207031   1.4306641   0.03738403 -1.6367188  -2.3847656 ] 0   0 Match 716

[ 1.5585938   1.0048828  -0.07171631 -1.6757812  -2.2988281 ] 0   0 Match 717

[-1.4072266  -0.55908203  0.55322266  1.8330078   0.93115234] 3   2 
[-0.06799316  1.2216797   1.0195312   0.1463623  -1.2529297 ] 1   2 
[-1.3085938  -0.34472656  0.79833984  1.6328125   0.54833984] 3   2 
[-0.7944336   1.0039062   1.0126953   0.58447266 -0.9008789 ] 2   3 
[-0.02349854  1.8886719   0.8544922  -0.42529297 -1.5800781 ] 1   1 Match 718

[-1.4257812  -0.7524414   0.59716797  1.9267578   0.8544922 ] 3   4 
[ 1.0214844   1.4296875   0.5859375  -0.71484375 -1.4208984 ] 1   1 Match 719

[-1.3261719  -0.7841797   0.18615723  1.8261719   1.3007812 ] 3   3 Match 720

[-1.2402344  -0.65283203  0.08209229  1.640625    1.4765625 ] 3   1 
[-1.1689453  -0.6123047   0.12176514  1.5859375   1.4736328 ] 3   4 
[-1.4785156  -0.32348633  0.9501953   1.838

[ 1.3583984   1.8222656   0.29077148 -1.4550781  -2.21875   ] 1   1 Match 788

[-1.3193359  -0.73535156  0.17285156  1.7919922   1.3388672 ] 3   3 Match 789

[ 1.8134766   1.5097656   0.13220215 -1.6357422  -2.1992188 ] 0   0 Match 790

[ 0.36987305  1.5888672   1.0126953  -0.3400879  -1.4296875 ] 1   1 Match 791

[-1.5175781  -0.5410156   0.7597656   1.8945312   0.68066406] 3   1 
[-1.3076172  -0.65966797  0.23168945  1.7763672   1.2568359 ] 3   4 
[-1.1923828  -0.6738281   0.06835938  1.6513672   1.5478516 ] 3   4 
[-1.1171875  -0.6425781   0.09368896  1.6357422   1.5820312 ] 3   4 
[-1.4648438  -0.67529297  0.5541992   1.9111328   0.9082031 ] 3   4 
[-0.94189453  0.25708008  1.1123047   1.0273438  -0.41503906] 2   2 Match 792

[ 0.74072266  1.6630859   0.9711914  -0.80566406 -1.9433594 ] 1   0 
[ 1.2402344   1.5048828   0.25195312 -1.1699219  -1.7832031 ] 1   0 
[-0.85546875  0.38452148  1.2138672   0.94384766 -0.6533203 ] 2   1 
[-1.4140625  -0.7397461   0.30859375  1.8564453   1.0

[-1.1572266  -0.3935547   0.65185547  1.6796875   0.9458008 ] 3   1 
[ 0.06982422  1.6435547   1.1005859  -0.31347656 -1.4902344 ] 1   1 Match 859

[-1.125      -0.640625    0.09661865  1.5634766   1.5986328 ] 4   4 Match 860

[-0.9506836  -0.53027344  0.16589355  1.4541016   1.7148438 ] 4   4 Match 861

[-1.3818359  -0.20373535  0.9399414   1.7216797   0.17236328] 3   1 
[-1.3857422  -0.61279297  0.42871094  1.9082031   0.79296875] 3   3 Match 862

[ 1.5371094   1.6015625   0.05703735 -1.6914062  -2.4316406 ] 1   1 Match 863

[-0.7211914   1.0224609   1.2041016   0.45751953 -0.7392578 ] 2   3 
[-1.2626953  -0.17626953  1.1582031   1.4863281   0.01433563] 3   4 
[-0.1529541   1.4042969   0.8930664   0.26635742 -0.8510742 ] 1   1 Match 864

[-1.2724609  -0.6567383   0.14404297  1.6972656   1.3964844 ] 3   3 Match 865

[-1.3994141  -0.56591797  0.3791504   1.7626953   1.0244141 ] 3   4 
[ 1.2353516   1.6787109   0.29003906 -1.1962891  -1.9951172 ] 1   1 Match 866

[-1.3642578  -0.7065429


[-0.17993164  1.3046875   1.1894531   0.22839355 -1.2304688 ] 1   0 
[-1.1201172  -0.63623047  0.12756348  1.6054688   1.5605469 ] 3   3 Match 934

[ 0.4765625  1.8554688  0.9433594 -0.7998047 -1.7529297] 1   1 Match 935

[ 0.875       1.8486328   0.64697266 -0.87841797 -1.9169922 ] 1   3 
[-1.5449219  -0.37182617  0.7451172   1.7333984   0.23937988] 3   3 Match 936

[-1.3134766  -0.69873047  0.2364502   1.8417969   1.2597656 ] 3   3 Match 937

[ 0.5419922   1.5898438   1.1298828  -0.36499023 -1.5839844 ] 1   2 
[-1.0322266  -0.5761719   0.14099121  1.5107422   1.6708984 ] 4   3 
[-1.4121094  -0.78808594  0.28735352  1.9619141   1.1376953 ] 3   2 
[ 0.453125    1.9082031   0.8881836  -0.61816406 -1.7011719 ] 1   1 Match 938

[ 0.2734375   1.5703125   0.9394531   0.08807373 -1.4443359 ] 1   1 Match 939

[ 0.4091797   1.1181641   0.8203125  -0.18322754 -1.0527344 ] 1   2 
[ 1.1015625   1.5048828   0.23937988 -0.99853516 -1.7890625 ] 1   2 
[ 1.3916016   1.5722656   0.14538574 -1.3378906

[ 1.0820312  1.6572266  0.5854492 -0.9975586 -1.9238281] 1   1 Match 1006

[ 1.2304688  1.6748047  0.1817627 -1.2451172 -1.9160156] 1   1 Match 1007

[ 1.9179688  1.5527344 -0.0221405 -1.6220703 -2.2929688] 0   0 Match 1008

[-0.2902832  1.1142578  1.0224609  0.3647461 -0.9042969] 1   1 Match 1009

[ 1.2919922  1.7021484  0.3540039 -1.2304688 -1.9404297] 1   1 Match 1010

[-0.27441406  1.4794922   1.1875      0.06939697 -1.2900391 ] 1   2 
[-1.1386719  -0.6464844   0.10748291  1.5849609   1.5800781 ] 3   4 
[ 0.83935547  1.5087891   0.62060547 -0.71435547 -1.5644531 ] 1   1 Match 1011

[-1.1513672 -0.6645508  0.0803833  1.5917969  1.5761719] 3   4 
[ 1.0830078   1.7822266   0.44726562 -1.0039062  -1.8251953 ] 1   2 
[-1.4912109  -0.6621094   0.47070312  1.859375    0.9428711 ] 3   2 
[-1.4960938  -0.26367188  1.0507812   1.6474609   0.12023926] 3   4 
[-1.2607422   0.07446289  1.0771484   1.2265625  -0.15258789] 3   1 
[ 1.2451172   1.7724609   0.37036133 -1.265625   -2.0664062 ] 1   0

[ 0.7192383   1.8964844   0.8203125  -0.74560547 -1.8027344 ] 1   1 Match 1076

[-1.1572266 -0.6665039  0.0692749  1.6240234  1.5380859] 3   4 
[-0.95703125  0.29907227  1.1679688   1.2050781  -0.43945312] 3   1 
[-1.2216797  -0.734375    0.16296387  1.7880859   1.4824219 ] 3   4 
[ 1.3310547   1.7753906   0.13647461 -1.2050781  -2.0878906 ] 1   1 Match 1077

[ 0.4350586   1.4814453   0.9003906  -0.24328613 -1.4316406 ] 1   4 
[ 1.2138672   1.5595703   0.37597656 -1.1289062  -1.7695312 ] 1   0 
[-1.0742188  0.5859375  1.1533203  0.9223633 -0.4387207] 2   4 
[-1.5507812  -0.5136719   0.8520508   1.8330078   0.55908203] 3   3 Match 1078

[-1.3173828   0.3815918   1.0966797   1.1962891  -0.23754883] 3   3 Match 1079

[-0.6616211   0.8071289   1.1230469   0.6640625  -0.94677734] 2   3 
[ 1.7675781   1.4726562   0.09368896 -1.3876953  -2.1328125 ] 0   1 
[ 1.1621094   1.7714844   0.49072266 -1.3857422  -2.2558594 ] 1   1 Match 1080

[ 1.7763672   1.1591797  -0.02186584 -1.6064453  -2.263671

In [117]:
from sklearn import metrics
 
print(metrics.confusion_matrix(Targets,Pred))

[[ 87 167  17   8   0]
 [ 75 442  54  62   0]
 [  9 182  66 128   4]
 [  0  41  52 358  59]
 [  0  10  12 233 144]]


In [118]:
target_names = ['Very Neg', 'Negative', 'Neutral','Positive','Very Pos']
print(metrics.classification_report(Targets, Pred,target_names =target_names))

              precision    recall  f1-score   support

    Very Neg       0.51      0.31      0.39       279
    Negative       0.52      0.70      0.60       633
     Neutral       0.33      0.17      0.22       389
    Positive       0.45      0.70      0.55       510
    Very Pos       0.70      0.36      0.48       399

    accuracy                           0.50      2210
   macro avg       0.50      0.45      0.45      2210
weighted avg       0.50      0.50      0.47      2210



In [119]:
Fold_Predictions=pd.DataFrame(Pred, columns=['Pred5'] )
Fold_Predictions

Unnamed: 0,Pred5
0,1
1,1
2,1
3,4
4,1
...,...
2205,4
2206,1
2207,3
2208,4


In [120]:
Fold_Predictions.to_excel(output_folder+'/Saves/fold5_Predictions.xls')

In [121]:
#clearing GPU cache

del(model)
del(TrainResult, TrainModel_outputs, EvalResult, EvalModel_outputs, TestResult, TestModel_outputs, wrong_predictions)
torch.cuda.empty_cache()

# Comparing the Predictions

In [122]:
Pred1=pd.read_excel('./folds/fold1/bert/bert-base-cased/'+'Saves/fold1_Predictions.xls')
Pred2=pd.read_excel('./folds/fold2/bert/bert-base-cased/'+'Saves/fold2_Predictions.xls')
Pred3=pd.read_excel('./folds/fold3/bert/bert-base-cased/'+'Saves/fold3_Predictions.xls')
Pred4=pd.read_excel('./folds/fold4/bert/bert-base-cased/'+'Saves/fold4_Predictions.xls')
Pred5=pd.read_excel('./folds/fold5/bert/bert-base-cased/'+'Saves/fold5_Predictions.xls')


In [123]:
 for row in range(len(Pred1)):
        
        print(Pred1.iloc[row,1] , end=',')
        print(Pred2.iloc[row,1]  , end=',')
        print(Pred3.iloc[row,1] , end=',')
        print(Pred4.iloc[row,1] , end=',')
        print(Pred5.iloc[row,1] )
    

1,1,1,1,1
1,1,1,1,1
1,1,1,1,1
4,4,4,4,4
1,1,1,1,1
3,3,3,3,3
1,1,1,1,0
4,4,4,3,3
1,1,1,1,0
0,1,0,1,0
1,1,1,1,1
3,3,3,3,3
2,1,1,2,2
1,1,1,1,0
4,4,4,4,4
2,3,3,3,3
3,3,3,3,3
3,3,3,3,3
1,1,1,1,1
3,3,3,3,3
4,4,4,4,4
4,3,4,3,3
1,3,2,2,2
1,1,1,2,1
1,1,1,1,1
3,3,3,3,3
1,1,1,1,0
1,1,1,1,1
2,1,2,2,2
1,1,1,1,1
1,1,1,1,1
1,1,2,2,1
2,2,3,2,2
1,1,1,1,1
0,1,0,1,0
1,1,1,1,3
3,3,4,3,3
4,4,4,4,3
3,2,2,2,3
1,1,1,1,1
4,4,4,4,4
1,1,1,1,1
4,3,3,3,3
1,2,1,1,1
1,1,1,1,1
1,1,0,1,0
4,4,4,4,4
1,1,1,1,1
3,3,3,3,3
1,1,1,1,1
1,1,0,1,0
1,1,1,1,1
1,1,1,1,1
3,3,3,3,3
3,3,3,3,3
2,3,2,3,1
1,1,1,1,1
3,3,2,3,3
3,1,2,3,3
1,1,1,1,1
3,3,3,3,3
1,1,1,1,1
3,3,3,3,3
3,4,3,4,3
2,1,1,1,1
3,3,2,3,2
1,1,1,1,1
0,1,0,1,0
3,3,2,3,3
3,2,2,2,2
1,1,1,1,1
1,1,1,1,1
4,4,4,4,4
1,1,1,1,1
4,4,4,4,4
1,1,1,1,1
1,1,1,2,1
4,4,4,4,4
1,1,0,1,0
1,1,1,1,1
1,1,1,1,1
0,1,1,1,0
3,3,3,3,3
3,3,3,3,3
4,4,4,3,3
1,1,1,1,1
4,4,4,3,3
3,3,4,3,3
3,3,2,3,3
1,1,1,1,1
2,2,2,3,3
2,2,1,3,1
1,1,1,1,1
4,4,4,4,4
0,1,0,1,0
1,1,1,1,1
3,3,3,3,3
4,4,4,4,4
1,1,1,3,1
3,3,4,3,4


3,3,3,3,3
0,1,0,1,0
3,4,3,3,3
1,1,1,1,1
3,3,3,3,2
3,3,3,3,3
3,1,2,2,2
3,3,3,3,3
4,4,4,4,4
4,4,4,4,4
1,1,1,1,1
4,3,4,3,3
3,3,3,3,3
3,2,3,3,2
1,1,1,1,1
1,1,1,1,1
3,3,3,3,3
1,1,1,1,2
4,4,3,3,3
3,3,3,3,3
1,1,1,1,1
1,1,1,1,1
4,4,4,4,4
2,2,1,2,2
3,3,3,3,3
2,1,1,2,1
1,1,1,1,1
3,3,3,3,3
1,1,1,1,1
4,4,4,4,3
3,3,3,3,3
4,3,3,3,3
3,4,3,3,3
1,1,1,1,1
3,3,3,3,3
3,3,3,3,3
1,1,1,1,1
4,4,4,4,4
2,1,2,1,2
1,1,0,1,0
2,1,1,2,1
1,2,2,2,1
1,1,1,1,1
4,4,4,4,3
4,4,4,4,4
1,2,2,1,2
1,1,1,1,1
3,3,3,3,3
2,1,2,2,2
4,4,4,4,4
1,1,1,1,1
2,3,2,2,1
4,4,4,4,4
1,1,1,1,0
3,3,3,3,3
3,2,3,2,3
3,3,3,3,3
3,3,3,3,3
1,1,1,1,1
2,1,2,2,1
3,3,3,3,3
1,1,1,1,1
1,1,1,1,1
3,3,3,3,3
1,1,1,1,1
1,1,1,1,1
1,1,1,1,1
1,1,1,1,1
3,4,4,3,3
3,3,3,3,3
1,1,1,1,1
1,1,1,1,1
1,1,1,1,1
1,1,2,1,1
1,1,1,1,1
3,3,4,3,3
2,2,3,2,1
4,4,4,3,3
3,3,3,3,3
1,1,1,1,1
3,4,4,3,3
1,1,1,1,2
1,1,1,1,1
4,4,4,3,3
3,3,3,3,3
1,1,1,1,0
1,1,1,1,1
1,1,1,1,1
4,4,4,4,4
1,1,1,1,1
4,4,4,4,3
3,3,3,3,3
1,1,1,1,1
1,1,1,1,1
4,4,4,4,4
0,0,0,1,0
1,2,2,2,2
1,1,1,1,1
1,1,1,1,1
0,1,1,1,0


3,3,3,3,3
1,1,1,1,1
3,3,3,3,3
4,4,4,4,4
4,4,4,4,4
1,1,1,2,1
1,1,1,1,1
2,1,1,3,1
3,3,3,3,3
1,1,1,1,1
1,1,1,1,0
1,1,1,1,1
4,4,3,3,3
3,3,3,3,3
0,0,0,1,0
2,3,3,3,3
1,1,1,1,1
3,3,3,3,3
3,2,2,2,2
4,4,4,4,4
3,3,3,3,3
3,3,3,3,3
3,3,2,3,3
2,1,1,2,2
4,4,4,4,4
3,3,3,3,3
3,3,3,3,3
1,1,1,1,1
4,4,4,4,4
4,4,4,4,4
3,3,3,3,3
3,3,3,3,3
1,1,1,1,1
3,1,3,3,2
3,3,3,3,3
1,1,1,1,1
4,4,3,3,3
3,4,4,3,3
1,1,1,1,1
3,3,3,3,3
2,3,2,3,3
3,3,3,3,3
1,1,1,1,1
3,3,3,3,3
3,3,3,3,3
3,3,3,3,2
3,3,3,3,3
3,3,3,3,3
3,3,3,3,3
1,1,1,1,1
1,1,1,1,1
3,2,3,3,3
1,1,1,1,1
1,1,1,1,1
3,3,2,3,3
4,4,4,4,4
3,3,3,3,3
3,3,4,3,3
2,2,2,3,2
1,1,1,1,1
2,3,2,3,2
4,4,4,3,3
3,3,3,3,3
4,4,4,4,4
4,4,4,4,3
2,1,2,2,1
1,1,1,1,1
4,4,4,4,4
2,2,2,3,2
4,4,4,3,3
3,3,3,3,3
1,1,1,1,1
4,3,3,3,3
3,3,3,3,3
1,1,1,1,1
3,4,4,4,3
1,1,1,1,1
4,4,4,4,3
1,1,1,1,1
3,3,3,3,3
3,3,3,3,3
1,1,1,1,1
3,3,2,2,2
1,1,1,1,1
1,1,1,1,1
3,3,3,3,3
1,1,1,1,1
1,1,1,1,1
4,4,4,4,3
2,3,3,3,2
1,1,1,1,1
4,4,4,4,4
3,3,3,3,3
4,3,4,3,3
1,1,1,1,1
1,1,1,1,1
2,1,2,2,2
4,4,4,4,4
4,4,4,4,4
3,3,2,3,3


In [124]:
results=pd.DataFrame( columns=['text', 'label','fold1','fold2','fold3','fold4','fold5'])

results['text']=test['text']
results['label']=test['labels']
results['fold1']=Pred1['Pred1'] 
results['fold2']=Pred2['Pred2'] 
results['fold3']=Pred3['Pred3'] 
results['fold4']=Pred4['Pred4'] 
results['fold5']=Pred5['Pred5'] 

        
results

Unnamed: 0,text,label,fold1,fold2,fold3,fold4,fold5
0,Maybe I found the proceedings a little bit too...,1,1,1,1,1,1
1,"As with too many studio pics , plot mechanics ...",0,1,1,1,1,1
2,"Beers , who , when she 's given the right line...",2,1,1,1,1,1
3,"Cute , funny , heartwarming digitally animated...",4,4,4,4,4,4
4,So what is the point ?,0,1,1,1,1,1
...,...,...,...,...,...,...,...
2205,It 's a glorious groove that leaves you wantin...,4,4,3,4,3,4
2206,It 's getting harder and harder to ignore the ...,1,1,1,1,1,1
2207,"A real movie , about real people , that gives ...",3,3,4,3,3,3
2208,"Sharp , lively , funny and ultimately sobering...",4,4,4,4,4,4


In [125]:
now=datetime.now()

results.to_excel('./folds/results'+now.strftime("%m-%d-%Y %H-%M")+'.xls' ,index=False)