In this notebook we first do the classification using the transformer This is our first classification task.

The output classification vector from the transformer is saved to be used by the FCNN This is our second classification task.

In [1]:
# Importing necessary libraries
import pandas as pd
from datetime import datetime
import sklearn
import torch
import torch.nn as nn
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
device

device(type='cuda', index=0)

In [2]:
from simpletransformers.classification import ClassificationModel

## Preparing the dataset

Some pre-processing to the dataset has already been done in preparation for various tests, so this processing is not from scratch.

In [3]:
# procedure for getting the data sets and formatting them for the transformer
 

def prepareDataset( filename):
     
    ReadSet=pd.read_excel(filename )

    ReadSet['text']=ReadSet['Statement']
    ReadSet['labels']=ReadSet['Label']
    
    ReadSet=ReadSet.drop(['ID','Label','Statement','Subject','Speaker','Job','From','Affiliation','PantsTotal','NotRealTotal','BarelyTotal','HalfTotal','MostlyTotal','Truths','Context'
],axis=1)
     
    return ReadSet


In [4]:
# preparing the training dataset
train=prepareDataset( 'train.xlsx')
# and display for inspecting
train

Unnamed: 0,text,labels
0,The attorney general requires that rape victim...,0
1,President Clinton reduced the scale of our mil...,3
2,"I used tax cuts to help create over 80,000 job...",4
3,"New Mexico moved ""up to"" sixth in the nation i...",4
4,"Corporate profits are up, CEO pay is up, but a...",5
...,...,...
10264,"Under Obamacare, premiums have doubled and tri...",4
10265,We adopted the modern Social Security system a...,5
10266,More than two months ago President Barack Obam...,3
10267,"We had a massive landslide victory, as you kno...",1


In [5]:
# preparing the evaluation/validation dataset
Eval=prepareDataset('valid.xlsx')
# and display for inspecting
Eval

Unnamed: 0,text,labels
0,The president is brain-dead.,0
1,"Barack Obama supported keeping troops in Iraq,...",3
2,"He's leading by example, refusing contribution...",3
3,I'm the first person who really took up the is...,4
4,I built that border fence in San Diego...and i...,4
...,...,...
1279,CNN accidentally aired 30 minutes of pornograp...,1
1280,President Obamas American Recovery and Reinves...,2
1281,We (in Illinois) have the fifth-highest tax bu...,4
1282,Says Donald Trump won more counties than any c...,4


In [6]:
# preparing the test set dataset
test=prepareDataset('test.xlsx')
test

Unnamed: 0,text,labels
0,New Mexico was 46th in teacher pay (when he wa...,4
1,Barack Obama and Hillary Clinton have changed ...,3
2,I'll tell you what I can tell this country: If...,1
3,Tommy Thompson created the first school choice...,5
4,Fifty-six percent decline in overall crime. A ...,5
...,...,...
1278,"We have trade agreements with 20 countries, an...",1
1279,On Donald Trumps plan to cut federal funding t...,4
1280,"Black Lives Matter, who are attacking law enfo...",2
1281,Latina who enthusiastically supported Donald T...,0


## Setting up the transformer for fine tuning

This is where changes are done to optimise the model

The simpletransformers library is the quickest way to do this at the time of writing. 
For more information on the settings and their default value go here:
https://github.com/ThilinaRajapakse/simpletransformers#default-settings 

###### Please do read that reference before changing any parameters. Don't try to be a hero!

In [7]:
#Set the model being used here
model_class='roberta'  # bert or roberta or albert
model_version='roberta-large' #bert-base-cased, roberta-base, roberta-large, albert-base-v2 OR albert-large-v2


output_folder='./TunedModels/'+model_class+'/'+model_version+"/"
cache_directory= "./TunedModels/"+model_class+"/"+model_version+"/"+"/cache/"
labels_count=6  # the number of classification classes

print('model variables were set up: ')

model variables were set up: 


In [8]:
# use this to test if writing to the directories is working

import os
print(os.getcwd())
print(output_folder)
print(cache_directory)

testWrite=train.head(30)
 
testWrite.to_csv(output_folder+'DeleteThisToo.tsv', sep='\t')
testWrite.to_csv(cache_directory+'DeleteThisToo.tsv', sep='\t')

del(testWrite)

G:\0 finalThesis\LIAR_Text
./TunedModels/roberta/roberta-large/
./TunedModels/roberta/roberta-large//cache/


In [10]:
 
save_every_steps=1285
# assuming training batch size of 8
# any number above 1284 saves the model only at every epoch
# Saving the model mid training very often will consume disk space fast

train_args={
    "output_dir":output_folder,
    "cache_dir":cache_directory,
    'reprocess_input_data': True,
    'overwrite_output_dir': True,
    'num_train_epochs': 2,
    "save_steps": save_every_steps, 
    "learning_rate": 2.2e-5,
    "train_batch_size": 64,
    "eval_batch_size": 16,
    "adam_epsilon": 1e-7,
    "evaluate_during_training_steps": 5,
    "max_seq_length": 100,
    "n_gpu": 1,
}

# Create a ClassificationModel
model = ClassificationModel(model_class, model_version, num_labels=labels_count, args=train_args) 

# You can set class weights by using the optional weight argument

### Loading a saved model (based on above args{})

If you stopped training you can continue training from a previously saved check point.
The next cell allows you to load a model from any checkpoint.
The number of epochs in the train_args{} will be done and continue tuning from your checkpoint.

###### HOWEVER
It will overwrite previous checkpoints!
Example:  If you load an epoch-3 checkpoint, the epoch-1 checkpoint will be overwritten by the 4th epoch and it will be equivalent to a 4th epoch even if you have epoch-1 in the name.
###### SO BE CAREFUL

In [10]:
# loading a previously saved model based on this particular Transformer Class and model_name

# loading the checkpoint that gave the best result
CheckPoint='checkpoint-161-epoch-2'  


preSavedCheckpoint=output_folder+CheckPoint

print('Loading model, please wait...')
model = ClassificationModel( model_class, preSavedCheckpoint, num_labels=labels_count, args=train_args) 
print('model in use is :', preSavedCheckpoint )

Loading model, please wait...
model in use is : ./TunedModels/roberta/roberta-large/checkpoint-161-epoch-2


## Training the Transformer

Skip the next cell if you want to skip the training and go directly to the evaluation

In [11]:
# Train the model
start_time = datetime.now()
model.train_model(train)
print("Training time: ", datetime.now() - start_time)

Converting to features started. Cache is not used.


HBox(children=(FloatProgress(value=0.0, max=10269.0), HTML(value='')))


Selected optimization level O1:  Insert automatic casts around Pytorch functions and Tensor methods.

Defaults for this optimization level are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic
Processing user overrides (additional kwargs that are not None)...
After processing overrides, optimization options are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic


HBox(children=(FloatProgress(value=0.0, description='Epoch', max=2.0, style=ProgressStyle(description_width='i…

HBox(children=(FloatProgress(value=0.0, description='Current iteration', max=161.0, style=ProgressStyle(descri…

Running loss: 1.887999



Running loss: 1.849229Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 32768.0
Running loss: 1.703963Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 16384.0
Running loss: 1.803489



Running loss: 1.803763Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 8192.0
Running loss: 1.722979


HBox(children=(FloatProgress(value=0.0, description='Current iteration', max=161.0, style=ProgressStyle(descri…

Running loss: 1.703236Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 4096.0


Training of roberta model complete. Saved to ./TunedModels/roberta/roberta-large/.
Training time:  0:12:25.826038


## Evaluating the training

In [12]:
TrainResult, TrainModel_outputs, wrong_predictions = model.eval_model(train, acc=sklearn.metrics.accuracy_score)

EvalResult, EvalModel_outputs, wrong_predictions = model.eval_model(Eval, acc=sklearn.metrics.accuracy_score)


print('Training Result:', TrainResult['acc'])
#print('Model Out:', TrainModel_outputs)

print('Eval Result:', EvalResult['acc'])
#print('Model Out:', EvalModel_outputs)

print("Training & Evaluation time taken: ", datetime.now() - start_time)

Converting to features started. Cache is not used.


HBox(children=(FloatProgress(value=0.0, max=10269.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, max=642.0), HTML(value='')))


{'mcc': 0.22271499230748143, 'acc': 0.3658584088031941, 'eval_loss': 1.541503066585814}
Converting to features started. Cache is not used.


HBox(children=(FloatProgress(value=0.0, max=1284.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, max=81.0), HTML(value='')))


{'mcc': 0.12517664675137755, 'acc': 0.2842679127725857, 'eval_loss': 1.6382164130976171}
Training Result: 0.3658584088031941
Eval Result: 0.2842679127725857
Training & Evaluation time taken:  0:16:38.946984


In [13]:
TestResult, TestModel_outputs, wrong_predictions = model.eval_model(test, acc=sklearn.metrics.accuracy_score)

print('Test Set Result:', TestResult['acc'])
#print('Model Out:', TestModel_outputs)

Converting to features started. Cache is not used.


HBox(children=(FloatProgress(value=0.0, max=1283.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, max=81.0), HTML(value='')))


{'mcc': 0.10461072020967066, 'acc': 0.2727981293842556, 'eval_loss': 1.63450613728276}
Test Set Result: 0.2727981293842556


In [14]:
 Pred=[]

countCorrect=0

for row in range(TestModel_outputs.shape[0]):
    outputs=TestModel_outputs[row]
    #print(test.iloc[row,0])
    print(outputs, end=' ')
    
    result=0
    if outputs[0]<outputs[1]:result=1
    if outputs[result]<outputs[2]:result=2
    if outputs[result]<outputs[3]:result=3
    if outputs[result]<outputs[4]:result=4
    if outputs[result]<outputs[5]:result=5
    Pred.append(result)
    print(result, ' ',test.iloc[row,1], end=' ')
    if result==test.iloc[row,1]:
        countCorrect+=1
        print('Match',countCorrect)
    print('')

print(countCorrect)

[-2.5820312  -0.9086914   0.08953857  1.1494141   1.3085938   0.32763672] 4   4 Match 1

[-1.0683594  -0.12902832  0.13452148  0.39819336  0.01693726 -0.1652832 ] 3   3 Match 2

[-0.3725586   0.17175293 -0.08422852  0.00453186 -0.25708008 -0.20959473] 1   1 Match 3

[-1.5673828  -0.33276367  0.5341797   1.1835938   0.56591797 -0.36499023] 3   5 
[-2.4023438  -0.6591797  -0.4663086   0.8261719   1.1064453   0.57128906] 4   5 
[-2.0253906  -0.29858398 -0.28710938  0.80322266  0.65966797  0.22766113] 3   2 
[-2.5742188  -0.19689941 -0.41186523  0.9501953   1.2851562   0.7241211 ] 4   4 Match 4

[-0.67871094  0.35791016  0.30322266  0.79248047 -0.13330078 -0.46069336] 3   5 
[-2.0585938  -0.43408203 -0.43945312  0.34301758  0.84814453  0.9980469 ] 5   4 
[-1.9697266  -0.0147171   0.21386719  0.9267578   0.7158203   0.13220215] 3   5 
[-0.8510742   0.09527588  0.5727539   1.046875    0.45214844 -0.93115234] 3   5 
[-1.5361328   0.2783203   0.1973877   0.76123047 -0.07617188 -0.52685547] 3  

[ 0.07275391  0.9692383   0.41308594  0.06634521 -0.6904297  -1.5107422 ] 1   2 
[-2.3457031  -0.6015625  -0.2331543   0.8598633   1.1728516   0.75439453] 4   4 Match 38

[-1.5527344   0.14758301 -0.05941772  0.42016602  0.49536133  0.38793945] 4   4 Match 39

[-1.265625    0.12866211 -0.02493286  0.40649414  0.1517334   0.10675049] 3   5 
[-1.0859375   0.04574585  0.30029297  0.55029297  0.28930664 -0.39013672] 3   1 
[-1.2421875   0.05041504 -0.4230957   0.4765625   0.23046875  0.11743164] 3   1 
[-2.421875   -0.23754883  0.09454346  1.1201172   0.7338867   0.14477539] 3   4 
[-2.1679688  -0.5136719  -0.2368164   0.7910156   0.92041016  0.6801758 ] 4   5 
[-1.3505859  -0.33203125  0.04882812  0.38745117  0.49365234 -0.22607422] 4   1 
[-2.4765625  -0.45166016  0.06750488  1.0478516   0.86865234 -0.12780762] 3   2 
[-1.1777344   0.27026367 -0.64501953  0.12225342  0.05993652  1.0605469 ] 5   2 
[ 0.32714844  0.79248047 -0.04220581  0.22265625 -0.9316406  -0.65625   ] 1   1 Match 40

[

[-1.9208984  -0.5239258  -0.18481445  0.71240234  0.74072266  0.29785156] 4   3 
[-0.6821289  -0.17749023  0.15856934  0.32739258 -0.01053619 -0.27563477] 3   0 
[-2.0078125  -0.40527344 -0.21350098  0.5463867   0.7890625   0.83203125] 5   5 Match 78

[-0.70410156  0.5209961  -0.2175293   0.05517578 -0.12493896  0.14111328] 1   0 
[-1.90625    -0.19042969 -0.38208008  0.7285156   0.6098633   0.57470703] 3   3 Match 79

[-1.8193359  -0.18762207  0.18188477  0.69384766  0.5371094  -0.16638184] 3   2 
[-1.4824219  -0.0586853   0.12493896  0.34204102  0.03970337  0.39868164] 5   0 
[-0.11993408  0.2824707   0.92871094  0.1875     -0.55908203 -1.0097656 ] 2   0 
[-1.9677734  -0.67822266  0.02638245  1.2431641   0.60791016 -0.05941772] 3   4 
[-0.61865234 -0.02648926  0.4086914   0.25708008 -0.17675781 -0.37719727] 2   1 
[-1.0917969   0.18676758  1.0380859   0.98046875  0.01512909 -1.1376953 ] 2   2 Match 80

[-1.1542969  -0.05111694 -0.32128906  0.10931396  0.11352539  0.8466797 ] 5   3 
[


[-1.6884766  -0.10510254 -0.10614014  0.39501953  0.5913086   0.7988281 ] 5   3 
[-1.2666016  -0.21252441 -0.4189453   0.14648438  0.53515625  0.5136719 ] 4   3 
[-1.6396484  -0.5576172  -0.5991211   0.47827148  0.87890625  0.92089844] 5   4 
[-1.5195312   0.38891602  0.2211914   0.8911133   0.18408203  0.05734253] 3   3 Match 125

[ 0.1739502   0.6455078   0.78564453  0.49194336 -0.5551758  -1.4453125 ] 2   1 
[-0.8208008   0.36645508 -0.22314453  0.12017822 -0.1953125   0.1784668 ] 1   4 
[-1.9208984  -0.40356445 -0.4169922   0.68310547  1.125       0.25512695] 4   2 
[-0.03860474  0.6533203   0.35058594  0.1352539  -0.3942871  -0.7680664 ] 1   0 
[ 0.00460815  0.7363281   0.42773438 -0.02037048 -0.56640625 -0.9433594 ] 1   1 Match 126

[-1.7460938   0.20361328 -0.5703125   0.34423828  0.5732422   0.6767578 ] 5   1 
[-1.8798828  -0.42260742 -0.38867188  0.73779297  0.5600586   0.43481445] 3   5 
[-2.5136719  -0.33642578 -0.12182617  1.2529297   0.7636719  -0.03475952] 3   4 
[-1.149

[-0.9243164  -0.16845703 -0.16430664 -0.03933716  0.29296875  0.44433594] 5   3 
[-0.3371582   0.42993164  0.68310547  0.6591797  -0.35839844 -1.6787109 ] 2   1 
[ 0.64501953  0.98291016  0.3552246   0.04360962 -0.79833984 -1.3544922 ] 1   1 Match 162

[-0.34399414  0.4663086  -0.00399017 -0.03549194 -0.56689453  0.05715942] 1   2 
[-1.8027344   0.27807617  0.26953125  0.48046875 -0.09857178 -0.14807129] 3   3 Match 163

[-0.95166016  0.17541504  0.51660156  0.53125    -0.08648682 -1.0976562 ] 3   3 Match 164

[-0.3017578   0.22094727  0.5395508   0.61865234 -0.5229492  -0.71728516] 3   2 
[-1.7949219   0.21691895 -0.22827148  0.21887207  0.35107422  0.7402344 ] 5   2 
[-0.68115234 -0.0914917  -0.16564941  0.06652832  0.21716309 -0.15307617] 4   1 
[-0.6826172   1.8964844  -0.34594727  0.6376953  -1.3828125  -0.28857422] 1   5 
[-2.5410156  -0.51904297 -0.3803711   0.6748047   1.0419922   1.1044922 ] 5   5 Match 165

[-0.62060547  0.76171875  0.59228516  1.0234375  -0.28808594 -1.39453

[-1.9511719  -0.2401123  -0.2232666   0.5083008   1.1650391   0.95703125] 4   4 Match 197

[-1.2060547   0.13256836  0.06451416  0.5551758   0.38354492 -0.14050293] 3   2 
[-1.2675781   0.28515625  0.21972656  0.74365234  0.01625061 -0.14404297] 3   4 
[-1.8125      0.18334961  0.1505127   0.68847656  0.3408203   0.31323242] 3   4 
[-1.5283203  -0.06604004 -0.5366211   0.3161621   0.49658203  0.5449219 ] 5   0 
[-1.2519531   0.5996094   0.30566406  0.62597656 -0.46020508 -0.59814453] 3   2 
[-1.2050781   0.18334961  0.10198975  0.5161133   0.484375    0.15270996] 3   5 
[-1.8388672  -0.30200195 -0.56884766  0.2331543   0.78125     0.8652344 ] 5   5 Match 198

[-0.8129883   0.3486328   0.08190918  0.00503159 -0.28515625 -0.46020508] 1   2 
[-1.3974609   0.23706055 -0.03314209  0.3256836  -0.07025146  0.06082153] 3   3 Match 199

[-0.8989258   0.4152832  -0.20214844  0.3203125  -0.01902771 -0.09155273] 1   1 Match 200

[-1.84375    -0.6010742  -0.40161133  0.42529297  0.97021484  0.88085


[-0.6567383   0.3395996   1.0830078   0.70166016 -0.26513672 -1.0351562 ] 2   0 
[-0.7519531   0.22436523  1.0917969   0.69677734 -0.10412598 -1.2314453 ] 2   2 Match 241

[-2.0507812  -0.41455078 -0.3383789   0.95703125  1.0322266   0.57470703] 4   4 Match 242

[-0.32666016  0.4794922  -0.14367676  0.36743164  0.00372124 -0.1809082 ] 1   4 
[-2.3886719  -0.6274414  -0.17675781  0.6020508   1.0244141   0.84033203] 4   2 
[-2.3359375  -0.27148438 -0.33374023  0.55322266  0.8383789   0.91748047] 5   3 
[-2.8671875  -0.71435547 -0.34936523  0.8457031   1.6660156   1.1650391 ] 4   5 
[-1.0429688  -0.19348145  0.69091797  0.59472656  0.3293457  -0.44360352] 2   4 
[-0.94970703 -0.03421021 -0.0723877   0.9433594   0.44799805 -0.4169922 ] 3   4 
[-0.20422363 -0.16040039  0.20288086  0.6538086  -0.07232666 -0.39624023] 3   0 
[-1.5087891  -0.49951172  0.23205566  0.7114258   0.43164062 -0.09112549] 3   4 
[-0.2854004   0.13500977 -0.27075195 -0.11230469 -0.15258789  0.13916016] 5   5 Match 24

[-1.7353516  -0.18127441 -0.5751953   0.4321289   0.69628906  0.6074219 ] 4   4 Match 275

[-2.3769531  -0.4326172   0.07299805  0.70947266  0.95214844  0.87109375] 4   3 
[-1.5507812   0.2919922   0.05947876  0.62841797  0.2331543  -0.15625   ] 3   3 Match 276

[ 0.04092407  0.42578125  0.61621094  0.46826172 -0.8466797  -1.4501953 ] 2   3 
[-1.1289062   0.30371094 -0.10656738  0.04394531  0.07116699  0.28564453] 1   2 
[-2.0722656  -0.41064453 -0.25976562  0.67089844  0.87646484  0.67578125] 4   3 
[-1.6982422  -0.43115234  0.04052734  0.44360352  1.0947266   0.04522705] 4   3 
[-0.64208984  0.3930664   0.25        0.26904297 -0.0838623  -0.31079102] 1   4 
[-2.6699219  -1.0976562  -0.14526367  1.0048828   1.5214844   0.52246094] 4   1 
[-1.7333984  -0.14733887 -0.5410156   0.10510254  0.45532227  0.92089844] 5   4 
[-2.6035156  -0.63378906 -0.54541016  0.9135742   1.1962891   0.96875   ] 4   3 
[-1.6210938  -0.00211525 -0.47216797  0.03353882  0.52197266  0.8691406 ] 5   5 Match 277

 -6.1082840e-04  4.0722656e-01] 5   3 
[-1.6904297  -0.27978516 -0.29541016  0.29248047  0.5864258   0.8105469 ] 5   4 
[-0.72265625  2.0371094  -0.51708984  1.0488281  -1.0722656  -0.35546875] 1   3 
[ 1.0224609   0.8432617   0.23925781 -0.14282227 -0.80566406 -1.1640625 ] 0   3 
[-1.4326172   0.39086914  0.24645996  0.6333008   0.01599121  0.05102539] 3   4 
[-1.9169922  -0.3618164  -0.5126953   0.24194336  1.1054688   1.2089844 ] 5   5 Match 316

[-0.31274414  0.64453125 -0.4111328  -0.26416016 -0.36767578 -0.10626221] 1   5 
[ 1.5117188   0.6333008  -0.00377655 -0.36206055 -0.8520508  -0.79296875] 0   5 
[-1.1484375   0.04470825 -0.10083008  0.2685547   0.21166992  0.08422852] 3   2 
[-2.4375     -0.44018555 -0.11669922  0.5957031   1.1796875   0.7792969 ] 4   4 Match 317

[-1.3730469   0.03289795 -0.6772461   0.18652344  0.6894531   1.2207031 ] 5   2 
[-0.47827148 -0.0244751  -0.01940918  0.10028076 -0.01705933 -0.37841797] 3   4 
[-0.68066406  0.31835938 -0.24780273  0.14868164  

[ 0.36621094  0.5991211  -0.30444336 -0.34228516 -0.68408203 -0.59765625] 1   1 Match 348

[-2.03125    -0.22937012  0.55322266  0.7265625   0.5361328  -0.11694336] 3   2 
[ 0.6816406   0.33544922 -0.14892578 -0.25830078 -0.73779297 -0.7866211 ] 0   0 Match 349

[-1.0615234   0.06201172 -0.20715332  0.11572266  0.38867188  0.07757568] 4   0 
[-0.7314453   0.4494629  -0.1472168   0.11578369  0.10473633  0.32470703] 1   3 
[-1.3193359  -0.1619873  -0.03546143  0.50683594  0.05413818 -0.29711914] 3   2 
[-2.6699219  -0.69140625 -0.6220703   0.78515625  1.3730469   1.3105469 ] 4   1 
[-0.9321289   0.6411133   0.25976562  0.68066406 -0.25610352 -0.39990234] 3   4 
[ 1.1464844   0.4790039   0.33276367  0.33984375 -0.5029297  -1.25      ] 0   2 
[-0.09283447  0.27392578 -0.16467285 -0.03109741 -0.2541504  -0.14770508] 1   0 
[-0.52001953  0.11218262 -0.23120117 -0.1381836  -0.18054199 -0.0736084 ] 1   1 Match 350

350


In [17]:
from sklearn import metrics

print(metrics.confusion_matrix(test['labels'],Pred))

[[16 36 12 16 10  2]
 [ 5 83 21 68 39 34]
 [ 4 51 27 78 29 25]
 [ 5 49 29 90 68 26]
 [ 0 30  5 91 83 40]
 [ 3 27  6 55 69 51]]


In [18]:
target_names = ['Pants', 'False', 'Barely-True','Half-True','Mostly-True','True']

print(metrics.classification_report(test['labels'], Pred,target_names =target_names))

              precision    recall  f1-score   support

       Pants       0.48      0.17      0.26        92
       False       0.30      0.33      0.32       250
 Barely-True       0.27      0.13      0.17       214
   Half-True       0.23      0.34      0.27       267
 Mostly-True       0.28      0.33      0.30       249
        True       0.29      0.24      0.26       211

    accuracy                           0.27      1283
   macro avg       0.31      0.26      0.26      1283
weighted avg       0.29      0.27      0.27      1283



In [19]:
# saving the output of the models to CSVs
#these are 1X6 classification vectors

SavesDirectory='./TunedModels/'+model_class+'/'+model_version+"/Saves/"
print('Saving...')
trainOut = pd.DataFrame(data= TrainModel_outputs )
trainOut.to_csv(SavesDirectory+'trainOut.tsv', sep='\t',  index=False)

evalOut = pd.DataFrame(data= EvalModel_outputs )
evalOut.to_csv(SavesDirectory+'evalOut.tsv', sep='\t',  index=False)

testOut = pd.DataFrame(data= TestModel_outputs )
testOut.to_csv(SavesDirectory+'testOut.tsv', sep='\t',  index=False)

print('Saving Complete on',datetime.now() ,'in:', SavesDirectory)

Saving...
Saving Complete on 2020-05-01 08:32:11.747996 in: ./TunedModels/roberta/roberta-large/Saves/


In [20]:
del(model)
#del(train,Eval,test)
del(trainOut,evalOut,testOut)
torch.cuda.empty_cache()

#  Adding the reputation vector

This section takes the output results from the transformer used above and uses it together with the speaker's reputation to enhance the classification.

Before running this section it is suggested that you halt the program and start running it again from this cell. The neural net will likely have an error caused by some unreleased variable used by thr simple transformers library. 

In [1]:
import pandas as pd
import torch
import torch.nn as nn
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
device

device(type='cuda', index=0)

In [2]:

train=pd.read_excel('trainReputation.xlsx' )
train=train.iloc[:,:-2].astype(float)
train=train/200  #for scaling
#train

model_class='roberta'  # bert or roberta or albert
model_version='roberta-large' #bert-base-cased, roberta-base, roberta-large, albert-base-v2 OR albert-large-v2
SavesDirectory='./TunedModels/'+model_class+'/'+model_version+"/Saves/"
TF_Output=pd.read_csv( SavesDirectory+'trainOut.tsv', sep='\t')

train=pd.concat([train,TF_Output], axis=1)

train

Unnamed: 0,PantsTotal,NotRealTotal,BarelyTotal,HalfTotal,MostlyTotal,Truths,0,1,2,3,4,5
0,0.005,0.000,0.000,0.000,0.000,0.000,-0.875000,-0.035553,-0.036926,0.124512,0.091125,0.248901
1,0.095,0.160,0.170,0.290,0.165,0.165,-1.904297,-0.231323,0.364746,1.080078,0.403564,-0.042603
2,0.005,0.010,0.005,0.015,0.040,0.010,-2.021484,-0.625000,0.489990,1.277344,0.593262,-0.539062
3,0.005,0.010,0.005,0.015,0.040,0.010,-2.273438,-0.583496,-0.375000,1.092773,0.999512,0.554688
4,0.035,0.145,0.200,0.345,0.380,0.365,-2.255859,-0.787598,0.012428,1.327148,0.995117,-0.192749
...,...,...,...,...,...,...,...,...,...,...,...,...
10264,0.005,0.030,0.070,0.050,0.050,0.020,-0.273926,0.841309,0.604980,0.637695,-0.615723,-1.586914
10265,0.055,0.075,0.080,0.100,0.050,0.035,-2.171875,-0.200439,-0.195557,0.686523,1.147461,1.179688
10266,0.035,0.115,0.140,0.190,0.170,0.075,-1.827148,-0.359619,-0.008209,0.834473,0.503418,0.132935
10267,0.305,0.570,0.315,0.255,0.185,0.070,-0.827637,0.247314,-0.009705,0.440430,0.225342,-0.443848


In [3]:
TrainLables=pd.read_excel('trainReputation.xlsx' )
TrainLables=TrainLables.iloc[:,-1] 

TrainLables=pd.get_dummies(TrainLables)
TrainLables

Unnamed: 0,0,1,2,3,4,5
0,1,0,0,0,0,0
1,0,0,0,1,0,0
2,0,0,0,0,1,0
3,0,0,0,0,1,0
4,0,0,0,0,0,1
...,...,...,...,...,...,...
10264,0,0,0,0,1,0
10265,0,0,0,0,0,1
10266,0,0,0,1,0,0
10267,0,1,0,0,0,0


In [4]:
input=torch.tensor(train.values)
 
input

tensor([[ 0.0050,  0.0000,  0.0000,  ...,  0.1245,  0.0911,  0.2489],
        [ 0.0950,  0.1600,  0.1700,  ...,  1.0801,  0.4036, -0.0426],
        [ 0.0050,  0.0100,  0.0050,  ...,  1.2773,  0.5933, -0.5391],
        ...,
        [ 0.0350,  0.1150,  0.1400,  ...,  0.8345,  0.5034,  0.1329],
        [ 0.3050,  0.5700,  0.3150,  ...,  0.4404,  0.2253, -0.4438],
        [ 0.0000,  0.0050,  0.0000,  ...,  0.1836, -0.9189, -0.9209]],
       dtype=torch.float64)

In [5]:
targets=torch.tensor(TrainLables.astype(float).values)
 
targets

tensor([[1., 0., 0., 0., 0., 0.],
        [0., 0., 0., 1., 0., 0.],
        [0., 0., 0., 0., 1., 0.],
        ...,
        [0., 0., 0., 1., 0., 0.],
        [0., 1., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0.]], dtype=torch.float64)

In [6]:
 
size= torch.tensor(input[0].size())
InputSize=size.item()

OutputSize=torch.tensor(targets[0].size()).item()

print('input size:', InputSize)
print('output size:', OutputSize)

input size: 12
output size: 6


In [8]:

class Net(nn.Module):

    def __init__(self):
        super(Net, self).__init__()
        
         
        # an affine operation: y = Wx + b
        self.fc1 = nn.Linear(InputSize, 24)  # input size 32
        self.fc2 = nn.Linear(24, 12)
        self.fc3 = nn.Linear(12, OutputSize)  #classifies 'outputsize' different classes

    def forward(self, x):
        x = torch.tanh(self.fc1(x))
        x = torch.tanh(self.fc2(x)) 
        x = torch.tanh(self.fc3(x)).double()
        return x

    

#now we use it

net = Net()

In [39]:
# here we  setup the neural network parameters
# pick an optimizer (Simple Gradient Descent)

learning_rate = 1e-4
criterion = nn.MSELoss()  #computes the loss Function

import torch.optim as optim

# creating optimizer
#optimizer = optim.SGD(net.parameters(), lr=learning_rate)
optimizer = torch.optim.Adam(net.parameters(), lr=learning_rate)


In [44]:
for epoch in range(100):  
        
    optimizer.zero_grad()   # zero the gradient buffers
    output = net(input.float())

    loss = criterion(output, targets)
    print('Loss:', loss, ' at epoch:', epoch)

    loss.backward()  #backprop
    optimizer.step()    # Does the update

Loss: tensor(0.0931, dtype=torch.float64, grad_fn=<MseLossBackward>)  at epoch: 0
Loss: tensor(0.0931, dtype=torch.float64, grad_fn=<MseLossBackward>)  at epoch: 1
Loss: tensor(0.0931, dtype=torch.float64, grad_fn=<MseLossBackward>)  at epoch: 2
Loss: tensor(0.0931, dtype=torch.float64, grad_fn=<MseLossBackward>)  at epoch: 3
Loss: tensor(0.0931, dtype=torch.float64, grad_fn=<MseLossBackward>)  at epoch: 4
Loss: tensor(0.0931, dtype=torch.float64, grad_fn=<MseLossBackward>)  at epoch: 5
Loss: tensor(0.0931, dtype=torch.float64, grad_fn=<MseLossBackward>)  at epoch: 6
Loss: tensor(0.0931, dtype=torch.float64, grad_fn=<MseLossBackward>)  at epoch: 7
Loss: tensor(0.0931, dtype=torch.float64, grad_fn=<MseLossBackward>)  at epoch: 8
Loss: tensor(0.0931, dtype=torch.float64, grad_fn=<MseLossBackward>)  at epoch: 9
Loss: tensor(0.0931, dtype=torch.float64, grad_fn=<MseLossBackward>)  at epoch: 10
Loss: tensor(0.0931, dtype=torch.float64, grad_fn=<MseLossBackward>)  at epoch: 11
Loss: tensor(0

In [25]:
#save the FCNN model

stage='NNetwork/'
SavesDirectory='./TunedModels/'+model_class+'/'+model_version+"/"+stage
#PATH = SavesDirectory+'Tanh_MSE_adam4793.pth'

torch.save(net.state_dict(), PATH)

# more on saving pytorch networks: https://pytorch.org/docs/stable/notes/serialization.html

In [38]:
#load previously saved FCNN model 

stage='NNetwork/'
SavesDirectory='./TunedModels/'+model_class+'/'+model_version+"/"+stage
#PATH = SavesDirectory+'Tanh_MSE_adam4731.pth'

net = Net()
net.load_state_dict(torch.load(PATH))

<All keys matched successfully>

In [26]:
correct = 0
total = 0

countCorrect0=0
countCorrect1=0
count0=0
count1=0
labels=pd.read_excel('trainReputation.xlsx' )

Y=[]  #target
Pred=[]  #predicted

with torch.no_grad():
    for row in range(len(input)):
        outputs = net(input[row,:].float())
        result=0
        total+=1
        if outputs[0]<outputs[1]:result=1
        if outputs[result]<outputs[2]:result=2
        if outputs[result]<outputs[3]:result=3
        if outputs[result]<outputs[4]:result=4
        if outputs[result]<outputs[5]:result=5
        
        if TrainLables.iloc[row,result]==1: correct+=1
        
        Y.append(labels.iloc[row])
        Pred.append(result)
        
        print(result, end=' ')
        
    
print('Correct:', correct, 'out of:', total )
print('Accuracy of the network : ',( 100 * correct / total))


0 3 3 4 3 1 4 5 5 5 4 3 4 5 3 5 4 5 3 3 3 2 5 1 3 3 3 1 1 1 4 4 5 3 5 3 5 5 5 5 3 4 3 4 1 5 2 3 5 5 5 5 3 1 5 3 3 3 1 3 3 4 5 1 3 3 4 5 5 3 4 3 5 4 5 4 4 3 4 4 3 3 4 1 2 4 3 1 1 5 3 3 3 0 0 5 5 2 2 5 5 1 4 4 3 5 3 3 5 4 4 3 4 5 4 3 3 3 3 1 4 2 5 5 0 3 4 3 1 4 3 3 5 5 4 2 1 5 4 1 4 3 3 3 3 3 5 4 4 5 1 3 1 5 3 4 4 3 5 3 5 1 3 5 1 0 3 5 3 3 3 5 5 3 3 5 4 5 0 2 4 5 3 5 5 2 5 5 3 5 3 5 1 3 5 4 5 4 4 1 1 3 0 5 1 1 5 1 5 3 2 0 0 0 1 3 5 4 4 5 5 3 5 1 1 1 0 3 0 4 3 4 2 3 5 3 5 4 4 2 5 5 5 5 4 3 3 5 4 4 3 3 5 3 3 5 2 5 4 4 5 0 5 4 5 3 4 3 5 3 5 3 3 3 5 0 0 0 3 5 3 0 3 5 5 3 3 3 3 3 4 2 5 3 2 3 3 4 1 3 4 2 3 2 1 5 3 4 4 1 0 0 0 1 4 5 5 5 3 0 3 1 3 1 5 5 5 4 1 5 3 1 1 3 1 5 1 5 1 3 3 5 5 4 3 4 4 3 3 3 3 3 3 5 1 0 1 1 5 4 0 5 3 5 4 5 3 4 4 3 4 4 5 4 5 1 1 5 3 5 3 5 3 3 5 1 5 5 5 1 5 5 0 0 5 1 1 4 5 4 5 5 5 1 3 3 3 2 5 5 5 3 1 3 3 2 3 5 4 3 5 1 5 4 3 0 4 1 5 4 3 3 1 0 1 1 2 0 3 2 5 5 5 5 5 3 5 5 2 4 5 5 3 3 3 1 1 1 1 1 0 5 0 5 1 1 3 4 1 1 1 1 3 4 1 3 3 3 2 4 3 1 3 1 5 5 2 0 5 1 4 1 1 5 5 1 3 1 5 4 

 4 0 4 3 5 3 5 3 1 3 3 0 5 5 3 1 5 0 1 1 1 5 1 3 1 1 3 1 2 1 4 5 3 1 3 2 3 2 1 3 1 1 4 5 4 4 3 4 5 4 1 3 3 1 5 1 3 3 4 5 1 3 1 4 2 1 4 3 1 4 5 1 5 5 4 1 5 1 4 3 1 3 2 4 4 4 2 3 1 4 4 4 5 3 3 5 1 1 0 4 1 2 1 5 3 0 4 1 2 4 0 3 3 4 4 3 3 1 4 3 1 3 5 3 4 1 4 0 3 3 4 5 4 5 1 4 5 5 5 4 4 2 1 4 3 5 4 5 4 1 3 5 3 3 1 3 2 2 3 2 3 0 0 2 5 1 1 1 5 1 1 3 1 0 3 4 4 4 1 2 3 2 4 2 5 4 5 3 2 5 5 3 1 1 0 5 1 1 4 3 3 1 4 2 2 3 1 2 5 3 3 5 1 3 3 5 3 3 5 2 3 0 2 5 2 1 4 4 1 1 1 1 2 1 1 1 3 1 3 1 2 5 2 5 4 5 4 5 1 3 1 1 1 4 5 1 0 4 1 4 5 3 2 5 3 5 5 3 4 2 1 3 1 2 5 3 3 3 4 3 5 1 3 1 5 4 2 5 1 1 1 0 4 5 4 3 2 0 5 4 0 5 3 3 2 0 3 4 4 3 5 1 5 5 5 2 1 1 1 3 4 4 2 4 2 3 1 1 5 1 5 4 1 1 5 4 5 3 3 4 4 5 5 5 4 3 5 1 2 3 5 2 4 1 1 4 3 2 4 1 2 1 1 3 2 4 3 2 5 4 2 4 1 1 1 2 2 1 1 0 4 2 1 3 1 5 3 4 1 5 1 0 5 4 2 2 3 2 5 3 2 1 3 2 1 2 2 4 2 3 1 1 3 2 3 1 1 4 1 2 3 3 4 1 2 1 1 2 2 2 5 4 3 5 0 5 1 3 4 5 1 5 4 3 3 3 4 2 5 5 5 3 5 3 1 2 4 1 4 2 1 3 1 4 4 3 4 3 0 3 1 4 2 4 1 4 1 3 5 1 2 5 3 1 4 4 5 5 3 1 5 5 2 3 3 3 3 5 3 4

 4 4 2 4 4 3 5 1 5 4 1 2 1 1 1 4 0 1 4 4 3 2 4 2 4 1 4 3 2 3 3 3 4 5 4 5 4 1 2 1 1 3 4 3 2 2 3 4 1 1 0 2 4 1 3 4 3 5 1 4 3 2 4 1 3 3 1 1 4 1 5 5 2 4 0 3 3 4 5 4 5 4 5 4 1 0 4 4 2 4 2 4 3 2 1 3 2 5 1 1 5 4 3 4 3 4 5 1 4 5 1 2 0 0 2 5 1 3 3 1 2 2 1 3 4 5 2 1 4 1 0 1 1 3 2 1 0 3 2 1 3 4 1 4 4 1 3 3 5 1 4 1 4 1 1 5 4 4 3 1 1 2 3 4 4 1 1 1 3 4 5 3 3 3 4 3 3 1 4 2 3 4 1 1 5 1 1 4 0 2 5 1 5 4 1 0 3 4 3 2 2 4 1 4 1 1 1 4 1 5 5 1 4 3 4 3 3 2 5 5 5 3 1 5 1 1 4 1 1 3 4 5 3 2 3 3 2 2 4 2 1 4 4 1 2 1 2 4 5 5 2 1 4 1 1 1 1 4 1 4 5 4 3 2 4 1 1 4 4 1 2 3 3 5 5 1 5 2 5 2 2 2 3 1 4 1 3 1 2 1 5 1 2 2 4 1 5 3 1 0 4 4 5 1 5 4 5 4 1 4 1 4 3 5 4 2 1 1 5 3 1 5 5 1 4 5 5 5 1 4 3 2 3 1 2 4 1 1 1 2 4 0 5 1 3 1 4 3 5 1 5 1 5 4 1 3 4 5 1 3 3 5 1 3 1 5 5 1 4 1 1 3 5 3 4 4 1 4 2 5 5 5 4 4 2 4 5 3 1 4 2 5 0 4 4 2 1 1 1 3 1 4 1 3 4 3 5 2 1 3 5 4 4 5 1 4 1 4 1 4 2 0 3 5 1 3 4 1 5 1 4 1 1 2 2 4 3 2 1 5 4 5 4 3 2 1 1 3 2 1 4 3 4 1 1 5 3 4 4 1 1 3 1 1 3 3 3 2 3 2 4 1 1 1 3 1 2 4 0 2 1 1 4 3 1 1 1 2 4 5 3 4 4 4 4 5 2 3 4 5

In [27]:
# load the validation data

ValidData=pd.read_excel('validReputation.xlsx' )
ValidData=ValidData.iloc[:,:-2].astype(float)
ValidData=ValidData/200

SavesDirectory='./TunedModels/'+model_class+'/'+model_version+"/Saves/"
TF_Output=pd.read_csv( SavesDirectory+'evalOut.tsv', sep='\t')

ValidData=pd.concat([ValidData,TF_Output], axis=1)


ValidData=torch.tensor(ValidData.values)
ValidData

tensor([[ 0.0200,  0.0500,  0.0550,  ..., -0.3562, -1.0098, -0.8691],
        [ 0.0100,  0.0300,  0.0300,  ...,  1.0410,  0.2646, -0.4180],
        [ 0.0450,  0.3550,  0.3500,  ...,  0.5498,  0.1101,  0.2856],
        ...,
        [ 0.0000,  0.0000,  0.0000,  ...,  0.4402,  1.1455,  1.0537],
        [ 0.0000,  0.0500,  0.0400,  ...,  0.4883,  1.1436,  0.5972],
        [ 0.0000,  0.0000,  0.0000,  ...,  1.3447,  1.0322,  0.3281]],
       dtype=torch.float64)

In [28]:
labels=pd.read_excel('validReputation.xlsx' )

labels=labels.iloc[:,-1] 
labelsOneHot=pd.get_dummies(labels)
labelsOneHot

Unnamed: 0,0,1,2,3,4,5
0,1,0,0,0,0,0
1,0,0,0,1,0,0
2,0,0,0,1,0,0
3,0,0,0,0,1,0
4,0,0,0,0,1,0
...,...,...,...,...,...,...
1279,0,1,0,0,0,0
1280,0,0,1,0,0,0
1281,0,0,0,0,1,0
1282,0,0,0,0,1,0


In [29]:
ValidLables =torch.tensor(labelsOneHot.values)
ValidLables

tensor([[1, 0, 0, 0, 0, 0],
        [0, 0, 0, 1, 0, 0],
        [0, 0, 0, 1, 0, 0],
        ...,
        [0, 0, 0, 0, 1, 0],
        [0, 0, 0, 0, 1, 0],
        [0, 0, 0, 1, 0, 0]], dtype=torch.uint8)

In [30]:
correct = 0
total = 0

countCorrect0=0
countCorrect1=0
count0=0
count1=0

Y=[]  #target
Pred=[]  #predicted

with torch.no_grad():
    for row in range(len(ValidData)):
        outputs = net(ValidData[row,:].float())
        result=0
        total+=1
        if outputs[0]<outputs[1]:result=1
        if outputs[result]<outputs[2]:result=2
        if outputs[result]<outputs[3]:result=3
        if outputs[result]<outputs[4]:result=4
        if outputs[result]<outputs[5]:result=5
        
        if labelsOneHot.iloc[row,result]==1: correct+=1
        
        Y.append(labels.iloc[row])
        Pred.append(result)
        
        print(result, end=' ')
        
    
print('Correct:', correct, 'out of:', total )
print('Accuracy of the network : ',( 100 * correct / total))

1 3 5 5 4 3 5 5 5 4 5 3 1 1 5 3 3 5 0 2 3 3 1 2 3 3 2 5 3 3 4 4 4 1 0 0 3 4 1 0 0 2 1 0 5 3 5 3 4 3 5 1 0 5 5 1 5 1 3 5 3 5 1 3 1 3 0 1 1 5 3 2 4 0 5 3 3 1 3 1 1 2 1 1 3 3 1 4 1 1 1 2 5 3 3 1 3 1 0 1 0 1 1 4 1 1 0 0 1 1 1 1 4 1 1 1 1 2 1 1 4 2 3 0 3 3 5 4 1 1 1 3 2 4 4 2 3 2 5 3 3 3 3 3 5 3 1 5 3 4 1 4 1 1 5 2 5 1 3 2 4 5 2 4 5 4 2 1 1 3 3 3 3 2 1 4 5 4 2 5 1 5 5 2 2 5 4 5 4 4 0 2 3 3 4 5 2 1 4 5 2 4 3 1 5 2 4 4 4 2 2 1 1 1 1 3 3 5 2 4 4 2 1 3 3 2 2 1 2 3 3 2 2 2 1 2 3 2 1 1 3 5 2 5 3 4 3 4 1 3 2 3 1 1 2 5 4 3 3 3 4 5 3 2 5 2 1 3 3 1 2 1 4 2 4 1 5 1 5 5 4 1 1 4 3 1 2 0 3 1 4 1 1 4 4 1 3 1 1 3 5 3 3 5 0 3 4 5 3 1 5 2 4 2 4 3 5 1 5 2 5 2 1 3 1 1 4 5 1 4 4 2 3 5 5 2 4 3 1 5 1 1 1 1 4 0 1 3 2 3 1 4 3 5 2 1 3 1 2 1 2 4 3 2 2 4 1 2 3 1 4 3 1 1 2 1 5 5 2 1 5 1 1 1 4 3 5 3 5 4 1 1 5 0 4 3 5 1 1 3 4 4 3 4 5 5 4 1 4 3 2 1 3 3 3 5 4 3 4 4 4 2 4 4 1 3 2 1 1 1 3 0 5 3 1 1 1 5 5 1 3 4 4 5 4 3 1 1 3 5 5 2 3 1 3 5 1 5 1 1 3 5 3 1 1 4 5 4 5 3 3 4 4 5 4 3 5 1 3 4 1 0 1 3 3 3 2 3 1 3 1 3 4 1 2 2 5 1 1 2 

In [31]:
# load the test data

TestData=pd.read_excel('testReputation.xlsx' )
TestData=TestData.iloc[:,:-2].astype(float)
TestData=TestData/200

SavesDirectory='./TunedModels/'+model_class+'/'+model_version+"/Saves/"
TF_Output=pd.read_csv( SavesDirectory+'testOut.tsv', sep='\t')

TestData=pd.concat([TestData,TF_Output], axis=1)


TestData=torch.tensor(TestData.values)
TestData

tensor([[ 0.0050,  0.0100,  0.0050,  ...,  1.1494,  1.3086,  0.3276],
        [ 0.0000,  0.0000,  0.0000,  ...,  0.3982,  0.0169, -0.1653],
        [ 0.0200,  0.0250,  0.0600,  ...,  0.0045, -0.2571, -0.2096],
        ...,
        [ 0.0100,  0.0050,  0.0250,  ...,  0.3398, -0.5029, -1.2500],
        [ 0.2200,  0.0950,  0.0350,  ..., -0.0311, -0.2542, -0.1477],
        [ 0.0050,  0.0600,  0.0100,  ..., -0.1382, -0.1805, -0.0736]],
       dtype=torch.float64)

In [32]:
labels=pd.read_excel('testReputation.xlsx' )

labels=labels.iloc[:,-1] 
labelsOneHot=pd.get_dummies(labels)
labelsOneHot

Unnamed: 0,0,1,2,3,4,5
0,0,0,0,0,1,0
1,0,0,0,1,0,0
2,0,1,0,0,0,0
3,0,0,0,0,0,1
4,0,0,0,0,0,1
...,...,...,...,...,...,...
1278,0,1,0,0,0,0
1279,0,0,0,0,1,0
1280,0,0,1,0,0,0
1281,1,0,0,0,0,0


In [33]:
TestLables =torch.tensor(labelsOneHot.values)
TestLables

tensor([[0, 0, 0, 0, 1, 0],
        [0, 0, 0, 1, 0, 0],
        [0, 1, 0, 0, 0, 0],
        ...,
        [0, 0, 1, 0, 0, 0],
        [1, 0, 0, 0, 0, 0],
        [0, 1, 0, 0, 0, 0]], dtype=torch.uint8)

In [45]:
correct = 0
total = 0

countCorrect0=0
countCorrect1=0
count0=0
count1=0

Y=[]  #target
Pred=[]  #predicted

with torch.no_grad():
    for row in range(len(TestData)):
        outputs = net(TestData[row,:].float())
        result=0
        total+=1
        if outputs[0]<outputs[1]:result=1
        if outputs[result]<outputs[2]:result=2
        if outputs[result]<outputs[3]:result=3
        if outputs[result]<outputs[4]:result=4
        if outputs[result]<outputs[5]:result=5
        
        if labelsOneHot.iloc[row,result]==1: correct+=1
        
        Y.append(labels.iloc[row])
        Pred.append(result)
        
        print(result, end=' ')
        
       
print('Correct:', correct, 'out of:', total )
print('Accuracy of the network : ',( 100 * correct / total))

4 3 2 3 4 3 4 3 5 3 3 3 3 1 4 3 1 3 1 4 5 4 5 3 5 5 1 4 3 5 4 1 0 3 5 3 5 3 5 4 5 5 3 1 5 3 5 2 3 5 4 3 3 5 1 4 5 2 5 1 1 2 1 5 1 5 5 5 1 3 5 5 1 5 5 1 5 1 5 5 2 5 5 4 1 2 1 5 5 4 3 1 1 1 2 1 4 5 3 1 4 0 3 1 1 1 1 1 1 5 1 3 4 3 1 1 0 3 1 2 3 2 4 3 1 4 1 4 5 3 3 3 1 3 3 2 5 1 2 3 4 4 1 3 3 3 3 3 4 3 0 3 1 4 5 3 5 5 3 3 4 1 1 1 3 5 4 5 3 1 1 3 5 5 1 3 5 3 3 2 5 4 3 1 1 0 2 4 4 5 2 5 5 0 1 1 2 2 3 1 1 3 5 4 4 1 3 5 3 3 2 2 3 1 4 1 2 4 2 1 1 1 3 1 1 4 3 2 3 0 4 1 0 1 3 2 2 0 5 2 2 1 2 2 4 1 2 1 3 3 0 4 4 5 3 3 3 0 4 1 2 1 1 2 0 5 1 3 4 2 2 3 1 2 5 2 5 1 1 3 5 1 3 4 3 5 5 1 3 0 1 4 1 3 1 1 3 3 3 4 1 4 4 5 5 3 2 5 3 3 1 4 1 3 3 4 5 3 3 3 5 4 4 3 1 0 5 4 5 3 2 3 3 1 5 1 1 3 3 2 4 3 3 3 3 5 2 3 1 1 1 0 4 1 4 1 1 3 1 2 4 1 1 1 2 0 1 4 5 1 0 1 5 5 4 3 4 5 3 1 5 1 3 4 1 3 3 2 1 5 3 2 1 1 2 3 3 1 1 4 3 1 0 1 4 5 2 5 3 5 3 2 1 4 2 1 1 5 3 5 2 4 4 1 2 3 3 3 4 3 5 1 3 3 2 4 3 3 5 1 4 5 1 1 3 4 3 2 3 5 1 5 1 3 4 4 4 3 1 1 5 2 3 4 4 3 1 3 1 3 3 4 1 4 1 2 3 1 5 3 2 3 3 1 2 1 2 0 5 5 1 3 4 3 2 1 3 4 3 4 

In [35]:
from sklearn import metrics
 
print(metrics.confusion_matrix(Y,Pred))

[[ 40  26   8  13   4   1]
 [  3 143  27  38  21  18]
 [  3  35  79  52  27  18]
 [  1  40  27 125  47  27]
 [  0  21  13  64 117  34]
 [  3  21   9  25  51 102]]


In [36]:
target_names = ['Pants', 'False', 'Barely-True','Hlaf-True','Mostly-True','True']

print(metrics.classification_report(Y, Pred,target_names =target_names))

              precision    recall  f1-score   support

       Pants       0.80      0.43      0.56        92
       False       0.50      0.57      0.53       250
 Barely-True       0.48      0.37      0.42       214
   Hlaf-True       0.39      0.47      0.43       267
 Mostly-True       0.44      0.47      0.45       249
        True       0.51      0.48      0.50       211

    accuracy                           0.47      1283
   macro avg       0.52      0.47      0.48      1283
weighted avg       0.49      0.47      0.47      1283



In [37]:
#save the FCNN model

stage='NNetwork/'
SavesDirectory='./TunedModels/'+model_class+'/'+model_version+"/"+stage
#PATH = SavesDirectory+'Tanh_MSE_adam4731.pth'

torch.save(net.state_dict(), PATH)

# more on saving pytorch networks: https://pytorch.org/docs/stable/notes/serialization.html