In this notebook we first do the classification using the transformer This is our first classification task.

The output classification vector from the transformer is saved to be used by the FCNN This is our second classification task.


In [1]:
# Importing necessary libraries
import pandas as pd
from datetime import datetime
import sklearn
import torch
import torch.nn as nn
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
device

device(type='cuda', index=0)

In [2]:
from simpletransformers.classification import ClassificationModel

## Preparing the dataset

Some pre-processing to the dataset has already been done in preparation for various tests, so this processing is not from scratch.

In [3]:
# procedure for getting the data sets and formatting them for the transformer
 

def prepareDataset( filename):
     
    ReadSet=pd.read_excel(filename )

    ReadSet['text']=ReadSet['Statement']
    ReadSet['labels']=ReadSet['Label']
    
    ReadSet=ReadSet.drop(['ID','Label','Statement','Subject','Speaker','Job','From','Affiliation','PantsTotal','NotRealTotal','BarelyTotal','HalfTotal','MostlyTotal' ,'RealTotal','Context'],axis=1)
    

    return ReadSet


In [4]:
# preparing the training dataset
train=prepareDataset( 'train-clean.xlsx')
# and display for inspecting
train

Unnamed: 0,text,labels
0,President Obama is a Muslim.,0
1,An independent payment advisory board created ...,0
2,U.S. Sen. Bill Nelson was the deciding vote fo...,2
3,Large phone companies and their trade associat...,4
4,RIPTA has really some of the fullest buses for...,4
...,...,...
10094,The Georgia Dome has returned $10 billion in e...,1
10095,Then-Gov. Carl Sanders put 56 percent of the s...,4
10096,Nathan Deal saved the HOPE scholarship program.,4
10097,John Faso took money from fossil fuel companie...,3


In [5]:
# preparing the evaluation/validation dataset
Eval=prepareDataset('valid-clean.xlsx')
# and display for inspecting
Eval

Unnamed: 0,text,labels
0,New Jerseys once-broken pension system is now ...,3
1,The new health care law will cut $500 billion ...,2
2,"For thousands of public employees, Wisconsin G...",3
3,Because as a Senator Toomey stood up for Wall ...,4
4,The governors budget proposal reduces the stat...,5
...,...,...
1267,You can import as many hemp products into this...,5
1268,Says when Republicans took over the state legi...,3
1269,Wisconsin's laws ranked the worst in the world...,2
1270,"There currently are 825,000 student stations s...",4


In [6]:
# preparing the test set dataset
test=prepareDataset('test-clean.xlsx')
test

Unnamed: 0,text,labels
0,"In a lawsuit between private citizens, a Flori...",4
1,Obama-Nelson economic record: Job creation a...,4
2,Says George LeMieux even compared Marco Rubio ...,2
3,Gene Green is the NRAs favorite Democrat in Co...,2
4,"In labor negotiations with city employees, Mil...",2
...,...,...
1250,Says Milwaukee County Executive Chris Abele sp...,1
1251,"The words subhuman mongrel, which Ted Nugent c...",5
1252,California's Prop 55 prevents $4 billion in ne...,2
1253,Says One of the states largest governments mad...,0


## Setting up the transformer for fine tuning

This is where changes are done to optimise the model

The simpletransformers library is the quickest way to do this at the time of writing. 
For more information on the settings and their default value go here:
https://github.com/ThilinaRajapakse/simpletransformers#default-settings 

###### Please do read that reference before changing any parameters. Don't try to be a hero!

In [7]:
#Set the model being used here
model_class='albert'  # bert or roberta or albert
model_version='albert-large-v2' #bert-base-cased, roberta-base, roberta-large, albert-base-v2 OR albert-large-v2


output_folder='./TunedModels/'+model_class+'/'+model_version+"/"
cache_directory= "./TunedModels/"+model_class+"/"+model_version+"/cache/"
labels_count=6  # the number of classification classes

print('model variables were set up: ')

model variables were set up: 


In [8]:
# use this to test if writing to the directories is working

import os
print(os.getcwd())
print(output_folder)
print(cache_directory)

testWrite=train.head(30)
 
testWrite.to_csv(output_folder+'DeleteThisToo.tsv', sep='\t')
testWrite.to_csv(cache_directory+'DeleteThisToo.tsv', sep='\t')

del(testWrite)

G:\0 finalThesis\CleanedText
./TunedModels/albert/albert-large-v2/
./TunedModels/albert/albert-large-v2/cache/


In [15]:
 
save_every_steps=1285
# assuming training batch size of 8
# any number above 1284 saves the model only at every epoch
# Saving the model mid training very often will consume disk space fast

train_args={
    "output_dir":output_folder,
    "cache_dir":cache_directory,
    'reprocess_input_data': True,
    'overwrite_output_dir': True,
    'num_train_epochs': 1,
    "save_steps": save_every_steps, 
    "learning_rate": 1.2e-5,
    "train_batch_size": 64,
    "eval_batch_size": 16,
    "evaluate_during_training_steps": 5,
    "max_seq_length": 100,
    "n_gpu": 1,
}

# Create a ClassificationModel
model = ClassificationModel(model_class, model_version, num_labels=labels_count, args=train_args) 

# You can set class weights by using the optional weight argument

### Loading a saved model (based on above args{})

If you stopped training you can continue training from a previously saved check point.
The next cell allows you to load a model from any checkpoint.
The number of epochs in the train_args{} will be done and continue tuning from your checkpoint.

###### HOWEVER
It will overwrite previous checkpoints!
Example:  If you load an epoch-3 checkpoint, the epoch-1 checkpoint will be overwritten by the 4th epoch and it will be equivalent to a 4th epoch even if you have epoch-1 in the name.
###### SO BE CAREFUL

In [22]:
# loading a previously saved model based on this particular Transformer Class and model_name

# loading the checkpoint that gave the best result
CheckPoint='checkpoint-316-epoch-2'  #epoch 2


preSavedCheckpoint=output_folder+CheckPoint

print('Loading model, please wait...')
model = ClassificationModel( model_class, preSavedCheckpoint, num_labels=labels_count, args=train_args) 
print('model in use is :', preSavedCheckpoint )

Loading model, please wait...
model in use is : ./TunedModels/albert/albert-large-v2/checkpoint-316-epoch-2


## Training the Transformer

Skip the next cell if you want to skip the training and go directly to the evaluation

In [23]:
# Train the model
current_time = datetime.now()
model.train_model(train)
print("Training time: ", datetime.now() - current_time)

Converting to features started. Cache is not used.


HBox(children=(FloatProgress(value=0.0, max=10099.0), HTML(value='')))


Selected optimization level O1:  Insert automatic casts around Pytorch functions and Tensor methods.

Defaults for this optimization level are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic
Processing user overrides (additional kwargs that are not None)...
After processing overrides, optimization options are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic


HBox(children=(FloatProgress(value=0.0, description='Epoch', max=1.0, style=ProgressStyle(description_width='i…

HBox(children=(FloatProgress(value=0.0, description='Current iteration', max=158.0, style=ProgressStyle(descri…

Running loss: 1.658109Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 32768.0
Running loss: 1.735714Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 16384.0
Running loss: 1.625402Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 8192.0
Running loss: 1.748201

Training of albert model complete. Saved to ./TunedModels/albert/albert-large-v2/.
Training time:  0:03:07.161133


## Evaluating the training

In [24]:
TrainResult, TrainModel_outputs, wrong_predictions = model.eval_model(train, acc=sklearn.metrics.accuracy_score)

EvalResult, EvalModel_outputs, wrong_predictions = model.eval_model(Eval, acc=sklearn.metrics.accuracy_score)

TestResult, TestModel_outputs, wrong_predictions = model.eval_model(test, acc=sklearn.metrics.accuracy_score)

print('Training Result:', TrainResult['acc'])
#print('Model Out:', TrainModel_outputs)

print('Eval Result:', EvalResult['acc'])
#print('Model Out:', EvalModel_outputs)

print('Test Set Result:', TestResult['acc'])
#print('Model Out:', TestModel_outputs)

Features loaded from cache at ./TunedModels/albert/albert-large-v2/cache/cached_dev_albert_100_6_10099


HBox(children=(FloatProgress(value=0.0, max=632.0), HTML(value='')))


{'mcc': 0.14302343923293323, 'acc': 0.30300029705911474, 'eval_loss': 1.6455843840973288}
Features loaded from cache at ./TunedModels/albert/albert-large-v2/cache/cached_dev_albert_100_6_1272


HBox(children=(FloatProgress(value=0.0, max=80.0), HTML(value='')))


{'mcc': 0.07267173158242483, 'acc': 0.24842767295597484, 'eval_loss': 1.7028664216399192}
Features loaded from cache at ./TunedModels/albert/albert-large-v2/cache/cached_dev_albert_100_6_1255


HBox(children=(FloatProgress(value=0.0, max=79.0), HTML(value='')))


{'mcc': 0.07538753483067408, 'acc': 0.250996015936255, 'eval_loss': 1.6997072953212111}
Training Result: 0.30300029705911474
Eval Result: 0.24842767295597484
Test Set Result: 0.250996015936255


In [25]:
Pred=[]

countCorrect=0

for row in range(TestModel_outputs.shape[0]):
    outputs=TestModel_outputs[row]
    #print(test.iloc[row,0])
    print(outputs, end=' ')
    
    result=0
    if outputs[0]<outputs[1]:result=1
    if outputs[result]<outputs[2]:result=2
    if outputs[result]<outputs[3]:result=3
    if outputs[result]<outputs[4]:result=4
    if outputs[result]<outputs[5]:result=5
    Pred.append(result)
    print(result, ' ',test.iloc[row,1], end=' ')
    if result==test.iloc[row,1]:
        countCorrect+=1
        print('Match',countCorrect)
    print('')

print(countCorrect)

[-0.01067352  0.47070312  0.32421875 -0.16882324 -0.38891602 -0.03646851] 1   4 
[-1.1787109   0.37060547 -0.06860352  0.4416504   0.5527344   0.3251953 ] 4   4 Match 1

[-0.60498047  0.35058594  0.19274902  0.13439941  0.05429077 -0.19616699] 1   2 
[-0.6972656   0.03170776  0.06051636  0.609375    0.16479492  0.00113297] 3   2 
[-0.10852051  0.15368652  0.2565918   0.2607422  -0.38989258 -0.42163086] 3   2 
[-1.4775391   0.11663818 -0.15234375  0.6484375   1.0341797   0.578125  ] 4   5 
[-0.19213867  0.27661133  0.3256836   0.16809082 -0.4038086  -0.47485352] 2   3 
[-0.32055664  0.06652832  0.26782227  0.3515625  -0.23376465 -0.5761719 ] 3   2 
[-0.22595215  0.42041016  0.18652344 -0.01853943 -0.28076172 -0.16564941] 1   1 Match 2

[-0.05224609  0.11279297  0.31567383  0.22253418 -0.37231445 -0.6254883 ] 2   0 
[-1.5654297  -0.0082016  -0.27661133  0.57128906  0.9892578   0.85839844] 4   5 
[-0.11102295  0.38256836  0.29125977 -0.03808594 -0.3791504  -0.33569336] 1   2 
[-1.6376953 

[-0.5600586   0.35351562  0.23461914  0.2878418  -0.04327393 -0.04568481] 1   5 
[-0.5698242   0.4091797   0.15148926  0.05392456  0.01408386  0.06030273] 1   4 
[-1.6152344  -0.03231812 -0.26733398  0.7421875   1.1738281   0.66308594] 4   5 
[-1.5820312   0.0440979  -0.29296875  0.671875    0.8510742   0.7919922 ] 4   5 
[-0.68310547  0.16101074  0.16186523  0.31103516 -0.0305481  -0.03083801] 3   4 
[-1.6083984  -0.11114502 -0.32836914  0.26049805  0.94433594  0.83984375] 4   4 Match 32

[-1.4199219   0.05966187 -0.24157715  0.42651367  1.046875    0.8979492 ] 4   2 
[ 0.01751709  0.31030273  0.26171875  0.06848145 -0.5209961  -0.61035156] 1   2 
[-1.5576172   0.12103271 -0.17749023  0.5571289   0.97753906  0.66845703] 4   4 Match 33

[ 0.05511475  0.14318848  0.30297852  0.26464844 -0.45166016 -0.58984375] 2   0 
[-0.11395264  0.24743652  0.39233398  0.06719971 -0.4345703  -0.5361328 ] 2   0 
[ 0.18920898  0.26171875  0.2685547   0.01145935 -0.5649414  -0.7519531 ] 2   0 
[-0.077148


[-1.2509766   0.37402344 -0.05151367  0.5102539   0.61572266  0.23095703] 4   5 
[-1.2539062   0.27978516  0.10839844  0.7001953   0.5888672  -0.01829529] 3   4 
[-0.9506836   0.06726074  0.32861328  0.7944336   0.29248047 -0.26782227] 3   3 Match 68

[-1.3652344  -0.15759277 -0.02818298  1.0556641   0.75097656  0.50878906] 3   2 
[-1.5175781   0.02452087 -0.03079224  0.8613281   0.9501953   0.62841797] 4   4 Match 69

[-0.6879883   0.20080566  0.34057617  0.47509766  0.03158569 -0.30615234] 3   4 
[-1.5615234   0.07592773 -0.31274414  0.18811035  0.90722656  1.0126953 ] 5   1 
[-1.4248047  -0.06573486 -0.0473938   0.71972656  0.5571289   0.5629883 ] 3   4 
[ 0.1541748   0.21643066  0.24743652 -0.0066452  -0.52978516 -0.6401367 ] 2   1 
[-0.14343262  0.53759766  0.12017822 -0.37280273 -0.3359375   0.25756836] 1   1 Match 70

[-1.5527344  -0.0592041  -0.3479004   0.37182617  0.9458008   1.0195312 ] 5   4 
[-0.37817383  0.21875     0.37597656  0.23730469 -0.30249023 -0.40356445] 2   5 


[ 0.16625977  0.5288086   0.34960938 -0.2697754  -0.43774414  0.0166626 ] 1   4 
[-0.33325195  0.15014648  0.26049805  0.35180664 -0.21508789 -0.5126953 ] 3   3 Match 107

[-0.72021484  0.33520508  0.26293945  0.3017578   0.01560974 -0.25195312] 1   5 
[-0.48217773  0.19824219  0.3474121   0.49194336 -0.10888672 -0.40161133] 3   4 
[-1.5673828  -0.15185547 -0.27270508  0.5371094   1.0166016   0.9296875 ] 4   5 
[-0.16320801  0.22460938  0.30200195  0.12054443 -0.4609375  -0.47729492] 2   0 
[-1.0175781   0.34375     0.13623047  0.5600586   0.3305664  -0.09289551] 3   4 
[-1.7392578  -0.16125488 -0.44458008  0.83935547  1.1386719   0.71972656] 4   5 
[ 0.14916992  0.3479004   0.18237305 -0.14587402 -0.57373047 -0.5336914 ] 1   4 
[-0.60791016  0.38232422  0.20507812  0.09613037  0.00637054  0.20239258] 1   3 
[-0.01229095  0.39672852  0.24414062 -0.10919189 -0.42773438 -0.20031738] 1   2 
[-1.4570312   0.07067871 -0.14807129  0.71533203  0.8803711   0.7192383 ] 4   4 Match 108

[-0.7099

[-0.93652344  0.3334961   0.05453491  0.27319336  0.28857422  0.203125  ] 1   1 Match 148

[-1.5683594   0.24060059 -0.44604492  0.20654297  0.9428711   0.7602539 ] 4   4 Match 149

[-0.9477539   0.2800293   0.12988281  0.35302734  0.32885742  0.23083496] 3   3 Match 150

[-0.06048584  0.28295898  0.265625    0.06192017 -0.39770508 -0.46801758] 1   5 
[-0.70751953  0.34375     0.10137939  0.35058594  0.06542969 -0.19616699] 3   1 
[-1.0859375   0.22827148  0.21142578  0.6171875   0.37231445  0.007164  ] 3   3 Match 151

[-0.1194458   0.51220703  0.21398926 -0.2211914  -0.4543457  -0.0189209 ] 1   3 
[-0.59375     0.5932617   0.03277588 -0.25854492 -0.02330017  0.37231445] 1   5 
[ 0.04257202  0.26171875  0.3815918   0.13244629 -0.50341797 -0.56591797] 2   2 Match 152

[-0.21032715  0.13916016  0.30419922  0.34594727 -0.36279297 -0.56689453] 3   1 
[-0.77978516  0.4206543  -0.04702759  0.00408554  0.08380127  0.23547363] 1   5 
[-0.2788086   0.29663086  0.4404297   0.31054688 -0.2543945

[-1.1796875   0.45922852  0.01660156  0.10687256  0.5229492   0.65185547] 5   3 
[ 0.2553711   0.390625    0.27978516 -0.14453125 -0.65722656 -0.57666016] 1   3 
[-0.49365234  0.5620117   0.12176514 -0.08538818 -0.01138306  0.00963593] 1   2 
[-0.28320312  0.3894043   0.30932617  0.21398926 -0.19848633 -0.38427734] 1   1 Match 187

[-1.0214844   0.32080078  0.07983398  0.5126953   0.3605957   0.13049316] 3   3 Match 188

[-0.00250244  0.34936523  0.28881836 -0.02270508 -0.43481445 -0.54248047] 1   4 
[-1.2353516   0.3720703  -0.07659912  0.22790527  0.5751953   0.359375  ] 4   4 Match 189

[-1.28125     0.17736816  0.05169678  0.5307617   0.49853516  0.5839844 ] 5   4 
[-1.2109375   0.20727539 -0.13647461  0.0531311   0.45898438  0.9501953 ] 5   2 
[-1.6123047   0.11309814 -0.27441406  0.34277344  1.0683594   0.86083984] 4   5 
[-1.1083984   0.36132812 -0.15454102  0.17236328  0.5083008   0.47021484] 4   4 Match 190

[ 0.17236328  0.5053711   0.25952148 -0.32470703 -0.51220703 -0.25781

[-1.0576172   0.2861328   0.13696289  0.6435547   0.42504883  0.02970886] 3   3 Match 222

[-1.6142578  -0.20629883 -0.1229248   0.6977539   1.0185547   0.73876953] 4   2 
[-0.80566406  0.01686096  0.12963867  0.8647461   0.22814941 -0.1661377 ] 3   2 
[-1.4853516  -0.1673584  -0.19262695  1.0048828   0.92578125  0.63427734] 3   4 
[-1.0244141   0.4128418   0.10821533  0.47973633  0.3515625  -0.00804138] 3   2 
[-1.0654297   0.32055664  0.03009033  0.5234375   0.4477539   0.02850342] 3   5 
[-0.7080078   0.6591797   0.05465698 -0.19421387  0.0802002   0.51171875] 1   1 Match 223

[-0.22485352  0.2277832   0.30419922  0.40527344 -0.37426758 -0.6323242 ] 3   4 
[-0.13879395  0.2824707   0.28833008  0.17150879 -0.46044922 -0.46679688] 2   2 Match 224

[-0.33691406  0.484375    0.08258057 -0.14099121 -0.20117188 -0.07470703] 1   4 
[-0.84375     0.3322754  -0.04031372  0.20141602  0.43676758  0.63623047] 5   5 Match 225

[-0.44360352  0.3317871   0.16381836  0.20227051 -0.07196045  0.02046

[-1.3046875   0.21008301  0.00323486  0.47192383  0.72216797  0.6645508 ] 4   4 Match 252

[-1.2060547   0.11273193  0.08459473  0.65966797  0.61279297  0.5161133 ] 3   4 
[-1.5810547  -0.06030273 -0.2746582   0.6381836   1.0595703   0.6958008 ] 4   2 
[-1.5283203  -0.28076172 -0.07025146  1.046875    1.0390625   0.69677734] 3   3 Match 253

[-1.5576172  -0.11279297 -0.3959961   0.2434082   1.0097656   1.0527344 ] 5   4 
[-1.4541016  -0.08612061 -0.11938477  0.95751953  1.0927734   0.7216797 ] 4   3 
[-1.5488281  -0.12817383 -0.40283203  0.49121094  1.0517578   0.9165039 ] 4   2 
[-0.6123047   0.30273438  0.05001831  0.3791504   0.21679688 -0.24365234] 3   2 
[-0.5751953   0.1694336   0.28173828  0.5288086  -0.06768799 -0.5649414 ] 3   2 
[-1.4521484   0.18286133  0.07147217  0.79785156  0.7553711   0.2758789 ] 3   4 
[-0.6972656   0.30810547  0.03092957  0.23156738 -0.05999756  0.04559326] 1   2 
[-0.23474121  0.1583252   0.29296875  0.36572266 -0.3076172  -0.49658203] 3   3 Match 254

[-1.0810547   0.35888672  0.03646851  0.3774414   0.4033203   0.34814453] 4   1 
[-0.88671875  0.23010254  0.2692871   0.5727539   0.21716309 -0.04107666] 3   2 
[-0.79833984  0.32836914  0.1505127   0.32861328  0.20214844 -0.04464722] 3   3 Match 288

[ 0.31323242  0.23303223  0.36157227  0.05444336 -0.6196289  -0.5493164 ] 2   1 
[-0.6640625   0.24975586  0.20349121  0.43676758 -0.03083801 -0.2614746 ] 3   3 Match 289

[ 0.03799438  0.3046875   0.2668457  -0.0103302  -0.5698242  -0.59472656] 1   4 
[-0.6176758   0.31079102  0.34838867  0.25805664 -0.03222656 -0.08831787] 2   2 Match 290

[-0.7739258   0.23474121  0.3959961   0.46728516  0.13000488 -0.25512695] 3   5 
[-0.0736084   0.37817383  0.2763672  -0.06536865 -0.3869629  -0.21166992] 1   1 Match 291

[-1.1054688   0.39672852  0.07867432  0.28466797  0.40063477  0.29174805] 4   4 Match 292

[-0.98876953  0.08197021  0.23706055  0.81396484  0.41577148 -0.21679688] 3   4 
[-0.7861328   0.2998047   0.14172363  0.27661133  0.1336669

In [26]:
from sklearn import metrics
print(metrics.confusion_matrix(test['labels'],Pred))

[[ 3 38 18 20  8  4]
 [ 2 83 19 68 50 11]
 [ 2 63 18 90 35 13]
 [ 0 57 24 96 70  8]
 [ 0 39  8 83 97 22]
 [ 1 45  9 60 73 18]]


In [27]:
target_names = ['Pants', 'False', 'Barely-True','Half-True','Mostly-True','True']

print(metrics.classification_report(test['labels'], Pred,target_names =target_names))

              precision    recall  f1-score   support

       Pants       0.38      0.03      0.06        91
       False       0.26      0.36      0.30       233
 Barely-True       0.19      0.08      0.11       221
   Half-True       0.23      0.38      0.29       255
 Mostly-True       0.29      0.39      0.33       249
        True       0.24      0.09      0.13       206

    accuracy                           0.25      1255
   macro avg       0.26      0.22      0.20      1255
weighted avg       0.25      0.25      0.22      1255



In [28]:
# saving the output of the models to CSVs
#these are 1X6 classification vectors

SavesDirectory='./TunedModels/'+model_class+'/'+model_version+"/Saves/"
print('Saving...')
trainOut = pd.DataFrame(data= TrainModel_outputs )
trainOut.to_csv(SavesDirectory+'trainOut.tsv', sep='\t',  index=False)

evalOut = pd.DataFrame(data= EvalModel_outputs )
evalOut.to_csv(SavesDirectory+'evalOut.tsv', sep='\t',  index=False)

testOut = pd.DataFrame(data= TestModel_outputs )
testOut.to_csv(SavesDirectory+'testOut.tsv', sep='\t',  index=False)

print('Saving Complete on',datetime.now() ,'in:', SavesDirectory)

Saving...
Saving Complete on 2020-05-01 11:01:00.724276 in: ./TunedModels/albert/albert-large-v2/Saves/


In [29]:
del(model)
#del(train,Eval,test)
del(trainOut,evalOut,testOut)
torch.cuda.empty_cache()

#  Adding the reputation vector

This section takes the output results from the transformer used above and uses it together with the speaker's reputation to enhance the classification.

Before running this section it is suggested that you halt the program and start running it again from this cell. The neural net will likely have an error caused by some unreleased variable used by thr simple transformers library. 

In [1]:
import pandas as pd
import torch
import torch.nn as nn
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
device

device(type='cuda', index=0)

In [2]:

train=pd.read_excel('train-clean-Reputation.xlsx' )
train=train.iloc[:,:-1].astype(float)
train=train/200  #for scaling
#train

model_class='albert'  # bert or roberta or albert
model_version='albert-large-v2' #bert-base-cased, roberta-base, roberta-large, albert-base-v2 OR albert-large-v2
SavesDirectory='./TunedModels/'+model_class+'/'+model_version+"/Saves/"
TF_Output=pd.read_csv( SavesDirectory+'trainOut.tsv', sep='\t')

train=pd.concat([train,TF_Output], axis=1)

train

Unnamed: 0,PantsTotal,NotRealTotal,BarelyTotal,HalfTotal,MostlyTotal,RealTotal,0,1,2,3,4,5
0,0.005,0.000,0.00,0.000,0.000,0.0,0.251709,0.507812,0.165771,-0.336426,-0.629395,-0.468018
1,0.005,0.000,0.01,0.000,0.000,0.0,-0.285400,0.240112,0.230591,0.284180,-0.336182,-0.511719
2,0.005,0.000,0.01,0.000,0.000,0.0,-0.137207,0.454590,0.310059,-0.156128,-0.430908,-0.191650
3,0.000,0.000,0.00,0.000,0.005,0.0,-1.269531,0.150269,0.098450,0.658691,0.732422,0.282959
4,0.000,0.000,0.00,0.000,0.005,0.0,-1.560547,-0.055054,-0.488525,0.243530,1.099609,0.950684
...,...,...,...,...,...,...,...,...,...,...,...,...
10094,0.000,0.005,0.00,0.000,0.010,0.0,-1.088867,0.264404,0.211792,0.831543,0.363037,-0.169678
10095,0.000,0.005,0.00,0.000,0.010,0.0,-1.488281,0.141235,0.070374,0.853027,0.795898,0.498535
10096,0.000,0.005,0.00,0.000,0.010,0.0,0.198120,0.395020,0.268311,-0.198730,-0.600586,-0.487793
10097,0.000,0.000,0.00,0.005,0.000,0.0,-0.303223,0.244751,0.422363,0.350830,-0.297852,-0.516602


In [3]:
TrainLables=pd.read_excel('train-clean-Reputation.xlsx' )
TrainLables=TrainLables.iloc[:,-1] 

TrainLables=pd.get_dummies(TrainLables)
TrainLables

Unnamed: 0,0,1,2,3,4,5
0,1,0,0,0,0,0
1,1,0,0,0,0,0
2,0,0,1,0,0,0
3,0,0,0,0,1,0
4,0,0,0,0,1,0
...,...,...,...,...,...,...
10094,0,1,0,0,0,0
10095,0,0,0,0,1,0
10096,0,0,0,0,1,0
10097,0,0,0,1,0,0


In [4]:
input=torch.tensor(train.values)
 
input

tensor([[ 0.0050,  0.0000,  0.0000,  ..., -0.3364, -0.6294, -0.4680],
        [ 0.0050,  0.0000,  0.0100,  ...,  0.2842, -0.3362, -0.5117],
        [ 0.0050,  0.0000,  0.0100,  ..., -0.1561, -0.4309, -0.1917],
        ...,
        [ 0.0000,  0.0050,  0.0000,  ..., -0.1987, -0.6006, -0.4878],
        [ 0.0000,  0.0000,  0.0000,  ...,  0.3508, -0.2979, -0.5166],
        [ 0.0000,  0.0000,  0.0000,  ...,  0.3975,  0.5186,  0.3503]],
       dtype=torch.float64)

In [5]:
targets=torch.tensor(TrainLables.astype(float).values)
 
targets

tensor([[1., 0., 0., 0., 0., 0.],
        [1., 0., 0., 0., 0., 0.],
        [0., 0., 1., 0., 0., 0.],
        ...,
        [0., 0., 0., 0., 1., 0.],
        [0., 0., 0., 1., 0., 0.],
        [0., 0., 0., 0., 1., 0.]], dtype=torch.float64)

In [6]:
 
size= torch.tensor(input[0].size())
InputSize=size.item()

OutputSize=torch.tensor(targets[0].size()).item()

print('input size:', InputSize)
print('output size:', OutputSize)

input size: 12
output size: 6


In [24]:

class Net(nn.Module):

    def __init__(self):
        super(Net, self).__init__()
        
         
        # an affine operation: y = Wx + b
        self.fc1 = nn.Linear(InputSize, 24)  # input size 32
        self.fc2 = nn.Linear(24, 12)
        self.fc3 = nn.Linear(12, OutputSize)  #classifies 'outputsize' different classes

    def forward(self, x):
        x = torch.tanh(self.fc1(x))
        x = torch.tanh(self.fc2(x)) 
        x = torch.tanh(self.fc3(x)).double()
        return x

    

#now we use it

net = Net()

In [50]:
# here we  setup the neural network parameters
# pick an optimizer (Simple Gradient Descent)

learning_rate = 1e-4
criterion = nn.MSELoss()  #computes the loss Function

import torch.optim as optim

# creating optimizer
#optimizer = optim.SGD(net.parameters(), lr=learning_rate)
optimizer = torch.optim.Adam(net.parameters(), lr=learning_rate)


In [66]:
for epoch in range(100):  
        
    optimizer.zero_grad()   # zero the gradient buffers
    output = net(input.float())

    loss = criterion(output, targets)
    print('Loss:', loss, ' at epoch:', epoch)

    loss.backward()  #backprop
    optimizer.step()    # Does the update

Loss: tensor(0.0947, dtype=torch.float64, grad_fn=<MseLossBackward>)  at epoch: 0
Loss: tensor(0.0947, dtype=torch.float64, grad_fn=<MseLossBackward>)  at epoch: 1
Loss: tensor(0.0947, dtype=torch.float64, grad_fn=<MseLossBackward>)  at epoch: 2
Loss: tensor(0.0947, dtype=torch.float64, grad_fn=<MseLossBackward>)  at epoch: 3
Loss: tensor(0.0947, dtype=torch.float64, grad_fn=<MseLossBackward>)  at epoch: 4
Loss: tensor(0.0947, dtype=torch.float64, grad_fn=<MseLossBackward>)  at epoch: 5
Loss: tensor(0.0947, dtype=torch.float64, grad_fn=<MseLossBackward>)  at epoch: 6
Loss: tensor(0.0947, dtype=torch.float64, grad_fn=<MseLossBackward>)  at epoch: 7
Loss: tensor(0.0947, dtype=torch.float64, grad_fn=<MseLossBackward>)  at epoch: 8
Loss: tensor(0.0947, dtype=torch.float64, grad_fn=<MseLossBackward>)  at epoch: 9
Loss: tensor(0.0947, dtype=torch.float64, grad_fn=<MseLossBackward>)  at epoch: 10
Loss: tensor(0.0947, dtype=torch.float64, grad_fn=<MseLossBackward>)  at epoch: 11
Loss: tensor(0

In [65]:
#load previously saved FCNN model 

stage='NNetwork6WayClassification/'
SavesDirectory='./TunedModels/'+model_class+'/'+model_version+"/"+stage
#PATH = SavesDirectory+'Tanh_MSE_adam4781.pth'

net = Net()
net.load_state_dict(torch.load(PATH))

<All keys matched successfully>

In [27]:
correct = 0
total = 0

countCorrect0=0
countCorrect1=0
count0=0
count1=0
labels=pd.read_excel('train-clean-Reputation.xlsx' )

Y=[]  #target
Pred=[]  #predicted

with torch.no_grad():
    for row in range(len(input)):
        outputs = net(input[row,:].float())
        result=0
        total+=1
        if outputs[0]<outputs[1]:result=1
        if outputs[result]<outputs[2]:result=2
        if outputs[result]<outputs[3]:result=3
        if outputs[result]<outputs[4]:result=4
        if outputs[result]<outputs[5]:result=5
        
        if TrainLables.iloc[row,result]==1: correct+=1
        
        Y.append(labels.iloc[row])
        Pred.append(result)
        
        print(result, end=' ')
        
    
print('Correct:', correct, 'out of:', total )
print('Accuracy of the network : ',( 100 * correct / total))

0 2 2 4 4 5 3 3 4 3 5 5 4 4 3 3 3 3 5 3 3 1 4 1 1 1 5 1 4 4 5 1 5 4 1 0 3 5 3 4 2 2 3 3 2 2 4 1 1 2 4 4 5 4 4 4 4 1 1 2 1 4 4 4 4 4 4 4 4 1 4 1 4 4 4 1 4 4 4 4 2 2 2 5 4 5 3 1 5 5 3 5 3 4 1 0 0 0 0 5 2 5 5 1 5 5 5 5 4 5 5 5 5 5 5 5 5 5 5 5 5 3 0 3 4 3 3 3 3 1 5 4 1 3 1 3 3 3 3 3 5 1 4 1 1 4 1 1 1 4 5 4 4 4 5 1 1 1 1 1 1 1 1 2 1 2 1 1 1 1 1 1 1 1 3 3 2 2 2 2 2 2 5 4 4 4 4 0 1 2 0 3 3 2 4 1 1 4 4 2 2 2 3 3 1 3 3 1 2 3 2 3 1 3 0 2 2 2 2 4 4 4 4 1 3 0 0 4 4 3 3 5 2 3 3 2 2 2 2 1 1 2 2 1 2 2 1 1 2 2 2 3 3 3 3 4 4 4 2 2 2 2 2 2 3 1 3 3 3 1 3 1 1 2 3 3 2 2 3 3 3 5 2 4 4 4 3 1 3 3 4 3 4 0 1 5 5 5 5 5 0 3 0 2 3 3 1 4 3 4 1 3 2 4 4 1 3 3 3 4 1 1 2 1 3 0 3 3 5 3 0 3 3 0 1 5 4 4 4 2 2 5 3 3 3 3 3 2 3 3 4 2 1 4 4 1 4 3 4 1 3 2 3 3 0 4 3 3 4 4 3 3 1 3 4 2 4 2 2 4 5 3 3 3 4 1 4 3 4 4 4 1 4 4 4 1 2 3 0 5 5 5 5 4 5 5 1 4 4 4 4 1 4 2 5 5 1 5 5 3 1 1 4 1 4 2 4 5 3 3 3 3 3 3 4 2 3 5 3 3 3 3 3 4 5 5 3 4 4 4 4 5 4 4 4 5 5 3 3 1 5 4 5 3 5 4 2 3 3 4 4 3 4 4 3 3 3 5 3 4 3 5 2 3 3 5 5 5 5 5 3 4 4 3 3 5 4 2 3 5 

 5 5 5 5 3 5 3 3 3 3 3 5 3 3 3 4 5 4 4 5 5 3 5 4 3 4 4 2 4 1 3 4 5 4 5 4 3 3 3 3 5 3 4 4 3 3 3 5 5 3 3 4 5 4 4 5 4 3 4 5 3 5 5 4 5 3 3 4 5 3 3 5 3 5 5 3 3 1 4 4 4 4 1 1 2 2 3 3 3 1 3 3 3 1 1 4 1 5 1 1 1 5 1 4 2 5 5 4 5 1 1 3 3 4 3 2 4 1 5 1 1 1 3 1 4 2 5 4 4 5 4 5 4 4 4 5 3 3 3 5 4 4 3 1 5 5 5 5 5 3 5 1 4 2 5 5 4 4 5 2 1 5 1 3 2 2 2 2 1 1 2 3 4 4 5 5 1 3 1 4 5 5 5 5 3 5 5 1 3 3 1 5 5 4 3 5 3 2 2 2 2 2 2 4 1 4 1 2 2 4 3 4 3 3 3 4 0 4 3 3 3 5 4 1 3 5 3 1 4 4 3 2 5 3 2 2 5 5 2 3 1 2 1 1 2 3 2 5 5 0 2 5 4 5 5 4 2 4 5 1 1 4 4 0 0 0 0 0 4 1 4 3 3 4 3 2 2 3 1 1 2 4 4 4 4 4 2 4 4 4 4 4 4 4 4 4 4 2 4 4 4 4 2 2 2 2 4 4 5 4 4 2 4 2 2 4 4 3 2 3 4 5 2 4 4 2 2 4 4 4 4 4 4 4 4 4 2 4 2 2 2 4 4 1 1 3 1 3 4 4 3 3 4 5 4 3 1 3 4 4 4 4 4 4 4 4 0 4 2 2 3 4 5 5 5 3 3 1 2 3 3 1 1 4 5 5 5 5 0 0 0 3 4 4 4 4 3 5 4 4 4 4 4 4 2 4 4 4 3 3 1 2 3 4 2 3 5 5 5 5 1 3 3 5 0 0 0 3 4 5 1 1 1 3 3 3 4 3 2 2 2 2 2 4 4 4 4 5 1 4 1 4 3 3 5 4 3 3 5 1 1 4 4 5 5 4 4 4 4 4 1 0 2 5 0 1 1 0 4 3 3 2 2 1 5 1 1 5 2 1 1 2 5 2 3 1 3 5 2 1

 5 5 5 3 3 5 5 2 1 1 1 1 1 5 3 1 4 2 1 3 5 3 5 1 4 5 1 5 5 4 1 2 4 3 3 5 4 5 4 2 0 2 1 4 4 4 3 4 4 4 4 4 4 5 3 5 3 5 4 3 3 3 5 3 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 3 5 5 5 5 5 5 5 5 5 1 1 5 4 1 4 4 3 3 1 3 1 4 4 4 2 4 4 3 3 4 5 4 0 3 5 2 1 2 1 3 3 2 3 1 1 5 2 5 3 2 2 4 5 4 4 2 4 2 2 2 2 2 4 2 2 4 3 4 2 4 3 3 4 2 2 4 2 4 4 2 2 2 2 2 2 2 2 2 2 3 2 3 1 1 1 1 1 4 2 3 5 4 1 4 5 4 5 3 1 4 4 4 1 4 1 4 3 1 5 1 4 1 4 4 1 5 5 1 1 4 1 1 1 4 4 5 4 4 4 1 5 4 4 5 1 4 2 5 0 3 3 3 3 3 4 2 4 2 4 3 3 3 3 3 3 5 3 4 1 3 1 1 5 3 5 1 5 3 3 3 3 1 3 1 1 1 1 3 1 5 3 5 1 1 1 2 1 4 1 1 3 2 1 1 1 4 5 3 5 3 5 3 0 2 2 1 0 1 2 1 1 1 0 0 1 1 0 2 0 2 0 0 0 1 0 0 2 1 0 0 1 0 2 0 0 0 2 1 5 3 3 3 4 3 3 3 3 4 3 3 3 3 3 3 1 4 3 4 3 1 3 5 2 5 5 5 5 5 5 4 5 1 4 3 3 3 3 5 3 3 5 1 4 3 5 5 5 5 1 3 0 1 5 5 2 4 1 0 0 1 1 3 3 4 4 4 1 5 5 3 4 1 1 1 1 1 1 1 1 1 1 1 1 1 5 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 5 3 1 1 1 1 5 5 1 1 1 1 1 5 1 4 3 3 2 3 3 5 4 3 3 4 4 4 4 4 3 3 3 2 4 1 1 1 1 4 2 4 2 2 2 4 2 1 1 5 1 4 5 3 3 0 0 4 3 4 1 1 4

In [28]:
# load the validation data

ValidData=pd.read_excel('valid-clean-Reputation.xlsx' )
ValidData=ValidData.iloc[:,:-1].astype(float)
ValidData=ValidData/200

SavesDirectory='./TunedModels/'+model_class+'/'+model_version+"/Saves/"
TF_Output=pd.read_csv( SavesDirectory+'evalOut.tsv', sep='\t')

ValidData=pd.concat([ValidData,TF_Output], axis=1)


ValidData=torch.tensor(ValidData.values)
ValidData

tensor([[ 0.0000,  0.0000,  0.0000,  ...,  0.4324, -0.1462, -0.5625],
        [ 0.0050,  0.0000,  0.0100,  ...,  0.2032, -0.3833, -0.3286],
        [ 0.0000,  0.0000,  0.0050,  ...,  0.2717, -0.2191, -0.4355],
        ...,
        [ 0.0000,  0.0000,  0.0150,  ...,  0.3020,  1.0312,  0.9219],
        [ 0.0000,  0.0000,  0.0000,  ...,  0.3862,  0.9517,  1.0928],
        [ 0.0000,  0.0000,  0.0000,  ...,  0.6538,  0.9268,  0.3442]],
       dtype=torch.float64)

In [29]:
labels=pd.read_excel('valid-clean-Reputation.xlsx' )

labels=labels.iloc[:,-1] 
labelsOneHot=pd.get_dummies(labels)
labelsOneHot

Unnamed: 0,0,1,2,3,4,5
0,0,0,0,1,0,0
1,0,0,1,0,0,0
2,0,0,0,1,0,0
3,0,0,0,0,1,0
4,0,0,0,0,0,1
...,...,...,...,...,...,...
1267,0,0,0,0,0,1
1268,0,0,0,1,0,0
1269,0,0,1,0,0,0
1270,0,0,0,0,1,0


In [30]:
ValidLables =torch.tensor(labelsOneHot.values)
ValidLables

tensor([[0, 0, 0, 1, 0, 0],
        [0, 0, 1, 0, 0, 0],
        [0, 0, 0, 1, 0, 0],
        ...,
        [0, 0, 1, 0, 0, 0],
        [0, 0, 0, 0, 1, 0],
        [0, 0, 0, 1, 0, 0]], dtype=torch.uint8)

In [31]:
correct = 0
total = 0

countCorrect0=0
countCorrect1=0
count0=0
count1=0

Y=[]  #target
Pred=[]  #predicted

with torch.no_grad():
    for row in range(len(ValidData)):
        outputs = net(ValidData[row,:].float())
        result=0
        total+=1
        if outputs[0]<outputs[1]:result=1
        if outputs[result]<outputs[2]:result=2
        if outputs[result]<outputs[3]:result=3
        if outputs[result]<outputs[4]:result=4
        if outputs[result]<outputs[5]:result=5
        
        if labelsOneHot.iloc[row,result]==1: correct+=1
        
        Y.append(labels.iloc[row])
        Pred.append(result)
        
        print(result, end=' ')
        
    
print('Correct:', correct, 'out of:', total )
print('Accuracy of the network : ',( 100 * correct / total))

3 2 3 2 5 4 4 4 5 2 3 3 1 1 1 4 2 2 2 1 2 0 1 2 4 2 3 1 2 5 4 3 3 0 1 3 3 3 2 3 4 4 3 1 3 3 2 5 3 3 4 3 3 3 5 3 3 4 5 1 3 3 3 4 3 3 5 3 3 4 3 3 5 4 4 4 4 3 2 1 4 3 4 3 3 3 3 4 3 1 3 5 3 4 3 5 4 3 5 3 2 5 5 3 3 3 4 3 1 3 1 1 2 4 4 4 4 3 4 2 2 4 3 1 1 5 4 1 1 4 5 4 5 4 3 4 3 2 3 4 3 1 1 1 3 3 2 4 4 0 0 0 0 0 0 0 0 0 0 0 0 4 4 5 1 5 2 5 3 1 5 1 4 4 5 5 1 4 4 5 5 5 3 1 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 2 2 5 3 1 1 4 5 2 4 5 2 2 1 3 3 3 2 3 5 0 1 1 2 3 3 3 5 3 5 3 5 3 5 5 3 3 3 3 5 5 3 3 3 1 4 4 4 1 4 3 4 4 3 3 1 1 4 5 3 1 1 3 4 2 1 2 2 2 3 2 2 4 5 4 1 2 0 4 2 4 1 3 0 4 2 2 1 4 5 2 0 3 3 3 3 2 2 1 3 2 1 2 4 2 4 4 1 4 3 1 2 4 4 1 1 4 4 5 1 2 0 2 2 5 3 2 3 4 5 3 3 3 2 4 4 2 5 2 1 1 1 1 3 1 1 1 3 1 0 3 0 3 0 4 4 1 1 1 1 1 1 4 4 1 1 1 1 1 4 1 1 1 4 3 1 0 4 4 4 2 0 4 5 4 1 5 4 4 5 3 2 2 2 2 3 2 5 5 1 2 2 1 2 3 0 4 2 5 3 3 4 4 0 0 4 0 0 3 0 5 3 4 2 4 2 1 2 2 2 5 3 5 1 1 5 3 3 1 4 3 2 3 2 5 1 5 3 3 4 5 5 1 1 1 3 3 2 2 2 1 5 1 4 3 1 1 5 1 4 3 1 0 2 5 2 3 1 1 4 4 3 1 4 3 3 0 4 4 4 3 5 1 

In [32]:
# load the test data

TestData=pd.read_excel('test-clean-Reputation.xlsx' )
TestData=TestData.iloc[:,:-1].astype(float)
TestData=TestData/200

SavesDirectory='./TunedModels/'+model_class+'/'+model_version+"/Saves/"
TF_Output=pd.read_csv( SavesDirectory+'testOut.tsv', sep='\t')

TestData=pd.concat([TestData,TF_Output], axis=1)


TestData=torch.tensor(TestData.values)
TestData

tensor([[ 0.0000,  0.0050,  0.0100,  ..., -0.1688, -0.3889, -0.0365],
        [ 0.0000,  0.0050,  0.0100,  ...,  0.4417,  0.5527,  0.3252],
        [ 0.0000,  0.0050,  0.0100,  ...,  0.1344,  0.0543, -0.1962],
        ...,
        [ 0.0000,  0.0000,  0.0050,  ...,  0.5479,  0.1997, -0.2791],
        [ 0.0050,  0.0000,  0.0000,  ...,  0.2302,  0.9248,  0.8755],
        [ 0.0000,  0.0000,  0.0000,  ...,  0.4470,  0.0788, -0.3301]],
       dtype=torch.float64)

In [33]:
labels=pd.read_excel('test-clean-Reputation.xlsx' )

labels=labels.iloc[:,-1] 
labelsOneHot=pd.get_dummies(labels)
labelsOneHot

TestLables =torch.tensor(labelsOneHot.values)
TestLables

tensor([[0, 0, 0, 0, 1, 0],
        [0, 0, 0, 0, 1, 0],
        [0, 0, 1, 0, 0, 0],
        ...,
        [0, 0, 1, 0, 0, 0],
        [1, 0, 0, 0, 0, 0],
        [0, 0, 0, 1, 0, 0]], dtype=torch.uint8)

In [67]:
correct = 0
total = 0


Y=[]  #target
Pred=[]  #predicted

with torch.no_grad():
    for row in range(len(TestData)):
        outputs = net(TestData[row,:].float())
        result=0
        total+=1
        if outputs[0]<outputs[1]:result=1
        if outputs[result]<outputs[2]:result=2
        if outputs[result]<outputs[3]:result=3
        if outputs[result]<outputs[4]:result=4
        if outputs[result]<outputs[5]:result=5
        
        if labelsOneHot.iloc[row,result]==1: correct+=1
        
        Y.append(labels.iloc[row])
        Pred.append(result)
        
        print(result, end=' ')
        
       
print('Correct:', correct, 'out of:', total )
print('Accuracy of the network : ',( 100 * correct / total))

4 4 4 2 2 5 2 4 2 3 5 5 5 3 3 4 1 1 1 2 1 1 4 2 0 5 2 3 0 2 3 2 2 2 1 2 2 3 3 2 3 0 1 2 4 2 1 1 3 1 2 4 1 5 0 5 3 3 3 5 3 3 3 3 4 4 4 3 1 3 4 1 3 3 1 4 1 5 3 3 4 3 5 4 5 3 3 3 4 3 4 3 3 4 3 3 3 4 4 1 3 3 5 5 4 4 3 5 4 3 3 4 4 1 4 5 4 2 3 4 3 3 1 5 1 4 4 4 4 4 3 4 4 3 1 3 3 2 2 3 0 2 4 3 1 1 3 4 4 4 4 4 4 4 4 4 2 1 0 0 0 0 0 1 1 0 0 3 3 3 3 5 3 5 5 1 5 2 4 3 5 0 0 3 4 0 5 5 1 5 1 1 0 4 4 3 3 5 4 0 0 1 0 1 0 0 0 0 0 0 0 0 1 2 3 4 0 3 3 5 5 5 5 3 2 3 5 4 4 1 3 3 5 5 5 4 3 3 5 3 5 3 1 3 3 5 4 3 2 3 1 4 5 0 0 4 3 3 5 5 1 5 1 3 4 3 4 2 0 4 4 2 2 2 1 1 1 4 4 2 3 4 3 4 4 1 1 4 2 5 5 3 3 3 3 1 5 4 2 4 0 5 5 5 1 5 4 1 4 3 3 0 2 2 2 0 2 2 3 4 3 3 3 3 2 2 2 3 5 3 5 4 4 4 2 3 1 4 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 4 4 1 0 1 3 2 1 1 3 1 1 4 1 1 4 4 1 1 4 4 4 1 3 3 2 1 2 1 5 2 0 3 3 1 1 3 5 3 2 2 2 2 5 2 2 2 3 2 3 3 3 1 2 4 1 3 4 0 3 3 4 0 3 0 3 0 3 3 5 3 4 1 2 3 1 2 5 1 1 2 3 2 4 2 4 4 4 5 1 1 5 4 3 3 5 5 4 5 5 3 1 2 2 5 4 4 4 2 0 4 3 3 1 0 1 1 4 5 3 1 1 5 5 2 2 2 5 3 2 5 5 2 3 3 5 3 3 2 4 1 5 1 1 1 4 3 

In [68]:
from sklearn import metrics 
print(metrics.confusion_matrix(Y,Pred))

[[ 49  17   4  14   5   2]
 [  6 130  29  29  26  13]
 [ 13  36  76  45  28  23]
 [  2  35  24 126  45  23]
 [  1  21  21  58 126  22]
 [  2  23  13  34  41  93]]


In [69]:

target_names = ['Pants', 'False', 'Barely-True','Half-True','Mostly-True','True']

print(metrics.classification_report(Y, Pred,target_names =target_names))


              precision    recall  f1-score   support

       Pants       0.67      0.54      0.60        91
       False       0.50      0.56      0.53       233
 Barely-True       0.46      0.34      0.39       221
   Half-True       0.41      0.49      0.45       255
 Mostly-True       0.46      0.51      0.48       249
        True       0.53      0.45      0.49       206

    accuracy                           0.48      1255
   macro avg       0.50      0.48      0.49      1255
weighted avg       0.48      0.48      0.48      1255



In [61]:
#save the FCNN model

stage='NNetwork6WayClassification/'
SavesDirectory='./TunedModels/'+model_class+'/'+model_version+"/"+stage
#PATH = SavesDirectory+'Tanh_MSE_adam4781.pth'

torch.save(net.state_dict(), PATH)

# more on saving pytorch networks: https://pytorch.org/docs/stable/notes/serialization.html