# TRAINING TEXT CLASSIFIERS WITH SPACY

In this lab we will train different text classifiers with spacy.

1. Read through the code and train to add more inline documentation as you try to understand the functionality.

2. We will adapt the code to train two different fake news classifiers: one on general fake news from 6 different domains and another one on celebrities, were there are legitimate news but also news which are false gossip.



In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [2]:
%cd /content/drive/MyDrive/LAP/Subjects/AP1/labs

/content/drive/MyDrive/LAP/Subjects/AP1/labs


## Load language modules

In [3]:
# We will be using spacy v2, so no need to upgrade to v3

In [4]:
# TODO install and test the language modules of your choice following the https://spacy.io/usage
!python -m spacy download en_core_web_sm
!python -m spacy download en_core_web_md
!python -m spacy download en_core_web_lg

Collecting en_core_web_sm==2.2.5
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.5/en_core_web_sm-2.2.5.tar.gz (12.0 MB)
[K     |████████████████████████████████| 12.0 MB 13.6 MB/s 
[38;5;2m✔ Download and installation successful[0m
You can now load the model via spacy.load('en_core_web_sm')
Collecting en_core_web_md==2.2.5
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_md-2.2.5/en_core_web_md-2.2.5.tar.gz (96.4 MB)
[K     |████████████████████████████████| 96.4 MB 1.9 MB/s 
Building wheels for collected packages: en-core-web-md
  Building wheel for en-core-web-md (setup.py) ... [?25l[?25hdone
  Created wheel for en-core-web-md: filename=en_core_web_md-2.2.5-py3-none-any.whl size=98051301 sha256=3c9a84434bccf0a0869ff8128636c78c9f7606434ebfa0aca57490f886bf7e0e
  Stored in directory: /tmp/pip-ephem-wheel-cache-f6jzqwb4/wheels/69/c5/b8/4f1c029d89238734311b3269762ab2ee325a42da2ce8edb997
Successfully bu

In [5]:
import en_core_web_sm
import en_core_web_md
import en_core_web_lg

## Load stance data

In [6]:
import spacy
import csv
import random
import time
import numpy as np
import pandas as pd
import re
import string

from spacy.util import minibatch, compounding
import sys
from spacy import displacy
from itertools import chain

from sklearn.metrics import classification_report

# TODO add inline documentation describing the functionality of each function
# load data
def load_data(fnames):
    data = []
    for fname in fnames:
        data.append(pd.read_csv(fname, sep='\t', encoding='utf-8'))
    data = pd.concat(data)
    targets = set(data['Target'])
    return data, list(targets)

# pre-process tweets
def cleanup(tweet):
    """we remove urls, hashtags and user symbols"""
    tweet = re.sub(r"http\S+", "", tweet.replace("#", "").replace("@", "").replace('\n', ' ').replace('\t', ' '))
    return tweet

In [7]:
# data path. trial data used as training too.
folder = "stance-semeval2016"
labels = ['AGAINST', 'FAVOR', 'NONE']
trial_file = f"../datasets/{folder}/semeval2016-task6-trialdata.utf-8.txt"
train_file = f"../datasets/{folder}/semeval2016-task6-trainingdata.utf-8.txt"
test_file = f"../datasets/{folder}/SemEval2016-Task6-subtaskA-testdata-gold.txt"

training_data, targets = load_data([trial_file, train_file])
training_data['Clean_tweet'] = training_data['Tweet'].apply(cleanup)

test_data, _ = load_data([test_file])
test_data['Clean_tweet'] = test_data['Tweet'].apply(cleanup)
display(training_data)

Unnamed: 0,ID,Target,Tweet,Stance,Clean_tweet
0,1,Hillary Clinton,"@tedcruz And, #HandOverTheServer she wiped cle...",AGAINST,"tedcruz And, HandOverTheServer she wiped clean..."
1,2,Hillary Clinton,Hillary is our best choice if we truly want to...,FAVOR,Hillary is our best choice if we truly want to...
2,3,Hillary Clinton,@TheView I think our country is ready for a fe...,AGAINST,TheView I think our country is ready for a fem...
3,4,Hillary Clinton,I just gave an unhealthy amount of my hard-ear...,AGAINST,I just gave an unhealthy amount of my hard-ear...
4,5,Hillary Clinton,@PortiaABoulger Thank you for adding me to you...,NONE,PortiaABoulger Thank you for adding me to your...
...,...,...,...,...,...
2809,2910,Legalization of Abortion,"There's a law protecting unborn eagles, but no...",AGAINST,"There's a law protecting unborn eagles, but no..."
2810,2911,Legalization of Abortion,I am 1 in 3... I have had an abortion #Abortio...,AGAINST,I am 1 in 3... I have had an abortion Abortion...
2811,2912,Legalization of Abortion,How dare you say my sexual preference is a cho...,AGAINST,How dare you say my sexual preference is a cho...
2812,2913,Legalization of Abortion,"Equal rights for those 'born that way', no rig...",AGAINST,"Equal rights for those 'born that way', no rig..."


In [8]:
for target in targets:
  training_data[training_data['Target'] == target][['Stance', 'Clean_tweet']].to_csv(f"../datasets/{folder}/train.{target}.tsv",
          sep="\t", index=False, quoting=csv.QUOTE_NONE, quotechar="", escapechar="")
  test_data[test_data['Target'] == target][['Stance', 'Clean_tweet']].to_csv(f"../datasets/{folder}/test.{target}.tsv",
          sep="\t", index=False, quoting=csv.QUOTE_NONE, quotechar="", escapechar="")

## Load target data

In [9]:
def load_data_spacy(fname):
    training_data = pd.read_csv(fname, sep='\t', encoding='utf-8')
    #train_data.dropna(axis = 0, how ='any',inplace=True)
    #train_data['Num_words_text'] = train_data['text'].apply(lambda x:len(str(x).split())) 
    #mask = train_data['Num_words_text'] >2
    #train_data = train_data[mask]
    print(training_data['Stance'].value_counts())
    
    train_texts = training_data['Clean_tweet'].tolist()
    train_cats = training_data['Stance'].tolist()
    final_train_cats=[]
    for cat in train_cats:
        cat_list = {}
        if cat == 'AGAINST':
            cat_list['AGAINST'] =  1
            cat_list['FAVOR'] =  0
            cat_list['NONE'] =  0
        elif cat == 'FAVOR':
            cat_list['AGAINST'] =  0
            cat_list['FAVOR'] =  1
            cat_list['NONE'] =  0
        else:
            cat_list['AGAINST'] =  0
            cat_list['FAVOR'] =  0
            cat_list['NONE'] =  1
        final_train_cats.append(cat_list)
        
    train_data = list(zip(train_texts, [{"cats": cats} for cats in final_train_cats]))
    return train_data, train_texts, train_cats

In [10]:
target = "Feminist Movement"
training_data, train_texts, train_cats = load_data_spacy(f"../datasets/{folder}/train.{target}.tsv")
print(training_data[:10])
print(len(training_data))
test_data, test_texts, test_cats = load_data_spacy(f"../datasets/{folder}/test.{target}.tsv")
print(len(test_data))

AGAINST    328
FAVOR      210
NONE       126
Name: Stance, dtype: int64
[('Always a delight to see chest-drumming alpha males hiss and scuttle backwards up the wall when a feminist enters the room. manly SemST', {'cats': {'AGAINST': 0, 'FAVOR': 1, 'NONE': 0}}), ("Sometimes I overheat and want to take off my shirt but can't because of social expectations of people with breasts. ;n; SemST", {'cats': {'AGAINST': 0, 'FAVOR': 1, 'NONE': 0}}), ('If feminists spent 1/2 as much time reading papers as they do tumblr they would be real people, not ignorant sexist bigots. SemST', {'cats': {'AGAINST': 1, 'FAVOR': 0, 'NONE': 0}}), ('Stupid Feminists, the civilization you take for granted was built with the labour, blood sweat and tears of men. SemST', {'cats': {'AGAINST': 1, 'FAVOR': 0, 'NONE': 0}}), ("YOU'RE A GIRL AND HAVE A SEX DRIVE!? YOU MUST BE A SLUT! feminist SemST", {'cats': {'AGAINST': 0, 'FAVOR': 1, 'NONE': 0}}), ("Suns out....  Dresses out...  StreetHarassment out...  This shouldn't be 

## Train

In [11]:
def Sort(sub_li):
    # reverse = True (Soresulting_list = list(first_list)rts in Descending  order) 
    # key is set to sort using second element of  
    # sublist lambda has been used 
    return(sorted(sub_li, key = lambda x: x[1],reverse=True))  

# run the predictions on each sentence in the evaluation  dataset, and return the metrics
def evaluate(tokenizer, textcat, test_texts, test_cats, labels):
    docs = (tokenizer(text) for text in test_texts)
    preds = []
    for i, doc in enumerate(textcat.pipe(docs)):
        #print(doc.cats.items())
        scores = Sort(doc.cats.items())
        #print(scores)
        catList=[]
        for score in scores:
            catList.append(score[0])
        preds.append(catList[0])
    print(classification_report(test_cats, preds, labels=labels)) 

In [12]:
def train_spacy(train_data, iterations, test_texts, test_cats, model_arch, dropout = 0.3, model=en_core_web_sm, init_tok2vec=None):
    ''' Train a spacy NER model, which can be queried against with test data
   
    train_data : training data in the format of (sentence, {cats: ['AGAINST'|'FAVOR'|'NONE']})
    labels : a list of unique annotations
    iterations : number of training iterations
    dropout : dropout proportion for training
    display_freq : number of epochs between logging losses to console
    '''
    
    nlp = model.load()

    # add the text classifier to the pipeline if it doesn't exist
    # nlp.create_pipe works for built-ins that are registered with spaCy
    if "textcat" not in nlp.pipe_names:
        textcat = nlp.create_pipe(
            "textcat", config={"exclusive_classes": True, "architecture": model_arch}
        )
        nlp.add_pipe(textcat, last=True)
        
    # otherwise, get it, so we can add labels to it
    else:
        textcat = nlp.get_pipe("textcat")

    # add label to text classifier
    for label in labels:
        textcat.add_label(label)

    # get names of other pipes to disable them during training
    pipe_exceptions = ["textcat", "trf_wordpiecer", "trf_tok2vec"]
    other_pipes = [pipe for pipe in nlp.pipe_names if pipe not in pipe_exceptions]
    with nlp.disable_pipes(*other_pipes):  # only train textcat
        optimizer = nlp.begin_training()
        if init_tok2vec is not None:
            with init_tok2vec.open("rb") as file_:
                textcat.model.tok2vec.from_bytes(file_.read())
        print("Training the model...")
        print("{:^5}\t{:^5}\t{:^5}\t{:^5}".format("LOSS", "P", "R", "F"))
        batch_sizes = compounding(16.0, 64.0, 1.5)
        for i in range(iterations):
            print('Iteration: '+str(i))
            start_time = time.clock()
            losses = {}
            # batch up the examples using spaCy's minibatch
            random.shuffle(train_data)
            batches = minibatch(train_data, size=batch_sizes)
            for batch in batches:
                texts, annotations = zip(*batch)
                nlp.update(texts, annotations, sgd=optimizer, drop=dropout, losses=losses)
            with textcat.model.use_params(optimizer.averages):
                # evaluate on the test data 
                evaluate(nlp.tokenizer, textcat, test_texts, test_cats, labels[:2])
            print ('Elapsed time'+str(time.clock() - start_time)+  "seconds")
        with nlp.use_params(optimizer.averages):
            model_name = model_arch + "_" + target
            filepath = "../resources/" + folder + "/" + model_name 
            nlp.to_disk(filepath)
    return nlp

In [13]:
nlp = train_spacy(training_data, 20, test_texts, test_cats, "bow")

Training the model...
LOSS 	  P  	  R  	  F  
Iteration: 0




              precision    recall  f1-score   support

     AGAINST       0.64      0.99      0.78       183
       FAVOR       0.50      0.02      0.03        58

   micro avg       0.64      0.76      0.70       241
   macro avg       0.57      0.51      0.41       241
weighted avg       0.61      0.76      0.60       241

Elapsed time0.9187440000000002seconds
Iteration: 1




              precision    recall  f1-score   support

     AGAINST       0.64      0.99      0.78       183
       FAVOR       0.50      0.02      0.03        58

   micro avg       0.64      0.76      0.70       241
   macro avg       0.57      0.51      0.41       241
weighted avg       0.61      0.76      0.60       241

Elapsed time0.44260399999999933seconds
Iteration: 2




              precision    recall  f1-score   support

     AGAINST       0.64      0.99      0.78       183
       FAVOR       0.50      0.02      0.03        58

   micro avg       0.64      0.76      0.70       241
   macro avg       0.57      0.51      0.41       241
weighted avg       0.61      0.76      0.60       241

Elapsed time0.43525099999999917seconds
Iteration: 3




              precision    recall  f1-score   support

     AGAINST       0.65      0.98      0.78       183
       FAVOR       0.33      0.03      0.06        58

   micro avg       0.64      0.76      0.69       241
   macro avg       0.49      0.51      0.42       241
weighted avg       0.57      0.76      0.61       241

Elapsed time0.42650000000000077seconds
Iteration: 4




              precision    recall  f1-score   support

     AGAINST       0.65      0.97      0.78       183
       FAVOR       0.38      0.09      0.14        58

   micro avg       0.64      0.76      0.70       241
   macro avg       0.52      0.53      0.46       241
weighted avg       0.59      0.76      0.63       241

Elapsed time0.44603399999999915seconds
Iteration: 5




              precision    recall  f1-score   support

     AGAINST       0.66      0.95      0.78       183
       FAVOR       0.38      0.14      0.20        58

   micro avg       0.64      0.76      0.69       241
   macro avg       0.52      0.54      0.49       241
weighted avg       0.59      0.76      0.64       241

Elapsed time0.4291900000000002seconds
Iteration: 6




              precision    recall  f1-score   support

     AGAINST       0.65      0.91      0.76       183
       FAVOR       0.31      0.16      0.21        58

   micro avg       0.62      0.73      0.67       241
   macro avg       0.48      0.53      0.48       241
weighted avg       0.57      0.73      0.63       241

Elapsed time0.4455809999999989seconds
Iteration: 7




              precision    recall  f1-score   support

     AGAINST       0.67      0.90      0.77       183
       FAVOR       0.33      0.22      0.27        58

   micro avg       0.62      0.73      0.67       241
   macro avg       0.50      0.56      0.52       241
weighted avg       0.59      0.73      0.65       241

Elapsed time0.4437219999999993seconds
Iteration: 8




              precision    recall  f1-score   support

     AGAINST       0.68      0.89      0.77       183
       FAVOR       0.35      0.28      0.31        58

   micro avg       0.62      0.74      0.68       241
   macro avg       0.51      0.58      0.54       241
weighted avg       0.60      0.74      0.66       241

Elapsed time0.42777000000000065seconds
Iteration: 9




              precision    recall  f1-score   support

     AGAINST       0.69      0.88      0.77       183
       FAVOR       0.35      0.31      0.33        58

   micro avg       0.63      0.74      0.68       241
   macro avg       0.52      0.60      0.55       241
weighted avg       0.61      0.74      0.67       241

Elapsed time0.44025099999999995seconds
Iteration: 10




              precision    recall  f1-score   support

     AGAINST       0.70      0.87      0.77       183
       FAVOR       0.37      0.36      0.37        58

   micro avg       0.63      0.75      0.68       241
   macro avg       0.53      0.62      0.57       241
weighted avg       0.62      0.75      0.68       241

Elapsed time0.4404609999999991seconds
Iteration: 11




              precision    recall  f1-score   support

     AGAINST       0.70      0.86      0.77       183
       FAVOR       0.38      0.38      0.38        58

   micro avg       0.64      0.75      0.69       241
   macro avg       0.54      0.62      0.58       241
weighted avg       0.62      0.75      0.68       241

Elapsed time0.44039399999999773seconds
Iteration: 12




              precision    recall  f1-score   support

     AGAINST       0.70      0.86      0.77       183
       FAVOR       0.37      0.38      0.38        58

   micro avg       0.63      0.74      0.68       241
   macro avg       0.54      0.62      0.57       241
weighted avg       0.62      0.74      0.68       241

Elapsed time0.4430170000000011seconds
Iteration: 13




              precision    recall  f1-score   support

     AGAINST       0.71      0.86      0.78       183
       FAVOR       0.39      0.41      0.40        58

   micro avg       0.64      0.75      0.69       241
   macro avg       0.55      0.64      0.59       241
weighted avg       0.63      0.75      0.69       241

Elapsed time0.444110000000002seconds
Iteration: 14




              precision    recall  f1-score   support

     AGAINST       0.71      0.84      0.77       183
       FAVOR       0.40      0.47      0.43        58

   micro avg       0.64      0.75      0.69       241
   macro avg       0.55      0.65      0.60       241
weighted avg       0.64      0.75      0.69       241

Elapsed time0.45056200000000146seconds
Iteration: 15




              precision    recall  f1-score   support

     AGAINST       0.71      0.81      0.76       183
       FAVOR       0.38      0.48      0.43        58

   micro avg       0.63      0.73      0.68       241
   macro avg       0.55      0.65      0.59       241
weighted avg       0.63      0.73      0.68       241

Elapsed time0.5516550000000002seconds
Iteration: 16




              precision    recall  f1-score   support

     AGAINST       0.72      0.80      0.76       183
       FAVOR       0.38      0.50      0.43        58

   micro avg       0.62      0.73      0.67       241
   macro avg       0.55      0.65      0.59       241
weighted avg       0.64      0.73      0.68       241

Elapsed time0.45568400000000153seconds
Iteration: 17




              precision    recall  f1-score   support

     AGAINST       0.73      0.79      0.76       183
       FAVOR       0.37      0.53      0.44        58

   micro avg       0.62      0.73      0.67       241
   macro avg       0.55      0.66      0.60       241
weighted avg       0.64      0.73      0.68       241

Elapsed time0.4543539999999986seconds
Iteration: 18




              precision    recall  f1-score   support

     AGAINST       0.73      0.77      0.74       183
       FAVOR       0.36      0.53      0.43        58

   micro avg       0.61      0.71      0.66       241
   macro avg       0.54      0.65      0.59       241
weighted avg       0.64      0.71      0.67       241

Elapsed time0.43504000000000076seconds
Iteration: 19




              precision    recall  f1-score   support

     AGAINST       0.72      0.76      0.74       183
       FAVOR       0.36      0.53      0.43        58

   micro avg       0.61      0.71      0.65       241
   macro avg       0.54      0.65      0.58       241
weighted avg       0.64      0.71      0.67       241

Elapsed time0.44492199999999826seconds




In [14]:
nlp = train_spacy(training_data, 20, test_texts, test_cats, "simple_cnn")

Training the model...
LOSS 	  P  	  R  	  F  
Iteration: 0


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.64      1.00      0.78       183
       FAVOR       0.00      0.00      0.00        58

   micro avg       0.64      0.76      0.70       241
   macro avg       0.32      0.50      0.39       241
weighted avg       0.49      0.76      0.59       241

Elapsed time2.0073100000000004seconds
Iteration: 1
              precision    recall  f1-score   support

     AGAINST       0.64      1.00      0.78       183
       FAVOR       0.00      0.00      0.00        58

   micro avg       0.64      0.76      0.70       241
   macro avg       0.32      0.50      0.39       241
weighted avg       0.49      0.76      0.60       241

Elapsed time1.635318999999999seconds
Iteration: 2




              precision    recall  f1-score   support

     AGAINST       0.67      0.91      0.77       183
       FAVOR       0.18      0.10      0.13        58

   micro avg       0.61      0.71      0.66       241
   macro avg       0.43      0.51      0.45       241
weighted avg       0.55      0.71      0.62       241

Elapsed time1.5722519999999989seconds
Iteration: 3




              precision    recall  f1-score   support

     AGAINST       0.70      0.73      0.72       183
       FAVOR       0.27      0.40      0.32        58

   micro avg       0.57      0.65      0.61       241
   macro avg       0.49      0.56      0.52       241
weighted avg       0.60      0.65      0.62       241

Elapsed time1.5843260000000008seconds
Iteration: 4




              precision    recall  f1-score   support

     AGAINST       0.71      0.72      0.71       183
       FAVOR       0.26      0.38      0.31        58

   micro avg       0.56      0.64      0.60       241
   macro avg       0.48      0.55      0.51       241
weighted avg       0.60      0.64      0.62       241

Elapsed time1.5769990000000007seconds
Iteration: 5




              precision    recall  f1-score   support

     AGAINST       0.73      0.69      0.71       183
       FAVOR       0.28      0.48      0.36        58

   micro avg       0.57      0.64      0.60       241
   macro avg       0.50      0.59      0.53       241
weighted avg       0.62      0.64      0.62       241

Elapsed time1.6183300000000003seconds
Iteration: 6




              precision    recall  f1-score   support

     AGAINST       0.74      0.67      0.70       183
       FAVOR       0.31      0.53      0.39        58

   micro avg       0.58      0.63      0.60       241
   macro avg       0.53      0.60      0.55       241
weighted avg       0.64      0.63      0.63       241

Elapsed time1.576594seconds
Iteration: 7




              precision    recall  f1-score   support

     AGAINST       0.72      0.70      0.71       183
       FAVOR       0.31      0.41      0.35        58

   micro avg       0.60      0.63      0.62       241
   macro avg       0.52      0.56      0.53       241
weighted avg       0.62      0.63      0.63       241

Elapsed time1.5915430000000015seconds
Iteration: 8




              precision    recall  f1-score   support

     AGAINST       0.76      0.63      0.69       183
       FAVOR       0.31      0.59      0.40        58

   micro avg       0.57      0.62      0.59       241
   macro avg       0.53      0.61      0.55       241
weighted avg       0.65      0.62      0.62       241

Elapsed time1.5916509999999988seconds
Iteration: 9




              precision    recall  f1-score   support

     AGAINST       0.77      0.61      0.68       183
       FAVOR       0.32      0.66      0.43        58

   micro avg       0.57      0.62      0.60       241
   macro avg       0.55      0.63      0.56       241
weighted avg       0.66      0.62      0.62       241

Elapsed time1.5863239999999976seconds
Iteration: 10




              precision    recall  f1-score   support

     AGAINST       0.74      0.64      0.68       183
       FAVOR       0.30      0.55      0.39        58

   micro avg       0.56      0.62      0.59       241
   macro avg       0.52      0.60      0.54       241
weighted avg       0.63      0.62      0.61       241

Elapsed time1.5669930000000036seconds
Iteration: 11




              precision    recall  f1-score   support

     AGAINST       0.72      0.65      0.68       183
       FAVOR       0.31      0.55      0.40        58

   micro avg       0.56      0.63      0.59       241
   macro avg       0.52      0.60      0.54       241
weighted avg       0.62      0.63      0.61       241

Elapsed time1.5696849999999998seconds
Iteration: 12




              precision    recall  f1-score   support

     AGAINST       0.73      0.64      0.68       183
       FAVOR       0.34      0.55      0.42        58

   micro avg       0.59      0.62      0.60       241
   macro avg       0.53      0.60      0.55       241
weighted avg       0.64      0.62      0.62       241

Elapsed time1.5751050000000006seconds
Iteration: 13




              precision    recall  f1-score   support

     AGAINST       0.73      0.63      0.68       183
       FAVOR       0.32      0.57      0.41        58

   micro avg       0.57      0.62      0.59       241
   macro avg       0.53      0.60      0.55       241
weighted avg       0.63      0.62      0.61       241

Elapsed time1.5755070000000018seconds
Iteration: 14




              precision    recall  f1-score   support

     AGAINST       0.76      0.62      0.68       183
       FAVOR       0.35      0.66      0.45        58

   micro avg       0.58      0.63      0.60       241
   macro avg       0.55      0.64      0.57       241
weighted avg       0.66      0.63      0.63       241

Elapsed time1.6023420000000002seconds
Iteration: 15




              precision    recall  f1-score   support

     AGAINST       0.74      0.63      0.68       183
       FAVOR       0.33      0.60      0.43        58

   micro avg       0.58      0.63      0.60       241
   macro avg       0.54      0.62      0.56       241
weighted avg       0.64      0.63      0.62       241

Elapsed time1.5757569999999959seconds
Iteration: 16




              precision    recall  f1-score   support

     AGAINST       0.75      0.61      0.67       183
       FAVOR       0.34      0.62      0.44        58

   micro avg       0.58      0.61      0.59       241
   macro avg       0.54      0.61      0.55       241
weighted avg       0.65      0.61      0.61       241

Elapsed time1.5704610000000017seconds
Iteration: 17




              precision    recall  f1-score   support

     AGAINST       0.74      0.60      0.66       183
       FAVOR       0.34      0.62      0.44        58

   micro avg       0.57      0.61      0.59       241
   macro avg       0.54      0.61      0.55       241
weighted avg       0.64      0.61      0.61       241

Elapsed time1.5694250000000025seconds
Iteration: 18




              precision    recall  f1-score   support

     AGAINST       0.74      0.61      0.66       183
       FAVOR       0.34      0.60      0.43        58

   micro avg       0.57      0.61      0.59       241
   macro avg       0.54      0.61      0.55       241
weighted avg       0.64      0.61      0.61       241

Elapsed time1.566676000000001seconds
Iteration: 19




              precision    recall  f1-score   support

     AGAINST       0.74      0.61      0.67       183
       FAVOR       0.34      0.62      0.44        58

   micro avg       0.58      0.61      0.59       241
   macro avg       0.54      0.62      0.55       241
weighted avg       0.64      0.61      0.61       241

Elapsed time1.5603999999999942seconds




## Test

In [15]:
textcat_bow = spacy.load(f"../resources/{folder}/bow_{target}")
tweets = textcat_bow(test_texts[10])
print("Text: "+ test_texts[10])
print("Gold Label:"+ test_cats[10])
print("Predicted Label:") 
print(tweets.cats)

Text: sometiimes you just feel like punching a feminist in the face SemST
Gold Label:AGAINST
Predicted Label:
{'AGAINST': 0.3934006094932556, 'FAVOR': 0.3548700511455536, 'NONE': 0.2517293393611908}


In [16]:
textcat_simple_cnn = spacy.load(f"../resources/{folder}/simple_cnn_{target}")
tweets = textcat_simple_cnn(test_texts[10])
print("Text: "+ test_texts[10])
print("Gold Label:"+ test_cats[10])
print("Predicted Label:") 
print(tweets.cats)

Text: sometiimes you just feel like punching a feminist in the face SemST
Gold Label:AGAINST
Predicted Label:
{'AGAINST': 0.014153940603137016, 'FAVOR': 0.9340554475784302, 'NONE': 0.05179070308804512}


# ASSIGNMENTS

1. TODO Train the classifiers for the other 4 targets in the Stance SemEval 2016 dataset.

2. TODO Reuse the above code to train a new classifier for fake news using the celebrity and the fake news datasets: 

  Data: "/content/drive/My Drive/Colab Notebooks/2022-ILTAPP/datasets/fake_rada"

  2.1 HINT: You need to (i) load the data into a pandas dataframe; (ii) modify the labels from the converter and training functions.

  2.2 HINT:Once you have a pandas dataframe, it is easy to split the data into 80% for training and 20% for testing.

3. TODO Try the different spacy language models to see the difference in performance.

## Atheism

In [17]:
target = "Atheism"
training_data, train_texts, train_cats = load_data_spacy(f"../datasets/{folder}/train.{target}.tsv")
print("TOTAL    ", len(training_data))
test_data, test_texts, test_cats = load_data_spacy(f"../datasets/{folder}/test.{target}.tsv")
print("TOTAL    ", len(test_data))

AGAINST    304
NONE       117
FAVOR       92
Name: Stance, dtype: int64
TOTAL     513
AGAINST    160
FAVOR       32
NONE        28
Name: Stance, dtype: int64
TOTAL     220


In [18]:
nlp = train_spacy(training_data, 20, test_texts, test_cats, "bow")

Training the model...
LOSS 	  P  	  R  	  F  
Iteration: 0


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.73      1.00      0.84       160
       FAVOR       0.00      0.00      0.00        32

   micro avg       0.73      0.83      0.78       192
   macro avg       0.36      0.50      0.42       192
weighted avg       0.61      0.83      0.70       192

Elapsed time0.4554169999999971seconds
Iteration: 1


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.73      1.00      0.84       160
       FAVOR       0.00      0.00      0.00        32

   micro avg       0.73      0.83      0.78       192
   macro avg       0.36      0.50      0.42       192
weighted avg       0.61      0.83      0.70       192

Elapsed time0.24215600000000137seconds
Iteration: 2


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.73      1.00      0.84       160
       FAVOR       0.00      0.00      0.00        32

   micro avg       0.73      0.83      0.78       192
   macro avg       0.36      0.50      0.42       192
weighted avg       0.61      0.83      0.70       192

Elapsed time0.2569429999999997seconds
Iteration: 3


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.73      1.00      0.84       160
       FAVOR       0.00      0.00      0.00        32

   micro avg       0.73      0.83      0.78       192
   macro avg       0.36      0.50      0.42       192
weighted avg       0.61      0.83      0.70       192

Elapsed time0.23718600000000123seconds
Iteration: 4


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.73      1.00      0.84       160
       FAVOR       0.00      0.00      0.00        32

   micro avg       0.73      0.83      0.78       192
   macro avg       0.36      0.50      0.42       192
weighted avg       0.61      0.83      0.70       192

Elapsed time0.22846899999999692seconds
Iteration: 5


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.73      1.00      0.84       160
       FAVOR       0.00      0.00      0.00        32

   micro avg       0.73      0.83      0.78       192
   macro avg       0.36      0.50      0.42       192
weighted avg       0.61      0.83      0.70       192

Elapsed time0.24385300000000143seconds
Iteration: 6


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.73      1.00      0.84       160
       FAVOR       0.00      0.00      0.00        32

   micro avg       0.73      0.83      0.78       192
   macro avg       0.36      0.50      0.42       192
weighted avg       0.61      0.83      0.70       192

Elapsed time0.25467100000000187seconds
Iteration: 7


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.73      1.00      0.84       160
       FAVOR       0.00      0.00      0.00        32

   micro avg       0.73      0.83      0.78       192
   macro avg       0.36      0.50      0.42       192
weighted avg       0.61      0.83      0.70       192

Elapsed time0.22726500000000271seconds
Iteration: 8
              precision    recall  f1-score   support

     AGAINST       0.73      1.00      0.84       160
       FAVOR       1.00      0.03      0.06        32

   micro avg       0.73      0.84      0.78       192
   macro avg       0.87      0.52      0.45       192
weighted avg       0.78      0.84      0.71       192

Elapsed time0.2237650000000002seconds
Iteration: 9




              precision    recall  f1-score   support

     AGAINST       0.73      1.00      0.85       160
       FAVOR       1.00      0.03      0.06        32

   micro avg       0.74      0.84      0.78       192
   macro avg       0.87      0.52      0.45       192
weighted avg       0.78      0.84      0.72       192

Elapsed time0.23212499999999636seconds
Iteration: 10
              precision    recall  f1-score   support

     AGAINST       0.73      0.99      0.84       160
       FAVOR       0.50      0.03      0.06        32

   micro avg       0.73      0.83      0.78       192
   macro avg       0.62      0.51      0.45       192
weighted avg       0.69      0.83      0.71       192

Elapsed time0.21884499999999463seconds
Iteration: 11




              precision    recall  f1-score   support

     AGAINST       0.74      0.98      0.85       160
       FAVOR       0.40      0.06      0.11        32

   micro avg       0.74      0.83      0.78       192
   macro avg       0.57      0.52      0.48       192
weighted avg       0.69      0.83      0.72       192

Elapsed time0.2348169999999996seconds
Iteration: 12




              precision    recall  f1-score   support

     AGAINST       0.75      0.97      0.85       160
       FAVOR       0.40      0.06      0.11        32

   micro avg       0.74      0.82      0.78       192
   macro avg       0.57      0.52      0.48       192
weighted avg       0.69      0.82      0.72       192

Elapsed time0.23582299999999634seconds
Iteration: 13




              precision    recall  f1-score   support

     AGAINST       0.75      0.97      0.84       160
       FAVOR       0.33      0.06      0.11        32

   micro avg       0.74      0.82      0.78       192
   macro avg       0.54      0.52      0.47       192
weighted avg       0.68      0.82      0.72       192

Elapsed time0.2281770000000023seconds
Iteration: 14




              precision    recall  f1-score   support

     AGAINST       0.75      0.97      0.84       160
       FAVOR       0.33      0.06      0.11        32

   micro avg       0.74      0.82      0.78       192
   macro avg       0.54      0.52      0.47       192
weighted avg       0.68      0.82      0.72       192

Elapsed time0.24690900000000227seconds
Iteration: 15




              precision    recall  f1-score   support

     AGAINST       0.75      0.96      0.84       160
       FAVOR       0.29      0.06      0.10        32

   micro avg       0.74      0.81      0.77       192
   macro avg       0.52      0.51      0.47       192
weighted avg       0.67      0.81      0.72       192

Elapsed time0.2254519999999971seconds
Iteration: 16




              precision    recall  f1-score   support

     AGAINST       0.75      0.96      0.84       160
       FAVOR       0.29      0.06      0.10        32

   micro avg       0.74      0.81      0.77       192
   macro avg       0.52      0.51      0.47       192
weighted avg       0.67      0.81      0.72       192

Elapsed time0.2507579999999976seconds
Iteration: 17




              precision    recall  f1-score   support

     AGAINST       0.75      0.96      0.84       160
       FAVOR       0.29      0.06      0.10        32

   micro avg       0.74      0.81      0.77       192
   macro avg       0.52      0.51      0.47       192
weighted avg       0.68      0.81      0.72       192

Elapsed time0.2416070000000019seconds
Iteration: 18




              precision    recall  f1-score   support

     AGAINST       0.77      0.96      0.85       160
       FAVOR       0.29      0.06      0.10        32

   micro avg       0.75      0.81      0.78       192
   macro avg       0.53      0.51      0.48       192
weighted avg       0.69      0.81      0.73       192

Elapsed time0.2023850000000067seconds
Iteration: 19
              precision    recall  f1-score   support

     AGAINST       0.77      0.94      0.85       160
       FAVOR       0.22      0.06      0.10        32

   micro avg       0.74      0.80      0.77       192
   macro avg       0.49      0.50      0.47       192
weighted avg       0.68      0.80      0.72       192

Elapsed time0.21679699999999968seconds


In [19]:
nlp = train_spacy(training_data, 20, test_texts, test_cats, "simple_cnn")

Training the model...
LOSS 	  P  	  R  	  F  
Iteration: 0


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.73      1.00      0.84       160
       FAVOR       0.00      0.00      0.00        32

   micro avg       0.73      0.83      0.78       192
   macro avg       0.36      0.50      0.42       192
weighted avg       0.61      0.83      0.70       192

Elapsed time1.5226989999999958seconds
Iteration: 1


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.73      1.00      0.84       160
       FAVOR       0.00      0.00      0.00        32

   micro avg       0.73      0.83      0.78       192
   macro avg       0.36      0.50      0.42       192
weighted avg       0.61      0.83      0.70       192

Elapsed time1.2765780000000007seconds
Iteration: 2


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.73      1.00      0.84       160
       FAVOR       0.00      0.00      0.00        32

   micro avg       0.73      0.83      0.78       192
   macro avg       0.36      0.50      0.42       192
weighted avg       0.61      0.83      0.70       192

Elapsed time1.2817050000000023seconds
Iteration: 3
              precision    recall  f1-score   support

     AGAINST       0.76      0.94      0.84       160
       FAVOR       0.18      0.06      0.09        32

   micro avg       0.73      0.79      0.76       192
   macro avg       0.47      0.50      0.47       192
weighted avg       0.66      0.79      0.72       192

Elapsed time1.2954159999999888seconds
Iteration: 4




              precision    recall  f1-score   support

     AGAINST       0.76      0.91      0.83       160
       FAVOR       0.16      0.09      0.12        32

   micro avg       0.71      0.77      0.74       192
   macro avg       0.46      0.50      0.47       192
weighted avg       0.66      0.77      0.71       192

Elapsed time1.2855759999999918seconds
Iteration: 5




              precision    recall  f1-score   support

     AGAINST       0.85      0.76      0.80       160
       FAVOR       0.30      0.47      0.37        32

   micro avg       0.71      0.71      0.71       192
   macro avg       0.57      0.62      0.58       192
weighted avg       0.76      0.71      0.73       192

Elapsed time1.2861699999999985seconds
Iteration: 6




              precision    recall  f1-score   support

     AGAINST       0.82      0.78      0.80       160
       FAVOR       0.33      0.12      0.18        32

   micro avg       0.79      0.67      0.72       192
   macro avg       0.58      0.45      0.49       192
weighted avg       0.74      0.67      0.70       192

Elapsed time1.284567999999993seconds
Iteration: 7




              precision    recall  f1-score   support

     AGAINST       0.83      0.77      0.80       160
       FAVOR       0.31      0.12      0.18        32

   micro avg       0.78      0.66      0.72       192
   macro avg       0.57      0.45      0.49       192
weighted avg       0.74      0.66      0.69       192

Elapsed time1.262701000000007seconds
Iteration: 8




              precision    recall  f1-score   support

     AGAINST       0.81      0.79      0.80       160
       FAVOR       0.24      0.16      0.19        32

   micro avg       0.74      0.68      0.71       192
   macro avg       0.53      0.47      0.49       192
weighted avg       0.72      0.68      0.70       192

Elapsed time1.2610379999999992seconds
Iteration: 9




              precision    recall  f1-score   support

     AGAINST       0.85      0.74      0.79       160
       FAVOR       0.28      0.22      0.25        32

   micro avg       0.76      0.66      0.71       192
   macro avg       0.56      0.48      0.52       192
weighted avg       0.76      0.66      0.70       192

Elapsed time1.2523089999999968seconds
Iteration: 10




              precision    recall  f1-score   support

     AGAINST       0.84      0.78      0.81       160
       FAVOR       0.27      0.22      0.24        32

   micro avg       0.76      0.69      0.72       192
   macro avg       0.56      0.50      0.53       192
weighted avg       0.75      0.69      0.72       192

Elapsed time1.2389779999999888seconds
Iteration: 11




              precision    recall  f1-score   support

     AGAINST       0.84      0.78      0.81       160
       FAVOR       0.27      0.22      0.24        32

   micro avg       0.75      0.69      0.72       192
   macro avg       0.55      0.50      0.53       192
weighted avg       0.74      0.69      0.71       192

Elapsed time1.2575230000000062seconds
Iteration: 12




              precision    recall  f1-score   support

     AGAINST       0.83      0.80      0.81       160
       FAVOR       0.29      0.25      0.27        32

   micro avg       0.74      0.71      0.73       192
   macro avg       0.56      0.53      0.54       192
weighted avg       0.74      0.71      0.72       192

Elapsed time1.268392999999989seconds
Iteration: 13




              precision    recall  f1-score   support

     AGAINST       0.85      0.78      0.81       160
       FAVOR       0.29      0.28      0.29        32

   micro avg       0.75      0.70      0.72       192
   macro avg       0.57      0.53      0.55       192
weighted avg       0.76      0.70      0.73       192

Elapsed time1.256540000000001seconds
Iteration: 14




              precision    recall  f1-score   support

     AGAINST       0.83      0.81      0.82       160
       FAVOR       0.26      0.22      0.24        32

   micro avg       0.74      0.71      0.73       192
   macro avg       0.54      0.52      0.53       192
weighted avg       0.73      0.71      0.72       192

Elapsed time1.2536889999999943seconds
Iteration: 15




              precision    recall  f1-score   support

     AGAINST       0.84      0.81      0.83       160
       FAVOR       0.31      0.28      0.30        32

   micro avg       0.76      0.72      0.74       192
   macro avg       0.58      0.55      0.56       192
weighted avg       0.76      0.72      0.74       192

Elapsed time1.2289849999999944seconds
Iteration: 16




              precision    recall  f1-score   support

     AGAINST       0.83      0.81      0.82       160
       FAVOR       0.28      0.25      0.26        32

   micro avg       0.75      0.72      0.73       192
   macro avg       0.55      0.53      0.54       192
weighted avg       0.74      0.72      0.73       192

Elapsed time1.2461199999999906seconds
Iteration: 17




              precision    recall  f1-score   support

     AGAINST       0.84      0.79      0.82       160
       FAVOR       0.29      0.28      0.29        32

   micro avg       0.75      0.71      0.73       192
   macro avg       0.57      0.54      0.55       192
weighted avg       0.75      0.71      0.73       192

Elapsed time1.2345630000000085seconds
Iteration: 18




              precision    recall  f1-score   support

     AGAINST       0.86      0.75      0.80       160
       FAVOR       0.29      0.28      0.29        32

   micro avg       0.75      0.67      0.71       192
   macro avg       0.57      0.52      0.54       192
weighted avg       0.76      0.67      0.71       192

Elapsed time1.2455810000000014seconds
Iteration: 19




              precision    recall  f1-score   support

     AGAINST       0.85      0.79      0.82       160
       FAVOR       0.27      0.28      0.28        32

   micro avg       0.74      0.71      0.73       192
   macro avg       0.56      0.54      0.55       192
weighted avg       0.75      0.71      0.73       192

Elapsed time1.2503339999999952seconds




## Climate Change is a Real Concern

In [20]:
target = "Climate Change is a Real Concern"
training_data, train_texts, train_cats = load_data_spacy(f"../datasets/{folder}/train.{target}.tsv")
print("TOTAL    ", len(training_data))
test_data, test_texts, test_cats = load_data_spacy(f"../datasets/{folder}/test.{target}.tsv")
print("TOTAL    ", len(test_data))

FAVOR      212
NONE       168
AGAINST     15
Name: Stance, dtype: int64
TOTAL     395
FAVOR      123
NONE        35
AGAINST     11
Name: Stance, dtype: int64
TOTAL     169


In [21]:
nlp = train_spacy(training_data, 20, test_texts, test_cats, "bow")

Training the model...
LOSS 	  P  	  R  	  F  
Iteration: 0


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.77      0.91      0.83       123

   micro avg       0.77      0.84      0.80       134
   macro avg       0.38      0.46      0.42       134
weighted avg       0.70      0.84      0.76       134

Elapsed time0.3953389999999928seconds
Iteration: 1
              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.78      0.98      0.87       123

   micro avg       0.78      0.90      0.83       134
   macro avg       0.39      0.49      0.43       134
weighted avg       0.71      0.90      0.80       134

Elapsed time0.17195700000000613seconds
Iteration: 2


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.77      0.98      0.86       123

   micro avg       0.77      0.90      0.83       134
   macro avg       0.39      0.49      0.43       134
weighted avg       0.71      0.90      0.79       134

Elapsed time0.1815950000000015seconds
Iteration: 3
              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.78      0.99      0.87       123

   micro avg       0.78      0.91      0.84       134
   macro avg       0.39      0.50      0.44       134
weighted avg       0.71      0.91      0.80       134

Elapsed time0.1804169999999914seconds
Iteration: 4


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.77      0.98      0.86       123

   micro avg       0.77      0.90      0.83       134
   macro avg       0.39      0.49      0.43       134
weighted avg       0.71      0.90      0.79       134

Elapsed time0.18875400000000297seconds
Iteration: 5
              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.77      0.98      0.86       123

   micro avg       0.77      0.90      0.83       134
   macro avg       0.39      0.49      0.43       134
weighted avg       0.71      0.90      0.79       134

Elapsed time0.1870619999999974seconds
Iteration: 6


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.77      0.94      0.85       123

   micro avg       0.77      0.87      0.81       134
   macro avg       0.38      0.47      0.42       134
weighted avg       0.71      0.87      0.78       134

Elapsed time0.18963999999999714seconds
Iteration: 7


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.77      0.93      0.85       123

   micro avg       0.77      0.86      0.81       134
   macro avg       0.39      0.47      0.42       134
weighted avg       0.71      0.86      0.78       134

Elapsed time0.23517699999999309seconds
Iteration: 8


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.77      0.92      0.84       123

   micro avg       0.77      0.84      0.81       134
   macro avg       0.39      0.46      0.42       134
weighted avg       0.71      0.84      0.77       134

Elapsed time0.21599799999999902seconds
Iteration: 9
              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.77      0.92      0.84       123

   micro avg       0.77      0.84      0.81       134
   macro avg       0.39      0.46      0.42       134
weighted avg       0.71      0.84      0.77       134

Elapsed time0.18554500000000473seconds
Iteration: 10


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.77      0.91      0.84       123

   micro avg       0.77      0.84      0.80       134
   macro avg       0.39      0.46      0.42       134
weighted avg       0.71      0.84      0.77       134

Elapsed time0.20054299999999614seconds
Iteration: 11


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.78      0.92      0.84       123

   micro avg       0.78      0.84      0.81       134
   macro avg       0.39      0.46      0.42       134
weighted avg       0.72      0.84      0.77       134

Elapsed time0.25885900000000106seconds
Iteration: 12


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.78      0.92      0.84       123

   micro avg       0.78      0.84      0.81       134
   macro avg       0.39      0.46      0.42       134
weighted avg       0.72      0.84      0.77       134

Elapsed time0.19825499999998897seconds
Iteration: 13
              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.78      0.91      0.84       123

   micro avg       0.78      0.84      0.81       134
   macro avg       0.39      0.46      0.42       134
weighted avg       0.71      0.84      0.77       134

Elapsed time0.15354800000000068seconds
Iteration: 14


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.78      0.90      0.84       123

   micro avg       0.78      0.83      0.80       134
   macro avg       0.39      0.45      0.42       134
weighted avg       0.72      0.83      0.77       134

Elapsed time0.19134699999999327seconds
Iteration: 15


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.78      0.89      0.83       123

   micro avg       0.78      0.82      0.80       134
   macro avg       0.39      0.45      0.42       134
weighted avg       0.72      0.82      0.76       134

Elapsed time0.25175400000000536seconds
Iteration: 16
              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.78      0.89      0.83       123

   micro avg       0.78      0.82      0.80       134
   macro avg       0.39      0.45      0.42       134
weighted avg       0.72      0.82      0.76       134

Elapsed time0.17965800000000343seconds
Iteration: 17


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.78      0.89      0.83       123

   micro avg       0.78      0.81      0.80       134
   macro avg       0.39      0.44      0.41       134
weighted avg       0.71      0.81      0.76       134

Elapsed time0.17929800000000284seconds
Iteration: 18
              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.78      0.89      0.83       123

   micro avg       0.78      0.81      0.80       134
   macro avg       0.39      0.44      0.42       134
weighted avg       0.72      0.81      0.76       134

Elapsed time0.19227199999998845seconds
Iteration: 19


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.79      0.88      0.83       123

   micro avg       0.79      0.81      0.80       134
   macro avg       0.39      0.44      0.42       134
weighted avg       0.72      0.81      0.76       134

Elapsed time0.17414999999999736seconds


In [22]:
nlp = train_spacy(training_data, 20, test_texts, test_cats, "simple_cnn")

Training the model...
LOSS 	  P  	  R  	  F  
Iteration: 0


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.73      1.00      0.84       123

   micro avg       0.73      0.92      0.81       134
   macro avg       0.36      0.50      0.42       134
weighted avg       0.67      0.92      0.77       134

Elapsed time1.1849349999999959seconds
Iteration: 1


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.73      1.00      0.85       123

   micro avg       0.73      0.92      0.81       134
   macro avg       0.37      0.50      0.42       134
weighted avg       0.67      0.92      0.78       134

Elapsed time0.9184599999999961seconds
Iteration: 2


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.73      1.00      0.85       123

   micro avg       0.73      0.92      0.81       134
   macro avg       0.37      0.50      0.42       134
weighted avg       0.67      0.92      0.78       134

Elapsed time0.9287160000000085seconds
Iteration: 3


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.77      0.94      0.85       123

   micro avg       0.77      0.87      0.82       134
   macro avg       0.39      0.47      0.42       134
weighted avg       0.71      0.87      0.78       134

Elapsed time0.936981000000003seconds
Iteration: 4


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.82      0.84      0.83       123

   micro avg       0.82      0.77      0.80       134
   macro avg       0.41      0.42      0.42       134
weighted avg       0.76      0.77      0.76       134

Elapsed time0.9399259999999998seconds
Iteration: 5


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.84      0.72      0.78       123

   micro avg       0.84      0.66      0.74       134
   macro avg       0.42      0.36      0.39       134
weighted avg       0.77      0.66      0.71       134

Elapsed time0.9328450000000004seconds
Iteration: 6


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.82      0.76      0.78       123

   micro avg       0.82      0.69      0.75       134
   macro avg       0.41      0.38      0.39       134
weighted avg       0.75      0.69      0.72       134

Elapsed time0.9235989999999958seconds
Iteration: 7


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.81      0.78      0.80       123

   micro avg       0.81      0.72      0.76       134
   macro avg       0.41      0.39      0.40       134
weighted avg       0.75      0.72      0.73       134

Elapsed time0.9108360000000033seconds
Iteration: 8
              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.82      0.76      0.79       123

   micro avg       0.82      0.70      0.76       134
   macro avg       0.41      0.38      0.40       134
weighted avg       0.76      0.70      0.73       134

Elapsed time0.9264809999999954seconds
Iteration: 9




              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.80      0.80      0.80       123

   micro avg       0.79      0.73      0.76       134
   macro avg       0.40      0.40      0.40       134
weighted avg       0.74      0.73      0.73       134

Elapsed time0.9173089999999888seconds
Iteration: 10




              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.82      0.78      0.80       123

   micro avg       0.81      0.72      0.76       134
   macro avg       0.41      0.39      0.40       134
weighted avg       0.75      0.72      0.73       134

Elapsed time0.9134999999999991seconds
Iteration: 11




              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.79      0.81      0.80       123

   micro avg       0.78      0.75      0.76       134
   macro avg       0.40      0.41      0.40       134
weighted avg       0.73      0.75      0.74       134

Elapsed time0.9112109999999944seconds
Iteration: 12




              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.79      0.81      0.80       123

   micro avg       0.78      0.75      0.76       134
   macro avg       0.40      0.41      0.40       134
weighted avg       0.73      0.75      0.74       134

Elapsed time0.8965089999999947seconds
Iteration: 13




              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.78      0.85      0.81       123

   micro avg       0.76      0.78      0.77       134
   macro avg       0.39      0.42      0.40       134
weighted avg       0.71      0.78      0.74       134

Elapsed time0.9065899999999942seconds
Iteration: 14




              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.79      0.82      0.80       123

   micro avg       0.78      0.75      0.77       134
   macro avg       0.39      0.41      0.40       134
weighted avg       0.72      0.75      0.74       134

Elapsed time0.9049710000000033seconds
Iteration: 15




              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.79      0.84      0.81       123

   micro avg       0.77      0.77      0.77       134
   macro avg       0.39      0.42      0.41       134
weighted avg       0.72      0.77      0.74       134

Elapsed time0.9298079999999942seconds
Iteration: 16




              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.78      0.83      0.81       123

   micro avg       0.77      0.76      0.77       134
   macro avg       0.39      0.41      0.40       134
weighted avg       0.72      0.76      0.74       134

Elapsed time0.9199240000000088seconds
Iteration: 17




              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.79      0.81      0.80       123

   micro avg       0.78      0.75      0.76       134
   macro avg       0.40      0.41      0.40       134
weighted avg       0.73      0.75      0.74       134

Elapsed time0.9317110000000071seconds
Iteration: 18




              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.79      0.80      0.79       123

   micro avg       0.78      0.73      0.75       134
   macro avg       0.40      0.40      0.40       134
weighted avg       0.73      0.73      0.73       134

Elapsed time0.9032359999999926seconds
Iteration: 19




              precision    recall  f1-score   support

     AGAINST       0.00      0.00      0.00        11
       FAVOR       0.79      0.82      0.80       123

   micro avg       0.78      0.75      0.77       134
   macro avg       0.39      0.41      0.40       134
weighted avg       0.72      0.75      0.74       134

Elapsed time0.9142149999999987seconds




## Hillary Clinton

In [23]:
target = "Hillary Clinton"
training_data, train_texts, train_cats = load_data_spacy(f"../datasets/{folder}/train.{target}.tsv")
print("TOTAL    ", len(training_data))
test_data, test_texts, test_cats = load_data_spacy(f"../datasets/{folder}/test.{target}.tsv")
print("TOTAL    ", len(test_data))

AGAINST    393
NONE       178
FAVOR      118
Name: Stance, dtype: int64
TOTAL     689
AGAINST    172
NONE        78
FAVOR       45
Name: Stance, dtype: int64
TOTAL     295


In [24]:
nlp = train_spacy(training_data, 20, test_texts, test_cats, "bow")

Training the model...
LOSS 	  P  	  R  	  F  
Iteration: 0


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.58      1.00      0.74       172
       FAVOR       0.00      0.00      0.00        45

   micro avg       0.58      0.79      0.67       217
   macro avg       0.29      0.50      0.37       217
weighted avg       0.46      0.79      0.58       217

Elapsed time0.6265099999999961seconds
Iteration: 1


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.58      1.00      0.74       172
       FAVOR       0.00      0.00      0.00        45

   micro avg       0.58      0.79      0.67       217
   macro avg       0.29      0.50      0.37       217
weighted avg       0.46      0.79      0.58       217

Elapsed time0.2821909999999974seconds
Iteration: 2


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.58      1.00      0.74       172
       FAVOR       0.00      0.00      0.00        45

   micro avg       0.58      0.79      0.67       217
   macro avg       0.29      0.50      0.37       217
weighted avg       0.46      0.79      0.58       217

Elapsed time0.2936809999999923seconds
Iteration: 3


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.58      1.00      0.74       172
       FAVOR       0.00      0.00      0.00        45

   micro avg       0.58      0.79      0.67       217
   macro avg       0.29      0.50      0.37       217
weighted avg       0.46      0.79      0.58       217

Elapsed time0.3005860000000098seconds
Iteration: 4


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.58      1.00      0.74       172
       FAVOR       0.00      0.00      0.00        45

   micro avg       0.58      0.79      0.67       217
   macro avg       0.29      0.50      0.37       217
weighted avg       0.46      0.79      0.58       217

Elapsed time0.30452099999999405seconds
Iteration: 5


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.58      1.00      0.74       172
       FAVOR       0.00      0.00      0.00        45

   micro avg       0.58      0.79      0.67       217
   macro avg       0.29      0.50      0.37       217
weighted avg       0.46      0.79      0.58       217

Elapsed time0.2820860000000067seconds
Iteration: 6


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.59      1.00      0.74       172
       FAVOR       0.00      0.00      0.00        45

   micro avg       0.59      0.79      0.67       217
   macro avg       0.29      0.50      0.37       217
weighted avg       0.47      0.79      0.59       217

Elapsed time0.2830199999999934seconds
Iteration: 7


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.59      0.99      0.74       172
       FAVOR       0.00      0.00      0.00        45

   micro avg       0.59      0.79      0.67       217
   macro avg       0.29      0.50      0.37       217
weighted avg       0.46      0.79      0.58       217

Elapsed time0.32111000000000445seconds
Iteration: 8


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.59      0.99      0.74       172
       FAVOR       0.00      0.00      0.00        45

   micro avg       0.59      0.78      0.68       217
   macro avg       0.30      0.49      0.37       217
weighted avg       0.47      0.78      0.59       217

Elapsed time0.31667699999999854seconds
Iteration: 9


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.60      0.98      0.74       172
       FAVOR       0.00      0.00      0.00        45

   micro avg       0.60      0.78      0.68       217
   macro avg       0.30      0.49      0.37       217
weighted avg       0.48      0.78      0.59       217

Elapsed time0.3163080000000065seconds
Iteration: 10


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.61      0.98      0.75       172
       FAVOR       0.00      0.00      0.00        45

   micro avg       0.61      0.78      0.68       217
   macro avg       0.31      0.49      0.38       217
weighted avg       0.48      0.78      0.60       217

Elapsed time0.30236999999999625seconds
Iteration: 11


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.61      0.98      0.75       172
       FAVOR       0.00      0.00      0.00        45

   micro avg       0.61      0.78      0.69       217
   macro avg       0.31      0.49      0.38       217
weighted avg       0.49      0.78      0.60       217

Elapsed time0.3209849999999932seconds
Iteration: 12


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.62      0.98      0.76       172
       FAVOR       0.00      0.00      0.00        45

   micro avg       0.62      0.77      0.69       217
   macro avg       0.31      0.49      0.38       217
weighted avg       0.49      0.77      0.60       217

Elapsed time0.2973100000000102seconds
Iteration: 13
              precision    recall  f1-score   support

     AGAINST       0.62      0.98      0.76       172
       FAVOR       1.00      0.02      0.04        45

   micro avg       0.62      0.78      0.69       217
   macro avg       0.81      0.50      0.40       217
weighted avg       0.70      0.78      0.61       217

Elapsed time0.2856790000000018seconds
Iteration: 14




              precision    recall  f1-score   support

     AGAINST       0.62      0.98      0.76       172
       FAVOR       1.00      0.04      0.09        45

   micro avg       0.62      0.78      0.69       217
   macro avg       0.81      0.51      0.42       217
weighted avg       0.70      0.78      0.62       217

Elapsed time0.31307999999999936seconds
Iteration: 15




              precision    recall  f1-score   support

     AGAINST       0.62      0.97      0.76       172
       FAVOR       0.67      0.04      0.08        45

   micro avg       0.62      0.78      0.69       217
   macro avg       0.64      0.51      0.42       217
weighted avg       0.63      0.78      0.62       217

Elapsed time0.3035899999999998seconds
Iteration: 16




              precision    recall  f1-score   support

     AGAINST       0.62      0.97      0.76       172
       FAVOR       0.67      0.04      0.08        45

   micro avg       0.62      0.78      0.69       217
   macro avg       0.64      0.51      0.42       217
weighted avg       0.63      0.78      0.62       217

Elapsed time0.285191999999995seconds
Iteration: 17




              precision    recall  f1-score   support

     AGAINST       0.63      0.97      0.76       172
       FAVOR       0.75      0.07      0.12        45

   micro avg       0.63      0.78      0.70       217
   macro avg       0.69      0.52      0.44       217
weighted avg       0.65      0.78      0.63       217

Elapsed time0.3077300000000065seconds
Iteration: 18




              precision    recall  f1-score   support

     AGAINST       0.62      0.96      0.76       172
       FAVOR       0.67      0.09      0.16        45

   micro avg       0.63      0.78      0.69       217
   macro avg       0.65      0.52      0.46       217
weighted avg       0.63      0.78      0.63       217

Elapsed time0.3188949999999977seconds
Iteration: 19




              precision    recall  f1-score   support

     AGAINST       0.63      0.96      0.76       172
       FAVOR       0.71      0.11      0.19        45

   micro avg       0.63      0.78      0.70       217
   macro avg       0.67      0.54      0.48       217
weighted avg       0.65      0.78      0.64       217

Elapsed time0.2876410000000078seconds




In [25]:
nlp = train_spacy(training_data, 20, test_texts, test_cats, "simple_cnn")

Training the model...
LOSS 	  P  	  R  	  F  
Iteration: 0


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.58      1.00      0.74       172
       FAVOR       0.00      0.00      0.00        45

   micro avg       0.58      0.79      0.67       217
   macro avg       0.29      0.50      0.37       217
weighted avg       0.46      0.79      0.58       217

Elapsed time1.9372079999999983seconds
Iteration: 1


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.58      1.00      0.74       172
       FAVOR       0.00      0.00      0.00        45

   micro avg       0.58      0.79      0.67       217
   macro avg       0.29      0.50      0.37       217
weighted avg       0.46      0.79      0.58       217

Elapsed time1.6306849999999997seconds
Iteration: 2
              precision    recall  f1-score   support

     AGAINST       0.60      0.98      0.75       172
       FAVOR       0.00      0.00      0.00        45

   micro avg       0.60      0.78      0.68       217
   macro avg       0.30      0.49      0.37       217
weighted avg       0.48      0.78      0.59       217

Elapsed time1.6216849999999994seconds
Iteration: 3




              precision    recall  f1-score   support

     AGAINST       0.64      0.92      0.75       172
       FAVOR       0.50      0.02      0.04        45

   micro avg       0.64      0.73      0.68       217
   macro avg       0.57      0.47      0.40       217
weighted avg       0.61      0.73      0.61       217

Elapsed time1.5739820000000009seconds
Iteration: 4




              precision    recall  f1-score   support

     AGAINST       0.65      0.86      0.74       172
       FAVOR       0.86      0.13      0.23        45

   micro avg       0.66      0.71      0.68       217
   macro avg       0.75      0.50      0.49       217
weighted avg       0.69      0.71      0.64       217

Elapsed time1.5921790000000016seconds
Iteration: 5




              precision    recall  f1-score   support

     AGAINST       0.66      0.87      0.75       172
       FAVOR       0.64      0.20      0.31        45

   micro avg       0.66      0.73      0.69       217
   macro avg       0.65      0.53      0.53       217
weighted avg       0.66      0.73      0.66       217

Elapsed time1.5770090000000039seconds
Iteration: 6




              precision    recall  f1-score   support

     AGAINST       0.66      0.87      0.75       172
       FAVOR       0.59      0.22      0.32        45

   micro avg       0.66      0.74      0.69       217
   macro avg       0.62      0.55      0.54       217
weighted avg       0.65      0.74      0.66       217

Elapsed time1.5863879999999995seconds
Iteration: 7




              precision    recall  f1-score   support

     AGAINST       0.69      0.84      0.76       172
       FAVOR       0.54      0.29      0.38        45

   micro avg       0.68      0.73      0.70       217
   macro avg       0.62      0.57      0.57       217
weighted avg       0.66      0.73      0.68       217

Elapsed time1.563022999999987seconds
Iteration: 8




              precision    recall  f1-score   support

     AGAINST       0.69      0.87      0.77       172
       FAVOR       0.60      0.27      0.37        45

   micro avg       0.68      0.74      0.71       217
   macro avg       0.64      0.57      0.57       217
weighted avg       0.67      0.74      0.69       217

Elapsed time1.642725999999982seconds
Iteration: 9




              precision    recall  f1-score   support

     AGAINST       0.68      0.80      0.73       172
       FAVOR       0.52      0.31      0.39        45

   micro avg       0.66      0.70      0.68       217
   macro avg       0.60      0.55      0.56       217
weighted avg       0.65      0.70      0.66       217

Elapsed time1.5742840000000058seconds
Iteration: 10




              precision    recall  f1-score   support

     AGAINST       0.69      0.84      0.76       172
       FAVOR       0.56      0.33      0.42        45

   micro avg       0.67      0.73      0.70       217
   macro avg       0.62      0.59      0.59       217
weighted avg       0.66      0.73      0.69       217

Elapsed time1.555991000000006seconds
Iteration: 11




              precision    recall  f1-score   support

     AGAINST       0.67      0.84      0.74       172
       FAVOR       0.58      0.31      0.41        45

   micro avg       0.66      0.73      0.69       217
   macro avg       0.63      0.57      0.57       217
weighted avg       0.65      0.73      0.67       217

Elapsed time1.5562400000000025seconds
Iteration: 12




              precision    recall  f1-score   support

     AGAINST       0.68      0.83      0.75       172
       FAVOR       0.50      0.29      0.37        45

   micro avg       0.66      0.71      0.69       217
   macro avg       0.59      0.56      0.56       217
weighted avg       0.64      0.71      0.67       217

Elapsed time1.5805090000000064seconds
Iteration: 13




              precision    recall  f1-score   support

     AGAINST       0.66      0.88      0.76       172
       FAVOR       0.59      0.29      0.39        45

   micro avg       0.66      0.76      0.71       217
   macro avg       0.63      0.59      0.57       217
weighted avg       0.65      0.76      0.68       217

Elapsed time1.5787190000000066seconds
Iteration: 14




              precision    recall  f1-score   support

     AGAINST       0.67      0.90      0.76       172
       FAVOR       0.62      0.29      0.39        45

   micro avg       0.66      0.77      0.71       217
   macro avg       0.64      0.59      0.58       217
weighted avg       0.66      0.77      0.69       217

Elapsed time1.5742639999999994seconds
Iteration: 15




              precision    recall  f1-score   support

     AGAINST       0.66      0.87      0.75       172
       FAVOR       0.54      0.29      0.38        45

   micro avg       0.65      0.75      0.70       217
   macro avg       0.60      0.58      0.56       217
weighted avg       0.64      0.75      0.67       217

Elapsed time1.5856339999999989seconds
Iteration: 16




              precision    recall  f1-score   support

     AGAINST       0.66      0.88      0.75       172
       FAVOR       0.65      0.29      0.40        45

   micro avg       0.66      0.76      0.70       217
   macro avg       0.65      0.58      0.58       217
weighted avg       0.66      0.76      0.68       217

Elapsed time1.548043000000007seconds
Iteration: 17




              precision    recall  f1-score   support

     AGAINST       0.68      0.86      0.76       172
       FAVOR       0.54      0.31      0.39        45

   micro avg       0.66      0.75      0.70       217
   macro avg       0.61      0.59      0.58       217
weighted avg       0.65      0.75      0.68       217

Elapsed time1.587931999999995seconds
Iteration: 18




              precision    recall  f1-score   support

     AGAINST       0.67      0.88      0.76       172
       FAVOR       0.65      0.29      0.40        45

   micro avg       0.67      0.76      0.71       217
   macro avg       0.66      0.58      0.58       217
weighted avg       0.67      0.76      0.69       217

Elapsed time1.5549760000000106seconds
Iteration: 19




              precision    recall  f1-score   support

     AGAINST       0.68      0.87      0.76       172
       FAVOR       0.59      0.29      0.39        45

   micro avg       0.67      0.75      0.71       217
   macro avg       0.63      0.58      0.57       217
weighted avg       0.66      0.75      0.68       217

Elapsed time1.558952000000005seconds




## Legalization of Abortion

In [26]:
target = "Legalization of Abortion"
training_data, train_texts, train_cats = load_data_spacy(f"../datasets/{folder}/train.{target}.tsv")
print("TOTAL    ", len(training_data))
test_data, test_texts, test_cats = load_data_spacy(f"../datasets/{folder}/test.{target}.tsv")
print("TOTAL    ", len(test_data))

AGAINST    355
NONE       177
FAVOR      121
Name: Stance, dtype: int64
TOTAL     653
AGAINST    189
FAVOR       46
NONE        45
Name: Stance, dtype: int64
TOTAL     280


In [27]:
nlp = train_spacy(training_data, 20, test_texts, test_cats, "bow")

Training the model...
LOSS 	  P  	  R  	  F  
Iteration: 0


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.68      1.00      0.81       189
       FAVOR       0.00      0.00      0.00        46

   micro avg       0.68      0.80      0.73       235
   macro avg       0.34      0.50      0.40       235
weighted avg       0.54      0.80      0.65       235

Elapsed time0.5710590000000195seconds
Iteration: 1


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.68      1.00      0.81       189
       FAVOR       0.00      0.00      0.00        46

   micro avg       0.68      0.80      0.73       235
   macro avg       0.34      0.50      0.40       235
weighted avg       0.54      0.80      0.65       235

Elapsed time0.282227000000006seconds
Iteration: 2


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.68      1.00      0.81       189
       FAVOR       0.00      0.00      0.00        46

   micro avg       0.68      0.80      0.73       235
   macro avg       0.34      0.50      0.40       235
weighted avg       0.54      0.80      0.65       235

Elapsed time0.3039620000000127seconds
Iteration: 3


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.68      1.00      0.81       189
       FAVOR       0.00      0.00      0.00        46

   micro avg       0.68      0.80      0.73       235
   macro avg       0.34      0.50      0.40       235
weighted avg       0.54      0.80      0.65       235

Elapsed time0.2934669999999926seconds
Iteration: 4


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.68      1.00      0.81       189
       FAVOR       0.00      0.00      0.00        46

   micro avg       0.68      0.80      0.74       235
   macro avg       0.34      0.50      0.40       235
weighted avg       0.54      0.80      0.65       235

Elapsed time0.292792999999989seconds
Iteration: 5


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.68      1.00      0.81       189
       FAVOR       0.00      0.00      0.00        46

   micro avg       0.68      0.80      0.74       235
   macro avg       0.34      0.50      0.40       235
weighted avg       0.54      0.80      0.65       235

Elapsed time0.30529999999998836seconds
Iteration: 6
              precision    recall  f1-score   support

     AGAINST       0.68      0.99      0.81       189
       FAVOR       0.50      0.02      0.04        46

   micro avg       0.68      0.80      0.74       235
   macro avg       0.59      0.51      0.42       235
weighted avg       0.64      0.80      0.66       235

Elapsed time0.2732990000000086seconds
Iteration: 7




              precision    recall  f1-score   support

     AGAINST       0.68      0.99      0.81       189
       FAVOR       0.67      0.04      0.08        46

   micro avg       0.68      0.81      0.74       235
   macro avg       0.67      0.52      0.45       235
weighted avg       0.68      0.81      0.67       235

Elapsed time0.29655599999998117seconds
Iteration: 8




              precision    recall  f1-score   support

     AGAINST       0.69      0.98      0.81       189
       FAVOR       0.40      0.04      0.08        46

   micro avg       0.68      0.80      0.73       235
   macro avg       0.54      0.51      0.44       235
weighted avg       0.63      0.80      0.66       235

Elapsed time0.29268100000001596seconds
Iteration: 9




              precision    recall  f1-score   support

     AGAINST       0.69      0.96      0.80       189
       FAVOR       0.56      0.11      0.18        46

   micro avg       0.68      0.80      0.74       235
   macro avg       0.62      0.54      0.49       235
weighted avg       0.66      0.80      0.68       235

Elapsed time0.30544499999999175seconds
Iteration: 10




              precision    recall  f1-score   support

     AGAINST       0.69      0.96      0.81       189
       FAVOR       0.60      0.13      0.21        46

   micro avg       0.69      0.80      0.74       235
   macro avg       0.65      0.55      0.51       235
weighted avg       0.68      0.80      0.69       235

Elapsed time0.27664899999999193seconds
Iteration: 11




              precision    recall  f1-score   support

     AGAINST       0.70      0.95      0.81       189
       FAVOR       0.64      0.15      0.25        46

   micro avg       0.70      0.80      0.74       235
   macro avg       0.67      0.55      0.53       235
weighted avg       0.69      0.80      0.70       235

Elapsed time0.2931080000000179seconds
Iteration: 12




              precision    recall  f1-score   support

     AGAINST       0.71      0.93      0.80       189
       FAVOR       0.60      0.20      0.30        46

   micro avg       0.70      0.79      0.74       235
   macro avg       0.65      0.56      0.55       235
weighted avg       0.69      0.79      0.70       235

Elapsed time0.29744600000000787seconds
Iteration: 13




              precision    recall  f1-score   support

     AGAINST       0.72      0.93      0.81       189
       FAVOR       0.62      0.22      0.32        46

   micro avg       0.71      0.79      0.75       235
   macro avg       0.67      0.57      0.57       235
weighted avg       0.70      0.79      0.71       235

Elapsed time0.28132700000000455seconds
Iteration: 14




              precision    recall  f1-score   support

     AGAINST       0.71      0.91      0.80       189
       FAVOR       0.61      0.24      0.34        46

   micro avg       0.71      0.78      0.74       235
   macro avg       0.66      0.57      0.57       235
weighted avg       0.69      0.78      0.71       235

Elapsed time0.3064189999999769seconds
Iteration: 15




              precision    recall  f1-score   support

     AGAINST       0.71      0.90      0.80       189
       FAVOR       0.58      0.24      0.34        46

   micro avg       0.70      0.77      0.74       235
   macro avg       0.65      0.57      0.57       235
weighted avg       0.69      0.77      0.71       235

Elapsed time0.3049489999999935seconds
Iteration: 16




              precision    recall  f1-score   support

     AGAINST       0.71      0.89      0.79       189
       FAVOR       0.55      0.24      0.33        46

   micro avg       0.70      0.77      0.73       235
   macro avg       0.63      0.57      0.56       235
weighted avg       0.68      0.77      0.70       235

Elapsed time0.31049600000000055seconds
Iteration: 17




              precision    recall  f1-score   support

     AGAINST       0.71      0.89      0.79       189
       FAVOR       0.52      0.24      0.33        46

   micro avg       0.70      0.76      0.73       235
   macro avg       0.62      0.56      0.56       235
weighted avg       0.68      0.76      0.70       235

Elapsed time0.2906340000000114seconds
Iteration: 18




              precision    recall  f1-score   support

     AGAINST       0.71      0.87      0.78       189
       FAVOR       0.48      0.26      0.34        46

   micro avg       0.69      0.75      0.72       235
   macro avg       0.59      0.56      0.56       235
weighted avg       0.66      0.75      0.69       235

Elapsed time0.2956299999999885seconds
Iteration: 19




              precision    recall  f1-score   support

     AGAINST       0.72      0.87      0.79       189
       FAVOR       0.52      0.30      0.38        46

   micro avg       0.70      0.76      0.73       235
   macro avg       0.62      0.59      0.59       235
weighted avg       0.68      0.76      0.71       235

Elapsed time0.29749000000001047seconds




In [28]:
nlp = train_spacy(training_data, 20, test_texts, test_cats, "simple_cnn")

Training the model...
LOSS 	  P  	  R  	  F  
Iteration: 0


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.68      1.00      0.81       189
       FAVOR       0.00      0.00      0.00        46

   micro avg       0.68      0.80      0.73       235
   macro avg       0.34      0.50      0.40       235
weighted avg       0.54      0.80      0.65       235

Elapsed time1.897892000000013seconds
Iteration: 1


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

     AGAINST       0.68      1.00      0.81       189
       FAVOR       0.00      0.00      0.00        46

   micro avg       0.68      0.80      0.73       235
   macro avg       0.34      0.50      0.40       235
weighted avg       0.54      0.80      0.65       235

Elapsed time1.5802309999999977seconds
Iteration: 2
              precision    recall  f1-score   support

     AGAINST       0.69      0.91      0.78       189
       FAVOR       0.17      0.02      0.04        46

   micro avg       0.67      0.74      0.70       235
   macro avg       0.43      0.47      0.41       235
weighted avg       0.58      0.74      0.64       235

Elapsed time1.5569500000000005seconds
Iteration: 3




              precision    recall  f1-score   support

     AGAINST       0.72      0.80      0.76       189
       FAVOR       0.45      0.22      0.29        46

   micro avg       0.70      0.69      0.69       235
   macro avg       0.59      0.51      0.53       235
weighted avg       0.67      0.69      0.67       235

Elapsed time1.5987550000000113seconds
Iteration: 4




              precision    recall  f1-score   support

     AGAINST       0.74      0.75      0.74       189
       FAVOR       0.44      0.35      0.39        46

   micro avg       0.69      0.67      0.68       235
   macro avg       0.59      0.55      0.57       235
weighted avg       0.68      0.67      0.67       235

Elapsed time1.5476390000000038seconds
Iteration: 5




              precision    recall  f1-score   support

     AGAINST       0.78      0.66      0.71       189
       FAVOR       0.47      0.50      0.48        46

   micro avg       0.71      0.63      0.67       235
   macro avg       0.63      0.58      0.60       235
weighted avg       0.72      0.63      0.67       235

Elapsed time1.5736279999999851seconds
Iteration: 6




              precision    recall  f1-score   support

     AGAINST       0.76      0.69      0.73       189
       FAVOR       0.48      0.52      0.50        46

   micro avg       0.70      0.66      0.68       235
   macro avg       0.62      0.61      0.61       235
weighted avg       0.71      0.66      0.68       235

Elapsed time1.5683670000000234seconds
Iteration: 7




              precision    recall  f1-score   support

     AGAINST       0.77      0.68      0.72       189
       FAVOR       0.51      0.50      0.51        46

   micro avg       0.71      0.65      0.68       235
   macro avg       0.64      0.59      0.61       235
weighted avg       0.72      0.65      0.68       235

Elapsed time1.883371000000011seconds
Iteration: 8




              precision    recall  f1-score   support

     AGAINST       0.77      0.68      0.72       189
       FAVOR       0.50      0.48      0.49        46

   micro avg       0.71      0.64      0.67       235
   macro avg       0.63      0.58      0.60       235
weighted avg       0.71      0.64      0.67       235

Elapsed time1.5636440000000107seconds
Iteration: 9




              precision    recall  f1-score   support

     AGAINST       0.76      0.70      0.73       189
       FAVOR       0.49      0.52      0.51        46

   micro avg       0.70      0.67      0.68       235
   macro avg       0.62      0.61      0.62       235
weighted avg       0.71      0.67      0.69       235

Elapsed time1.554395999999997seconds
Iteration: 10




              precision    recall  f1-score   support

     AGAINST       0.77      0.73      0.75       189
       FAVOR       0.48      0.46      0.47        46

   micro avg       0.71      0.68      0.69       235
   macro avg       0.62      0.59      0.61       235
weighted avg       0.71      0.68      0.69       235

Elapsed time1.5610929999999996seconds
Iteration: 11




              precision    recall  f1-score   support

     AGAINST       0.77      0.67      0.72       189
       FAVOR       0.41      0.43      0.42        46

   micro avg       0.69      0.63      0.65       235
   macro avg       0.59      0.55      0.57       235
weighted avg       0.70      0.63      0.66       235

Elapsed time1.546073000000007seconds
Iteration: 12




              precision    recall  f1-score   support

     AGAINST       0.77      0.65      0.70       189
       FAVOR       0.40      0.54      0.46        46

   micro avg       0.67      0.63      0.65       235
   macro avg       0.59      0.60      0.58       235
weighted avg       0.70      0.63      0.66       235

Elapsed time1.5546870000000013seconds
Iteration: 13




              precision    recall  f1-score   support

     AGAINST       0.76      0.66      0.70       189
       FAVOR       0.39      0.48      0.43        46

   micro avg       0.67      0.62      0.64       235
   macro avg       0.58      0.57      0.57       235
weighted avg       0.69      0.62      0.65       235

Elapsed time1.5648260000000107seconds
Iteration: 14




              precision    recall  f1-score   support

     AGAINST       0.77      0.67      0.71       189
       FAVOR       0.39      0.54      0.45        46

   micro avg       0.66      0.64      0.65       235
   macro avg       0.58      0.61      0.58       235
weighted avg       0.69      0.64      0.66       235

Elapsed time1.5699500000000057seconds
Iteration: 15




              precision    recall  f1-score   support

     AGAINST       0.76      0.67      0.71       189
       FAVOR       0.41      0.54      0.47        46

   micro avg       0.67      0.64      0.66       235
   macro avg       0.59      0.61      0.59       235
weighted avg       0.69      0.64      0.66       235

Elapsed time1.5683070000000043seconds
Iteration: 16




              precision    recall  f1-score   support

     AGAINST       0.77      0.65      0.70       189
       FAVOR       0.41      0.54      0.47        46

   micro avg       0.67      0.63      0.65       235
   macro avg       0.59      0.59      0.59       235
weighted avg       0.70      0.63      0.66       235

Elapsed time1.550896000000023seconds
Iteration: 17




              precision    recall  f1-score   support

     AGAINST       0.76      0.66      0.70       189
       FAVOR       0.40      0.50      0.44        46

   micro avg       0.67      0.63      0.64       235
   macro avg       0.58      0.58      0.57       235
weighted avg       0.69      0.63      0.65       235

Elapsed time1.5351030000000208seconds
Iteration: 18




              precision    recall  f1-score   support

     AGAINST       0.77      0.65      0.70       189
       FAVOR       0.39      0.48      0.43        46

   micro avg       0.67      0.61      0.64       235
   macro avg       0.58      0.56      0.57       235
weighted avg       0.69      0.61      0.65       235

Elapsed time1.5803410000000042seconds
Iteration: 19




              precision    recall  f1-score   support

     AGAINST       0.75      0.67      0.71       189
       FAVOR       0.40      0.46      0.43        46

   micro avg       0.67      0.63      0.65       235
   macro avg       0.58      0.56      0.57       235
weighted avg       0.69      0.63      0.65       235

Elapsed time1.5713640000000169seconds




## Fake News

In [29]:
from sklearn.model_selection import train_test_split

In [30]:
# data path. trial data used as training too.
folder = "fake_rada"
target = "fake_news_full"
labels = ['legit', 'fake']
fake_news_file = f"../datasets/{folder}/{target}.tsv"
df = pd.read_csv(fake_news_file, sep='\t', header=None, encoding='utf-8')
df = df.rename(columns={0: "Class", 1: "Text"})
training_data, test_data = train_test_split(df, test_size=0.2)

In [31]:
def load_data_spacy(data):
    print(data['Class'].value_counts())
    texts = data['Text'].tolist()
    cats = data['Class'].tolist()
    final_cats=[]
    for cat in cats:
        cat_list = {}
        if cat == 'fake':
            cat_list['fake'] =  1
            cat_list['legit'] =  0
        else:
            cat_list['fake'] =  0
            cat_list['legit'] =  1
        final_cats.append(cat_list)
    data = list(zip(texts, [{"cats": cats} for cats in final_cats]))
    return data, texts, cats

In [32]:
training_data, train_texts, train_cats = load_data_spacy(training_data)
print("TOTAL    ", len(training_data))
test_data, test_texts, test_cats = load_data_spacy(test_data)
print("TOTAL    ", len(test_data))

legit    191
fake     190
Name: Class, dtype: int64
TOTAL     381
legit    49
fake     47
Name: Class, dtype: int64
TOTAL     96


In [33]:
nlp = train_spacy(training_data, 20, test_texts, test_cats, "bow")

Training the model...
LOSS 	  P  	  R  	  F  
Iteration: 0




              precision    recall  f1-score   support

       legit       0.56      0.45      0.50        49
        fake       0.53      0.64      0.58        47

    accuracy                           0.54        96
   macro avg       0.55      0.54      0.54        96
weighted avg       0.55      0.54      0.54        96

Elapsed time0.8482089999999971seconds
Iteration: 1




              precision    recall  f1-score   support

       legit       0.52      0.45      0.48        49
        fake       0.50      0.57      0.53        47

    accuracy                           0.51        96
   macro avg       0.51      0.51      0.51        96
weighted avg       0.51      0.51      0.51        96

Elapsed time0.37719199999997954seconds
Iteration: 2




              precision    recall  f1-score   support

       legit       0.53      0.53      0.53        49
        fake       0.51      0.51      0.51        47

    accuracy                           0.52        96
   macro avg       0.52      0.52      0.52        96
weighted avg       0.52      0.52      0.52        96

Elapsed time0.36866699999998787seconds
Iteration: 3




              precision    recall  f1-score   support

       legit       0.51      0.55      0.53        49
        fake       0.49      0.45      0.47        47

    accuracy                           0.50        96
   macro avg       0.50      0.50      0.50        96
weighted avg       0.50      0.50      0.50        96

Elapsed time0.35603799999998387seconds
Iteration: 4




              precision    recall  f1-score   support

       legit       0.51      0.47      0.49        49
        fake       0.49      0.53      0.51        47

    accuracy                           0.50        96
   macro avg       0.50      0.50      0.50        96
weighted avg       0.50      0.50      0.50        96

Elapsed time0.4717249999999922seconds
Iteration: 5




              precision    recall  f1-score   support

       legit       0.52      0.45      0.48        49
        fake       0.50      0.57      0.53        47

    accuracy                           0.51        96
   macro avg       0.51      0.51      0.51        96
weighted avg       0.51      0.51      0.51        96

Elapsed time0.3652580000000114seconds
Iteration: 6




              precision    recall  f1-score   support

       legit       0.52      0.51      0.52        49
        fake       0.50      0.51      0.51        47

    accuracy                           0.51        96
   macro avg       0.51      0.51      0.51        96
weighted avg       0.51      0.51      0.51        96

Elapsed time0.36406900000000064seconds
Iteration: 7




              precision    recall  f1-score   support

       legit       0.52      0.51      0.52        49
        fake       0.50      0.51      0.51        47

    accuracy                           0.51        96
   macro avg       0.51      0.51      0.51        96
weighted avg       0.51      0.51      0.51        96

Elapsed time0.37922000000000367seconds
Iteration: 8




              precision    recall  f1-score   support

       legit       0.52      0.51      0.52        49
        fake       0.50      0.51      0.51        47

    accuracy                           0.51        96
   macro avg       0.51      0.51      0.51        96
weighted avg       0.51      0.51      0.51        96

Elapsed time0.36904499999999985seconds
Iteration: 9




              precision    recall  f1-score   support

       legit       0.53      0.51      0.52        49
        fake       0.51      0.53      0.52        47

    accuracy                           0.52        96
   macro avg       0.52      0.52      0.52        96
weighted avg       0.52      0.52      0.52        96

Elapsed time0.3652189999999962seconds
Iteration: 10




              precision    recall  f1-score   support

       legit       0.52      0.49      0.51        49
        fake       0.50      0.53      0.52        47

    accuracy                           0.51        96
   macro avg       0.51      0.51      0.51        96
weighted avg       0.51      0.51      0.51        96

Elapsed time0.3675389999999936seconds
Iteration: 11




              precision    recall  f1-score   support

       legit       0.52      0.49      0.51        49
        fake       0.50      0.53      0.52        47

    accuracy                           0.51        96
   macro avg       0.51      0.51      0.51        96
weighted avg       0.51      0.51      0.51        96

Elapsed time0.3686079999999947seconds
Iteration: 12




              precision    recall  f1-score   support

       legit       0.52      0.49      0.51        49
        fake       0.50      0.53      0.52        47

    accuracy                           0.51        96
   macro avg       0.51      0.51      0.51        96
weighted avg       0.51      0.51      0.51        96

Elapsed time0.3705050000000085seconds
Iteration: 13




              precision    recall  f1-score   support

       legit       0.52      0.49      0.51        49
        fake       0.50      0.53      0.52        47

    accuracy                           0.51        96
   macro avg       0.51      0.51      0.51        96
weighted avg       0.51      0.51      0.51        96

Elapsed time0.33930999999998335seconds
Iteration: 14




              precision    recall  f1-score   support

       legit       0.52      0.49      0.51        49
        fake       0.50      0.53      0.52        47

    accuracy                           0.51        96
   macro avg       0.51      0.51      0.51        96
weighted avg       0.51      0.51      0.51        96

Elapsed time0.3650329999999826seconds
Iteration: 15




              precision    recall  f1-score   support

       legit       0.52      0.49      0.51        49
        fake       0.50      0.53      0.52        47

    accuracy                           0.51        96
   macro avg       0.51      0.51      0.51        96
weighted avg       0.51      0.51      0.51        96

Elapsed time0.36219299999999066seconds
Iteration: 16




              precision    recall  f1-score   support

       legit       0.52      0.49      0.51        49
        fake       0.50      0.53      0.52        47

    accuracy                           0.51        96
   macro avg       0.51      0.51      0.51        96
weighted avg       0.51      0.51      0.51        96

Elapsed time0.3487619999999936seconds
Iteration: 17




              precision    recall  f1-score   support

       legit       0.53      0.51      0.52        49
        fake       0.51      0.53      0.52        47

    accuracy                           0.52        96
   macro avg       0.52      0.52      0.52        96
weighted avg       0.52      0.52      0.52        96

Elapsed time0.3599179999999933seconds
Iteration: 18




              precision    recall  f1-score   support

       legit       0.53      0.51      0.52        49
        fake       0.51      0.53      0.52        47

    accuracy                           0.52        96
   macro avg       0.52      0.52      0.52        96
weighted avg       0.52      0.52      0.52        96

Elapsed time0.3596239999999966seconds
Iteration: 19




              precision    recall  f1-score   support

       legit       0.53      0.51      0.52        49
        fake       0.51      0.53      0.52        47

    accuracy                           0.52        96
   macro avg       0.52      0.52      0.52        96
weighted avg       0.52      0.52      0.52        96

Elapsed time0.3482869999999991seconds




In [34]:
nlp = train_spacy(training_data, 20, test_texts, test_cats, "simple_cnn")

Training the model...
LOSS 	  P  	  R  	  F  
Iteration: 0


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

       legit       0.51      1.00      0.68        49
        fake       0.00      0.00      0.00        47

    accuracy                           0.51        96
   macro avg       0.26      0.50      0.34        96
weighted avg       0.26      0.51      0.34        96

Elapsed time4.915413000000001seconds
Iteration: 1
              precision    recall  f1-score   support

       legit       0.67      0.04      0.08        49
        fake       0.49      0.98      0.66        47

    accuracy                           0.50        96
   macro avg       0.58      0.51      0.37        96
weighted avg       0.58      0.50      0.36        96

Elapsed time4.314307999999983seconds
Iteration: 2




              precision    recall  f1-score   support

       legit       0.70      0.65      0.67        49
        fake       0.66      0.70      0.68        47

    accuracy                           0.68        96
   macro avg       0.68      0.68      0.68        96
weighted avg       0.68      0.68      0.68        96

Elapsed time4.282456999999994seconds
Iteration: 3




              precision    recall  f1-score   support

       legit       0.64      0.71      0.67        49
        fake       0.66      0.57      0.61        47

    accuracy                           0.65        96
   macro avg       0.65      0.64      0.64        96
weighted avg       0.65      0.65      0.64        96

Elapsed time4.410654999999991seconds
Iteration: 4




              precision    recall  f1-score   support

       legit       0.64      0.78      0.70        49
        fake       0.70      0.55      0.62        47

    accuracy                           0.67        96
   macro avg       0.67      0.66      0.66        96
weighted avg       0.67      0.67      0.66        96

Elapsed time4.2757600000000195seconds
Iteration: 5




              precision    recall  f1-score   support

       legit       0.65      0.80      0.72        49
        fake       0.72      0.55      0.63        47

    accuracy                           0.68        96
   macro avg       0.69      0.67      0.67        96
weighted avg       0.69      0.68      0.67        96

Elapsed time4.30085600000001seconds
Iteration: 6




              precision    recall  f1-score   support

       legit       0.63      0.69      0.66        49
        fake       0.64      0.57      0.61        47

    accuracy                           0.64        96
   macro avg       0.64      0.63      0.63        96
weighted avg       0.64      0.64      0.63        96

Elapsed time4.285679000000016seconds
Iteration: 7




              precision    recall  f1-score   support

       legit       0.63      0.78      0.70        49
        fake       0.69      0.53      0.60        47

    accuracy                           0.66        96
   macro avg       0.66      0.65      0.65        96
weighted avg       0.66      0.66      0.65        96

Elapsed time4.226956999999999seconds
Iteration: 8




              precision    recall  f1-score   support

       legit       0.63      0.59      0.61        49
        fake       0.60      0.64      0.62        47

    accuracy                           0.61        96
   macro avg       0.62      0.62      0.61        96
weighted avg       0.62      0.61      0.61        96

Elapsed time4.190550999999999seconds
Iteration: 9




              precision    recall  f1-score   support

       legit       0.60      0.69      0.64        49
        fake       0.62      0.51      0.56        47

    accuracy                           0.60        96
   macro avg       0.61      0.60      0.60        96
weighted avg       0.61      0.60      0.60        96

Elapsed time4.2201670000000036seconds
Iteration: 10




              precision    recall  f1-score   support

       legit       0.59      0.61      0.60        49
        fake       0.58      0.55      0.57        47

    accuracy                           0.58        96
   macro avg       0.58      0.58      0.58        96
weighted avg       0.58      0.58      0.58        96

Elapsed time4.234679999999997seconds
Iteration: 11




              precision    recall  f1-score   support

       legit       0.63      0.55      0.59        49
        fake       0.58      0.66      0.62        47

    accuracy                           0.60        96
   macro avg       0.61      0.61      0.60        96
weighted avg       0.61      0.60      0.60        96

Elapsed time4.207504999999998seconds
Iteration: 12




              precision    recall  f1-score   support

       legit       0.62      0.53      0.57        49
        fake       0.57      0.66      0.61        47

    accuracy                           0.59        96
   macro avg       0.60      0.60      0.59        96
weighted avg       0.60      0.59      0.59        96

Elapsed time4.187690000000032seconds
Iteration: 13




              precision    recall  f1-score   support

       legit       0.61      0.55      0.58        49
        fake       0.58      0.64      0.61        47

    accuracy                           0.59        96
   macro avg       0.60      0.59      0.59        96
weighted avg       0.60      0.59      0.59        96

Elapsed time4.245648000000017seconds
Iteration: 14




              precision    recall  f1-score   support

       legit       0.56      0.57      0.57        49
        fake       0.54      0.53      0.54        47

    accuracy                           0.55        96
   macro avg       0.55      0.55      0.55        96
weighted avg       0.55      0.55      0.55        96

Elapsed time4.183709999999962seconds
Iteration: 15




              precision    recall  f1-score   support

       legit       0.56      0.57      0.57        49
        fake       0.54      0.53      0.54        47

    accuracy                           0.55        96
   macro avg       0.55      0.55      0.55        96
weighted avg       0.55      0.55      0.55        96

Elapsed time4.238297999999986seconds
Iteration: 16




              precision    recall  f1-score   support

       legit       0.56      0.61      0.58        49
        fake       0.55      0.49      0.52        47

    accuracy                           0.55        96
   macro avg       0.55      0.55      0.55        96
weighted avg       0.55      0.55      0.55        96

Elapsed time4.238778000000025seconds
Iteration: 17




              precision    recall  f1-score   support

       legit       0.55      0.59      0.57        49
        fake       0.53      0.49      0.51        47

    accuracy                           0.54        96
   macro avg       0.54      0.54      0.54        96
weighted avg       0.54      0.54      0.54        96

Elapsed time4.282365999999968seconds
Iteration: 18




              precision    recall  f1-score   support

       legit       0.54      0.67      0.60        49
        fake       0.54      0.40      0.46        47

    accuracy                           0.54        96
   macro avg       0.54      0.54      0.53        96
weighted avg       0.54      0.54      0.53        96

Elapsed time4.203459000000009seconds
Iteration: 19




              precision    recall  f1-score   support

       legit       0.55      0.63      0.59        49
        fake       0.55      0.47      0.51        47

    accuracy                           0.55        96
   macro avg       0.55      0.55      0.55        96
weighted avg       0.55      0.55      0.55        96

Elapsed time4.184121000000005seconds




## Fake News Celebrity

In [35]:
target = "celebrity_full"
celebrity_file = f"../datasets/{folder}/{target}.tsv"
df = pd.read_csv(celebrity_file, sep='\t', header=None, encoding='utf-8')
df = df.rename(columns={0: "Class", 1: "Text"})
training_data, test_data = train_test_split(df, test_size=0.2)

In [36]:
training_data, train_texts, train_cats = load_data_spacy(training_data)
print("TOTAL    ", len(training_data))
test_data, test_texts, test_cats = load_data_spacy(test_data)
print("TOTAL    ", len(test_data))

legit    203
fake     197
Name: Class, dtype: int64
TOTAL     400
fake     53
legit    47
Name: Class, dtype: int64
TOTAL     100


In [37]:
nlp = train_spacy(training_data, 20, test_texts, test_cats, "bow")

Training the model...
LOSS 	  P  	  R  	  F  
Iteration: 0




              precision    recall  f1-score   support

       legit       0.53      0.87      0.66        47
        fake       0.73      0.30      0.43        53

    accuracy                           0.57       100
   macro avg       0.63      0.59      0.54       100
weighted avg       0.63      0.57      0.53       100

Elapsed time2.3798380000000066seconds
Iteration: 1




              precision    recall  f1-score   support

       legit       0.59      0.57      0.58        47
        fake       0.63      0.64      0.64        53

    accuracy                           0.61       100
   macro avg       0.61      0.61      0.61       100
weighted avg       0.61      0.61      0.61       100

Elapsed time1.6119659999999953seconds
Iteration: 2




              precision    recall  f1-score   support

       legit       0.64      0.81      0.72        47
        fake       0.78      0.60      0.68        53

    accuracy                           0.70       100
   macro avg       0.71      0.71      0.70       100
weighted avg       0.72      0.70      0.70       100

Elapsed time1.6241150000000175seconds
Iteration: 3




              precision    recall  f1-score   support

       legit       0.63      0.70      0.67        47
        fake       0.71      0.64      0.67        53

    accuracy                           0.67       100
   macro avg       0.67      0.67      0.67       100
weighted avg       0.67      0.67      0.67       100

Elapsed time1.583264999999983seconds
Iteration: 4




              precision    recall  f1-score   support

       legit       0.64      0.74      0.69        47
        fake       0.73      0.62      0.67        53

    accuracy                           0.68       100
   macro avg       0.68      0.68      0.68       100
weighted avg       0.69      0.68      0.68       100

Elapsed time1.5874389999999607seconds
Iteration: 5




              precision    recall  f1-score   support

       legit       0.67      0.68      0.67        47
        fake       0.71      0.70      0.70        53

    accuracy                           0.69       100
   macro avg       0.69      0.69      0.69       100
weighted avg       0.69      0.69      0.69       100

Elapsed time1.5827390000000037seconds
Iteration: 6




              precision    recall  f1-score   support

       legit       0.65      0.64      0.65        47
        fake       0.69      0.70      0.69        53

    accuracy                           0.67       100
   macro avg       0.67      0.67      0.67       100
weighted avg       0.67      0.67      0.67       100

Elapsed time1.5702269999999885seconds
Iteration: 7




              precision    recall  f1-score   support

       legit       0.66      0.74      0.70        47
        fake       0.74      0.66      0.70        53

    accuracy                           0.70       100
   macro avg       0.70      0.70      0.70       100
weighted avg       0.71      0.70      0.70       100

Elapsed time1.5910370000000285seconds
Iteration: 8




              precision    recall  f1-score   support

       legit       0.65      0.70      0.67        47
        fake       0.71      0.66      0.69        53

    accuracy                           0.68       100
   macro avg       0.68      0.68      0.68       100
weighted avg       0.68      0.68      0.68       100

Elapsed time1.5861289999999713seconds
Iteration: 9




              precision    recall  f1-score   support

       legit       0.65      0.72      0.69        47
        fake       0.73      0.66      0.69        53

    accuracy                           0.69       100
   macro avg       0.69      0.69      0.69       100
weighted avg       0.69      0.69      0.69       100

Elapsed time1.6724570000000085seconds
Iteration: 10




              precision    recall  f1-score   support

       legit       0.65      0.72      0.69        47
        fake       0.73      0.66      0.69        53

    accuracy                           0.69       100
   macro avg       0.69      0.69      0.69       100
weighted avg       0.69      0.69      0.69       100

Elapsed time1.593291000000022seconds
Iteration: 11




              precision    recall  f1-score   support

       legit       0.66      0.74      0.70        47
        fake       0.74      0.66      0.70        53

    accuracy                           0.70       100
   macro avg       0.70      0.70      0.70       100
weighted avg       0.71      0.70      0.70       100

Elapsed time1.580329000000006seconds
Iteration: 12




              precision    recall  f1-score   support

       legit       0.66      0.74      0.70        47
        fake       0.74      0.66      0.70        53

    accuracy                           0.70       100
   macro avg       0.70      0.70      0.70       100
weighted avg       0.71      0.70      0.70       100

Elapsed time1.635355000000004seconds
Iteration: 13




              precision    recall  f1-score   support

       legit       0.67      0.85      0.75        47
        fake       0.82      0.62      0.71        53

    accuracy                           0.73       100
   macro avg       0.75      0.74      0.73       100
weighted avg       0.75      0.73      0.73       100

Elapsed time1.6267869999999789seconds
Iteration: 14




              precision    recall  f1-score   support

       legit       0.66      0.83      0.74        47
        fake       0.80      0.62      0.70        53

    accuracy                           0.72       100
   macro avg       0.73      0.73      0.72       100
weighted avg       0.74      0.72      0.72       100

Elapsed time1.6224260000000186seconds
Iteration: 15




              precision    recall  f1-score   support

       legit       0.66      0.79      0.72        47
        fake       0.77      0.64      0.70        53

    accuracy                           0.71       100
   macro avg       0.72      0.71      0.71       100
weighted avg       0.72      0.71      0.71       100

Elapsed time1.6438770000000318seconds
Iteration: 16




              precision    recall  f1-score   support

       legit       0.65      0.77      0.71        47
        fake       0.76      0.64      0.69        53

    accuracy                           0.70       100
   macro avg       0.71      0.70      0.70       100
weighted avg       0.71      0.70      0.70       100

Elapsed time1.6069089999999733seconds
Iteration: 17




              precision    recall  f1-score   support

       legit       0.65      0.77      0.71        47
        fake       0.76      0.64      0.69        53

    accuracy                           0.70       100
   macro avg       0.71      0.70      0.70       100
weighted avg       0.71      0.70      0.70       100

Elapsed time1.6013740000000212seconds
Iteration: 18




              precision    recall  f1-score   support

       legit       0.63      0.68      0.65        47
        fake       0.69      0.64      0.67        53

    accuracy                           0.66       100
   macro avg       0.66      0.66      0.66       100
weighted avg       0.66      0.66      0.66       100

Elapsed time1.7078610000000367seconds
Iteration: 19




              precision    recall  f1-score   support

       legit       0.63      0.68      0.65        47
        fake       0.69      0.64      0.67        53

    accuracy                           0.66       100
   macro avg       0.66      0.66      0.66       100
weighted avg       0.66      0.66      0.66       100

Elapsed time1.5967820000000188seconds




In [38]:
nlp = train_spacy(training_data, 20, test_texts, test_cats, "simple_cnn", model=en_core_web_sm)

Training the model...
LOSS 	  P  	  R  	  F  
Iteration: 0


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

       legit       0.47      1.00      0.64        47
        fake       0.00      0.00      0.00        53

    accuracy                           0.47       100
   macro avg       0.23      0.50      0.32       100
weighted avg       0.22      0.47      0.30       100

Elapsed time17.67125599999997seconds
Iteration: 1


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

       legit       0.47      1.00      0.64        47
        fake       0.00      0.00      0.00        53

    accuracy                           0.47       100
   macro avg       0.23      0.50      0.32       100
weighted avg       0.22      0.47      0.30       100

Elapsed time16.994145000000003seconds
Iteration: 2
              precision    recall  f1-score   support

       legit       0.66      0.74      0.70        47
        fake       0.74      0.66      0.70        53

    accuracy                           0.70       100
   macro avg       0.70      0.70      0.70       100
weighted avg       0.71      0.70      0.70       100

Elapsed time17.499482999999998seconds
Iteration: 3




              precision    recall  f1-score   support

       legit       0.57      0.96      0.71        47
        fake       0.90      0.36      0.51        53

    accuracy                           0.64       100
   macro avg       0.74      0.66      0.61       100
weighted avg       0.75      0.64      0.61       100

Elapsed time16.905857000000026seconds
Iteration: 4




              precision    recall  f1-score   support

       legit       0.65      0.68      0.67        47
        fake       0.71      0.68      0.69        53

    accuracy                           0.68       100
   macro avg       0.68      0.68      0.68       100
weighted avg       0.68      0.68      0.68       100

Elapsed time16.967750999999964seconds
Iteration: 5




              precision    recall  f1-score   support

       legit       0.66      0.74      0.70        47
        fake       0.74      0.66      0.70        53

    accuracy                           0.70       100
   macro avg       0.70      0.70      0.70       100
weighted avg       0.71      0.70      0.70       100

Elapsed time17.040886seconds
Iteration: 6




              precision    recall  f1-score   support

       legit       0.64      0.79      0.70        47
        fake       0.76      0.60      0.67        53

    accuracy                           0.69       100
   macro avg       0.70      0.70      0.69       100
weighted avg       0.70      0.69      0.69       100

Elapsed time16.82744299999996seconds
Iteration: 7




              precision    recall  f1-score   support

       legit       0.74      0.55      0.63        47
        fake       0.68      0.83      0.75        53

    accuracy                           0.70       100
   macro avg       0.71      0.69      0.69       100
weighted avg       0.71      0.70      0.69       100

Elapsed time16.96626600000002seconds
Iteration: 8




              precision    recall  f1-score   support

       legit       0.69      0.72      0.71        47
        fake       0.75      0.72      0.73        53

    accuracy                           0.72       100
   macro avg       0.72      0.72      0.72       100
weighted avg       0.72      0.72      0.72       100

Elapsed time17.05890099999999seconds
Iteration: 9




              precision    recall  f1-score   support

       legit       0.68      0.68      0.68        47
        fake       0.72      0.72      0.72        53

    accuracy                           0.70       100
   macro avg       0.70      0.70      0.70       100
weighted avg       0.70      0.70      0.70       100

Elapsed time16.92929300000003seconds
Iteration: 10




              precision    recall  f1-score   support

       legit       0.65      0.79      0.71        47
        fake       0.77      0.62      0.69        53

    accuracy                           0.70       100
   macro avg       0.71      0.70      0.70       100
weighted avg       0.71      0.70      0.70       100

Elapsed time16.94512800000001seconds
Iteration: 11




              precision    recall  f1-score   support

       legit       0.67      0.64      0.65        47
        fake       0.69      0.72      0.70        53

    accuracy                           0.68       100
   macro avg       0.68      0.68      0.68       100
weighted avg       0.68      0.68      0.68       100

Elapsed time17.215191000000004seconds
Iteration: 12




              precision    recall  f1-score   support

       legit       0.66      0.79      0.72        47
        fake       0.77      0.64      0.70        53

    accuracy                           0.71       100
   macro avg       0.72      0.71      0.71       100
weighted avg       0.72      0.71      0.71       100

Elapsed time16.90057999999999seconds
Iteration: 13




              precision    recall  f1-score   support

       legit       0.69      0.70      0.69        47
        fake       0.73      0.72      0.72        53

    accuracy                           0.71       100
   macro avg       0.71      0.71      0.71       100
weighted avg       0.71      0.71      0.71       100

Elapsed time16.969431999999983seconds
Iteration: 14




              precision    recall  f1-score   support

       legit       0.65      0.72      0.69        47
        fake       0.73      0.66      0.69        53

    accuracy                           0.69       100
   macro avg       0.69      0.69      0.69       100
weighted avg       0.69      0.69      0.69       100

Elapsed time17.370791999999938seconds
Iteration: 15




              precision    recall  f1-score   support

       legit       0.65      0.72      0.69        47
        fake       0.73      0.66      0.69        53

    accuracy                           0.69       100
   macro avg       0.69      0.69      0.69       100
weighted avg       0.69      0.69      0.69       100

Elapsed time17.264425000000074seconds
Iteration: 16




              precision    recall  f1-score   support

       legit       0.67      0.77      0.71        47
        fake       0.76      0.66      0.71        53

    accuracy                           0.71       100
   macro avg       0.71      0.71      0.71       100
weighted avg       0.72      0.71      0.71       100

Elapsed time17.14454999999998seconds
Iteration: 17




              precision    recall  f1-score   support

       legit       0.66      0.70      0.68        47
        fake       0.72      0.68      0.70        53

    accuracy                           0.69       100
   macro avg       0.69      0.69      0.69       100
weighted avg       0.69      0.69      0.69       100

Elapsed time17.22563600000001seconds
Iteration: 18




              precision    recall  f1-score   support

       legit       0.67      0.72      0.69        47
        fake       0.73      0.68      0.71        53

    accuracy                           0.70       100
   macro avg       0.70      0.70      0.70       100
weighted avg       0.70      0.70      0.70       100

Elapsed time17.02709600000003seconds
Iteration: 19




              precision    recall  f1-score   support

       legit       0.68      0.68      0.68        47
        fake       0.72      0.72      0.72        53

    accuracy                           0.70       100
   macro avg       0.70      0.70      0.70       100
weighted avg       0.70      0.70      0.70       100

Elapsed time17.036455000000046seconds




In [39]:
nlp = train_spacy(training_data, 20, test_texts, test_cats, "simple_cnn", model=en_core_web_md)

Training the model...
LOSS 	  P  	  R  	  F  
Iteration: 0




              precision    recall  f1-score   support

       legit       1.00      0.02      0.04        47
        fake       0.54      1.00      0.70        53

    accuracy                           0.54       100
   macro avg       0.77      0.51      0.37       100
weighted avg       0.75      0.54      0.39       100

Elapsed time17.23968000000002seconds
Iteration: 1




              precision    recall  f1-score   support

       legit       0.69      0.53      0.60        47
        fake       0.66      0.79      0.72        53

    accuracy                           0.67       100
   macro avg       0.68      0.66      0.66       100
weighted avg       0.67      0.67      0.66       100

Elapsed time16.968157999999903seconds
Iteration: 2




              precision    recall  f1-score   support

       legit       0.60      0.79      0.68        47
        fake       0.74      0.53      0.62        53

    accuracy                           0.65       100
   macro avg       0.67      0.66      0.65       100
weighted avg       0.67      0.65      0.65       100

Elapsed time16.641706999999997seconds
Iteration: 3




              precision    recall  f1-score   support

       legit       0.57      0.89      0.69        47
        fake       0.81      0.40      0.53        53

    accuracy                           0.63       100
   macro avg       0.69      0.64      0.61       100
weighted avg       0.69      0.63      0.61       100

Elapsed time16.74393699999996seconds
Iteration: 4




              precision    recall  f1-score   support

       legit       0.83      0.40      0.54        47
        fake       0.64      0.92      0.75        53

    accuracy                           0.68       100
   macro avg       0.73      0.66      0.65       100
weighted avg       0.73      0.68      0.65       100

Elapsed time16.545286999999917seconds
Iteration: 5




              precision    recall  f1-score   support

       legit       0.57      0.98      0.72        47
        fake       0.95      0.34      0.50        53

    accuracy                           0.64       100
   macro avg       0.76      0.66      0.61       100
weighted avg       0.77      0.64      0.60       100

Elapsed time16.627117999999996seconds
Iteration: 6




              precision    recall  f1-score   support

       legit       0.72      0.49      0.58        47
        fake       0.65      0.83      0.73        53

    accuracy                           0.67       100
   macro avg       0.68      0.66      0.65       100
weighted avg       0.68      0.67      0.66       100

Elapsed time16.61999800000001seconds
Iteration: 7




              precision    recall  f1-score   support

       legit       0.66      0.87      0.75        47
        fake       0.84      0.60      0.70        53

    accuracy                           0.73       100
   macro avg       0.75      0.74      0.73       100
weighted avg       0.76      0.73      0.73       100

Elapsed time16.89749000000006seconds
Iteration: 8




              precision    recall  f1-score   support

       legit       0.69      0.72      0.71        47
        fake       0.75      0.72      0.73        53

    accuracy                           0.72       100
   macro avg       0.72      0.72      0.72       100
weighted avg       0.72      0.72      0.72       100

Elapsed time17.137474999999995seconds
Iteration: 9




              precision    recall  f1-score   support

       legit       0.70      0.66      0.68        47
        fake       0.71      0.75      0.73        53

    accuracy                           0.71       100
   macro avg       0.71      0.71      0.71       100
weighted avg       0.71      0.71      0.71       100

Elapsed time17.610584000000017seconds
Iteration: 10




              precision    recall  f1-score   support

       legit       0.69      0.77      0.73        47
        fake       0.77      0.70      0.73        53

    accuracy                           0.73       100
   macro avg       0.73      0.73      0.73       100
weighted avg       0.73      0.73      0.73       100

Elapsed time17.413449999999898seconds
Iteration: 11




              precision    recall  f1-score   support

       legit       0.66      0.85      0.74        47
        fake       0.82      0.60      0.70        53

    accuracy                           0.72       100
   macro avg       0.74      0.73      0.72       100
weighted avg       0.74      0.72      0.72       100

Elapsed time17.55959900000005seconds
Iteration: 12




              precision    recall  f1-score   support

       legit       0.68      0.85      0.75        47
        fake       0.83      0.64      0.72        53

    accuracy                           0.74       100
   macro avg       0.75      0.75      0.74       100
weighted avg       0.76      0.74      0.74       100

Elapsed time16.88738699999999seconds
Iteration: 13




              precision    recall  f1-score   support

       legit       0.68      0.85      0.75        47
        fake       0.83      0.64      0.72        53

    accuracy                           0.74       100
   macro avg       0.75      0.75      0.74       100
weighted avg       0.76      0.74      0.74       100

Elapsed time16.851299999999924seconds
Iteration: 14




              precision    recall  f1-score   support

       legit       0.67      0.83      0.74        47
        fake       0.81      0.64      0.72        53

    accuracy                           0.73       100
   macro avg       0.74      0.74      0.73       100
weighted avg       0.75      0.73      0.73       100

Elapsed time16.79320000000007seconds
Iteration: 15




              precision    recall  f1-score   support

       legit       0.66      0.83      0.74        47
        fake       0.80      0.62      0.70        53

    accuracy                           0.72       100
   macro avg       0.73      0.73      0.72       100
weighted avg       0.74      0.72      0.72       100

Elapsed time16.869541000000027seconds
Iteration: 16




              precision    recall  f1-score   support

       legit       0.66      0.87      0.75        47
        fake       0.84      0.60      0.70        53

    accuracy                           0.73       100
   macro avg       0.75      0.74      0.73       100
weighted avg       0.76      0.73      0.73       100

Elapsed time16.706118999999944seconds
Iteration: 17




              precision    recall  f1-score   support

       legit       0.66      0.87      0.75        47
        fake       0.84      0.60      0.70        53

    accuracy                           0.73       100
   macro avg       0.75      0.74      0.73       100
weighted avg       0.76      0.73      0.73       100

Elapsed time16.930451999999946seconds
Iteration: 18




              precision    recall  f1-score   support

       legit       0.66      0.87      0.75        47
        fake       0.84      0.60      0.70        53

    accuracy                           0.73       100
   macro avg       0.75      0.74      0.73       100
weighted avg       0.76      0.73      0.73       100

Elapsed time16.890479000000028seconds
Iteration: 19




              precision    recall  f1-score   support

       legit       0.67      0.87      0.76        47
        fake       0.85      0.62      0.72        53

    accuracy                           0.74       100
   macro avg       0.76      0.75      0.74       100
weighted avg       0.76      0.74      0.74       100

Elapsed time16.900120000000015seconds




In [40]:
nlp = train_spacy(training_data, 20, test_texts, test_cats, "simple_cnn", model=en_core_web_lg)

Training the model...
LOSS 	  P  	  R  	  F  
Iteration: 0


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


              precision    recall  f1-score   support

       legit       0.00      0.00      0.00        47
        fake       0.53      1.00      0.69        53

    accuracy                           0.53       100
   macro avg       0.27      0.50      0.35       100
weighted avg       0.28      0.53      0.37       100

Elapsed time17.567030999999815seconds
Iteration: 1
              precision    recall  f1-score   support

       legit       0.47      1.00      0.64        47
        fake       1.00      0.02      0.04        53

    accuracy                           0.48       100
   macro avg       0.74      0.51      0.34       100
weighted avg       0.75      0.48      0.32       100

Elapsed time17.0897030000001seconds
Iteration: 2




              precision    recall  f1-score   support

       legit       0.85      0.23      0.37        47
        fake       0.59      0.96      0.73        53

    accuracy                           0.62       100
   macro avg       0.72      0.60      0.55       100
weighted avg       0.71      0.62      0.56       100

Elapsed time17.195531999999957seconds
Iteration: 3




              precision    recall  f1-score   support

       legit       0.77      0.51      0.62        47
        fake       0.67      0.87      0.75        53

    accuracy                           0.70       100
   macro avg       0.72      0.69      0.68       100
weighted avg       0.72      0.70      0.69       100

Elapsed time17.47464600000012seconds
Iteration: 4




              precision    recall  f1-score   support

       legit       0.60      0.89      0.72        47
        fake       0.83      0.47      0.60        53

    accuracy                           0.67       100
   macro avg       0.72      0.68      0.66       100
weighted avg       0.72      0.67      0.66       100

Elapsed time17.343699000000015seconds
Iteration: 5




              precision    recall  f1-score   support

       legit       0.64      0.64      0.64        47
        fake       0.68      0.68      0.68        53

    accuracy                           0.66       100
   macro avg       0.66      0.66      0.66       100
weighted avg       0.66      0.66      0.66       100

Elapsed time17.195130999999947seconds
Iteration: 6




              precision    recall  f1-score   support

       legit       0.65      0.64      0.65        47
        fake       0.69      0.70      0.69        53

    accuracy                           0.67       100
   macro avg       0.67      0.67      0.67       100
weighted avg       0.67      0.67      0.67       100

Elapsed time17.370769999999993seconds
Iteration: 7




              precision    recall  f1-score   support

       legit       0.70      0.64      0.67        47
        fake       0.70      0.75      0.73        53

    accuracy                           0.70       100
   macro avg       0.70      0.70      0.70       100
weighted avg       0.70      0.70      0.70       100

Elapsed time17.820760000000064seconds
Iteration: 8




              precision    recall  f1-score   support

       legit       0.70      0.70      0.70        47
        fake       0.74      0.74      0.74        53

    accuracy                           0.72       100
   macro avg       0.72      0.72      0.72       100
weighted avg       0.72      0.72      0.72       100

Elapsed time18.07469900000001seconds
Iteration: 9




              precision    recall  f1-score   support

       legit       0.72      0.70      0.71        47
        fake       0.74      0.75      0.75        53

    accuracy                           0.73       100
   macro avg       0.73      0.73      0.73       100
weighted avg       0.73      0.73      0.73       100

Elapsed time17.842689999999948seconds
Iteration: 10




              precision    recall  f1-score   support

       legit       0.63      0.77      0.69        47
        fake       0.74      0.60      0.67        53

    accuracy                           0.68       100
   macro avg       0.69      0.68      0.68       100
weighted avg       0.69      0.68      0.68       100

Elapsed time17.964079000000083seconds
Iteration: 11




              precision    recall  f1-score   support

       legit       0.65      0.85      0.73        47
        fake       0.82      0.58      0.68        53

    accuracy                           0.71       100
   macro avg       0.73      0.72      0.71       100
weighted avg       0.74      0.71      0.71       100

Elapsed time17.398777999999993seconds
Iteration: 12




              precision    recall  f1-score   support

       legit       0.63      0.81      0.71        47
        fake       0.78      0.58      0.67        53

    accuracy                           0.69       100
   macro avg       0.70      0.70      0.69       100
weighted avg       0.71      0.69      0.69       100

Elapsed time17.1616190000002seconds
Iteration: 13




              precision    recall  f1-score   support

       legit       0.68      0.77      0.72        47
        fake       0.77      0.68      0.72        53

    accuracy                           0.72       100
   macro avg       0.72      0.72      0.72       100
weighted avg       0.73      0.72      0.72       100

Elapsed time17.224954000000025seconds
Iteration: 14




              precision    recall  f1-score   support

       legit       0.66      0.85      0.74        47
        fake       0.82      0.60      0.70        53

    accuracy                           0.72       100
   macro avg       0.74      0.73      0.72       100
weighted avg       0.74      0.72      0.72       100

Elapsed time17.188654000000042seconds
Iteration: 15




              precision    recall  f1-score   support

       legit       0.68      0.85      0.75        47
        fake       0.83      0.64      0.72        53

    accuracy                           0.74       100
   macro avg       0.75      0.75      0.74       100
weighted avg       0.76      0.74      0.74       100

Elapsed time17.19825199999991seconds
Iteration: 16




              precision    recall  f1-score   support

       legit       0.70      0.79      0.74        47
        fake       0.79      0.70      0.74        53

    accuracy                           0.74       100
   macro avg       0.74      0.74      0.74       100
weighted avg       0.75      0.74      0.74       100

Elapsed time17.11533499999996seconds
Iteration: 17




              precision    recall  f1-score   support

       legit       0.68      0.64      0.66        47
        fake       0.70      0.74      0.72        53

    accuracy                           0.69       100
   macro avg       0.69      0.69      0.69       100
weighted avg       0.69      0.69      0.69       100

Elapsed time17.29797399999984seconds
Iteration: 18




              precision    recall  f1-score   support

       legit       0.72      0.66      0.69        47
        fake       0.72      0.77      0.75        53

    accuracy                           0.72       100
   macro avg       0.72      0.72      0.72       100
weighted avg       0.72      0.72      0.72       100

Elapsed time17.21510400000011seconds
Iteration: 19




              precision    recall  f1-score   support

       legit       0.68      0.72      0.70        47
        fake       0.74      0.70      0.72        53

    accuracy                           0.71       100
   macro avg       0.71      0.71      0.71       100
weighted avg       0.71      0.71      0.71       100

Elapsed time17.19644600000015seconds


