# Assignment 2

Date: 29-10-2020 <br>
Nick Radunovic (s2072724) <br>
Cheyenne Heath (s1647865) <br>

Tasks:
1. Download W-NUT_data.zip from the Brightspace assignment and unzip the directory. It
contains 3 IOB files: wnut17train.conll (train), emerging.dev.conll (dev),
emerging.test.annotated (test)
2. The IOB files do not contain POS tags yet. Add a function to your CRFsuite script that reads
the IOB files and adds POS tags (using an existing package for linguistic processing such as
Spacy or NLTK). The data needs to be stored in the same way as the benchmark data from
the tutorial (an array of triples (word,pos,biotag)).
3. Run a baseline run (train -> test) with the features directly copied from the tutorial.
4. Set up hyperparameter optimization using the dev set and evaluate the result on the test set.
5. Extend the features: add a larger context (-2 .. +2 or more) and engineer a few other features
that might be relevant for this task. Have a look at the train/dev data to get inspiration on
potentially relevant papers.
6. Experiment with the effect of different feature sets on the quality of the labelling.

In [118]:
#Imports
import random
random.seed(30) # set random seed for reproducibility

import numpy as np
from itertools import chain
from collections import Counter
import eli5

import nltk
nltk.download('averaged_perceptron_tagger')
import sklearn
import scipy.stats
from sklearn.metrics import make_scorer
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RandomizedSearchCV

import sklearn_crfsuite
from sklearn_crfsuite import scorers
from sklearn_crfsuite import metrics

[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     C:\Users\Stand\AppData\Roaming\nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!


Function to read in the data per sentence. 

In [119]:
def parse_data(file):    
    sents = []
    with open(file, encoding='utf-8') as fp:
        new_sent = []
        for line in fp:
            if (line == '\n') or (line == '\t\n'):
                #new line so end of sentence, append new_sent to sents array and clear the new_sent
                sents.append(new_sent)
                new_sent = []
            else:
                #create tuple and add to sentence
                new_line = line.strip()
                new_sent.append(tuple(new_line.split('\t')))
    return sents

In [120]:
#parse all files
train_sents = parse_data('wnut17train.conll')
dev_sents = parse_data('emerging.dev.conll')
test_sents = parse_data('emerging.test.annotated')

dev_sents[0]

[('Stabilized', 'O'),
 ('approach', 'O'),
 ('or', 'O'),
 ('not', 'O'),
 ('?', 'O'),
 ('That', 'O'),
 ('´', 'O'),
 ('s', 'O'),
 ('insane', 'O'),
 ('and', 'O'),
 ('good', 'O'),
 ('.', 'O')]

Creating a function that transforms data of the form (word,pos) to the form (word,pos,biotag) for the words of each sentence.

In [121]:
#def add_POS_tag(word_tuple):
#    #convert tuple to list
#    l = list(word_tuple)
#    
#    #insert new value at index 1
#    new_val = nltk.pos_tag(word_tuple)
#    l.insert(1, new_val[0][1])
#    
#    #convert list again to tuple
#    new_word_tuple = tuple(l)
#    return new_word_tuple 

def add_POS_tag(word_pos, sent):
    new = []
    for i in range(len(word_pos)):
        l = list(word_pos[i])
        l.append(sent[i][1])
        l = tuple(l)
        new.append(l)
    return new

We now add POS tags to the words of each sentence, storing the data in the format: (word,pos,biotag).
Note, that the function pos_tag of nltk get the whole sentence as input and adds POS tags to each wordt based on both the word and the context that the word is in.

In [122]:
#add pos tag to each dataset, can take a few minutes
train_sents = [add_POS_tag(nltk.pos_tag([word[0] for word in sentence]), sentence) for sentence in train_sents]
dev_sents = [add_POS_tag(nltk.pos_tag([word[0] for word in sentence]), sentence) for sentence in dev_sents]
test_sents = [add_POS_tag(nltk.pos_tag([word[0] for word in sentence]), sentence) for sentence in test_sents]

3. Run a baseline run (train -> test) with the features directly copied from the tutorial.

In [123]:
def word2features(sent, i):
    word = sent[i][0]
    postag = sent[i][1]

    features = {
        'bias': 1.0,
        'word.lower()': word.lower(),
        'word[-3:]': word[-3:],
        'word[-2:]': word[-2:],
        'word.isupper()': word.isupper(),
        'word.istitle()': word.istitle(),
        'word.isdigit()': word.isdigit(),
        'postag': postag,
        'postag[:2]': postag[:2],
    }
    if i > 0:
        word1 = sent[i-1][0]
        postag1 = sent[i-1][1]
        features.update({
            '-1:word.lower()': word1.lower(),
            '-1:word.istitle()': word1.istitle(),
            '-1:word.isupper()': word1.isupper(),
            '-1:postag': postag1,
            '-1:postag[:2]': postag1[:2],
        })
    else:
        features['BOS'] = True

    if i < len(sent)-1:
        word1 = sent[i+1][0]
        postag1 = sent[i+1][1]
        features.update({
            '+1:word.lower()': word1.lower(),
            '+1:word.istitle()': word1.istitle(),
            '+1:word.isupper()': word1.isupper(),
            '+1:postag': postag1,
            '+1:postag[:2]': postag1[:2],
        })
    else:
        features['EOS'] = True

    return features

def sent2features(sent):
    return [word2features(sent, i) for i in range(len(sent))]

def sent2labels(sent):
    return [label for token, postag, label in sent]

def sent2tokens(sent):
    return [token for token, postag, label in sent]

In [124]:
X_train = [sent2features(s) for s in train_sents]
y_train = [sent2labels(s) for s in train_sents]

X_dev = [sent2features(s) for s in dev_sents]
y_dev = [sent2labels(s) for s in dev_sents]

X_test = [sent2features(s) for s in test_sents]
y_test = [sent2labels(s) for s in test_sents]

In [125]:
crf = sklearn_crfsuite.CRF(
    algorithm='lbfgs',
    c1=0.1,
    c2=0.1,
    max_iterations=100,
    all_possible_transitions=True
)
crf.fit(X_train, y_train)

CRF(algorithm='lbfgs', all_possible_states=None, all_possible_transitions=True,
    averaging=None, c=None, c1=0.1, c2=0.1, calibration_candidates=None,
    calibration_eta=None, calibration_max_trials=None, calibration_rate=None,
    calibration_samples=None, delta=None, epsilon=None, error_sensitive=None,
    gamma=None, keep_tempfiles=None, linesearch=None, max_iterations=100,
    max_linesearch=None, min_freq=None, model_filename=None, num_memories=None,
    pa_type=None, period=None, trainer_cls=None, variance=None, verbose=False)

In [126]:
labels = list(crf.classes_)
labels.remove('O')
labels

['B-location',
 'I-location',
 'B-group',
 'B-corporation',
 'B-person',
 'B-creative-work',
 'B-product',
 'I-person',
 'I-creative-work',
 'I-corporation',
 'I-group',
 'I-product']

In [127]:
y_pred = crf.predict(X_test)

sorted_labels = sorted(
    labels,
    key=lambda name: (name[1:], name[0])
)
print(metrics.flat_classification_report(
    y_test, y_pred, labels=sorted_labels, digits=3
))

                 precision    recall  f1-score   support

  B-corporation      0.000     0.000     0.000        66
  I-corporation      0.000     0.000     0.000        22
B-creative-work      0.333     0.035     0.064       142
I-creative-work      0.296     0.037     0.065       218
        B-group      0.300     0.036     0.065       165
        I-group      0.357     0.071     0.119        70
     B-location      0.385     0.233     0.290       150
     I-location      0.231     0.064     0.100        94
       B-person      0.551     0.138     0.220       429
       I-person      0.547     0.221     0.315       131
      B-product      0.600     0.024     0.045       127
      I-product      0.375     0.048     0.085       126

      micro avg      0.430     0.093     0.153      1740
      macro avg      0.331     0.076     0.114      1740
   weighted avg      0.401     0.093     0.142      1740



Set up hyperparameter optimization using the dev set and evaluate the result on the test set.

In [128]:
# define fixed parameters and parameters to search
crf = sklearn_crfsuite.CRF(
    algorithm='lbfgs', 
    max_iterations=100, 
    all_possible_transitions=True
)
params_space = {
    'c1': scipy.stats.expon(scale=0.5),
    'c2': scipy.stats.expon(scale=0.05),
}

# use the same metric for evaluation
f1_scorer = make_scorer(metrics.flat_f1_score, 
                        average='weighted', labels=labels)

# search
rs = RandomizedSearchCV(crf, params_space, 
                        cv=3, 
                        verbose=1, 
                        n_jobs=-1, 
                        n_iter=50, 
                        scoring=f1_scorer)
rs.fit(X_train, y_train)

Fitting 3 folds for each of 50 candidates, totalling 150 fits


[Parallel(n_jobs=-1)]: Using backend LokyBackend with 12 concurrent workers.
[Parallel(n_jobs=-1)]: Done  26 tasks      | elapsed:  1.6min
[Parallel(n_jobs=-1)]: Done 150 out of 150 | elapsed:  9.3min finished


RandomizedSearchCV(cv=3, error_score=nan,
                   estimator=CRF(algorithm='lbfgs', all_possible_states=None,
                                 all_possible_transitions=True, averaging=None,
                                 c=None, c1=None, c2=None,
                                 calibration_candidates=None,
                                 calibration_eta=None,
                                 calibration_max_trials=None,
                                 calibration_rate=None,
                                 calibration_samples=None, delta=None,
                                 epsilon=None, error_sensitive=None, gamma=None,
                                 keep_...
                                        'c2': <scipy.stats._distn_infrastructure.rv_frozen object at 0x000001C5B81AFC48>},
                   pre_dispatch='2*n_jobs', random_state=None, refit=True,
                   return_train_score=False,
                   scoring=make_scorer(flat_f1_score, average=weighte

In [129]:
print('best params:', rs.best_params_)
print('best CV score:', rs.best_score_)
print('model size: {:0.2f}M'.format(rs.best_estimator_.size_ / 1000000))

print("\nweighted avg:")
crf = rs.best_estimator_
y_pred = crf.predict(X_test)
weighted_avg = metrics.flat_classification_report(
    y_test, y_pred, labels=sorted_labels, digits=3, output_dict=True)['weighted avg']
for k in weighted_avg.keys():
    print("%s: %s" % (k, round(weighted_avg[k], 3)))

best params: {'c1': 0.04628662290756564, 'c2': 0.03332856133145488}
best CV score: 0.3981227042835333
model size: 0.92M

weighted avg:
precision: 0.383
recall: 0.092
f1-score: 0.141
support: 1740


#### Extend the features

This is the extended word2feature function that encompasses both a bigger range (-3 to +3) and a new feature: 'word.starts_with_uppercase'.

In [130]:
def word2features_extended(sent, i):
    word = sent[i][0]
    postag = sent[i][1]

    features = {
        'bias': 1.0,
        'word.lower()': word.lower(),
        'word[-3:]': word[-3:],
        'word[-2:]': word[-2:],
        'word.isupper()': word.isupper(),
        'word.istitle()': word.istitle(),
        'word.isdigit()': word.isdigit(),
        #'word.starts_with_uppercase': word[:1].isupper(),
        'postag': postag,
        'postag[:2]': postag[:2],
    }
    if i > 2:
        word3 = sent[i-3][0]
        postag3 = sent[i-3][1]
        features.update({
            '-3:word.lower()': word3.lower(),
            '-3:word.istitle()': word3.istitle(),
            '-3:word.isupper()': word3.isupper(),
            '-3:word.isdigit()': word3.isdigit(),
            '-3:postag': postag3,
            '-3:postag[:2]': postag3[:2],
        })
    if i > 1:
        word2 = sent[i-2][0]
        postag2 = sent[i-2][1]
        features.update({
            '-2:word.lower()': word2.lower(),
            '-2:word.istitle()': word2.istitle(),
            '-2:word.isupper()': word2.isupper(),
            '-2:word.isdigit()': word2.isdigit(),
            '-2:postag': postag2,
            '-2:postag[:2]': postag2[:2],
        })
    if i > 0:
        word1 = sent[i-1][0]
        postag1 = sent[i-1][1]
        features.update({
            '-1:word.lower()': word1.lower(),
            '-1:word.istitle()': word1.istitle(),
            '-1:word.isupper()': word1.isupper(),
            #'word.starts_with_uppercase': word1[:1].isupper(),
            '-1:word.isdigit()': word1.isdigit(),
            '-1:postag': postag1,
            '-1:postag[:2]': postag1[:2],
        })
    else:
        features['BOS'] = True

    if i < len(sent)-3:
        word3 = sent[i+3][0]
        postag3 = sent[i+3][1]
        features.update({
            '+3:word.lower()': word3.lower(),
            '+3:word.istitle()': word3.istitle(),
            '+3:word.isupper()': word3.isupper(),
            '+3:word.isdigit()': word3.isdigit(),
            '+3:postag': postag3,
            '+3:postag[:2]': postag3[:2],
        })
    if i < len(sent)-2:
        word2 = sent[i+2][0]
        postag2 = sent[i+2][1]
        features.update({
            '+2:word.lower()': word2.lower(),
            '+2:word.istitle()': word2.istitle(),
            '+2:word.isupper()': word2.isupper(),
            '+2:word.isdigit()': word2.isdigit(),
            '+2:postag': postag2,
            '+2:postag[:2]': postag2[:2],
        })
    if i < len(sent)-1:
        word1 = sent[i+1][0]
        postag1 = sent[i+1][1]
        features.update({
            '+1:word.lower()': word1.lower(),
            '+1:word.istitle()': word1.istitle(),
            '+1:word.isupper()': word1.isupper(),
            '+1:word.isdigit()': word1.isdigit(),
            '+1:postag': postag1,
            '+1:postag[:2]': postag1[:2],
        })
    else:
        features['EOS'] = True

    return features

def sent2features_extended(sent):
    return [word2features_extended(sent, i) for i in range(len(sent))]

In [131]:
X_dev = [sent2features_extended(s) for s in dev_sents]
y_dev = [sent2labels(s) for s in dev_sents]

X_test = [sent2features_extended(s) for s in test_sents]
y_test = [sent2labels(s) for s in test_sents]

In [132]:
crf = sklearn_crfsuite.CRF(
    algorithm='lbfgs',
    c1=0.1,
    c2=0.1,
    max_iterations=100,
    all_possible_transitions=True
)
crf.fit(X_dev, y_dev)

y_pred = crf.predict(X_test)

sorted_labels = sorted(
    labels,
    key=lambda name: (name[1:], name[0])
)

print("weighted avg:")
weighted_avg = metrics.flat_classification_report(
    y_test, y_pred, labels=sorted_labels, digits=3, output_dict=True)['weighted avg']
for k in weighted_avg.keys():
    print("%s: %s" % (k, round(weighted_avg[k], 3)))

weighted avg:
precision: 0.288
recall: 0.156
f1-score: 0.187
support: 1740


  _warn_prf(average, modifier, msg_start, len(result))


Note that the F1 score has gotten a bit higher after using the extended features, but not much.

## TO-DO
Adjusting the features such that the F1 score is as high as possible. 
I don't get why the F1 score is so low, compared to that obtained in the tutorial :(

In [133]:
# define fixed parameters and parameters to search
crf = sklearn_crfsuite.CRF(
    algorithm='lbfgs', 
    max_iterations=100, 
    all_possible_transitions=True
)
params_space = {
    'c1': scipy.stats.expon(scale=0.5),
    'c2': scipy.stats.expon(scale=0.05),
}

# use the same metric for evaluation
f1_scorer = make_scorer(metrics.flat_f1_score, 
                        average='weighted', labels=labels)

# search
rs = RandomizedSearchCV(crf, params_space, 
                        cv=3, 
                        verbose=1, 
                        n_jobs=-1, 
                        n_iter=10, 
                        scoring=f1_scorer)
rs.fit(X_dev, y_dev)

print('best params:', rs.best_params_)
print("\nweighted avg:")
crf = rs.best_estimator_
y_pred = crf.predict(X_test)
weighted_avg = metrics.flat_classification_report(
    y_test, y_pred, labels=sorted_labels, digits=3, output_dict=True)['weighted avg']
for k in weighted_avg.keys():
    print("%s: %s" % (k, round(weighted_avg[k], 3)))

Fitting 3 folds for each of 10 candidates, totalling 30 fits


[Parallel(n_jobs=-1)]: Using backend LokyBackend with 12 concurrent workers.
[Parallel(n_jobs=-1)]: Done  30 out of  30 | elapsed:   56.9s finished


best params: {'c1': 0.0038698964579690896, 'c2': 0.02263680390337462}

weighted avg:
precision: 0.282
recall: 0.15
f1-score: 0.177
support: 1740


For some reason, the results are worse when using hyperparameter optimalization.

In [134]:
# For interpretation see: https://eli5.readthedocs.io/en/latest/tutorials/sklearn_crfsuite.html
eli5.show_weights(crf, top=30)



From \ To,O,B-corporation,I-corporation,B-creative-work,I-creative-work,B-group,I-group,B-location,I-location,B-person,I-person,B-product,I-product
O,2.973,-0.242,-2.119,0.596,-3.779,1.154,-2.936,1.012,-2.714,1.605,-3.304,0.866,-3.024
B-corporation,-0.075,1.044,3.135,-0.183,-0.38,-0.179,-0.263,-0.2,-0.252,-0.823,-0.475,0.056,-0.515
I-corporation,-0.062,-0.044,2.155,-0.272,-0.295,-0.078,-0.12,-0.112,-0.061,-0.208,-0.062,-0.18,-0.18
B-creative-work,-0.799,-0.292,-0.264,-0.671,4.935,-0.409,-0.482,-0.133,-0.432,-1.333,-0.756,-0.46,-0.65
I-creative-work,-0.236,-0.194,-0.283,0.0,4.652,-0.344,-0.66,-0.39,-0.564,-1.171,-0.562,-0.519,-0.565
B-group,-0.414,-0.067,-0.047,-0.313,-0.542,-0.41,4.435,-0.207,-0.247,-0.717,-0.335,-0.301,-0.342
I-group,-0.668,-0.05,-0.077,-0.133,-0.336,-0.221,3.456,-0.013,-0.207,-0.45,-0.147,-0.073,-0.088
B-location,-0.424,-0.189,-0.25,0.333,-0.77,-0.105,-0.392,-0.342,4.4,-1.158,-0.668,-0.424,-0.576
I-location,-0.567,-0.101,-0.062,-0.067,-0.3,-0.012,-0.085,-0.577,3.612,-0.421,-0.108,-0.082,-0.081
B-person,0.67,-0.49,-0.963,-0.799,-1.257,-0.45,-0.719,-0.374,-0.755,-1.256,5.137,-1.115,-1.216

Weight?,Feature,Unnamed: 2_level_0,Unnamed: 3_level_0,Unnamed: 4_level_0,Unnamed: 5_level_0,Unnamed: 6_level_0,Unnamed: 7_level_0,Unnamed: 8_level_0,Unnamed: 9_level_0,Unnamed: 10_level_0,Unnamed: 11_level_0,Unnamed: 12_level_0
Weight?,Feature,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
Weight?,Feature,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2
Weight?,Feature,Unnamed: 2_level_3,Unnamed: 3_level_3,Unnamed: 4_level_3,Unnamed: 5_level_3,Unnamed: 6_level_3,Unnamed: 7_level_3,Unnamed: 8_level_3,Unnamed: 9_level_3,Unnamed: 10_level_3,Unnamed: 11_level_3,Unnamed: 12_level_3
Weight?,Feature,Unnamed: 2_level_4,Unnamed: 3_level_4,Unnamed: 4_level_4,Unnamed: 5_level_4,Unnamed: 6_level_4,Unnamed: 7_level_4,Unnamed: 8_level_4,Unnamed: 9_level_4,Unnamed: 10_level_4,Unnamed: 11_level_4,Unnamed: 12_level_4
Weight?,Feature,Unnamed: 2_level_5,Unnamed: 3_level_5,Unnamed: 4_level_5,Unnamed: 5_level_5,Unnamed: 6_level_5,Unnamed: 7_level_5,Unnamed: 8_level_5,Unnamed: 9_level_5,Unnamed: 10_level_5,Unnamed: 11_level_5,Unnamed: 12_level_5
Weight?,Feature,Unnamed: 2_level_6,Unnamed: 3_level_6,Unnamed: 4_level_6,Unnamed: 5_level_6,Unnamed: 6_level_6,Unnamed: 7_level_6,Unnamed: 8_level_6,Unnamed: 9_level_6,Unnamed: 10_level_6,Unnamed: 11_level_6,Unnamed: 12_level_6
Weight?,Feature,Unnamed: 2_level_7,Unnamed: 3_level_7,Unnamed: 4_level_7,Unnamed: 5_level_7,Unnamed: 6_level_7,Unnamed: 7_level_7,Unnamed: 8_level_7,Unnamed: 9_level_7,Unnamed: 10_level_7,Unnamed: 11_level_7,Unnamed: 12_level_7
Weight?,Feature,Unnamed: 2_level_8,Unnamed: 3_level_8,Unnamed: 4_level_8,Unnamed: 5_level_8,Unnamed: 6_level_8,Unnamed: 7_level_8,Unnamed: 8_level_8,Unnamed: 9_level_8,Unnamed: 10_level_8,Unnamed: 11_level_8,Unnamed: 12_level_8
Weight?,Feature,Unnamed: 2_level_9,Unnamed: 3_level_9,Unnamed: 4_level_9,Unnamed: 5_level_9,Unnamed: 6_level_9,Unnamed: 7_level_9,Unnamed: 8_level_9,Unnamed: 9_level_9,Unnamed: 10_level_9,Unnamed: 11_level_9,Unnamed: 12_level_9
Weight?,Feature,Unnamed: 2_level_10,Unnamed: 3_level_10,Unnamed: 4_level_10,Unnamed: 5_level_10,Unnamed: 6_level_10,Unnamed: 7_level_10,Unnamed: 8_level_10,Unnamed: 9_level_10,Unnamed: 10_level_10,Unnamed: 11_level_10,Unnamed: 12_level_10
Weight?,Feature,Unnamed: 2_level_11,Unnamed: 3_level_11,Unnamed: 4_level_11,Unnamed: 5_level_11,Unnamed: 6_level_11,Unnamed: 7_level_11,Unnamed: 8_level_11,Unnamed: 9_level_11,Unnamed: 10_level_11,Unnamed: 11_level_11,Unnamed: 12_level_11
Weight?,Feature,Unnamed: 2_level_12,Unnamed: 3_level_12,Unnamed: 4_level_12,Unnamed: 5_level_12,Unnamed: 6_level_12,Unnamed: 7_level_12,Unnamed: 8_level_12,Unnamed: 9_level_12,Unnamed: 10_level_12,Unnamed: 11_level_12,Unnamed: 12_level_12
+4.326,BOS,,,,,,,,,,,
+3.131,bias,,,,,,,,,,,
+2.321,EOS,,,,,,,,,,,
+2.304,word.lower():man,,,,,,,,,,,
+2.282,word[-3:]:ril,,,,,,,,,,,
+2.282,word.lower():april,,,,,,,,,,,
+2.055,+3:word.lower():as,,,,,,,,,,,
+1.929,word[-2:]:om,,,,,,,,,,,
+1.900,word.lower():great,,,,,,,,,,,
+1.883,word[-3:]:zer,,,,,,,,,,,

Weight?,Feature
+4.326,BOS
+3.131,bias
+2.321,EOS
+2.304,word.lower():man
+2.282,word[-3:]:ril
+2.282,word.lower():april
+2.055,+3:word.lower():as
+1.929,word[-2:]:om
+1.900,word.lower():great
+1.883,word[-3:]:zer

Weight?,Feature
+2.071,word[-2:]:vo
+1.648,word.lower():spacex
+1.612,word[-3:]:rge
+1.586,+3:word.lower():rocket
+1.586,word.lower():herge
+1.568,word.lower():atwoodgames
+1.421,word.lower():united
+1.414,word.lower():tesla
+1.412,+3:word.lower():.
+1.396,+3:word.lower():be

Weight?,Feature
+1.434,-1:word.istitle()
+1.006,+3:postag:VBN
+0.995,-2:word.istitle()
+0.736,+1:postag:NNS
+0.646,+2:postag[:2]:VB
+0.609,-3:postag:NN
+0.598,-1:word.isupper()
+0.541,-2:postag[:2]:VB
+0.536,-3:postag[:2]:NN
+0.533,word[-3:]:ews

Weight?,Feature
+1.972,-1:word.lower():from
+1.865,word[-3:]:man
+1.857,word.lower():koops
+1.848,+3:word.lower():cold
+1.848,word.lower():beat
+1.817,word.lower():minecraft
+1.791,word[-3:]:aft
+1.766,-3:word.lower():so
+1.705,+2:postag:JJS
+1.679,word.lower():spiderman

Weight?,Feature
+1.885,-2:word.lower():pool
+1.712,-1:word.lower():party
+1.518,-1:word.lower():first
+1.355,"+1:word.lower():"""
+1.216,-3:word.lower():much
+1.196,-1:word.lower():battlefield
+1.194,word[-2:]:sk
+1.174,-1:word.lower():la
+1.009,-1:word.isupper()
+0.987,word.lower():morty

Weight?,Feature
+2.576,+1:word.lower():fan
+1.630,-2:word.lower():perfect
+1.616,word.lower():choice
+1.598,word.lower():warriors
+1.588,word.lower():chiefs
+1.588,word[-2:]:FS
+1.588,word[-3:]:EFS
+1.571,-1:word.lower():rip
+1.561,-1:word.lower():song
+1.545,+3:word.lower():dolphin

Weight?,Feature
+1.224,postag:NNS
+1.126,word[-2:]:ls
+0.888,word.lower():kings
+0.888,word[-3:]:NGS
+0.867,+2:word.lower():back
+0.865,word[-2:]:GS
+0.861,word[-3:]:and
+0.801,word[-2:]:nd
+0.796,-3:word.lower():father
+0.795,-1:word.istitle()

Weight?,Feature
+2.912,-1:word.lower():in
+2.413,word[-2:]:ca
+2.099,word.lower():zweibrucken
+2.099,word[-3:]:KEN
+2.099,+3:word.lower():lived
+1.945,word[-2:]:EN
+1.912,word.lower():universe
+1.862,word[-3:]:ica
+1.862,word.lower():jamaica
+1.805,word.lower():britian

Weight?,Feature
+1.244,word[-3:]:and
+1.164,word[-2:]:nd
+1.088,-3:word.lower():emily
+0.979,word[-2:]:ns
+0.957,-1:word.lower():discovery
+0.953,-1:word.lower():milton
+0.953,-3:word.lower():academy
+0.953,word.lower():keyons
+0.914,-3:word.lower():go
+0.836,word[-3:]:ons

Weight?,Feature
+3.519,word.lower():tanner
+3.292,word[-3:]:ily
+3.119,word.lower():trump
+2.796,word[-2:]:na
+2.716,word[-2:]:ma
+2.540,word.lower():kendrick
+2.452,word.lower():sterling
+2.425,+2:word.lower():gone
+2.241,word[-2:]:an
+2.184,word[-3:]:ick

Weight?,Feature
+1.997,-1:word.lower():lil
+1.743,-1:word.isupper()
+1.310,-1:word.lower():jai
+1.296,word[-2:]:rt
+1.280,word[-3:]:son
+1.233,word[-2:]:le
+1.224,+3:word.lower():get
+1.177,word[-2:]:or
+1.148,word.lower():censor
+1.146,word[-3:]:sor

Weight?,Feature
+2.280,-1:word.lower():back
+2.179,word.lower():youtube
+2.058,-1:word.lower():play
+1.988,-2:word.lower():version
+1.977,word[-3:]:ube
+1.963,word.lower():bord
+1.963,-1:word.lower():ougie
+1.795,word.lower():photoshop
+1.728,word.lower():monopoly
+1.657,-1:word.lower():your

Weight?,Feature
+1.692,-3:word.lower():for
+1.196,-2:word.lower():kids
+1.195,word.isdigit()
+1.166,word.lower():children
+1.113,-1:word.lower():audi
+1.093,+2:word.lower():it
+1.087,word[-3:]:ren
+1.007,postag:CD
+1.007,postag[:2]:CD
+1.005,-3:word.lower():loader


Maybe, we could find out what features contribute to the low F1 score by looking into the table above!