## sklearn-crfsuite

In this notebook we train a basic CRF model for Named Entity Recognition on CoNLL2002 data (following https://github.com/TeamHG-Memex/sklearn-crfsuite/blob/master/docs/CoNLL2002.ipynb) and check its weights to see what it learned.

In [1]:
import sys
sys.path.insert(0, '..')

import nltk
import sklearn_crfsuite

from eli5 import explain_weights, format_as_text
from eli5.formatters import fields

Load training data:

In [2]:
%%time
train_sents = list(nltk.corpus.conll2002.iob_sents('esp.train'))
test_sents = list(nltk.corpus.conll2002.iob_sents('esp.testb'))

CPU times: user 2.86 s, sys: 136 ms, total: 2.99 s
Wall time: 3.06 s


Extract features: word parts, POS tags, lower/title/upper flags, features of nearby words

In [3]:
def word2features(sent, i):
    word = sent[i][0]
    postag = sent[i][1]
    
    features = {
        'bias': 1.0,
        'word.lower()': word.lower(),
        'word[-3:]': word[-3:],
        'word[-2:]': word[-2:],
        'word.isupper()': word.isupper(),
        'word.istitle()': word.istitle(),
        'word.isdigit()': word.isdigit(),
        'postag': postag,
        'postag[:2]': postag[:2],        
    }
    if i > 0:
        word1 = sent[i-1][0]
        postag1 = sent[i-1][1]
        features.update({
            '-1:word.lower()': word1.lower(),
            '-1:word.istitle()': word1.istitle(),
            '-1:word.isupper()': word1.isupper(),
            '-1:postag': postag1,
            '-1:postag[:2]': postag1[:2],
        })
    else:
        features['BOS'] = True
        
    if i < len(sent)-1:
        word1 = sent[i+1][0]
        postag1 = sent[i+1][1]
        features.update({
            '+1:word.lower()': word1.lower(),
            '+1:word.istitle()': word1.istitle(),
            '+1:word.isupper()': word1.isupper(),
            '+1:postag': postag1,
            '+1:postag[:2]': postag1[:2],
        })
    else:
        features['EOS'] = True
                
    return features


def sent2features(sent):
    return [word2features(sent, i) for i in range(len(sent))]

def sent2labels(sent):
    return [label for token, postag, label in sent]

def sent2tokens(sent):
    return [token for token, postag, label in sent]

In [4]:
sent2features(train_sents[0])[0]

{'+1:postag': 'Fpa',
 '+1:postag[:2]': 'Fp',
 '+1:word.istitle()': False,
 '+1:word.isupper()': False,
 '+1:word.lower()': '(',
 'BOS': True,
 'bias': 1.0,
 'postag': 'NP',
 'postag[:2]': 'NP',
 'word.isdigit()': False,
 'word.istitle()': True,
 'word.isupper()': False,
 'word.lower()': 'melbourne',
 'word[-2:]': 'ne',
 'word[-3:]': 'rne'}

In [5]:
%%time
X_train = [sent2features(s) for s in train_sents]
y_train = [sent2labels(s) for s in train_sents]

X_test = [sent2features(s) for s in test_sents]
y_test = [sent2labels(s) for s in test_sents]

CPU times: user 1.55 s, sys: 150 ms, total: 1.7 s
Wall time: 1.73 s


Train a CRF model:

In [6]:
%%time
crf = sklearn_crfsuite.CRF(
    algorithm='lbfgs',
    c1=0.1, 
    c2=0.1, 
    max_iterations=20, 
    all_possible_transitions=False,
)
crf.fit(X_train, y_train)

CPU times: user 15.1 s, sys: 269 ms, total: 15.4 s
Wall time: 16 s


Check CRF weights (transition weights and state weights):

In [7]:
explain_weights(crf, top=20)

From \ To,B-LOC,O,B-ORG,B-PER,I-PER,B-MISC,I-ORG,I-LOC,I-MISC
B-LOC,-0.06,0.03,0.0,-0.221,0.0,0.0,0.0,4.345,0.0
O,2.488,3.704,2.857,2.262,0.0,2.24,0.0,0.0,0.0
B-ORG,-0.217,0.265,-0.955,-0.381,0.0,-0.535,5.029,0.0,0.0
B-PER,-0.832,-0.336,-1.224,-1.169,4.528,0.0,0.0,0.0,0.0
I-PER,-0.277,-0.81,0.0,-0.728,3.611,0.0,0.0,0.0,0.0
B-MISC,-0.326,-0.651,-0.408,-0.354,0.0,0.0,0.0,0.0,5.87
I-ORG,-1.926,-0.42,-1.703,-0.622,0.0,-0.794,5.551,0.0,0.0
I-LOC,-0.608,-0.297,0.0,0.0,0.0,0.0,0.0,3.744,0.0
I-MISC,-0.986,-0.724,-0.929,-0.612,0.0,-0.421,0.0,0.0,5.874

y=B-LOC  top features,y=B-LOC  top features,Unnamed: 2_level_0,Unnamed: 3_level_0,Unnamed: 4_level_0,Unnamed: 5_level_0,Unnamed: 6_level_0,Unnamed: 7_level_0,Unnamed: 8_level_0
Weight,Feature,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
y=O  top features,y=O  top features,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2
Weight,Feature,Unnamed: 2_level_3,Unnamed: 3_level_3,Unnamed: 4_level_3,Unnamed: 5_level_3,Unnamed: 6_level_3,Unnamed: 7_level_3,Unnamed: 8_level_3
y=B-ORG  top features,y=B-ORG  top features,Unnamed: 2_level_4,Unnamed: 3_level_4,Unnamed: 4_level_4,Unnamed: 5_level_4,Unnamed: 6_level_4,Unnamed: 7_level_4,Unnamed: 8_level_4
Weight,Feature,Unnamed: 2_level_5,Unnamed: 3_level_5,Unnamed: 4_level_5,Unnamed: 5_level_5,Unnamed: 6_level_5,Unnamed: 7_level_5,Unnamed: 8_level_5
y=B-PER  top features,y=B-PER  top features,Unnamed: 2_level_6,Unnamed: 3_level_6,Unnamed: 4_level_6,Unnamed: 5_level_6,Unnamed: 6_level_6,Unnamed: 7_level_6,Unnamed: 8_level_6
Weight,Feature,Unnamed: 2_level_7,Unnamed: 3_level_7,Unnamed: 4_level_7,Unnamed: 5_level_7,Unnamed: 6_level_7,Unnamed: 7_level_7,Unnamed: 8_level_7
y=I-PER  top features,y=I-PER  top features,Unnamed: 2_level_8,Unnamed: 3_level_8,Unnamed: 4_level_8,Unnamed: 5_level_8,Unnamed: 6_level_8,Unnamed: 7_level_8,Unnamed: 8_level_8
Weight,Feature,Unnamed: 2_level_9,Unnamed: 3_level_9,Unnamed: 4_level_9,Unnamed: 5_level_9,Unnamed: 6_level_9,Unnamed: 7_level_9,Unnamed: 8_level_9
y=B-MISC  top features,y=B-MISC  top features,Unnamed: 2_level_10,Unnamed: 3_level_10,Unnamed: 4_level_10,Unnamed: 5_level_10,Unnamed: 6_level_10,Unnamed: 7_level_10,Unnamed: 8_level_10
Weight,Feature,Unnamed: 2_level_11,Unnamed: 3_level_11,Unnamed: 4_level_11,Unnamed: 5_level_11,Unnamed: 6_level_11,Unnamed: 7_level_11,Unnamed: 8_level_11
y=I-ORG  top features,y=I-ORG  top features,Unnamed: 2_level_12,Unnamed: 3_level_12,Unnamed: 4_level_12,Unnamed: 5_level_12,Unnamed: 6_level_12,Unnamed: 7_level_12,Unnamed: 8_level_12
Weight,Feature,Unnamed: 2_level_13,Unnamed: 3_level_13,Unnamed: 4_level_13,Unnamed: 5_level_13,Unnamed: 6_level_13,Unnamed: 7_level_13,Unnamed: 8_level_13
y=I-LOC  top features,y=I-LOC  top features,Unnamed: 2_level_14,Unnamed: 3_level_14,Unnamed: 4_level_14,Unnamed: 5_level_14,Unnamed: 6_level_14,Unnamed: 7_level_14,Unnamed: 8_level_14
Weight,Feature,Unnamed: 2_level_15,Unnamed: 3_level_15,Unnamed: 4_level_15,Unnamed: 5_level_15,Unnamed: 6_level_15,Unnamed: 7_level_15,Unnamed: 8_level_15
y=I-MISC  top features,y=I-MISC  top features,Unnamed: 2_level_16,Unnamed: 3_level_16,Unnamed: 4_level_16,Unnamed: 5_level_16,Unnamed: 6_level_16,Unnamed: 7_level_16,Unnamed: 8_level_16
Weight,Feature,Unnamed: 2_level_17,Unnamed: 3_level_17,Unnamed: 4_level_17,Unnamed: 5_level_17,Unnamed: 6_level_17,Unnamed: 7_level_17,Unnamed: 8_level_17
+2.308,word.istitle(),,,,,,,
+2.215,-1:word.lower():en,,,,,,,
+0.968,word[-2:]:id,,,,,,,
+0.921,word[-3:]:rid,,,,,,,
+0.920,word.lower():madrid,,,,,,,
+0.786,word[-2:]:ia,,,,,,,
+0.686,word[-2:]:ña,,,,,,,
+0.675,word[-2:]:ís,,,,,,,
+0.665,word[-3:]:ona,,,,,,,
+0.646,word.lower():españa,,,,,,,

y=B-LOC  top features,y=B-LOC  top features
Weight,Feature
+2.308,word.istitle()
+2.215,-1:word.lower():en
+0.968,word[-2:]:id
+0.921,word[-3:]:rid
+0.920,word.lower():madrid
+0.786,word[-2:]:ia
+0.686,word[-2:]:ña
+0.675,word[-2:]:ís
+0.665,word[-3:]:ona
+0.646,word.lower():españa

y=O  top features,y=O  top features
Weight,Feature
+3.644,postag[:2]:Fp
+3.577,BOS
+3.140,bias
+1.998,postag:CC
+1.998,postag[:2]:CC
+1.838,"word[-2:]:,"
+1.838,"word.lower():,"
+1.838,"word[-3:]:,"
+1.838,postag:Fc
+1.838,postag[:2]:Fc

y=B-ORG  top features,y=B-ORG  top features
Weight,Feature
+2.413,word.lower():efe
+2.156,word.isupper()
+1.133,word[-2:]:FE
+1.117,word[-3:]:EFE
+1.103,word.lower():gobierno
+0.960,-1:word.lower():del
+0.882,word.istitle()
+0.850,word[-3:]:rno
+0.778,-1:word.lower():al
+0.741,word[-2:]:PP

y=B-PER  top features,y=B-PER  top features
Weight,Feature
+2.805,word.istitle()
+0.785,postag:NP
+0.785,postag[:2]:NP
+0.602,+1:postag:VMI
+0.591,-1:word.lower():a
+0.590,+1:postag[:2]:VM
… 4353 more positive …,… 4353 more positive …
… 429 more negative …,… 429 more negative …
-0.564,word[-2:]:ia
-0.578,word.lower():la

y=I-PER  top features,y=I-PER  top features
Weight,Feature
+1.837,-1:word.istitle()
+1.267,word[-2:]:ez
+1.015,word.istitle()
+0.692,-1:word.lower():josé
+0.536,-1:postag[:2]:AQ
+0.536,-1:postag:AQ
+0.505,-1:postag[:2]:VM
+0.498,-1:word.lower():juan
+0.425,-1:word.lower():maría
+0.353,-1:postag:VMI

y=B-MISC  top features,y=B-MISC  top features
Weight,Feature
+1.894,word.isupper()
+0.804,word.istitle()
+0.664,-1:word.lower():la
+0.519,-1:postag[:2]:Fe
+0.519,-1:postag:Fe
+0.519,"-1:word.lower():"""
+0.460,"word[-3:]:"""
+0.460,postag:Fe
+0.460,postag[:2]:Fe
+0.460,"word.lower():"""

y=I-ORG  top features,y=I-ORG  top features
Weight,Feature
+1.515,-1:word.istitle()
+0.938,-1:word.lower():de
+0.723,-1:postag[:2]:SP
+0.723,-1:postag:SP
+0.486,word[-2:]:id
+0.481,word[-3:]:rid
+0.471,word.lower():madrid
+0.435,-1:word.lower():real
+0.390,+1:postag:Fpa
+0.390,+1:word.lower():(

y=I-LOC  top features,y=I-LOC  top features
Weight,Feature
+1.302,-1:word.istitle()
+0.801,-1:word.lower():de
+0.641,word[-2:]:de
+0.615,word[-3:]:de
+0.538,-1:word.lower():san
+0.403,-1:word.lower():la
+0.364,-1:postag[:2]:SP
+0.364,-1:postag:SP
+0.316,word[-2:]:la
+0.296,word.istitle()

y=I-MISC  top features,y=I-MISC  top features
Weight,Feature
+0.729,-1:word.istitle()
+0.562,+1:postag:Fe
+0.562,"+1:word.lower():"""
+0.562,+1:postag[:2]:Fe
+0.424,word[-2:]:es
+0.343,-1:word.lower():liga
+0.341,-1:word.lower():de
+0.313,word[-2:]:el
+0.274,-1:word.lower():copa
+0.252,+1:postag:Z


Let's check how CRF's all_possible_transitions argument affects the result:

In [8]:
%%time
crf = sklearn_crfsuite.CRF(
    algorithm='lbfgs',
    c1=0.1, 
    c2=0.1, 
    max_iterations=20, 
    all_possible_transitions=True,
)
crf.fit(X_train, y_train)

CPU times: user 15.6 s, sys: 134 ms, total: 15.7 s
Wall time: 15.8 s


In [9]:
expl = explain_weights(crf, top=20)
expl

From \ To,B-LOC,O,B-ORG,B-PER,I-PER,B-MISC,I-ORG,I-LOC,I-MISC
B-LOC,-0.133,-0.019,-0.753,-0.178,-1.331,-0.45,-1.462,3.305,-1.193
O,1.503,2.573,2.066,1.792,-5.007,1.333,-6.43,-5.12,-5.563
B-ORG,-0.187,0.225,-0.993,-0.226,-1.901,-0.585,4.358,-1.342,-1.954
B-PER,-0.824,-0.575,-0.933,-0.855,4.06,-0.556,-1.699,-1.104,-1.503
I-PER,-0.377,-0.879,-0.833,-0.592,3.053,-0.461,-1.235,-0.835,-0.985
B-MISC,-0.396,-0.56,-0.422,-0.351,-0.992,-0.372,-1.143,-0.759,4.372
I-ORG,-1.741,-0.31,-1.247,-0.27,-1.567,-0.723,4.737,-0.935,-1.481
I-LOC,-0.678,-0.198,-0.623,-0.486,-0.678,-0.385,-0.842,2.789,-0.593
I-MISC,-1.134,-0.511,-0.912,-0.414,-1.152,-0.455,-1.378,-0.719,4.218

y=B-LOC  top features,y=B-LOC  top features,Unnamed: 2_level_0,Unnamed: 3_level_0,Unnamed: 4_level_0,Unnamed: 5_level_0,Unnamed: 6_level_0,Unnamed: 7_level_0,Unnamed: 8_level_0
Weight,Feature,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
y=O  top features,y=O  top features,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2
Weight,Feature,Unnamed: 2_level_3,Unnamed: 3_level_3,Unnamed: 4_level_3,Unnamed: 5_level_3,Unnamed: 6_level_3,Unnamed: 7_level_3,Unnamed: 8_level_3
y=B-ORG  top features,y=B-ORG  top features,Unnamed: 2_level_4,Unnamed: 3_level_4,Unnamed: 4_level_4,Unnamed: 5_level_4,Unnamed: 6_level_4,Unnamed: 7_level_4,Unnamed: 8_level_4
Weight,Feature,Unnamed: 2_level_5,Unnamed: 3_level_5,Unnamed: 4_level_5,Unnamed: 5_level_5,Unnamed: 6_level_5,Unnamed: 7_level_5,Unnamed: 8_level_5
y=B-PER  top features,y=B-PER  top features,Unnamed: 2_level_6,Unnamed: 3_level_6,Unnamed: 4_level_6,Unnamed: 5_level_6,Unnamed: 6_level_6,Unnamed: 7_level_6,Unnamed: 8_level_6
Weight,Feature,Unnamed: 2_level_7,Unnamed: 3_level_7,Unnamed: 4_level_7,Unnamed: 5_level_7,Unnamed: 6_level_7,Unnamed: 7_level_7,Unnamed: 8_level_7
y=I-PER  top features,y=I-PER  top features,Unnamed: 2_level_8,Unnamed: 3_level_8,Unnamed: 4_level_8,Unnamed: 5_level_8,Unnamed: 6_level_8,Unnamed: 7_level_8,Unnamed: 8_level_8
Weight,Feature,Unnamed: 2_level_9,Unnamed: 3_level_9,Unnamed: 4_level_9,Unnamed: 5_level_9,Unnamed: 6_level_9,Unnamed: 7_level_9,Unnamed: 8_level_9
y=B-MISC  top features,y=B-MISC  top features,Unnamed: 2_level_10,Unnamed: 3_level_10,Unnamed: 4_level_10,Unnamed: 5_level_10,Unnamed: 6_level_10,Unnamed: 7_level_10,Unnamed: 8_level_10
Weight,Feature,Unnamed: 2_level_11,Unnamed: 3_level_11,Unnamed: 4_level_11,Unnamed: 5_level_11,Unnamed: 6_level_11,Unnamed: 7_level_11,Unnamed: 8_level_11
y=I-ORG  top features,y=I-ORG  top features,Unnamed: 2_level_12,Unnamed: 3_level_12,Unnamed: 4_level_12,Unnamed: 5_level_12,Unnamed: 6_level_12,Unnamed: 7_level_12,Unnamed: 8_level_12
Weight,Feature,Unnamed: 2_level_13,Unnamed: 3_level_13,Unnamed: 4_level_13,Unnamed: 5_level_13,Unnamed: 6_level_13,Unnamed: 7_level_13,Unnamed: 8_level_13
y=I-LOC  top features,y=I-LOC  top features,Unnamed: 2_level_14,Unnamed: 3_level_14,Unnamed: 4_level_14,Unnamed: 5_level_14,Unnamed: 6_level_14,Unnamed: 7_level_14,Unnamed: 8_level_14
Weight,Feature,Unnamed: 2_level_15,Unnamed: 3_level_15,Unnamed: 4_level_15,Unnamed: 5_level_15,Unnamed: 6_level_15,Unnamed: 7_level_15,Unnamed: 8_level_15
y=I-MISC  top features,y=I-MISC  top features,Unnamed: 2_level_16,Unnamed: 3_level_16,Unnamed: 4_level_16,Unnamed: 5_level_16,Unnamed: 6_level_16,Unnamed: 7_level_16,Unnamed: 8_level_16
Weight,Feature,Unnamed: 2_level_17,Unnamed: 3_level_17,Unnamed: 4_level_17,Unnamed: 5_level_17,Unnamed: 6_level_17,Unnamed: 7_level_17,Unnamed: 8_level_17
+2.175,-1:word.lower():en,,,,,,,
+1.925,word.istitle(),,,,,,,
+0.995,word[-2:]:id,,,,,,,
+0.949,word[-3:]:rid,,,,,,,
+0.949,word.lower():madrid,,,,,,,
+0.694,word[-2:]:ís,,,,,,,
+0.685,+1:postag[:2]:Fp,,,,,,,
+0.674,word[-3:]:ona,,,,,,,
+0.659,word[-2:]:ña,,,,,,,
+0.654,word.lower():españa,,,,,,,

y=B-LOC  top features,y=B-LOC  top features
Weight,Feature
+2.175,-1:word.lower():en
+1.925,word.istitle()
+0.995,word[-2:]:id
+0.949,word[-3:]:rid
+0.949,word.lower():madrid
+0.694,word[-2:]:ís
+0.685,+1:postag[:2]:Fp
+0.674,word[-3:]:ona
+0.659,word[-2:]:ña
+0.654,word.lower():españa

y=O  top features,y=O  top features
Weight,Feature
+3.840,postag[:2]:Fp
+3.413,BOS
+2.915,bias
+2.181,postag:CC
+2.181,postag[:2]:CC
+1.923,EOS
+1.878,word.lower():y
+1.816,"word[-3:]:,"
+1.816,"word.lower():,"
+1.816,postag[:2]:Fc

y=B-ORG  top features,y=B-ORG  top features
Weight,Feature
+3.318,word.isupper()
+2.310,word.lower():efe
+1.384,word[-3:]:EFE
+1.321,word.istitle()
+1.231,word[-2:]:FE
+1.203,word.lower():gobierno
+0.869,word[-3:]:rno
+0.717,-1:word.lower():al
+0.712,-1:word.lower():el
+0.683,-1:word.lower():del

y=B-PER  top features,y=B-PER  top features
Weight,Feature
+1.826,word.istitle()
+0.845,postag:NP
+0.845,postag[:2]:NP
+0.708,+1:postag:VMI
+0.649,-1:postag:VMI
+0.600,-1:word.lower():a
… 4320 more positive …,… 4320 more positive …
… 449 more negative …,… 449 more negative …
-0.590,"-1:word.lower():"""
-0.590,-1:postag[:2]:Fe

y=I-PER  top features,y=I-PER  top features
Weight,Feature
+2.816,-1:word.istitle()
+1.261,word[-2:]:ez
+1.062,word.istitle()
+0.651,-1:word.lower():josé
+0.571,-1:postag:AQ
+0.571,-1:postag[:2]:AQ
+0.519,-1:word.lower():juan
+0.507,-1:postag[:2]:VM
+0.461,-1:word.lower():maría
+0.365,-1:word.lower():de

y=B-MISC  top features,y=B-MISC  top features
Weight,Feature
+2.347,word.isupper()
+0.634,word.istitle()
+0.504,postag:Fe
+0.504,"word[-2:]:"""
+0.504,"word[-3:]:"""
+0.504,"word.lower():"""
+0.504,postag[:2]:Fe
+0.474,postag:Z
+0.474,postag[:2]:Z
+0.466,"-1:word.lower():"""

y=I-ORG  top features,y=I-ORG  top features
Weight,Feature
+1.598,-1:word.istitle()
+0.926,-1:word.lower():de
+0.568,-1:postag:SP
+0.568,-1:postag[:2]:SP
+0.469,word[-2:]:id
+0.464,word[-3:]:rid
+0.454,word.lower():madrid
+0.449,-1:word.lower():real
+0.434,word[-3:]:de
+0.417,word.lower():de

y=I-LOC  top features,y=I-LOC  top features
Weight,Feature
+0.794,-1:word.lower():de
+0.785,-1:word.istitle()
+0.598,-1:word.lower():san
+0.516,word.istitle()
+0.499,word[-2:]:de
+0.480,word[-3:]:de
+0.476,word.lower():de
+0.430,-1:postag[:2]:NC
+0.430,-1:postag:NC
+0.392,-1:word.lower():la

y=I-MISC  top features,y=I-MISC  top features
Weight,Feature
+1.141,-1:word.istitle()
+0.639,+1:postag[:2]:Fe
+0.639,"+1:word.lower():"""
+0.639,+1:postag:Fe
+0.587,-1:word.lower():de
+0.447,word[-2:]:de
+0.432,word[-3:]:de
+0.431,word.lower():de
+0.389,-1:postag[:2]:SP
+0.389,-1:postag:SP


With `all_possible_transitions=True` CRF learned large negative weights for impossible transitions like O -> I-ORG.

It is also possible to format the result as text (could be useful in console):

In [10]:
print(format_as_text(expl))

Explained as: CRF

Transition features:
          B-LOC       O    B-ORG    B-PER    I-PER    B-MISC    I-ORG    I-LOC    I-MISC
------  -------  ------  -------  -------  -------  --------  -------  -------  --------
B-LOC    -0.133  -0.019   -0.753   -0.178   -1.331    -0.450   -1.462    3.305    -1.193
O         1.503   2.573    2.066    1.792   -5.007     1.333   -6.430   -5.120    -5.563
B-ORG    -0.187   0.225   -0.993   -0.226   -1.901    -0.585    4.358   -1.342    -1.954
B-PER    -0.824  -0.575   -0.933   -0.855    4.060    -0.556   -1.699   -1.104    -1.503
I-PER    -0.377  -0.879   -0.833   -0.592    3.053    -0.461   -1.235   -0.835    -0.985
B-MISC   -0.396  -0.560   -0.422   -0.351   -0.992    -0.372   -1.143   -0.759     4.372
I-ORG    -1.741  -0.310   -1.247   -0.270   -1.567    -0.723    4.737   -0.935    -1.481
I-LOC    -0.678  -0.198   -0.623   -0.486   -0.678    -0.385   -0.842    2.789    -0.593
I-MISC   -1.134  -0.511   -0.912   -0.414   -1.152    -0.455   -1.378 