<a href="https://colab.research.google.com/github/jbishop45/CS-7650/blob/project-2/project_2_NER_release_sp23.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [15]:
# Licensing Information:  You are free to use or extend this project for
# educational purposes provided that (1) you do not distribute or publish
# solutions, (2) you retain this notice, and (3) you provide clear
# attribution to The Georgia Institute of Technology, including a link to  https://aritter.github.io/CS-7650-sp22/

# Attribution Information: This assignment was developed at The Georgia Institute of Technology
# by Alan Ritter (alan.ritter@cc.gatech.edu)
# Contributors: Xurui Zhang (Spring 2022)

# Project #2: Named Entity Recognition

In this assignment, you will implement a bidirectional LSTM-CNN-CRF for sequence labeling, following [this paper by Xuezhe Ma and Ed Hovy](https://www.aclweb.org/anthology/P16-1101.pdf), on the CoNLL named entity recognition dataset.  Before starting the assignment, we recommend reading the Ma and Hovy paper.

First, let's import some libraries and make sure the runtime has access to a GPU.


In [2]:
import torch
import torch.nn as nn
import torch.optim as optim

gpu_info = !nvidia-smi
gpu_info = '\n'.join(gpu_info)
if gpu_info.find('failed') >= 0:
    print('Select the Runtime > "Change runtime type" menu to enable a GPU accelerator, ')
    print('and then re-execute this cell.')
else:
    print(gpu_info)

print(f'GPU available: {torch.cuda.is_available()}')

Fri Mar 24 13:26:18 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.12    Driver Version: 525.85.12    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   57C    P8    10W /  70W |      0MiB / 15360MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

## Download the Data

Run the following code to download the English part of the CoNLL 2003 dataset, the evaluation script and pre-filtered GloVe embeddings we are providing for this data.

In [3]:
#CoNLL 2003 data
!wget https://raw.githubusercontent.com/patverga/torch-ner-nlp-from-scratch/master/data/conll2003/eng.train
!wget https://raw.githubusercontent.com/patverga/torch-ner-nlp-from-scratch/master/data/conll2003/eng.testa
!wget https://raw.githubusercontent.com/patverga/torch-ner-nlp-from-scratch/master/data/conll2003/eng.testb
!cat eng.train | awk '{print $1 "\t" $4}' > train
!cat eng.testa | awk '{print $1 "\t" $4}' > dev
!cat eng.testb | awk '{print $1 "\t" $4}' > test

#Evaluation Script
!wget https://raw.githubusercontent.com/aritter/twitter_nlp/master/data/annotated/wnut16/conlleval.pl

#Pre-filtered GloVe embeddings
!wget https://raw.githubusercontent.com/aritter/aritter.github.io/master/files/glove.840B.300d.conll_filtered.txt

--2023-03-24 13:26:18--  https://raw.githubusercontent.com/patverga/torch-ner-nlp-from-scratch/master/data/conll2003/eng.train
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3283420 (3.1M) [text/plain]
Saving to: ‘eng.train’


2023-03-24 13:26:18 (45.1 MB/s) - ‘eng.train’ saved [3283420/3283420]

--2023-03-24 13:26:18--  https://raw.githubusercontent.com/patverga/torch-ner-nlp-from-scratch/master/data/conll2003/eng.testa
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 827443 (808K) [text/plain]
Saving to: ‘eng.testa’


2023-03-24

## CoNLL Data Format

Run the following cell to see a sample of the data in CoNLL format.  As you can see, each line in the file represents a word and its labeled named entity tag in BIO format.  A blank line is used to seperate sentences.

In [4]:
!head -n 20 train

-DOCSTART-	O
	
EU	I-ORG
rejects	O
German	I-MISC
call	O
to	O
boycott	O
British	I-MISC
lamb	O
.	O
	
Peter	I-PER
Blackburn	I-PER
	
BRUSSELS	I-LOC
1996-08-22	O
	
The	O
European	I-ORG


## Reading in the Data

Below we proivide a bit of code to read in data in the CoNLL format.  This also reads in the filtered GloVe embeddings, to save you some effort - we will discuss this more later.

In [5]:
#Read in the training data
def read_conll_format(filename):
    (words, tags, currentSent, currentTags) = ([],[],['-START-'],['START'])
    for line in open(filename).readlines():
        line = line.strip()
        #print(line)
        if line == "":
            currentSent.append('-END-')
            currentTags.append('END')
            words.append(currentSent)
            tags.append(currentTags)
            (currentSent, currentTags) = (['-START-'], ['START'])
        else:
            (word, tag) = line.split()
            currentSent.append(word)
            currentTags.append(tag)
    return (words, tags)

def sentences2char(sentences):
    return [[['start'] + [c for c in w] + ['end'] for w in l] for l in sentences]


(sentences_train, tags_train) = read_conll_format("train")
(sentences_dev, tags_dev)     = read_conll_format("dev")

print(sentences_train[2])
print(tags_train[2])

sentencesChar = sentences2char(sentences_train)

print(sentencesChar[2])

['-START-', 'Peter', 'Blackburn', '-END-']
['START', 'I-PER', 'I-PER', 'END']
[['start', '-', 'S', 'T', 'A', 'R', 'T', '-', 'end'], ['start', 'P', 'e', 't', 'e', 'r', 'end'], ['start', 'B', 'l', 'a', 'c', 'k', 'b', 'u', 'r', 'n', 'end'], ['start', '-', 'E', 'N', 'D', '-', 'end']]


In [6]:
#Read GloVe embeddings.
def read_GloVe(filename):
    embeddings = {}
    for line in open(filename).readlines():
        #print(line)
        fields = line.strip().split(" ")
        word = fields[0]
        embeddings[word] = [float(x) for x in fields[1:]]
    return embeddings

GloVe = read_GloVe("glove.840B.300d.conll_filtered.txt")

print(GloVe["the"])
print("dimension of glove embedding:", len(GloVe["the"]))

[0.27204, -0.06203, -0.1884, 0.023225, -0.018158, 0.0067192, -0.13877, 0.17708, 0.17709, 2.5882, -0.35179, -0.17312, 0.43285, -0.10708, 0.15006, -0.19982, -0.19093, 1.1871, -0.16207, -0.23538, 0.003664, -0.19156, -0.085662, 0.039199, -0.066449, -0.04209, -0.19122, 0.011679, -0.37138, 0.21886, 0.0011423, 0.4319, -0.14205, 0.38059, 0.30654, 0.020167, -0.18316, -0.0065186, -0.0080549, -0.12063, 0.027507, 0.29839, -0.22896, -0.22882, 0.14671, -0.076301, -0.1268, -0.0066651, -0.052795, 0.14258, 0.1561, 0.05551, -0.16149, 0.09629, -0.076533, -0.049971, -0.010195, -0.047641, -0.16679, -0.2394, 0.0050141, -0.049175, 0.013338, 0.41923, -0.10104, 0.015111, -0.077706, -0.13471, 0.119, 0.10802, 0.21061, -0.051904, 0.18527, 0.17856, 0.041293, -0.014385, -0.082567, -0.035483, -0.076173, -0.045367, 0.089281, 0.33672, -0.22099, -0.0067275, 0.23983, -0.23147, -0.88592, 0.091297, -0.012123, 0.013233, -0.25799, -0.02972, 0.016754, 0.01369, 0.32377, 0.039546, 0.042114, -0.088243, 0.30318, 0.087747, 0.1634

## Mapping Tokens to Indices

As in the last project, we will need to convert words in the dataset to numeric indices, so they can be presented as input to a neural network.  Code to handle this for you with sample usage is provided below.

In [7]:
#Create mappings between tokens and indices.

from collections import Counter
import random

#Will need this later to remove 50% of words that only appear once in the training data from the vocabulary (and don't have GloVe embeddings).
wordCounts = Counter([w for l in sentences_train for w in l])
charCounts = Counter([c for l in sentences_train for w in l for c in w])
singletons = set([w for (w,c) in wordCounts.items() if c == 1 and not w in GloVe.keys()])
charSingletons = set([w for (w,c) in charCounts.items() if c == 1])

#Build dictionaries to map from words, characters to indices and vice versa.
#Save first two words in the vocabulary for padding and "UNK" token.
word2i = {w:i+2 for i,w in enumerate(set([w for l in sentences_train for w in l] + list(GloVe.keys())))}
char2i = {w:i+2 for i,w in enumerate(set([c for l in sentencesChar for w in l for c in w]))}
i2word = {i:w for w,i in word2i.items()}
i2char = {i:w for w,i in char2i.items()}

vocab_size = max(word2i.values()) + 1
char_vocab_size = max(char2i.values()) + 1

#Tag dictionaries.
tag2i = {w:i for i,w in enumerate(set([t for l in tags_train for t in l]))}
i2tag = {i:t for t,i in tag2i.items()}

#When training, randomly replace singletons with UNK tokens sometimes to simulate situation at test time.
def getDictionaryRandomUnk(w, dictionary, train=False):
    if train and (w in singletons and random.random() > 0.5):
        return 1
    else:
        return dictionary.get(w, 1)

#Map a list of sentences from words to indices.
def sentences2indices(words, dictionary, train=False):
    #1.0 => UNK
    return [[getDictionaryRandomUnk(w,dictionary, train=train) for w in l] for l in words]

#Map a list of sentences containing to indices (character indices)
def sentences2indicesChar(chars, dictionary):
    #1.0 => UNK
    return [[[dictionary.get(c,1) for c in w] for w in l] for l in chars]

#Indices
X       = sentences2indices(sentences_train, word2i, train=True)
X_char  = sentences2indicesChar(sentencesChar, char2i)
Y       = sentences2indices(tags_train, tag2i)

print("vocab size:", vocab_size)
print("char vocab size:", char_vocab_size)
print()

print("index of word 'the':", word2i["the"])
print("word of index 253:", i2word[253])
print()

#Print out some examples of what the dev inputs will look like
for i in range(10):
    print(" ".join([i2word.get(w,'UNK') for w in X[i]]))

vocab size: 29148
char vocab size: 88

index of word 'the': 7265
word of index 253: 15:07.57

-START- -DOCSTART- -END-
-START- EU rejects German call to boycott British lamb . -END-
-START- Peter Blackburn -END-
-START- BRUSSELS 1996-08-22 -END-
-START- The European Commission said on Thursday it disagreed with German advice to consumers to shun British lamb until scientists determine whether mad cow disease can be transmitted to sheep . -END-
-START- Germany 's representative to the European Union 's veterinary committee Werner Zwingmann said on Wednesday consumers should buy sheepmeat from countries other than Britain until the scientific advice was clearer . -END-
-START- " We do n't support any such recommendation because we do n't see any grounds for it , " the Commission 's chief spokesman Nikolaus van der Pas told a news briefing . -END-
-START- He said further scientific study was required and if it was found that action was needed it should be taken by the European Union . -EN

In [8]:
[print(i,i2word[i],word2i[i2word[i]]) for i in range(2,10)]

2 location 2
3 4.252 3
4 turning 4
5 8982 5
6 +27 6
7 Preston 7
8 1-1 8
9 Cowboys 9


[None, None, None, None, None, None, None, None]

## Padding and Batching

In this assignment, you should train your models using minibatched SGD, rather than using a batch size of 1 as we did in the previous project.  When presenting multiple sentences to the network at the same time, we will need to pad them to be of the same length. We use [torch.nn.utils.rnn.pad_sequence](https://pytorch.org/docs/stable/generated/torch.nn.utils.rnn.pad_sequence.html) to do so.

Below we provide some code to prepare batches of data to present to the network.  pad the sequence so that all the sequences have the same length.

**Side Note:** PyTorch includes utilities in [`torch.utils.data`](https://pytorch.org/docs/stable/data.html) to help with padding, batching, shuffling and some other things, but for this assignment we will do everything from scratch to help you see exactly how this works.

In [9]:
#Pad inputs to max sequence length (for batching)
def prepare_input(X_list):
    X_padded = torch.nn.utils.rnn.pad_sequence([torch.as_tensor(l) for l in X_list], batch_first=True).type(torch.LongTensor) # padding the sequences with 0
    X_mask   = torch.nn.utils.rnn.pad_sequence([torch.as_tensor([1.0] * len(l)) for l in X_list], batch_first=True).type(torch.FloatTensor) # consisting of 0 and 1, 0 for padded positions, 1 for non-padded positions
    return (X_padded, X_mask)

#Maximum word length (for character representations)
MAX_CLEN=32

def prepare_input_char(X_list):
    MAX_SLEN = max([len(l) for l in X_list])
    X_padded  = [l + [[]]*(MAX_SLEN-len(l))  for l in X_list]
    X_padded  = [[w[0:MAX_CLEN] for w in l] for l in X_padded]
    X_padded  = [[w + [1]*(MAX_CLEN-len(w)) for w in l] for l in X_padded]
    return torch.as_tensor(X_padded).type(torch.LongTensor)

#Pad outputs using one-hot encoding
def prepare_output_onehot(Y_list, NUM_TAGS=max(tag2i.values())+1):
    Y_onehot = [torch.zeros(len(l), NUM_TAGS) for l in Y_list]
    for i in range(len(Y_list)):
        for j in range(len(Y_list[i])):
            Y_onehot[i][j,Y_list[i][j]] = 1.0
    Y_padded = torch.nn.utils.rnn.pad_sequence(Y_onehot, batch_first=True).type(torch.FloatTensor)
    return Y_padded

print("max slen:", max([len(x) for x in X_char]))  #Max sequence length in the training data is 39.

(X_padded, X_mask) = prepare_input(X)
X_padded_char      = prepare_input_char(X_char)
Y_onehot           = prepare_output_onehot(Y)

print("X_padded:", X_padded.shape)
print("X_mask:", X_mask.shape)
print("X_padded_char:", X_padded_char.shape)
print("Y_onehot:", Y_onehot.shape)

max slen: 115
X_padded: torch.Size([14987, 115])
X_mask: torch.Size([14987, 115])
X_padded_char: torch.Size([14987, 115, 32])
Y_onehot: torch.Size([14987, 115, 10])


In [10]:
input = prepare_input(X[0:5])[0]
print("input: ", input.shape)

input:  torch.Size([5, 32])


## **Your code starts here:** Basic LSTM Tagger (10 points)

OK, now you should have everything you need to get started.

Recall that your goal is to to implement the BiLSTM-CNN-CRF, as described in [(Ma and Hovy, 2016)](https://www.aclweb.org/anthology/P16-1101.pdf).  This is a relatively complex network with various components.  Below we provide starter code to break down your implementation into increasingly complex versions of the final model, starting with a Basic LSTM tagger.  This way you can be confident that each part is working correctly before incrementally increasing the complexity of your implementation.  This is generally a good approach to take when implementing complex models, since buggy PyTorch code is often partially working, but produces worse results than a correct implementation, so it's hard to know whether added complexities are helping or hurting.  Also, if you aren't able to match published results it's hard to know which component of your model has the problem (or even whether or not it is a problem in the published result!)

Fill in the functions marked as `TODO` in the code block below.  If everything is working correctly, you should be able to achieve an **F1 score of 0.87 on the dev set and 0.83 on the test set (with GloVe embeddings)**. You are required to initialize word embeddings with GloVe later, but you can randomly initialize the word embeddings in the beginning.

In [11]:
class BasicLSTMtagger(nn.Module):
    def __init__(self, DIM_EMB=10, DIM_HID=10, VOCAB_SIZE=29148, debug=False):
        super(BasicLSTMtagger, self).__init__()
        NUM_TAGS = max(tag2i.values())+1
        (self.DIM_EMB, self.NUM_TAGS) = (DIM_EMB, NUM_TAGS)

        #TODO: initialize parameters - embedding layer, nn.LSTM, nn.Linear and nn.LogSoftmax
        bidirectional=True
        in_features = DIM_HID*2 if bidirectional else DIM_HID

        self.word_embeddings = nn.Embedding(num_embeddings=VOCAB_SIZE,embedding_dim=DIM_EMB)
        self.lstm = nn.LSTM(input_size=DIM_EMB, hidden_size=DIM_HID, batch_first=True, bidirectional=bidirectional)
        self.hidden2tag = nn.Linear(in_features=in_features, out_features=NUM_TAGS)

        self.debug = debug
        if self.debug:
          print('VOCAB_SIZE: ' + str(VOCAB_SIZE))
          print('NUM_TAGS: '  +str(NUM_TAGS))
          print('DIM_EMB: ' + str(DIM_EMB))
          print('DIM_HID: ' + str(DIM_HID))
          print('bidirectional?: ' + str(bidirectional) + '\n')

    def forward(self, embeds, train=False):
        # X is X_padded from prepare_input()
        #TODO: Implement the forward computation.
        # embeds = self.word_embeddings(X)
        # if self.debug:
        #   print('embeds: ' + str(embeds.shape))

        lstm_out,_ = self.lstm(embeds)
        if self.debug:
          print('lstm_out: ' + str(lstm_out.shape))

        tag_space = self.hidden2tag(lstm_out)
        if self.debug:
          print('tag_space: ' + str(tag_space.shape))

        tag_scores = torch.nn.functional.log_softmax(tag_space,dim=2)
        if self.debug:
          print('tag_scores: ' + str(tag_scores.shape) + '\n')

        return tag_scores
        #return torch.randn((X.shape[0], X.shape[1], self.NUM_TAGS))  #Random baseline.

    def init_glove(self, GloVe):
        #TODO: initialize word embeddings using GloVe (you can skip this part in your first version, if you want, see instructions below).
        for i in range(2,self.word_embeddings.num_embeddings):
          word = i2word[i]
          if word not in singletons:
            try: 
              self.word_embeddings.weight[i,:] = torch.tensor(GloVe[word])
            except KeyError:
              if self.debug:
                print(i,word,'neither in singletons nor GloVe')
        pass

    def inference(self, sentences):
        X       = prepare_input(sentences2indices(sentences, word2i))[0].cuda()
        embed = self.word_embeddings(X)
        pred = self.forward(embed).argmax(dim=2)
        return [[i2tag[pred[i,j].item()] for j in range(len(sentences[i]))] for i in range(len(sentences))]

    def print_predictions(self, words, tags):
        Y_pred = self.inference(words)
        for i in range(len(words)):
            print("----------------------------")
            print(" ".join([f"{words[i][j]}/{Y_pred[i][j]}/{tags[i][j]}" for j in range(len(words[i]))]))
            print("Predicted:\t", Y_pred[i])
            print("Gold:\t\t", tags[i])

    def write_predictions(self, sentences, outFile):
        fOut = open(outFile, 'w')
        for s in sentences:
            y = self.inference([s])[0]
            #print("\n".join(y[1:len(y)-1]))
            fOut.write("\n".join(y[1:len(y)-1]))  #Skip start and end tokens
            fOut.write("\n\n")

#The following code will initialize a model and test that your forward computation runs without errors.
lstm_test   = BasicLSTMtagger(DIM_HID=7, DIM_EMB=300)
word_embeds = lstm_test.word_embeddings(prepare_input(X[0:5])[0])
lstm_output = lstm_test.forward(word_embeds) # torch.Size([5, 32])
Y_onehot    = prepare_output_onehot(Y[0:5])

#Check the shape of the lstm_output and one-hot label tensors.
print("lstm output shape:", lstm_output.shape)
print("Y onehot shape:", Y_onehot.shape, "\n")

lstm output shape: torch.Size([5, 32, 10])
Y onehot shape: torch.Size([5, 32, 10]) 



In [12]:
#Read in the data

(sentences_dev, tags_dev)     = read_conll_format('dev')
(sentences_train, tags_train) = read_conll_format('train')
(sentences_test, tags_test)   = read_conll_format('test')

# Train your Model (10 points)

Next, implement the function below to train your basic BiLSTM tagger.  See [torch.nn.lstm](https://pytorch.org/docs/stable/generated/torch.nn.LSTM.html).  Make sure to save your predictions on the test set (`test_pred_lstm.txt`) for submission to GradeScope. Feel free to change number of epochs, optimizer, learning rate and batch size.

In [19]:
from random import sample
import tqdm
import os
import subprocess
import random

def shuffle_sentences(sentences, tags):
    shuffled_sentences = []
    shuffled_tags      = []
    indices = list(range(len(sentences)))
    random.shuffle(indices)
    for i in indices:
        shuffled_sentences.append(sentences[i])
        shuffled_tags.append(tags[i])
    return (shuffled_sentences, shuffled_tags)

In [106]:
#Training
nEpochs = 100

def train_basic_lstm(sentences, tags, lstm, glove):
  #TODO: initialize optimizer
    #loss_function = nn.NLLLoss()
    with torch.no_grad():
      lstm.init_glove(glove)

    optimizer = optim.Adadelta(lstm.parameters(),lr=0.5) #lr=0.1)
    #optimizer = optim.SGD(lstm.parameters(),lr=0.1)
    batchSize = 50

    for epoch in range(nEpochs):
        totalLoss = 0.0

        (sentences_shuffled, tags_shuffled) = shuffle_sentences(sentences, tags)
        if lstm.debug:
          print('sentences shuffled: ' + str(len(sentences_shuffled)))
          print('tags shuffled: ' + str(len(tags_shuffled)))
        for batch in tqdm.notebook.tqdm(range(0, len(sentences), batchSize), leave=False):
            lstm.zero_grad()
            #TODO: Implement gradient update.
              # https://pytorch.org/tutorials/beginner/nlp/sequence_models_tutorial.html

            X_batch = sentences2indices(sentences_shuffled[batch:batch+batchSize], word2i, train=True) #train=True
            Y_batch = sentences2indices(tags_shuffled[batch:batch+batchSize], tag2i)
            if lstm.debug:
              print('len(X_batch): ' + str(len(X_batch)))

            X_batch_prepared = prepare_input(X_batch)[0].cuda() # [0] returns padded sequence, [1] returns mask
            Y_batch_onehot   = prepare_output_onehot(Y_batch).cuda() #.argmax(dim=1)
            if lstm.debug:
              print('X_batch_prepared: ' + str(X_batch_prepared.shape))
              print('Y_onehot: ' + str(Y_batch_onehot.shape))

            word_embeds = lstm.word_embeddings(X_batch_prepared)
            if lstm.debug:
              print('word_embeds: ' + str(word_embeds.shape))
            pred = lstm.forward(word_embeds)
            if lstm.debug:
              print('pred: ' + str(pred.shape))
            
            #loss = loss_function(pred, Y_batch_onehot)
            loss = torch.einsum('bij,bij',torch.neg(pred),Y_batch_onehot) / batchSize
            loss.backward()
            optimizer.step()
            totalLoss += loss

        print(f"loss on epoch {epoch} = {totalLoss}")
        lstm.write_predictions(sentences_dev, 'dev_pred')   #Performance on dev set
        print('conlleval:')
        print(subprocess.Popen('paste dev dev_pred | perl conlleval.pl -d "\t"', shell=True, stdout=subprocess.PIPE,stderr=subprocess.STDOUT).communicate()[0].decode('UTF-8'))

        if epoch % 10 == 0:
            s = sample(range(len(sentences_dev)), 5)
            lstm.print_predictions([sentences_dev[i] for i in s], [tags_dev[i] for i in s])

torch.manual_seed(2)

lstm = BasicLSTMtagger(DIM_HID=500, DIM_EMB=300, debug=False).cuda()
train_basic_lstm(sentences_train, tags_train, lstm, GloVe)

  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 0 = 871.011474609375
conlleval:
processed 51578 tokens with 5942 phrases; found: 6081 phrases; correct: 4922.
accuracy:  97.05%; precision:  80.94%; recall:  82.83%; FB1:  81.88
              LOC: precision:  90.16%; recall:  86.34%; FB1:  88.21  1759
             MISC: precision:  78.01%; recall:  73.10%; FB1:  75.48  864
              ORG: precision:  64.40%; recall:  75.69%; FB1:  69.59  1576
              PER: precision:  87.51%; recall:  89.41%; FB1:  88.45  1882

----------------------------
-START-/START/START Chinese/I-MISC/I-MISC police/O/O have/O/O detained/O/O veteran/O/O dissident/O/O Wang/I-PER/I-PER Donghai/I-PER/I-PER ,/O/O the/O/O New/I-ORG/I-MISC York-based/I-ORG/I-MISC pressure/O/O group/O/O Human/I-ORG/I-ORG Rights/I-ORG/I-ORG in/O/I-ORG China/I-LOC/I-ORG said/O/O on/O/O Saturday/O/O ./O/O -END-/END/END
Predicted:	 ['START', 'I-MISC', 'O', 'O', 'O', 'O', 'O', 'I-PER', 'I-PER', 'O', 'O', 'I-ORG', 'I-ORG', 'O', 'O', 'I-ORG', 'I-ORG', 'O', 'I-LOC', 'O', 'O

  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 1 = 334.00885009765625
conlleval:
processed 51578 tokens with 5942 phrases; found: 5861 phrases; correct: 5063.
accuracy:  97.75%; precision:  86.38%; recall:  85.21%; FB1:  85.79
              LOC: precision:  91.36%; recall:  89.82%; FB1:  90.58  1806
             MISC: precision:  80.17%; recall:  81.13%; FB1:  80.65  933
              ORG: precision:  77.54%; recall:  74.65%; FB1:  76.06  1291
              PER: precision:  90.88%; recall:  90.34%; FB1:  90.61  1831



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 2 = 235.54644775390625
conlleval:
processed 51578 tokens with 5942 phrases; found: 5964 phrases; correct: 5257.
accuracy:  98.22%; precision:  88.15%; recall:  88.47%; FB1:  88.31
              LOC: precision:  93.27%; recall:  90.53%; FB1:  91.88  1783
             MISC: precision:  83.26%; recall:  81.45%; FB1:  82.35  902
              ORG: precision:  79.25%; recall:  83.15%; FB1:  81.15  1407
              PER: precision:  92.31%; recall:  93.81%; FB1:  93.05  1872



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 3 = 178.70046997070312
conlleval:
processed 51578 tokens with 5942 phrases; found: 6043 phrases; correct: 5344.
accuracy:  98.42%; precision:  88.43%; recall:  89.94%; FB1:  89.18
              LOC: precision:  92.47%; recall:  93.58%; FB1:  93.02  1859
             MISC: precision:  80.04%; recall:  84.38%; FB1:  82.15  972
              ORG: precision:  82.84%; recall:  82.10%; FB1:  82.47  1329
              PER: precision:  92.72%; recall:  94.79%; FB1:  93.74  1883



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 4 = 135.9063262939453
conlleval:
processed 51578 tokens with 5942 phrases; found: 6040 phrases; correct: 5339.
accuracy:  98.32%; precision:  88.39%; recall:  89.85%; FB1:  89.12
              LOC: precision:  94.05%; recall:  92.11%; FB1:  93.07  1799
             MISC: precision:  81.98%; recall:  83.41%; FB1:  82.69  938
              ORG: precision:  79.62%; recall:  84.79%; FB1:  82.12  1428
              PER: precision:  92.85%; recall:  94.52%; FB1:  93.68  1875



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 5 = 103.84879302978516
conlleval:
processed 51578 tokens with 5942 phrases; found: 6003 phrases; correct: 5328.
accuracy:  98.30%; precision:  88.76%; recall:  89.67%; FB1:  89.21
              LOC: precision:  95.18%; recall:  90.36%; FB1:  92.71  1744
             MISC: precision:  78.12%; recall:  85.57%; FB1:  81.68  1010
              ORG: precision:  83.09%; recall:  83.52%; FB1:  83.30  1348
              PER: precision:  92.53%; recall:  95.49%; FB1:  93.99  1901



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 6 = 78.50308990478516
conlleval:
processed 51578 tokens with 5942 phrases; found: 6055 phrases; correct: 5394.
accuracy:  98.35%; precision:  89.08%; recall:  90.78%; FB1:  89.92
              LOC: precision:  94.04%; recall:  94.50%; FB1:  94.27  1846
             MISC: precision:  84.09%; recall:  84.82%; FB1:  84.45  930
              ORG: precision:  79.82%; recall:  85.23%; FB1:  82.44  1432
              PER: precision:  93.83%; recall:  94.08%; FB1:  93.96  1847



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 7 = 58.689613342285156
conlleval:
processed 51578 tokens with 5942 phrases; found: 6071 phrases; correct: 5360.
accuracy:  98.24%; precision:  88.29%; recall:  90.21%; FB1:  89.24
              LOC: precision:  93.63%; recall:  92.87%; FB1:  93.25  1822
             MISC: precision:  83.82%; recall:  83.73%; FB1:  83.78  921
              ORG: precision:  77.33%; recall:  87.25%; FB1:  81.99  1513
              PER: precision:  94.33%; recall:  92.94%; FB1:  93.63  1815



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 8 = 40.538150787353516
conlleval:
processed 51578 tokens with 5942 phrases; found: 6091 phrases; correct: 5365.
accuracy:  98.24%; precision:  88.08%; recall:  90.29%; FB1:  89.17
              LOC: precision:  93.68%; recall:  92.05%; FB1:  92.86  1805
             MISC: precision:  86.72%; recall:  84.27%; FB1:  85.48  896
              ORG: precision:  77.57%; recall:  87.70%; FB1:  82.32  1516
              PER: precision:  91.84%; recall:  93.43%; FB1:  92.63  1874



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 9 = 29.8436279296875
conlleval:
processed 51578 tokens with 5942 phrases; found: 6046 phrases; correct: 5325.
accuracy:  98.14%; precision:  88.07%; recall:  89.62%; FB1:  88.84
              LOC: precision:  95.29%; recall:  90.36%; FB1:  92.76  1742
             MISC: precision:  79.76%; recall:  86.33%; FB1:  82.92  998
              ORG: precision:  78.36%; recall:  82.62%; FB1:  80.44  1414
              PER: precision:  93.08%; recall:  95.60%; FB1:  94.32  1892



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 10 = 20.353050231933594
conlleval:
processed 51578 tokens with 5942 phrases; found: 6018 phrases; correct: 5357.
accuracy:  98.39%; precision:  89.02%; recall:  90.15%; FB1:  89.58
              LOC: precision:  93.94%; recall:  92.76%; FB1:  93.34  1814
             MISC: precision:  85.94%; recall:  83.51%; FB1:  84.71  896
              ORG: precision:  79.50%; recall:  85.61%; FB1:  82.44  1444
              PER: precision:  93.08%; recall:  94.19%; FB1:  93.63  1864

----------------------------
-START-/START/START Prime/O/O Minister/O/O John/I-PER/I-PER Major/I-PER/I-PER says/O/O the/O/O 300-year-old/I-MISC/O union/O/O of/O/O the/O/O Scottish/I-MISC/I-MISC and/O/O English/I-MISC/I-MISC parliaments/O/O will/O/O be/O/O a/O/O main/O/O plank/O/O in/O/O his/O/O Conservative/I-ORG/I-ORG Party/I-ORG/I-ORG 's/O/O election/O/O platform/O/O ./O/O -END-/END/END
Predicted:	 ['START', 'O', 'O', 'I-PER', 'I-PER', 'O', 'O', 'I-MISC', 'O', 'O', 'O', 'I-MISC', 'O', 'I-MISC', 'O', 'O

  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 11 = 14.847429275512695
conlleval:
processed 51578 tokens with 5942 phrases; found: 6029 phrases; correct: 5384.
accuracy:  98.36%; precision:  89.30%; recall:  90.61%; FB1:  89.95
              LOC: precision:  93.94%; recall:  93.74%; FB1:  93.84  1833
             MISC: precision:  85.73%; recall:  85.36%; FB1:  85.54  918
              ORG: precision:  80.98%; recall:  84.12%; FB1:  82.52  1393
              PER: precision:  92.68%; recall:  94.84%; FB1:  93.75  1885



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 12 = 12.1641845703125
conlleval:
processed 51578 tokens with 5942 phrases; found: 6029 phrases; correct: 5340.
accuracy:  98.25%; precision:  88.57%; recall:  89.87%; FB1:  89.22
              LOC: precision:  94.80%; recall:  92.27%; FB1:  93.52  1788
             MISC: precision:  84.15%; recall:  84.06%; FB1:  84.10  921
              ORG: precision:  78.03%; recall:  85.01%; FB1:  81.37  1461
              PER: precision:  93.06%; recall:  93.92%; FB1:  93.49  1859



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 13 = 8.204129219055176
conlleval:
processed 51578 tokens with 5942 phrases; found: 5961 phrases; correct: 5322.
accuracy:  98.39%; precision:  89.28%; recall:  89.57%; FB1:  89.42
              LOC: precision:  95.12%; recall:  91.24%; FB1:  93.14  1762
             MISC: precision:  84.11%; recall:  83.84%; FB1:  83.98  919
              ORG: precision:  80.80%; recall:  85.98%; FB1:  83.31  1427
              PER: precision:  92.82%; recall:  93.38%; FB1:  93.10  1853



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 14 = 6.6463303565979
conlleval:
processed 51578 tokens with 5942 phrases; found: 6070 phrases; correct: 5365.
accuracy:  98.36%; precision:  88.39%; recall:  90.29%; FB1:  89.33
              LOC: precision:  92.72%; recall:  92.22%; FB1:  92.47  1827
             MISC: precision:  82.42%; recall:  84.38%; FB1:  83.39  944
              ORG: precision:  80.39%; recall:  85.91%; FB1:  83.06  1433
              PER: precision:  93.30%; recall:  94.52%; FB1:  93.91  1866



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 15 = 5.4600114822387695
conlleval:
processed 51578 tokens with 5942 phrases; found: 5985 phrases; correct: 5365.
accuracy:  98.48%; precision:  89.64%; recall:  90.29%; FB1:  89.96
              LOC: precision:  94.24%; recall:  92.60%; FB1:  93.41  1805
             MISC: precision:  86.64%; recall:  84.38%; FB1:  85.49  898
              ORG: precision:  80.86%; recall:  85.98%; FB1:  83.34  1426
              PER: precision:  93.37%; recall:  94.08%; FB1:  93.73  1856



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 16 = 4.230954647064209
conlleval:
processed 51578 tokens with 5942 phrases; found: 5980 phrases; correct: 5360.
accuracy:  98.43%; precision:  89.63%; recall:  90.21%; FB1:  89.92
              LOC: precision:  94.53%; recall:  92.27%; FB1:  93.39  1793
             MISC: precision:  86.18%; recall:  83.84%; FB1:  84.99  897
              ORG: precision:  80.35%; recall:  85.68%; FB1:  82.93  1430
              PER: precision:  93.71%; recall:  94.63%; FB1:  94.17  1860



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 17 = 4.07953405380249
conlleval:
processed 51578 tokens with 5942 phrases; found: 6083 phrases; correct: 5399.
accuracy:  98.37%; precision:  88.76%; recall:  90.86%; FB1:  89.80
              LOC: precision:  94.72%; recall:  92.81%; FB1:  93.76  1800
             MISC: precision:  83.00%; recall:  85.25%; FB1:  84.11  947
              ORG: precision:  79.14%; recall:  86.58%; FB1:  82.69  1467
              PER: precision:  93.47%; recall:  94.84%; FB1:  94.15  1869



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 18 = 3.3729612827301025
conlleval:
processed 51578 tokens with 5942 phrases; found: 6068 phrases; correct: 5377.
accuracy:  98.34%; precision:  88.61%; recall:  90.49%; FB1:  89.54
              LOC: precision:  94.36%; recall:  92.87%; FB1:  93.61  1808
             MISC: precision:  82.74%; recall:  85.25%; FB1:  83.97  950
              ORG: precision:  79.12%; recall:  85.61%; FB1:  82.23  1451
              PER: precision:  93.44%; recall:  94.30%; FB1:  93.87  1859



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 19 = 2.98640513420105
conlleval:
processed 51578 tokens with 5942 phrases; found: 6022 phrases; correct: 5389.
accuracy:  98.46%; precision:  89.49%; recall:  90.69%; FB1:  90.09
              LOC: precision:  94.71%; recall:  92.60%; FB1:  93.64  1796
             MISC: precision:  83.03%; recall:  84.92%; FB1:  83.97  943
              ORG: precision:  81.82%; recall:  85.91%; FB1:  83.81  1408
              PER: precision:  93.49%; recall:  95.17%; FB1:  94.32  1875



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 20 = 2.5329885482788086
conlleval:
processed 51578 tokens with 5942 phrases; found: 5980 phrases; correct: 5364.
accuracy:  98.41%; precision:  89.70%; recall:  90.27%; FB1:  89.98
              LOC: precision:  94.33%; recall:  93.36%; FB1:  93.84  1818
             MISC: precision:  88.45%; recall:  82.21%; FB1:  85.22  857
              ORG: precision:  79.82%; recall:  86.13%; FB1:  82.86  1447
              PER: precision:  93.43%; recall:  94.25%; FB1:  93.84  1858

----------------------------
-START-/START/START MADRID/I-LOC/I-LOC 1996-08-30/O/O -END-/END/END
Predicted:	 ['START', 'I-LOC', 'O', 'END']
Gold:		 ['START', 'I-LOC', 'O', 'END']
----------------------------
-START-/START/START At/O/O California/I-LOC/I-LOC ,/O/O Tino/I-PER/I-PER Martinez/I-PER/I-PER 's/O/O two-run/O/O homer/O/O keyed/O/O a/O/O three-run/O/O first/O/O and/O/O Andy/I-PER/I-PER Pettitte/I-PER/I-PER became/O/O the/O/O league/O/O 's/O/O first/O/O 19-game/O/O winner/O/O as/O/O the/O/O New/I-O

  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 21 = 2.3236911296844482
conlleval:
processed 51578 tokens with 5942 phrases; found: 5991 phrases; correct: 5370.
accuracy:  98.46%; precision:  89.63%; recall:  90.37%; FB1:  90.00
              LOC: precision:  95.40%; recall:  92.60%; FB1:  93.98  1783
             MISC: precision:  82.17%; recall:  83.95%; FB1:  83.05  942
              ORG: precision:  82.22%; recall:  86.58%; FB1:  84.34  1412
              PER: precision:  93.53%; recall:  94.14%; FB1:  93.83  1854



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 22 = 1.84418523311615
conlleval:
processed 51578 tokens with 5942 phrases; found: 6038 phrases; correct: 5375.
accuracy:  98.39%; precision:  89.02%; recall:  90.46%; FB1:  89.73
              LOC: precision:  94.49%; recall:  92.49%; FB1:  93.48  1798
             MISC: precision:  83.39%; recall:  83.84%; FB1:  83.61  927
              ORG: precision:  80.01%; recall:  86.28%; FB1:  83.03  1446
              PER: precision:  93.52%; recall:  94.79%; FB1:  94.15  1867



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 23 = 1.6688551902770996
conlleval:
processed 51578 tokens with 5942 phrases; found: 6037 phrases; correct: 5367.
accuracy:  98.38%; precision:  88.90%; recall:  90.32%; FB1:  89.61
              LOC: precision:  94.23%; recall:  92.43%; FB1:  93.32  1802
             MISC: precision:  83.76%; recall:  84.49%; FB1:  84.13  930
              ORG: precision:  80.08%; recall:  85.46%; FB1:  82.68  1431
              PER: precision:  93.06%; recall:  94.68%; FB1:  93.86  1874



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 24 = 1.6579712629318237
conlleval:
processed 51578 tokens with 5942 phrases; found: 6022 phrases; correct: 5376.
accuracy:  98.42%; precision:  89.27%; recall:  90.47%; FB1:  89.87
              LOC: precision:  94.51%; recall:  92.81%; FB1:  93.66  1804
             MISC: precision:  83.59%; recall:  83.41%; FB1:  83.50  920
              ORG: precision:  81.25%; recall:  86.28%; FB1:  83.69  1424
              PER: precision:  93.12%; recall:  94.73%; FB1:  93.92  1874



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 25 = 1.5711638927459717
conlleval:
processed 51578 tokens with 5942 phrases; found: 5985 phrases; correct: 5369.
accuracy:  98.46%; precision:  89.71%; recall:  90.36%; FB1:  90.03
              LOC: precision:  95.14%; recall:  92.71%; FB1:  93.91  1790
             MISC: precision:  84.80%; recall:  83.51%; FB1:  84.15  908
              ORG: precision:  81.25%; recall:  86.28%; FB1:  83.69  1424
              PER: precision:  93.34%; recall:  94.41%; FB1:  93.87  1863



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 26 = 1.8655462265014648
conlleval:
processed 51578 tokens with 5942 phrases; found: 6009 phrases; correct: 5363.
accuracy:  98.40%; precision:  89.25%; recall:  90.26%; FB1:  89.75
              LOC: precision:  95.02%; recall:  92.38%; FB1:  93.68  1786
             MISC: precision:  84.68%; recall:  83.95%; FB1:  84.31  914
              ORG: precision:  79.42%; recall:  86.35%; FB1:  82.74  1458
              PER: precision:  93.68%; recall:  94.14%; FB1:  93.91  1851



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 27 = 1.504820466041565
conlleval:
processed 51578 tokens with 5942 phrases; found: 5979 phrases; correct: 5366.
accuracy:  98.48%; precision:  89.75%; recall:  90.31%; FB1:  90.03
              LOC: precision:  94.21%; recall:  92.92%; FB1:  93.56  1812
             MISC: precision:  85.32%; recall:  83.84%; FB1:  84.57  906
              ORG: precision:  81.96%; recall:  85.38%; FB1:  83.64  1397
              PER: precision:  93.40%; recall:  94.52%; FB1:  93.96  1864



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 28 = 1.2352855205535889
conlleval:
processed 51578 tokens with 5942 phrases; found: 6034 phrases; correct: 5372.
accuracy:  98.42%; precision:  89.03%; recall:  90.41%; FB1:  89.71
              LOC: precision:  94.03%; recall:  92.65%; FB1:  93.34  1810
             MISC: precision:  83.07%; recall:  84.60%; FB1:  83.83  939
              ORG: precision:  80.67%; recall:  86.20%; FB1:  83.35  1433
              PER: precision:  93.63%; recall:  94.14%; FB1:  93.88  1852



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 29 = 1.466690182685852
conlleval:
processed 51578 tokens with 5942 phrases; found: 5969 phrases; correct: 5363.
accuracy:  98.43%; precision:  89.85%; recall:  90.26%; FB1:  90.05
              LOC: precision:  95.39%; recall:  92.38%; FB1:  93.86  1779
             MISC: precision:  84.70%; recall:  84.06%; FB1:  84.38  915
              ORG: precision:  81.64%; recall:  85.91%; FB1:  83.72  1411
              PER: precision:  93.29%; recall:  94.41%; FB1:  93.85  1864



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 30 = 1.4561488628387451
conlleval:
processed 51578 tokens with 5942 phrases; found: 6042 phrases; correct: 5373.
accuracy:  98.38%; precision:  88.93%; recall:  90.42%; FB1:  89.67
              LOC: precision:  94.65%; recall:  92.49%; FB1:  93.56  1795
             MISC: precision:  83.60%; recall:  84.06%; FB1:  83.83  927
              ORG: precision:  79.32%; recall:  85.83%; FB1:  82.45  1451
              PER: precision:  93.53%; recall:  94.90%; FB1:  94.21  1869

----------------------------
-START-/START/START Swansea/I-ORG/I-ORG 4/O/O 1/O/O 0/O/O 3/O/O 4/O/O 9/O/O 3/O/O -END-/END/END
Predicted:	 ['START', 'I-ORG', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'END']
Gold:		 ['START', 'I-ORG', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'END']
----------------------------
-START-/START/START "/O/O I/O/O 'm/O/O an/O/O emotional/O/O player/O/O ,/O/O "/O/O said/O/O the/O/O 104th-ranked/I-MISC/O Tarango/I-PER/I-PER ./O/O "/O/O -END-/END/END
Predicted:	 ['START', 'O', 'O', 'O', 'O', 'O'

  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 31 = 1.2785120010375977
conlleval:
processed 51578 tokens with 5942 phrases; found: 6007 phrases; correct: 5374.
accuracy:  98.40%; precision:  89.46%; recall:  90.44%; FB1:  89.95
              LOC: precision:  93.76%; recall:  93.25%; FB1:  93.50  1827
             MISC: precision:  84.62%; recall:  84.16%; FB1:  84.39  917
              ORG: precision:  81.96%; recall:  85.01%; FB1:  83.46  1391
              PER: precision:  93.22%; recall:  94.73%; FB1:  93.97  1872



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 32 = 1.3364728689193726
conlleval:
processed 51578 tokens with 5942 phrases; found: 6012 phrases; correct: 5376.
accuracy:  98.42%; precision:  89.42%; recall:  90.47%; FB1:  89.94
              LOC: precision:  94.84%; recall:  93.09%; FB1:  93.96  1803
             MISC: precision:  82.78%; recall:  84.49%; FB1:  83.63  941
              ORG: precision:  81.44%; recall:  85.09%; FB1:  83.22  1401
              PER: precision:  93.52%; recall:  94.79%; FB1:  94.15  1867



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 33 = 1.370385766029358
conlleval:
processed 51578 tokens with 5942 phrases; found: 5988 phrases; correct: 5357.
accuracy:  98.43%; precision:  89.46%; recall:  90.15%; FB1:  89.81
              LOC: precision:  94.40%; recall:  92.60%; FB1:  93.49  1802
             MISC: precision:  83.87%; recall:  84.06%; FB1:  83.97  924
              ORG: precision:  81.24%; recall:  84.94%; FB1:  83.05  1402
              PER: precision:  93.66%; recall:  94.57%; FB1:  94.11  1860



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 34 = 0.9559427499771118
conlleval:
processed 51578 tokens with 5942 phrases; found: 6040 phrases; correct: 5367.
accuracy:  98.38%; precision:  88.86%; recall:  90.32%; FB1:  89.58
              LOC: precision:  94.59%; recall:  92.27%; FB1:  93.41  1792
             MISC: precision:  83.88%; recall:  83.51%; FB1:  83.70  918
              ORG: precision:  79.12%; recall:  85.91%; FB1:  82.37  1456
              PER: precision:  93.38%; recall:  95.01%; FB1:  94.19  1874



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 35 = 1.0217775106430054
conlleval:
processed 51578 tokens with 5942 phrases; found: 5972 phrases; correct: 5373.
accuracy:  98.49%; precision:  89.97%; recall:  90.42%; FB1:  90.20
              LOC: precision:  94.70%; recall:  93.30%; FB1:  94.00  1810
             MISC: precision:  84.63%; recall:  84.82%; FB1:  84.72  924
              ORG: precision:  82.71%; recall:  85.23%; FB1:  83.95  1382
              PER: precision:  93.43%; recall:  94.14%; FB1:  93.78  1856



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 36 = 1.2006782293319702
conlleval:
processed 51578 tokens with 5942 phrases; found: 5989 phrases; correct: 5362.
accuracy:  98.45%; precision:  89.53%; recall:  90.24%; FB1:  89.88
              LOC: precision:  94.72%; recall:  92.81%; FB1:  93.76  1800
             MISC: precision:  83.08%; recall:  83.62%; FB1:  83.35  928
              ORG: precision:  81.55%; recall:  85.68%; FB1:  83.56  1409
              PER: precision:  93.79%; recall:  94.30%; FB1:  94.04  1852



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 37 = 0.8655781149864197
conlleval:
processed 51578 tokens with 5942 phrases; found: 6005 phrases; correct: 5356.
accuracy:  98.40%; precision:  89.19%; recall:  90.14%; FB1:  89.66
              LOC: precision:  94.54%; recall:  92.32%; FB1:  93.42  1794
             MISC: precision:  83.51%; recall:  83.51%; FB1:  83.51  922
              ORG: precision:  80.51%; recall:  86.28%; FB1:  83.30  1437
              PER: precision:  93.57%; recall:  94.08%; FB1:  93.83  1852



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 38 = 1.1220191717147827
conlleval:
processed 51578 tokens with 5942 phrases; found: 6007 phrases; correct: 5364.
accuracy:  98.43%; precision:  89.30%; recall:  90.27%; FB1:  89.78
              LOC: precision:  94.37%; recall:  93.09%; FB1:  93.72  1812
             MISC: precision:  83.15%; recall:  83.51%; FB1:  83.33  926
              ORG: precision:  81.11%; recall:  85.16%; FB1:  83.08  1408
              PER: precision:  93.61%; recall:  94.57%; FB1:  94.09  1861



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 39 = 0.9642558693885803
conlleval:
processed 51578 tokens with 5942 phrases; found: 6000 phrases; correct: 5353.
accuracy:  98.39%; precision:  89.22%; recall:  90.09%; FB1:  89.65
              LOC: precision:  94.35%; recall:  92.65%; FB1:  93.49  1804
             MISC: precision:  83.84%; recall:  83.84%; FB1:  83.84  922
              ORG: precision:  80.96%; recall:  85.61%; FB1:  83.22  1418
              PER: precision:  93.21%; recall:  93.92%; FB1:  93.56  1856



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 40 = 0.9040539860725403
conlleval:
processed 51578 tokens with 5942 phrases; found: 6012 phrases; correct: 5361.
accuracy:  98.40%; precision:  89.17%; recall:  90.22%; FB1:  89.69
              LOC: precision:  94.66%; recall:  92.65%; FB1:  93.65  1798
             MISC: precision:  82.92%; recall:  84.27%; FB1:  83.59  937
              ORG: precision:  80.52%; recall:  85.09%; FB1:  82.74  1417
              PER: precision:  93.60%; recall:  94.52%; FB1:  94.06  1860

----------------------------
-START-/START/START 8-198/I-ORG/O 9-203/I-ORG/O ./O/O -END-/END/END
Predicted:	 ['START', 'I-ORG', 'I-ORG', 'O', 'END']
Gold:		 ['START', 'O', 'O', 'O', 'END']
----------------------------
-START-/START/START Suu/I-PER/I-PER Kyi/I-PER/I-PER said/O/O at/O/O least/O/O 61/O/O democracy/O/O supporters/O/O had/O/O been/O/O arrested/O/O since/O/O May/O/O ,/O/O and/O/O about/O/O 30/O/O of/O/O them/O/O had/O/O been/O/O sentenced/O/O ,/O/O most/O/O to/O/O long/O/O prison/O/O terms/O/O

  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 41 = 1.4405461549758911
conlleval:
processed 51578 tokens with 5942 phrases; found: 5992 phrases; correct: 5358.
accuracy:  98.43%; precision:  89.42%; recall:  90.17%; FB1:  89.79
              LOC: precision:  94.56%; recall:  92.81%; FB1:  93.68  1803
             MISC: precision:  84.56%; recall:  83.73%; FB1:  84.14  913
              ORG: precision:  80.80%; recall:  86.28%; FB1:  83.45  1432
              PER: precision:  93.49%; recall:  93.59%; FB1:  93.54  1844



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 42 = 0.9048216938972473
conlleval:
processed 51578 tokens with 5942 phrases; found: 5971 phrases; correct: 5347.
accuracy:  98.44%; precision:  89.55%; recall:  89.99%; FB1:  89.77
              LOC: precision:  94.84%; recall:  92.11%; FB1:  93.45  1784
             MISC: precision:  83.93%; recall:  83.84%; FB1:  83.88  921
              ORG: precision:  81.26%; recall:  85.38%; FB1:  83.27  1409
              PER: precision:  93.54%; recall:  94.30%; FB1:  93.92  1857



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 43 = 0.8540564179420471
conlleval:
processed 51578 tokens with 5942 phrases; found: 5985 phrases; correct: 5360.
accuracy:  98.42%; precision:  89.56%; recall:  90.21%; FB1:  89.88
              LOC: precision:  94.96%; recall:  92.22%; FB1:  93.57  1784
             MISC: precision:  85.46%; recall:  83.51%; FB1:  84.48  901
              ORG: precision:  80.55%; recall:  85.53%; FB1:  82.97  1424
              PER: precision:  93.23%; recall:  94.95%; FB1:  94.08  1876



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 44 = 1.3882941007614136
conlleval:
processed 51578 tokens with 5942 phrases; found: 6011 phrases; correct: 5360.
accuracy:  98.38%; precision:  89.17%; recall:  90.21%; FB1:  89.68
              LOC: precision:  94.23%; recall:  92.49%; FB1:  93.35  1803
             MISC: precision:  83.50%; recall:  83.95%; FB1:  83.72  927
              ORG: precision:  80.42%; recall:  86.06%; FB1:  83.14  1435
              PER: precision:  93.88%; recall:  94.08%; FB1:  93.98  1846



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 45 = 1.182590365409851
conlleval:
processed 51578 tokens with 5942 phrases; found: 6021 phrases; correct: 5367.
accuracy:  98.40%; precision:  89.14%; recall:  90.32%; FB1:  89.73
              LOC: precision:  94.40%; recall:  92.60%; FB1:  93.49  1802
             MISC: precision:  82.85%; recall:  83.84%; FB1:  83.34  933
              ORG: precision:  81.01%; recall:  86.20%; FB1:  83.53  1427
              PER: precision:  93.44%; recall:  94.30%; FB1:  93.87  1859



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 46 = 0.9830508828163147
conlleval:
processed 51578 tokens with 5942 phrases; found: 6004 phrases; correct: 5367.
accuracy:  98.44%; precision:  89.39%; recall:  90.32%; FB1:  89.85
              LOC: precision:  94.33%; recall:  93.36%; FB1:  93.84  1818
             MISC: precision:  83.86%; recall:  83.41%; FB1:  83.63  917
              ORG: precision:  81.29%; recall:  85.53%; FB1:  83.36  1411
              PER: precision:  93.43%; recall:  94.25%; FB1:  93.84  1858



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 47 = 0.9439864754676819
conlleval:
processed 51578 tokens with 5942 phrases; found: 6048 phrases; correct: 5365.
accuracy:  98.36%; precision:  88.71%; recall:  90.29%; FB1:  89.49
              LOC: precision:  94.12%; recall:  93.30%; FB1:  93.71  1821
             MISC: precision:  82.92%; recall:  83.73%; FB1:  83.32  931
              ORG: precision:  79.54%; recall:  85.23%; FB1:  82.29  1437
              PER: precision:  93.38%; recall:  94.25%; FB1:  93.81  1859



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 48 = 0.876814067363739
conlleval:
processed 51578 tokens with 5942 phrases; found: 5982 phrases; correct: 5340.
accuracy:  98.38%; precision:  89.27%; recall:  89.87%; FB1:  89.57
              LOC: precision:  94.70%; recall:  92.32%; FB1:  93.50  1791
             MISC: precision:  84.14%; recall:  83.41%; FB1:  83.77  914
              ORG: precision:  80.17%; recall:  85.31%; FB1:  82.66  1427
              PER: precision:  93.57%; recall:  93.97%; FB1:  93.77  1850



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 49 = 0.9531188607215881
conlleval:
processed 51578 tokens with 5942 phrases; found: 5990 phrases; correct: 5358.
accuracy:  98.39%; precision:  89.45%; recall:  90.17%; FB1:  89.81
              LOC: precision:  94.98%; recall:  92.71%; FB1:  93.83  1793
             MISC: precision:  83.80%; recall:  83.62%; FB1:  83.71  920
              ORG: precision:  80.75%; recall:  85.38%; FB1:  83.00  1418
              PER: precision:  93.54%; recall:  94.41%; FB1:  93.97  1859



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 50 = 1.051814317703247
conlleval:
processed 51578 tokens with 5942 phrases; found: 6038 phrases; correct: 5354.
accuracy:  98.33%; precision:  88.67%; recall:  90.10%; FB1:  89.38
              LOC: precision:  94.35%; recall:  92.76%; FB1:  93.55  1806
             MISC: precision:  81.24%; recall:  83.62%; FB1:  82.42  949
              ORG: precision:  79.79%; recall:  85.38%; FB1:  82.49  1435
              PER: precision:  93.83%; recall:  94.14%; FB1:  93.98  1848

----------------------------
-START-/START/START --/O/O Amra/I-PER/I-PER Kevic/I-PER/I-PER ,/O/O Belgrade/I-LOC/I-LOC newsroom/O/O +381/O/O 11/O/O 2224305/O/O -END-/END/END
Predicted:	 ['START', 'O', 'I-PER', 'I-PER', 'O', 'I-LOC', 'O', 'O', 'O', 'O', 'END']
Gold:		 ['START', 'O', 'I-PER', 'I-PER', 'O', 'I-LOC', 'O', 'O', 'O', 'O', 'END']
----------------------------
-START-/START/START 9-0-49-1/O/O ,/O/O Muralitharan/I-PER/I-PER 10-0-41-2/O/O ,/O/O Jayasuriya/I-PER/I-PER 10-0-43-2/O/O ,/O/O Chandana/I-PE

  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 51 = 0.6993301510810852
conlleval:
processed 51578 tokens with 5942 phrases; found: 5980 phrases; correct: 5366.
accuracy:  98.48%; precision:  89.73%; recall:  90.31%; FB1:  90.02
              LOC: precision:  94.66%; recall:  92.71%; FB1:  93.67  1799
             MISC: precision:  83.46%; recall:  83.73%; FB1:  83.60  925
              ORG: precision:  82.48%; recall:  85.31%; FB1:  83.87  1387
              PER: precision:  93.47%; recall:  94.84%; FB1:  94.15  1869



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 52 = 0.8044747710227966
conlleval:
processed 51578 tokens with 5942 phrases; found: 5989 phrases; correct: 5353.
accuracy:  98.40%; precision:  89.38%; recall:  90.09%; FB1:  89.73
              LOC: precision:  94.39%; recall:  92.43%; FB1:  93.40  1799
             MISC: precision:  84.24%; recall:  84.06%; FB1:  84.15  920
              ORG: precision:  80.93%; recall:  85.46%; FB1:  83.13  1416
              PER: precision:  93.53%; recall:  94.14%; FB1:  93.83  1854



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 53 = 0.5771037340164185
conlleval:
processed 51578 tokens with 5942 phrases; found: 5956 phrases; correct: 5362.
accuracy:  98.47%; precision:  90.03%; recall:  90.24%; FB1:  90.13
              LOC: precision:  94.64%; recall:  93.20%; FB1:  93.91  1809
             MISC: precision:  85.62%; recall:  83.30%; FB1:  84.44  897
              ORG: precision:  82.06%; recall:  85.61%; FB1:  83.80  1399
              PER: precision:  93.68%; recall:  94.14%; FB1:  93.91  1851



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 54 = 1.0752768516540527
conlleval:
processed 51578 tokens with 5942 phrases; found: 5974 phrases; correct: 5356.
accuracy:  98.45%; precision:  89.66%; recall:  90.14%; FB1:  89.90
              LOC: precision:  94.44%; recall:  92.54%; FB1:  93.48  1800
             MISC: precision:  84.48%; recall:  83.84%; FB1:  84.16  915
              ORG: precision:  81.75%; recall:  85.16%; FB1:  83.42  1397
              PER: precision:  93.50%; recall:  94.52%; FB1:  94.01  1862



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 55 = 0.7342816591262817
conlleval:
processed 51578 tokens with 5942 phrases; found: 6044 phrases; correct: 5360.
accuracy:  98.31%; precision:  88.68%; recall:  90.21%; FB1:  89.44
              LOC: precision:  94.22%; recall:  93.14%; FB1:  93.68  1816
             MISC: precision:  83.93%; recall:  83.30%; FB1:  83.61  915
              ORG: precision:  78.73%; recall:  85.31%; FB1:  81.89  1453
              PER: precision:  93.39%; recall:  94.30%; FB1:  93.84  1860



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 56 = 0.8086504936218262
conlleval:
processed 51578 tokens with 5942 phrases; found: 5954 phrases; correct: 5341.
accuracy:  98.43%; precision:  89.70%; recall:  89.89%; FB1:  89.79
              LOC: precision:  94.40%; recall:  92.71%; FB1:  93.55  1804
             MISC: precision:  85.02%; recall:  83.73%; FB1:  84.37  908
              ORG: precision:  81.73%; recall:  85.09%; FB1:  83.38  1396
              PER: precision:  93.45%; recall:  93.65%; FB1:  93.55  1846



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 57 = 0.6348341703414917
conlleval:
processed 51578 tokens with 5942 phrases; found: 5967 phrases; correct: 5341.
accuracy:  98.42%; precision:  89.51%; recall:  89.89%; FB1:  89.70
              LOC: precision:  94.36%; recall:  92.81%; FB1:  93.58  1807
             MISC: precision:  82.67%; recall:  83.84%; FB1:  83.25  935
              ORG: precision:  82.27%; recall:  84.79%; FB1:  83.51  1382
              PER: precision:  93.65%; recall:  93.70%; FB1:  93.68  1843



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 58 = 0.8439703583717346
conlleval:
processed 51578 tokens with 5942 phrases; found: 5997 phrases; correct: 5361.
accuracy:  98.41%; precision:  89.39%; recall:  90.22%; FB1:  89.81
              LOC: precision:  94.39%; recall:  93.47%; FB1:  93.93  1819
             MISC: precision:  83.57%; recall:  83.30%; FB1:  83.43  919
              ORG: precision:  81.62%; recall:  85.09%; FB1:  83.32  1398
              PER: precision:  93.23%; recall:  94.19%; FB1:  93.71  1861



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 59 = 0.8873289227485657
conlleval:
processed 51578 tokens with 5942 phrases; found: 5987 phrases; correct: 5358.
accuracy:  98.43%; precision:  89.49%; recall:  90.17%; FB1:  89.83
              LOC: precision:  94.63%; recall:  92.98%; FB1:  93.79  1805
             MISC: precision:  83.51%; recall:  83.51%; FB1:  83.51  922
              ORG: precision:  81.77%; recall:  84.94%; FB1:  83.32  1393
              PER: precision:  93.25%; recall:  94.52%; FB1:  93.88  1867



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 60 = 0.7880527377128601
conlleval:
processed 51578 tokens with 5942 phrases; found: 6046 phrases; correct: 5367.
accuracy:  98.36%; precision:  88.77%; recall:  90.32%; FB1:  89.54
              LOC: precision:  94.19%; recall:  93.47%; FB1:  93.83  1823
             MISC: precision:  83.01%; recall:  83.73%; FB1:  83.37  930
              ORG: precision:  79.71%; recall:  84.94%; FB1:  82.24  1429
              PER: precision:  93.29%; recall:  94.41%; FB1:  93.85  1864

----------------------------
-START-/START/START SEOUL/I-LOC/I-LOC 1996-08-31/O/O -END-/END/END
Predicted:	 ['START', 'I-LOC', 'O', 'END']
Gold:		 ['START', 'I-LOC', 'O', 'END']
----------------------------
-START-/START/START 289/O/O Joakim/I-PER/I-PER Haeggman/I-PER/I-PER (/O/O Sweden/I-LOC/I-LOC )/O/O 71/O/O 77/O/O 70/O/O 71/O/O ,/O/O Antoine/I-PER/I-PER Lebouc/I-PER/I-PER (/O/O France/I-LOC/I-LOC )/O/O 74/O/O 73/O/O 70/O/O 72/O/O -END-/END/END
Predicted:	 ['START', 'O', 'I-PER', 'I-PER', 'O', 'I-LOC'

  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 61 = 1.0418293476104736
conlleval:
processed 51578 tokens with 5942 phrases; found: 5993 phrases; correct: 5349.
accuracy:  98.40%; precision:  89.25%; recall:  90.02%; FB1:  89.64
              LOC: precision:  94.34%; recall:  92.54%; FB1:  93.43  1802
             MISC: precision:  83.08%; recall:  83.62%; FB1:  83.35  928
              ORG: precision:  81.03%; recall:  85.38%; FB1:  83.15  1413
              PER: precision:  93.68%; recall:  94.08%; FB1:  93.88  1850



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 62 = 0.6839953064918518
conlleval:
processed 51578 tokens with 5942 phrases; found: 5982 phrases; correct: 5343.
accuracy:  98.40%; precision:  89.32%; recall:  89.92%; FB1:  89.62
              LOC: precision:  94.48%; recall:  92.16%; FB1:  93.30  1792
             MISC: precision:  83.23%; recall:  83.41%; FB1:  83.32  924
              ORG: precision:  80.80%; recall:  85.38%; FB1:  83.03  1417
              PER: precision:  93.89%; recall:  94.25%; FB1:  94.07  1849



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 63 = 0.778083860874176
conlleval:
processed 51578 tokens with 5942 phrases; found: 5986 phrases; correct: 5337.
accuracy:  98.39%; precision:  89.16%; recall:  89.82%; FB1:  89.49
              LOC: precision:  94.27%; recall:  92.27%; FB1:  93.26  1798
             MISC: precision:  82.53%; recall:  83.51%; FB1:  83.02  933
              ORG: precision:  80.99%; recall:  85.16%; FB1:  83.02  1410
              PER: precision:  93.77%; recall:  93.92%; FB1:  93.84  1845



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 64 = 0.5942714810371399
conlleval:
processed 51578 tokens with 5942 phrases; found: 5988 phrases; correct: 5349.
accuracy:  98.42%; precision:  89.33%; recall:  90.02%; FB1:  89.67
              LOC: precision:  94.24%; recall:  92.60%; FB1:  93.41  1805
             MISC: precision:  83.60%; recall:  83.51%; FB1:  83.56  921
              ORG: precision:  80.95%; recall:  85.23%; FB1:  83.04  1412
              PER: precision:  93.78%; recall:  94.19%; FB1:  93.99  1850



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 65 = 0.6791855096817017
conlleval:
processed 51578 tokens with 5942 phrases; found: 5992 phrases; correct: 5351.
accuracy:  98.42%; precision:  89.30%; recall:  90.05%; FB1:  89.68
              LOC: precision:  94.57%; recall:  92.92%; FB1:  93.74  1805
             MISC: precision:  83.79%; recall:  83.51%; FB1:  83.65  919
              ORG: precision:  80.97%; recall:  85.01%; FB1:  82.94  1408
              PER: precision:  93.23%; recall:  94.14%; FB1:  93.68  1860



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 66 = 0.8167482614517212
conlleval:
processed 51578 tokens with 5942 phrases; found: 6054 phrases; correct: 5345.
accuracy:  98.27%; precision:  88.29%; recall:  89.95%; FB1:  89.11
              LOC: precision:  94.13%; recall:  92.49%; FB1:  93.30  1805
             MISC: precision:  82.35%; recall:  83.51%; FB1:  82.93  935
              ORG: precision:  78.27%; recall:  85.16%; FB1:  81.57  1459
              PER: precision:  93.48%; recall:  94.14%; FB1:  93.81  1855



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 67 = 0.6911866664886475
conlleval:
processed 51578 tokens with 5942 phrases; found: 5970 phrases; correct: 5363.
accuracy:  98.44%; precision:  89.83%; recall:  90.26%; FB1:  90.04
              LOC: precision:  94.65%; recall:  93.41%; FB1:  94.03  1813
             MISC: precision:  83.15%; recall:  84.06%; FB1:  83.60  932
              ORG: precision:  83.62%; recall:  83.74%; FB1:  83.68  1343
              PER: precision:  92.93%; recall:  94.95%; FB1:  93.93  1882



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 68 = 0.6193132996559143
conlleval:
processed 51578 tokens with 5942 phrases; found: 5995 phrases; correct: 5340.
accuracy:  98.39%; precision:  89.07%; recall:  89.87%; FB1:  89.47
              LOC: precision:  94.16%; recall:  92.16%; FB1:  93.15  1798
             MISC: precision:  83.15%; recall:  83.51%; FB1:  83.33  926
              ORG: precision:  80.32%; recall:  85.23%; FB1:  82.71  1423
              PER: precision:  93.83%; recall:  94.14%; FB1:  93.98  1848



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 69 = 0.7322702407836914
conlleval:
processed 51578 tokens with 5942 phrases; found: 5981 phrases; correct: 5338.
accuracy:  98.40%; precision:  89.25%; recall:  89.84%; FB1:  89.54
              LOC: precision:  94.57%; recall:  92.00%; FB1:  93.27  1787
             MISC: precision:  84.17%; recall:  83.62%; FB1:  83.90  916
              ORG: precision:  80.10%; recall:  86.13%; FB1:  83.00  1442
              PER: precision:  93.79%; recall:  93.49%; FB1:  93.64  1836



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 70 = 0.7874616980552673
conlleval:
processed 51578 tokens with 5942 phrases; found: 5986 phrases; correct: 5348.
accuracy:  98.42%; precision:  89.34%; recall:  90.00%; FB1:  89.67
              LOC: precision:  94.18%; recall:  92.49%; FB1:  93.33  1804
             MISC: precision:  83.90%; recall:  83.62%; FB1:  83.76  919
              ORG: precision:  80.97%; recall:  85.98%; FB1:  83.40  1424
              PER: precision:  93.80%; recall:  93.65%; FB1:  93.72  1839

----------------------------
-START-/START/START Scorers/O/O :/O/O Nicolas/I-PER/I-PER Ouedec/I-PER/I-PER (/O/O 49th/O/O minute/O/O )/O/O ,/O/O Youri/I-PER/I-PER Djorkaeff/I-PER/I-PER (/O/O 53rd/O/O )/O/O -END-/END/END
Predicted:	 ['START', 'O', 'O', 'I-PER', 'I-PER', 'O', 'O', 'O', 'O', 'O', 'I-PER', 'I-PER', 'O', 'O', 'O', 'END']
Gold:		 ['START', 'O', 'O', 'I-PER', 'I-PER', 'O', 'O', 'O', 'O', 'O', 'I-PER', 'I-PER', 'O', 'O', 'O', 'END']
----------------------------
-START-/START/START 2./O/O Michael/

  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 71 = 0.8282478451728821
conlleval:
processed 51578 tokens with 5942 phrases; found: 5995 phrases; correct: 5350.
accuracy:  98.43%; precision:  89.24%; recall:  90.04%; FB1:  89.64
              LOC: precision:  94.29%; recall:  92.54%; FB1:  93.41  1803
             MISC: precision:  83.15%; recall:  83.51%; FB1:  83.33  926
              ORG: precision:  80.93%; recall:  85.46%; FB1:  83.13  1416
              PER: precision:  93.73%; recall:  94.14%; FB1:  93.93  1850



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 72 = 0.5712136030197144
conlleval:
processed 51578 tokens with 5942 phrases; found: 6007 phrases; correct: 5348.
accuracy:  98.38%; precision:  89.03%; recall:  90.00%; FB1:  89.51
              LOC: precision:  94.33%; recall:  92.38%; FB1:  93.34  1799
             MISC: precision:  83.99%; recall:  83.62%; FB1:  83.80  918
              ORG: precision:  79.87%; recall:  85.53%; FB1:  82.61  1436
              PER: precision:  93.47%; recall:  94.08%; FB1:  93.78  1854



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 73 = 0.727339506149292
conlleval:
processed 51578 tokens with 5942 phrases; found: 5979 phrases; correct: 5334.
accuracy:  98.40%; precision:  89.21%; recall:  89.77%; FB1:  89.49
              LOC: precision:  94.17%; recall:  92.38%; FB1:  93.27  1802
             MISC: precision:  81.55%; recall:  83.41%; FB1:  82.47  943
              ORG: precision:  82.15%; recall:  84.79%; FB1:  83.45  1384
              PER: precision:  93.57%; recall:  93.97%; FB1:  93.77  1850



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 74 = 0.5923712849617004
conlleval:
processed 51578 tokens with 5942 phrases; found: 6030 phrases; correct: 5348.
accuracy:  98.32%; precision:  88.69%; recall:  90.00%; FB1:  89.34
              LOC: precision:  94.42%; recall:  92.11%; FB1:  93.25  1792
             MISC: precision:  83.10%; recall:  83.73%; FB1:  83.41  929
              ORG: precision:  78.83%; recall:  85.23%; FB1:  81.91  1450
              PER: precision:  93.65%; recall:  94.52%; FB1:  94.08  1859



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 75 = 0.7761808037757874
conlleval:
processed 51578 tokens with 5942 phrases; found: 5958 phrases; correct: 5359.
accuracy:  98.48%; precision:  89.95%; recall:  90.19%; FB1:  90.07
              LOC: precision:  94.79%; recall:  93.03%; FB1:  93.90  1803
             MISC: precision:  84.25%; recall:  83.51%; FB1:  83.88  914
              ORG: precision:  82.59%; recall:  85.61%; FB1:  84.07  1390
              PER: precision:  93.57%; recall:  94.03%; FB1:  93.80  1851



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 76 = 0.6721312403678894
conlleval:
processed 51578 tokens with 5942 phrases; found: 5975 phrases; correct: 5356.
accuracy:  98.43%; precision:  89.64%; recall:  90.14%; FB1:  89.89
              LOC: precision:  94.59%; recall:  93.25%; FB1:  93.91  1811
             MISC: precision:  83.21%; recall:  83.84%; FB1:  83.52  929
              ORG: precision:  83.07%; recall:  83.82%; FB1:  83.44  1353
              PER: precision:  92.77%; recall:  94.79%; FB1:  93.77  1882



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 77 = 0.5060636401176453
conlleval:
processed 51578 tokens with 5942 phrases; found: 5988 phrases; correct: 5351.
accuracy:  98.43%; precision:  89.36%; recall:  90.05%; FB1:  89.71
              LOC: precision:  94.45%; recall:  92.65%; FB1:  93.54  1802
             MISC: precision:  83.60%; recall:  83.51%; FB1:  83.56  921
              ORG: precision:  81.29%; recall:  85.53%; FB1:  83.36  1411
              PER: precision:  93.42%; recall:  94.03%; FB1:  93.72  1854



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 78 = 0.7727060914039612
conlleval:
processed 51578 tokens with 5942 phrases; found: 5988 phrases; correct: 5351.
accuracy:  98.41%; precision:  89.36%; recall:  90.05%; FB1:  89.71
              LOC: precision:  94.51%; recall:  92.81%; FB1:  93.66  1804
             MISC: precision:  83.42%; recall:  83.51%; FB1:  83.47  923
              ORG: precision:  81.12%; recall:  85.23%; FB1:  83.13  1409
              PER: precision:  93.57%; recall:  94.08%; FB1:  93.83  1852



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 79 = 0.8673596382141113
conlleval:
processed 51578 tokens with 5942 phrases; found: 6005 phrases; correct: 5346.
accuracy:  98.38%; precision:  89.03%; recall:  89.97%; FB1:  89.50
              LOC: precision:  94.32%; recall:  92.16%; FB1:  93.23  1795
             MISC: precision:  83.23%; recall:  83.41%; FB1:  83.32  924
              ORG: precision:  80.24%; recall:  85.68%; FB1:  82.87  1432
              PER: precision:  93.58%; recall:  94.19%; FB1:  93.89  1854



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 80 = 0.7231130003929138
conlleval:
processed 51578 tokens with 5942 phrases; found: 5981 phrases; correct: 5331.
accuracy:  98.36%; precision:  89.13%; recall:  89.72%; FB1:  89.42
              LOC: precision:  94.58%; recall:  92.11%; FB1:  93.33  1789
             MISC: precision:  82.76%; recall:  83.30%; FB1:  83.03  928
              ORG: precision:  80.44%; recall:  84.94%; FB1:  82.63  1416
              PER: precision:  93.72%; recall:  94.03%; FB1:  93.88  1848

----------------------------
-START-/START/START R./I-PER/I-PER Mahanama/I-PER/I-PER b/O/O McGrath/I-PER/I-PER 50/O/O -END-/END/END
Predicted:	 ['START', 'I-PER', 'I-PER', 'O', 'I-PER', 'O', 'END']
Gold:		 ['START', 'I-PER', 'I-PER', 'O', 'I-PER', 'O', 'END']
----------------------------
-START-/START/START In/O/O 14/O/O years/O/O as/O/O Chicago/I-LOC/I-LOC 's/O/O archbishop/O/O ,/O/O he/O/O built/O/O a/O/O prayerful/O/O ,/O/O saintly/O/O image/O/O and/O/O was/O/O deeply/O/O involved/O/O in/O/O world/O/O

  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 81 = 0.6674872040748596
conlleval:
processed 51578 tokens with 5942 phrases; found: 5987 phrases; correct: 5359.
accuracy:  98.45%; precision:  89.51%; recall:  90.19%; FB1:  89.85
              LOC: precision:  94.16%; recall:  92.98%; FB1:  93.56  1814
             MISC: precision:  84.03%; recall:  83.30%; FB1:  83.66  914
              ORG: precision:  81.82%; recall:  85.91%; FB1:  83.81  1408
              PER: precision:  93.52%; recall:  93.97%; FB1:  93.74  1851



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 82 = 0.7205471992492676
conlleval:
processed 51578 tokens with 5942 phrases; found: 6006 phrases; correct: 5342.
accuracy:  98.35%; precision:  88.94%; recall:  89.90%; FB1:  89.42
              LOC: precision:  94.48%; recall:  92.32%; FB1:  93.39  1795
             MISC: precision:  83.14%; recall:  83.41%; FB1:  83.27  925
              ORG: precision:  79.90%; recall:  85.09%; FB1:  82.41  1428
              PER: precision:  93.43%; recall:  94.25%; FB1:  93.84  1858



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 83 = 0.6267237067222595
conlleval:
processed 51578 tokens with 5942 phrases; found: 5980 phrases; correct: 5338.
accuracy:  98.41%; precision:  89.26%; recall:  89.84%; FB1:  89.55
              LOC: precision:  94.27%; recall:  92.22%; FB1:  93.23  1797
             MISC: precision:  83.90%; recall:  83.62%; FB1:  83.76  919
              ORG: precision:  80.70%; recall:  85.46%; FB1:  83.01  1420
              PER: precision:  93.66%; recall:  93.76%; FB1:  93.71  1844



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 84 = 0.7825548648834229
conlleval:
processed 51578 tokens with 5942 phrases; found: 5993 phrases; correct: 5362.
accuracy:  98.42%; precision:  89.47%; recall:  90.24%; FB1:  89.85
              LOC: precision:  94.58%; recall:  93.09%; FB1:  93.83  1808
             MISC: precision:  83.41%; recall:  83.41%; FB1:  83.41  922
              ORG: precision:  81.90%; recall:  84.71%; FB1:  83.28  1387
              PER: precision:  93.12%; recall:  94.84%; FB1:  93.98  1876



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 85 = 0.6332400441169739
conlleval:
processed 51578 tokens with 5942 phrases; found: 6013 phrases; correct: 5338.
accuracy:  98.35%; precision:  88.77%; recall:  89.84%; FB1:  89.30
              LOC: precision:  94.16%; recall:  92.11%; FB1:  93.12  1797
             MISC: precision:  83.24%; recall:  83.51%; FB1:  83.38  925
              ORG: precision:  79.47%; recall:  85.46%; FB1:  82.36  1442
              PER: precision:  93.56%; recall:  93.92%; FB1:  93.74  1849



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 86 = 0.680480420589447
conlleval:
processed 51578 tokens with 5942 phrases; found: 5988 phrases; correct: 5353.
accuracy:  98.43%; precision:  89.40%; recall:  90.09%; FB1:  89.74
              LOC: precision:  94.36%; recall:  92.87%; FB1:  93.61  1808
             MISC: precision:  83.64%; recall:  83.19%; FB1:  83.41  917
              ORG: precision:  81.17%; recall:  85.53%; FB1:  83.30  1413
              PER: precision:  93.68%; recall:  94.08%; FB1:  93.88  1850



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 87 = 0.4998241662979126
conlleval:
processed 51578 tokens with 5942 phrases; found: 5977 phrases; correct: 5355.
accuracy:  98.45%; precision:  89.59%; recall:  90.12%; FB1:  89.86
              LOC: precision:  94.62%; recall:  92.87%; FB1:  93.74  1803
             MISC: precision:  83.59%; recall:  83.41%; FB1:  83.50  920
              ORG: precision:  81.70%; recall:  85.23%; FB1:  83.43  1399
              PER: precision:  93.64%; recall:  94.30%; FB1:  93.97  1855



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 88 = 0.7983539700508118
conlleval:
processed 51578 tokens with 5942 phrases; found: 5989 phrases; correct: 5348.
accuracy:  98.39%; precision:  89.30%; recall:  90.00%; FB1:  89.65
              LOC: precision:  94.52%; recall:  92.98%; FB1:  93.74  1807
             MISC: precision:  83.12%; recall:  83.30%; FB1:  83.21  924
              ORG: precision:  81.65%; recall:  83.97%; FB1:  82.79  1379
              PER: precision:  92.92%; recall:  94.79%; FB1:  93.85  1879



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 89 = 0.7210713028907776
conlleval:
processed 51578 tokens with 5942 phrases; found: 6039 phrases; correct: 5336.
accuracy:  98.29%; precision:  88.36%; recall:  89.80%; FB1:  89.07
              LOC: precision:  94.21%; recall:  92.16%; FB1:  93.18  1797
             MISC: precision:  82.58%; recall:  83.30%; FB1:  82.94  930
              ORG: precision:  78.12%; recall:  85.46%; FB1:  81.62  1467
              PER: precision:  93.71%; recall:  93.87%; FB1:  93.79  1845



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 90 = 0.69603431224823
conlleval:
processed 51578 tokens with 5942 phrases; found: 5960 phrases; correct: 5346.
accuracy:  98.44%; precision:  89.70%; recall:  89.97%; FB1:  89.83
              LOC: precision:  94.45%; recall:  92.60%; FB1:  93.51  1801
             MISC: precision:  83.62%; recall:  83.62%; FB1:  83.62  922
              ORG: precision:  82.45%; recall:  85.16%; FB1:  83.79  1385
              PER: precision:  93.52%; recall:  94.03%; FB1:  93.77  1852

----------------------------
-START-/START/START Italy/I-LOC/I-LOC 's/O/O Dini/I-PER/I-PER meets/O/O Burundi/I-LOC/I-LOC negotiator/O/O Nyerere/I-PER/I-PER ./O/O -END-/END/END
Predicted:	 ['START', 'I-LOC', 'O', 'I-PER', 'O', 'I-LOC', 'O', 'I-PER', 'O', 'END']
Gold:		 ['START', 'I-LOC', 'O', 'I-PER', 'O', 'I-LOC', 'O', 'I-PER', 'O', 'END']
----------------------------
-START-/START/START -DOCSTART-/O/O -END-/END/END
Predicted:	 ['START', 'O', 'END']
Gold:		 ['START', 'O', 'END']
---------------------------

  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 91 = 0.8783797025680542
conlleval:
processed 51578 tokens with 5942 phrases; found: 6016 phrases; correct: 5351.
accuracy:  98.35%; precision:  88.95%; recall:  90.05%; FB1:  89.50
              LOC: precision:  94.40%; recall:  92.71%; FB1:  93.55  1804
             MISC: precision:  82.76%; recall:  83.30%; FB1:  83.03  928
              ORG: precision:  80.14%; recall:  84.86%; FB1:  82.43  1420
              PER: precision:  93.45%; recall:  94.57%; FB1:  94.01  1864



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 92 = 0.7595962882041931
conlleval:
processed 51578 tokens with 5942 phrases; found: 5989 phrases; correct: 5343.
accuracy:  98.38%; precision:  89.21%; recall:  89.92%; FB1:  89.56
              LOC: precision:  94.28%; recall:  92.38%; FB1:  93.32  1800
             MISC: precision:  84.38%; recall:  83.19%; FB1:  83.78  909
              ORG: precision:  80.35%; recall:  85.68%; FB1:  82.93  1430
              PER: precision:  93.51%; recall:  93.92%; FB1:  93.72  1850



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 93 = 0.512948215007782
conlleval:
processed 51578 tokens with 5942 phrases; found: 6024 phrases; correct: 5344.
accuracy:  98.36%; precision:  88.71%; recall:  89.94%; FB1:  89.32
              LOC: precision:  94.17%; recall:  92.27%; FB1:  93.21  1800
             MISC: precision:  82.45%; recall:  83.08%; FB1:  82.77  929
              ORG: precision:  79.61%; recall:  85.61%; FB1:  82.50  1442
              PER: precision:  93.63%; recall:  94.19%; FB1:  93.91  1853



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 94 = 0.6854432821273804
conlleval:
processed 51578 tokens with 5942 phrases; found: 6014 phrases; correct: 5333.
accuracy:  98.33%; precision:  88.68%; recall:  89.75%; FB1:  89.21
              LOC: precision:  93.95%; recall:  92.16%; FB1:  93.05  1802
             MISC: precision:  83.41%; recall:  83.41%; FB1:  83.41  922
              ORG: precision:  79.00%; recall:  85.83%; FB1:  82.27  1457
              PER: precision:  93.84%; recall:  93.38%; FB1:  93.61  1833



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 95 = 0.7271214723587036
conlleval:
processed 51578 tokens with 5942 phrases; found: 5966 phrases; correct: 5349.
accuracy:  98.43%; precision:  89.66%; recall:  90.02%; FB1:  89.84
              LOC: precision:  94.40%; recall:  93.63%; FB1:  94.01  1822
             MISC: precision:  83.23%; recall:  83.41%; FB1:  83.32  924
              ORG: precision:  83.47%; recall:  83.59%; FB1:  83.53  1343
              PER: precision:  92.65%; recall:  94.41%; FB1:  93.52  1877



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 96 = 0.6995302438735962
conlleval:
processed 51578 tokens with 5942 phrases; found: 5981 phrases; correct: 5357.
accuracy:  98.44%; precision:  89.57%; recall:  90.15%; FB1:  89.86
              LOC: precision:  94.37%; recall:  93.03%; FB1:  93.70  1811
             MISC: precision:  83.68%; recall:  83.41%; FB1:  83.54  919
              ORG: precision:  82.11%; recall:  85.23%; FB1:  83.64  1392
              PER: precision:  93.38%; recall:  94.25%; FB1:  93.81  1859



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 97 = 0.8994585275650024
conlleval:
processed 51578 tokens with 5942 phrases; found: 6045 phrases; correct: 5348.
accuracy:  98.30%; precision:  88.47%; recall:  90.00%; FB1:  89.23
              LOC: precision:  94.18%; recall:  92.43%; FB1:  93.30  1803
             MISC: precision:  82.83%; recall:  83.19%; FB1:  83.01  926
              ORG: precision:  78.26%; recall:  85.38%; FB1:  81.67  1463
              PER: precision:  93.79%; recall:  94.35%; FB1:  94.07  1853



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 98 = 0.4565410614013672
conlleval:
processed 51578 tokens with 5942 phrases; found: 5991 phrases; correct: 5345.
accuracy:  98.40%; precision:  89.22%; recall:  89.95%; FB1:  89.58
              LOC: precision:  94.40%; recall:  92.65%; FB1:  93.52  1803
             MISC: precision:  83.59%; recall:  83.41%; FB1:  83.50  920
              ORG: precision:  80.75%; recall:  85.38%; FB1:  83.00  1418
              PER: precision:  93.46%; recall:  93.87%; FB1:  93.66  1850



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 99 = 0.8050068020820618
conlleval:
processed 51578 tokens with 5942 phrases; found: 5978 phrases; correct: 5346.
accuracy:  98.41%; precision:  89.43%; recall:  89.97%; FB1:  89.70
              LOC: precision:  94.77%; recall:  92.65%; FB1:  93.70  1796
             MISC: precision:  83.24%; recall:  83.51%; FB1:  83.38  925
              ORG: precision:  81.55%; recall:  85.38%; FB1:  83.42  1404
              PER: precision:  93.31%; recall:  93.87%; FB1:  93.59  1853



In [107]:
#Evaluation on test data
lstm.write_predictions(sentences_test, 'test_pred_lstm.txt')
!wget https://raw.githubusercontent.com/aritter/twitter_nlp/master/data/annotated/wnut16/conlleval.pl
!paste test test_pred_lstm.txt | perl conlleval.pl -d "\t"

--2023-03-24 03:15:40--  https://raw.githubusercontent.com/aritter/twitter_nlp/master/data/annotated/wnut16/conlleval.pl
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 12754 (12K) [text/plain]
Saving to: ‘conlleval.pl.8’


2023-03-24 03:15:40 (88.5 MB/s) - ‘conlleval.pl.8’ saved [12754/12754]

processed 46666 tokens with 5648 phrases; found: 5695 phrases; correct: 4791.
accuracy:  97.08%; precision:  84.13%; recall:  84.83%; FB1:  84.48
              LOC: precision:  89.89%; recall:  86.39%; FB1:  88.11  1603
             MISC: precision:  69.95%; recall:  74.93%; FB1:  72.35  752
              ORG: precision:  78.18%; recall:  82.84%; FB1:  80.44  1760
              PER: precision:  91.65%; recall:  89.55%; FB1:  90.58  1580


## Initialization with GloVe Embeddings (5 points)

If you haven't already, implement the `init_glove()` method in `BasicLSTMtagger` above.

Rather than initializing word embeddings randomly, it is common to use learned word embeddings (GloVe or Word2Vec), as discussed in lecture.  To make this simpler, we have already pre-filtered [GloVe](https://nlp.stanford.edu/projects/glove/) embeddings to only contain words in the vocabulary of the CoNLL NER dataset, and loaded them into a dictionary (`GloVe`) at the beginning of this notebook.



## Character Embeddings (10 points)

Now that you have your basic LSTM tagger working, the next step is to add a convolutional network that computes word embeddings from character representations of words.  See Figure 2 and Figure 3 in the [Ma and Hovy](https://www.aclweb.org/anthology/P16-1101.pdf) paper.  We have provided code in `sentences2input_tensors` to convert sentences into lists of word and character indices.  See also [nn.Conv1d](https://pytorch.org/docs/stable/generated/torch.nn.Conv1d.html) and [MaxPool1d](https://pytorch.org/docs/stable/generated/torch.nn.MaxPool1d.html).

Hint: The nn.Conv1d accepts input size $(N, C_{in}, L_{in})$, but we have input size $(N, \text{SLEN}, \text{CLEN}, \text{EMB_DIM})$. We can reshape and [permute](https://pytorch.org/docs/stable/generated/torch.permute.html) our input to satisfy the nn.Conv1d, and recover the dimensions later.

Make sure to save your predictions on the test set, for submission to GradeScope.  You should be able to achieve **90 F1 / 85 F1 on the dev/test sets**.

In [20]:
import torch.nn.functional as F


class CharLSTMtagger(BasicLSTMtagger): 
    def __init__(self, DIM_EMB=10, DIM_CHAR_EMB=30, DIM_HID=10, VOCAB_SIZE=29148, debug=True):
        super(CharLSTMtagger, self).__init__(DIM_EMB=DIM_EMB, DIM_HID=DIM_HID, VOCAB_SIZE=VOCAB_SIZE, debug=debug)
        NUM_TAGS = max(tag2i.values())+1

        (self.DIM_EMB, self.DIM_CHAR_EMB, self.NUM_TAGS) = (DIM_EMB, DIM_CHAR_EMB, NUM_TAGS)

        #TODO: Initialize parameters.
        self.char_embeddings = nn.Embedding(num_embeddings=VOCAB_SIZE, embedding_dim=DIM_CHAR_EMB)
        self.cnn = nn.Conv1d(in_channels=DIM_CHAR_EMB, out_channels=DIM_CHAR_EMB, kernel_size=3)
        self.max_pool = nn.MaxPool1d(kernel_size=DIM_CHAR_EMB) # 1 = floor((30+2*padding - dilation*(kernel_size-1)-1)/stride + 1) ---> 30+2*padding = dilation*(kernel_size-1) + 1 ---> kernel_size=30

        self.lstm = nn.LSTM(input_size=DIM_EMB+DIM_CHAR_EMB, hidden_size=DIM_HID, batch_first=True, bidirectional=True)
        self.drop = nn.Dropout(p=0.2)

        self.debug = debug
        if self.debug:
          print('DIM_CHAR_EMB: ' + str(DIM_CHAR_EMB) + '\n')

    def forward(self, X, X_char, train=False):
        #TODO: Implement the forward computation.
        BATCH_SIZE, SLEN, CLEN = X_char.shape

        ## generate word embeddings - MAKE SURE WE'VE INITIALIZED GLOVE EMBEDDINGS BEFORE CALLING FORWARD!
        word_embeds = self.word_embeddings(X)
        if self.debug:
          print('word_embeds: ' + str(word_embeds.shape) + '\n')

        ## generate char embeddings
        char_embeds = self.char_embeddings(X_char)
        if self.debug:
          print('char_embeds: ' + str(char_embeds.shape))

        permuted_char_embeds = torch.permute(char_embeds,(0,1,3,2))
        if self.debug:
          print('permuted_char_embeds: ' + str(permuted_char_embeds.shape))
        
        reshaped_char_embeds = torch.reshape(permuted_char_embeds,(BATCH_SIZE*SLEN,self.DIM_CHAR_EMB,CLEN))
        if self.debug:
          print('reshaped_char_embeds: ' + str(reshaped_char_embeds.shape))

        cnn_char_embeds = self.cnn(self.drop(reshaped_char_embeds))
        if self.debug:
          print('cnn_char_embeds: ' + str(cnn_char_embeds.shape))
        
        max_pool_char_embeds = self.max_pool(cnn_char_embeds)
        if self.debug:
          print('max_pool_char_embeds: ' + str(max_pool_char_embeds.shape))

        reshaped_char_embeds = torch.reshape(max_pool_char_embeds,(BATCH_SIZE,SLEN,self.DIM_CHAR_EMB))
        if self.debug:
          print('reshaped_max_pool_output: ' + str(reshaped_char_embeds.shape) + '\n')

        ## concatenate word and character embeddings
        lstm_input = self.drop(torch.cat((word_embeds, reshaped_char_embeds), dim=2))
        if self.debug:
          print('lstm_input: ' + str(lstm_input.shape))

        ## perform rest of LSTM forward pass as normal
        tag_scores = super(CharLSTMtagger,self).forward(lstm_input)
        return tag_scores
        #pass
        # #return torch.randn((X.shape[0], X.shape[1], self.NUM_TAGS))  #Random baseline.

    def sentences2input_tensors(self, sentences):
        (X, X_mask)   = prepare_input(sentences2indices(sentences, word2i))
        X_char        = prepare_input_char(sentences2indicesChar(sentences, char2i))
        return (X, X_mask, X_char)

    def inference(self, sentences):
        (X, X_mask, X_char) = self.sentences2input_tensors(sentences)
        pred = self.forward(X.cuda(), X_char.cuda()).argmax(dim=2)
        return [[i2tag[pred[i,j].item()] for j in range(len(sentences[i]))] for i in range(len(sentences))]

    def print_predictions(self, words, tags):
        Y_pred = self.inference(words)
        for i in range(len(words)):
            print("----------------------------")
            print(" ".join([f"{words[i][j]}/{Y_pred[i][j]}/{tags[i][j]}" for j in range(len(words[i]))]))
            print("Predicted:\t", Y_pred[i])
            print("Gold:\t\t", tags[i])

char_lstm_test = CharLSTMtagger(DIM_HID=7, DIM_EMB=300)
lstm_output    = char_lstm_test.forward(prepare_input(X[0:5])[0], prepare_input_char(X_char[0:5]))
Y_onehot       = prepare_output_onehot(Y[0:5])

print("lstm output shape:", lstm_output.shape)
print("Y onehot shape:", Y_onehot.shape)

VOCAB_SIZE: 29148
NUM_TAGS: 10
DIM_EMB: 300
DIM_HID: 7
bidirectional?: True

DIM_CHAR_EMB: 30

word_embeds: torch.Size([5, 32, 300])

char_embeds: torch.Size([5, 32, 32, 30])
permuted_char_embeds: torch.Size([5, 32, 30, 32])
reshaped_char_embeds: torch.Size([160, 30, 32])
cnn_char_embeds: torch.Size([160, 30, 30])
max_pool_char_embeds: torch.Size([160, 30, 1])
reshaped_max_pool_output: torch.Size([5, 32, 30])

lstm_input: torch.Size([5, 32, 330])
lstm_out: torch.Size([5, 32, 14])
tag_space: torch.Size([5, 32, 10])
tag_scores: torch.Size([5, 32, 10])

lstm output shape: torch.Size([5, 32, 10])
Y onehot shape: torch.Size([5, 32, 10])


In [21]:
#Training LSTM w/ character embeddings. Feel free to change number of epochs, optimizer, learning rate and batch size.

nEpochs = 125

def train_char_lstm(sentences, tags, lstm, glove):
  #optimizer = optim.Adadelta(lstm.parameters(), lr=0.1)
  #TODO: initialize optimizer
    with torch.no_grad():
      lstm.init_glove(glove)

    optimizer = optim.Adadelta(lstm.parameters(),lr=1.0)

    batchSize = 50

    for epoch in range(nEpochs):
        totalLoss = 0.0

        (sentences_shuffled, tags_shuffled) = shuffle_sentences(sentences, tags)
        for batch in tqdm.notebook.tqdm(range(0, len(sentences), batchSize), leave=False):
            lstm.zero_grad()
            #TODO: Gradient update
            (X_batch_prepared, X_mask_batch_prepared, X_char_batch_prepared) = lstm.sentences2input_tensors(sentences_shuffled[batch:batch+batchSize])
            Y_batch_onehot   = prepare_output_onehot(sentences2indices(tags_shuffled[batch:batch+batchSize], tag2i)).cuda()
            if lstm.debug:
              print('X_batch_prepared: ' + str(X_batch_prepared.shape))
              print('X_char_batch_prepared: ' + str(X_char_batch_prepared.shape))
              print('Y_onehot: ' + str(Y_batch_onehot.shape))

            pred = lstm.forward(X_batch_prepared.cuda() ,X_char_batch_prepared.cuda()) # TODO
            if lstm.debug:
              print('pred: ' + str(pred.shape))
            
            #loss = loss_function(pred, Y_batch_onehot)
            loss = torch.einsum('bij,bij',torch.neg(pred),Y_batch_onehot) / batchSize
            loss.backward()
            optimizer.step()
            totalLoss += loss

        print(f"loss on epoch {epoch} = {totalLoss}")
        lstm.write_predictions(sentences_dev, 'dev_pred')   #Performance on dev set
        print('conlleval:')
        print(subprocess.Popen('paste dev dev_pred | perl conlleval.pl -d "\t"', shell=True, stdout=subprocess.PIPE,stderr=subprocess.STDOUT).communicate()[0].decode('UTF-8'))

        if epoch % 10 == 0:
            s = sample(range(len(sentences_dev)), 5)
            lstm.print_predictions([sentences_dev[i] for i in s], [tags_dev[i] for i in s])

torch.manual_seed(1)

char_lstm = CharLSTMtagger(DIM_HID=500, DIM_EMB=300, debug=False).cuda()
train_char_lstm(sentences_train, tags_train, char_lstm, GloVe)

  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 0 = 815.6292724609375
conlleval:
processed 51578 tokens with 5942 phrases; found: 6111 phrases; correct: 5027.
accuracy:  97.38%; precision:  82.26%; recall:  84.60%; FB1:  83.41
              LOC: precision:  85.14%; recall:  91.67%; FB1:  88.28  1978
             MISC: precision:  76.08%; recall:  74.19%; FB1:  75.12  899
              ORG: precision:  72.64%; recall:  74.65%; FB1:  73.63  1378
              PER: precision:  89.33%; recall:  90.01%; FB1:  89.67  1856

----------------------------
-START-/START/START LOS/I-ORG/I-ORG ANGELES/I-ORG/I-ORG AT/O/O PHILADELPHIA/I-LOC/I-LOC -END-/END/END
Predicted:	 ['START', 'I-ORG', 'I-ORG', 'O', 'I-LOC', 'END']
Gold:		 ['START', 'I-ORG', 'I-ORG', 'O', 'I-LOC', 'END']
----------------------------
-START-/START/START It/O/O showed/O/O spending/O/O rose/O/O only/O/O 0.2/O/O percent/O/O last/O/O month/O/O to/O/O a/O/O seasonally/O/O adjusted/O/O annual/O/O rate/O/O of/O/O $/O/O 5.15/O/O trillion/O/O after/O/O dropping/O/O a/O/O 

  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 1 = 303.6272277832031
conlleval:
processed 51578 tokens with 5942 phrases; found: 6079 phrases; correct: 5199.
accuracy:  97.94%; precision:  85.52%; recall:  87.50%; FB1:  86.50
              LOC: precision:  93.80%; recall:  88.95%; FB1:  91.31  1742
             MISC: precision:  84.99%; recall:  78.63%; FB1:  81.69  853
              ORG: precision:  72.60%; recall:  82.77%; FB1:  77.35  1529
              PER: precision:  88.49%; recall:  93.92%; FB1:  91.12  1955



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 2 = 195.77528381347656
conlleval:
processed 51578 tokens with 5942 phrases; found: 6025 phrases; correct: 5334.
accuracy:  98.37%; precision:  88.53%; recall:  89.77%; FB1:  89.15
              LOC: precision:  93.16%; recall:  94.83%; FB1:  93.98  1870
             MISC: precision:  84.33%; recall:  81.13%; FB1:  82.70  887
              ORG: precision:  80.16%; recall:  83.45%; FB1:  81.77  1396
              PER: precision:  92.15%; recall:  93.65%; FB1:  92.89  1872



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 3 = 139.30967712402344
conlleval:
processed 51578 tokens with 5942 phrases; found: 6092 phrases; correct: 5399.
accuracy:  98.44%; precision:  88.62%; recall:  90.86%; FB1:  89.73
              LOC: precision:  93.94%; recall:  94.50%; FB1:  94.22  1848
             MISC: precision:  82.99%; recall:  84.16%; FB1:  83.58  935
              ORG: precision:  82.28%; recall:  82.77%; FB1:  82.53  1349
              PER: precision:  90.66%; recall:  96.47%; FB1:  93.48  1960



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 4 = 99.4372329711914
conlleval:
processed 51578 tokens with 5942 phrases; found: 6082 phrases; correct: 5416.
accuracy:  98.50%; precision:  89.05%; recall:  91.15%; FB1:  90.09
              LOC: precision:  93.63%; recall:  95.26%; FB1:  94.44  1869
             MISC: precision:  79.78%; recall:  86.01%; FB1:  82.78  994
              ORG: precision:  83.78%; recall:  83.97%; FB1:  83.87  1344
              PER: precision:  93.17%; recall:  94.84%; FB1:  94.00  1875



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 5 = 72.79410552978516
conlleval:
processed 51578 tokens with 5942 phrases; found: 6103 phrases; correct: 5422.
accuracy:  98.56%; precision:  88.84%; recall:  91.25%; FB1:  90.03
              LOC: precision:  94.39%; recall:  94.28%; FB1:  94.34  1835
             MISC: precision:  79.98%; recall:  87.09%; FB1:  83.39  1004
              ORG: precision:  82.93%; recall:  84.41%; FB1:  83.67  1365
              PER: precision:  92.42%; recall:  95.28%; FB1:  93.83  1899



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 6 = 55.40594482421875
conlleval:
processed 51578 tokens with 5942 phrases; found: 6074 phrases; correct: 5414.
accuracy:  98.55%; precision:  89.13%; recall:  91.11%; FB1:  90.11
              LOC: precision:  94.85%; recall:  94.23%; FB1:  94.54  1825
             MISC: precision:  80.04%; recall:  86.12%; FB1:  82.97  992
              ORG: precision:  81.81%; recall:  87.17%; FB1:  84.40  1429
              PER: precision:  94.09%; recall:  93.38%; FB1:  93.73  1828



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 7 = 41.91330337524414
conlleval:
processed 51578 tokens with 5942 phrases; found: 6065 phrases; correct: 5437.
accuracy:  98.58%; precision:  89.65%; recall:  91.50%; FB1:  90.56
              LOC: precision:  94.81%; recall:  94.50%; FB1:  94.66  1831
             MISC: precision:  81.12%; recall:  86.23%; FB1:  83.60  980
              ORG: precision:  82.86%; recall:  87.25%; FB1:  85.00  1412
              PER: precision:  94.25%; recall:  94.25%; FB1:  94.25  1842



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 8 = 32.36700439453125
conlleval:
processed 51578 tokens with 5942 phrases; found: 6084 phrases; correct: 5439.
accuracy:  98.63%; precision:  89.40%; recall:  91.53%; FB1:  90.45
              LOC: precision:  93.35%; recall:  95.54%; FB1:  94.43  1880
             MISC: precision:  81.34%; recall:  86.98%; FB1:  84.07  986
              ORG: precision:  86.07%; recall:  83.37%; FB1:  84.70  1299
              PER: precision:  91.92%; recall:  95.77%; FB1:  93.80  1919



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 9 = 25.073322296142578
conlleval:
processed 51578 tokens with 5942 phrases; found: 6017 phrases; correct: 5456.
accuracy:  98.67%; precision:  90.68%; recall:  91.82%; FB1:  91.25
              LOC: precision:  95.46%; recall:  93.85%; FB1:  94.65  1806
             MISC: precision:  83.35%; recall:  85.79%; FB1:  84.55  949
              ORG: precision:  84.69%; recall:  89.11%; FB1:  86.85  1411
              PER: precision:  94.33%; recall:  94.79%; FB1:  94.56  1851



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 10 = 18.609596252441406
conlleval:
processed 51578 tokens with 5942 phrases; found: 6061 phrases; correct: 5420.
accuracy:  98.58%; precision:  89.42%; recall:  91.22%; FB1:  90.31
              LOC: precision:  95.36%; recall:  92.87%; FB1:  94.10  1789
             MISC: precision:  81.48%; recall:  87.31%; FB1:  84.29  988
              ORG: precision:  83.74%; recall:  86.43%; FB1:  85.06  1384
              PER: precision:  92.11%; recall:  95.01%; FB1:  93.53  1900

----------------------------
-START-/START/START -DOCSTART-/O/O -END-/END/END
Predicted:	 ['START', 'O', 'END']
Gold:		 ['START', 'O', 'END']
----------------------------
-START-/START/START 7./O/O Frank/I-PER/I-PER Busemann/I-PER/I-PER (/O/O Germany/I-LOC/I-LOC )/O/O 13.58/O/O -END-/END/END
Predicted:	 ['START', 'O', 'I-PER', 'I-PER', 'O', 'I-LOC', 'O', 'O', 'END']
Gold:		 ['START', 'O', 'I-PER', 'I-PER', 'O', 'I-LOC', 'O', 'O', 'END']
----------------------------
-START-/START/START At/O/O The/I-LOC/I-

  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 11 = 17.53987693786621
conlleval:
processed 51578 tokens with 5942 phrases; found: 6066 phrases; correct: 5452.
accuracy:  98.64%; precision:  89.88%; recall:  91.75%; FB1:  90.81
              LOC: precision:  96.07%; recall:  93.25%; FB1:  94.64  1783
             MISC: precision:  82.50%; recall:  85.90%; FB1:  84.17  960
              ORG: precision:  82.61%; recall:  88.89%; FB1:  85.63  1443
              PER: precision:  93.35%; recall:  95.28%; FB1:  94.30  1880



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 12 = 13.304128646850586
conlleval:
processed 51578 tokens with 5942 phrases; found: 6050 phrases; correct: 5430.
accuracy:  98.58%; precision:  89.75%; recall:  91.38%; FB1:  90.56
              LOC: precision:  95.28%; recall:  93.30%; FB1:  94.28  1799
             MISC: precision:  82.76%; recall:  86.44%; FB1:  84.56  963
              ORG: precision:  83.01%; recall:  86.73%; FB1:  84.83  1401
              PER: precision:  93.06%; recall:  95.33%; FB1:  94.18  1887



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 13 = 11.099258422851562
conlleval:
processed 51578 tokens with 5942 phrases; found: 5996 phrases; correct: 5445.
accuracy:  98.67%; precision:  90.81%; recall:  91.64%; FB1:  91.22
              LOC: precision:  94.95%; recall:  94.07%; FB1:  94.50  1820
             MISC: precision:  85.89%; recall:  85.14%; FB1:  85.51  914
              ORG: precision:  85.11%; recall:  87.40%; FB1:  86.24  1377
              PER: precision:  93.37%; recall:  95.55%; FB1:  94.45  1885



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 14 = 10.965926170349121
conlleval:
processed 51578 tokens with 5942 phrases; found: 6054 phrases; correct: 5454.
accuracy:  98.64%; precision:  90.09%; recall:  91.79%; FB1:  90.93
              LOC: precision:  95.14%; recall:  93.85%; FB1:  94.49  1812
             MISC: precision:  85.79%; recall:  86.44%; FB1:  86.12  929
              ORG: precision:  82.07%; recall:  88.74%; FB1:  85.27  1450
              PER: precision:  93.56%; recall:  94.63%; FB1:  94.09  1863



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 15 = 8.258366584777832
conlleval:
processed 51578 tokens with 5942 phrases; found: 6014 phrases; correct: 5437.
accuracy:  98.63%; precision:  90.41%; recall:  91.50%; FB1:  90.95
              LOC: precision:  95.48%; recall:  93.25%; FB1:  94.35  1794
             MISC: precision:  86.91%; recall:  84.27%; FB1:  85.57  894
              ORG: precision:  82.85%; recall:  89.34%; FB1:  85.97  1446
              PER: precision:  93.03%; recall:  94.95%; FB1:  93.98  1880



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 16 = 8.591498374938965
conlleval:
processed 51578 tokens with 5942 phrases; found: 6050 phrases; correct: 5438.
accuracy:  98.60%; precision:  89.88%; recall:  91.52%; FB1:  90.69
              LOC: precision:  94.33%; recall:  94.18%; FB1:  94.25  1834
             MISC: precision:  87.03%; recall:  83.73%; FB1:  85.35  887
              ORG: precision:  81.15%; recall:  89.26%; FB1:  85.01  1475
              PER: precision:  93.80%; recall:  94.41%; FB1:  94.10  1854



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 17 = 7.4459733963012695
conlleval:
processed 51578 tokens with 5942 phrases; found: 6039 phrases; correct: 5449.
accuracy:  98.68%; precision:  90.23%; recall:  91.70%; FB1:  90.96
              LOC: precision:  95.26%; recall:  94.18%; FB1:  94.72  1816
             MISC: precision:  83.56%; recall:  86.55%; FB1:  85.03  955
              ORG: precision:  84.40%; recall:  87.17%; FB1:  85.77  1385
              PER: precision:  93.04%; recall:  95.11%; FB1:  94.07  1883



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 18 = 5.911384105682373
conlleval:
processed 51578 tokens with 5942 phrases; found: 6012 phrases; correct: 5464.
accuracy:  98.66%; precision:  90.88%; recall:  91.96%; FB1:  91.42
              LOC: precision:  95.48%; recall:  94.23%; FB1:  94.85  1813
             MISC: precision:  84.82%; recall:  85.47%; FB1:  85.14  929
              ORG: precision:  86.21%; recall:  87.17%; FB1:  86.69  1356
              PER: precision:  92.79%; recall:  96.42%; FB1:  94.57  1914



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 19 = 4.424786567687988
conlleval:
processed 51578 tokens with 5942 phrases; found: 5997 phrases; correct: 5434.
accuracy:  98.64%; precision:  90.61%; recall:  91.45%; FB1:  91.03
              LOC: precision:  94.41%; recall:  93.85%; FB1:  94.13  1826
             MISC: precision:  86.65%; recall:  85.90%; FB1:  86.27  914
              ORG: precision:  83.88%; recall:  88.07%; FB1:  85.92  1408
              PER: precision:  93.94%; recall:  94.30%; FB1:  94.12  1849



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 20 = 5.0266265869140625
conlleval:
processed 51578 tokens with 5942 phrases; found: 6031 phrases; correct: 5473.
accuracy:  98.72%; precision:  90.75%; recall:  92.11%; FB1:  91.42
              LOC: precision:  96.16%; recall:  94.01%; FB1:  95.07  1796
             MISC: precision:  84.29%; recall:  87.31%; FB1:  85.78  955
              ORG: precision:  84.47%; recall:  87.99%; FB1:  86.19  1397
              PER: precision:  93.52%; recall:  95.60%; FB1:  94.55  1883

----------------------------
-START-/START/START Northampton/I-ORG/I-ORG 46/O/O West/I-ORG/I-ORG Hartlepool/I-ORG/I-ORG 20/O/O -END-/END/END
Predicted:	 ['START', 'I-ORG', 'O', 'I-ORG', 'I-ORG', 'O', 'END']
Gold:		 ['START', 'I-ORG', 'O', 'I-ORG', 'I-ORG', 'O', 'END']
----------------------------
-START-/START/START Radio/I-ORG/I-ORG Red/I-ORG/I-ORG quoted/O/O local/O/O police/O/O in/O/O the/O/O town/O/O of/O/O Tacambaro/I-LOC/I-LOC ,/O/O Michoacan/I-LOC/I-LOC ,/O/O 80/O/O km/O/O (/O/O 50/O/O miles/O/O )

  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 21 = 4.500218868255615
conlleval:
processed 51578 tokens with 5942 phrases; found: 6025 phrases; correct: 5455.
accuracy:  98.71%; precision:  90.54%; recall:  91.80%; FB1:  91.17
              LOC: precision:  95.40%; recall:  93.63%; FB1:  94.51  1803
             MISC: precision:  85.28%; recall:  86.12%; FB1:  85.70  931
              ORG: precision:  83.03%; recall:  88.29%; FB1:  85.58  1426
              PER: precision:  94.21%; recall:  95.39%; FB1:  94.79  1865



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 22 = 4.31218147277832
conlleval:
processed 51578 tokens with 5942 phrases; found: 6019 phrases; correct: 5462.
accuracy:  98.67%; precision:  90.75%; recall:  91.92%; FB1:  91.33
              LOC: precision:  95.52%; recall:  94.12%; FB1:  94.82  1810
             MISC: precision:  86.48%; recall:  86.01%; FB1:  86.24  917
              ORG: precision:  84.89%; recall:  88.81%; FB1:  86.81  1403
              PER: precision:  92.59%; recall:  94.95%; FB1:  93.76  1889



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 23 = 3.8160502910614014
conlleval:
processed 51578 tokens with 5942 phrases; found: 6031 phrases; correct: 5486.
accuracy:  98.73%; precision:  90.96%; recall:  92.33%; FB1:  91.64
              LOC: precision:  95.06%; recall:  94.28%; FB1:  94.67  1822
             MISC: precision:  84.29%; recall:  87.85%; FB1:  86.03  961
              ORG: precision:  86.21%; recall:  88.14%; FB1:  87.17  1371
              PER: precision:  93.87%; recall:  95.66%; FB1:  94.76  1877



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 24 = 3.8983473777770996
conlleval:
processed 51578 tokens with 5942 phrases; found: 6049 phrases; correct: 5475.
accuracy:  98.69%; precision:  90.51%; recall:  92.14%; FB1:  91.32
              LOC: precision:  95.57%; recall:  93.90%; FB1:  94.73  1805
             MISC: precision:  84.45%; recall:  87.20%; FB1:  85.81  952
              ORG: precision:  84.18%; recall:  88.52%; FB1:  86.30  1410
              PER: precision:  93.46%; recall:  95.49%; FB1:  94.47  1882



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 25 = 3.984055757522583
conlleval:
processed 51578 tokens with 5942 phrases; found: 6032 phrases; correct: 5482.
accuracy:  98.72%; precision:  90.88%; recall:  92.26%; FB1:  91.57
              LOC: precision:  96.51%; recall:  93.30%; FB1:  94.88  1776
             MISC: precision:  83.47%; recall:  86.55%; FB1:  84.98  956
              ORG: precision:  85.17%; recall:  89.11%; FB1:  87.10  1403
              PER: precision:  93.57%; recall:  96.36%; FB1:  94.95  1897



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 26 = 3.7029716968536377
conlleval:
processed 51578 tokens with 5942 phrases; found: 6054 phrases; correct: 5489.
accuracy:  98.73%; precision:  90.67%; recall:  92.38%; FB1:  91.51
              LOC: precision:  95.12%; recall:  94.50%; FB1:  94.81  1825
             MISC: precision:  83.96%; recall:  87.42%; FB1:  85.65  960
              ORG: precision:  85.92%; recall:  87.40%; FB1:  86.65  1364
              PER: precision:  93.18%; recall:  96.36%; FB1:  94.74  1905



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 27 = 4.676445007324219
conlleval:
processed 51578 tokens with 5942 phrases; found: 6042 phrases; correct: 5464.
accuracy:  98.70%; precision:  90.43%; recall:  91.96%; FB1:  91.19
              LOC: precision:  95.09%; recall:  93.85%; FB1:  94.47  1813
             MISC: precision:  84.77%; recall:  86.33%; FB1:  85.55  939
              ORG: precision:  84.43%; recall:  87.77%; FB1:  86.07  1394
              PER: precision:  93.20%; recall:  95.93%; FB1:  94.54  1896



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 28 = 3.188711643218994
conlleval:
processed 51578 tokens with 5942 phrases; found: 6023 phrases; correct: 5476.
accuracy:  98.69%; precision:  90.92%; recall:  92.16%; FB1:  91.53
              LOC: precision:  95.24%; recall:  93.69%; FB1:  94.46  1807
             MISC: precision:  86.31%; recall:  86.88%; FB1:  86.59  928
              ORG: precision:  85.21%; recall:  88.52%; FB1:  86.83  1393
              PER: precision:  93.25%; recall:  95.93%; FB1:  94.57  1895



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 29 = 2.876906633377075
conlleval:
processed 51578 tokens with 5942 phrases; found: 6018 phrases; correct: 5475.
accuracy:  98.73%; precision:  90.98%; recall:  92.14%; FB1:  91.56
              LOC: precision:  95.30%; recall:  93.79%; FB1:  94.54  1808
             MISC: precision:  84.97%; recall:  86.44%; FB1:  85.70  938
              ORG: precision:  85.44%; recall:  89.26%; FB1:  87.31  1401
              PER: precision:  93.96%; recall:  95.44%; FB1:  94.69  1871



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 30 = 2.788072347640991
conlleval:
processed 51578 tokens with 5942 phrases; found: 6029 phrases; correct: 5473.
accuracy:  98.73%; precision:  90.78%; recall:  92.11%; FB1:  91.44
              LOC: precision:  95.75%; recall:  94.39%; FB1:  95.07  1811
             MISC: precision:  84.98%; recall:  85.90%; FB1:  85.44  932
              ORG: precision:  84.68%; recall:  87.77%; FB1:  86.20  1390
              PER: precision:  93.35%; recall:  96.09%; FB1:  94.70  1896

----------------------------
-START-/START/START SEOUL/I-LOC/I-LOC 1996-08-31/O/O -END-/END/END
Predicted:	 ['START', 'I-LOC', 'O', 'END']
Gold:		 ['START', 'I-LOC', 'O', 'END']
----------------------------
-START-/START/START 5./O/O Shem/I-PER/I-PER Kororia/I-PER/I-PER (/O/O Kenya/I-LOC/I-LOC )/O/O 13:06.65/O/O -END-/END/END
Predicted:	 ['START', 'O', 'I-PER', 'I-PER', 'O', 'I-LOC', 'O', 'O', 'END']
Gold:		 ['START', 'O', 'I-PER', 'I-PER', 'O', 'I-LOC', 'O', 'O', 'END']
----------------------------
-STAR

  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 31 = 2.786778211593628
conlleval:
processed 51578 tokens with 5942 phrases; found: 6040 phrases; correct: 5462.
accuracy:  98.63%; precision:  90.43%; recall:  91.92%; FB1:  91.17
              LOC: precision:  94.91%; recall:  94.45%; FB1:  94.68  1828
             MISC: precision:  83.78%; recall:  87.42%; FB1:  85.56  962
              ORG: precision:  86.27%; recall:  85.76%; FB1:  86.01  1333
              PER: precision:  92.38%; recall:  96.15%; FB1:  94.23  1917



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 32 = 2.550187826156616
conlleval:
processed 51578 tokens with 5942 phrases; found: 6032 phrases; correct: 5495.
accuracy:  98.74%; precision:  91.10%; recall:  92.48%; FB1:  91.78
              LOC: precision:  95.68%; recall:  93.96%; FB1:  94.81  1804
             MISC: precision:  86.49%; recall:  86.77%; FB1:  86.63  925
              ORG: precision:  85.52%; recall:  88.52%; FB1:  86.99  1388
              PER: precision:  93.05%; recall:  96.74%; FB1:  94.86  1915



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 33 = 2.5506081581115723
conlleval:
processed 51578 tokens with 5942 phrases; found: 6023 phrases; correct: 5479.
accuracy:  98.73%; precision:  90.97%; recall:  92.21%; FB1:  91.58
              LOC: precision:  95.95%; recall:  94.07%; FB1:  95.00  1801
             MISC: precision:  84.38%; recall:  86.12%; FB1:  85.24  941
              ORG: precision:  85.33%; recall:  88.89%; FB1:  87.07  1397
              PER: precision:  93.68%; recall:  95.82%; FB1:  94.74  1884



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 34 = 2.478121757507324
conlleval:
processed 51578 tokens with 5942 phrases; found: 6041 phrases; correct: 5494.
accuracy:  98.76%; precision:  90.95%; recall:  92.46%; FB1:  91.70
              LOC: precision:  96.37%; recall:  94.07%; FB1:  95.21  1793
             MISC: precision:  84.16%; recall:  86.98%; FB1:  85.55  953
              ORG: precision:  86.23%; recall:  88.74%; FB1:  87.47  1380
              PER: precision:  92.64%; recall:  96.31%; FB1:  94.44  1915



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 35 = 1.830381155014038
conlleval:
processed 51578 tokens with 5942 phrases; found: 6059 phrases; correct: 5498.
accuracy:  98.73%; precision:  90.74%; recall:  92.53%; FB1:  91.63
              LOC: precision:  95.66%; recall:  94.77%; FB1:  95.21  1820
             MISC: precision:  82.83%; recall:  86.33%; FB1:  84.55  961
              ORG: precision:  86.26%; recall:  88.96%; FB1:  87.59  1383
              PER: precision:  93.30%; recall:  95.98%; FB1:  94.62  1895



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 36 = 2.0933451652526855
conlleval:
processed 51578 tokens with 5942 phrases; found: 6045 phrases; correct: 5499.
accuracy:  98.77%; precision:  90.97%; recall:  92.54%; FB1:  91.75
              LOC: precision:  94.87%; recall:  95.70%; FB1:  95.28  1853
             MISC: precision:  85.24%; recall:  85.79%; FB1:  85.51  928
              ORG: precision:  86.65%; recall:  87.62%; FB1:  87.13  1356
              PER: precision:  93.03%; recall:  96.36%; FB1:  94.67  1908



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 37 = 1.926824927330017
conlleval:
processed 51578 tokens with 5942 phrases; found: 6048 phrases; correct: 5471.
accuracy:  98.71%; precision:  90.46%; recall:  92.07%; FB1:  91.26
              LOC: precision:  95.87%; recall:  93.47%; FB1:  94.65  1791
             MISC: precision:  84.02%; recall:  86.12%; FB1:  85.06  945
              ORG: precision:  84.81%; recall:  89.11%; FB1:  86.91  1409
              PER: precision:  92.75%; recall:  95.82%; FB1:  94.26  1903



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 38 = 2.2001214027404785
conlleval:
processed 51578 tokens with 5942 phrases; found: 6037 phrases; correct: 5484.
accuracy:  98.77%; precision:  90.84%; recall:  92.29%; FB1:  91.56
              LOC: precision:  95.78%; recall:  94.01%; FB1:  94.89  1803
             MISC: precision:  83.47%; recall:  86.55%; FB1:  84.98  956
              ORG: precision:  85.59%; recall:  88.59%; FB1:  87.06  1388
              PER: precision:  93.70%; recall:  96.15%; FB1:  94.91  1890



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 39 = 2.3558900356292725
conlleval:
processed 51578 tokens with 5942 phrases; found: 6060 phrases; correct: 5480.
accuracy:  98.73%; precision:  90.43%; recall:  92.22%; FB1:  91.32
              LOC: precision:  95.31%; recall:  94.12%; FB1:  94.71  1814
             MISC: precision:  84.46%; recall:  86.66%; FB1:  85.55  946
              ORG: precision:  86.47%; recall:  87.70%; FB1:  87.08  1360
              PER: precision:  91.55%; recall:  96.42%; FB1:  93.92  1940



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 40 = 1.7689828872680664
conlleval:
processed 51578 tokens with 5942 phrases; found: 6029 phrases; correct: 5488.
accuracy:  98.74%; precision:  91.03%; recall:  92.36%; FB1:  91.69
              LOC: precision:  95.18%; recall:  94.56%; FB1:  94.87  1825
             MISC: precision:  86.20%; recall:  85.36%; FB1:  85.78  913
              ORG: precision:  84.96%; recall:  89.34%; FB1:  87.10  1410
              PER: precision:  93.89%; recall:  95.87%; FB1:  94.87  1881

----------------------------
-START-/START/START He/O/O declined/O/O to/O/O say/O/O whether/O/O the/O/O girls/O/O had/O/O been/O/O kidnapped/O/O or/O/O whether/O/O they/O/O had/O/O gone/O/O away/O/O of/O/O their/O/O own/O/O accord/O/O ./O/O -END-/END/END
Predicted:	 ['START', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'END']
Gold:		 ['START', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O

  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 41 = 1.5115333795547485
conlleval:
processed 51578 tokens with 5942 phrases; found: 6045 phrases; correct: 5476.
accuracy:  98.72%; precision:  90.59%; recall:  92.16%; FB1:  91.37
              LOC: precision:  96.40%; recall:  93.36%; FB1:  94.86  1779
             MISC: precision:  85.48%; recall:  84.92%; FB1:  85.20  916
              ORG: precision:  83.70%; recall:  89.63%; FB1:  86.57  1436
              PER: precision:  92.79%; recall:  96.42%; FB1:  94.57  1914



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 42 = 1.8462995290756226
conlleval:
processed 51578 tokens with 5942 phrases; found: 5998 phrases; correct: 5462.
accuracy:  98.71%; precision:  91.06%; recall:  91.92%; FB1:  91.49
              LOC: precision:  96.16%; recall:  93.96%; FB1:  95.04  1795
             MISC: precision:  85.31%; recall:  85.68%; FB1:  85.50  926
              ORG: precision:  85.52%; recall:  88.07%; FB1:  86.77  1381
              PER: precision:  93.09%; recall:  95.82%; FB1:  94.44  1896



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 43 = 1.474531888961792
conlleval:
processed 51578 tokens with 5942 phrases; found: 6040 phrases; correct: 5486.
accuracy:  98.77%; precision:  90.83%; recall:  92.33%; FB1:  91.57
              LOC: precision:  96.16%; recall:  93.96%; FB1:  95.04  1795
             MISC: precision:  84.49%; recall:  86.88%; FB1:  85.67  948
              ORG: precision:  85.43%; recall:  88.29%; FB1:  86.84  1386
              PER: precision:  92.88%; recall:  96.36%; FB1:  94.59  1911



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 44 = 1.6645466089248657
conlleval:
processed 51578 tokens with 5942 phrases; found: 6051 phrases; correct: 5512.
accuracy:  98.78%; precision:  91.09%; recall:  92.76%; FB1:  91.92
              LOC: precision:  95.74%; recall:  94.18%; FB1:  94.95  1807
             MISC: precision:  85.30%; recall:  86.88%; FB1:  86.08  939
              ORG: precision:  86.03%; recall:  90.01%; FB1:  87.97  1403
              PER: precision:  93.27%; recall:  96.31%; FB1:  94.76  1902



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 45 = 1.8882019519805908
conlleval:
processed 51578 tokens with 5942 phrases; found: 6045 phrases; correct: 5502.
accuracy:  98.75%; precision:  91.02%; recall:  92.60%; FB1:  91.80
              LOC: precision:  95.25%; recall:  94.88%; FB1:  95.06  1830
             MISC: precision:  85.44%; recall:  86.55%; FB1:  85.99  934
              ORG: precision:  86.51%; recall:  88.44%; FB1:  87.46  1371
              PER: precision:  92.93%; recall:  96.36%; FB1:  94.62  1910



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 46 = 1.3768668174743652
conlleval:
processed 51578 tokens with 5942 phrases; found: 6026 phrases; correct: 5502.
accuracy:  98.79%; precision:  91.30%; recall:  92.60%; FB1:  91.95
              LOC: precision:  95.35%; recall:  94.77%; FB1:  95.06  1826
             MISC: precision:  85.88%; recall:  85.79%; FB1:  85.84  921
              ORG: precision:  86.82%; recall:  88.89%; FB1:  87.84  1373
              PER: precision:  93.28%; recall:  96.53%; FB1:  94.88  1906



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 47 = 1.3087987899780273
conlleval:
processed 51578 tokens with 5942 phrases; found: 6031 phrases; correct: 5489.
accuracy:  98.73%; precision:  91.01%; recall:  92.38%; FB1:  91.69
              LOC: precision:  96.06%; recall:  94.23%; FB1:  95.14  1802
             MISC: precision:  85.42%; recall:  85.79%; FB1:  85.61  926
              ORG: precision:  86.71%; recall:  89.04%; FB1:  87.86  1377
              PER: precision:  92.06%; recall:  96.25%; FB1:  94.11  1926



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 48 = 0.9961185455322266
conlleval:
processed 51578 tokens with 5942 phrases; found: 6035 phrases; correct: 5496.
accuracy:  98.77%; precision:  91.07%; recall:  92.49%; FB1:  91.78
              LOC: precision:  95.29%; recall:  94.61%; FB1:  94.95  1824
             MISC: precision:  85.21%; recall:  86.23%; FB1:  85.71  933
              ORG: precision:  86.55%; recall:  89.71%; FB1:  88.10  1390
              PER: precision:  93.22%; recall:  95.55%; FB1:  94.37  1888



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 49 = 1.2694920301437378
conlleval:
processed 51578 tokens with 5942 phrases; found: 6033 phrases; correct: 5485.
accuracy:  98.75%; precision:  90.92%; recall:  92.31%; FB1:  91.61
              LOC: precision:  96.29%; recall:  94.56%; FB1:  95.41  1804
             MISC: precision:  83.26%; recall:  87.42%; FB1:  85.29  968
              ORG: precision:  86.79%; recall:  86.73%; FB1:  86.76  1340
              PER: precision:  92.61%; recall:  96.58%; FB1:  94.55  1921



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 50 = 1.3680061101913452
conlleval:
processed 51578 tokens with 5942 phrases; found: 6026 phrases; correct: 5471.
accuracy:  98.72%; precision:  90.79%; recall:  92.07%; FB1:  91.43
              LOC: precision:  95.61%; recall:  93.63%; FB1:  94.61  1799
             MISC: precision:  85.24%; recall:  86.44%; FB1:  85.84  935
              ORG: precision:  85.31%; recall:  87.92%; FB1:  86.60  1382
              PER: precision:  92.93%; recall:  96.36%; FB1:  94.62  1910

----------------------------
-START-/START/START -DOCSTART-/O/O -END-/END/END
Predicted:	 ['START', 'O', 'END']
Gold:		 ['START', 'O', 'END']
----------------------------
-START-/START/START -DOCSTART-/O/O -END-/END/END
Predicted:	 ['START', 'O', 'END']
Gold:		 ['START', 'O', 'END']
----------------------------
-START-/START/START Fredericks/I-PER/I-PER (/O/O Namibia/I-LOC/I-LOC )/O/O ,/O/O Linford/I-PER/I-PER Christie/I-PER/I-PER (/O/O Britain/I-LOC/I-LOC )/O/O 38.87/O/O seconds/O/O -END-/END/END
Predic

  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 51 = 2.0846054553985596
conlleval:
processed 51578 tokens with 5942 phrases; found: 6028 phrases; correct: 5477.
accuracy:  98.74%; precision:  90.86%; recall:  92.17%; FB1:  91.51
              LOC: precision:  95.78%; recall:  93.96%; FB1:  94.86  1802
             MISC: precision:  85.79%; recall:  86.44%; FB1:  86.12  929
              ORG: precision:  85.25%; recall:  88.81%; FB1:  87.00  1397
              PER: precision:  92.79%; recall:  95.71%; FB1:  94.23  1900



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 52 = 0.819855272769928
conlleval:
processed 51578 tokens with 5942 phrases; found: 6020 phrases; correct: 5480.
accuracy:  98.76%; precision:  91.03%; recall:  92.22%; FB1:  91.62
              LOC: precision:  95.38%; recall:  94.34%; FB1:  94.85  1817
             MISC: precision:  84.85%; recall:  85.68%; FB1:  85.27  931
              ORG: precision:  85.73%; recall:  88.74%; FB1:  87.21  1388
              PER: precision:  93.79%; recall:  95.93%; FB1:  94.85  1884



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 53 = 0.9911814332008362
conlleval:
processed 51578 tokens with 5942 phrases; found: 6045 phrases; correct: 5470.
accuracy:  98.75%; precision:  90.49%; recall:  92.06%; FB1:  91.27
              LOC: precision:  95.17%; recall:  94.34%; FB1:  94.75  1821
             MISC: precision:  84.22%; recall:  86.23%; FB1:  85.21  944
              ORG: precision:  84.63%; recall:  88.67%; FB1:  86.60  1405
              PER: precision:  93.49%; recall:  95.17%; FB1:  94.32  1875



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 54 = 1.3418047428131104
conlleval:
processed 51578 tokens with 5942 phrases; found: 5976 phrases; correct: 5449.
accuracy:  98.70%; precision:  91.18%; recall:  91.70%; FB1:  91.44
              LOC: precision:  95.30%; recall:  93.85%; FB1:  94.57  1809
             MISC: precision:  85.46%; recall:  84.82%; FB1:  85.14  915
              ORG: precision:  85.39%; recall:  87.62%; FB1:  86.49  1376
              PER: precision:  94.24%; recall:  95.98%; FB1:  95.10  1876



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 55 = 1.6047687530517578
conlleval:
processed 51578 tokens with 5942 phrases; found: 5999 phrases; correct: 5479.
accuracy:  98.72%; precision:  91.33%; recall:  92.21%; FB1:  91.77
              LOC: precision:  95.87%; recall:  94.88%; FB1:  95.38  1818
             MISC: precision:  85.59%; recall:  86.33%; FB1:  85.96  930
              ORG: precision:  86.98%; recall:  87.17%; FB1:  87.08  1344
              PER: precision:  92.87%; recall:  96.15%; FB1:  94.48  1907



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 56 = 0.9377322196960449
conlleval:
processed 51578 tokens with 5942 phrases; found: 6012 phrases; correct: 5492.
accuracy:  98.75%; precision:  91.35%; recall:  92.43%; FB1:  91.89
              LOC: precision:  95.58%; recall:  94.28%; FB1:  94.93  1812
             MISC: precision:  85.88%; recall:  86.44%; FB1:  86.16  928
              ORG: precision:  86.90%; recall:  89.04%; FB1:  87.96  1374
              PER: precision:  93.20%; recall:  96.04%; FB1:  94.60  1898



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 57 = 1.157020926475525
conlleval:
processed 51578 tokens with 5942 phrases; found: 6016 phrases; correct: 5511.
accuracy:  98.81%; precision:  91.61%; recall:  92.75%; FB1:  92.17
              LOC: precision:  95.22%; recall:  95.48%; FB1:  95.35  1842
             MISC: precision:  85.51%; recall:  85.14%; FB1:  85.33  918
              ORG: precision:  86.50%; recall:  89.86%; FB1:  88.15  1393
              PER: precision:  94.85%; recall:  95.93%; FB1:  95.38  1863



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 58 = 1.1176384687423706
conlleval:
processed 51578 tokens with 5942 phrases; found: 6019 phrases; correct: 5499.
accuracy:  98.76%; precision:  91.36%; recall:  92.54%; FB1:  91.95
              LOC: precision:  95.84%; recall:  95.26%; FB1:  95.55  1826
             MISC: precision:  84.79%; recall:  86.44%; FB1:  85.61  940
              ORG: precision:  86.68%; recall:  88.29%; FB1:  87.48  1366
              PER: precision:  93.69%; recall:  95.98%; FB1:  94.82  1887



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 59 = 1.3476529121398926
conlleval:
processed 51578 tokens with 5942 phrases; found: 5998 phrases; correct: 5496.
accuracy:  98.78%; precision:  91.63%; recall:  92.49%; FB1:  92.06
              LOC: precision:  95.69%; recall:  95.37%; FB1:  95.53  1831
             MISC: precision:  85.78%; recall:  85.68%; FB1:  85.73  921
              ORG: precision:  88.02%; recall:  88.22%; FB1:  88.12  1344
              PER: precision:  93.11%; recall:  96.15%; FB1:  94.60  1902



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 60 = 0.9462077617645264
conlleval:
processed 51578 tokens with 5942 phrases; found: 6021 phrases; correct: 5479.
accuracy:  98.75%; precision:  91.00%; recall:  92.21%; FB1:  91.60
              LOC: precision:  95.45%; recall:  94.72%; FB1:  95.08  1823
             MISC: precision:  85.64%; recall:  85.36%; FB1:  85.50  919
              ORG: precision:  85.37%; recall:  88.74%; FB1:  87.02  1394
              PER: precision:  93.47%; recall:  95.66%; FB1:  94.55  1885

----------------------------
-START-/START/START 4./O/O Erik/I-PER/I-PER Breukink/I-PER/I-PER (/O/O Netherlands/I-LOC/I-LOC )/O/O Rabobank/I-ORG/I-ORG 8/O/O seconds/O/O -END-/END/END
Predicted:	 ['START', 'O', 'I-PER', 'I-PER', 'O', 'I-LOC', 'O', 'I-ORG', 'O', 'O', 'END']
Gold:		 ['START', 'O', 'I-PER', 'I-PER', 'O', 'I-LOC', 'O', 'I-ORG', 'O', 'O', 'END']
----------------------------
-START-/START/START George/I-PER/I-PER Bush/I-PER/I-PER became/O/O president/O/O in/O/O 1988/O/O on/O/O his/O/O no-new-ta

  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 61 = 0.9823054671287537
conlleval:
processed 51578 tokens with 5942 phrases; found: 6018 phrases; correct: 5491.
accuracy:  98.77%; precision:  91.24%; recall:  92.41%; FB1:  91.82
              LOC: precision:  95.47%; recall:  94.07%; FB1:  94.76  1810
             MISC: precision:  84.88%; recall:  87.09%; FB1:  85.97  946
              ORG: precision:  86.91%; recall:  88.59%; FB1:  87.74  1367
              PER: precision:  93.51%; recall:  96.20%; FB1:  94.84  1895



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 62 = 0.8002633452415466
conlleval:
processed 51578 tokens with 5942 phrases; found: 6032 phrases; correct: 5493.
accuracy:  98.77%; precision:  91.06%; recall:  92.44%; FB1:  91.75
              LOC: precision:  95.64%; recall:  94.23%; FB1:  94.93  1810
             MISC: precision:  83.18%; recall:  86.33%; FB1:  84.73  957
              ORG: precision:  87.39%; recall:  88.89%; FB1:  88.13  1364
              PER: precision:  93.32%; recall:  96.31%; FB1:  94.79  1901



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 63 = 0.7894929051399231
conlleval:
processed 51578 tokens with 5942 phrases; found: 6055 phrases; correct: 5508.
accuracy:  98.77%; precision:  90.97%; recall:  92.70%; FB1:  91.82
              LOC: precision:  95.02%; recall:  94.50%; FB1:  94.76  1827
             MISC: precision:  84.85%; recall:  86.23%; FB1:  85.53  937
              ORG: precision:  86.39%; recall:  89.49%; FB1:  87.91  1389
              PER: precision:  93.43%; recall:  96.47%; FB1:  94.93  1902



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 64 = 1.5979465246200562
conlleval:
processed 51578 tokens with 5942 phrases; found: 6010 phrases; correct: 5482.
accuracy:  98.78%; precision:  91.21%; recall:  92.26%; FB1:  91.73
              LOC: precision:  95.50%; recall:  94.72%; FB1:  95.11  1822
             MISC: precision:  85.93%; recall:  86.12%; FB1:  86.02  924
              ORG: precision:  85.62%; recall:  88.37%; FB1:  86.97  1384
              PER: precision:  93.78%; recall:  95.71%; FB1:  94.73  1880



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 65 = 1.0211180448532104
conlleval:
processed 51578 tokens with 5942 phrases; found: 6026 phrases; correct: 5491.
accuracy:  98.78%; precision:  91.12%; recall:  92.41%; FB1:  91.76
              LOC: precision:  96.11%; recall:  94.12%; FB1:  95.10  1799
             MISC: precision:  86.64%; recall:  85.14%; FB1:  85.89  906
              ORG: precision:  83.99%; recall:  90.01%; FB1:  86.90  1437
              PER: precision:  93.95%; recall:  96.09%; FB1:  95.01  1884



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 66 = 1.1248273849487305
conlleval:
processed 51578 tokens with 5942 phrases; found: 6057 phrases; correct: 5504.
accuracy:  98.79%; precision:  90.87%; recall:  92.63%; FB1:  91.74
              LOC: precision:  94.77%; recall:  94.67%; FB1:  94.72  1835
             MISC: precision:  84.88%; recall:  86.44%; FB1:  85.65  939
              ORG: precision:  85.58%; recall:  89.41%; FB1:  87.45  1401
              PER: precision:  94.00%; recall:  96.04%; FB1:  95.01  1882



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 67 = 1.2002052068710327
conlleval:
processed 51578 tokens with 5942 phrases; found: 6030 phrases; correct: 5486.
accuracy:  98.75%; precision:  90.98%; recall:  92.33%; FB1:  91.65
              LOC: precision:  95.53%; recall:  94.28%; FB1:  94.90  1813
             MISC: precision:  85.04%; recall:  85.68%; FB1:  85.36  929
              ORG: precision:  85.14%; recall:  89.26%; FB1:  87.15  1406
              PER: precision:  93.89%; recall:  95.93%; FB1:  94.90  1882



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 68 = 1.2146705389022827
conlleval:
processed 51578 tokens with 5942 phrases; found: 6028 phrases; correct: 5496.
accuracy:  98.77%; precision:  91.17%; recall:  92.49%; FB1:  91.83
              LOC: precision:  95.56%; recall:  94.88%; FB1:  95.22  1824
             MISC: precision:  85.47%; recall:  86.12%; FB1:  85.79  929
              ORG: precision:  86.80%; recall:  88.29%; FB1:  87.54  1364
              PER: precision:  92.88%; recall:  96.36%; FB1:  94.59  1911



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 69 = 0.9926496148109436
conlleval:
processed 51578 tokens with 5942 phrases; found: 6043 phrases; correct: 5484.
accuracy:  98.75%; precision:  90.75%; recall:  92.29%; FB1:  91.51
              LOC: precision:  94.66%; recall:  94.56%; FB1:  94.61  1835
             MISC: precision:  84.16%; recall:  86.44%; FB1:  85.29  947
              ORG: precision:  86.06%; recall:  88.37%; FB1:  87.20  1377
              PER: precision:  93.68%; recall:  95.82%; FB1:  94.74  1884



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 70 = 0.7346284985542297
conlleval:
processed 51578 tokens with 5942 phrases; found: 6039 phrases; correct: 5498.
accuracy:  98.79%; precision:  91.04%; recall:  92.53%; FB1:  91.78
              LOC: precision:  94.88%; recall:  94.88%; FB1:  94.88  1837
             MISC: precision:  84.04%; recall:  86.23%; FB1:  85.12  946
              ORG: precision:  86.90%; recall:  88.52%; FB1:  87.70  1366
              PER: precision:  93.81%; recall:  96.25%; FB1:  95.02  1890

----------------------------
-START-/START/START Sutjeska/I-ORG/I-ORG 1/O/O Loznica/I-ORG/I-ORG 0/O/O -END-/END/END
Predicted:	 ['START', 'I-ORG', 'O', 'I-ORG', 'O', 'END']
Gold:		 ['START', 'I-ORG', 'O', 'I-ORG', 'O', 'END']
----------------------------
-START-/START/START FRIDAY/O/O ,/O/O AUGUST/O/O 30/O/O SCHEDULE/O/O -END-/END/END
Predicted:	 ['START', 'O', 'O', 'O', 'O', 'O', 'END']
Gold:		 ['START', 'O', 'O', 'O', 'O', 'O', 'END']
----------------------------
-START-/START/START from/O/O (/O/O Patr

  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 71 = 0.9706400632858276
conlleval:
processed 51578 tokens with 5942 phrases; found: 5974 phrases; correct: 5463.
accuracy:  98.75%; precision:  91.45%; recall:  91.94%; FB1:  91.69
              LOC: precision:  95.15%; recall:  94.07%; FB1:  94.61  1816
             MISC: precision:  84.66%; recall:  85.57%; FB1:  85.11  932
              ORG: precision:  87.44%; recall:  88.29%; FB1:  87.87  1354
              PER: precision:  94.12%; recall:  95.66%; FB1:  94.88  1872



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 72 = 0.8818730711936951
conlleval:
processed 51578 tokens with 5942 phrases; found: 6038 phrases; correct: 5463.
accuracy:  98.72%; precision:  90.48%; recall:  91.94%; FB1:  91.20
              LOC: precision:  95.45%; recall:  93.74%; FB1:  94.59  1804
             MISC: precision:  84.19%; recall:  85.47%; FB1:  84.82  936
              ORG: precision:  84.73%; recall:  87.70%; FB1:  86.19  1388
              PER: precision:  93.04%; recall:  96.47%; FB1:  94.72  1910



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 73 = 0.8839454054832458
conlleval:
processed 51578 tokens with 5942 phrases; found: 6037 phrases; correct: 5487.
accuracy:  98.77%; precision:  90.89%; recall:  92.34%; FB1:  91.61
              LOC: precision:  94.39%; recall:  95.32%; FB1:  94.85  1855
             MISC: precision:  85.57%; recall:  84.92%; FB1:  85.25  915
              ORG: precision:  86.97%; recall:  88.07%; FB1:  87.51  1358
              PER: precision:  92.82%; recall:  96.20%; FB1:  94.48  1909



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 74 = 0.8965342044830322
conlleval:
processed 51578 tokens with 5942 phrases; found: 6031 phrases; correct: 5487.
accuracy:  98.79%; precision:  90.98%; recall:  92.34%; FB1:  91.66
              LOC: precision:  94.87%; recall:  94.61%; FB1:  94.74  1832
             MISC: precision:  86.04%; recall:  85.57%; FB1:  85.81  917
              ORG: precision:  85.99%; recall:  88.81%; FB1:  87.38  1385
              PER: precision:  93.25%; recall:  96.04%; FB1:  94.62  1897



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 75 = 0.9730690717697144
conlleval:
processed 51578 tokens with 5942 phrases; found: 6033 phrases; correct: 5485.
accuracy:  98.73%; precision:  90.92%; recall:  92.31%; FB1:  91.61
              LOC: precision:  95.65%; recall:  94.50%; FB1:  95.07  1815
             MISC: precision:  84.22%; recall:  85.68%; FB1:  84.95  938
              ORG: precision:  86.73%; recall:  88.22%; FB1:  87.47  1364
              PER: precision:  92.69%; recall:  96.42%; FB1:  94.52  1916



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 76 = 0.5232837200164795
conlleval:
processed 51578 tokens with 5942 phrases; found: 6034 phrases; correct: 5502.
accuracy:  98.79%; precision:  91.18%; recall:  92.60%; FB1:  91.88
              LOC: precision:  94.50%; recall:  95.48%; FB1:  94.99  1856
             MISC: precision:  85.67%; recall:  86.88%; FB1:  86.27  935
              ORG: precision:  87.70%; recall:  87.70%; FB1:  87.70  1341
              PER: precision:  93.11%; recall:  96.15%; FB1:  94.60  1902



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 77 = 0.6632832288742065
conlleval:
processed 51578 tokens with 5942 phrases; found: 6011 phrases; correct: 5508.
accuracy:  98.82%; precision:  91.63%; recall:  92.70%; FB1:  92.16
              LOC: precision:  95.13%; recall:  94.72%; FB1:  94.93  1829
             MISC: precision:  86.24%; recall:  86.33%; FB1:  86.29  923
              ORG: precision:  87.38%; recall:  89.34%; FB1:  88.35  1371
              PER: precision:  93.96%; recall:  96.31%; FB1:  95.12  1888



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 78 = 0.8560560345649719
conlleval:
processed 51578 tokens with 5942 phrases; found: 6050 phrases; correct: 5495.
accuracy:  98.76%; precision:  90.83%; recall:  92.48%; FB1:  91.64
              LOC: precision:  95.40%; recall:  94.77%; FB1:  95.08  1825
             MISC: precision:  84.73%; recall:  86.66%; FB1:  85.68  943
              ORG: precision:  84.86%; recall:  88.22%; FB1:  86.51  1394
              PER: precision:  93.86%; recall:  96.20%; FB1:  95.01  1888



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 79 = 1.001223087310791
conlleval:
processed 51578 tokens with 5942 phrases; found: 5993 phrases; correct: 5490.
accuracy:  98.77%; precision:  91.61%; recall:  92.39%; FB1:  92.00
              LOC: precision:  95.52%; recall:  95.16%; FB1:  95.34  1830
             MISC: precision:  88.50%; recall:  85.14%; FB1:  86.79  887
              ORG: precision:  85.19%; recall:  88.37%; FB1:  86.75  1391
              PER: precision:  94.01%; recall:  96.20%; FB1:  95.09  1885



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 80 = 0.9607558846473694
conlleval:
processed 51578 tokens with 5942 phrases; found: 6013 phrases; correct: 5490.
accuracy:  98.79%; precision:  91.30%; recall:  92.39%; FB1:  91.84
              LOC: precision:  95.69%; recall:  94.34%; FB1:  95.01  1811
             MISC: precision:  84.58%; recall:  85.68%; FB1:  85.13  934
              ORG: precision:  86.11%; recall:  89.19%; FB1:  87.62  1389
              PER: precision:  94.25%; recall:  96.15%; FB1:  95.19  1879

----------------------------
-START-/START/START HOUSTON/I-ORG/I-ORG AT/O/O PITTSBURGH/I-LOC/I-LOC -END-/END/END
Predicted:	 ['START', 'I-ORG', 'O', 'I-LOC', 'END']
Gold:		 ['START', 'I-ORG', 'O', 'I-LOC', 'END']
----------------------------
-START-/START/START 6./O/O Jon/I-PER/I-PER Drummond/I-PER/I-PER (/O/O U.S./I-LOC/I-LOC )/O/O 20.78/O/O -END-/END/END
Predicted:	 ['START', 'O', 'I-PER', 'I-PER', 'O', 'I-LOC', 'O', 'O', 'END']
Gold:		 ['START', 'O', 'I-PER', 'I-PER', 'O', 'I-LOC', 'O', 'O', 'END']
--

  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 81 = 0.7104125618934631
conlleval:
processed 51578 tokens with 5942 phrases; found: 6012 phrases; correct: 5484.
accuracy:  98.76%; precision:  91.22%; recall:  92.29%; FB1:  91.75
              LOC: precision:  95.31%; recall:  95.05%; FB1:  95.18  1832
             MISC: precision:  85.10%; recall:  86.12%; FB1:  85.61  933
              ORG: precision:  86.99%; recall:  88.29%; FB1:  87.64  1361
              PER: precision:  93.32%; recall:  95.55%; FB1:  94.42  1886



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 82 = 0.6704590916633606
conlleval:
processed 51578 tokens with 5942 phrases; found: 6025 phrases; correct: 5483.
accuracy:  98.72%; precision:  91.00%; recall:  92.28%; FB1:  91.64
              LOC: precision:  95.70%; recall:  94.50%; FB1:  95.10  1814
             MISC: precision:  85.32%; recall:  86.98%; FB1:  86.14  940
              ORG: precision:  85.96%; recall:  86.73%; FB1:  86.34  1353
              PER: precision:  92.91%; recall:  96.74%; FB1:  94.79  1918



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 83 = 0.9359360933303833
conlleval:
processed 51578 tokens with 5942 phrases; found: 6041 phrases; correct: 5506.
accuracy:  98.79%; precision:  91.14%; recall:  92.66%; FB1:  91.90
              LOC: precision:  95.34%; recall:  94.67%; FB1:  95.00  1824
             MISC: precision:  85.95%; recall:  86.23%; FB1:  86.09  925
              ORG: precision:  85.82%; recall:  89.78%; FB1:  87.76  1403
              PER: precision:  93.59%; recall:  95.98%; FB1:  94.77  1889



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 84 = 0.725657045841217
conlleval:
processed 51578 tokens with 5942 phrases; found: 6028 phrases; correct: 5500.
accuracy:  98.78%; precision:  91.24%; recall:  92.56%; FB1:  91.90
              LOC: precision:  95.63%; recall:  94.18%; FB1:  94.90  1809
             MISC: precision:  85.05%; recall:  86.98%; FB1:  86.01  943
              ORG: precision:  86.02%; recall:  89.04%; FB1:  87.50  1388
              PER: precision:  93.96%; recall:  96.31%; FB1:  95.12  1888



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 85 = 0.7280406355857849
conlleval:
processed 51578 tokens with 5942 phrases; found: 6031 phrases; correct: 5494.
accuracy:  98.76%; precision:  91.10%; recall:  92.46%; FB1:  91.77
              LOC: precision:  95.04%; recall:  94.83%; FB1:  94.93  1833
             MISC: precision:  85.21%; recall:  86.88%; FB1:  86.04  940
              ORG: precision:  85.46%; recall:  88.07%; FB1:  86.74  1382
              PER: precision:  94.35%; recall:  96.09%; FB1:  95.21  1876



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 86 = 0.7121262550354004
conlleval:
processed 51578 tokens with 5942 phrases; found: 6062 phrases; correct: 5491.
accuracy:  98.77%; precision:  90.58%; recall:  92.41%; FB1:  91.49
              LOC: precision:  94.80%; recall:  94.28%; FB1:  94.54  1827
             MISC: precision:  84.22%; recall:  86.23%; FB1:  85.21  944
              ORG: precision:  85.70%; recall:  88.52%; FB1:  87.09  1385
              PER: precision:  93.23%; recall:  96.47%; FB1:  94.82  1906



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 87 = 0.5248392820358276
conlleval:
processed 51578 tokens with 5942 phrases; found: 6022 phrases; correct: 5509.
accuracy:  98.82%; precision:  91.48%; recall:  92.71%; FB1:  92.09
              LOC: precision:  95.67%; recall:  94.94%; FB1:  95.30  1823
             MISC: precision:  85.56%; recall:  86.77%; FB1:  86.16  935
              ORG: precision:  86.41%; recall:  88.67%; FB1:  87.52  1376
              PER: precision:  94.07%; recall:  96.42%; FB1:  95.23  1888



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 88 = 0.5402951240539551
conlleval:
processed 51578 tokens with 5942 phrases; found: 6032 phrases; correct: 5498.
accuracy:  98.78%; precision:  91.15%; recall:  92.53%; FB1:  91.83
              LOC: precision:  95.34%; recall:  94.67%; FB1:  95.00  1824
             MISC: precision:  85.31%; recall:  85.68%; FB1:  85.50  926
              ORG: precision:  85.92%; recall:  89.19%; FB1:  87.52  1392
              PER: precision:  93.81%; recall:  96.25%; FB1:  95.02  1890



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 89 = 0.40595003962516785
conlleval:
processed 51578 tokens with 5942 phrases; found: 6048 phrases; correct: 5499.
accuracy:  98.77%; precision:  90.92%; recall:  92.54%; FB1:  91.73
              LOC: precision:  95.22%; recall:  94.34%; FB1:  94.78  1820
             MISC: precision:  84.28%; recall:  86.66%; FB1:  85.45  948
              ORG: precision:  86.62%; recall:  88.81%; FB1:  87.70  1375
              PER: precision:  93.23%; recall:  96.42%; FB1:  94.80  1905



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 90 = 0.6765504479408264
conlleval:
processed 51578 tokens with 5942 phrases; found: 6069 phrases; correct: 5500.
accuracy:  98.76%; precision:  90.62%; recall:  92.56%; FB1:  91.58
              LOC: precision:  95.06%; recall:  94.34%; FB1:  94.70  1823
             MISC: precision:  83.67%; recall:  85.57%; FB1:  84.61  943
              ORG: precision:  85.35%; recall:  89.49%; FB1:  87.37  1406
              PER: precision:  93.73%; recall:  96.53%; FB1:  95.11  1897

----------------------------
-START-/START/START 3./O/O Rose/I-PER/I-PER Cheruiyot/I-PER/I-PER (/O/O Kenya/I-LOC/I-LOC )/O/O 15:05.41/O/O -END-/END/END
Predicted:	 ['START', 'O', 'I-PER', 'I-PER', 'O', 'I-LOC', 'O', 'O', 'END']
Gold:		 ['START', 'O', 'I-PER', 'I-PER', 'O', 'I-LOC', 'O', 'O', 'END']
----------------------------
-START-/START/START Baratelli/I-PER/I-PER ,/O/O who/O/O played/O/O for/O/O Nice/I-ORG/I-ORG and/O/O Paris/I-LOC/I-ORG St/I-ORG/I-ORG Germain/I-LOC/I-ORG ,/O/O takes/O/O over/O/O fr

  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 91 = 0.8872435688972473
conlleval:
processed 51578 tokens with 5942 phrases; found: 6041 phrases; correct: 5505.
accuracy:  98.77%; precision:  91.13%; recall:  92.65%; FB1:  91.88
              LOC: precision:  95.50%; recall:  94.77%; FB1:  95.14  1823
             MISC: precision:  86.10%; recall:  86.01%; FB1:  86.06  921
              ORG: precision:  85.54%; recall:  89.11%; FB1:  87.29  1397
              PER: precision:  93.47%; recall:  96.42%; FB1:  94.92  1900



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 92 = 0.6610360741615295
conlleval:
processed 51578 tokens with 5942 phrases; found: 6050 phrases; correct: 5496.
accuracy:  98.76%; precision:  90.84%; recall:  92.49%; FB1:  91.66
              LOC: precision:  95.29%; recall:  94.61%; FB1:  94.95  1824
             MISC: precision:  83.13%; recall:  87.09%; FB1:  85.06  966
              ORG: precision:  86.79%; recall:  88.22%; FB1:  87.50  1363
              PER: precision:  93.41%; recall:  96.20%; FB1:  94.78  1897



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 93 = 0.6315358877182007
conlleval:
processed 51578 tokens with 5942 phrases; found: 6045 phrases; correct: 5496.
accuracy:  98.77%; precision:  90.92%; recall:  92.49%; FB1:  91.70
              LOC: precision:  95.45%; recall:  94.77%; FB1:  95.11  1824
             MISC: precision:  84.45%; recall:  86.01%; FB1:  85.22  939
              ORG: precision:  85.79%; recall:  89.11%; FB1:  87.42  1393
              PER: precision:  93.54%; recall:  95.93%; FB1:  94.72  1889



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 94 = 0.6527142524719238
conlleval:
processed 51578 tokens with 5942 phrases; found: 6026 phrases; correct: 5483.
accuracy:  98.76%; precision:  90.99%; recall:  92.28%; FB1:  91.63
              LOC: precision:  95.47%; recall:  94.18%; FB1:  94.82  1812
             MISC: precision:  85.13%; recall:  85.68%; FB1:  85.41  928
              ORG: precision:  85.85%; recall:  89.56%; FB1:  87.66  1399
              PER: precision:  93.38%; recall:  95.66%; FB1:  94.50  1887



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 95 = 0.48642757534980774
conlleval:
processed 51578 tokens with 5942 phrases; found: 6038 phrases; correct: 5490.
accuracy:  98.74%; precision:  90.92%; recall:  92.39%; FB1:  91.65
              LOC: precision:  95.42%; recall:  94.18%; FB1:  94.79  1813
             MISC: precision:  86.32%; recall:  86.23%; FB1:  86.27  921
              ORG: precision:  85.00%; recall:  89.19%; FB1:  87.05  1407
              PER: precision:  93.25%; recall:  96.04%; FB1:  94.62  1897



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 96 = 0.4975367784500122
conlleval:
processed 51578 tokens with 5942 phrases; found: 6011 phrases; correct: 5490.
accuracy:  98.75%; precision:  91.33%; recall:  92.39%; FB1:  91.86
              LOC: precision:  95.75%; recall:  94.39%; FB1:  95.07  1811
             MISC: precision:  84.45%; recall:  86.01%; FB1:  85.22  939
              ORG: precision:  86.46%; recall:  88.07%; FB1:  87.26  1366
              PER: precision:  94.04%; recall:  96.74%; FB1:  95.37  1895



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 97 = 0.729536235332489
conlleval:
processed 51578 tokens with 5942 phrases; found: 6056 phrases; correct: 5478.
accuracy:  98.69%; precision:  90.46%; recall:  92.19%; FB1:  91.32
              LOC: precision:  95.07%; recall:  94.39%; FB1:  94.73  1824
             MISC: precision:  84.33%; recall:  85.79%; FB1:  85.05  938
              ORG: precision:  84.85%; recall:  88.14%; FB1:  86.47  1393
              PER: precision:  93.16%; recall:  96.15%; FB1:  94.63  1901



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 98 = 0.5180591940879822
conlleval:
processed 51578 tokens with 5942 phrases; found: 6031 phrases; correct: 5476.
accuracy:  98.75%; precision:  90.80%; recall:  92.16%; FB1:  91.47
              LOC: precision:  95.70%; recall:  93.36%; FB1:  94.52  1792
             MISC: precision:  85.05%; recall:  85.14%; FB1:  85.09  923
              ORG: precision:  84.81%; recall:  89.11%; FB1:  86.91  1409
              PER: precision:  93.39%; recall:  96.69%; FB1:  95.01  1907



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 99 = 0.5101637840270996
conlleval:
processed 51578 tokens with 5942 phrases; found: 6039 phrases; correct: 5476.
accuracy:  98.71%; precision:  90.68%; recall:  92.16%; FB1:  91.41
              LOC: precision:  94.24%; recall:  95.26%; FB1:  94.75  1857
             MISC: precision:  83.67%; recall:  85.57%; FB1:  84.61  943
              ORG: precision:  87.67%; recall:  87.99%; FB1:  87.83  1346
              PER: precision:  92.82%; recall:  95.39%; FB1:  94.08  1893



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 100 = 0.7019873857498169
conlleval:
processed 51578 tokens with 5942 phrases; found: 6051 phrases; correct: 5497.
accuracy:  98.77%; precision:  90.84%; recall:  92.51%; FB1:  91.67
              LOC: precision:  95.01%; recall:  95.26%; FB1:  95.13  1842
             MISC: precision:  82.75%; recall:  87.42%; FB1:  85.02  974
              ORG: precision:  86.48%; recall:  88.74%; FB1:  87.60  1376
              PER: precision:  94.19%; recall:  95.06%; FB1:  94.62  1859

----------------------------
-START-/START/START -DOCSTART-/O/O -END-/END/END
Predicted:	 ['START', 'O', 'END']
Gold:		 ['START', 'O', 'END']
----------------------------
-START-/START/START GOLF/I-LOC/O -/O/O BRITISH/I-MISC/I-MISC MASTERS/I-MISC/I-MISC FINAL/O/O SCORES/O/O ./O/O -END-/END/END
Predicted:	 ['START', 'I-LOC', 'O', 'I-MISC', 'I-MISC', 'O', 'O', 'O', 'END']
Gold:		 ['START', 'O', 'O', 'I-MISC', 'I-MISC', 'O', 'O', 'O', 'END']
----------------------------
-START-/START/START Canada/I-LOC/I-L

  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 101 = 0.4296817481517792
conlleval:
processed 51578 tokens with 5942 phrases; found: 6051 phrases; correct: 5507.
accuracy:  98.78%; precision:  91.01%; recall:  92.68%; FB1:  91.84
              LOC: precision:  95.34%; recall:  94.67%; FB1:  95.00  1824
             MISC: precision:  84.35%; recall:  87.09%; FB1:  85.70  952
              ORG: precision:  85.84%; recall:  88.59%; FB1:  87.19  1384
              PER: precision:  93.97%; recall:  96.47%; FB1:  95.20  1891



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 102 = 0.5663375854492188
conlleval:
processed 51578 tokens with 5942 phrases; found: 6056 phrases; correct: 5506.
accuracy:  98.78%; precision:  90.92%; recall:  92.66%; FB1:  91.78
              LOC: precision:  94.85%; recall:  94.34%; FB1:  94.60  1827
             MISC: precision:  84.17%; recall:  87.64%; FB1:  85.87  960
              ORG: precision:  86.22%; recall:  88.67%; FB1:  87.43  1379
              PER: precision:  93.97%; recall:  96.42%; FB1:  95.18  1890



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 103 = 0.5427147150039673
conlleval:
processed 51578 tokens with 5942 phrases; found: 6036 phrases; correct: 5484.
accuracy:  98.73%; precision:  90.85%; recall:  92.29%; FB1:  91.57
              LOC: precision:  95.29%; recall:  94.67%; FB1:  94.98  1825
             MISC: precision:  85.24%; recall:  85.79%; FB1:  85.51  928
              ORG: precision:  84.71%; recall:  87.99%; FB1:  86.32  1393
              PER: precision:  93.86%; recall:  96.31%; FB1:  95.07  1890



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 104 = 0.7607239484786987
conlleval:
processed 51578 tokens with 5942 phrases; found: 6069 phrases; correct: 5490.
accuracy:  98.74%; precision:  90.46%; recall:  92.39%; FB1:  91.42
              LOC: precision:  95.16%; recall:  94.12%; FB1:  94.64  1817
             MISC: precision:  83.52%; recall:  86.88%; FB1:  85.17  959
              ORG: precision:  85.37%; recall:  88.37%; FB1:  86.84  1388
              PER: precision:  93.18%; recall:  96.36%; FB1:  94.74  1905



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 105 = 0.6449174284934998
conlleval:
processed 51578 tokens with 5942 phrases; found: 6063 phrases; correct: 5498.
accuracy:  98.77%; precision:  90.68%; recall:  92.53%; FB1:  91.60
              LOC: precision:  94.89%; recall:  94.94%; FB1:  94.91  1838
             MISC: precision:  83.56%; recall:  86.55%; FB1:  85.03  955
              ORG: precision:  86.21%; recall:  88.14%; FB1:  87.17  1371
              PER: precision:  93.42%; recall:  96.31%; FB1:  94.84  1899



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 106 = 0.590035080909729
conlleval:
processed 51578 tokens with 5942 phrases; found: 6039 phrases; correct: 5499.
accuracy:  98.78%; precision:  91.06%; recall:  92.54%; FB1:  91.80
              LOC: precision:  95.48%; recall:  94.34%; FB1:  94.91  1815
             MISC: precision:  83.96%; recall:  86.88%; FB1:  85.39  954
              ORG: precision:  86.99%; recall:  88.74%; FB1:  87.86  1368
              PER: precision:  93.32%; recall:  96.36%; FB1:  94.82  1902



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 107 = 0.439458966255188
conlleval:
processed 51578 tokens with 5942 phrases; found: 6033 phrases; correct: 5493.
accuracy:  98.77%; precision:  91.05%; recall:  92.44%; FB1:  91.74
              LOC: precision:  95.24%; recall:  94.83%; FB1:  95.04  1829
             MISC: precision:  84.43%; recall:  86.44%; FB1:  85.42  944
              ORG: precision:  85.96%; recall:  88.59%; FB1:  87.26  1382
              PER: precision:  94.04%; recall:  95.87%; FB1:  94.95  1878



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 108 = 0.2819998860359192
conlleval:
processed 51578 tokens with 5942 phrases; found: 6045 phrases; correct: 5507.
accuracy:  98.78%; precision:  91.10%; recall:  92.68%; FB1:  91.88
              LOC: precision:  96.10%; recall:  95.16%; FB1:  95.62  1819
             MISC: precision:  83.88%; recall:  86.33%; FB1:  85.09  949
              ORG: precision:  85.69%; recall:  88.44%; FB1:  87.05  1384
              PER: precision:  93.87%; recall:  96.47%; FB1:  95.15  1893



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 109 = 0.548682689666748
conlleval:
processed 51578 tokens with 5942 phrases; found: 6037 phrases; correct: 5497.
accuracy:  98.75%; precision:  91.06%; recall:  92.51%; FB1:  91.78
              LOC: precision:  94.58%; recall:  95.92%; FB1:  95.24  1863
             MISC: precision:  86.42%; recall:  84.92%; FB1:  85.67  906
              ORG: precision:  86.04%; recall:  88.22%; FB1:  87.11  1375
              PER: precision:  93.45%; recall:  96.04%; FB1:  94.73  1893



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 110 = 0.5168820023536682
conlleval:
processed 51578 tokens with 5942 phrases; found: 6057 phrases; correct: 5507.
accuracy:  98.76%; precision:  90.92%; recall:  92.68%; FB1:  91.79
              LOC: precision:  94.77%; recall:  94.61%; FB1:  94.69  1834
             MISC: precision:  84.66%; recall:  86.77%; FB1:  85.70  945
              ORG: precision:  86.63%; recall:  88.89%; FB1:  87.74  1376
              PER: precision:  93.43%; recall:  96.47%; FB1:  94.93  1902

----------------------------
-START-/START/START Net/O/O loss/O/O 1,967/O/O loss/O/O 841/O/O -END-/END/END
Predicted:	 ['START', 'O', 'O', 'O', 'O', 'O', 'END']
Gold:		 ['START', 'O', 'O', 'O', 'O', 'O', 'END']
----------------------------
-START-/START/START URUS-MARTAN/I-ORG/I-LOC ,/O/O Russia/I-LOC/I-LOC 1996-08-31/O/O -END-/END/END
Predicted:	 ['START', 'I-ORG', 'O', 'I-LOC', 'O', 'END']
Gold:		 ['START', 'I-LOC', 'O', 'I-LOC', 'O', 'END']
----------------------------
-START-/START/START 2./O/O Ludm

  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 111 = 0.48707836866378784
conlleval:
processed 51578 tokens with 5942 phrases; found: 6058 phrases; correct: 5496.
accuracy:  98.75%; precision:  90.72%; recall:  92.49%; FB1:  91.60
              LOC: precision:  94.85%; recall:  94.23%; FB1:  94.54  1825
             MISC: precision:  84.90%; recall:  86.01%; FB1:  85.45  934
              ORG: precision:  86.18%; recall:  89.78%; FB1:  87.95  1397
              PER: precision:  92.95%; recall:  95.98%; FB1:  94.44  1902



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 112 = 0.40110304951667786
conlleval:
processed 51578 tokens with 5942 phrases; found: 6045 phrases; correct: 5507.
accuracy:  98.80%; precision:  91.10%; recall:  92.68%; FB1:  91.88
              LOC: precision:  95.30%; recall:  94.99%; FB1:  95.15  1831
             MISC: precision:  83.92%; recall:  87.20%; FB1:  85.53  958
              ORG: precision:  87.16%; recall:  88.59%; FB1:  87.87  1363
              PER: precision:  93.50%; recall:  96.09%; FB1:  94.78  1893



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 113 = 0.5626247525215149
conlleval:
processed 51578 tokens with 5942 phrases; found: 6022 phrases; correct: 5511.
accuracy:  98.83%; precision:  91.51%; recall:  92.75%; FB1:  92.13
              LOC: precision:  95.86%; recall:  94.45%; FB1:  95.15  1810
             MISC: precision:  85.12%; recall:  86.88%; FB1:  85.99  941
              ORG: precision:  87.33%; recall:  89.93%; FB1:  88.61  1381
              PER: precision:  93.60%; recall:  96.04%; FB1:  94.80  1890



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 114 = 0.6113554835319519
conlleval:
processed 51578 tokens with 5942 phrases; found: 6022 phrases; correct: 5499.
accuracy:  98.79%; precision:  91.32%; recall:  92.54%; FB1:  91.93
              LOC: precision:  95.42%; recall:  94.18%; FB1:  94.79  1813
             MISC: precision:  85.30%; recall:  86.23%; FB1:  85.76  932
              ORG: precision:  85.93%; recall:  90.60%; FB1:  88.20  1414
              PER: precision:  94.42%; recall:  95.49%; FB1:  94.95  1863



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 115 = 0.48182129859924316
conlleval:
processed 51578 tokens with 5942 phrases; found: 6055 phrases; correct: 5498.
accuracy:  98.76%; precision:  90.80%; recall:  92.53%; FB1:  91.66
              LOC: precision:  95.00%; recall:  95.16%; FB1:  95.08  1840
             MISC: precision:  84.53%; recall:  87.09%; FB1:  85.79  950
              ORG: precision:  86.13%; recall:  87.55%; FB1:  86.83  1363
              PER: precision:  93.22%; recall:  96.25%; FB1:  94.71  1902



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 116 = 0.49452388286590576
conlleval:
processed 51578 tokens with 5942 phrases; found: 6035 phrases; correct: 5489.
accuracy:  98.76%; precision:  90.95%; recall:  92.38%; FB1:  91.66
              LOC: precision:  95.20%; recall:  94.99%; FB1:  95.10  1833
             MISC: precision:  85.31%; recall:  85.68%; FB1:  85.50  926
              ORG: precision:  84.74%; recall:  88.59%; FB1:  86.62  1402
              PER: precision:  94.24%; recall:  95.87%; FB1:  95.05  1874



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 117 = 0.5199880599975586
conlleval:
processed 51578 tokens with 5942 phrases; found: 6061 phrases; correct: 5490.
accuracy:  98.73%; precision:  90.58%; recall:  92.39%; FB1:  91.48
              LOC: precision:  95.14%; recall:  94.77%; FB1:  94.96  1830
             MISC: precision:  82.07%; recall:  85.90%; FB1:  83.94  965
              ORG: precision:  85.80%; recall:  88.74%; FB1:  87.24  1387
              PER: precision:  94.04%; recall:  95.93%; FB1:  94.97  1879



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 118 = 0.4952753484249115
conlleval:
processed 51578 tokens with 5942 phrases; found: 6045 phrases; correct: 5495.
accuracy:  98.78%; precision:  90.90%; recall:  92.48%; FB1:  91.68
              LOC: precision:  95.52%; recall:  95.10%; FB1:  95.31  1829
             MISC: precision:  83.40%; recall:  87.20%; FB1:  85.26  964
              ORG: precision:  85.39%; recall:  88.44%; FB1:  86.89  1389
              PER: precision:  94.36%; recall:  95.44%; FB1:  94.90  1863



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 119 = 0.5383999347686768
conlleval:
processed 51578 tokens with 5942 phrases; found: 6020 phrases; correct: 5488.
accuracy:  98.78%; precision:  91.16%; recall:  92.36%; FB1:  91.76
              LOC: precision:  95.18%; recall:  94.50%; FB1:  94.84  1824
             MISC: precision:  85.71%; recall:  86.55%; FB1:  86.13  931
              ORG: precision:  86.21%; recall:  89.04%; FB1:  87.60  1385
              PER: precision:  93.62%; recall:  95.55%; FB1:  94.57  1880



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 120 = 0.5560972690582275
conlleval:
processed 51578 tokens with 5942 phrases; found: 6043 phrases; correct: 5505.
accuracy:  98.76%; precision:  91.10%; recall:  92.65%; FB1:  91.86
              LOC: precision:  94.74%; recall:  95.16%; FB1:  94.95  1845
             MISC: precision:  84.99%; recall:  86.01%; FB1:  85.50  933
              ORG: precision:  86.46%; recall:  88.59%; FB1:  87.51  1374
              PER: precision:  93.92%; recall:  96.42%; FB1:  95.15  1891

----------------------------
-START-/START/START -DOCSTART-/O/O -END-/END/END
Predicted:	 ['START', 'O', 'END']
Gold:		 ['START', 'O', 'END']
----------------------------
-START-/START/START -DOCSTART-/O/O -END-/END/END
Predicted:	 ['START', 'O', 'END']
Gold:		 ['START', 'O', 'END']
----------------------------
-START-/START/START Lotte/I-ORG/I-ORG 6/O/O Hyundai/I-ORG/I-ORG 2/O/O -END-/END/END
Predicted:	 ['START', 'I-ORG', 'O', 'I-ORG', 'O', 'END']
Gold:		 ['START', 'I-ORG', 'O', 'I-ORG', 'O', 'END']
-

  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 121 = 0.4578723907470703
conlleval:
processed 51578 tokens with 5942 phrases; found: 6032 phrases; correct: 5497.
accuracy:  98.74%; precision:  91.13%; recall:  92.51%; FB1:  91.82
              LOC: precision:  94.71%; recall:  94.61%; FB1:  94.66  1835
             MISC: precision:  84.89%; recall:  86.55%; FB1:  85.71  940
              ORG: precision:  86.15%; recall:  88.59%; FB1:  87.35  1379
              PER: precision:  94.41%; recall:  96.25%; FB1:  95.32  1878



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 122 = 0.5089549422264099
conlleval:
processed 51578 tokens with 5942 phrases; found: 6038 phrases; correct: 5478.
accuracy:  98.73%; precision:  90.73%; recall:  92.19%; FB1:  91.45
              LOC: precision:  95.46%; recall:  93.85%; FB1:  94.65  1806
             MISC: precision:  85.22%; recall:  86.33%; FB1:  85.78  934
              ORG: precision:  84.71%; recall:  89.26%; FB1:  86.93  1413
              PER: precision:  93.42%; recall:  95.60%; FB1:  94.50  1885



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 123 = 0.3632732629776001
conlleval:
processed 51578 tokens with 5942 phrases; found: 6053 phrases; correct: 5484.
accuracy:  98.72%; precision:  90.60%; recall:  92.29%; FB1:  91.44
              LOC: precision:  94.85%; recall:  94.23%; FB1:  94.54  1825
             MISC: precision:  84.26%; recall:  85.90%; FB1:  85.07  940
              ORG: precision:  84.83%; recall:  88.81%; FB1:  86.78  1404
              PER: precision:  93.95%; recall:  96.09%; FB1:  95.01  1884



  0%|          | 0/300 [00:00<?, ?it/s]

loss on epoch 124 = 0.41114678978919983
conlleval:
processed 51578 tokens with 5942 phrases; found: 6042 phrases; correct: 5487.
accuracy:  98.76%; precision:  90.81%; recall:  92.34%; FB1:  91.57
              LOC: precision:  95.08%; recall:  94.67%; FB1:  94.87  1829
             MISC: precision:  84.88%; recall:  86.44%; FB1:  85.65  939
              ORG: precision:  85.22%; recall:  88.14%; FB1:  86.66  1387
              PER: precision:  93.75%; recall:  96.04%; FB1:  94.88  1887



In [22]:
#Evaluation on test set
char_lstm.write_predictions(sentences_test, 'test_pred_cnn_lstm.txt')
!wget https://raw.githubusercontent.com/aritter/twitter_nlp/master/data/annotated/wnut16/conlleval.pl
!paste test test_pred_cnn_lstm.txt | perl conlleval.pl -d "\t"

--2023-03-24 14:15:06--  https://raw.githubusercontent.com/aritter/twitter_nlp/master/data/annotated/wnut16/conlleval.pl
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.111.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 12754 (12K) [text/plain]
Saving to: ‘conlleval.pl.1’


2023-03-24 14:15:06 (132 MB/s) - ‘conlleval.pl.1’ saved [12754/12754]

processed 46666 tokens with 5648 phrases; found: 5812 phrases; correct: 5020.
accuracy:  97.75%; precision:  86.37%; recall:  88.88%; FB1:  87.61
              LOC: precision:  90.96%; recall:  91.67%; FB1:  91.31  1681
             MISC: precision:  73.53%; recall:  78.35%; FB1:  75.86  748
              ORG: precision:  81.76%; recall:  86.63%; FB1:  84.13  1760
              PER: precision:  92.54%; recall:  92.89%; FB1:  92.72  1623


## Conditional Random Fields (15 points)

Now we are ready to add a CRF layer to the `CharacterLSTMTagger`.  To train the model, implement `conditional_log_likelihood`, using the score (unnormalized log probability) of the gold sequence, in addition to the partition function, $Z(X)$, which is computed using the forward algorithm.  Then, you can simply use Pytorch's automatic differentiation to compute gradients by running backpropagation through the computation graph of the dynamic program (this should be very simple, so long as you are able to correctly implement the forward algorithm using a computation graph that is supported by PyTorch).  This approach to computing gradients for CRFs is discussed in Section 7.5.3 of the [Eisenstein Book](https://github.com/jacobeisenstein/gt-nlp-class/blob/master/notes/eisenstein-nlp-notes.pdf)

You will also need to implement the Viterbi algorithm for inference during decoding.

After including CRF training and Viterbi decoding, you should be getting about **92 F1 / 88 F1 on the dev and test set**, respectively.

**IMPORTANT:** Note that training will be substantially slower this time - depending on the efficiency of your implementation, it could take about 5 minutes per epoch (e.g. 50 minutes for 10 iterations).  It is recommended to start out training on a single batch of data (and testing on this same batch), so that you can quickly debug, making sure your model can memorize the labels on a single batch, and then optimize your code.  Once you are fairly confident your code is working properly, then you can train using the full dataset.  We have provided a (commented out) line of code to switch between training on a single batch and the full dataset below.

**Hint #1:** While debugging your implementation of the Forward algorithm it is helpful to look at the loss during training.  The loss should never be less than zero (the log-likelihood should always be negative).

**Hint #2:** To sum log-probabilities in a numerically stable way at the end of the Forward algorithm, you will want to use [`torch.logsumexp`](https://pytorch.org/docs/stable/generated/torch.logsumexp.html).

In [57]:
#For F.max_pool1d()
import torch.nn.functional as F

class LSTM_CRFtagger(CharLSTMtagger):
    def __init__(self, DIM_EMB=10, DIM_CHAR_EMB=30, DIM_HID=10, N_TAGS=max(tag2i.values())+1, VOCAB_SIZE=29148, debug=True):
        super(LSTM_CRFtagger, self).__init__(DIM_EMB=DIM_EMB, DIM_HID=DIM_HID, DIM_CHAR_EMB=DIM_CHAR_EMB)

        #TODO: Initialize parameters.
        self.N_TAGS = N_TAGS
        self.start_transitions = nn.Parameter(torch.zeros(N_TAGS))
        self.end_transitions = nn.Parameter(torch.zeros(N_TAGS))

        self.transitionWeights = nn.Parameter(torch.zeros((N_TAGS, N_TAGS), requires_grad=True))
        nn.init.normal_(self.transitionWeights)
        nn.init.normal_(self.start_transitions)
        nn.init.normal_(self.end_transitions)

    #   _compute_score(emissions, tags, mask):
    def gold_score(self, lstm_scores, Y, mask):
        # lstm_scores: (batch_size, seq_len, num_tags)
        # Y: 
        # mask: (batch_size, seq_length)
        #TODO: compute score of gold sequence Y (unnormalized conditional log-probability)
        if self.debug:
          print('gold_score, Y: ' + str(len(Y)))
          print('gold_score, Y: ' + str(Y))
          print('gold_score, lstm_scores: ' + str(lstm_scores.shape))
          print('gold_score, mask: ' + str(mask.shape))

        BATCH_SIZE, SLEN_padded, NUM_TAGS = lstm_scores.shape

        score = self.start_transitions[Y[:BATCH_SIZE,0]]
        score += lstm_scores[:BATCH_SIZE, 0, Y[:BATCH_SIZE,0]]

        for i in range(1,SLEN_padded):
          score += self.transitionWeights[Y:BATCH_SIZE,[i-1],Y[:BATCH_SIZE,i]] * mask[:BATCH_SIZE,i]
          score += lstm_scores[torch.arange(BATCH_SIZE),i,Y[:BATCH_SIZE,i]] * mask[:BATCH_SIZE,i]

        seq_ends = mask.float().sum(dim=1) - 1
        last_tags = Y[:BATCH_SIZE,seq_ends]
        score += self.end_transitions[last_tags]
        if self.debug:
          print('gold_score: ' + str(score))
        return score

    #Forward algorithm for a single sentence
    #Efficiency will eventually be important here.  We recommend you start by 
    #training on a single batch and make sure your code can memorize the 
    #training data.  Then you can go back and re-write the inner loop using 
    #tensor operations to speed things up.
    ### computes normalizer, Z(X)
    # _compute_normalizer(emissions,mask)
    def forward_algorithm(self, lstm_scores, mask):
        # lstm_scores: (batch_size, seq_len, num_tags)
        # mask: (batch_size, seq_length)
        #TODO: implement forward algorithm.
        BATCH_SIZE, SLEN_padded, NUM_TAGS = lstm_scores.shape
        score = self.start_transitions + lstm_scores[:,0]

        for i in range(1,SLEN_padded):
          broadcast_score = score.unsqueeze(2)
          broadcast_emissions = lstm_scores[:,i].unsqueeze(1)
          next_score = broadcast_score + self.transitionWeights + broadcast_emissions
          next_score = torch.logsumexp(next_score,dim=1)
          score = torch.where(mask[i].unsqueeze(1), next_score, score)
        
        score += self.end_transitions

        return torch.logsumexp(score, dim=1)

    def conditional_log_likelihood(self, sentences, tags, train=True):
        #Todo: compute conditional log likelihood of Y (use forward_algorithm and gold_score)
        (X,X_mask,X_char) = self.sentences2input_tensors(sentences)
        lstm_scores = self.forward(X.cuda(),X_char.cuda())
        numerator = self.gold_score(lstm_scores, Y, X_mask)
        denominator = self.forward_algorithm(lstm_scores, X_mask)
        cll = (numerator - denominator).sum() / X_mask.float().sum()
        if self.debug:
          print('conditional log likelihood: ' + str(cll))
        return cll

    def viterbi(self, lstm_scores, sLen):
        #TODO: Implement Viterbi algorithm, soring backpointers to recover the argmax sequence.  Returns the argmax sequence in addition to its unnormalized conditional log-likelihood.

        if self.debug:
          print('')
          #print('viterbiSeq: ' + str(viterbiSeq))
          #print('ll: ' + str(ll))
        return None #viterbiSeq, ll
        #return (torch.as_tensor([random.randint(0,lstm_scores.shape[1]-1) for x in range(sLen)]), 0)

    #Computes Viterbi sequences on a batch of data.
    def viterbi_batch(self, sentences):
        viterbiSeqs = []
        (X, X_mask, X_char) = self.sentences2input_tensors(sentences)
        lstm_scores = self.forward(X.cuda(), X_char.cuda())
        for s in range(len(sentences)):
            (viterbiSeq, ll) = self.viterbi(lstm_scores[s], len(sentences[s]))
            viterbiSeqs.append(viterbiSeq)
        return viterbiSeqs

    def forward(self, X, X_char, train=False):
        return super(LSTM_CRFtagger,self).forward(X,X_char)
        #return torch.randn((X.shape[0], X.shape[1], self.NUM_TAGS))  #Random baseline.

    def print_predictions(self, words, tags):
        Y_pred = self.inference(words)
        for i in range(len(words)):
            print("----------------------------")
            print(" ".join([f"{words[i][j]}/{Y_pred[i][j]}/{tags[i][j]}" for j in range(len(words[i]))]))
            print("Predicted:\t", [Y_pred[i][j] for j in range(len(words[i]))])
            print("Gold:\t\t", tags[i])

    #Need to use Viterbi this time.
    def inference(self, sentences, viterbi=True):
        pred = self.viterbi_batch(sentences)
        return [[i2tag[pred[i][j].item()] for j in range(len(sentences[i]))] for i in range(len(sentences))]

lstm_crf = LSTM_CRFtagger(DIM_EMB=300,debug=True).cuda()

VOCAB_SIZE: 29148
NUM_TAGS: 10
DIM_EMB: 300
DIM_HID: 10
bidirectional?: True

DIM_CHAR_EMB: 30



In [58]:
print(lstm_crf.conditional_log_likelihood(sentences_dev[0:3], tags_dev[0:3]))

word_embeds: torch.Size([3, 13, 300])

char_embeds: torch.Size([3, 13, 32, 30])
permuted_char_embeds: torch.Size([3, 13, 30, 32])
reshaped_char_embeds: torch.Size([39, 30, 32])
cnn_char_embeds: torch.Size([39, 30, 30])
max_pool_char_embeds: torch.Size([39, 30, 1])
reshaped_max_pool_output: torch.Size([3, 13, 30])

lstm_input: torch.Size([3, 13, 330])
lstm_out: torch.Size([3, 13, 20])
tag_space: torch.Size([3, 13, 10])
tag_scores: torch.Size([3, 13, 10])

gold_score, Y: 14987
gold_score, Y: [[3, 0, 4], [3, 7, 0, 1, 0, 0, 0, 1, 0, 0, 4], [3, 9, 9, 4], [3, 8, 0, 4], [3, 0, 7, 7, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4], [3, 8, 0, 0, 0, 0, 7, 7, 0, 0, 0, 9, 9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 8, 0, 0, 0, 0, 0, 0, 0, 4], [3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7, 0, 0, 0, 9, 9, 9, 9, 0, 0, 0, 0, 0, 4], [3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7, 7, 0, 4], [3, 0, 0, 0, 0, 0, 0, 0, 7, 0, 0, 9, 9, 0

TypeError: ignored

In [None]:
#CharLSTM-CRF Training. Feel free to change number of epochs, optimizer, learning rate and batch size.
import tqdm
import os
import subprocess
import random

nEpochs = 10

#Get CoNLL evaluation script
os.system('wget https://raw.githubusercontent.com/aritter/twitter_nlp/master/data/annotated/wnut16/conlleval.pl')

def train_crf_lstm(sentences, tags, lstm):
    #optimizer = optim.Adadelta(lstm.parameters(), lr=1.0)
    #TODO: initialize optimizer

    batchSize = 50

    for epoch in range(nEpochs):
        totalLoss = 0.0
        lstm.train()

        #Shuffle the sentences
        (sentences_shuffled, tags_shuffled) = shuffle_sentences(sentences, tags)
        for batch in tqdm.notebook.tqdm(range(0, len(sentences), batchSize), leave=False):
            lstm.zero_grad()
            #TODO: take gradient step on a batch of data.

        print(f"loss on epoch {epoch} = {totalLoss}")
        lstm.write_predictions(sentences_dev, 'dev_pred')   #Performance on dev set
        print('conlleval:')
        print(subprocess.Popen('paste dev dev_pred | perl conlleval.pl -d "\t"', shell=True, stdout=subprocess.PIPE,stderr=subprocess.STDOUT).communicate()[0].decode('UTF-8'))

        if epoch % 10 == 0:
            lstm.eval()
            s = random.sample(range(50), 5)
            lstm.print_predictions([sentences_train[i] for i in s], [tags_train[i] for i in s])   #Print predictions on train data (useful for debugging)

crf_lstm = LSTM_CRFtagger(DIM_HID=500, DIM_EMB=300, DIM_CHAR_EMB=30).cuda()
train_crf_lstm(sentences_train, tags_train, crf_lstm)             #Train on the full dataset
#train_crf_lstm(sentences_train[0:50], tags_train[0:50])          #Train only the first batch (use this during development/debugging)

In [None]:
crf_lstm.eval()
crf_lstm.write_predictions(sentences_test, 'test_pred_cnn_lstm_crf.txt')
!wget https://raw.githubusercontent.com/aritter/twitter_nlp/master/data/annotated/wnut16/conlleval.pl
!paste test test_pred_cnn_lstm_crf.txt | perl conlleval.pl -d "\t"

## Gradescope

Gradescope allows you to add multiple files to your submission. Please submit this notebook along with the test set prediction:
* test_pred_lstm.txt
* test_pred_cnn_lstm.txt
* test_pred_cnn_lstm_crf.txt
* NER_release.ipynb

To download this notebook, go to `File > Download.ipynb`. You can download the predictions from Colab by clicking the folder icon on the left and finding them under Files. 

Please make sure that you name the files as specified above. You will be able to see the test set accuracy for your predictions. However, the final score will be assigned later based on accuracy and implementation. 

When submitting the .ipynb notebook, please make sure that all the cells run when executed in order starting from a fresh session. If the code doesn't take too long to run, you can re-run everything with `Runtime -> Restart and run all`

You can submit multiple times before the deadline and choose the submission which you want to be graded by going to `Submission History` on gradescope.
