# POS with HMM

In this notebook we will look into the task of Part Of Speech Tagging. We will see how to prepare data, implement an algorithm, and evaluate it. 

### Preparing Data

Let us start by downloading a dataset and see what the data looks like. 

We make use of a sample from the *CONLL-2003++* dataset, from here: [https://github.com/ZihanWangKi/CrossWeigh
](https://github.com/ZihanWangKi/CrossWeigh). I put the data in a data.zip, which we will open and process now. 

**LOCAL NOTEBOOK:**
If you are opening this in a jupyter notebook, just make sure the zip is in the same directory. Skip the following cells till the next text block.

**GOOGLE COLAB:**
Are you working in a google colab, you need to give acces to google drive where you can put the zip with data. Run the following cells and follow the instructions.

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [2]:
# MAKE SURE TO CHANGE THE PATH IN YOUR GOOGLE DRIVE TO THE DATAFILE. 
# !cp drive/MyDrive/[path to data in your personal google drive]/data.zip .
!cp drive/MyDrive/KULEUVEN/teaching/NLP/'NLP 2021-2022'/Exercises/General_Track/exercise1_POS/data.zip .
# !cp 'drive/MyDrive/1 PhD/Teaching/NLP/NLP 2021-2022/Exercises/General_Track/exercise1_POS/data.zip' .

Let us unzip the data.zip file and have a look at what some of the data looks like.

In [3]:
!unzip data.zip

Archive:  data.zip
   creating: data/
  inflating: data/conllpp_dev.txt    
  inflating: data/conllpp_test.txt   
  inflating: data/conllpp_train.txt  


In [4]:
ls data/

conllpp_dev.txt  conllpp_test.txt  conllpp_train.txt


Okay, so we have a training, a development, and a test split. Let's look inside one of the files.

---



In [5]:
with open('data/conllpp_dev.txt','r') as f:
  for i,l in enumerate(f):
    if i > 20:
      break
    print("\t".join(l.strip().split()))

-DOCSTART-	-X-	-X-	O

CRICKET	NNP	B-NP	O
-	:	O	O
LEICESTERSHIRE	NNP	B-NP	B-ORG
TAKE	NNP	I-NP	O
OVER	IN	B-PP	O
AT	NNP	B-NP	O
TOP	NNP	I-NP	O
AFTER	NNP	I-NP	O
INNINGS	NNP	I-NP	O
VICTORY	NN	I-NP	O
.	.	O	O

LONDON	NNP	B-NP	B-LOC
1996-08-30	CD	I-NP	O

West	NNP	B-NP	B-MISC
Indian	NNP	I-NP	I-MISC
all-rounder	NN	I-NP	O
Phil	NNP	I-NP	B-PER


Each file contains sentences, where each line is a word and some annotations. The sentences are seperated by a white line. The first item on each line is a word, the second a part-of-speech (POS) tag, the third a syntactic chunk tag and the fourth the named entity tag. 

### Preprocessing

Let's start by processing the data so it is a more useful format. We will put everything into lists, so that each item of the list is either the words of the sentence, or the labels of the sentence. 

In [6]:
def process_split(path_to_raw_txt):
  # these lists will contain the full splits data
  words = []
  pos = []
  chunks = []
  ners = []
  with open(path_to_raw_txt,'r') as f:
    # we need buffers for collecting the current sentence
    buffer_words = []
    buffer_pos = [] 
    buffer_chunks = []
    buffer_ners = []
    # we will loop over all the lines in the file
    for line in f:
      line = line.strip()
      #The file starts with a -DOCSTART- indicater. skip this line
      if line.startswith('-DOCSTART-'):
        continue
      # if we reach a blank line, we add the complete buffer to the full splits lists
      if len(line) == 0:
        # make sure we don't add empty buffer to the data lists
        if len(buffer_words) != 0:
          words.append(buffer_words)
          pos.append(buffer_pos)
          chunks.append(buffer_chunks)
          ners.append(buffer_ners)
        # now we need to reset the buffers for the next sentence
        buffer_words = []
        buffer_pos = [] 
        buffer_chunks = []
        buffer_ners = []
      else:
        # split the current line into the 4 elements and add them to the current sentence
        elements = line.split()
        buffer_words.append(elements[0].lower())
        buffer_pos.append(elements[1])
        buffer_chunks.append(elements[2])
        buffer_ners.append(elements[3])
    # if we finished the loop and we still have a sentence in our buffer, we add it to the full list
    if len(buffer_words) != 0:
      words.append(buffer_words)
      pos.append(buffer_pos)
      chunks.append(buffer_chunks)
      ners.append(buffer_ners)
  return words, pos, chunks, ners

train_words, train_pos, train_chunks, train_ners = process_split('data/conllpp_train.txt')
dev_words, dev_pos, dev_chunks, dev_ners = process_split('data/conllpp_dev.txt')
test_words, test_pos, test_chunks, test_ners = process_split('data/conllpp_test.txt')

In [7]:
print("WORDS: ", train_words[:3])
print("POS TAGS: ", train_pos[:3])
print("CHUNKS: ", train_chunks[:3])
print("NER TAGS: ", train_ners[:3])

WORDS:  [['eu', 'rejects', 'german', 'call', 'to', 'boycott', 'british', 'lamb', '.'], ['peter', 'blackburn'], ['brussels', '1996-08-22']]
POS TAGS:  [['NNP', 'VBZ', 'JJ', 'NN', 'TO', 'VB', 'JJ', 'NN', '.'], ['NNP', 'NNP'], ['NNP', 'CD']]
CHUNKS:  [['B-NP', 'B-VP', 'B-NP', 'I-NP', 'B-VP', 'I-VP', 'B-NP', 'I-NP', 'O'], ['B-NP', 'I-NP'], ['B-NP', 'I-NP']]
NER TAGS:  [['B-ORG', 'O', 'B-MISC', 'O', 'O', 'O', 'B-MISC', 'O', 'O'], ['B-PER', 'I-PER'], ['B-LOC', 'O']]


The data looks good now, and easily usable now. Now we start implementing a POS model. 

## POS Tagging with a HMM

For our HMM, we will need to prepare several probabilities for our markov chain. We will collect a set of possible labels and possible tokens from the train data. 

We add a start-of-sentence ('\<SOS\>') token at the beginning of every sentence

In [38]:
# import the needed modules
from collections import Counter
import numpy as np
import pandas as pd
from sklearn.metrics import f1_score
import time

In [9]:
count_labels = Counter()
count_words = Counter()
count_transitions = Counter()

# count the words
for sent in train_words:
  count_words['<SOS>'] += 1
  for word in sent:
    # we lowercase all the words, to reduce unique number of tokens
    count_words[word] += 1

# count the labels
for labels in train_pos:
  count_labels['<SOS>'] += 1
  for label in labels:
    count_labels[label] += 1

# count the possible transitions for the labels
for labels in train_pos:
  prev_label = '<SOS>'
  for label in labels:
    count_transitions[(prev_label, label)] += 1
    prev_label = label


print('='*10)
print('Unique words: {}'.format(len(count_words)))
print(f'Number of labels: {len(count_labels)}, labels: {count_labels}')
print(f'Number of transitions: {len(count_transitions)}, bigrams: {count_transitions}')

Unique words: 21010
Number of labels: 46, labels: Counter({'NNP': 34392, 'NN': 23899, 'CD': 19704, 'IN': 19064, '<SOS>': 14041, 'DT': 13453, 'JJ': 11831, 'NNS': 9903, 'VBD': 8293, '.': 7389, ',': 7291, 'VB': 4252, 'VBN': 4105, 'RB': 3975, 'CC': 3653, 'TO': 3469, 'PRP': 3163, '(': 2866, ')': 2866, 'VBG': 2585, 'VBZ': 2426, ':': 2386, '"': 2178, 'POS': 1553, 'PRP$': 1520, 'VBP': 1436, 'MD': 1199, 'NNPS': 684, 'WP': 528, 'RP': 528, 'WDT': 506, 'SYM': 439, '$': 427, 'WRB': 384, 'JJR': 382, 'JJS': 254, 'FW': 166, 'RBR': 163, 'EX': 136, 'RBS': 35, "''": 35, 'PDT': 33, 'UH': 30, 'WP$': 23, 'LS': 13, 'NN|SYM': 4})
Number of transitions: 1143, bigrams: Counter({('NNP', 'NNP'): 11130, ('CD', 'CD'): 7259, ('NN', 'IN'): 6426, ('IN', 'DT'): 5999, ('DT', 'NN'): 5855, ('JJ', 'NN'): 5428, ('<SOS>', 'NNP'): 5421, ('IN', 'NNP'): 4518, ('NNP', 'CD'): 4002, ('DT', 'JJ'): 3634, ('NN', 'NN'): 2882, ('NNS', 'IN'): 2708, ('NNP', ','): 2343, ('NN', '.'): 2241, ('TO', 'VB'): 1982, ('NNP', 'VBD'): 1949, ('(', 'N

We counted almost everything, we didn't count the conditional pairs of words and labels yet. 


To avoid zero probabilities, we create an unknown ('\<UNK\>') token. We will only use the most common words as features, the counts of the remaining words are moved towards the unknown token. Given we have 21000 unique words, we use the 20.000 most common words.

We need to take this into account when counting the word-label pairs

In [10]:
count_observation = Counter()

# count the conditional pairs between the words and labels
vocab_words = dict(count_words.most_common(20000))
for sent, labels in zip(train_words, train_pos):
  # add a SOS pair at the beginning of every sentence
  count_observation[("<SOS>", "<SOS>")] += 1
  for word, label in zip(sent, labels):
    if word in vocab_words.keys(): 
      count_observation[(word, label)] += 1
    else:
      count_observation[("<UNK>", label)] += 1
print(f'Number of unique word-label pairs: {len(count_observation)}, example pairs: {[f"word={p[0]} tag={p[1]}: {c}" for p, c in sorted(count_observation.items(), key=lambda item: item[1],reverse=True)][:10]}')

Number of unique word-label pairs: 24323, example pairs: ['word=<SOS> tag=<SOS>: 14041', 'word=the tag=DT: 8383', 'word=. tag=.: 7374', 'word=, tag=,: 7290', 'word=of tag=IN: 3815', 'word=in tag=IN: 3601', 'word=to tag=TO: 3424', 'word=a tag=DT: 3184', 'word=and tag=CC: 2872', 'word=( tag=(: 2861']


Now we counted everything, we can create the probabilities matrices and vectors. We must make sure that we always keep the same order for all our words and labels. So we will create a word and label vocab. 

In [11]:
word2id_vocab = {k:i for i, (k, c) in enumerate(count_words.most_common(20000))}
word2id_vocab['<UNK>'] = 20000
label2id_vocab = {k:i for i, (k, c) in enumerate(count_labels.most_common())}
id2label_vocab = {i:k for k, i in label2id_vocab.items()}
id2word_vocab = {v: k for k, v in word2id_vocab.items()}

In [12]:
# # first we count the total
# total_words = np.sum(list(count_words.values()))

# # now we create the probabilities
# prob_words_vector = np.zeros(len(word2id_vocab))
# used_count = 0  # we keep track of the used counts, so we can use the remaining count for the unknown token. 
# for word, id in word2id_vocab.items():
#   if word != '<UNK>':
#     prob_words_vector[id] = count_words[word]/total_words
#     used_count += count_words[word]
# # now we add the probability for the unknown token
# prob_words_vector[word2id_vocab['<UNK>']] = (total_words-used_count)/total_words
# print(np.sum(prob_words_vector))

Next let us compute the probabilities for the transitions. Since we might have missed a transition, we make use of laplace smoothing for the transition probabilities.  


In [13]:
# # first we count the total
# total = np.sum(list(count_labels.values()))

# # now we create the probabilities
# prob_labels_vector = np.zeros(len(label2id_vocab))
# for label, id in label2id_vocab.items():
#   prob_labels_vector[id] = count_labels[label]/total

## NOW THE TRANSITIONS
conditional_totals = Counter()
for bigram, count in count_transitions.items():
  conditional_totals[bigram[0]] += count
smoothing_constant = len(label2id_vocab)

# now we create the probability matrix
prob_transition_matrix = np.zeros([len(label2id_vocab), len(label2id_vocab)])
for label_i, id_i in label2id_vocab.items():
  denominator = conditional_totals[label_i] + smoothing_constant
  for label_j, id_j in label2id_vocab.items():
    if (label_i, label_j) in count_transitions:
      numerator = count_transitions[(label_i, label_j)] + 1
    else:
      numerator = 1
    prob_transition_matrix[id_i, id_j] = numerator/denominator
# print(np.sum(prob_labels_vector))
print(np.sum(prob_transition_matrix, 1))

[1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]


In [14]:
# convert the matrix to a df for better readability
tags_df = pd.DataFrame(prob_transition_matrix, 
                       columns = label2id_vocab.keys(), index=label2id_vocab.keys())
tags_df.iloc[:10, :10]

Unnamed: 0,NNP,NN,CD,IN,<SOS>,DT,JJ,NNS,VBD,.
NNP,0.330218,0.047199,0.118755,0.052183,3e-05,0.003323,0.01157,0.017058,0.05785,0.050314
NN,0.040002,0.121779,0.028217,0.271479,4.2e-05,0.007434,0.011743,0.052167,0.064459,0.094703
CD,0.088875,0.065162,0.461538,0.038462,6.4e-05,0.002352,0.027718,0.109282,0.003814,0.03363
IN,0.237118,0.093871,0.072673,0.017158,5.2e-05,0.314828,0.077238,0.040298,0.000682,0.00084
<SOS>,0.384894,0.078015,0.076099,0.038049,7.1e-05,0.092355,0.055086,0.05821,0.003549,0.000426
DT,0.122002,0.434841,0.030445,0.007129,7.4e-05,0.001485,0.269919,0.062375,0.003861,0.001411
JJ,0.065408,0.461767,0.028664,0.050608,8.5e-05,0.004933,0.091265,0.15999,0.005699,0.016926
NNS,0.013941,0.021479,0.018381,0.27974,0.000103,0.006506,0.013527,0.008777,0.103779,0.111524
VBD,0.066803,0.031599,0.046137,0.145741,0.00012,0.162441,0.047699,0.02415,0.002163,0.08038
.,0.002188,0.002188,0.002188,0.002188,0.002188,0.002188,0.002188,0.002188,0.002188,0.002188


And finally we compute the state observation probabilities, for a word given a label. Even though we already have unknown labels for the words, we will still use laplace smoothing here. 

In [15]:
conditional_totals = Counter()
for (word,label), count in count_observation.items():
  conditional_totals[label] += count
# we define a smoothing constant for the laplace smoothing. This is equal to the number of possible observations
smoothing_constant = len(word2id_vocab)

# now we create the probability matrix
prob_observation_matrix = np.zeros([len(label2id_vocab), len(word2id_vocab)])
for label, label_id in label2id_vocab.items():
  denominator = conditional_totals[label] + smoothing_constant
  for word, word_id in word2id_vocab.items():
    if (word, label) in count_observation:
      numerator = count_observation[(word, label)] + 1
    else:
      numerator = 1
    prob_observation_matrix[label_id, word_id] = numerator/denominator
print(np.sum(prob_observation_matrix, 1))
print('Most likely word per label')
print([f'LABEL: {label}, WORD: {id2word_vocab[max_id]}, LIKELIHOOD:{prob:.2f}' for label, max_id,prob in zip(id2label_vocab.values(),np.argmax(prob_observation_matrix, 1),np.max(prob_observation_matrix, 1))])

[1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
Most likely word per label
['LABEL: NNP, WORD: u.s., LIKELIHOOD:0.01', 'LABEL: NN, WORD: year, LIKELIHOOD:0.01', 'LABEL: CD, WORD: 1, LIKELIHOOD:0.04', 'LABEL: IN, WORD: of, LIKELIHOOD:0.10', 'LABEL: <SOS>, WORD: <SOS>, LIKELIHOOD:0.41', 'LABEL: DT, WORD: the, LIKELIHOOD:0.25', 'LABEL: JJ, WORD: first, LIKELIHOOD:0.01', 'LABEL: NNS, WORD: results, LIKELIHOOD:0.01', 'LABEL: VBD, WORD: said, LIKELIHOOD:0.06', 'LABEL: ., WORD: ., LIKELIHOOD:0.27', 'LABEL: ,, WORD: ,, LIKELIHOOD:0.27', 'LABEL: VB, WORD: be, LIKELIHOOD:0.02', 'LABEL: VBN, WORD: been, LIKELIHOOD:0.02', 'LABEL: RB, WORD: not, LIKELIHOOD:0.02', 'LABEL: CC, WORD: and, LIKELIHOOD:0.12', 'LABEL: TO, WORD: to, LIKELIHOOD:0.15', 'LABEL: PRP, WORD: he, LIKELIHOOD:0.03', 'LABEL: (, WORD: (, LIKELIHOOD:0.13', 'LABEL: ), WORD: ), LIKELIHOOD:0.13', 'LABEL: VBG, WORD: being, LIKELIHOOD:0.00', 'LABEL

$P(W_t=v_i|L_t=l_j)$ =``prob_observation_matrix[i][j]``

$P(L_t=l_i|L_{t-1}=l_j)$ =``prob_transition_matrix[i][j]`` 

where $W_x$, $L_x$ are random variables corresponding to a word resp. word label in our dataset at location $x$.
$v_i$ is the $i$th word in the word vocabulary, and $l_j$ the $j$th label in the label vocabulary.
<!-- $P(lvar_{0:T}=lval_{0:T}|wvar_{0:T}=wval_{0:T}) = \prod_{t=0}^{T-1}{P(lvar_t=lval_t)|lvar_{0:t}=lval_{0:t}|wvar_{0:T}=wval_{0:T}}$
$P(l_t=s|w_t=v,l_{t-1}=s') = \frac{P()}{P()}$ -->

Now we have the needed matrices correctly created (they all sum to a probability of 1), we can use this to predict a sequence of labels for new sentences in our dev-set.

In [50]:
def greedy_sentence_annotation(sentence):
  # we initialize our sequence with the SOS token
  sos_lab_id = label2id_vocab["<SOS>"]
  sos_word_id = word2id_vocab["<SOS>"]
  sentence_probability = prob_observation_matrix[sos_lab_id, sos_word_id]
  prev_id = sos_lab_id
  prediction_labels = []
  for word in sentence:
    if word in word2id_vocab:
      word_id = word2id_vocab[word]
    else:
      word_id = word2id_vocab["<UNK>"]
    probs_possible_label = sentence_probability.copy()
    probs_possible_label *= prob_observation_matrix[:, word_id]
    probs_possible_label *= prob_transition_matrix[prev_id, :]
    # This is where we are 'greedy': we just continue working with the best option at each timestep
    prev_id = np.argmax(probs_possible_label) 
    prediction_labels.append(id2label_vocab[prev_id])
    sentence_probability = np.max(probs_possible_label)
  return sentence_probability, prediction_labels

def test_one_sentence(index):
  prob, pred_labels = greedy_sentence_annotation(dev_words[index])
  col_widths_formatting = '{:<20s} {:<15s} {:<15s}'
  print( col_widths_formatting.format("Word","True POS tag","Pred. POS tag") )
  for w,t,p in zip(dev_words[index],dev_pos[index],pred_labels):
    print( col_widths_formatting.format(w,t,p) )
  # print('\t'.join(dev_words[index]))
  # print('\t\t'.join(pred_labels))
  # print('\t\t'.join(dev_pos[index]))
  print("pred probabilty",prob)

test_one_sentence(830)

Word                 True POS tag    Pred. POS tag  
"                    "               "              
lebed                NN              NNP            
is                   VBZ             VBZ            
now                  RB              RB             
in                   IN              IN             
chechnya             NNP             NNP            
solving              VBG             NNP            
some                 DT              DT             
problems             NNS             NNS            
,                    ,               ,              
"                    "               "              
interfax             NN              NNP            
quoted               VBN             VBD            
chernomyrdin         NNP             NNP            
as                   IN              IN             
saying               VBG             VBG            
.                    .               .              
"                    "               "        

In [51]:
np.set_printoptions(linewidth=np.inf, precision=2) # To not have matrices wrap around when printing
def viterbi(sentence):
  T = len(sentence)+1
  N = len(label2id_vocab)
  viterbi_matrix = np.zeros((N,T))
  backpointer_matrix = np.zeros((N,T))
  sos_lab_id = label2id_vocab["<SOS>"]
  sos_word_id = word2id_vocab["<SOS>"]
  start_prob = prob_observation_matrix[sos_lab_id, sos_word_id]
  # Probability of every label at the start is 0, except for the <SOS> tag. 
  for s in range(N):
    viterbi_matrix[s,0] = 0
    backpointer_matrix[s,0] = sos_lab_id
  viterbi_matrix[sos_lab_id, 0] = start_prob

  for t_step in range(1,T):
    if sentence[t_step-1] in word2id_vocab:
      word_id = word2id_vocab[sentence[t_step-1]]
    else:
      word_id = word2id_vocab["<UNK>"]
    for s in range(N):
      state_prob = viterbi_matrix[:, t_step-1].copy()           # probability of sequence till now
      state_prob *= prob_transition_matrix[:, s]         # probability for transition from every previous label to label s
      state_prob *= prob_observation_matrix[s, word_id]  # probability observation
      viterbi_matrix[s, t_step] = np.max(state_prob)
      backpointer_matrix[s, t_step] = np.argmax(state_prob)

  # now we simply select the best path
  best_path_prob = np.max(viterbi_matrix[:, -1])
  best_path_pointer = int(np.argmax(viterbi_matrix[:, -1]))
  # given the path, we track back to create the entire label sequence
  best_path = [best_path_pointer]
  for t_step in range(T-1, 1, -1):
    best_path.append(int(backpointer_matrix[best_path[-1], t_step]))
  best_path = [id2label_vocab[i] for i in best_path[::-1]]
  return best_path_prob, best_path


def test_one_sentence_viterbi(index):
  prob, pred_labels = viterbi(dev_words[index])
  col_widths_formatting = '{:<20s} {:<15s} {:<15s}'
  print( col_widths_formatting.format("Word","True POS tag","Pred. POS tag") )
  for w,t,p in zip(dev_words[index],dev_pos[index],pred_labels):
    print( col_widths_formatting.format(w,t,p) )
  print("pred probabilty",prob)

In [52]:
test_one_sentence_viterbi(830)

Word                 True POS tag    Pred. POS tag  
"                    "               "              
lebed                NN              NNP            
is                   VBZ             VBZ            
now                  RB              RB             
in                   IN              IN             
chechnya             NNP             NNP            
solving              VBG             VBD            
some                 DT              DT             
problems             NNS             NNS            
,                    ,               ,              
"                    "               "              
interfax             NN              PRP            
quoted               VBN             VBD            
chernomyrdin         NNP             VBN            
as                   IN              IN             
saying               VBG             VBG            
.                    .               .              
"                    "               "        

## Evaluation
For the evaluation we will make use of the F1 metric. We simply use a pre-implemented function from sklearn. 

In [53]:
all_true_labels = []
all_greedy_pred = []
all_viterbi_pred = []
greedy_times = []
viterbi_times = []

for sent, true_labels in zip(dev_words, dev_pos):
  start = time.time()
  _, greedy_pred = greedy_sentence_annotation(sent)
  stop = time.time()
  greedy_times.append(stop-start)
  start=time.time()
  _, vitebi_pred = viterbi(sent)
  stop=time.time()
  viterbi_times.append(stop-start)
  all_true_labels += true_labels
  all_greedy_pred += greedy_pred
  all_viterbi_pred += vitebi_pred

f1_greedy = f1_score(all_true_labels, all_greedy_pred, average='macro')
f1_viterbi = f1_score(all_true_labels, all_viterbi_pred, average='macro')

print("F1 Score Greedy HMM:{}. total runtime: {}".format(np.round(f1_greedy,4),np.round(np.sum(greedy_times),4)))
print("F1 Score viterbi HMM:{}. total runtime: {}".format(np.round(f1_viterbi,4),np.round(np.sum(viterbi_times),4)))

F1 Score Greedy HMM:0.6256. total runtime: 0.8335
F1 Score viterbi HMM:0.6687. total runtime: 31.7705
