# Laphet (Version 0.7) with myPOS Tags (fasttext embedding, no freeze)

ဒီ notebook မှာတော့ myPOS ဒေတာရဲ့ tag တွေနဲ့ ဆောက်ထားတဲ့ fasttext embedding ကို သုံးပြီး --embedding_method fasttext_no_freeze ဆိုတဲ့ option နဲ့ training, text generation and testing/evaluation တွေကို လုပ်ပြသွားပါမယ်။   

MLP, Bi-LSTM, Transformer, BERT, GPT မော်ဒယ် တစ်ခုချင်းစီအတွက် shell script ရဲ့ task line ကို comment on/off လုပ်ပြီး run ပြသွားပါမယ်။  

## Updated Bash Shell Script

In [1]:
!cat ./train_test_tag_nofz.sh

#!/bin/bash

# Updated for Laphet LM Toolkit Version 0.7
# Last updated: 28 Jan 2025

# Create the output and log directories if they don't exist
mkdir -p model/tag/
mkdir -p output/tag/
mkdir -p log/tag/

# Function to train, generate text, and test a language model
task() {
  local model_type=$1
  local model_file="./model/tag/${model_type}.nofz.model"
  local output_file="./output/tag/${model_type}_nofz_gen_texts.txt"
  local log_file="./log/tag/${model_type}.nofz.log"
  local train_data="./data/myPOS/tag/train_tag.txt"
  local dev_data="./data/myPOS/tag/dev_tag.txt"
  local test_data="./data/myPOS/tag/test_tag.txt"
  local start_name="./data/myPOS/tag/start_tags.txt"

  {
    echo "Training ${model_type^} language model:";
    time python -u laphet.py --model_type $model_type --train --data $train_data \
      --dev_file $dev_data --model $model_file --seq_len 50 --epochs 10 --batch_size 32 \
      --lr 0.0001 --embedding_method fasttext_no_freeze \
      --fasttext_model ./fasttex

## Training, Text Generation and Testing with MLP LM

In [5]:
!./train_test_tag_nofz.sh

Training Mlp language model:
Epoch 1/10 (Training): 100%|███████████████| 1250/1250 [00:09<00:00, 135.33it/s]
Epoch 1, Training Loss: 0.8974
Epoch 1/10 (Validation): 100%|█████████████████| 69/69 [00:00<00:00, 519.16it/s]
Epoch 1, Validation Loss: 0.5729
Best model saved at ./model/tag/mlp.nofz.model with validation loss: 0.5729
Epoch 2/10 (Training): 100%|███████████████| 1250/1250 [00:08<00:00, 140.34it/s]
Epoch 2, Training Loss: 0.5729
Epoch 2/10 (Validation): 100%|█████████████████| 69/69 [00:00<00:00, 515.38it/s]
Epoch 2, Validation Loss: 0.5729
Best model saved at ./model/tag/mlp.nofz.model with validation loss: 0.5729
Epoch 3/10 (Training): 100%|███████████████| 1250/1250 [00:08<00:00, 140.29it/s]
Epoch 3, Training Loss: 0.5729
Epoch 3/10 (Validation): 100%|█████████████████| 69/69 [00:00<00:00, 546.96it/s]
Epoch 3, Validation Loss: 0.5729
Best model saved at ./model/tag/mlp.nofz.model with validation loss: 0.5729
Epoch 4/10 (Training): 100%|███████████████| 1250/1250 [00:08<00:

## GPU Usage During MLP LM Training

In [6]:
!ls -lh ./model/tag/mlp.nofz.model*

-rw-rw-r-- 1 ye ye 246K Jan 28 20:24 ./model/tag/mlp.nofz.model
-rw-rw-r-- 1 ye ye  110 Jan 28 20:23 ./model/tag/mlp.nofz.model.vocab


In [7]:
!cat ./data/myPOS/tag/start_tags.txt

pron
n
adj
v
pron part
pron ppm
n v
n n
v part
n part
pron pron
pron ppm
n tn
adj v part
n n n


Generated output with above promt file:  

In [8]:
!cat ./output/tag/mlp_nofz_gen_texts.txt

pron int num
pron v adv
pron tn [PAD]
pron sb sb
pron v int
n punc v
n adj conj
n pron [UNK]
n int v
n sb sb
adj num v
adj [UNK] adj
adj [UNK] fw
adj [UNK] conj
adj adv [UNK]
v ppm num
v n ppm
v adv v
v conj abb
v abb tn
pron part part v
pron part n tn
pron part int adv
pron part int abb
pron part adv sb
pron ppm fw v
pron ppm part adj
pron ppm conj abb
pron ppm part pron
pron ppm fw ppm
n v fw adv
n v adv n
n v conj conj
n v abb v
n v num [UNK]
n n ppm abb
n n n pron
n n punc v
n n fw v
n n v num
v part [PAD] fw
v part ppm [PAD]
v part ppm conj
v part pron int
v part sb fw
n part int abb
n part sb abb
n part adv abb
n part adv v
n part ppm sb
pron pron part abb
pron pron conj ppm
pron pron adj conj
pron pron tn int
pron pron n pron
pron ppm num ppm
pron ppm [PAD] conj
pron ppm [UNK] tn
pron ppm part [PAD]
pron ppm fw tn
n tn n v
n tn [UNK] num
n tn conj [PAD]
n tn [UNK] [PAD]
n tn adj conj
adj v part tn adv
adj v part abb abb
adj v part n tn
adj v part int abb
adj v part punc part
n n

## Training, Text Generation and Testing with Bi-LSTM LM

Updated bash shell script is as follows:   

In [9]:
!./train_test_tag_nofz.sh

Training Bilstm language model:
Epoch 1/10 (Training): 100%|████████████████| 1250/1250 [00:21<00:00, 58.32it/s]
Epoch 1, Training Loss: 0.1894
Epoch 1/10 (Validation): 100%|█████████████████| 69/69 [00:00<00:00, 176.39it/s]
Epoch 1, Validation Loss: 0.0019
Best model saved at ./model/tag/bilstm.nofz.model with validation loss: 0.0019
Epoch 2/10 (Training): 100%|████████████████| 1250/1250 [00:21<00:00, 58.67it/s]
Epoch 2, Training Loss: 0.0012
Epoch 2/10 (Validation): 100%|█████████████████| 69/69 [00:00<00:00, 172.26it/s]
Epoch 2, Validation Loss: 0.0001
Best model saved at ./model/tag/bilstm.nofz.model with validation loss: 0.0001
Epoch 3/10 (Training): 100%|████████████████| 1250/1250 [00:21<00:00, 59.42it/s]
Epoch 3, Training Loss: 0.0002
Epoch 3/10 (Validation): 100%|█████████████████| 69/69 [00:00<00:00, 177.79it/s]
Epoch 3, Validation Loss: 0.0000
Best model saved at ./model/tag/bilstm.nofz.model with validation loss: 0.0000
Epoch 4/10 (Training): 100%|████████████████| 1250/12

## GPU Usage of Bi-LSTM LM

In [10]:
!ls -lh ./model/tag/bilstm.nofz.model*

-rw-rw-r-- 1 ye ye 82M Jan 28 20:30 ./model/tag/bilstm.nofz.model
-rw-rw-r-- 1 ye ye 110 Jan 28 20:27 ./model/tag/bilstm.nofz.model.vocab


In [11]:
!cat ./output/tag/bilstm_nofz_gen_texts.txt

pron pron abb
pron pron n
pron pron pron
pron ppm int
pron ppm num
n num sb
n num sb
n num sb
n num sb
n num sb
adj num adv
adj [UNK] int
adj pron num
adj pron n
adj [PAD] adv
v conj part
v conj part
v conj part
v conj part
v conj part
pron part part part
pron part part part
pron part part part
pron part part part
pron part part part
pron ppm pron num
pron ppm num num
pron ppm num adj
pron ppm num adj
pron ppm fw num
n v conj part
n v conj part
n v conj part
n v conj part
n v conj part
n n num sb
n n num sb
n n num sb
n n num sb
n n num sb
v part part part
v part part part
v part part part
v part part part
v part part part
n part part part
n part part part
n part part part
n part part part
n part part part
pron pron n num
pron pron n adj
pron pron ppm ppm
pron pron ppm ppm
pron pron n num
pron ppm fw num
pron ppm num num
pron ppm num adj
pron ppm conj adj
pron ppm num num
n tn num conj
n tn [UNK] n
n tn [UNK] abb
n tn [UNK] n
n tn conj part
adj v part part part
adj v part part part
adj

In [12]:
!tail -n 2 ./log/tag/bilstm.nofz.log

Average Perplexity on Test Data: 1.0000
Average Cross-Entropy on Test Data: 0.0000


## Training, Text Generation and Testing with Transformer LM

In [14]:
!cat ./train_test_tag_nofz.sh

#!/bin/bash

# Updated for Laphet LM Toolkit Version 0.7
# Last updated: 28 Jan 2025

# Create the output and log directories if they don't exist
mkdir -p model/tag/
mkdir -p output/tag/
mkdir -p log/tag/

# Function to train, generate text, and test a language model
task() {
  local model_type=$1
  local model_file="./model/tag/${model_type}.nofz.model"
  local output_file="./output/tag/${model_type}_nofz_gen_texts.txt"
  local log_file="./log/tag/${model_type}.nofz.log"
  local train_data="./data/myPOS/tag/train_tag.txt"
  local dev_data="./data/myPOS/tag/dev_tag.txt"
  local test_data="./data/myPOS/tag/test_tag.txt"
  local start_name="./data/myPOS/tag/start_tags.txt"

  {
    echo "Training ${model_type^} language model:";
    time python -u laphet.py --model_type $model_type --train --data $train_data \
      --dev_file $dev_data --model $model_file --seq_len 50 --epochs 10 --batch_size 32 \
      --lr 0.0001 --embedding_method fasttext_no_freeze \
      --fasttext_model ./fasttex

In [15]:
!./train_test_tag_nofz.sh

Training Transformer language model:
Epoch 1/10 (Training): 100%|███████████████| 1250/1250 [00:05<00:00, 227.35it/s]
Epoch 1, Training Loss: 0.2948
Epoch 1/10 (Validation): 100%|█████████████████| 69/69 [00:00<00:00, 538.45it/s]
Epoch 1, Validation Loss: 0.0063
Best model saved at ./model/tag/transformer.nofz.model with validation loss: 0.0063
Epoch 2/10 (Training): 100%|███████████████| 1250/1250 [00:05<00:00, 231.51it/s]
Epoch 2, Training Loss: 0.0059
Epoch 2/10 (Validation): 100%|█████████████████| 69/69 [00:00<00:00, 638.67it/s]
Epoch 2, Validation Loss: 0.0013
Best model saved at ./model/tag/transformer.nofz.model with validation loss: 0.0013
Epoch 3/10 (Training): 100%|███████████████| 1250/1250 [00:05<00:00, 237.81it/s]
Epoch 3, Training Loss: 0.0016
Epoch 3/10 (Validation): 100%|█████████████████| 69/69 [00:00<00:00, 579.99it/s]
Epoch 3, Validation Loss: 0.0004
Best model saved at ./model/tag/transformer.nofz.model with validation loss: 0.0004
Epoch 4/10 (Training): 100%|█████

## GPU Usage of Transformer based LM

training လုပ်ပြီးထွက်လာတဲ့ model ဖိုင်ရဲ့ filesize ကို လေ့လာရအောင်။  

In [16]:
!ls -lh ./model/tag/transformer*

-rw-rw-r-- 1 ye ye 2.3M Jan 28 19:55 ./model/tag/transformer.ftfz.model
-rw-rw-r-- 1 ye ye  110 Jan 28 19:54 ./model/tag/transformer.ftfz.model.vocab
-rw-rw-r-- 1 ye ye 8.2M Jan 28 17:22 ./model/tag/transformer.model
-rw-rw-r-- 1 ye ye  110 Jan 28 17:20 ./model/tag/transformer.model.vocab
-rw-rw-r-- 1 ye ye 2.3M Jan 28 20:40 ./model/tag/transformer.nofz.model
-rw-rw-r-- 1 ye ye  110 Jan 28 20:39 ./model/tag/transformer.nofz.model.vocab


In [17]:
!cat ./output/tag/transformer_nofz_gen_texts.txt

pron [PAD] n
pron n punc
pron ppm v
pron part adj
pron punc ppm
n fw [UNK]
n ppm v
n num fw
n n adj
n [UNK] tn
adj punc n
adj [UNK] tn
adj [UNK] tn
adj v [UNK]
adj abb abb
v num int
v ppm num
v adv fw
v v sb
v punc sb
pron part tn n
pron part [UNK] num
pron part adj part
pron part adv num
pron part part fw
pron ppm [UNK] adj
pron ppm num n
pron ppm v adv
pron ppm ppm fw
pron ppm num [UNK]
n v int [PAD]
n v num pron
n v abb fw
n v punc ppm
n v ppm adv
n n ppm punc
n n [PAD] num
n n adv adj
n n v fw
n n n tn
v part tn adv
v part adv n
v part tn adv
v part adv num
v part adv n
n part tn pron
n part abb [UNK]
n part pron tn
n part pron part
n part pron ppm
pron pron adv abb
pron pron fw sb
pron pron punc conj
pron pron ppm sb
pron pron sb abb
pron ppm [PAD] num
pron ppm adj v
pron ppm num num
pron ppm part [PAD]
pron ppm pron adj
n tn tn n
n tn num tn
n tn sb int
n tn pron [PAD]
n tn tn sb
adj v part punc conj
adj v part adj tn
adj v part part [PAD]
adj v part tn int
adj v part num fw
n n 

In [18]:
!tail -n 2 ./log/tag/transformer.nofz.log

Average Perplexity on Test Data: 1.0000
Average Cross-Entropy on Test Data: 0.0000


## Training, Text Generation and Testing with BERT LM

In [19]:
!cat ./train_test_tag_nofz.sh

#!/bin/bash

# Updated for Laphet LM Toolkit Version 0.7
# Last updated: 28 Jan 2025

# Create the output and log directories if they don't exist
mkdir -p model/tag/
mkdir -p output/tag/
mkdir -p log/tag/

# Function to train, generate text, and test a language model
task() {
  local model_type=$1
  local model_file="./model/tag/${model_type}.nofz.model"
  local output_file="./output/tag/${model_type}_nofz_gen_texts.txt"
  local log_file="./log/tag/${model_type}.nofz.log"
  local train_data="./data/myPOS/tag/train_tag.txt"
  local dev_data="./data/myPOS/tag/dev_tag.txt"
  local test_data="./data/myPOS/tag/test_tag.txt"
  local start_name="./data/myPOS/tag/start_tags.txt"

  {
    echo "Training ${model_type^} language model:";
    time python -u laphet.py --model_type $model_type --train --data $train_data \
      --dev_file $dev_data --model $model_file --seq_len 50 --epochs 10 --batch_size 32 \
      --lr 0.0001 --embedding_method fasttext_no_freeze \
      --fasttext_model ./fasttex

In [21]:
!./train_test_tag_nofz.sh

Training Bert language model:
Epoch 1/10 (Training): 100%|███████████████| 1250/1250 [00:05<00:00, 223.16it/s]
Epoch 1, Training Loss: 0.3094
Epoch 1/10 (Validation): 100%|█████████████████| 69/69 [00:00<00:00, 659.63it/s]
Epoch 1, Validation Loss: 0.0069
Best model saved at ./model/tag/bert.nofz.model with validation loss: 0.0069
Epoch 2/10 (Training): 100%|███████████████| 1250/1250 [00:05<00:00, 235.57it/s]
Epoch 2, Training Loss: 0.0063
Epoch 2/10 (Validation): 100%|█████████████████| 69/69 [00:00<00:00, 664.03it/s]
Epoch 2, Validation Loss: 0.0013
Best model saved at ./model/tag/bert.nofz.model with validation loss: 0.0013
Epoch 3/10 (Training): 100%|███████████████| 1250/1250 [00:05<00:00, 234.99it/s]
Epoch 3, Training Loss: 0.0017
Epoch 3/10 (Validation): 100%|█████████████████| 69/69 [00:00<00:00, 584.63it/s]
Epoch 3, Validation Loss: 0.0004
Best model saved at ./model/tag/bert.nofz.model with validation loss: 0.0004
Epoch 4/10 (Training): 100%|███████████████| 1250/1250 [00:05

## GPU Usage of BERT LM

In [22]:
!ls -lh ./model/tag/bert*

-rw-rw-r-- 1 ye ye 2.3M Jan 28 19:59 ./model/tag/bert.ftfz.model
-rw-rw-r-- 1 ye ye  110 Jan 28 19:58 ./model/tag/bert.ftfz.model.vocab
-rw-rw-r-- 1 ye ye 8.2M Jan 28 17:37 ./model/tag/bert.model
-rw-rw-r-- 1 ye ye  110 Jan 28 17:36 ./model/tag/bert.model.vocab
-rw-rw-r-- 1 ye ye 2.3M Jan 28 20:46 ./model/tag/bert.nofz.model
-rw-rw-r-- 1 ye ye  110 Jan 28 20:45 ./model/tag/bert.nofz.model.vocab


In [23]:
!cat ./output/tag/bert_nofz_gen_texts.txt

pron v ppm
pron ppm int
pron conj pron
pron n int
pron adj adv
n [PAD] num
n v [UNK]
n adj adj
n part tn
n adj [PAD]
adj conj [PAD]
adj num punc
adj pron int
adj int pron
adj num v
v conj n
v int abb
v [PAD] fw
v adj ppm
v adj part
pron part conj n
pron part tn int
pron part [PAD] sb
pron part part int
pron part int ppm
pron ppm conj adv
pron ppm [PAD] ppm
pron ppm [PAD] num
pron ppm tn [PAD]
pron ppm adv tn
n v v n
n v conj tn
n v sb [UNK]
n v [PAD] abb
n v ppm [PAD]
n n n num
n n punc part
n n abb adv
n n pron [PAD]
n n [UNK] adj
v part v adv
v part conj adv
v part fw pron
v part v v
v part punc conj
n part v abb
n part pron abb
n part punc adv
n part adj n
n part [PAD] tn
pron pron pron part
pron pron part part
pron pron abb sb
pron pron [PAD] punc
pron pron [PAD] tn
pron ppm punc [UNK]
pron ppm pron punc
pron ppm part adv
pron ppm n tn
pron ppm num sb
n tn [UNK] [PAD]
n tn conj v
n tn punc fw
n tn abb [PAD]
n tn sb abb
adj v part tn punc
adj v part num ppm
adj v part part fw
adj v 

In [24]:
!cat ./log/tag/bert.nofz.log

Training Bert language model:
Epoch 1, Training Loss: 0.3094
Epoch 1, Validation Loss: 0.0069
Best model saved at ./model/tag/bert.nofz.model with validation loss: 0.0069
Epoch 2, Training Loss: 0.0063
Epoch 2, Validation Loss: 0.0013
Best model saved at ./model/tag/bert.nofz.model with validation loss: 0.0013
Epoch 3, Training Loss: 0.0017
Epoch 3, Validation Loss: 0.0004
Best model saved at ./model/tag/bert.nofz.model with validation loss: 0.0004
Epoch 4, Training Loss: 0.0006
Epoch 4, Validation Loss: 0.0002
Best model saved at ./model/tag/bert.nofz.model with validation loss: 0.0002
Epoch 5, Training Loss: 0.0003
Epoch 5, Validation Loss: 0.0001
Best model saved at ./model/tag/bert.nofz.model with validation loss: 0.0001
Epoch 6, Training Loss: 0.0002
Epoch 6, Validation Loss: 0.0000
Best model saved at ./model/tag/bert.nofz.model with validation loss: 0.0000
Epoch 7, Training Loss: 0.0001
Epoch 7, Validation Loss: 0.0000
Best model saved at ./model/tag/bert.nofz.model with validat

## Training, Text Generation and Testing with GPT LM

Updated bash shell script is as follows:

In [25]:
!cat ./train_test_tag_nofz.sh

#!/bin/bash

# Updated for Laphet LM Toolkit Version 0.7
# Last updated: 28 Jan 2025

# Create the output and log directories if they don't exist
mkdir -p model/tag/
mkdir -p output/tag/
mkdir -p log/tag/

# Function to train, generate text, and test a language model
task() {
  local model_type=$1
  local model_file="./model/tag/${model_type}.nofz.model"
  local output_file="./output/tag/${model_type}_nofz_gen_texts.txt"
  local log_file="./log/tag/${model_type}.nofz.log"
  local train_data="./data/myPOS/tag/train_tag.txt"
  local dev_data="./data/myPOS/tag/dev_tag.txt"
  local test_data="./data/myPOS/tag/test_tag.txt"
  local start_name="./data/myPOS/tag/start_tags.txt"

  {
    echo "Training ${model_type^} language model:";
    time python -u laphet.py --model_type $model_type --train --data $train_data \
      --dev_file $dev_data --model $model_file --seq_len 50 --epochs 10 --batch_size 32 \
      --lr 0.0001 --embedding_method fasttext_no_freeze \
      --fasttext_model ./fasttex

In [27]:
!./train_test_tag_nofz.sh

Training Gpt language model:
Epoch 1/10 (Training): 100%|███████████████| 1250/1250 [00:05<00:00, 244.47it/s]
Epoch 1, Training Loss: 0.1026
Epoch 1/10 (Validation): 100%|█████████████████| 69/69 [00:00<00:00, 799.18it/s]
Epoch 1, Validation Loss: 0.0018
Best model saved at ./model/tag/gpt.nofz.model with validation loss: 0.0018
Epoch 2/10 (Training): 100%|███████████████| 1250/1250 [00:04<00:00, 270.74it/s]
Epoch 2, Training Loss: 0.0013
Epoch 2/10 (Validation): 100%|█████████████████| 69/69 [00:00<00:00, 803.94it/s]
Epoch 2, Validation Loss: 0.0005
Best model saved at ./model/tag/gpt.nofz.model with validation loss: 0.0005
Epoch 3/10 (Training): 100%|███████████████| 1250/1250 [00:04<00:00, 272.36it/s]
Epoch 3, Training Loss: 0.0004
Epoch 3/10 (Validation): 100%|█████████████████| 69/69 [00:00<00:00, 774.58it/s]
Epoch 3, Validation Loss: 0.0002
Best model saved at ./model/tag/gpt.nofz.model with validation loss: 0.0002
Epoch 4/10 (Training): 100%|███████████████| 1250/1250 [00:04<00:

## GPU Usage of GPT based LM Building

trained လုပ်ပြီးထွက်လာတဲ့ model ဖိုင်ကိုလေ့လာကြည့်ရအောင်

In [28]:
!ls -lh ./model/tag/gpt.nofz.model*

-rw-rw-r-- 1 ye ye 2.3M Jan 28 20:50 ./model/tag/gpt.nofz.model
-rw-rw-r-- 1 ye ye  110 Jan 28 20:49 ./model/tag/gpt.nofz.model.vocab


In [29]:
!ls -lh ./model/tag/gpt.*model

-rw-rw-r-- 1 ye ye 2.3M Jan 28 20:02 ./model/tag/gpt.ftfz.model
-rw-rw-r-- 1 ye ye 8.2M Jan 28 17:40 ./model/tag/gpt.model
-rw-rw-r-- 1 ye ye 2.3M Jan 28 20:50 ./model/tag/gpt.nofz.model


အထက်ပါအတိုင်း fasttext embedding ကို သုံးပြီးဆောက်ခဲ့တဲ့ GPT based LM မော်ဒယ်နှစ်ခုက nn.Embedding နည်းနဲ့ ဆောက်ခဲ့တဲ့ မော်ဒယ်ထက် filesize သေးတယ် ဆိုတာကို တွေ့ရလိမ့်မယ်။  

GPT based language model ရဲ့ tag generation ကိုလည်း လေ့လာကြည့်ရအောင်။  

In [30]:
!cat ./output/tag/gpt_nofz_gen_texts.txt

pron punc fw
pron [PAD] pron
pron adv [UNK]
pron [PAD] v
pron int tn
n pron punc
n n pron
n int adj
n [UNK] v
n pron int
adj sb v
adj part abb
adj fw adv
adj part adv
adj abb punc
v fw fw
v [UNK] tn
v ppm ppm
v abb tn
v pron sb
pron part num [PAD]
pron part punc pron
pron part int part
pron part tn sb
pron part pron ppm
pron ppm v adj
pron ppm ppm sb
pron ppm punc conj
pron ppm [PAD] abb
pron ppm part sb
n v conj fw
n v [PAD] [PAD]
n v [PAD] num
n v n conj
n v tn [PAD]
n n adj abb
n n adj tn
n n punc int
n n num n
n n int part
v part num sb
v part adj ppm
v part int v
v part adj pron
v part part [PAD]
n part ppm [PAD]
n part pron adj
n part adj [UNK]
n part fw fw
n part punc adj
pron pron n abb
pron pron sb adj
pron pron ppm v
pron pron sb int
pron pron abb sb
pron ppm pron abb
pron ppm [PAD] tn
pron ppm [UNK] punc
pron ppm [PAD] tn
pron ppm v [UNK]
n tn punc n
n tn tn pron
n tn sb sb
n tn fw [PAD]
n tn [PAD] adv
adj v part part fw
adj v part adj sb
adj v part [PAD] part
adj v part adv