(ncrfpp) yekyaw.thu@gpu:~/tool$ git clone https://github.com/jiesutd/NCRFpp
Cloning into 'NCRFpp'...
remote: Enumerating objects: 768, done.
remote: Total 768 (delta 0), reused 0 (delta 0), pack-reused 768
Receiving objects: 100% (768/768), 6.89 MiB | 12.92 MiB/s, done.
Resolving deltas: 100% (484/484), done.
(ncrfpp) yekyaw.thu@gpu:~/tool$
checked the cloned files:
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ ls
demo.clf.config demo.train.config main_parse.py model README.md utils
demo.decode.config LICENCE main.py readme sample_data
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
I create an Anaconda new environment with Python version 3.8 as follows:
(base) yekyaw.thu@gpu:~/tool/NCRFpp$ conda create --name ncrfpp python==3.8
...
...
...
wheel pkgs/main/noarch::wheel-0.37.1-pyhd3eb1b0_0
xz pkgs/main/linux-64::xz-5.2.8-h5eee18b_0
zlib pkgs/main/linux-64::zlib-1.2.13-h5eee18b_0
Proceed ([y]/n)? y
Downloading and Extracting Packages
libedit-3.1.20221030 | 181 KB | ################################################################################### | 100%
xz-5.2.8 | 429 KB | ################################################################################### | 100%
sqlite-3.33.0 | 1.1 MB | ################################################################################### | 100%
libffi-3.2.1 | 48 KB | ################################################################################### | 100%
python-3.8.0 | 34.9 MB | ################################################################################### | 100%
pip-22.3.1 | 2.7 MB | ################################################################################### | 100%
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
# $ conda activate ncrfpp
#
# To deactivate an active environment, use
#
# $ conda deactivate
When I try to install torch==1.0, I got an error as follows:
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ pip install torch==1.0
ERROR: Could not find a version that satisfies the requirement torch==1.0 (from versions: 1.4.0, 1.5.0, 1.5.1, 1.6.0, 1.7.0, 1.7.1, 1.8.0, 1.8.1, 1.9.0, 1.9.1, 1.10.0, 1.10.1, 1.10.2, 1.11.0, 1.12.0, 1.12.1, 1.13.0)
ERROR: No matching distribution found for torch==1.0
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
How about trying with 1.4.0 ?!
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ pip install torch==1.0
ERROR: Could not find a version that satisfies the requirement torch==1.0 (from versions: 1.4.0, 1.5.0, 1.5.1, 1.6.0, 1.7.0, 1.7.1, 1.8.0, 1.8.1, 1.9.0, 1.9.1, 1.10.0, 1.10.1, 1.10.2, 1.11.0, 1.12.0, 1.12.1, 1.13.0)
ERROR: No matching distribution found for torch==1.0
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ pip install --upgrade pip
Requirement already satisfied: pip in /home/yekyaw.thu/.conda/envs/ncrfpp/lib/python3.8/site-packages (22.3.1)
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ pip install torch==1.4.0
Collecting torch==1.4.0
Downloading torch-1.4.0-cp38-cp38-manylinux1_x86_64.whl (753.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 753.4/753.4 MB 1.9 MB/s eta 0:00:00
Installing collected packages: torch
Successfully installed torch-1.4.0
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
Check the demo config file ...
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ cat demo.train.config
### use # to comment out the configure item
### I/O ###
train_dir=sample_data/train.bmes
dev_dir=sample_data/dev.bmes
test_dir=sample_data/test.bmes
model_dir=sample_data/lstmcrf
word_emb_dir=sample_data/sample.word.emb
#raw_dir=
#decode_dir=
#dset_dir=
#load_model_dir=
#char_emb_dir=
norm_word_emb=False
norm_char_emb=False
number_normalized=True
seg=True
word_emb_dim=50
char_emb_dim=30
###NetworkConfiguration###
use_crf=True
use_char=True
word_seq_feature=LSTM
char_seq_feature=CNN
#feature=[POS] emb_size=20
#feature=[Cap] emb_size=20
#nbest=1
###TrainingSetting###
status=train
optimizer=SGD
iteration=1
batch_size=10
ave_batch_loss=False
###Hyperparameters###
cnn_layer=4
char_hidden_dim=50
hidden_dim=200
dropout=0.5
lstm_layer=1
bilstm=True
learning_rate=0.015
lr_decay=0.05
momentum=0
l2=1e-8
#gpu
#clip=
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
checck the GPU status:
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ nvidia-smi
Wed Dec 14 21:20:23 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.161.03 Driver Version: 470.161.03 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:0A:00.0 Off | N/A |
| 30% 45C P0 58W / 300W | 0MiB / 11019MiB | 1% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce ... Off | 00000000:42:00.0 Off | N/A |
| 62% 69C P0 72W / 257W | 0MiB / 11019MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 2 NVIDIA GeForce ... Off | 00000000:43:00.0 Off | N/A |
| 22% 64C P0 72W / 250W | 0MiB / 11016MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
test training ...
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ time python main.py --config demo.train.config
Traceback (most recent call last):
File "main.py", line 12, in <module>
import torch
File "/home/yekyaw.thu/.conda/envs/ncrfpp/lib/python3.8/site-packages/torch/__init__.py", line 81, in <module>
from torch._C import *
ImportError: numpy.core.multiarray failed to import
real 0m0.218s
user 0m0.160s
sys 0m0.017s
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
numpy library required ...
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ pip install numpy
Collecting numpy
Using cached numpy-1.23.5-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.1 MB)
Installing collected packages: numpy
Successfully installed numpy-1.23.5
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
test training again ...
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ time python main.py --config demo.train.config
Seed num: 42
MODEL: train
Load pretrained word embedding, norm: False, dir: sample_data/sample.word.emb
Embedding:
pretrain word:15093, prefect match:847, case_match:433, oov:1834, oov%:0.5887640449438202
Training model...
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
DATA SUMMARY START:
I/O:
Start Sequence Laebling task...
Tag scheme: BMES
Split token: |||
MAX SENTENCE LENGTH: 250
MAX WORD LENGTH: -1
Number normalized: True
Word alphabet size: 3115
Char alphabet size: 71
Label alphabet size: 18
Word embedding dir: sample_data/sample.word.emb
Char embedding dir: None
Word embedding size: 50
Char embedding size: 30
Norm word emb: False
Norm char emb: False
Train file directory: sample_data/train.bmes
Dev file directory: sample_data/dev.bmes
Test file directory: sample_data/test.bmes
Raw file directory: None
Dset file directory: None
Model file directory: sample_data/lstmcrf
Loadmodel directory: None
Decode file directory: None
Train instance number: 484
Dev instance number: 112
Test instance number: 186
Raw instance number: 0
FEATURE num: 0
++++++++++++++++++++++++++++++++++++++++
Model Network:
Model use_crf: True
Model word extractor: LSTM
Model use_char: True
Model char extractor: CNN
Model char_hidden_dim: 50
++++++++++++++++++++++++++++++++++++++++
Training:
Optimizer: SGD
Iteration: 1
BatchSize: 10
Average batch loss: False
++++++++++++++++++++++++++++++++++++++++
Hyperparameters:
Hyper lr: 0.015
Hyper lr_decay: 0.05
Hyper HP_clip: None
Hyper momentum: 0.0
Hyper l2: 1e-08
Hyper hidden_dim: 200
Hyper dropout: 0.5
Hyper lstm_layer: 1
Hyper bilstm: True
Hyper GPU: True
DATA SUMMARY END.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
build sequence labeling network...
use_char: True
char feature extractor: CNN
word feature extractor: LSTM
use crf: True
build word sequence feature extractor: LSTM...
build word representation...
build char sequence feature extractor: CNN ...
build CRF...
Epoch: 0/1
Learning rate is set as: 0.015
Shuffle: first input word list: [1728, 131, 1661, 133]
Instance: 484; Time: 2.61s; loss: 5349.0192; acc: 5449/6640=0.8206
Epoch: 0 training finished. Time: 2.62s, speed: 185.08st/s, total loss: 5349.019195556641
totalloss: 5349.019195556641
Right token = 1225 All token = 1458 acc = 0.8401920438957476
Dev: time: 0.13s, speed: 874.52st/s; acc: 0.8402, p: 0.6026, r: 0.2238, f: 0.3264
Exceed previous best f score: -10
Save current best model in file: sample_data/lstmcrf.0.model
Right token = 3238 All token = 3610 acc = 0.8969529085872576
Test: time: 0.30s, speed: 625.11st/s; acc: 0.8970, p: 0.6645, r: 0.2919, f: 0.4056
real 0m7.784s
user 0m5.019s
sys 0m2.777s
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
test training with demo configuration looks OK.
checck the demo decode configuration file:
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ cat demo.decode.config
### Decode ###
status=decode
raw_dir=sample_data/raw.bmes
nbest=10
decode_dir=sample_data/raw.out
dset_dir=sample_data/lstmcrf.dset
load_model_dir=sample_data/lstmcrf.0.model(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
let's test ...
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ time python main.py --config demo.decode.config
Seed num: 42
MODEL: decode
sample_data/raw.bmes
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
DATA SUMMARY START:
I/O:
Start Sequence Laebling task...
Tag scheme: BMES
Split token: |||
MAX SENTENCE LENGTH: 250
MAX WORD LENGTH: -1
Number normalized: True
Word alphabet size: 3115
Char alphabet size: 71
Label alphabet size: 18
Word embedding dir: sample_data/sample.word.emb
Char embedding dir: None
Word embedding size: 50
Char embedding size: 30
Norm word emb: False
Norm char emb: False
Train file directory: sample_data/train.bmes
Dev file directory: sample_data/dev.bmes
Test file directory: sample_data/test.bmes
Raw file directory: sample_data/raw.bmes
Dset file directory: sample_data/lstmcrf.dset
Model file directory: sample_data/lstmcrf
Loadmodel directory: sample_data/lstmcrf.0.model
Decode file directory: sample_data/raw.out
Train instance number: 484
Dev instance number: 112
Test instance number: 186
Raw instance number: 0
FEATURE num: 0
++++++++++++++++++++++++++++++++++++++++
Model Network:
Model use_crf: True
Model word extractor: LSTM
Model use_char: True
Model char extractor: CNN
Model char_hidden_dim: 50
++++++++++++++++++++++++++++++++++++++++
Training:
Optimizer: SGD
Iteration: 1
BatchSize: 10
Average batch loss: False
++++++++++++++++++++++++++++++++++++++++
Hyperparameters:
Hyper lr: 0.015
Hyper lr_decay: 0.05
Hyper HP_clip: None
Hyper momentum: 0.0
Hyper l2: 1e-08
Hyper hidden_dim: 200
Hyper dropout: 0.5
Hyper lstm_layer: 1
Hyper bilstm: True
Hyper GPU: True
DATA SUMMARY END.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
nbest: 10
Load Model from file: sample_data/lstmcrf
build sequence labeling network...
use_char: True
char feature extractor: CNN
word feature extractor: LSTM
use crf: True
build word sequence feature extractor: LSTM...
build word representation...
build char sequence feature extractor: CNN ...
build CRF...
Decode raw data, nbest: 10 ...
Right token = 1225 All token = 1458 acc = 0.8401920438957476
raw: time:0.22s, speed:511.40st/s; acc: 0.8402, p: 0.6026, r: 0.2238, f: 0.3264
Predict raw 10-best result has been written into file. sample_data/raw.out
real 0m5.310s
user 0m1.903s
sys 0m3.298s
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/sample_data$ head -n 30 ./train.bmes
EU S-ORG
rejects O
German S-MISC
call O
to O
boycott O
British S-MISC
lamb O
. O
Peter B-PER
Blackburn E-PER
BRUSSELS S-LOC
1996-08-22 O
The O
European B-ORG
Commission E-ORG
said O
on O
Thursday O
it O
disagreed O
with O
German S-MISC
advice O
to O
consumers O
to O
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/sample_data$
Chek the training data format:
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/sample_data$ head -n 50 ./test.bmes
Aziz S-PER
said O
Iraq S-LOC
's O
military O
intervention O
, O
the O
first O
on O
such O
scale O
since O
the O
U.S. S-LOC
and O
allies O
decided O
to O
protect O
Iraqi B-MISC
Kurds E-MISC
against O
Baghdad S-LOC
, O
was O
in O
response O
to O
a O
plea O
from O
Barzani S-PER
to O
President O
Saddam B-PER
Hussein E-PER
to O
back O
him O
militarily O
and O
save O
his O
people O
from O
attacks O
by O
Iran S-LOC
and O
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/sample_data$
Check the test data format:
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/sample_data$ head -n 100 ./test.bmes
Aziz S-PER
said O
Iraq S-LOC
's O
military O
intervention O
, O
the O
first O
on O
such O
scale O
since O
the O
U.S. S-LOC
and O
allies O
decided O
to O
protect O
Iraqi B-MISC
Kurds E-MISC
against O
Baghdad S-LOC
, O
was O
in O
response O
to O
a O
plea O
from O
Barzani S-PER
to O
President O
Saddam B-PER
Hussein E-PER
to O
back O
him O
militarily O
and O
save O
his O
people O
from O
attacks O
by O
Iran S-LOC
and O
Talabani S-PER
. O
He O
said O
Barzani S-PER
sent O
a O
message O
to O
Saddam S-PER
on O
August O
22 O
in O
which O
he O
said O
: O
" O
The O
conspiracy O
is O
beyond O
our O
capability O
therefore O
we O
plead O
with O
your O
excellency O
to O
order O
Iraqi S-MISC
armed O
forces O
to O
interfere O
to O
help O
us O
to O
evade O
the O
foreign O
threat O
and O
put O
an O
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/sample_data$
Reference: https://github.com/jiesutd/NCRFpp/blob/master/readme/Configuration.md
Relating to I/O:
train_dir=xx #string (necessary in training). Set training file directory.
dev_dir=xx #string (necessary in training). Set dev file directory.
test_dir=xx #string . Set test file directory.
model_dir=xx #string (optional). Set saved model file directory.
word_emb_dir=xx #string (optional). Set pretrained word embedding file directory.
raw_dir=xx #string (optional). Set input raw file directory.
decode_dir=xx #string (necessary in decoding). Set decoded file directory.
dset_dir=xx #string (necessary). Set saved model file directory.
load_model_dir=xx #string (necessary in decoding). Set loaded model file directory. (when decoding)
char_emb_dir=xx #string (optional). Set pretrained character embedding file directory.
norm_word_emb=False #boolen. If normalize the pretrained word embedding.
norm_char_emb=False #boolen. If normalize the pretrained character embedding.
number_normalized=True #boolen. If normalize the digit into `0` for input files.
seg=True #boolen. If task is segmentation like, tasks with token accuracy evaluation (e.g. POS, CCG) is False; tasks with F-value evaluation(e.g. Word Segmentation, NER, Chunking) is True .
word_emb_dim=50 #int. Word embedding dimension, if model use pretrained word embedding, word_emb_dim will be reset as the same dimension as pretrained embedidng.
char_emb_dim=30 #int. Character embedding dimension, if model use pretrained character embedding, char_emb_dim will be reset as the same dimension as pretrained embedidng.
Relating to Networking:
use_crf=True #boolen (necessary in training). Flag of if using CRF layer. If it is set as False, then Softmax is used in inference layer.
use_char=True #boolen (necessary in training). Flag of if using character sequence layer.
word_seq_feature=XX #boolen (necessary in training): CNN/LSTM/GRU. Neural structure selection for word sequence.
char_seq_feature=CNN #boolen (necessary in training): CNN/LSTM/GRU. Neural structure selection for character sequence, it only be used when use_char=True.
feature=[POS] emb_size=20 emb_dir=xx #feature configuration. It includes the feature prefix [POS], pretrained feature embedding file and the embedding size.
feature=[Cap] emb_size=20 emb_dir=xx #feature configuration. Another feature [Cap].
nbest=1 #int (necessary in decoding). Set the nbest size during decoding.
Relating to Training Setting:
status=train #string: train or decode. Set the program running in training or decoding mode.
optimizer=SGD #string: SGD/Adagrad/AdaDelta/RMSprop/Adam. optimizer selection.
iteration=1 #int. Set the iteration number of training.
batch_size=10 #int. Set the batch size of training or decoding.
ave_batch_loss=False #boolen. Set average the batched loss during training.
Relating to Hyperparameters:
cnn_layer=4 #int. CNN layer number for word sequence layer.
char_hidden_dim=50 #int. Character hidden vector dimension for character sequence layer.
hidden_dim=200 #int. Word hidden vector dimension for word sequence layer.
dropout=0.5 #float. Dropout probability.
lstm_layer=1 #int. LSTM layer number for word sequence layer.
bilstm=True #boolen. If use bidirection lstm for word seuquence layer.
learning_rate=0.015 #float. Learning rate.
lr_decay=0.05 #float. Learning rate decay rate, only works when optimizer=SGD.
momentum=0 #float. Momentum
l2=1e-8 #float. L2-regulization.
#gpu=True #boolen. If use GPU, generally it depends on the hardward environment.
#clip= #float. Clip the gradient which is larger than the setted number.
for sentence dataset:
(ncrfpp) yekyaw.thu@gpu:~/exp/mySent/ncrf/data/sent$ cp /home/yekyaw.thu/exp/mySent/ncrf/mySentence-data-crf-format/sent_data_crf_format/train-valid-test-style/* .
(ncrfpp) yekyaw.thu@gpu:~/exp/mySent/ncrf/data/sent$ ls
test.col train.col valid.col
(ncrfpp) yekyaw.thu@gpu:~/exp/mySent/ncrf/data/sent$ wc *
68334 127244 1051379 test.col
583541 1087082 8995710 train.col
34729 64630 533580 valid.col
686604 1278956 10580669 total
(ncrfpp) yekyaw.thu@gpu:~/exp/mySent/ncrf/data/sent$
check data format roughly ...
(ncrfpp) yekyaw.thu@gpu:~/exp/mySent/ncrf/data/sent$ head train.col
ဘာ B
ရယ် O
လလိလို့ O
တိတိကျကျ O
ထောက်မပြ O
နနိုင် O
ပေမမဲ့ O
ပြဿနာ O
တစ် O
ခု O
(ncrfpp) yekyaw.thu@gpu:~/exp/mySent/ncrf/data/sent$ head valid.col
ထထို B
အချိန် O
မှ O
စ O
၍ O
စင်္ကာပူ O
ကျွန်း O
၏ O
ခေတ်သစ် O
တစ် O
(ncrfpp) yekyaw.thu@gpu:~/exp/mySent/ncrf/data/sent$ head test.col
အခု B
သန့်စင်ခန်း N
ကကို N
သသုံး N
ပါရစေ E
လူငယ် B
တွေ O
က O
ပပုံစံတကျ O
(ncrfpp) yekyaw.thu@gpu:~/exp/mySent/ncrf/data/sent$
preparing for sent+para dataset:
(ncrfpp) yekyaw.thu@gpu:~/exp/mySent/ncrf/data/para$ cp ../../mySentence-data-crf-format/sent+para_data_crf_format/train-valid-test-style/* .
(ncrfpp) yekyaw.thu@gpu:~/exp/mySent/ncrf/data/para$ wc *
102144 193264 1578958 test.col
881245 1668486 13659721 train.col
64861 123564 1004217 valid.col
1048250 1985314 16242896 total
(ncrfpp) yekyaw.thu@gpu:~/exp/mySent/ncrf/data/para$
check data format ...
(ncrfpp) yekyaw.thu@gpu:~/exp/mySent/ncrf/data/para$ head -n 30 train.col
နားလည် B
ပါ N
ပြီ E
ဈေး B
က O
များ N
လှ N
ချေ N
လား E
သူ B
ဒီ O
နေ့ O
နည်းနည်း O
ပင်ပန်း O
နေ N
တယ် N
ထင် N
တယ် E
ဘာ B
ကြောင့် O
လဲ O
ဆဆို N
စမ်း N
ပါ N
ဦး E
စိတ်ကောက် B
(ncrfpp) yekyaw.thu@gpu:~/exp/mySent/ncrf/data/para$
check valid.col
(ncrfpp) yekyaw.thu@gpu:~/exp/mySent/ncrf/data/para$ head -n 30 valid.col
သူ B
ဘယ်သူ N
နနဲ့ N
အရင်းနှီးဆဆုံး N
လဲ E
ဒီ B
က O
နေ O
ရှေ့ O
ကကို O
တည့်တည့် O
သွား O
မီးပွွိုင့် O
တွေ့ O
ရင် O
ဘယ်ဘက် O
ကွေ့ O
၂ O
မှတ်တတိုင် O
ဆက်လက် O
သွား O
ရင် O
ရောက် N
ပါ N
လိမ့် N
မယ် E
ရေခဲ B
မ N
(ncrfpp) yekyaw.thu@gpu:~/exp/mySent/ncrf/data/para$
check test.col file ...
(ncrfpp) yekyaw.thu@gpu:~/exp/mySent/ncrf/data/para$ head -n 30 test.col
ရင်ဘတ် B
အောင့် O
လာ N
ရင် N
သတိထား N
ပါ E
ဘယ်လောက် B
နောက်ကျ N
သလဲ E
ကြြိုပပိပို့ B
ဘတ်စ်ကား N
က N
အဆင်အပြေဆဆုံး N
ပဲ E
အဲဒီ B
အဖွွဲ့ O
ရရဲ့ O
ဥက္ကဋ္ဌ O
ဖြစ် O
တတဲ့ O
ယယို O
ကကို O
ယာမာ့ O
အာကိဟီတတို O
YokoyamaAkihito O
က O
တခြား O
(ncrfpp) yekyaw.thu@gpu:~/exp/mySent/ncrf/data/para$
training config file:
### use # to comment out the configure item
### I/O ###
train_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/train.col
dev_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/valid.col
test_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/test.col
model_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-charcnn
#word_emb_dir=sample_data/sample.word.emb
#raw_dir=
#decode_dir=
#dset_dir=
#load_model_dir=
#char_emb_dir=
norm_word_emb=False
norm_char_emb=False
number_normalized=True
seg=True
word_emb_dim=50
char_emb_dim=30
###NetworkConfiguration###
use_crf=False
use_char=True
word_seq_feature=LSTM
char_seq_feature=CNN
#feature=[POS] emb_size=20
#feature=[Cap] emb_size=20
#nbest=1
###TrainingSetting###
status=train
# optimizer can be SGD/Adagrad/AdaDelta/RMSprop/Adam
optimizer=SGD
iteration=100
batch_size=10
ave_batch_loss=False
###Hyperparameters###
cnn_layer=4
char_hidden_dim=50
hidden_dim=200
dropout=0.5
lstm_layer=1
bilstm=True
learning_rate=0.015
lr_decay=0.05
momentum=0
l2=1e-8
gpu=True
#clip=
training log ...
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ time python main.py --config ./mysent-config/word-lstm.char-cnn.train.config
...
...
...
Instance: 28000; Time: 0.68s; loss: 32.4567; acc: 287120/378246=0.7591
Instance: 28500; Time: 0.77s; loss: 44.5083; acc: 294180/385311=0.7635
Instance: 29000; Time: 0.72s; loss: 40.2403; acc: 300751/391890=0.7674
Instance: 29500; Time: 0.73s; loss: 22.6643; acc: 307435/398578=0.7713
Instance: 30000; Time: 0.77s; loss: 109.5681; acc: 314429/405583=0.7753
Instance: 30500; Time: 0.73s; loss: 15.8874; acc: 321077/412235=0.7789
Instance: 31000; Time: 0.79s; loss: 42.0999; acc: 328002/419166=0.7825
Instance: 31500; Time: 0.73s; loss: 51.8857; acc: 334597/425768=0.7859
Instance: 32000; Time: 0.82s; loss: 50.0649; acc: 341890/433072=0.7895
Instance: 32500; Time: 0.80s; loss: 129.4838; acc: 348910/440108=0.7928
Instance: 33000; Time: 0.79s; loss: 171.2889; acc: 355795/447015=0.7959
Instance: 33500; Time: 0.79s; loss: 155.8194; acc: 362839/454076=0.7991
Instance: 34000; Time: 0.73s; loss: 37.7461; acc: 369442/460685=0.8019
Instance: 34500; Time: 0.76s; loss: 110.5981; acc: 376426/467678=0.8049
Instance: 35000; Time: 0.81s; loss: 41.8031; acc: 383561/474819=0.8078
Instance: 35500; Time: 0.80s; loss: 18.3619; acc: 390800/482061=0.8107
Instance: 36000; Time: 0.76s; loss: 49.1879; acc: 397375/488644=0.8132
Instance: 36500; Time: 0.76s; loss: 41.6815; acc: 404113/495386=0.8158
Instance: 37000; Time: 0.77s; loss: 157.9075; acc: 410982/502268=0.8183
Instance: 37500; Time: 0.77s; loss: 181.8122; acc: 417884/509202=0.8207
Instance: 38000; Time: 0.71s; loss: 117.4134; acc: 424305/515634=0.8229
Instance: 38500; Time: 0.77s; loss: 19.4450; acc: 431182/522512=0.8252
Instance: 39000; Time: 0.77s; loss: 100.8025; acc: 437878/529223=0.8274
Instance: 39500; Time: 0.77s; loss: 13.4660; acc: 444810/536159=0.8296
Instance: 39999; Time: 0.77s; loss: 59.1358; acc: 451789/543142=0.8318
Epoch: 0 training finished. Time: 60.55s, speed: 660.64st/s, total loss: 238730.19521450996
totalloss: 238730.19521450996
Right token = 32287 All token = 32315 acc = 0.9991335293207488
Dev: time: 1.98s, speed: 1230.65st/s; acc: 0.9991, p: -1.0000, r: -1.0000, f: -1.0000
Exceed previous best f score: -10
Save current best model in file: /home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-charcnn.0.model
...
...
...
Instance: 35500; Time: 0.76s; loss: 70.0287; acc: 481987/482390=0.9992
Instance: 36000; Time: 0.78s; loss: 25.3078; acc: 489048/489455=0.9992
Instance: 36500; Time: 0.72s; loss: 35.9397; acc: 495493/495907=0.9992
Instance: 37000; Time: 0.77s; loss: 10.8092; acc: 502462/502879=0.9992
Instance: 37500; Time: 0.78s; loss: 59.3640; acc: 509334/509759=0.9992
Instance: 38000; Time: 0.70s; loss: 14.0096; acc: 515855/516282=0.9992
Instance: 38500; Time: 0.73s; loss: 7.5072; acc: 522564/522993=0.9992
Instance: 39000; Time: 0.76s; loss: 19.6130; acc: 529289/529722=0.9992
Instance: 39500; Time: 0.77s; loss: 40.5207; acc: 536107/536545=0.9992
Instance: 39999; Time: 0.75s; loss: 48.6918; acc: 542695/543142=0.9992
Epoch: 8 training finished. Time: 60.34s, speed: 662.91st/s, total loss: 2759.536283969879
totalloss: 2759.536283969879
Right token = 32287 All token = 32315 acc = 0.9991335293207488
Dev: time: 1.99s, speed: 1225.10st/s; acc: 0.9991, p: -1.0000, r: -1.0000, f: -1.0000
Right token = 63590 All token = 63622 acc = 0.9994970293294773
Test: time: 4.08s, speed: 1166.49st/s; acc: 0.9995, p: -1.0000, r: -1.0000, f: -1.0000
Epoch: 9/100
Learning rate is set as: 0.010344827586206896
Shuffle: first input word list: [4022, 2988, 253, 81, 254]
Instance: 500; Time: 0.77s; loss: 26.6937; acc: 7000/7005=0.9993
Instance: 1000; Time: 0.78s; loss: 27.5530; acc: 13928/13941=0.9991
Instance: 1500; Time: 0.75s; loss: 14.3770; acc: 20777/20792=0.9993
Instance: 2000; Time: 0.75s; loss: 46.9566; acc: 27451/27471=0.9993
Instance: 2500; Time: 0.74s; loss: 11.8214; acc: 34067/34089=0.9994
Instance: 3000; Time: 0.78s; loss: 44.0794; acc: 41062/41090=0.9993
Instance: 3500; Time: 0.71s; loss: 9.0670; acc: 47483/47512=0.9994
Instance: 4000; Time: 0.78s; loss: 52.0679; acc: 54581/54615=0.9994
Instance: 4500; Time: 0.77s; loss: 88.8468; acc: 61115/61161=0.9992
Instance: 5000; Time: 0.75s; loss: 20.6342; acc: 67838/67888=0.9993
...
...
...
Instance: 30000; Time: 0.76s; loss: 20.8148; acc: 406746/406922=0.9996
Instance: 30500; Time: 0.75s; loss: 7.8250; acc: 413750/413929=0.9996
Instance: 31000; Time: 0.72s; loss: 10.9805; acc: 420162/420344=0.9996
Instance: 31500; Time: 0.78s; loss: 25.2875; acc: 426838/427023=0.9996
Instance: 32000; Time: 0.73s; loss: 9.8244; acc: 433892/434078=0.9996
Instance: 32500; Time: 0.79s; loss: 7.2981; acc: 440924/441114=0.9996
Instance: 33000; Time: 0.70s; loss: 19.7985; acc: 447337/447533=0.9996
Instance: 33500; Time: 0.73s; loss: 28.0911; acc: 454349/454551=0.9996
Instance: 34000; Time: 0.75s; loss: 3.3515; acc: 461479/461681=0.9996
Instance: 34500; Time: 0.72s; loss: 2.3600; acc: 468072/468274=0.9996
Instance: 35000; Time: 0.75s; loss: 14.8842; acc: 474665/474870=0.9996
Instance: 35500; Time: 0.71s; loss: 3.6203; acc: 481233/481439=0.9996
Instance: 36000; Time: 0.71s; loss: 9.3795; acc: 487801/488010=0.9996
Instance: 36500; Time: 0.74s; loss: 4.0125; acc: 494715/494925=0.9996
Instance: 37000; Time: 0.72s; loss: 6.7776; acc: 501575/501787=0.9996
Instance: 37500; Time: 0.75s; loss: 11.1597; acc: 508596/508809=0.9996
Instance: 38000; Time: 0.75s; loss: 16.0169; acc: 515486/515702=0.9996
Instance: 38500; Time: 0.73s; loss: 11.6874; acc: 522166/522385=0.9996
Instance: 39000; Time: 0.74s; loss: 25.5515; acc: 529107/529333=0.9996
Instance: 39500; Time: 0.78s; loss: 14.5840; acc: 536122/536351=0.9996
Instance: 39999; Time: 0.75s; loss: 7.3349; acc: 542911/543142=0.9996
Epoch: 99 training finished. Time: 59.62s, speed: 670.94st/s, total loss: 854.5173244476318
totalloss: 854.5173244476318
Right token = 32284 All token = 32315 acc = 0.9990406931765434
Dev: time: 2.00s, speed: 1218.49st/s; acc: 0.9990, p: -1.0000, r: -1.0000, f: -1.0000
Right token = 63580 All token = 63622 acc = 0.9993398509949388
Test: time: 4.07s, speed: 1167.93st/s; acc: 0.9993, p: -1.0000, r: -1.0000, f: -1.0000
real 111m21.157s
user 110m30.263s
sys 0m31.132s
check GPU status during training time ...
and it looks using only GPU. I should findout how to set number of GPU in config file ...
Every 2.0s: nvidia-smi gpu.cadt.edu.kh: Fri Dec 16 21:13:38 2022
Fri Dec 16 21:13:38 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.161.03 Driver Version: 470.161.03 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:0A:00.0 Off | N/A |
| 46% 51C P2 62W / 300W | 908MiB / 11019MiB | 18% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce ... Off | 00000000:42:00.0 Off | N/A |
| 0% 54C P8 21W / 257W | 3MiB / 11019MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 2 NVIDIA GeForce ... Off | 00000000:43:00.0 Off | N/A |
| 33% 44C P8 29W / 250W | 3MiB / 11016MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 2702014 C python 905MiB |
+-----------------------------------------------------------------------------+
check the model file and output files:
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-model$ ls
wordlstm-charcnn.0.model wordlstm-charcnn.dset
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-model$
decode config file:
### Decode ###
status=decode
#raw_dir=sample_data/raw.bmes
raw_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/test.col
#nbest=1
#nbest=10
decode_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-hyp/wordlstm-charcnn.hyp
#dset_dir=sample_data/lstmcrf.dset
dset_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-charcnn.dset
#load_model_dir=sample_data/lstmcrf.0.model
load_model_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-charcnn.0.model
testing ...
Seed num: 42
MODEL: decode
/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/test.col
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
DATA SUMMARY START:
I/O:
Start Sequence Laebling task...
Tag scheme: NoSeg
Split token: |||
MAX SENTENCE LENGTH: 250
MAX WORD LENGTH: -1
Number normalized: True
Word alphabet size: 31439
Char alphabet size: 274
Label alphabet size: 5
Word embedding dir: None
Char embedding dir: None
Word embedding size: 50
Char embedding size: 30
Norm word emb: False
Norm char emb: False
Train file directory: /home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/train.col
Dev file directory: /home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/valid.col
Test file directory: /home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/test.col
Raw file directory: /home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/test.col
Dset file directory: /home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-charcnn.dset
Model file directory: /home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-charcnn
Loadmodel directory: /home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-charcnn.0.model
Decode file directory: /home/yekyaw.thu/tool/NCRFpp/mysent-hyp/wordlstm-charcnn.hyp
Train instance number: 39999
Dev instance number: 2414
Test instance number: 4712
Raw instance number: 0
FEATURE num: 0
++++++++++++++++++++++++++++++++++++++++
Model Network:
Model use_crf: False
Model word extractor: LSTM
Model use_char: True
Model char extractor: CNN
Model char_hidden_dim: 50
++++++++++++++++++++++++++++++++++++++++
Training:
Optimizer: SGD
Iteration: 100
BatchSize: 10
Average batch loss: False
++++++++++++++++++++++++++++++++++++++++
Hyperparameters:
Hyper lr: 0.015
Hyper lr_decay: 0.05
Hyper HP_clip: None
Hyper momentum: 0.0
Hyper l2: 1e-08
Hyper hidden_dim: 200
Hyper dropout: 0.5
Hyper lstm_layer: 1
Hyper bilstm: True
Hyper GPU: True
DATA SUMMARY END.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
nbest: None
Load Model from file: /home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-charcnn
build sequence labeling network...
use_char: True
char feature extractor: CNN
word feature extractor: LSTM
use crf: False
build word sequence feature extractor: LSTM...
build word representation...
build char sequence feature extractor: CNN ...
Decode raw data, nbest: None ...
Right token = 63589 All token = 63622 acc = 0.9994813114960234
raw: time:4.18s, speed:1137.11st/s; acc: 0.9995, p: -1.0000, r: -1.0000, f: -1.0000
hyp file cannot write as follows:
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-hyp$ wc wordlstm-charcnn.hyp
0 0 0 wordlstm-charcnn.hyp
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-hyp$
Error message is as follows:
DATA SUMMARY END.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
nbest: None
Load Model from file: /home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-charcnn
build sequence labeling network...
use_char: True
char feature extractor: CNN
word feature extractor: LSTM
use crf: False
build word sequence feature extractor: LSTM...
build word representation...
build char sequence feature extractor: CNN ...
Decode raw data, nbest: None ...
Right token = 63589 All token = 63622 acc = 0.9994813114960234
raw: time:4.22s, speed:1128.19st/s; acc: 0.9995, p: -1.0000, r: -1.0000, f: -1.0000
Traceback (most recent call last):
File "main.py", line 568, in <module>
data.write_decoded_results(decode_results, 'raw')
File "/home/yekyaw.thu/tool/NCRFpp/utils/data.py", line 334, in write_decoded_results
fout.write(content_list[idx][0][idy].encode('utf-8') + " " + predict_results[idx][idy] + '\n')
TypeError: can't concat str to bytes
real 0m12.807s
user 0m10.013s
sys 0m3.177s
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
Code debugging as follows:
334 #fout.write(content_list[idx][0][idy].encode('utf-8') + " " + predict_results[idx][idy] + '\n')
335 fout.write(content_list[idx][0][idy] + " " + predict_results[idx][idy] + '\n')
test again ...
DATA SUMMARY END.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
nbest: None
Load Model from file: /home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-charcnn
build sequence labeling network...
use_char: True
char feature extractor: CNN
word feature extractor: LSTM
use crf: False
build word sequence feature extractor: LSTM...
build word representation...
build char sequence feature extractor: CNN ...
Decode raw data, nbest: None ...
Right token = 63589 All token = 63622 acc = 0.9994813114960234
raw: time:4.19s, speed:1135.83st/s; acc: 0.9995, p: -1.0000, r: -1.0000, f: -1.0000
Predict raw result has been written into file. /home/yekyaw.thu/tool/NCRFpp/mysent-hyp/wordlstm-charcnn.hyp
real 0m12.621s
user 0m9.839s
sys 0m2.980s
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
check the hypothesis output file:
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-hyp$ wc wordlstm-charcnn.hyp
68334 127244 1051379 wordlstm-charcnn.hyp
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-hyp$
check the hyp file content:
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-hyp$ head -n 30 ./wordlstm-charcnn.hyp
အခု B
သန့်စင်ခန်း N
ကကို N
သသုံး N
ပါရစေ E
လူငယ် B
တွေ O
က O
ပပုံစံတကျ O
ရှိ O
မှု O
ကကို N
မ N
ကြြိုက် N
ဘူး E
ဒီ B
တစ် O
ခေါက် O
ကိစ္စ O
ကြောင့် O
ကျွန်တော့် O
ရရဲ့ O
သိက္ခာ O
အဖတ်ဆယ် O
လလိလို့ O
မ O
ရ O
အောင် N
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-hyp$
update the config file:
model_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-charlstm
###NetworkConfiguration###
use_crf=False
use_char=True
word_seq_feature=LSTM
char_seq_feature=LSTM
training log ...
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ time python main.py --config ./mysent-config/word-lstm.char-lstm.train.config | tee ./mysent-model/wordlstm-charlstm.training.log
...
...
...
Hyper lstm_layer: 1
Hyper bilstm: True
Hyper GPU: True
DATA SUMMARY END.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
/home/yekyaw.thu/.conda/envs/ncrfpp/lib/python3.8/site-packages/torch/nn/_reduction.py:43: UserWarning: size_average and reduce args will be deprecated, please use reduction='sum' instead.
warnings.warn(warning.format(ret))
build sequence labeling network...
use_char: True
char feature extractor: LSTM
word feature extractor: LSTM
use crf: False
build word sequence feature extractor: LSTM...
build word representation...
build char sequence feature extractor: LSTM ...
Epoch: 0/100
Learning rate is set as: 0.015
Shuffle: first input word list: [8895, 45, 226, 207, 1037, 644, 18, 253, 208, 254]
Instance: 500; Time: 0.98s; loss: 6763.9298; acc: 4315/6598=0.6540
Instance: 1000; Time: 0.99s; loss: 1224.9703; acc: 10795/13463=0.8018
Instance: 1500; Time: 0.96s; loss: 573.6215; acc: 17310/20100=0.8612
Instance: 2000; Time: 0.99s; loss: 255.3739; acc: 24278/27109=0.8956
Instance: 2500; Time: 0.96s; loss: 150.8403; acc: 30765/33628=0.9149
Instance: 3000; Time: 0.98s; loss: 82.3629; acc: 37755/40631=0.9292
Instance: 3500; Time: 0.98s; loss: 60.5309; acc: 44862/47745=0.9396
Instance: 4000; Time: 1.02s; loss: 48.2506; acc: 52091/54983=0.9474
Instance: 4500; Time: 0.96s; loss: 70.9716; acc: 58559/61461=0.9528
Instance: 5000; Time: 0.96s; loss: 127.8899; acc: 65131/68047=0.9571
Instance: 5500; Time: 1.03s; loss: 68.3554; acc: 72496/75417=0.9613
Instance: 6000; Time: 0.95s; loss: 37.4366; acc: 79385/82310=0.9645
...
...
...
Instance: 34000; Time: 0.99s; loss: 14.4887; acc: 461272/461480=0.9995
Instance: 34500; Time: 1.01s; loss: 8.4303; acc: 468217/468427=0.9996
Instance: 35000; Time: 0.97s; loss: 4.9089; acc: 475359/475571=0.9996
Instance: 35500; Time: 0.92s; loss: 13.4792; acc: 481895/482110=0.9996
Instance: 36000; Time: 0.99s; loss: 8.3643; acc: 488674/488892=0.9996
Instance: 36500; Time: 0.97s; loss: 2.2129; acc: 495312/495530=0.9996
Instance: 37000; Time: 0.99s; loss: 8.1948; acc: 502248/502468=0.9996
Instance: 37500; Time: 1.00s; loss: 3.1468; acc: 509004/509224=0.9996
Instance: 38000; Time: 0.99s; loss: 8.3233; acc: 515815/516037=0.9996
Instance: 38500; Time: 0.99s; loss: 28.5366; acc: 522582/522811=0.9996
Instance: 39000; Time: 1.01s; loss: 41.0298; acc: 529512/529755=0.9995
Instance: 39500; Time: 0.97s; loss: 17.6814; acc: 536158/536405=0.9995
Instance: 39999; Time: 0.96s; loss: 5.8083; acc: 542892/543142=0.9995
Epoch: 87 training finished. Time: 77.25s, speed: 517.79st/s, total loss: 915.7552361488342
totalloss: 915.7552361488342
Right token = 32283 All token = 32315 acc = 0.9990097477951416
Dev: time: 2.33s, speed: 1046.35st/s; acc: 0.9990, p: -1.0000, r: -1.0000, f: -1.0000
Right token = 63579 All token = 63622 acc = 0.999324133161485
Test: time: 4.72s, speed: 1005.99st/s; acc: 0.9993, p: -1.0000, r: -1.0000, f: -1.0000
Epoch: 88/100
Learning rate is set as: 0.0027777777777777775
Shuffle: first input word list: [1628, 2225, 798, 42, 233, 290, 53, 855, 27796, 3702, 98, 53, 798, 44, 49, 47, 855, 10091, 2514, 798, 44, 222, 114, 3381, 119, 134, 745, 42]
Instance: 500; Time: 0.95s; loss: 17.5519; acc: 6557/6562=0.9992
Instance: 1000; Time: 0.91s; loss: 21.6322; acc: 13005/13013=0.9994
Instance: 1500; Time: 0.99s; loss: 4.8008; acc: 19909/19918=0.9995
Instance: 2000; Time: 0.98s; loss: 9.4800; acc: 26477/26491=0.9995
Instance: 2500; Time: 0.96s; loss: 14.8256; acc: 33615/33632=0.9995
Instance: 3000; Time: 0.94s; loss: 7.8938; acc: 40177/40198=0.9995
Instance: 3500; Time: 0.95s; loss: 2.4266; acc: 46866/46887=0.9996
...
...
...
Instance: 28000; Time: 0.98s; loss: 20.8923; acc: 379376/379527=0.9996
Instance: 28500; Time: 1.00s; loss: 2.7063; acc: 386696/386847=0.9996
Instance: 29000; Time: 0.96s; loss: 5.3790; acc: 393165/393317=0.9996
Instance: 29500; Time: 0.93s; loss: 16.8582; acc: 399616/399772=0.9996
Instance: 30000; Time: 0.97s; loss: 18.0675; acc: 406763/406922=0.9996
Instance: 30500; Time: 0.96s; loss: 9.8058; acc: 413767/413929=0.9996
Instance: 31000; Time: 0.94s; loss: 8.0381; acc: 420180/420344=0.9996
Instance: 31500; Time: 0.98s; loss: 19.3071; acc: 426855/427023=0.9996
Instance: 32000; Time: 0.99s; loss: 6.0914; acc: 433908/434078=0.9996
Instance: 32500; Time: 1.04s; loss: 4.9224; acc: 440942/441114=0.9996
Instance: 33000; Time: 0.94s; loss: 19.6826; acc: 447354/447533=0.9996
Instance: 33500; Time: 0.96s; loss: 26.5260; acc: 454367/454551=0.9996
Instance: 34000; Time: 0.99s; loss: 2.0317; acc: 461497/461681=0.9996
Instance: 34500; Time: 0.97s; loss: 3.3676; acc: 468089/468274=0.9996
Instance: 35000; Time: 1.00s; loss: 17.3917; acc: 474681/474870=0.9996
Instance: 35500; Time: 0.96s; loss: 2.8269; acc: 481250/481439=0.9996
Instance: 36000; Time: 0.95s; loss: 7.2310; acc: 487818/488010=0.9996
Instance: 36500; Time: 0.98s; loss: 1.7408; acc: 494733/494925=0.9996
Instance: 37000; Time: 0.96s; loss: 13.4167; acc: 501590/501787=0.9996
Instance: 37500; Time: 1.00s; loss: 8.8884; acc: 508611/508809=0.9996
Instance: 38000; Time: 0.96s; loss: 14.8639; acc: 515501/515702=0.9996
Instance: 38500; Time: 0.94s; loss: 10.9465; acc: 522181/522385=0.9996
Instance: 39000; Time: 1.00s; loss: 21.4621; acc: 529122/529333=0.9996
Instance: 39500; Time: 1.00s; loss: 19.4401; acc: 536137/536351=0.9996
Instance: 39999; Time: 0.94s; loss: 10.5016; acc: 542925/543142=0.9996
Epoch: 99 training finished. Time: 77.20s, speed: 518.13st/s, total loss: 828.0619547367096
totalloss: 828.0619547367096
Right token = 32283 All token = 32315 acc = 0.9990097477951416
Dev: time: 2.32s, speed: 1049.03st/s; acc: 0.9990, p: -1.0000, r: -1.0000, f: -1.0000
Right token = 63579 All token = 63622 acc = 0.999324133161485
Test: time: 4.72s, speed: 1005.92st/s; acc: 0.9993, p: -1.0000, r: -1.0000, f: -1.0000
prepare decoding configuration file:
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-config$ cat word-lstm.char-lstm.decode.config
### Decode ###
status=decode
#raw_dir=sample_data/raw.bmes
raw_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/test.col
#nbest=1
#nbest=10
decode_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-hyp/wordlstm-charlstm.hyp
#dset_dir=sample_data/lstmcrf.dset
dset_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-charlstm.dset
#load_model_dir=sample_data/lstmcrf.0.model
load_model_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-charlstm.0.model
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-config$
manual decoding or testing ...
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ time python main.py --config ./mysent-config/word-lstm.char-lstm.decode.config | tee ./mysent-model/wordlstm-charlstm.decode.log
...
...
Hyperparameters:
Hyper lr: 0.015
Hyper lr_decay: 0.05
Hyper HP_clip: None
Hyper momentum: 0.0
Hyper l2: 1e-08
Hyper hidden_dim: 200
Hyper dropout: 0.5
Hyper lstm_layer: 1
Hyper bilstm: True
Hyper GPU: True
DATA SUMMARY END.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
nbest: None
Load Model from file: /home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-charlstm
build sequence labeling network...
use_char: True
char feature extractor: LSTM
word feature extractor: LSTM
use crf: False
build word sequence feature extractor: LSTM...
build word representation...
build char sequence feature extractor: LSTM ...
Decode raw data, nbest: None ...
Right token = 63589 All token = 63622 acc = 0.9994813114960234
raw: time:4.96s, speed:957.04st/s; acc: 0.9995, p: -1.0000, r: -1.0000, f: -1.0000
Predict raw result has been written into file. /home/yekyaw.thu/tool/NCRFpp/mysent-hyp/wordlstm-charlstm.hyp
real 0m19.159s
user 0m11.255s
sys 0m3.997s
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
The whole decoding log:
Seed num: 42
MODEL: decode
/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/test.col
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
DATA SUMMARY START:
I/O:
Start Sequence Laebling task...
Tag scheme: NoSeg
Split token: |||
MAX SENTENCE LENGTH: 250
MAX WORD LENGTH: -1
Number normalized: True
Word alphabet size: 31439
Char alphabet size: 274
Label alphabet size: 5
Word embedding dir: None
Char embedding dir: None
Word embedding size: 50
Char embedding size: 30
Norm word emb: False
Norm char emb: False
Train file directory: /home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/train.col
Dev file directory: /home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/valid.col
Test file directory: /home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/test.col
Raw file directory: /home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/test.col
Dset file directory: /home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-charlstm.dset
Model file directory: /home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-charlstm
Loadmodel directory: /home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-charlstm.0.model
Seed num: 42
MODEL: decode
/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/test.col
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
DATA SUMMARY START:
I/O:
Start Sequence Laebling task...
Tag scheme: NoSeg
Split token: |||
MAX SENTENCE LENGTH: 250
MAX WORD LENGTH: -1
Number normalized: True
Word alphabet size: 31439
Char alphabet size: 274
Label alphabet size: 5
Word embedding dir: None
Char embedding dir: None
Word embedding size: 50
Char embedding size: 30
Norm word emb: False
Norm char emb: False
Train file directory: /home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/train.col
Dev file directory: /home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/valid.col
Test file directory: /home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/test.col
Raw file directory: /home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/test.col
Dset file directory: /home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-charlstm.dset
Model file directory: /home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-charlstm
Loadmodel directory: /home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-charlstm.0.model
Hyper hidden_dim: 200
Hyper dropout: 0.5
Hyper lstm_layer: 1
Hyper bilstm: True
Hyper GPU: True
DATA SUMMARY END.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
nbest: None
Load Model from file: /home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-charlstm
build sequence labeling network...
use_char: True
char feature extractor: LSTM
word feature extractor: LSTM
use crf: False
build word sequence feature extractor: LSTM...
build word representation...
build char sequence feature extractor: LSTM ...
Decode raw data, nbest: None ...
Right token = 63589 All token = 63622 acc = 0.9994813114960234
raw: time:4.96s, speed:957.04st/s; acc: 0.9995, p: -1.0000, r: -1.0000, f: -1.0000
Predict raw result has been written into file. /home/yekyaw.thu/tool/NCRFpp/mysent-hyp/wordlstm-charlstm.hyp
check the hype file:
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-hyp$ head ./wordlstm-charlstm.hyp
အခု B
သန့်စင်ခန်း N
ကို N
သုံး N
ပါရစေ E
လူငယ် B
တွေ O
က O
ပုံစံတကျ O
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-hyp$
updated the config file as follows:
model_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-nochar
...
...
norm_word_emb=False
norm_char_emb=False
number_normalized=True
seg=True
word_emb_dim=50
char_emb_dim=30
start training ...
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ time python main.py --config ./mysent-config/word-lstm.no-char.train.config | tee ./mysent-model/wordlstm-nochar.training.log
...
...
...
Instance: 11500; Time: 0.68s; loss: 45.9094; acc: 153897/156911=0.9808
Instance: 12000; Time: 0.67s; loss: 17.9875; acc: 160434/163450=0.9815
Instance: 12500; Time: 0.69s; loss: 32.1635; acc: 167086/170107=0.9822
Instance: 13000; Time: 0.67s; loss: 62.4728; acc: 174158/177185=0.9829
Instance: 13500; Time: 0.75s; loss: 31.4787; acc: 180967/183998=0.9835
Instance: 14000; Time: 0.66s; loss: 38.0288; acc: 187571/190606=0.9841
Instance: 14500; Time: 0.61s; loss: 62.3357; acc: 193982/197025=0.9846
Instance: 15000; Time: 0.61s; loss: 33.2049; acc: 200684/203732=0.9850
Instance: 15500; Time: 0.60s; loss: 101.5641; acc: 207522/210579=0.9855
Instance: 16000; Time: 0.58s; loss: 23.6539; acc: 214113/217171=0.9859
Instance: 16500; Time: 0.60s; loss: 74.6310; acc: 220675/223739=0.9863
Instance: 17000; Time: 0.59s; loss: 81.2837; acc: 227171/230242=0.9867
Instance: 17500; Time: 0.60s; loss: 68.2723; acc: 233787/236869=0.9870
Instance: 18000; Time: 0.62s; loss: 55.4200; acc: 240566/243654=0.9873
Instance: 18500; Time: 0.61s; loss: 49.0099; acc: 247439/250533=0.9877
Instance: 19000; Time: 0.64s; loss: 59.5998; acc: 254512/257613=0.9880
Instance: 19500; Time: 0.60s; loss: 36.2397; acc: 261064/264169=0.9882
Instance: 20000; Time: 0.58s; loss: 42.1274; acc: 267386/270497=0.9885
Instance: 20500; Time: 0.61s; loss: 87.3023; acc: 274263/277384=0.9887
Instance: 21000; Time: 0.56s; loss: 34.6933; acc: 280519/283647=0.9890
Instance: 21500; Time: 0.62s; loss: 47.9778; acc: 287375/290511=0.9892
Instance: 22000; Time: 0.60s; loss: 21.2712; acc: 294173/297313=0.9894
Instance: 22500; Time: 0.60s; loss: 56.4415; acc: 300649/303798=0.9896
Instance: 23000; Time: 0.64s; loss: 50.0748; acc: 307477/310636=0.9898
Instance: 23500; Time: 0.65s; loss: 25.8181; acc: 314468/317630=0.9900
...
...
...
Instance: 30000; Time: 0.69s; loss: 10.2924; acc: 406760/406922=0.9996
Instance: 30500; Time: 0.69s; loss: 5.7964; acc: 413764/413929=0.9996
Instance: 31000; Time: 0.64s; loss: 8.3327; acc: 420177/420344=0.9996
Instance: 31500; Time: 0.72s; loss: 12.8702; acc: 426853/427023=0.9996
Instance: 32000; Time: 0.68s; loss: 10.7969; acc: 433905/434078=0.9996
Instance: 32500; Time: 0.74s; loss: 4.2581; acc: 440939/441114=0.9996
Instance: 33000; Time: 0.64s; loss: 13.3785; acc: 447353/447533=0.9996
Instance: 33500; Time: 0.68s; loss: 20.2858; acc: 454366/454551=0.9996
Instance: 34000; Time: 0.69s; loss: 4.9878; acc: 461495/461681=0.9996
Instance: 34500; Time: 0.66s; loss: 5.6717; acc: 468086/468274=0.9996
Instance: 35000; Time: 0.68s; loss: 21.4708; acc: 474678/474870=0.9996
Instance: 35500; Time: 0.65s; loss: 3.4203; acc: 481245/481439=0.9996
Instance: 36000; Time: 0.64s; loss: 9.2008; acc: 487814/488010=0.9996
Instance: 36500; Time: 0.69s; loss: 1.8054; acc: 494729/494925=0.9996
Instance: 37000; Time: 0.66s; loss: 14.5209; acc: 501587/501787=0.9996
Instance: 37500; Time: 0.71s; loss: 11.7103; acc: 508607/508809=0.9996
Instance: 38000; Time: 0.70s; loss: 9.9241; acc: 515498/515702=0.9996
Instance: 38500; Time: 0.68s; loss: 10.9779; acc: 522178/522385=0.9996
Instance: 39000; Time: 0.69s; loss: 15.0097; acc: 529120/529333=0.9996
Instance: 39500; Time: 0.69s; loss: 16.2991; acc: 536134/536351=0.9996
Instance: 39999; Time: 0.68s; loss: 5.6317; acc: 542922/543142=0.9996
Epoch: 99 training finished. Time: 54.51s, speed: 733.80st/s, total loss: 746.9416055679321
totalloss: 746.9416055679321
Right token = 32284 All token = 32315 acc = 0.9990406931765434
Dev: time: 1.95s, speed: 1252.95st/s; acc: 0.9990, p: -1.0000, r: -1.0000, f: -1.0000
Right token = 63578 All token = 63622 acc = 0.9993084153280312
Test: time: 3.99s, speed: 1193.74st/s; acc: 0.9993, p: -1.0000, r: -1.0000, f: -1.0000
real 101m55.815s
user 101m18.082s
sys 0m29.634s
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
Check the models ...
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-model$ ls
bk wordlstm-charlstm.0.model wordlstm-charlstm.training.log wordlstm-nochar.training.log
wordlstm-charcnn.0.model wordlstm-charlstm.decode.log wordlstm-nochar.0.model
wordlstm-charcnn.dset wordlstm-charlstm.dset wordlstm-nochar.dset
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-model$
Prepare configuration file for decoding:
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-config$ cat ./word-lstm.no-char.decode.config
### Decode ###
status=decode
#raw_dir=sample_data/raw.bmes
raw_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/test.col
#nbest=1
#nbest=10
decode_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-hyp/wordlstm-nochar.hyp
#dset_dir=sample_data/lstmcrf.dset
dset_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-nochar.dset
#load_model_dir=sample_data/lstmcrf.0.model
load_model_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-nochar.0.model
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-config$
manual decoding ...
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ time python main.py --config ./mysent-config/word-lstm.no-char.decode.config | tee ./mysent-model/wordlstm-nochar.decode.log
...
...
...
Average batch loss: False
++++++++++++++++++++++++++++++++++++++++
Hyperparameters:
Hyper lr: 0.015
Hyper lr_decay: 0.05
Hyper HP_clip: None
Hyper momentum: 0.0
Hyper l2: 1e-08
Hyper hidden_dim: 200
Hyper dropout: 0.5
Hyper lstm_layer: 1
Hyper bilstm: True
Hyper GPU: True
DATA SUMMARY END.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
nbest: None
Load Model from file: /home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-nochar
build sequence labeling network...
use_char: False
word feature extractor: LSTM
use crf: False
build word sequence feature extractor: LSTM...
build word representation...
Decode raw data, nbest: None ...
Right token = 63589 All token = 63622 acc = 0.9994813114960234
raw: time:3.95s, speed:1204.46st/s; acc: 0.9995, p: -1.0000, r: -1.0000, f: -1.0000
Predict raw result has been written into file. /home/yekyaw.thu/tool/NCRFpp/mysent-hyp/wordlstm-nochar.hyp
real 0m12.364s
user 0m9.567s
sys 0m3.147s
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
check the output hyp file:
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-hyp$ head ./wordlstm-nochar.hyp
အခု B
သန့်စင်ခန်း N
ကို N
သုံး N
ပါရစေ E
လူငယ် B
တွေ O
က O
ပုံစံတကျ O
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-hyp$
preparing config file:
model_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-crf-charcnn
...
###NetworkConfiguration###
use_crf=True
use_char=True
word_seq_feature=LSTM
char_seq_feature=CNN
training ...
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ time python main.py --config ./mysent-config/word-lstm.crf.char-cnn.train.config | tee ./mysent-model/wordlstm-crf-charcnn.training.log
...
...
...
BatchSize: 10
Average batch loss: False
++++++++++++++++++++++++++++++++++++++++
Hyperparameters:
Hyper lr: 0.015
Hyper lr_decay: 0.05
Hyper HP_clip: None
Hyper momentum: 0.0
Hyper l2: 1e-08
Hyper hidden_dim: 200
Hyper dropout: 0.5
Hyper lstm_layer: 1
Hyper bilstm: True
Hyper GPU: True
DATA SUMMARY END.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
build sequence labeling network...
use_char: True
char feature extractor: CNN
word feature extractor: LSTM
use crf: True
build word sequence feature extractor: LSTM...
build word representation...
build char sequence feature extractor: CNN ...
build CRF...
Epoch: 0/100
Learning rate is set as: 0.015
Shuffle: first input word list: [8895, 45, 226, 207, 1037, 644, 18, 253, 208, 254]
Instance: 500; Time: 2.63s; loss: 2854.2836; acc: 4935/6598=0.7480
Instance: 1000; Time: 2.62s; loss: 1355.7439; acc: 10843/13463=0.8054
...
...
...
Instance: 30500; Time: 2.58s; loss: 8.3659; acc: 413707/413929=0.9995
Instance: 31000; Time: 2.42s; loss: 18.4387; acc: 420118/420344=0.9995
Instance: 31500; Time: 2.63s; loss: 16.5659; acc: 426794/427023=0.9995
Instance: 32000; Time: 2.55s; loss: 17.0895; acc: 433845/434078=0.9995
Instance: 32500; Time: 2.74s; loss: 14.4164; acc: 440875/441114=0.9995
Instance: 33000; Time: 2.35s; loss: 22.5720; acc: 447285/447533=0.9994
Instance: 33500; Time: 2.51s; loss: 23.6950; acc: 454298/454551=0.9994
Instance: 34000; Time: 2.53s; loss: 5.4600; acc: 461427/461681=0.9994
Instance: 34500; Time: 2.40s; loss: 12.0142; acc: 468017/468274=0.9995
Instance: 35000; Time: 2.51s; loss: 22.8959; acc: 474610/474870=0.9995
Instance: 35500; Time: 2.27s; loss: 7.5930; acc: 481177/481439=0.9995
Instance: 36000; Time: 2.31s; loss: 19.3913; acc: 487743/488010=0.9995
Instance: 36500; Time: 2.47s; loss: 2.4894; acc: 494658/494925=0.9995
Instance: 37000; Time: 2.43s; loss: 8.1326; acc: 501518/501787=0.9995
Instance: 37500; Time: 2.56s; loss: 14.0385; acc: 508537/508809=0.9995
Instance: 38000; Time: 2.50s; loss: 11.3989; acc: 515426/515702=0.9995
Instance: 38500; Time: 2.41s; loss: 10.7767; acc: 522106/522385=0.9995
Instance: 39000; Time: 2.52s; loss: 28.8738; acc: 529046/529333=0.9995
Instance: 39500; Time: 2.54s; loss: 22.3990; acc: 536061/536351=0.9995
Instance: 39999; Time: 2.50s; loss: 8.9142; acc: 542850/543142=0.9995
Epoch: 99 training finished. Time: 198.87s, speed: 201.13st/s, total loss: 1062.7795715332031
totalloss: 1062.7795715332031
Right token = 32285 All token = 32315 acc = 0.9990716385579452
Dev: time: 2.87s, speed: 846.90st/s; acc: 0.9991, p: -1.0000, r: -1.0000, f: -1.0000
Right token = 63580 All token = 63622 acc = 0.9993398509949388
Test: time: 5.86s, speed: 809.12st/s; acc: 0.9993, p: -1.0000, r: -1.0000, f: -1.0000
real 347m13.584s
user 346m31.274s
sys 0m31.970s
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
during the training time, check the GPU status and as follows:
Every 2.0s: nvidia-smi gpu.cadt.edu.kh: Sat Dec 17 16:04:18 2022
Sat Dec 17 16:04:18 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.161.03 Driver Version: 470.161.03 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:0A:00.0 Off | N/A |
| 45% 52C P2 61W / 300W | 908MiB / 11019MiB | 17% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce ... Off | 00000000:42:00.0 Off | N/A |
| 8% 51C P8 21W / 257W | 3MiB / 11019MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 2 NVIDIA GeForce ... Off | 00000000:43:00.0 Off | N/A |
| 35% 51C P8 29W / 250W | 3MiB / 11016MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 2715558 C python 905MiB |
+-----------------------------------------------------------------------------+
prepare config file for decoding ...
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-config$ cat word-lstm.crf.char-cnn.decode.config
### Decode ###
status=decode
#raw_dir=sample_data/raw.bmes
raw_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/test.col
#nbest=1
#nbest=10
decode_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-hyp/wordlstm-crf-charcnn.hyp
#dset_dir=sample_data/lstmcrf.dset
dset_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-crf-charcnn.dset
#load_model_dir=sample_data/lstmcrf.0.model
load_model_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-crf-charcnn.0.model
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-config$
manual testing for wordlstm-crf-charcnn model ...
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ time python main.py --config ./mysent-config/word-lstm.crf.char-cnn.decode.config | tee ./mysent-model/wordlstm-crf-charcnn.decode.log
...
...
...
Hyper HP_clip: None
Hyper momentum: 0.0
Hyper l2: 1e-08
Hyper hidden_dim: 200
Hyper dropout: 0.5
Hyper lstm_layer: 1
Hyper bilstm: True
Hyper GPU: True
DATA SUMMARY END.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
nbest: None
Load Model from file: /home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-crf-charcnn
build sequence labeling network...
use_char: True
char feature extractor: CNN
word feature extractor: LSTM
use crf: True
build word sequence feature extractor: LSTM...
build word representation...
build char sequence feature extractor: CNN ...
build CRF...
Decode raw data, nbest: None ...
Right token = 58547 All token = 63622 acc = 0.9202319952217787
raw: time:5.92s, speed:801.76st/s; acc: 0.9202, p: -1.0000, r: -1.0000, f: -1.0000
Predict raw result has been written into file. /home/yekyaw.thu/tool/NCRFpp/mysent-hyp/wordlstm-crf-charcnn.hyp
real 0m15.207s
user 0m11.987s
sys 0m3.684s
check the outpuot hyp file:
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-hyp$ head -n 30 ./wordlstm-crf-charcnn.hyp
အခု B
သန့်စင်ခန်း N
ကကို N
သသုံး N
ပါရစေ E
လူငယ် B
တွေ O
က O
ပပုံစံတကျ O
ရှိ O
မှု O
ကကို O
မ O
ကြြိုက် N
ဘူး E
ဒီ B
တစ် O
ခေါက် O
ကိစ္စ O
ကြောင့် O
ကျွန်တော့် O
ရရဲ့ O
သိက္ခာ O
အဖတ်ဆယ် O
လလိလို့ O
မ O
ရ O
အောင် O
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-hyp$
check all hyp files so far:
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-hyp$ wc *.hyp
68334 127244 1051379 wordlstm-charcnn.hyp
68334 127244 1051379 wordlstm-charlstm.hyp
68334 127244 1051379 wordlstm-crf-charcnn.hyp
68334 127244 1051379 wordlstm-nochar.hyp
273336 508976 4205516 total
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-hyp$
prepare config file:
### I/O ###
model_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-crf-charlstm
...
...
###NetworkConfiguration###
use_crf=True
use_char=True
word_seq_feature=LSTM
char_seq_feature=LSTM
training start ...
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ time python main.py --config ./mysent-config/word-lstm.crf.char-lstm.train.config | tee ./mysent-model/wordlstm-crf-charlstm.training.log
...
...
Iteration: 100
BatchSize: 10
Average batch loss: False
++++++++++++++++++++++++++++++++++++++++
Hyperparameters:
Hyper lr: 0.015
Hyper lr_decay: 0.05
Hyper HP_clip: None
Hyper momentum: 0.0
Hyper l2: 1e-08
Hyper hidden_dim: 200
Hyper dropout: 0.5
Hyper lstm_layer: 1
Hyper bilstm: True
Hyper GPU: True
DATA SUMMARY END.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
build sequence labeling network...
use_char: True
char feature extractor: LSTM
word feature extractor: LSTM
use crf: True
build word sequence feature extractor: LSTM...
build word representation...
build char sequence feature extractor: LSTM ...
build CRF...
Epoch: 0/100
Learning rate is set as: 0.015
Shuffle: first input word list: [8895, 45, 226, 207, 1037, 644, 18, 253, 208, 254]
Instance: 500; Time: 2.54s; loss: 2852.0717; acc: 4747/6598=0.7195
Instance: 1000; Time: 2.66s; loss: 1393.6742; acc: 10592/13463=0.7867
...
...
...
Instance: 30500; Time: 2.66s; loss: 8.9390; acc: 413727/413929=0.9995
Instance: 31000; Time: 2.51s; loss: 17.0309; acc: 420138/420344=0.9995
Instance: 31500; Time: 2.74s; loss: 21.9109; acc: 426813/427023=0.9995
Instance: 32000; Time: 2.66s; loss: 10.6733; acc: 433866/434078=0.9995
Instance: 32500; Time: 2.85s; loss: 10.1479; acc: 440898/441114=0.9995
Instance: 33000; Time: 2.50s; loss: 15.6271; acc: 447312/447533=0.9995
Instance: 33500; Time: 2.59s; loss: 22.0999; acc: 454325/454551=0.9995
Instance: 34000; Time: 2.65s; loss: 4.1620; acc: 461453/461681=0.9995
Instance: 34500; Time: 2.52s; loss: 6.2731; acc: 468044/468274=0.9995
Instance: 35000; Time: 2.68s; loss: 24.2872; acc: 474637/474870=0.9995
Instance: 35500; Time: 2.50s; loss: 6.6882; acc: 481203/481439=0.9995
Instance: 36000; Time: 2.49s; loss: 25.8972; acc: 487768/488010=0.9995
Instance: 36500; Time: 2.66s; loss: 2.4390; acc: 494683/494925=0.9995
Instance: 37000; Time: 2.49s; loss: 9.3456; acc: 501543/501787=0.9995
Instance: 37500; Time: 2.70s; loss: 15.8565; acc: 508563/508809=0.9995
Instance: 38000; Time: 2.63s; loss: 16.7104; acc: 515453/515702=0.9995
Instance: 38500; Time: 2.53s; loss: 10.2049; acc: 522133/522385=0.9995
Instance: 39000; Time: 2.69s; loss: 31.6379; acc: 529071/529333=0.9995
Instance: 39500; Time: 2.73s; loss: 15.4925; acc: 536084/536351=0.9995
Instance: 39999; Time: 2.58s; loss: 5.8837; acc: 542873/543142=0.9995
Epoch: 99 training finished. Time: 207.21s, speed: 193.03st/s, total loss: 1021.2640380859375
totalloss: 1021.2640380859375
Right token = 32285 All token = 32315 acc = 0.9990716385579452
Dev: time: 3.25s, speed: 746.70st/s; acc: 0.9991, p: -1.0000, r: -1.0000, f: -1.0000
Right token = 63576 All token = 63622 acc = 0.9992769796611235
Test: time: 6.64s, speed: 713.91st/s; acc: 0.9993, p: -1.0000, r: -1.0000, f: -1.0000
real 366m16.764s
user 365m24.837s
sys 0m42.253s
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
Preparing decode config file:
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ cat ./mysent-config/word-lstm.crf.char-lstm.decode.config
### Decode ###
status=decode
#raw_dir=sample_data/raw.bmes
raw_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/test.col
#nbest=1
#nbest=10
decode_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-hyp/wordlstm-crf-charlstm.hyp
#dset_dir=sample_data/lstmcrf.dset
dset_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-crf-charlstm.dset
#load_model_dir=sample_data/lstmcrf.0.model
load_model_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-crf-charlstm.0.model
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
Manual testing ...
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ time python main.py --config ./mysent-config/word-lstm.crf.char-lstm.decode.config | tee ./mysent-model/wordlstm-crf-charlstm.decode.log
...
...
Hyper lstm_layer: 1
Hyper bilstm: True
Hyper GPU: True
DATA SUMMARY END.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
nbest: None
Load Model from file: /home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-crf-charlstm
build sequence labeling network...
use_char: True
char feature extractor: LSTM
word feature extractor: LSTM
use crf: True
build word sequence feature extractor: LSTM...
build word representation...
build char sequence feature extractor: LSTM ...
build CRF...
Decode raw data, nbest: None ...
Right token = 63563 All token = 63622 acc = 0.9990726478262236
raw: time:6.57s, speed:722.02st/s; acc: 0.9991, p: -1.0000, r: -1.0000, f: -1.0000
Predict raw result has been written into file. /home/yekyaw.thu/tool/NCRFpp/mysent-hyp/wordlstm-crf-charlstm.hyp
real 0m15.235s
user 0m12.522s
sys 0m3.079s
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
check the hyp file:
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ head ./mysent-hyp/wordlstm-crf-charlstm.hyp
အခု B
သန့်စင်ခန်း N
ကကို N
သသုံး N
ပါရစေ E
လူငယ် B
တွေ O
က O
ပပုံစံတကျ O
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
Preparing config file:
### I/O ###
model_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-crf-nochar
...
...
###NetworkConfiguration###
use_crf=True
use_char=False
word_seq_feature=LSTM
#char_seq_feature=LSTM
start training ...
...
...
...
Optimizer: SGD
Iteration: 100
BatchSize: 10
Average batch loss: False
++++++++++++++++++++++++++++++++++++++++
Hyperparameters:
Hyper lr: 0.015
Hyper lr_decay: 0.05
Hyper HP_clip: None
Hyper momentum: 0.0
Hyper l2: 1e-08
Hyper hidden_dim: 200
Hyper dropout: 0.5
Hyper lstm_layer: 1
Hyper bilstm: True
Hyper GPU: True
DATA SUMMARY END.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
build sequence labeling network...
use_char: False
word feature extractor: LSTM
use crf: True
build word sequence feature extractor: LSTM...
build word representation...
build CRF...
Epoch: 0/100
Learning rate is set as: 0.015
Shuffle: first input word list: [8895, 45, 226, 207, 1037, 644, 18, 253, 208, 254]
Instance: 500; Time: 2.34s; loss: 3149.8809; acc: 4772/6598=0.7232
Instance: 1000; Time: 2.50s; loss: 1330.4052; acc: 10747/13463=0.7983
Instance: 1500; Time: 2.38s; loss: 1028.0932; acc: 16734/20100=0.8325
...
...
...
Instance: 39000; Time: 2.50s; loss: 41.3326; acc: 529325/529691=0.9993
Instance: 39500; Time: 2.49s; loss: 25.9476; acc: 535969/536341=0.9993
Instance: 39999; Time: 2.48s; loss: 18.9208; acc: 542764/543142=0.9993
Epoch: 26 training finished. Time: 200.88s, speed: 199.12st/s, total loss: 1510.2309875488281
totalloss: 1510.2309875488281
Right token = 32286 All token = 32315 acc = 0.999102583939347
Dev: time: 2.76s, speed: 880.49st/s; acc: 0.9991, p: -1.0000, r: -1.0000, f: -1.0000
Right token = 63587 All token = 63622 acc = 0.9994498758291157
Test: time: 5.67s, speed: 837.40st/s; acc: 0.9994, p: -1.0000, r: -1.0000, f: -1.0000
Epoch: 27/100
Learning rate is set as: 0.006382978723404255
Shuffle: first input word list: [464, 1019, 45, 1918, 33, 26, 743, 102, 4, 37, 2402, 70, 12, 13]
Instance: 500; Time: 2.64s; loss: 29.0359; acc: 7113/7120=0.9990
Instance: 1000; Time: 2.50s; loss: 25.7385; acc: 14020/14034=0.9990
Instance: 1500; Time: 2.40s; loss: 20.6611; acc: 20551/20570=0.9991
Instance: 2000; Time: 2.49s; loss: 23.0771; acc: 27082/27105=0.9992
Instance: 2500; Time: 2.37s; loss: 22.4814; acc: 33565/33592=0.9992
Instance: 3000; Time: 2.51s; loss: 16.6531; acc: 40092/40123=0.9992
Instance: 3500; Time: 2.51s; loss: 9.8474; acc: 46945/46978=0.9993
Instance: 4000; Time: 2.48s; loss: 24.1730; acc: 53701/53738=0.9993
Instance: 4500; Time: 2.60s; loss: 17.0583; acc: 60530/60570=0.9993
Instance: 5000; Time: 2.63s; loss: 22.4983; acc: 67633/67675=0.9994
Instance: 5500; Time: 2.50s; loss: 8.9262; acc: 74514/74560=0.9994
Instance: 6000; Time: 2.45s; loss: 22.0184; acc: 81046/81094=0.9994
Instance: 6500; Time: 2.57s; loss: 16.4835; acc: 88030/88081=0.9994
Instance: 7000; Time: 2.37s; loss: 8.8923; acc: 94884/94938=0.9994
Instance: 7500; Time: 2.38s; loss: 10.2485; acc: 101885/101942=0.9994
Instance: 8000; Time: 2.45s; loss: 13.0707; acc: 109028/109089=0.9994
Instance: 8500; Time: 2.33s; loss: 5.8790; acc: 115951/116013=0.9995
Instance: 9000; Time: 2.55s; loss: 6.4045; acc: 122915/122978=0.9995
Instance: 9500; Time: 2.40s; loss: 13.6191; acc: 129379/129444=0.9995
...
...
...
Instance: 30000; Time: 2.54s; loss: 11.4768; acc: 406742/406922=0.9996
Instance: 30500; Time: 2.54s; loss: 5.7083; acc: 413747/413929=0.9996
Instance: 31000; Time: 2.44s; loss: 14.2428; acc: 420159/420344=0.9996
Instance: 31500; Time: 2.58s; loss: 7.7025; acc: 426835/427023=0.9996
Instance: 32000; Time: 2.54s; loss: 5.4664; acc: 433888/434078=0.9996
Instance: 32500; Time: 2.70s; loss: 4.4285; acc: 440922/441114=0.9996
Instance: 33000; Time: 2.38s; loss: 13.7225; acc: 447335/447533=0.9996
Instance: 33500; Time: 2.53s; loss: 16.0905; acc: 454350/454551=0.9996
Instance: 34000; Time: 2.52s; loss: 4.6945; acc: 461477/461681=0.9996
Instance: 34500; Time: 2.38s; loss: 4.5834; acc: 468069/468274=0.9996
Instance: 35000; Time: 2.46s; loss: 25.4389; acc: 474661/474870=0.9996
Instance: 35500; Time: 2.31s; loss: 3.7061; acc: 481229/481439=0.9996
Instance: 36000; Time: 2.28s; loss: 12.3723; acc: 487795/488010=0.9996
Instance: 36500; Time: 2.43s; loss: 3.0925; acc: 494710/494925=0.9996
Instance: 37000; Time: 2.33s; loss: 12.0607; acc: 501569/501787=0.9996
Instance: 37500; Time: 2.46s; loss: 10.4619; acc: 508589/508809=0.9996
Instance: 38000; Time: 2.49s; loss: 7.9403; acc: 515480/515702=0.9996
Instance: 38500; Time: 2.38s; loss: 8.8724; acc: 522160/522385=0.9996
Instance: 39000; Time: 2.43s; loss: 20.8378; acc: 529099/529333=0.9996
Instance: 39500; Time: 2.51s; loss: 13.6555; acc: 536113/536351=0.9996
Instance: 39999; Time: 2.41s; loss: 4.3625; acc: 542902/543142=0.9996
Epoch: 99 training finished. Time: 199.39s, speed: 200.61st/s, total loss: 782.2766418457031
totalloss: 782.2766418457031
Right token = 32285 All token = 32315 acc = 0.9990716385579452
Dev: time: 2.75s, speed: 884.41st/s; acc: 0.9991, p: -1.0000, r: -1.0000, f: -1.0000
Right token = 63580 All token = 63622 acc = 0.9993398509949388
Test: time: 5.63s, speed: 842.51st/s; acc: 0.9993, p: -1.0000, r: -1.0000, f: -1.0000
real 339m54.757s
user 339m12.320s
sys 0m31.127s
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
prepare decode config file:
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-config$ cat word-lstm.crf.nochar.decode.config
### Decode ###
status=decode
#raw_dir=sample_data/raw.bmes
raw_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/test.col
#nbest=1
#nbest=10
decode_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-hyp/wordlstm-crf-nochar.hyp
#dset_dir=sample_data/lstmcrf.dset
dset_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-crf-nochar.dset
#load_model_dir=sample_data/lstmcrf.0.model
load_model_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-crf-nochar.0.model
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-config$
start manual testing ...
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ time python main.py --config ./mysent-config/word-lstm.crf.nochar.decode.config | tee ./mysent-model/wordlstm-crf-nochar.decode.log
...
...
...
++++++++++++++++++++++++++++++++++++++++
Hyperparameters:
Hyper lr: 0.015
Hyper lr_decay: 0.05
Hyper HP_clip: None
Hyper momentum: 0.0
Hyper l2: 1e-08
Hyper hidden_dim: 200
Hyper dropout: 0.5
Hyper lstm_layer: 1
Hyper bilstm: True
Hyper GPU: True
DATA SUMMARY END.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
nbest: None
Load Model from file: /home/yekyaw.thu/tool/NCRFpp/mysent-model/wordlstm-crf-nochar
build sequence labeling network...
use_char: False
word feature extractor: LSTM
use crf: True
build word sequence feature extractor: LSTM...
build word representation...
build CRF...
Decode raw data, nbest: None ...
Right token = 63589 All token = 63622 acc = 0.9994813114960234
raw: time:5.56s, speed:852.88st/s; acc: 0.9995, p: -1.0000, r: -1.0000, f: -1.0000
Predict raw result has been written into file. /home/yekyaw.thu/tool/NCRFpp/mysent-hyp/wordlstm-crf-nochar.hyp
real 0m14.250s
user 0m11.479s
sys 0m3.106s
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
check the output hyp file:
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-hyp$ head ./wordlstm-crf-nochar.hyp
အခု B
သန့်စင်ခန်း N
ကကို N
သသုံး N
ပါရစေ E
လူငယ် B
တွေ O
က O
ပပုံစံတကျ O
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-hyp$
prepare config file:
### use # to comment out the configure item
### I/O ###
train_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/train.col
dev_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/valid.col
test_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/test.col
model_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordcnn-charcnn
#word_emb_dir=sample_data/sample.word.emb
#raw_dir=
#decode_dir=
#dset_dir=
#load_model_dir=
#char_emb_dir=
norm_word_emb=False
norm_char_emb=False
number_normalized=True
seg=True
word_emb_dim=50
char_emb_dim=30
###NetworkConfiguration###
use_crf=False
use_char=True
word_seq_feature=CNN
char_seq_feature=CNN
#feature=[POS] emb_size=20
...
...
...
start training and got error as follows ...
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ time python main.py --config ./mysent-config/word-cnn.char-cnn.train.config | tee ./mysent-model/wordcnn-charcnn.training.log
Seed num: 42
MODEL: train
Training model...
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
DATA SUMMARY START:
I/O:
Start Sequence Laebling task...
Tag scheme: NoSeg
Split token: |||
MAX SENTENCE LENGTH: 250
MAX WORD LENGTH: -1
Number normalized: True
Word alphabet size: 31439
Char alphabet size: 274
Label alphabet size: 5
Word embedding dir: None
Char embedding dir: None
Word embedding size: 50
Char embedding size: 30
Norm word emb: False
Norm char emb: False
Train file directory: /home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/train.col
Dev file directory: /home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/valid.col
Test file directory: /home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/test.col
Raw file directory: None
Dset file directory: None
Model file directory: /home/yekyaw.thu/tool/NCRFpp/mysent-model/wordcnn-charcnn
Loadmodel directory: None
Decode file directory: None
Train instance number: 39999
Dev instance number: 2414
Test instance number: 4712
Raw instance number: 0
FEATURE num: 0
++++++++++++++++++++++++++++++++++++++++
Model Network:
Model use_crf: False
Model word extractor: CNN
Model use_char: True
Model char extractor: CNN
Model char_hidden_dim: 50
++++++++++++++++++++++++++++++++++++++++
Training:
Optimizer: SGD
Iteration: 100
BatchSize: 10
Average batch loss: False
++++++++++++++++++++++++++++++++++++++++
Hyperparameters:
Hyper lr: 0.015
Hyper lr_decay: 0.05
Hyper HP_clip: None
Hyper momentum: 0.0
Hyper l2: 1e-08
Hyper hidden_dim: 200
Hyper dropout: 0.5
Hyper lstm_layer: 1
Hyper bilstm: True
Hyper GPU: True
DATA SUMMARY END.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
build sequence labeling network...
use_char: True
char feature extractor: CNN
word feature extractor: CNN
use crf: False
build word sequence feature extractor: CNN...
build word representation...
build char sequence feature extractor: CNN ...
CNN layer: 4
Epoch: 0/100
Learning rate is set as: 0.015
Shuffle: first input word list: [8895, 45, 226, 207, 1037, 644, 18, 253, 208, 254]
Instance: 500; Time: 0.66s; loss: 25076740480520.8359; acc: 1733/6598=0.2627
ERROR: LOSS EXPLOSION (>1e8) ! PLEASE SET PROPER PARAMETERS AND STRUCTURE! EXIT....
update the learning rate:
#learning_rate=0.015
learning_rate=0.010
train again and it looks working ...
...
...
...
Instance: 20500; Time: 0.67s; loss: 111.9535; acc: 271255/277384=0.9779
Instance: 21000; Time: 0.58s; loss: 37.9173; acc: 277511/283647=0.9784
Instance: 21500; Time: 0.64s; loss: 57.2896; acc: 284363/290511=0.9788
Instance: 22000; Time: 0.63s; loss: 39.8409; acc: 291155/297313=0.9793
Instance: 22500; Time: 0.62s; loss: 52.6794; acc: 297629/303798=0.9797
Instance: 23000; Time: 0.67s; loss: 40.9671; acc: 304457/310636=0.9801
Instance: 23500; Time: 0.66s; loss: 27.7918; acc: 311446/317630=0.9805
Instance: 24000; Time: 0.61s; loss: 118.8584; acc: 317936/324131=0.9809
Instance: 24500; Time: 0.62s; loss: 159.1228; acc: 324613/330820=0.9812
Instance: 25000; Time: 0.61s; loss: 56.5890; acc: 331645/337858=0.9816
Instance: 25500; Time: 0.59s; loss: 53.2868; acc: 338068/344286=0.9819
Instance: 26000; Time: 0.62s; loss: 60.5964; acc: 344780/351005=0.9823
Instance: 26500; Time: 0.62s; loss: 19.5004; acc: 351573/357800=0.9826
Instance: 27000; Time: 0.65s; loss: 79.3346; acc: 358649/364892=0.9829
Instance: 27500; Time: 0.66s; loss: 69.7029; acc: 365715/371969=0.9832
Instance: 28000; Time: 0.61s; loss: 25.8149; acc: 371987/378246=0.9835
Instance: 28500; Time: 0.68s; loss: 44.8137; acc: 379046/385311=0.9837
Instance: 29000; Time: 0.64s; loss: 72.8295; acc: 385613/391890=0.9840
Instance: 29500; Time: 0.63s; loss: 29.0705; acc: 392296/398578=0.9842
Instance: 30000; Time: 0.66s; loss: 82.0742; acc: 399292/405583=0.9845
Instance: 30500; Time: 0.64s; loss: 9.9670; acc: 405944/412235=0.9847
Instance: 31000; Time: 0.68s; loss: 34.5553; acc: 412869/419166=0.9850
Instance: 31500; Time: 0.65s; loss: 28.2391; acc: 419464/425768=0.9852
Instance: 32000; Time: 0.70s; loss: 41.5926; acc: 426758/433072=0.9854
Instance: 32500; Time: 0.69s; loss: 102.9158; acc: 433784/440108=0.9856
Instance: 33000; Time: 0.68s; loss: 126.6441; acc: 440679/447015=0.9858
Instance: 33500; Time: 0.68s; loss: 124.7159; acc: 447726/454076=0.9860
Instance: 34000; Time: 0.63s; loss: 43.2462; acc: 454325/460685=0.9862
Instance: 34500; Time: 0.68s; loss: 104.8425; acc: 461310/467678=0.9864
Instance: 35000; Time: 0.71s; loss: 44.8154; acc: 468443/474819=0.9866
...
...
...
Instance: 31500; Time: 0.66s; loss: 39.0150; acc: 426695/427023=0.9992
Instance: 32000; Time: 0.64s; loss: 98.0791; acc: 433741/434078=0.9992
Instance: 32500; Time: 0.68s; loss: 40.7073; acc: 440771/441114=0.9992
Instance: 33000; Time: 0.61s; loss: 39.6800; acc: 447183/447533=0.9992
Instance: 33500; Time: 0.65s; loss: 25.8624; acc: 454197/454551=0.9992
Instance: 34000; Time: 0.63s; loss: 37.1705; acc: 461324/461681=0.9992
Instance: 34500; Time: 0.47s; loss: 14.6737; acc: 467914/468274=0.9992
Instance: 35000; Time: 0.50s; loss: 27.9795; acc: 474506/474870=0.9992
Instance: 35500; Time: 0.47s; loss: 28.1497; acc: 481072/481439=0.9992
Instance: 36000; Time: 0.47s; loss: 36.1011; acc: 487637/488010=0.9992
Instance: 36500; Time: 0.55s; loss: 7.0099; acc: 494552/494925=0.9992
Instance: 37000; Time: 0.62s; loss: 22.0090; acc: 501411/501787=0.9993
Instance: 37500; Time: 0.65s; loss: 25.4751; acc: 508430/508809=0.9993
Instance: 38000; Time: 0.67s; loss: 22.8392; acc: 515319/515702=0.9993
Instance: 38500; Time: 0.66s; loss: 16.3963; acc: 521998/522385=0.9993
Instance: 39000; Time: 0.66s; loss: 141.7035; acc: 528928/529333=0.9992
Instance: 39500; Time: 0.68s; loss: 43.6660; acc: 535941/536351=0.9992
Instance: 39999; Time: 0.64s; loss: 18.7707; acc: 542728/543142=0.9992
Epoch: 99 training finished. Time: 50.60s, speed: 790.49st/s, total loss: 3005.427954673767
totalloss: 3005.427954673767
Right token = 32287 All token = 32315 acc = 0.9991335293207488
Dev: time: 1.85s, speed: 1317.93st/s; acc: 0.9991, p: -1.0000, r: -1.0000, f: -1.0000
Right token = 63589 All token = 63622 acc = 0.9994813114960234
Test: time: 3.77s, speed: 1261.49st/s; acc: 0.9995, p: -1.0000, r: -1.0000, f: -1.0000
real 92m0.469s
user 91m45.120s
sys 0m15.266s
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
check the model folder:
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-model$ ls
bk wordlstm-charlstm.dset wordlstm-crf-charlstm.training.log
wordcnn-charcnn.0.model wordlstm-charlstm.training.log wordlstm-crf-nochar.0.model
wordcnn-charcnn.dset wordlstm-crf-charcnn.0.model wordlstm-crf-nochar.decode.log
wordcnn-charcnn.training.errlog1 wordlstm-crf-charcnn.decode.log wordlstm-crf-nochar.dset
wordcnn-charcnn.training.log wordlstm-crf-charcnn.dset word-lstm.crf.nochar.training.log
wordlstm-charcnn.0.model wordlstm-crf-charcnn.training.log wordlstm-nochar.0.model
wordlstm-charcnn.dset wordlstm-crf-charlstm.0.model wordlstm-nochar.decode.log
wordlstm-charlstm.0.model wordlstm-crf-charlstm.decode.log wordlstm-nochar.dset
wordlstm-charlstm.decode.log wordlstm-crf-charlstm.dset wordlstm-nochar.training.log
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-model$
Prepare decode config ...
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-config$ cat word-cnn.char-cnn.decode.config
### Decode ###
status=decode
#raw_dir=sample_data/raw.bmes
raw_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/test.col
#nbest=1
#nbest=10
decode_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-hyp/wordcnn-charcnn.hyp
#dset_dir=sample_data/lstmcrf.dset
dset_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordcnn-charcnn.dset
#load_model_dir=sample_data/lstmcrf.0.model
load_model_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordcnn-charcnn.0.model
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-config$
manual testing ...
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ time python main.py --config ./mysent-config/word-cnn.char-cnn.decode.config | tee ./mysent-model/wordcnn-charcnn.decode.log
...
...
...
Hyper HP_clip: None
Hyper momentum: 0.0
Hyper l2: 1e-08
Hyper hidden_dim: 200
Hyper dropout: 0.5
Hyper lstm_layer: 1
Hyper bilstm: True
Hyper GPU: True
DATA SUMMARY END.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
nbest: None
Load Model from file: /home/yekyaw.thu/tool/NCRFpp/mysent-model/wordcnn-charcnn
build sequence labeling network...
use_char: True
char feature extractor: CNN
word feature extractor: CNN
use crf: False
build word sequence feature extractor: CNN...
build word representation...
build char sequence feature extractor: CNN ...
CNN layer: 4
Decode raw data, nbest: None ...
Right token = 63589 All token = 63622 acc = 0.9994813114960234
raw: time:5.43s, speed:874.26st/s; acc: 0.9995, p: -1.0000, r: -1.0000, f: -1.0000
Predict raw result has been written into file. /home/yekyaw.thu/tool/NCRFpp/mysent-hyp/wordcnn-charcnn.hyp
real 0m13.451s
user 0m11.597s
sys 0m1.538s
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
check the output hyp file:
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-hyp$ head ./wordcnn-charcnn.hyp
အခု B
သန့်စင်ခန်း N
ကကို N
သသုံး N
ပါရစေ E
လူငယ် B
တွေ O
က O
ပပုံစံတကျ O
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-hyp$
prepare the config file:
### use # to comment out the configure item
### I/O ###
train_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/train.col
dev_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/valid.col
test_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/test.col
model_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordcnn-charlstm
#word_emb_dir=sample_data/sample.word.emb
#raw_dir=
#decode_dir=
#dset_dir=
#load_model_dir=
#char_emb_dir=
norm_word_emb=False
norm_char_emb=False
number_normalized=True
seg=True
word_emb_dim=50
char_emb_dim=30
###NetworkConfiguration###
use_crf=False
use_char=True
word_seq_feature=CNN
char_seq_feature=LSTM
#feature=[POS] emb_size=20
...
...
...
training ...
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ time python ./main.py --config ./mysent-config/word-cnn.char-lstm.train.config | tee ./mysent-model/wordcnn-charlstm.train.log
...
...
...
++++++++++++++++++++++++++++++++++++++++
Hyperparameters:
Hyper lr: 0.01
Hyper lr_decay: 0.05
Hyper HP_clip: None
Hyper momentum: 0.0
Hyper l2: 1e-08
Hyper hidden_dim: 200
Hyper dropout: 0.5
Hyper lstm_layer: 1
Hyper bilstm: True
Hyper GPU: True
DATA SUMMARY END.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
/home/yekyaw.thu/.conda/envs/ncrfpp/lib/python3.8/site-packages/torch/nn/_reduction.py:43: UserWarning: size_average and reduce args will be deprecated, please use reduction='sum' instead.
warnings.warn(warning.format(ret))
build sequence labeling network...
use_char: True
char feature extractor: LSTM
word feature extractor: CNN
use crf: False
build word sequence feature extractor: CNN...
build word representation...
build char sequence feature extractor: LSTM ...
CNN layer: 4
Epoch: 0/100
Learning rate is set as: 0.01
Shuffle: first input word list: [8895, 45, 226, 207, 1037, 644, 18, 253, 208, 254]
Instance: 500; Time: 1.04s; loss: 46295.9573; acc: 4432/6598=0.6717
...
...
...
Instance: 30500; Time: 0.93s; loss: 18.5849; acc: 413619/413929=0.9993
Instance: 31000; Time: 0.83s; loss: 33.5528; acc: 420028/420344=0.9992
Instance: 31500; Time: 0.87s; loss: 30.8974; acc: 426700/427023=0.9992
Instance: 32000; Time: 0.86s; loss: 60.9885; acc: 433746/434078=0.9992
Instance: 32500; Time: 0.87s; loss: 17.4721; acc: 440777/441114=0.9992
Instance: 33000; Time: 0.77s; loss: 35.3249; acc: 447189/447533=0.9992
Instance: 33500; Time: 0.83s; loss: 23.4311; acc: 454202/454551=0.9992
Instance: 34000; Time: 0.81s; loss: 11.1323; acc: 461329/461681=0.9992
Instance: 34500; Time: 0.84s; loss: 12.5343; acc: 467919/468274=0.9992
Instance: 35000; Time: 0.84s; loss: 23.8477; acc: 474511/474870=0.9992
Instance: 35500; Time: 0.83s; loss: 23.1941; acc: 481077/481439=0.9992
Instance: 36000; Time: 0.81s; loss: 33.4580; acc: 487642/488010=0.9992
Instance: 36500; Time: 0.86s; loss: 5.1792; acc: 494557/494925=0.9993
Instance: 37000; Time: 0.93s; loss: 18.1748; acc: 501416/501787=0.9993
Instance: 37500; Time: 0.93s; loss: 26.9669; acc: 508435/508809=0.9993
Instance: 38000; Time: 0.90s; loss: 17.9885; acc: 515324/515702=0.9993
Instance: 38500; Time: 0.84s; loss: 15.3559; acc: 522003/522385=0.9993
Instance: 39000; Time: 0.88s; loss: 80.9564; acc: 528934/529333=0.9992
Instance: 39500; Time: 0.87s; loss: 30.3917; acc: 535947/536351=0.9992
Instance: 39999; Time: 0.88s; loss: 15.3817; acc: 542734/543142=0.9992
Epoch: 99 training finished. Time: 69.13s, speed: 578.58st/s, total loss: 1997.4792351722717
totalloss: 1997.4792351722717
Right token = 32281 All token = 32315 acc = 0.9989478570323379
Dev: time: 2.47s, speed: 987.04st/s; acc: 0.9989, p: -1.0000, r: -1.0000, f: -1.0000
Right token = 63576 All token = 63622 acc = 0.9992769796611235
Test: time: 4.70s, speed: 1011.79st/s; acc: 0.9993, p: -1.0000, r: -1.0000, f: -1.0000
real 128m37.791s
user 127m32.000s
sys 0m37.373s
prepare decode config file:
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-config$ cat word-cnn.char-lstm.decode.config
### Decode ###
status=decode
#raw_dir=sample_data/raw.bmes
raw_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/test.col
#nbest=1
#nbest=10
decode_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-hyp/wordcnn-charlstm.hyp
#dset_dir=sample_data/lstmcrf.dset
dset_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordcnn-charlstm.dset
#load_model_dir=sample_data/lstmcrf.0.model
load_model_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordcnn-charlstm.0.model
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-config$
manual testing ...
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ time python main.py --config ./mysent-config/word-cnn.char-lstm.decode.config | tee ./mysent-model/wordcnn-charlstm.decode.log
Hyper HP_clip: None
Hyper momentum: 0.0
Hyper l2: 1e-08
Hyper hidden_dim: 200
Hyper dropout: 0.5
Hyper lstm_layer: 1
Hyper bilstm: True
Hyper GPU: True
DATA SUMMARY END.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
nbest: None
Load Model from file: /home/yekyaw.thu/tool/NCRFpp/mysent-model/wordcnn-charlstm
build sequence labeling network...
use_char: True
char feature extractor: LSTM
word feature extractor: CNN
use crf: False
build word sequence feature extractor: CNN...
build word representation...
build char sequence feature extractor: LSTM ...
CNN layer: 4
Decode raw data, nbest: None ...
Right token = 63589 All token = 63622 acc = 0.9994813114960234
raw: time:4.89s, speed:971.71st/s; acc: 0.9995, p: -1.0000, r: -1.0000, f: -1.0000
Predict raw result has been written into file. /home/yekyaw.thu/tool/NCRFpp/mysent-hyp/wordcnn-charlstm.hyp
real 0m13.496s
user 0m11.743s
sys 0m1.405s
check the output hyp file:
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-hyp$ head wordcnn-charlstm.hyp
အခု B
သန့်စင်ခန်း N
ကကို N
သသုံး N
ပါရစေ E
လူငယ် B
တွေ O
က O
ပပုံစံတကျ O
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-hyp$
prepare config file:
### use # to comment out the configure item
### I/O ###
train_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/train.col
dev_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/valid.col
test_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/test.col
model_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordcnn-nochar
#word_emb_dir=sample_data/sample.word.emb
#raw_dir=
#decode_dir=
#dset_dir=
#load_model_dir=
#char_emb_dir=
norm_word_emb=False
norm_char_emb=False
number_normalized=True
seg=True
word_emb_dim=50
char_emb_dim=30
###NetworkConfiguration###
use_crf=False
use_char=False
word_seq_feature=CNN
#char_seq_feature=LSTM
#feature=[POS] emb_size=20
...
...
start training ...
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ time python main.py --config ./mysent-config/word-cnn.no-char.train.config | tee ./mysent-model/wordcnn-nochar.train.log
...
...
Hyperparameters:
Hyper lr: 0.01
Hyper lr_decay: 0.05
Hyper HP_clip: None
Hyper momentum: 0.0
Hyper l2: 1e-08
Hyper hidden_dim: 200
Hyper dropout: 0.5
Hyper lstm_layer: 1
Hyper bilstm: True
Hyper GPU: True
DATA SUMMARY END.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
/home/yekyaw.thu/.conda/envs/ncrfpp/lib/python3.8/site-packages/torch/nn/_reduction.py:43: UserWarning: size_average and reduce args will be deprecated, please use reduction='sum' instead.
warnings.warn(warning.format(ret))
build sequence labeling network...
use_char: False
word feature extractor: CNN
use crf: False
build word sequence feature extractor: CNN...
build word representation...
CNN layer: 4
Epoch: 0/100
Learning rate is set as: 0.01
Shuffle: first input word list: [8895, 45, 226, 207, 1037, 644, 18, 253, 208, 254]
Instance: 500; Time: 0.72s; loss: 183879.1970; acc: 3662/6598=0.5550
Instance: 1000; Time: 0.62s; loss: 5917.1649; acc: 8716/13463=0.6474
Instance: 1500; Time: 0.57s; loss: 4383.6639; acc: 13737/20100=0.6834
Instance: 2000; Time: 0.53s; loss: 4228.5525; acc: 19241/27109=0.7098
Instance: 2500; Time: 0.53s; loss: 3214.0498; acc: 24515/33628=0.7290
...
...
...
Instance: 31000; Time: 0.43s; loss: 36.0379; acc: 420018/420344=0.9992
Instance: 31500; Time: 0.48s; loss: 46.1417; acc: 426688/427023=0.9992
Instance: 32000; Time: 0.58s; loss: 78.4269; acc: 433734/434078=0.9992
Instance: 32500; Time: 0.63s; loss: 52.6536; acc: 440763/441114=0.9992
Instance: 33000; Time: 0.54s; loss: 39.1081; acc: 447175/447533=0.9992
Instance: 33500; Time: 0.49s; loss: 35.3635; acc: 454189/454551=0.9992
Instance: 34000; Time: 0.45s; loss: 27.4643; acc: 461316/461681=0.9992
Instance: 34500; Time: 0.44s; loss: 13.7139; acc: 467906/468274=0.9992
Instance: 35000; Time: 0.46s; loss: 24.5363; acc: 474497/474870=0.9992
Instance: 35500; Time: 0.53s; loss: 31.6647; acc: 481061/481439=0.9992
Instance: 36000; Time: 0.53s; loss: 36.9760; acc: 487626/488010=0.9992
Instance: 36500; Time: 0.56s; loss: 8.3417; acc: 494539/494925=0.9992
Instance: 37000; Time: 0.55s; loss: 22.6988; acc: 501398/501787=0.9992
Instance: 37500; Time: 0.55s; loss: 25.0444; acc: 508417/508809=0.9992
Instance: 38000; Time: 0.46s; loss: 19.8509; acc: 515306/515702=0.9992
Instance: 38500; Time: 0.45s; loss: 16.7377; acc: 521985/522385=0.9992
Instance: 39000; Time: 0.46s; loss: 95.4117; acc: 528913/529333=0.9992
Instance: 39500; Time: 0.46s; loss: 46.7911; acc: 535924/536351=0.9992
Instance: 39999; Time: 0.45s; loss: 16.8303; acc: 542712/543142=0.9992
Epoch: 99 training finished. Time: 42.67s, speed: 937.38st/s, total loss: 2424.9977610111237
totalloss: 2424.9977610111237
Right token = 32285 All token = 32315 acc = 0.9990716385579452
Dev: time: 1.88s, speed: 1298.77st/s; acc: 0.9991, p: -1.0000, r: -1.0000, f: -1.0000
Right token = 63589 All token = 63622 acc = 0.9994813114960234
Test: time: 4.00s, speed: 1189.38st/s; acc: 0.9995, p: -1.0000, r: -1.0000, f: -1.0000
real 84m37.016s
user 83m56.073s
sys 0m14.902s
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
prepare decode config file:
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ time python main.py --config ./mysent-config/word-cnn.nochar.decode.config
...
...
...
++++++++++++++++++++++++++++++++++++++++
Hyperparameters:
Hyper lr: 0.01
Hyper lr_decay: 0.05
Hyper HP_clip: None
Hyper momentum: 0.0
Hyper l2: 1e-08
Hyper hidden_dim: 200
Hyper dropout: 0.5
Hyper lstm_layer: 1
Hyper bilstm: True
Hyper GPU: True
DATA SUMMARY END.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
nbest: None
Load Model from file: /home/yekyaw.thu/tool/NCRFpp/mysent-model/wordcnn-nochar
build sequence labeling network...
use_char: False
word feature extractor: CNN
use crf: False
build word sequence feature extractor: CNN...
build word representation...
CNN layer: 4
Decode raw data, nbest: None ...
Right token = 63572 All token = 63622 acc = 0.9992141083273082
raw: time:3.75s, speed:1270.25st/s; acc: 0.9992, p: -1.0000, r: -1.0000, f: -1.0000
Predict raw result has been written into file. /home/yekyaw.thu/tool/NCRFpp/mysent-hyp/wordcnn-nochar.hyp
real 0m12.432s
user 0m9.680s
sys 0m3.054s
check the hyp file:
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-hyp$ head wordcnn-nochar.hyp
အခု B
သန့်စင်ခန်း N
ကကို N
သသုံး N
ပါရစေ E
လူငယ် B
တွေ O
က O
ပပုံစံတကျ O
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-hyp$
prepare training config file:
### use # to comment out the configure item
### I/O ###
train_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/train.col
dev_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/valid.col
test_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/test.col
model_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordcnn-crf-charcnn
#word_emb_dir=sample_data/sample.word.emb
#raw_dir=
#decode_dir=
#dset_dir=
#load_model_dir=
#char_emb_dir=
norm_word_emb=False
norm_char_emb=False
number_normalized=True
seg=True
word_emb_dim=50
char_emb_dim=30
###NetworkConfiguration###
use_crf=True
use_char=True
word_seq_feature=CNN
char_seq_feature=CNN
#feature=[POS] emb_size=20
...
...
...
start training word-cnn, crf, char-cnn model ...
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ time python ./main.py --config ./mysent-config/word-cnn.crf.char-cnn.train.config | tee ./mysent-model/word-cnn.crf.char-cnn.training.log
...
...
...
build word sequence feature extractor: CNN...
build word representation...
build char sequence feature extractor: CNN ...
CNN layer: 4
build CRF...
Epoch: 0/100
Learning rate is set as: 0.015
Shuffle: first input word list: [8895, 45, 226, 207, 1037, 644, 18, 253, 208, 254]
Instance: 500; Time: 2.16s; loss: 14709.3444; acc: 3933/6598=0.5961
Instance: 1000; Time: 2.20s; loss: 679335847501542.1250; acc: 6127/13463=0.4551
ERROR: LOSS EXPLOSION (>1e8) ! PLEASE SET PROPER PARAMETERS AND STRUCTURE! EXIT....
real 0m19.259s
user 0m16.236s
sys 0m3.266s
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
Got ERROR as shown in above!
And thus, update the learning rate as follows:
#learning_rate=0.015
learning_rate=0.010
re-train again ...
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ time python ./main.py --config ./mysent-config/word-cnn.crf.char-cnn.train.config | tee ./mysent-model/word-cnn.crf.char-cnn.training.log
...
...
...
BatchSize: 10
Average batch loss: False
++++++++++++++++++++++++++++++++++++++++
Hyperparameters:
Hyper lr: 0.01
Hyper lr_decay: 0.05
Hyper HP_clip: None
Hyper momentum: 0.0
Hyper l2: 1e-08
Hyper hidden_dim: 200
Hyper dropout: 0.5
Hyper lstm_layer: 1
Hyper bilstm: True
Hyper GPU: True
DATA SUMMARY END.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
build sequence labeling network...
use_char: True
char feature extractor: CNN
word feature extractor: CNN
use crf: True
build word sequence feature extractor: CNN...
build word representation...
build char sequence feature extractor: CNN ...
CNN layer: 4
build CRF...
Epoch: 0/100
Learning rate is set as: 0.01
Shuffle: first input word list: [8895, 45, 226, 207, 1037, 644, 18, 253, 208, 254]
Instance: 500; Time: 2.22s; loss: 9619.4552; acc: 4072/6598=0.6172
Instance: 1000; Time: 2.29s; loss: 3674.8943; acc: 9391/13463=0.6975
...
...
...
Instance: 30000; Time: 0.47s; loss: 20.1497; acc: 406608/406922=0.9992
Instance: 30500; Time: 0.46s; loss: 15.6747; acc: 413610/413929=0.9992
Instance: 31000; Time: 0.43s; loss: 36.0379; acc: 420018/420344=0.9992
Instance: 31500; Time: 0.48s; loss: 46.1417; acc: 426688/427023=0.9992
Instance: 32000; Time: 0.58s; loss: 78.4269; acc: 433734/434078=0.9992
Instance: 32500; Time: 0.63s; loss: 52.6536; acc: 440763/441114=0.9992
Instance: 33000; Time: 0.54s; loss: 39.1081; acc: 447175/447533=0.9992
Instance: 33500; Time: 0.49s; loss: 35.3635; acc: 454189/454551=0.9992
Instance: 34000; Time: 0.45s; loss: 27.4643; acc: 461316/461681=0.9992
Instance: 34500; Time: 0.44s; loss: 13.7139; acc: 467906/468274=0.9992
Instance: 35000; Time: 0.46s; loss: 24.5363; acc: 474497/474870=0.9992
Instance: 35500; Time: 0.53s; loss: 31.6647; acc: 481061/481439=0.9992
Instance: 36000; Time: 0.53s; loss: 36.9760; acc: 487626/488010=0.9992
Instance: 36500; Time: 0.56s; loss: 8.3417; acc: 494539/494925=0.9992
Instance: 37000; Time: 0.55s; loss: 22.6988; acc: 501398/501787=0.9992
Instance: 37500; Time: 0.55s; loss: 25.0444; acc: 508417/508809=0.9992
Instance: 38000; Time: 0.46s; loss: 19.8509; acc: 515306/515702=0.9992
Instance: 38500; Time: 0.45s; loss: 16.7377; acc: 521985/522385=0.9992
Instance: 39000; Time: 0.46s; loss: 95.4117; acc: 528913/529333=0.9992
Instance: 39500; Time: 0.46s; loss: 46.7911; acc: 535924/536351=0.9992
Instance: 39999; Time: 0.45s; loss: 16.8303; acc: 542712/543142=0.9992
Epoch: 99 training finished. Time: 42.67s, speed: 937.38st/s, total loss: 2424.9977610111237
totalloss: 2424.9977610111237
Right token = 32285 All token = 32315 acc = 0.9990716385579452
Dev: time: 1.88s, speed: 1298.77st/s; acc: 0.9991, p: -1.0000, r: -1.0000, f: -1.0000
Right token = 63589 All token = 63622 acc = 0.9994813114960234
Test: time: 4.00s, speed: 1189.38st/s; acc: 0.9995, p: -1.0000, r: -1.0000, f: -1.0000
real 84m37.016s
user 83m56.073s
sys 0m14.902s
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
prepare manual checking. At 1st I need to create a decode configuration file as follows:
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-config$ cat word-cnn.crf.char-cnn.decode.config
### Decode ###
status=decode
#raw_dir=sample_data/raw.bmes
raw_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/test.col
#nbest=1
#nbest=10
decode_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-hyp/wordcnn-crf-charcnn.hyp
#dset_dir=sample_data/lstmcrf.dset
dset_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordcnn-crf-charcnn.dset
#load_model_dir=sample_data/lstmcrf.0.model
load_model_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordcnn-crf-charcnn.0.model
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-config$
run manual testing ...
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ time python main.py --config ./mysent-config/word-cnn.crf.char-cnn.decode.config | tee
./mysent-model/wordcnn.crf.charcnn.decode.log
...
...
...
Hyper HP_clip: None
Hyper momentum: 0.0
Hyper l2: 1e-08
Hyper hidden_dim: 200
Hyper dropout: 0.5
Hyper lstm_layer: 1
Hyper bilstm: True
Hyper GPU: True
DATA SUMMARY END.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
nbest: None
Load Model from file: /home/yekyaw.thu/tool/NCRFpp/mysent-model/wordcnn-crf-charcnn
build sequence labeling network...
use_char: True
char feature extractor: CNN
word feature extractor: CNN
use crf: True
build word sequence feature extractor: CNN...
build word representation...
build char sequence feature extractor: CNN ...
CNN layer: 4
build CRF...
Decode raw data, nbest: None ...
Right token = 63589 All token = 63622 acc = 0.9994813114960234
raw: time:5.57s, speed:852.53st/s; acc: 0.9995, p: -1.0000, r: -1.0000, f: -1.0000
Predict raw result has been written into file. /home/yekyaw.thu/tool/NCRFpp/mysent-hyp/wordcnn-crf-charcnn.hyp
real 0m14.296s
user 0m11.311s
sys 0m3.144s
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
check the output hyp file:
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ head ./mysent-hyp/wordcnn-crf-charcnn.hyp
အခု B
သန့်စင်ခန်း N
ကကို N
သသုံး N
ပါရစေ E
လူငယ် B
တွေ O
က O
ပပုံစံတကျ O
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
prepare the config file:
### use # to comment out the configure item
### I/O ###
train_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/train.col
dev_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/valid.col
test_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/test.col
model_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordcnn-crf-charlstm
#word_emb_dir=sample_data/sample.word.emb
#raw_dir=
#decode_dir=
#dset_dir=
#load_model_dir=
#char_emb_dir=
norm_word_emb=False
norm_char_emb=False
number_normalized=True
seg=True
word_emb_dim=50
char_emb_dim=30
###NetworkConfiguration###
use_crf=True
use_char=True
word_seq_feature=CNN
char_seq_feature=LSTM
#feature=[POS] emb_size=20
...
...
...
start training Word-CNN, CRF, Char-LSTM ...
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ time python ./main.py --config ./mysent-config/word-cnn.crf.char-lstm.train.config | tee ./mysent-model/word-cnn.crf.char-lstm.training.log
...
...
...
Dev instance number: 2414
Test instance number: 4712
Raw instance number: 0
FEATURE num: 0
++++++++++++++++++++++++++++++++++++++++
Model Network:
Model use_crf: True
Model word extractor: CNN
Model use_char: True
Model char extractor: LSTM
Model char_hidden_dim: 50
++++++++++++++++++++++++++++++++++++++++
Training:
Optimizer: SGD
Iteration: 100
BatchSize: 10
Average batch loss: False
++++++++++++++++++++++++++++++++++++++++
Hyperparameters:
Hyper lr: 0.01
Hyper lr_decay: 0.05
Hyper HP_clip: None
Hyper momentum: 0.0
Hyper l2: 1e-08
Hyper hidden_dim: 200
Hyper dropout: 0.5
Hyper lstm_layer: 1
Hyper bilstm: True
Hyper GPU: True
DATA SUMMARY END.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
...
...
...
Instance: 33500; Time: 2.71s; loss: 89.1120; acc: 450743/454076=0.9927
Instance: 34000; Time: 2.47s; loss: 44.3172; acc: 457345/460685=0.9927
Instance: 34500; Time: 2.68s; loss: 61.8543; acc: 464330/467678=0.9928
Instance: 35000; Time: 2.80s; loss: 42.6160; acc: 471465/474819=0.9929
Instance: 35500; Time: 2.75s; loss: 18.5766; acc: 478705/482061=0.9930
Instance: 36000; Time: 2.65s; loss: 45.8868; acc: 485281/488644=0.9931
Instance: 36500; Time: 2.68s; loss: 35.7804; acc: 492016/495386=0.9932
Instance: 37000; Time: 2.61s; loss: 77.9952; acc: 498888/502268=0.9933
Instance: 37500; Time: 2.64s; loss: 77.2073; acc: 505813/509202=0.9933
Instance: 38000; Time: 2.51s; loss: 83.5568; acc: 512236/515634=0.9934
Instance: 38500; Time: 2.69s; loss: 25.0363; acc: 519113/522512=0.9935
Instance: 39000; Time: 2.68s; loss: 62.0210; acc: 525813/529223=0.9936
Instance: 39500; Time: 2.64s; loss: 18.2599; acc: 532746/536159=0.9936
Instance: 39999; Time: 2.72s; loss: 39.4828; acc: 539723/543142=0.9937
Epoch: 0 training finished. Time: 204.92s, speed: 195.19st/s, total loss: 43944.579750061035
totalloss: 43944.579750061035
Right token = 32287 All token = 32315 acc = 0.9991335293207488
Dev: time: 3.12s, speed: 777.69st/s; acc: 0.9991, p: -1.0000, r: -1.0000, f: -1.0000
Exceed previous best f score: -10
Save current best model in file: /home/yekyaw.thu/tool/NCRFpp/mysent-model/wordcnn-crf-charlstm.0.model
Right token = 63589 All token = 63622 acc = 0.9994813114960234
Test: time: 6.39s, speed: 743.47st/s; acc: 0.9995, p: -1.0000, r: -1.0000, f: -1.0000
Epoch: 1/100
Learning rate is set as: 0.009523809523809523
Shuffle: first input word list: [20733, 2180, 20734, 325, 130, 325, 872, 130, 872, 214]
Instance: 500; Time: 2.73s; loss: 23.1693; acc: 7006/7008=0.9997
Instance: 1000; Time: 2.47s; loss: 44.3065; acc: 13303/13314=0.9992
Instance: 1500; Time: 2.75s; loss: 86.1397; acc: 20303/20328=0.9988
Instance: 2000; Time: 2.71s; loss: 38.1606; acc: 26981/27011=0.9989
Instance: 2500; Time: 2.51s; loss: 59.5888; acc: 33237/33274=0.9989
Instance: 3000; Time: 2.75s; loss: 22.1511; acc: 40138/40177=0.9990
...
...
...
Instance: 30500; Time: 2.65s; loss: 15.0507; acc: 413603/413929=0.9992
Instance: 31000; Time: 2.56s; loss: 43.8955; acc: 420011/420344=0.9992
Instance: 31500; Time: 2.74s; loss: 33.9988; acc: 426683/427023=0.9992
Instance: 32000; Time: 2.46s; loss: 41.7207; acc: 433729/434078=0.9992
Instance: 32500; Time: 2.50s; loss: 32.4494; acc: 440758/441114=0.9992
Instance: 33000; Time: 2.17s; loss: 38.8189; acc: 447170/447533=0.9992
Instance: 33500; Time: 2.30s; loss: 23.0599; acc: 454184/454551=0.9992
Instance: 34000; Time: 2.34s; loss: 16.5867; acc: 461311/461681=0.9992
Instance: 34500; Time: 2.20s; loss: 13.3369; acc: 467901/468274=0.9992
Instance: 35000; Time: 2.35s; loss: 24.9703; acc: 474493/474870=0.9992
Instance: 35500; Time: 2.16s; loss: 21.7741; acc: 481059/481439=0.9992
Instance: 36000; Time: 2.18s; loss: 34.5946; acc: 487624/488010=0.9992
Instance: 36500; Time: 2.38s; loss: 5.5450; acc: 494539/494925=0.9992
Instance: 37000; Time: 2.19s; loss: 21.1420; acc: 501398/501787=0.9992
Instance: 37500; Time: 2.39s; loss: 24.0189; acc: 508417/508809=0.9992
Instance: 38000; Time: 2.33s; loss: 19.6784; acc: 515306/515702=0.9992
Instance: 38500; Time: 2.27s; loss: 15.3549; acc: 521985/522385=0.9992
Instance: 39000; Time: 2.37s; loss: 77.0108; acc: 528915/529333=0.9992
Instance: 39500; Time: 2.39s; loss: 35.0670; acc: 535928/536351=0.9992
Instance: 39999; Time: 2.30s; loss: 16.1045; acc: 542715/543142=0.9992
Epoch: 99 training finished. Time: 205.11s, speed: 195.01st/s, total loss: 2097.350311279297
totalloss: 2097.350311279297
Right token = 32287 All token = 32315 acc = 0.9991335293207488
Dev: time: 3.11s, speed: 780.27st/s; acc: 0.9991, p: -1.0000, r: -1.0000, f: -1.0000
Right token = 63589 All token = 63622 acc = 0.9994813114960234
Test: time: 6.36s, speed: 745.79st/s; acc: 0.9995, p: -1.0000, r: -1.0000, f: -1.0000
real 359m21.591s
user 358m29.862s
sys 0m33.201s
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
Let's see the model folder:
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ cd mysent-model/
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-model$ ls
bk wordcnn-crf-charlstm.0.model wordlstm-crf-charcnn.dset
wordcnn-charcnn.0.model wordcnn-crf-charlstm.dset wordlstm-crf-charcnn.training.log
wordcnn-charcnn.decode.log word-cnn.crf.char-lstm.training.log wordlstm-crf-charlstm.0.model
wordcnn-charcnn.dset wordcnn-nochar.0.model wordlstm-crf-charlstm.decode.log
wordcnn-charcnn.training.errlog1 wordcnn-nochar.dset wordlstm-crf-charlstm.dset
wordcnn-charcnn.training.log wordcnn-nochar.train.log wordlstm-crf-charlstm.training.log
wordcnn-charlstm.0.model wordlstm-charcnn.0.model wordlstm-crf-nochar.0.model
wordcnn-charlstm.decode.log wordlstm-charcnn.dset wordlstm-crf-nochar.decode.log
wordcnn-charlstm.dset wordlstm-charlstm.0.model wordlstm-crf-nochar.dset
wordcnn-charlstm.train.log wordlstm-charlstm.decode.log word-lstm.crf.nochar.training.log
wordcnn-crf-charcnn.0.model wordlstm-charlstm.dset wordlstm-nochar.0.model
wordcnn.crf.charcnn.decode.log wordlstm-charlstm.training.log wordlstm-nochar.decode.log
wordcnn-crf-charcnn.dset wordlstm-crf-charcnn.0.model wordlstm-nochar.dset
word-cnn.crf.char-cnn.training.log wordlstm-crf-charcnn.decode.log wordlstm-nochar.training.log
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-model$
prepare decoding config file for the Word-CNN, CRF, Char-LSTM model:
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-config$ cat word-cnn.crf.char-lstm.decode.config
### Decode ###
status=decode
#raw_dir=sample_data/raw.bmes
raw_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/test.col
#nbest=1
#nbest=10
decode_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-hyp/wordcnn-crf-charlstm.hyp
#dset_dir=sample_data/lstmcrf.dset
dset_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordcnn-crf-charlstm.dset
#load_model_dir=sample_data/lstmcrf.0.model
load_model_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordcnn-crf-charlstm.0.model
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-config$
manual decoding of Word-CNN, CRF, Char-LSTM model ...
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ time python main.py --config ./mysent-config/word-cnn.crf.char-lstm.decode.config | tee ./wordcnn.crf.charlstm.decode.log
...
...
...
Hyper HP_clip: None
Hyper momentum: 0.0
Hyper l2: 1e-08
Hyper hidden_dim: 200
Hyper dropout: 0.5
Hyper lstm_layer: 1
Hyper bilstm: True
Hyper GPU: True
DATA SUMMARY END.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
nbest: None
Load Model from file: /home/yekyaw.thu/tool/NCRFpp/mysent-model/wordcnn-crf-charlstm
build sequence labeling network...
use_char: True
char feature extractor: LSTM
word feature extractor: CNN
use crf: True
build word sequence feature extractor: CNN...
build word representation...
build char sequence feature extractor: LSTM ...
CNN layer: 4
build CRF...
Decode raw data, nbest: None ...
Right token = 63589 All token = 63622 acc = 0.9994813114960234
raw: time:6.37s, speed:744.10st/s; acc: 0.9995, p: -1.0000, r: -1.0000, f: -1.0000
Predict raw result has been written into file. /home/yekyaw.thu/tool/NCRFpp/mysent-hyp/wordcnn-crf-charlstm.hyp
real 0m15.192s
user 0m12.328s
sys 0m3.150s
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
preparing config file:
### use # to comment out the configure item
### I/O ###
train_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/train.col
dev_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/valid.col
test_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/test.col
model_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordcnn-crf-nochar
#word_emb_dir=sample_data/sample.word.emb
#raw_dir=
#decode_dir=
#dset_dir=
#load_model_dir=
#char_emb_dir=
norm_word_emb=False
norm_char_emb=False
number_normalized=True
seg=True
word_emb_dim=50
char_emb_dim=30
###NetworkConfiguration###
use_crf=True
use_char=False
word_seq_feature=CNN
#char_seq_feature=LSTM
#feature=[POS] emb_size=20
start training Word-CNN, CRF, No-char Model ...
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ time python main.py --config ./mysent-config/word-cnn.crf.nochar.train.config | tee word-cnn.crf.nochar.train.log
...
...
...
Decode file directory: None
Train instance number: 39999
Dev instance number: 2414
Test instance number: 4712
Raw instance number: 0
FEATURE num: 0
++++++++++++++++++++++++++++++++++++++++
Model Network:
Model use_crf: True
Model word extractor: CNN
Model use_char: False
++++++++++++++++++++++++++++++++++++++++
Training:
Optimizer: SGD
Iteration: 100
BatchSize: 10
Average batch loss: False
++++++++++++++++++++++++++++++++++++++++
Hyperparameters:
Hyper lr: 0.01
Hyper lr_decay: 0.05
Hyper HP_clip: None
Hyper momentum: 0.0
Hyper l2: 1e-08
Hyper hidden_dim: 200
Hyper dropout: 0.5
Hyper lstm_layer: 1
Hyper bilstm: True
Hyper GPU: True
DATA SUMMARY END.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
...
...
...
Instance: 32500; Time: 2.28s; loss: 24.5506; acc: 440783/441114=0.9992
Instance: 33000; Time: 2.02s; loss: 34.7561; acc: 447195/447533=0.9992
Instance: 33500; Time: 2.14s; loss: 21.3422; acc: 454209/454551=0.9992
Instance: 34000; Time: 2.19s; loss: 7.5556; acc: 461337/461681=0.9993
Instance: 34500; Time: 2.08s; loss: 12.9064; acc: 467926/468274=0.9993
Instance: 35000; Time: 2.22s; loss: 24.6565; acc: 474518/474870=0.9993
Instance: 35500; Time: 2.05s; loss: 14.1812; acc: 481084/481439=0.9993
Instance: 36000; Time: 2.04s; loss: 29.1913; acc: 487649/488010=0.9993
Instance: 36500; Time: 2.19s; loss: 5.1461; acc: 494564/494925=0.9993
Instance: 37000; Time: 2.10s; loss: 18.1655; acc: 501419/501787=0.9993
Instance: 37500; Time: 2.28s; loss: 19.7429; acc: 508438/508809=0.9993
Instance: 38000; Time: 2.24s; loss: 20.5841; acc: 515327/515702=0.9993
Instance: 38500; Time: 2.16s; loss: 15.2183; acc: 522006/522385=0.9993
Instance: 39000; Time: 2.22s; loss: 47.5378; acc: 528939/529333=0.9993
Instance: 39500; Time: 2.29s; loss: 36.6965; acc: 535948/536351=0.9992
Instance: 39999; Time: 2.19s; loss: 14.7860; acc: 542735/543142=0.9993
Epoch: 99 training finished. Time: 175.37s, speed: 228.08st/s, total loss: 1706.869140625
totalloss: 1706.869140625
Right token = 32283 All token = 32315 acc = 0.9990097477951416
Dev: time: 2.80s, speed: 869.81st/s; acc: 0.9990, p: -1.0000, r: -1.0000, f: -1.0000
Right token = 63591 All token = 63622 acc = 0.999512747162931
Test: time: 5.68s, speed: 835.87st/s; acc: 0.9995, p: -1.0000, r: -1.0000, f: -1.0000
real 308m19.890s
user 307m49.698s
sys 0m23.073s
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
Prepare decode/test config file ...
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-config$ cat word-cnn.crf.nochar.decode.config
### Decode ###
status=decode
#raw_dir=sample_data/raw.bmes
raw_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-config/data/sent/test.col
#nbest=1
#nbest=10
decode_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-hyp/wordcnn-crf-nochar.hyp
#dset_dir=sample_data/lstmcrf.dset
dset_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordcnn-crf-nochar.dset
#load_model_dir=sample_data/lstmcrf.0.model
load_model_dir=/home/yekyaw.thu/tool/NCRFpp/mysent-model/wordcnn-crf-nochar.0.model
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-config$
manual testing log ...
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$ time python ./main.py --config ./mysent-config/word-cnn.crf.nochar.decode.config | tee ./mysent-model/word-cnn.crf.nochar.decode.log
...
...
...
Decode file directory: /home/yekyaw.thu/tool/NCRFpp/mysent-hyp/wordcnn-crf-nochar.hyp
Train instance number: 39999
Dev instance number: 2414
Test instance number: 4712
Raw instance number: 0
FEATURE num: 0
++++++++++++++++++++++++++++++++++++++++
Model Network:
Model use_crf: True
Model word extractor: CNN
Model use_char: False
++++++++++++++++++++++++++++++++++++++++
Training:
Optimizer: SGD
Iteration: 100
BatchSize: 10
Average batch loss: False
++++++++++++++++++++++++++++++++++++++++
Hyperparameters:
Hyper lr: 0.01
Hyper lr_decay: 0.05
Hyper HP_clip: None
Hyper momentum: 0.0
Hyper l2: 1e-08
Hyper hidden_dim: 200
Hyper dropout: 0.5
Hyper lstm_layer: 1
Hyper bilstm: True
Hyper GPU: True
DATA SUMMARY END.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
nbest: None
Load Model from file: /home/yekyaw.thu/tool/NCRFpp/mysent-model/wordcnn-crf-nochar
build sequence labeling network...
use_char: False
word feature extractor: CNN
use crf: True
build word sequence feature extractor: CNN...
build word representation...
CNN layer: 4
build CRF...
Decode raw data, nbest: None ...
Right token = 63589 All token = 63622 acc = 0.9994813114960234
raw: time:5.99s, speed:792.42st/s; acc: 0.9995, p: -1.0000, r: -1.0000, f: -1.0000
Predict raw result has been written into file. /home/yekyaw.thu/tool/NCRFpp/mysent-hyp/wordcnn-crf-nochar.hyp
real 0m23.978s
user 0m11.976s
sys 0m3.584s
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp$
Check the output hyp file:
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-hyp$ head wordcnn-crf-nochar.hyp
အခု B
သန့်စင်ခန်း N
ကို N
သုံး N
ပါရစေ E
လူငယ် B
တွေ O
က O
ပုံစံတကျ O
(ncrfpp) yekyaw.thu@gpu:~/tool/NCRFpp/mysent-hyp$
All the experiments that I planned are finished!!!
to do: I plan to put tables for easier looking ...