This repo implements a NER model using Julia and Flux (glove + Bilstm + softmax);
Currently, we got chunk accuracy 78% on Conll2003 dataset.
https://www.oschina.net/news/99104/what-is-julia
Please see the REQUIRE file
Given a sentence, given a tag to each word (contain punctuation). The classic application is Named Entity Recognition. Here is an example.
John lives in New York
B-PER O O B-LOC I-LOC
the code related to model generation
model = Chain(
Dense_m(Weight),
MyBiLSTM(EmbedSize, HiddenSize),
Dropout(0.5),
lower_dim(HiddenSize * 2),
Dense(HiddenSize * 2, ClassNum), softmax
)
- an embedding layer to do word embedding, now we choose glove (Here we use the developing dataset to determine the hyperparameter EmbedSize);
- run a bi-lstm on each sentence to extract contextual representation of each word;
- one dropout layer;
- one fully connected layer to do the decode.
- Download the initial data (Conll2003 dataset)
or
git clone https://github.com/GGchencan/JuJu.git
we put the inital Conll2003 data in our demo folder
- use the data preprocess program to preprocess the data
julia data_preprocess_custom.jl train.txt test.txt dev.txt```
this function will help you to build the dataset into six different data file, used for train, eval and test. The order of the parameters is the path to train data, test data and evaluation data.
_3. cd the main.jl, simply run the file
```julia
julia main.jl
wait for several minutes, the train process will be finished.
- make sure the generated model file exist in the JuJu folder, run the demo.jl to show the result of your training.
julia demo/demo.jl
the result is expected to like
The training data must be identical in the Conll2003 data format.
A default data example
John B-PER
lives O
in O
New B-LOC
York I-LOC
. O
This O
is O
another O
sentence
After your prepare your own data and seperate it as train, test, eval, use step2 in Getting started to process your own data and do the training.
The result shows as following:
- example1 epoch = 10, dim(dimension of word embedding) = 50
- example2 epoch = 10, dim(dimension of word embedding) = 300
- example3 epoch = 20, dim(dimension of word embedding) = 300