# Neural Morphological Tagger

Performs neural morphological tagging. It takes Vabamorf's analyses as input to predict morphological tags with better accuracy than Vabamorf, but uses a different tag set. 
In other words: this can be thought of as a morphological disambiguation that is built on top of Vabamorf's output and uses different tag set than Vabamorf.

*Note: you need to install [estnltk_neural](https://github.com/estnltk/estnltk/tree/main/estnltk_neural) package for neural morphological tagging. Be aware that this implementation also requires an old `tensorflow` version (version < 2.0, such as 1.15.5), and is not compatible with the newest `tensorflow`.*

The tagger can be obtained using the following classes defined in `estnltk_neural.taggers`, each one of them loads a neural model with different configuration from others:

* `SoftmaxEmbTagSumTagger()`
* `SoftmaxEmbCatSumTagger()`
* `Seq2SeqEmbTagSumTagger()`
* `Seq2SeqEmbCatSumTagger()`

Note that models are not distributed with the estnltk\_neural package. You can download them in the following ways:
* If you create a new instance of tagger (`SoftmaxEmbTagSumTagger()`, `Seq2SeqEmbTagSumTagger()` etc) and the model has not been downloaded yet, you'll be prompted with a question asking for a permission to download the model;
* Alternatively, you can pre-download models manually via the download function:

```python
from estnltk import download
# download model for SoftmaxEmbTagSumTagger
download('softmaxembcatsumtagger')
# download model for Seq2SeqEmbTagSumTagger
download('seq2seqembtagsumtagger')
...
```

## `SoftmaxEmbTagSumTagger`

In [1]:
from estnltk import Text
from estnltk_neural.taggers import SoftmaxEmbTagSumTagger

text = Text("See on lause.")
text.tag_layer(['morph_analysis'])
        
tagger = SoftmaxEmbTagSumTagger('morph_softmax_emb_tag_sum')
tagger.tag(text)

This requires downloading resource 'neural_morph_softmax_emb_tag_sum_2019-08-23' (size: 354M). Proceed with downloading? [Y/n] y


Downloading neural_morph_softmax_emb_tag_sum_2019-08-23: 349031it [00:29, 11877.95it/s]


Unpacked resource into subfolder 'neural_morph_disamb/softmax_emb_tag_sum_2019-08-23/' of the resources dir.
Loaded analyses: 341 from file C:\Programmid\Miniconda3\envs\py37_estnltk_neural\lib\site-packages\estnltk\estnltk_resources\neural_morph_disamb\softmax_emb_tag_sum_2019-08-23\output\data\analysis.txt



The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.

Instructions for updating:
This class is equivalent as tf.keras.layers.LSTMCell, and will be replaced by that in Tensorflow 2.0.
Instructions for updating:
Please use `keras.layers.Bidirectional(keras.layers.RNN(cell))`, which is equivalent to this API
Instructions for updating:
Please use `keras.layers.RNN(cell)`, which 

text
See on lause.

layer name,attributes,parent,enveloping,ambiguous,span count
sentences,,,words,False,1
tokens,,,,False,4
compound_tokens,"type, normalized",,tokens,False,0
words,normalized_form,,,True,4
morph_analysis,"normalized_text, lemma, root, root_tokens, ending, clitic, form, partofspeech",words,,True,4
morph_softmax_emb_tag_sum,"morphtag, pos, form",words,,False,4


Now the text object has a layer named `morph_softmax_emb_tag_sum`, which contains three attributes for every word: `morphtag` (which is the original tag predicted by the neural model), `pos` and `form` (which are morptags converted into vabamorf format).

In [2]:
text['morph_softmax_emb_tag_sum']

layer name,attributes,parent,enveloping,ambiguous,span count
morph_softmax_emb_tag_sum,"morphtag, pos, form",words,,False,4

text,morphtag,pos,form
See,POS=P|NUMBER=sg|CASE=nom,P,sg n
on,POS=V|VERB_TYPE=main|MOOD=indic|TENSE=pres|PERSON=ps3|NUMBER=sg|VERB_PS=ps|VERB_POLARITY=af,V,b
lause,POS=S|NOUN_TYPE=com|NUMBER=sg|CASE=nom,S,sg n
.,POS=Z|PUNCT_TYPE=Fst,Z,


## `SoftmaxEmbCatSumTagger`

If you want to load a new tagger with a different neural model, you need to reset the previously loaded one with the `reset` method.

In [3]:
from estnltk_neural.taggers import SoftmaxEmbCatSumTagger

tagger.reset()
tagger = SoftmaxEmbCatSumTagger('morph_softmax_emb_cat_sum')
tagger.tag(text)
text['morph_softmax_emb_cat_sum']




INFO:model.py:79: Initializing tf session
INFO:model.py:92: Reloading the latest trained model...
INFO:saver.py:1284: Restoring parameters from C:\Programmid\Miniconda3\envs\py37_estnltk_neural\lib\site-packages\estnltk\estnltk_resources\neural_morph_disamb\softmax_emb_cat_sum_2019-08-23\output\results\model.weights


layer name,attributes,parent,enveloping,ambiguous,span count
morph_softmax_emb_cat_sum,"morphtag, pos, form",words,,False,4

text,morphtag,pos,form
See,POS=P|NUMBER=sg|CASE=nom,P,sg n
on,POS=V|VERB_TYPE=main|MOOD=indic|TENSE=pres|PERSON=ps3|NUMBER=sg|VERB_PS=ps|VERB_POLARITY=af,V,b
lause,POS=S|NOUN_TYPE=com|NUMBER=sg|CASE=nom,S,sg n
.,POS=Z|PUNCT_TYPE=Fst,Z,


## `Seq2SeqEmbTagSumTagger`

In [4]:
from estnltk_neural.taggers import Seq2SeqEmbTagSumTagger

tagger.reset()
tagger = Seq2SeqEmbTagSumTagger('morph_seq2seq_emb_tag_sum')
tagger.tag(text)
text['morph_seq2seq_emb_tag_sum']

Instructions for updating:
dim is deprecated, use axis instead
INFO:model.py:147: Initializing tf session
INFO:model.py:160: Reloading the latest trained model...
INFO:tf_logging.py:115: Restoring parameters from /home/paul/Projects/estnltk/estnltk/taggers/neural_morph/new_neural_morph/seq2seq_emb_tag_sum/output/results/model.weights


layer name,attributes,parent,enveloping,ambiguous,span count
morph_seq2seq_emb_tag_sum,"morphtag, pos, form",words,,False,4

text,morphtag,pos,form
See,POS=P|NUMBER=sg|CASE=nom,P,sg n
on,POS=V|VERB_TYPE=main|MOOD=indic|TENSE=pres|PERSON=ps3|NUMBER=sg|VERB_PS=ps|VERB_POLARITY=af,V,b
lause,POS=S|NOUN_TYPE=com|NUMBER=sg|CASE=nom,S,sg n
.,POS=Z|PUNCT_TYPE=Fst,Z,


## `Seq2SeqEmbCatSumTagger`

In [5]:
from estnltk_neural.taggers import Seq2SeqEmbCatSumTagger

tagger.reset()
tagger = Seq2SeqEmbCatSumTagger('morph_seq2seq_emb_cat_sum')
tagger.tag(text)
text['morph_seq2seq_emb_cat_sum']

INFO:model.py:147: Initializing tf session
INFO:model.py:160: Reloading the latest trained model...
INFO:tf_logging.py:115: Restoring parameters from /home/paul/Projects/estnltk/estnltk/taggers/neural_morph/new_neural_morph/seq2seq_emb_cat_sum/output/results/model.weights


layer name,attributes,parent,enveloping,ambiguous,span count
morph_seq2seq_emb_cat_sum,"morphtag, pos, form",words,,False,4

text,morphtag,pos,form
See,POS=P|NUMBER=sg|CASE=nom,P,sg n
on,POS=V|VERB_TYPE=main|MOOD=indic|TENSE=pres|PERSON=ps3|NUMBER=sg|VERB_PS=ps|VERB_POLARITY=af,V,b
lause,POS=S|NOUN_TYPE=com|NUMBER=sg|CASE=nom,S,sg n
.,POS=Z|PUNCT_TYPE=Fst,Z,
