# Web-taggers

To run a web tagger, you need a running [EstNLTK web-tagger](https://github.com/estnltk/webtagger-service) service. Web services for some tools are also available at the `tartunlp` server.

In [1]:
from estnltk import Text

## Currently available web-taggers

## BertEmbeddingsWebTagger

Tags [Bert's embeddings](https://huggingface.co/tartuNLP/EstBERT) using EstNLTK's [BertTagger](https://github.com/estnltk/estnltk/blob/version_1.6/tutorials/taggers/embeddings_tagger.ipynb).

In [2]:
from estnltk.taggers import BertEmbeddingsWebTagger
bert_embeddings_web_tagger = \
    BertEmbeddingsWebTagger(url='https://api.tartunlp.ai/estnltk/tagger/bert')
bert_embeddings_web_tagger

name,output layer,output attributes,input layers
BertEmbeddingsWebTagger,bert_embeddings,"('token', 'bert_embedding')","('sentences',)"

0,1
url,https://api.tartunlp.ai/estnltk/tagger/bert


In [3]:
bert_embeddings_web_tagger.about

'Tags BERT embeddings using EstNLTK 1.6.7beta webservice.'

In [4]:
bert_embeddings_web_tagger.status

'OK'

In [5]:
bert_embeddings_web_tagger.is_alive

True

In [6]:
# Create a text and add required layers
text = Text('See on lause.')
text.tag_layer('sentences')

text
See on lause.

layer name,attributes,parent,enveloping,ambiguous,span count
sentences,,,words,False,1
tokens,,,,False,4
compound_tokens,"type, normalized",,tokens,False,0
words,normalized_form,,,True,4


In [7]:
# Add bert embeddings
bert_embeddings_web_tagger.tag(text)
text.bert_embeddings

layer name,attributes,parent,enveloping,ambiguous,span count
bert_embeddings,"token, bert_embedding",,,True,4

text,token,bert_embedding
See,see,"[0.25575265288352966, -0.05085913464426994, -0.1655101180076599, 0.2314714193344 ..., type: <class 'list'>, length: 3072"
on,on,"[0.2487320750951767, 0.10666783154010773, 0.05178193747997284, 0.398448318243026 ..., type: <class 'list'>, length: 3072"
lause,lause,"[-1.9421499967575073, 0.15984250605106354, -0.40699252486228943, -0.438520014286 ..., type: <class 'list'>, length: 3072"
.,.,"[-0.05141175910830498, -0.10687198489904404, -0.009526986628770828, -0.070307776 ..., type: <class 'list'>, length: 3072"


Notes: 

   * `BertEmbeddingsWebTagger`'s webservice employs a request size limit: you can process `Text` objects that contain 125 words at maximum. If your text is larger, please use EstNLTK\'s functions `extract_sections` or `split_by` to split the `Text` object into smaller `Text`-s before processing. More information about splitting:             [https://github.com/estnltk/estnltk/blob/version_1.6/tutorials/system/layer_operations.ipynb](https://github.com/estnltk/estnltk/blob/version_1.6/tutorials/system/layer_operations.ipynb)
   
   
   * `BertTagger` uses a tokenization that diverges from EstNLTK's default tokenization. For details, see the tutorial: https://github.com/estnltk/estnltk/blob/version_1.6/tutorials/taggers/embeddings_tagger.ipynb

## StanzaSyntaxWebTagger

Tags dependency syntactic analysis using EstNLTK `StanzaSyntaxTagger`'s webservice.

In [8]:
from estnltk.taggers import StanzaSyntaxWebTagger
stanza_syntax_web_tagger = \
    StanzaSyntaxWebTagger(url='https://api.tartunlp.ai/estnltk/tagger/stanza_syntax')
stanza_syntax_web_tagger

name,output layer,output attributes,input layers
StanzaSyntaxWebTagger,stanza_syntax,"('id', 'lemma', 'upostag', 'xpostag', 'feats', 'head', 'deprel', 'deps', 'misc')","('words', 'sentences', 'morph_extended')"

0,1
url,https://api.tartunlp.ai/estnltk/tagger/stanza_syntax


In [9]:
# Create Text and add required input layers
from estnltk import Text
text = Text('Ilus suur karvane kass nurrus rohelisel diivanil.').tag_layer('morph_extended')
# Tag syntax with web_tagger
stanza_syntax_web_tagger.tag( text )
text.stanza_syntax

layer name,attributes,parent,enveloping,ambiguous,span count
stanza_syntax,"id, lemma, upostag, xpostag, feats, head, deprel, deps, misc",morph_extended,,False,8

text,id,lemma,upostag,xpostag,feats,head,deprel,deps,misc
Ilus,1,ilus,A,A,"{'nom': 'nom', 'pos': 'pos', 'sg': 'sg'}",4,amod,_,_
suur,2,suur,A,A,"{'nom': 'nom', 'pos': 'pos', 'sg': 'sg'}",4,amod,_,_
karvane,3,karvane,A,A,"{'nom': 'nom', 'pos': 'pos', 'sg': 'sg'}",4,amod,_,_
kass,4,kass,S,S,"{'com': 'com', 'nom': 'nom', 'sg': 'sg'}",5,nsubj,_,_
nurrus,5,nurruma,V,V,"{'af': 'af', 'aux': 'aux', 'impf': 'impf', 'indic': 'indic', 'ps': 'ps', 'ps3': 'ps3', 'sg': 'sg'}",0,root,_,_
rohelisel,6,roheline,A,A,"{'ad': 'ad', 'pos': 'pos', 'sg': 'sg'}",7,amod,_,_
diivanil,7,diivan,S,S,"{'ad': 'ad', 'com': 'com', 'sg': 'sg'}",5,obl,_,_
.,8,.,Z,Z,{},5,punct,_,_


## StanzaSyntaxEnsembleWebTagger

Tags dependency syntactic analysis using EstNLTK `StanzaSyntaxEnsembleWebTagger`'s webservice.

In [10]:
from estnltk.taggers import StanzaSyntaxEnsembleWebTagger
stanza_syntax_ensemble_web_tagger = \
    StanzaSyntaxEnsembleWebTagger(url='https://api.tartunlp.ai/estnltk/tagger/stanza_syntax_ensemble')
stanza_syntax_ensemble_web_tagger

name,output layer,output attributes,input layers
StanzaSyntaxEnsembleWebTagger,stanza_ensemble_syntax,"('id', 'lemma', 'upostag', 'xpostag', 'feats', 'head', 'deprel', 'deps', 'misc')","('words', 'sentences', 'morph_extended')"

0,1
url,https://api.tartunlp.ai/estnltk/tagger/stanza_syntax_ensemble


In [11]:
# Create Text and add required input layers
from estnltk import Text
text = Text('Ilus suur karvane kass nurrus rohelisel diivanil.').tag_layer('morph_extended')
# Tag syntax with web_tagger
stanza_syntax_ensemble_web_tagger.tag( text )
text.stanza_ensemble_syntax

layer name,attributes,parent,enveloping,ambiguous,span count
stanza_ensemble_syntax,"id, lemma, upostag, xpostag, feats, head, deprel, deps, misc",morph_extended,,False,8

text,id,lemma,upostag,xpostag,feats,head,deprel,deps,misc
Ilus,1,ilus,A,A,"{'nom': 'nom', 'pos': 'pos', 'sg': 'sg'}",4,amod,_,_
suur,2,suur,A,A,"{'nom': 'nom', 'pos': 'pos', 'sg': 'sg'}",4,amod,_,_
karvane,3,karvane,A,A,"{'nom': 'nom', 'pos': 'pos', 'sg': 'sg'}",4,amod,_,_
kass,4,kass,S,S,"{'com': 'com', 'nom': 'nom', 'sg': 'sg'}",5,nsubj,_,_
nurrus,5,nurruma,V,V,"{'af': 'af', 'impf': 'impf', 'indic': 'indic', 'mod': 'mod', 'ps': 'ps', 'ps3': 'ps3', 'sg': 'sg'}",0,root,_,_
rohelisel,6,roheline,A,A,"{'ad': 'ad', 'pos': 'pos', 'sg': 'sg'}",7,amod,_,_
diivanil,7,diivan,S,S,"{'ad': 'ad', 'com': 'com', 'sg': 'sg'}",5,obl,_,_
.,8,.,Z,Z,{},5,punct,_,_


Note: 

   * `StanzaSyntaxWebTagger`'s and `StanzaSyntaxEnsembleWebTagger`'s webservices employ a request size limit: you can process `Text` objects that contain 125 words at maximum. If your text is larger, please use EstNLTK\'s functions `extract_sections` or `split_by` to split the `Text` object into smaller `Text`-s before processing. More information about splitting:             [https://github.com/estnltk/estnltk/blob/version_1.6/tutorials/system/layer_operations.ipynb](https://github.com/estnltk/estnltk/blob/version_1.6/tutorials/system/layer_operations.ipynb)

## Other web-taggers

Before using the following web taggers, you first need to [create a hosting webservice](https://github.com/estnltk/webtagger-service).

## VabamorfWebTagger

See also documentation for `VabamorfTagger`.

In [2]:
from estnltk.taggers import VabamorfWebTagger
vabamorph_web_tagger = VabamorfWebTagger(url='http://127.0.0.1:5000/1.6.7beta/tag/morph_analysis')
vabamorph_web_tagger

name,output layer,output attributes,input layers
VabamorfWebTagger,morph_analysis,"('normalized_text', 'lemma', 'root', 'root_tokens', 'ending', 'clitic', 'form', 'partofspeech')","('words', 'sentences', 'compound_tokens')"

0,1
url,http://127.0.0.1:5000/1.6.7beta/tag/morph_analysis


In [3]:
vabamorph_web_tagger.about

'Tags morphological analysis using EstNLTK 1.6.7beta webservice.'

In [4]:
vabamorph_web_tagger.status

'OK'

In [5]:
vabamorph_web_tagger.is_alive

True

In [6]:
text = Text('See on lause.')
text.tag_layer(['sentences'])

vabamorph_web_tagger.tag(text)

text.pop_layer('tokens')

text

text
See on lause.

layer name,attributes,parent,enveloping,ambiguous,span count
sentences,,,words,False,1
words,normalized_form,,,True,4
morph_analysis,"normalized_text, lemma, root, root_tokens, ending, clitic, form, partofspeech",words,,True,4


## SoftmaxEmbTagSumWebTagger

See also documentation for `SoftmaxEmbTagSumTagger`.

In [9]:
from estnltk.taggers import SoftmaxEmbTagSumWebTagger

text = Text("See on lause.")
text.tag_layer(['morph_analysis'])
        
softmax_emb_tag_sum_web_tagger = SoftmaxEmbTagSumWebTagger(
    url='http://127.0.0.1:5000/1.6.7beta/tag/morph_softmax_emb_tag_sum',
    output_layer='softmax_emb_tag_sum')
softmax_emb_tag_sum_web_tagger

name,output layer,output attributes,input layers
SoftmaxEmbTagSumWebTagger,softmax_emb_tag_sum,"('morphtag', 'pos', 'form')","('morph_analysis', 'sentences', 'words')"

0,1
url,http://127.0.0.1:5000/1.6.7beta/tag/morph_softmax_emb_tag_sum


In [10]:
softmax_emb_tag_sum_web_tagger.tag(text)
text.softmax_emb_tag_sum

layer name,attributes,parent,enveloping,ambiguous,span count
softmax_emb_tag_sum,"morphtag, pos, form",words,,False,4

text,morphtag,pos,form
See,POS=P|NUMBER=sg|CASE=nom,P,sg n
on,POS=V|VERB_TYPE=main|MOOD=indic|TENSE=pres|PERSON=ps3|NUMBER=sg|VERB_PS=ps|VERB_POLARITY=af,V,b
lause,POS=S|NOUN_TYPE=com|NUMBER=sg|CASE=nom,S,sg n
.,POS=Z|PUNCT_TYPE=Fst,Z,
