# SpanTagger

**SpanTagger** allows us to tag spans on a pre-annotated layer of the **Text object**. For example, we can tag on lemmas if we specify 'morph_analysis' as input_layer and 'lemma' as input_attribute. We also need a vocabulary with tokens and their attributes that we want to tag. In this example, the vocabulary is saved in the file *vocabulary.csv*.

In [1]:
from estnltk import Text
from estnltk.taggers import SpanTagger
from estnltk.taggers import Vocabulary

In [2]:
vocabulary='span_vocabulary.csv'

In [3]:
vocabulary_file = 'span_vocabulary.csv'
Vocabulary(vocabulary=vocabulary_file, key='_token_')

_token_,value,_priority_
inimene,K,2
,I,3
päike,P,2
tundma,T,1


In [4]:
tagger = SpanTagger(output_layer='tagged_tokens',
                    input_layer='morph_analysis',
                    input_attribute='lemma',
                    vocabulary=vocabulary_file,
                    output_attributes=['value', '_priority_'], # default: None
                    key='_token_', # default: '_token_'
                    validator_attribute='_validator_', # default: '_validator_'
                    ambiguous=True # default: False
                    )

Let's create the **Text** object with the layer that we want to tag on, and then tag the spans:

In [5]:
text = Text('Eestimaal tunnevad inimesed palju puudust päikesest ja energiast.').tag_layer(['morph_analysis'])

In [6]:
tagger.tag(text)
text.tagged_tokens

layer name,attributes,parent,enveloping,ambiguous,span count
tagged_tokens,"value, _priority_",morph_analysis,,True,3

text,value,_priority_
tunnevad,T,1
inimesed,K,2
,I,3
päikesest,P,2
