# Creating a tagger with eveloping spans

### Defining tagger properties

When using eveloping spans, the layer doesn't need the parameter 'parent' since the spans wrap around the spans of the enveloped layer. Instead, the parameter 'enveloping' need to be defined in the 'make_layer_template()' function. The 'enveloping' parameter defines which layer's spans will be used for enveloping.

An example of make_layer_template function.

    def _make_layer_template(self):
            layer = Layer(name=self.output_layer,
                          text_object=None,
                          attributes=self.output_attributes,
                          enveloping=self.input_layers[1],
                          ambiguous=False )
            return layer

### Creating enveloping spans in tagger

It's important that to remember that the input for enveloping spans are the **base spans** of the enveloped spans. So the input can be either one base span or a list of base spans.

Example of a small entity tagger that creates enveloping spans from morph_analysis layer. This tagger will always create an enveloping span of first three words in the sentence and the second span from the last three words.

In [1]:
from estnltk import EnvelopingBaseSpan, Layer, Text
from estnltk.taggers import Tagger

class MinimalEntityTagger(Tagger):
    """ Minimal entity tagger example."""
    conf_param = ['input_morph_layer', 'stanza_layer']
    output_layer = 'entity'
    output_attributes = ()
    input_layers = ()
    
    def __init__(self,
                input_morph_layer="morph_analysis",
                stanza_layer = "stanza_syntax",
                sentences_layer='sentences',
                words_layer='words',
                ):
        import stanza
        self.input_morph_layer=input_morph_layer
        self.stanza_layer=stanza_layer
        

    def _make_layer_template(self):
        return Layer(name=self.output_layer, text_object=None, enveloping="morph_analysis")
    
    def _make_layer(self, text, layers, status=None):
        layer = self._make_layer_template()
        layer.text_object = text
        
        # create the list of base spans
        base_spans = []
        for span in layers[self.stanza_layer]:
            base_spans.append(span.base_span)
            
        # create a new enveloping span with the first three words
        # when using ˇbase_spans' as input, EnvelopingBaseSpan is used
        new_span1 = EnvelopingBaseSpan(base_spans[:3]) 
        new_span2 = EnvelopingBaseSpan(base_spans[len(base_spans)-3:]) 
        
        # add the span to the layer
        layer.add_annotation(new_span1)
        layer.add_annotation(new_span2)
        
        
        return layer

minimal_tagger = MinimalEntityTagger()
minimal_tagger

name,output layer,output attributes,input layers
MinimalEntityTagger,entity,(),()

0,1
input_morph_layer,morph_analysis
stanza_layer,stanza_syntax


Create the stanza_syntax layer for the example.

In [2]:
from estnltk_neural.taggers.syntax.stanza_tagger.stanza_tagger import StanzaSyntaxTagger

model_path = r"...\estnltk\taggers\syntax\stanza_tagger\stanza_resources"
input_type="morph_extended"
stanza_tagger = StanzaSyntaxTagger(input_type=input_type, input_morph_layer=input_type, 
                                   add_parent_and_children=True, resources_path=model_path)

In [3]:
txt = Text("Võtete ajal õppis Eminem lugu pidama näitlejatest , kes suhtuvad oma töösse sama tõsiselt kui tema muusikasse .")
txt.tag_layer('morph_extended')
stanza_tagger.tag( txt ) 

text
"Võtete ajal õppis Eminem lugu pidama näitlejatest , kes suhtuvad oma töösse sama tõsiselt kui tema muusikasse ."

layer name,attributes,parent,enveloping,ambiguous,span count
sentences,,,words,False,1
tokens,,,,False,18
compound_tokens,"type, normalized",,tokens,False,0
words,normalized_form,,,True,18
morph_analysis,"normalized_text, lemma, root, root_tokens, ending, clitic, form, partofspeech",words,,True,18
morph_extended,"normalized_text, lemma, root, root_tokens, ending, clitic, form, partofspeech, punctuation_type, pronoun_type, letter_case, fin, verb_extension_suffix, subcat",morph_analysis,,True,18
stanza_syntax,"id, lemma, upostag, xpostag, feats, head, deprel, deps, misc, parent_span, children",morph_extended,,False,18


Tag with minimal entity tagger

In [4]:
minimal_tagger.tag( txt )
txt.entity

layer name,attributes,parent,enveloping,ambiguous,span count
entity,,,morph_analysis,False,2

text
"['Võtete', 'ajal', 'õppis']"
"['tema', 'muusikasse', '.']"


Additional attributes can be given to the span by defining the attributes in \_\_init\_\_() 

and adding them to the layer with the span layer.add_annotation(new_span, **output_attributes).