## <span style="color:purple">Using SyntaxIgnoreTagger and SyntaxIgnoreCutter in pre-processing for syntactic analysis</span>

Some parts of a text may be difficult to analyse syntactically. For instance, enumerations of sports results, and content in parentheses (references and short remarks) may exhibit only very little linguistic structure suitable for analysis. So, you may want to skip the syntactic analysis on such parts of the text.

In order to detect parts of text that should be ignored by the syntactic analysis, EstNLTK has a special tagger called SyntaxIgnoreTagger.

After ignorable parts have been detected, you can use SyntaxIgnoreCutter to create a new Text object without these parts, and apply syntactic analysis.
And the resulting syntactic analysis layer can also be carried over to the original text, if needed.

## SyntaxIgnoreTagger 

In [1]:
from estnltk.taggers.standard.syntax.preprocessing.syntax_ignore_tagger import SyntaxIgnoreTagger
syntax_ignore_tagger = SyntaxIgnoreTagger()
syntax_ignore_tagger

name,output layer,output attributes,input layers
SyntaxIgnoreTagger,syntax_ignore,"('type',)","('words', 'sentences')"

0,1
allow_loose_match,True
ignore_parenthesized_num,True
ignore_parenthesized_num_greedy,True
ignore_parenthesized_ref,True
ignore_parenthesized_title_words,True
ignore_parenthesized_short_char_sequences,True
ignore_consecutive_parenthesized_sentences,True
ignore_consecutive_enum_ucase_num_sentences,True
ignore_sentences_consisting_of_numbers,True
ignore_sentences_starting_with_time,True


SyntaxIgnoreTagger requires that 'words' and 'sentences' have been annotated in text. It uses these layers to detect snippets of text (not necessarily full sentences) which should be ignored during the syntactic analysis. Detection patterns are partly based on the patterns used in pre-processing modules of EstSyntax, available [here](https://github.com/kristiinavaik/ettenten-eeltootlus) and [here](https://github.com/EstSyntax/preprocessing-module).

Similar detection patterns have been grouped together. Upon the initialization of the tagger, flags can be used to switch these groups off (by default, all flags have been switched on). In following, we will give a minute introduction on the flags, and types of ignorable text snippets the corresponding patterns aim to extract.

### Patterns for detecting ignore content inside sentences

#### Flag `ignore_brackets`

Content inside square brackets will be ignored.

In [2]:
# Create text and preprocess
from estnltk.text import Text
text = Text('Nurksulgudes tuuakse materjali viitekirje järjekorranumber kirjanduse loetelus ja leheküljed , nt [9: 5] või [9 lk 5], aga internetimaterjalil lihtsalt viitekirje, nt [7]')
text.tag_layer(['words', 'sentences'])

# Apply syntax_ignore_tagger
syntax_ignore_tagger = SyntaxIgnoreTagger(ignore_brackets=True)
syntax_ignore_tagger.tag(text)

# Examine results
text.syntax_ignore

layer name,attributes,parent,enveloping,ambiguous,span count
syntax_ignore,type,,words,False,3

text,type
"['[', '9', ':', '5', ']']",brackets_ref
"['[', '9', 'lk', '5', ']']",brackets_ref
"['[', '7', ']']",brackets_ref


#### Flag `ignore_parenthesized_ref`

Parenthesized content which looks like a reference (e.g. contains titlecased words, and date information) will be ignored.

In [3]:
# Create text and preprocess
from estnltk.text import Text
text = Text('Tutvustamisele tulevad Jan Kausi romaan “Koju” (Tuum 2012) ning Ülo Pikkovi romaan “Vana prints” (Varrak 2012). Temaatikaga seondub veel teinegi äsja Postimehes ilmunud jutt (Priit Pullerits «Džiibi kaitseks», PM 30.07.2010).')
text.tag_layer(['words', 'sentences'])

# Apply syntax_ignore_tagger
syntax_ignore_tagger = SyntaxIgnoreTagger(ignore_parenthesized_ref=True)
syntax_ignore_tagger.tag(text)

# Examine results
text.syntax_ignore

layer name,attributes,parent,enveloping,ambiguous,span count
syntax_ignore,type,,words,False,3

text,type
"['(', 'Tuum', '2012', ')']",parentheses_ref_year
"['(', 'Varrak', '2012', ')']",parentheses_ref_year
"['(', 'Priit', 'Pullerits', '«', 'Džiibi', 'kaitseks', '»', ',', 'PM', '30.07.2010', ')']",parentheses_ref_year


#### Flag `ignore_parenthesized_short_char_sequences`

Parenthesized short sequences of tokens (up to 4 tokens), each of which also has as short length (up to 4 characters), will be ignored.

In [4]:
# Create text and preprocess
from estnltk.text import Text
text = Text('Eesti judokate võistlus jäi laupäeval lühikeseks , nii Joel Rothberg ( -66 kg ) kui ka Renee Villemson ( -73 kg ) võidurõõmu maitsta ei saanud .')
text.tag_layer(['words', 'sentences'])

# Apply syntax_ignore_tagger
syntax_ignore_tagger = SyntaxIgnoreTagger(ignore_parenthesized_short_char_sequences=True)
syntax_ignore_tagger.tag(text)

# Examine results
text.syntax_ignore

layer name,attributes,parent,enveloping,ambiguous,span count
syntax_ignore,type,,words,False,2

text,type
"['(', '-66', 'kg', ')']",parentheses_1to3
"['(', '-73', 'kg', ')']",parentheses_1to3


#### Flag `ignore_parenthesized_title_words`

Parenthesized 1-2 titlecase words (which may be comma-separated) will be ignored.

In [5]:
# Create text and preprocess
from estnltk.text import Text
text = Text('Neidude 5 km klassikat võitis Lina Andersson ( Rootsi ) Pirjo Mannineni ( Soome ) ja Karin Holmbergi ( Rootsi ) ees .')
text.tag_layer(['words', 'sentences'])

# Apply syntax_ignore_tagger
syntax_ignore_tagger = SyntaxIgnoreTagger(ignore_parenthesized_title_words=True)
syntax_ignore_tagger.tag(text)

# Examine results
text.syntax_ignore

layer name,attributes,parent,enveloping,ambiguous,span count
syntax_ignore,type,,words,False,3

text,type
"['(', 'Rootsi', ')']",parentheses_title_words
"['(', 'Soome', ')']",parentheses_title_words
"['(', 'Rootsi', ')']",parentheses_title_words


#### Flag `ignore_parenthesized_num`

Parenthesized numerics (such as dates, date ranges, number sequences) will be ignored.

In [6]:
# Create text and preprocess
from estnltk.text import Text
text = Text('Klubi sai kuus korda Inglismaa meistriks (1976, 1977, 1979, 1980, 1982, 1983). Tallinna ( 21.-22. mai ) , Haapsalu ( 2.-3. juuli ) ja Liivimaa ( 30.-31. juuli ) rallidel on vähemalt see probleem lahendatud .')
text.tag_layer(['words', 'sentences'])

# Apply syntax_ignore_tagger
syntax_ignore_tagger = SyntaxIgnoreTagger(ignore_parenthesized_num=True)
syntax_ignore_tagger.tag(text)

# Examine results
text.syntax_ignore

layer name,attributes,parent,enveloping,ambiguous,span count
syntax_ignore,type,,words,False,4

text,type
"['(', '1976', ',', '1977', ',', '1979', ',', '1980', ',', '1982', ',', '1983', ')']",parentheses_num
"['(', '21.', '-', '22.', 'mai', ')']",parentheses_num_range
"['(', '2.', '-', '3.', 'juuli', ')']",parentheses_num_range
"['(', '30.', '-', '31.', 'juuli', ')']",parentheses_num_range


#### Flag `ignore_parenthesized_num_greedy`

Applies greedy parenthesized numeric content detection patterns: if there is at least one number inside parentheses, but there cannot be found at least 3 consecutive lowercase words, then the whole content inside parentheses will be marked as to be ignored.

In [7]:
# Create text and preprocess
from estnltk.text import Text
text = Text('Näited: A (300 000 000 m/sek), B ( naised 5 km , mehed 7,5 km ) ning C (meie puhul 165/80) .')
text.tag_layer(['words', 'sentences'])

# Apply syntax_ignore_tagger
syntax_ignore_tagger = SyntaxIgnoreTagger(ignore_parenthesized_num_greedy=True)
syntax_ignore_tagger.tag(text)

# Examine results
text.syntax_ignore

layer name,attributes,parent,enveloping,ambiguous,span count
syntax_ignore,type,,words,False,3

text,type
"['(', '300 000 000', 'm/sek', ')']",parentheses_num_start_uncategorized
"['(', 'naised', '5', 'km', ',', 'mehed', '7,5', 'km', ')']",parentheses_num_mid_uncategorized
"['(', 'meie', 'puhul', '165', '/', '80', ')']",parentheses_num_end_uncategorized


Note that this pattern group also covers content detected by some other `ignore_parenthesized_*` patterns. So, if you want to turn other patterns off, you may also want to turn off this pattern.

#### Flag `allow_loose_match`

If `True` (default setting), then an ignore text snippet may consume words without matching exactly with their boundaries (e.g. ignore snippet's start does not have to match word's start). If `False`, then an ignore text snippet must match exactly with word boundaries: it must start where a word starts, and end where a word ends.

### Patterns for ignoring full sentences

#### Flag  `ignore_sentences_starting_with_time`

Sentences starting with a date range (e.g. a time schedule of a seminar, or a TV program) will be ignored.

In [8]:
# Create text and preprocess
from estnltk.text import Text
text = Text('''
12.05 - 12.35 "Õnne 13" (1. osa)
12.35 - 13.05 "Õnne 13" (1. osa kordus)
13.05 - 13.35 "Õnne 13" (2. osa)
''')
text.tag_layer(['words', 'sentences'])

# Apply syntax_ignore_tagger
syntax_ignore_tagger = SyntaxIgnoreTagger(ignore_sentences_starting_with_time=True)
syntax_ignore_tagger.tag(text)

# Examine results
text.syntax_ignore

layer name,attributes,parent,enveloping,ambiguous,span count
syntax_ignore,type,,words,False,1

text,type
"['12.05', '-', '12.35', '""', 'Õnne', '13', '""', '(', '1.', 'osa', ')', '12.35', ..., type: <class 'list'>, length: 34",sentence_starts_with_time


#### Flag `ignore_sentences_with_comma_separated_num_name_lists`

Sentences containing comma separated list of titlecase words / numbers ( like sport results, player/country listings, game scores etc.) will be ignored.

In [9]:
# Create text and preprocess
from estnltk.text import Text
text = Text('''
Veerandfinaalid .
Lindsay Davenport ( 6 ) , USA-Jana Novotna ( 3 ) , Tšehhi 6 : 2 , 4 : 6 , 7 : 6 , Martina Hingis ( 1 ) , Šveits-Arantxa Sanchez Vicario ( 10 ) , Hispaania 6 : 3 , 6 : 2 .
Paarismängu veerandfinaalid .
Gigi Fernandez , USA/Nataša Zvereva , Valgevene- Alexandra Fusai/Nathalie Tauziat , Prantsusmaa 4 : 6 , 6 : 2 , 6 : 2 , Nicole Arendt , USA/Manon Bollegraf , Holland-Ruxandra Dragomir , Rumeenia/Iva Majoli , Horvaatia 6 : 3 , 3 : 6 , 6 : 4 .
''')
text.tag_layer(['words', 'sentences'])

# Apply syntax_ignore_tagger
syntax_ignore_tagger = SyntaxIgnoreTagger(ignore_sentences_with_comma_separated_num_name_lists=True)
syntax_ignore_tagger.tag(text)

# Examine results
text.syntax_ignore

layer name,attributes,parent,enveloping,ambiguous,span count
syntax_ignore,type,,words,False,2

text,type
"['Lindsay', 'Davenport', '(', '6', ')', ',', 'USA-Jana', 'Novotna', '(', '3', ') ..., type: <class 'list'>, length: 40",sentence_with_comma_separated_list
"['Gigi', 'Fernandez', ',', 'USA', '/', 'Nataša', 'Zvereva', ',', 'Valgevene-', ' ..., type: <class 'list'>, length: 48",sentence_with_comma_separated_list


#### Flag  `ignore_sentences_consisting_of_numbers`

Detects sentences that only contain number or numbers, no letters, and do not end with '!' nor '?'.

### Patterns for ignoring groups of consecutive sentences

#### Flag  `ignore_consecutive_parenthesized_sentences `

If consecutive sentences all contain parenthesized content that is already ignored, and all of these sentences contain less than 3 consecutive lowercase words, then these sentences likely represent enumerations (e.g. sports results) which can be safely ignored.

In [10]:
# Create text and preprocess
from estnltk.text import Text
text = Text('''Eile õhtul sõidetud avakatsel sai Markko Märtin ( Subaru , pildil ) viienda aja .
Tulemused :
1. Tommi Mäkinen ( FIN ) Mitsubishi - 3.46 , 9
2. Marcus Grönholm ( FIN ) Peugeot +1,0
3. Harri Rovanperä ( FIN ) Peugeot +4,1
4. Carlos Sainz ( ESP ) Ford +6,0''')
text.tag_layer(['words', 'sentences'])

# Apply syntax_ignore_tagger
syntax_ignore_tagger = SyntaxIgnoreTagger(ignore_consecutive_parenthesized_sentences=True)
syntax_ignore_tagger.tag(text)

# Examine results
text.syntax_ignore

layer name,attributes,parent,enveloping,ambiguous,span count
syntax_ignore,type,,words,False,5

text,type
"['Tulemused', ':', '1.']",consecutive_enum_ucase_sentences
"['Tommi', 'Mäkinen', '(', 'FIN', ')', 'Mitsubishi', '-', '3.46 , 9', '2.']",consecutive_parenthesized_sentences
"['Marcus', 'Grönholm', '(', 'FIN', ')', 'Peugeot', '+1,0', '3.']",consecutive_parenthesized_sentences
"['Harri', 'Rovanperä', '(', 'FIN', ')', 'Peugeot', '+4,1', '4.']",consecutive_parenthesized_sentences
"['Carlos', 'Sainz', '(', 'ESP', ')', 'Ford', '+6,0']",consecutive_parenthesized_sentences


#### Flag  `ignore_consecutive_enum_ucase_num_sentences`

Detects sentences that: 1) start with an uppercase letter, or an ordinal number followed by an uppercase letter, or an ordinal number; 2) contain at least one number; 3) does not contain 3 consecutive lowercase words; 4) are a part of at least 4 consecutive sentences that have the same properties (1, 2 and 3).

In [11]:
# Create text and preprocess
from estnltk.text import Text
text = Text('''Maakonniti summeerides on tulumaksu laekumise viis esimest :
1. Harjumaa 2792 ,
2. Hiiumaa 2119 ,
3. Tartumaa 2081 ,
4. Läänemaa 1933 ,
5. Pärnumaa 1903.''')
text.tag_layer(['words', 'sentences'])

# Apply syntax_ignore_tagger
syntax_ignore_tagger = SyntaxIgnoreTagger(ignore_consecutive_enum_ucase_num_sentences=True)
syntax_ignore_tagger.tag(text)

# Examine results
text.syntax_ignore

layer name,attributes,parent,enveloping,ambiguous,span count
syntax_ignore,type,,words,False,5

text,type
"['Harjumaa', '2792', ',', '2.']",consecutive_enum_ucase_sentences
"['Hiiumaa', '2119', ',', '3.']",consecutive_enum_ucase_sentences
"['Tartumaa', '2081', ',', '4.']",consecutive_enum_ucase_sentences
"['Läänemaa', '1933', ',', '5.']",consecutive_enum_ucase_sentences
"['Pärnumaa', '1903.']",consecutive_enum_ucase_sentences


## SyntaxIgnoreCutter

SyntaxIgnoreCutter cuts the input Text object into a smaller Text by leaving out all spans from the syntax_ignore layer. 

Usage example:

In [12]:
from estnltk.taggers.standard.syntax.preprocessing.syntax_ignore_tagger import SyntaxIgnoreTagger
from estnltk.taggers.standard.syntax.preprocessing.syntax_ignore_cutter import SyntaxIgnoreCutter
syntax_ignore_tagger = SyntaxIgnoreTagger()
syntax_ignore_cutter = SyntaxIgnoreCutter()

In [13]:
# Create input text
from estnltk import Text
text = Text('Need seminarid on toimunud Tartus 6 korda (1997–2000, 2002, 2005). '+\
            'Tutvustamisele tulevad Jan Kausi romaan “Koju” (Tuum 2012) ning Ülo Pikkovi romaan “Vana prints” (Varrak 2012).')
text.tag_layer('sentences')
# Detect ignorable parts
syntax_ignore_tagger.tag(text)
text.syntax_ignore

layer name,attributes,parent,enveloping,ambiguous,span count
syntax_ignore,type,,words,False,3

text,type
"['(', '1997', '–', '2000', ',', '2002', ',', '2005', ')']",parentheses_num_range
"['(', 'Tuum', '2012', ')']",parentheses_ref_year
"['(', 'Varrak', '2012', ')']",parentheses_ref_year


In [14]:
# Remove ignorable parts
cut_text = syntax_ignore_cutter.cut(text)
cut_text

text
Need seminarid on toimunud Tartus 6 korda . Tutvustamisele tulevad Jan Kausi romaan “Koju” ning Ülo Pikkovi romaan “Vana prints” .

layer name,attributes,parent,enveloping,ambiguous,span count
words,"original_start, original_end, original_index",,,False,25


Note that the cut text has a words layer, which keeps words' locations in the original text (`original_start, original_end, original_index`).
This information can be used to trace which words from the original text were analysed, and is required for carrying syntactic analysis layer over to the original text (see the next section for details).

Now, we can **apply syntactic analysis on the cut text**.
We use the default MaltParser-based analysis (which is described [here](03_syntactic_analysis_with_maltparser.ipynb)):

In [15]:
cut_text.tag_layer('maltparser_syntax')
cut_text['maltparser_syntax']

layer name,attributes,parent,enveloping,ambiguous,span count
maltparser_syntax,"id, lemma, upostag, xpostag, feats, head, deprel, deps, misc, parent_span, children",,,False,25

text,id,lemma,upostag,xpostag,feats,head,deprel,deps,misc,parent_span,children
Need,1,see,P,P,"{'pl': '', 'n': ''}",2,det,,,"Span('seminarid', [{'id': 2, 'lemma': 'seminar', 'upostag': 'S', 'xpostag': 'S', ..., type: <class 'estnltk_core.layer.span.Span'>",()
seminarid,2,seminar,S,S,"{'pl': '', 'n': ''}",4,nsubj,,,"Span('toimunud', [{'id': 4, 'lemma': 'toimuma', 'upostag': 'V', 'xpostag': 'V', ..., type: <class 'estnltk_core.layer.span.Span'>","(""Span('Need', [{'id': 1, 'lemma': 'see', 'upostag': 'P', 'xpostag': 'P', 'feats ..., type: <class 'tuple'>, length: 1"
on,3,olema,V,V,{'vad': ''},4,aux,,,"Span('toimunud', [{'id': 4, 'lemma': 'toimuma', 'upostag': 'V', 'xpostag': 'V', ..., type: <class 'estnltk_core.layer.span.Span'>",()
toimunud,4,toimuma,V,V,{'nud': ''},0,root,,,,"(""Span('seminarid', [{'id': 2, 'lemma': 'seminar', 'upostag': 'S', 'xpostag': 'S ..., type: <class 'tuple'>, length: 5"
Tartus,5,Tartu,H,H,"{'sg': '', 'in': ''}",4,obl,,,"Span('toimunud', [{'id': 4, 'lemma': 'toimuma', 'upostag': 'V', 'xpostag': 'V', ..., type: <class 'estnltk_core.layer.span.Span'>",()
6,6,6,N,N,{'?': ''},7,nummod,,,"Span('korda', [{'id': 7, 'lemma': 'kord', 'upostag': 'S', 'xpostag': 'S', 'feats ..., type: <class 'estnltk_core.layer.span.Span'>",()
korda,7,kord,S,S,"{'sg': '', 'p': ''}",4,obl,,,"Span('toimunud', [{'id': 4, 'lemma': 'toimuma', 'upostag': 'V', 'xpostag': 'V', ..., type: <class 'estnltk_core.layer.span.Span'>","(""Span('6', [{'id': 6, 'lemma': '6', 'upostag': 'N', 'xpostag': 'N', 'feats': {' ..., type: <class 'tuple'>, length: 1"
.,8,.,Z,Z,,4,punct,,,"Span('toimunud', [{'id': 4, 'lemma': 'toimuma', 'upostag': 'V', 'xpostag': 'V', ..., type: <class 'estnltk_core.layer.span.Span'>",()
Tutvustamisele,1,tutvustamine,S,S,"{'sg': '', 'all': ''}",2,obl,,,"Span('tulevad', [{'id': 2, 'lemma': 'tulema', 'upostag': 'V', 'xpostag': 'V', 'f ..., type: <class 'estnltk_core.layer.span.Span'>",()
tulevad,2,tulema,V,V,{'vad': ''},0,root,,,,"(""Span('Tutvustamisele', [{'id': 1, 'lemma': 'tutvustamine', 'upostag': 'S', 'xp ..., type: <class 'tuple'>, length: 4"


### Carrying syntactic analysis over to the original text

Once you've analysed the cut text syntactically, you also can carry syntactic analysis over to the original text.

Example:

In [16]:
from estnltk.taggers.standard.syntax.preprocessing.syntax_ignore_cutter import add_syntax_layer_from_cut_text

# func signature hint: original_text, cut_text, syntax_layer_name, add_empty_spans=True
add_syntax_layer_from_cut_text( text, cut_text, 'maltparser_syntax' )

text
"Need seminarid on toimunud Tartus 6 korda (1997–2000, 2002, 2005). Tutvustamisele tulevad Jan Kausi romaan “Koju” (Tuum 2012) ning Ülo Pikkovi romaan “Vana prints” (Varrak 2012)."

layer name,attributes,parent,enveloping,ambiguous,span count
sentences,,,words,False,2
tokens,,,,False,42
compound_tokens,"type, normalized",,tokens,False,0
words,normalized_form,,,True,42
syntax_ignore,type,,words,False,3
maltparser_syntax,"id, lemma, upostag, xpostag, feats, head, deprel, deps, misc",words,,False,42


Note that by default, the resulting syntactic analysis layer will have annotations even for ignored words, but these annotations are filled with empty (`None`) values:

In [17]:
text['maltparser_syntax']

layer name,attributes,parent,enveloping,ambiguous,span count
maltparser_syntax,"id, lemma, upostag, xpostag, feats, head, deprel, deps, misc",words,,False,42

text,id,lemma,upostag,xpostag,feats,head,deprel,deps,misc
Need,1.0,see,P,P,"{'pl': '', 'n': ''}",2.0,det,,
seminarid,2.0,seminar,S,S,"{'pl': '', 'n': ''}",4.0,nsubj,,
on,3.0,olema,V,V,{'vad': ''},4.0,aux,,
toimunud,4.0,toimuma,V,V,{'nud': ''},0.0,root,,
Tartus,5.0,Tartu,H,H,"{'sg': '', 'in': ''}",4.0,obl,,
6,6.0,6,N,N,{'?': ''},7.0,nummod,,
korda,7.0,kord,S,S,"{'sg': '', 'p': ''}",4.0,obl,,
(,,,,,,,,,
1997,,,,,,,,,
–,,,,,,,,,


To remove ignored words altogether from the syntax layer, use the setting `add_empty_spans=False`:

In [18]:
# Remove old layer
text.pop_layer('maltparser_syntax')

# Repeat carrying over syntax_layer
add_syntax_layer_from_cut_text( text, cut_text, 'maltparser_syntax', add_empty_spans=False )

text
"Need seminarid on toimunud Tartus 6 korda (1997–2000, 2002, 2005). Tutvustamisele tulevad Jan Kausi romaan “Koju” (Tuum 2012) ning Ülo Pikkovi romaan “Vana prints” (Varrak 2012)."

layer name,attributes,parent,enveloping,ambiguous,span count
sentences,,,words,False,2
tokens,,,,False,42
compound_tokens,"type, normalized",,tokens,False,0
words,normalized_form,,,True,42
syntax_ignore,type,,words,False,3
maltparser_syntax,"id, lemma, upostag, xpostag, feats, head, deprel, deps, misc",words,,False,25


In [19]:
# And None values are gone:
text['maltparser_syntax']

layer name,attributes,parent,enveloping,ambiguous,span count
maltparser_syntax,"id, lemma, upostag, xpostag, feats, head, deprel, deps, misc",words,,False,25

text,id,lemma,upostag,xpostag,feats,head,deprel,deps,misc
Need,1,see,P,P,"{'pl': '', 'n': ''}",2,det,,
seminarid,2,seminar,S,S,"{'pl': '', 'n': ''}",4,nsubj,,
on,3,olema,V,V,{'vad': ''},4,aux,,
toimunud,4,toimuma,V,V,{'nud': ''},0,root,,
Tartus,5,Tartu,H,H,"{'sg': '', 'in': ''}",4,obl,,
6,6,6,N,N,{'?': ''},7,nummod,,
korda,7,kord,S,S,"{'sg': '', 'p': ''}",4,obl,,
.,8,.,Z,Z,,4,punct,,
Tutvustamisele,1,tutvustamine,S,S,"{'sg': '', 'all': ''}",2,obl,,
tulevad,2,tulema,V,V,{'vad': ''},0,root,,
