# Reread a paper on Old Norse processing


[Morphological Tagging of Old Norse Texts and its Use in Studying Syntactic Variation and Change](http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=A2101BF4CF1D6C1D8C3526E8BA72ECAF?doi=10.1.1.710.7231&rep=rep1&type=pdf)


## Tagging Modern Icelandic 

### The Tagset

* [Using available resources](http://www.malfong.is/index.php?lang=en&pg=ordtidnibok)

* [Icelandic Frequency Corpus tagset](http://www.malfong.is/files/ot_tagset_files_en.pdf)

#### Parsing a tag

In [1]:
from collections import defaultdict


class POSElement:

    @staticmethod
    def parse(tag, value):
        return value

In [2]:
class Gender(POSElement):
    masculine = "k"
    feminine = "v"
    neuter = "h"

    verbose = defaultdict(str)
    verbose[masculine] = "masculine"
    verbose[feminine] = "feminine"
    verbose[neuter] = "neuter"

    @staticmethod
    def parse(tag, value):
        """
        >>> Gender.parse("k", "")
        ' masculine'
        >>> Gender.parse("v", "")
        ' feminine'
        >>> Gender.parse("h", "")
        ' neuter'

        :param tag:
        :param value:
        :return:
        """
        return value + " " + Gender.verbose[tag]

    @staticmethod
    def can_apply(tag):
        return tag in Gender.verbose

In [3]:
class Number(POSElement):
    singular = "e"
    plural = "f"

    verbose = defaultdict(str)
    verbose[singular] = "singular"
    verbose[plural] = "plural"

    @staticmethod
    def parse(tag, value):
        """
        >>> Number.parse("e", "")
        ' singular'
        >>> Number.parse("f", "")
        ' plural'

        :param tag:
        :param value:
        :return:
        """
        return value + " " + Number.verbose[tag]

    @staticmethod
    def can_apply(tag):
        return tag in Number.verbose

In [4]:
class Case:
    nominative = "n"
    accusative = "o"
    dative = "þ"
    genitive = "e"

    verbose = defaultdict(str)
    verbose[nominative] = "nominative"
    verbose[accusative] = "accusative"
    verbose[dative] = "dative"
    verbose[genitive] = "genitive"

    @staticmethod
    def parse(tag, value):
        """
        >>> Case.parse("n", "")
        ' nominative'
        >>> Case.parse("o", "")
        ' accusative'
        >>> Case.parse("þ", "")
        ' dative'
        >>> Case.parse("e", "")
        ' genitive'

        :param tag:
        :param value:
        :return:
        """
        return value + " " + Case.verbose[tag]

    @staticmethod
    def can_apply(tag):
        return tag in Case.verbose

In [5]:
class Declension:
    strong = "s"
    weak = "v"
    indeclinable = "o"

    verbose = defaultdict(str)
    verbose[strong] = "strong"
    verbose[weak] = "weak"
    verbose[indeclinable] = "indeclinable"

    @staticmethod
    def parse(tag, value):
        """
        >>> Declension.parse("s", "")
        ' strong'
        >>> Declension.parse("v", "")
        ' weak'
        >>> Declension.parse("o", "")
        ' indeclinable'

        :param tag:
        :param value:
        :return:
        """
        return value + " " + Declension.verbose[tag]

    @staticmethod
    def can_apply(tag):
        return tag in Declension.verbose

In [6]:
class Degree:
    positive = "f"
    comparative = "m"
    superlative = "e"

    verbose = defaultdict(str)
    verbose[positive] = "positive"
    verbose[comparative] = "comparative"
    verbose[superlative] = "superlative"

    @staticmethod
    def parse(tag, value):
        """
        >>> Degree.parse("f", "")
        ' positive'
        >>> Degree.parse("m", "")
        ' comparative'
        >>> Degree.parse("e", "")
        ' superlative'

        :param tag:
        :param value:
        :return:
        """
        return value + " " + Degree.verbose[tag]

    @staticmethod
    def can_apply(tag):
        return tag in Degree.verbose

In [7]:
class ProperNoun:
    person = "m"
    place = "ö"
    other = "s"

    verbose = defaultdict(str)
    verbose[person] = "person"
    verbose[place] = "place"
    verbose[other] = "other"

    @staticmethod
    def parse(tag, value):
        """
        >>> ProperNoun.parse("m", "")
        ' person'
        >>> ProperNoun.parse("ö", "")
        " place'
        >>> ProperNoun.parse("s", "")
        ' other'

        :param tag:
        :param value:
        :return:
        """
        return value + " " + ProperNoun.verbose[tag]

    @staticmethod
    def can_apply(tag):
        return tag in ProperNoun.verbose

In [8]:
class Pronoun:
    demonstrative = "a"
    indefinite_demonstrative = "b"
    possessive = "e"
    indefinite = "o"
    personal = "p"
    interrogative = "s"
    relative = "t"

    verbose = defaultdict(str)
    verbose[demonstrative] = "demonstrative"
    verbose[indefinite_demonstrative] = "indefinite demonstrative"
    verbose[possessive] = "possessive"
    verbose[indefinite] = "indefinite"
    verbose[personal] = "personal"
    verbose[interrogative] = "interrogative"
    verbose[relative] = "relative"

    @staticmethod
    def parse(tag, value):
        """
        >>> Pronoun.parse("a", "")
        ' demonstrative'
        >>> Pronoun.parse("b", "")
        ' indefinite demonstrative'
        >>> Pronoun.parse("e", "")
        ' possessive'
        >>> Pronoun.parse("o", "")
        ' indefinite'
        >>> Pronoun.parse("p", "")
        ' personal'
        >>> Pronoun.parse("s", "")
        ' interrogative'
        >>> Pronoun.parse("t", "")
        ' relative'

        :param tag:
        :param value:
        :return:
        """
        return value + " " + Pronoun.verbose[tag]

    @staticmethod
    def can_apply(tag):
        return tag in Pronoun.verbose

In [9]:
class Person:
    first = "1"
    second = "2"
    third = "3"

    verbose = defaultdict(str)
    verbose[first] = "first"
    verbose[second] = "second"
    verbose[third] = "third"

    @staticmethod
    def parse(tag, value):
        """
        >>> Person.parse("1", "")
        ' first'
        >>> Person.parse("2", "")
        ' second'
        >>> Person.parse("3", "")
        ' third'

        :param tag:
        :param value:
        :return:
        """
        return value + " " + Person.verbose[tag]

    @staticmethod
    def can_apply(tag):
        return tag in Person.verbose

In [10]:
class NumberCategory:
    cardinal = "f"
    ordinal = "o"

    verbose = defaultdict(str)
    verbose[cardinal] = "cardinal"
    verbose[ordinal] = "ordinal"

    @staticmethod
    def parse(tag, value):
        """
        >>> NumberCategory.parse("f", "")
        ' cardinal'
        >>> NumberCategory.parse("o", "")
        ' ordinal'

        :param tag:
        :param value:
        :return:
        """
        return value + " " + NumberCategory.verbose[tag]

    @staticmethod
    def can_apply(tag):
        return tag in NumberCategory.verbose

In [11]:
class Mood:
    infinitive = "n"
    imperative = "b"
    indicative = "f"
    subjunctive = "v"
    supine = "s"
    present_participle = "l"
    past_participle = "þ"

    verbose = defaultdict(str)
    verbose[infinitive] = "infinitive"
    verbose[imperative] = "imperative"
    verbose[indicative] = "indicative"
    verbose[subjunctive] = "subjunctive"
    verbose[supine] = "supine"
    verbose[present_participle] = "present participle"
    verbose[past_participle] = "past participle"

    @staticmethod
    def parse(tag, value):
        """
        >>> Mood.parse("n", "")
        ' infinitive'
        >>> Mood.parse("b", "")
        ' imperative'
        >>> Mood.parse("f", "")
        ' indicative'
        >>> Mood.parse("v", "")
        ' subjunctive'
        >>> Mood.parse("s", "")
        ' supine'
        >>> Mood.parse("l", "")
        ' present participle'
        >>> Mood.parse("þ", "")
        ' past participle'

        :param tag:
        :param value:
        :return:
        """
        return value + " " + Mood.verbose[tag]

    @staticmethod
    def can_apply(tag):
        return tag in Mood.verbose

In [12]:
class Voice:
    active = "g"
    middle = "m"

    verbose = defaultdict(str)
    verbose[active] = "active"
    verbose[middle] = "middle"

    @staticmethod
    def parse(tag, value):
        """
        >>> Voice.parse("g", "")
        ' active'
        >>> Voice.parse("m", "")
        ' middle'

        :param tag:
        :param value:
        :return:
        """
        return value + " " + Voice.verbose[tag]

    @staticmethod
    def can_apply(tag):
        return tag in Voice.verbose

In [13]:
class Tense:
    present = "n"
    past = "þ"

    verbose = defaultdict(str)
    verbose[present] = "present"
    verbose[past] = "past"

    @staticmethod
    def parse(tag, value):
        """
        >>> Tense.parse("n", "")
        'present"
        >>> Tense.parse("þ", "")
        'past'

        :param tag:
        :param value:
        :return:
        """
        return value + " " + Tense.verbose[tag]

    @staticmethod
    def can_apply(tag):
        return tag in Tense.verbose

In [14]:
class MainPOS:
    noun = "n"
    adjective = "l"
    pronoun = "f"
    article = "g"
    numeral = "t"
    verb = "s"
    adverb = "a"
    conjunction = "c"
    foreign = "e"
    unanalysed = "x"
    punctuation = "p"

    verbose = defaultdict(str)
    verbose[noun] = "noun"
    verbose[adjective] = "adjective"
    verbose[pronoun] = "pronoun"
    verbose[article] = "article"
    verbose[numeral] = "numeral"
    verbose[verb] = "verb"
    verbose[adverb] = "adverb"
    verbose[conjunction] = "conjunction"
    verbose[unanalysed] = "unanalysed"
    verbose[punctuation] = "punctuation"

    @staticmethod
    def apply(tag: str, l_pos: list, value: str):
        i = 1
        for pos in l_pos:
            if isinstance(pos, list):
                for j in pos:
                    if j.can_apply(tag[i]):
                        value = j.parse(tag[i], value)
            else:
                value = pos.parse(tag[i], value)
            i += 1
        return value

    @staticmethod
    def parse(tag):
        """
        >>> MainPOS.parse('fakeþ')
        'pronoun demonstrative masculine singular dative'
        >>> MainPOS.parse('sfg3eþ')
        'verb indicative active third singular past'
        >>> MainPOS.parse('lvensf')
        'adjective feminine singular nominative strong positive'
        >>> MainPOS.parse('fp1en')
        'pronoun personal first singular nominative'
        >>> MainPOS.parse('nkee')
        'noun masculine singular genitive'
        >>> MainPOS.parse('sþgken')
        'verb past participle active masculine singular nominative'
        >>> MainPOS.parse('nhfn')
        'noun neuter plural nominative'
        >>> MainPOS.parse('nveo')
        'noun feminine singular accusative'

        :param tag:
        :return:
        """

        value = ""
        if tag[0] == MainPOS.noun:
            if len(tag) >= 4:
                value = MainPOS.verbose[tag[0]]
                value = MainPOS.apply(tag, [Gender, Number, Case], value)
                if len(tag) == 5:
                    value = ProperNoun.parse(tag[4], value)
            return value

        elif tag[0] == MainPOS.adjective:
            if len(tag) == 6:
                value = MainPOS.verbose[tag[0]]
                value = MainPOS.apply(tag, [Gender, Number, Case, Declension, Degree], value)
            return value

        elif tag[0] == MainPOS.pronoun:
            if len(tag) == 5:
                value = MainPOS.verbose[tag[0]]
                value = MainPOS.apply(tag, [Pronoun, [Person, Gender], Number, Case], value)
            return value

        elif tag[0] == MainPOS.article:
            if len(tag) == 4:
                value = MainPOS.verbose[tag[0]]
                value = MainPOS.apply(tag, [Gender, Number, Case], value)
            return value

        elif tag[0] == MainPOS.numeral:
            if len(tag) == 5:
                value = MainPOS.verbose[tag[0]]
                value = MainPOS.apply(tag, [NumberCategory, Gender, Number, Case], value)

        elif tag[0] == MainPOS.verb:
            if len(tag) == 3 and tag[1] == "n":
                value = MainPOS.verbose[tag[0]]
                value = MainPOS.apply(tag, [Mood, Voice], value)
                value = Mood.parse(tag[1], value)
                value = Voice.parse(tag[2], value)

            elif len(tag) == 6 and tag[1] == "þ":
                value = MainPOS.verbose[tag[0]]
                value = MainPOS.apply(tag, [Mood, Voice, Gender, Number, Case], value)

            elif len(tag) == 6:
                value = MainPOS.verbose[tag[0]]
                value = MainPOS.apply(tag, [Mood, Voice, Person, Number, Tense], value)
            return value

        elif tag[0] == MainPOS.adverb:
            if len(tag) == 2:
                value = MainPOS.verbose[tag[0]]
                if tag[1] == "a":
                    value += " no case "
                elif tag[1] == "u":
                    value += " exclamation"
                elif tag[1] == "o":
                    value += " accusative"
                elif tag[1] == "þ":
                    value += " dative"
                elif tag[1] == "e":
                    value += " genitive"
            return value

        elif tag[0] == MainPOS.conjunction:
            if len(tag) == 2:
                value = MainPOS.verbose[tag[0]]
                if tag[1] == "n":
                    value += ""
                elif tag[1] == "t":
                    value += ""
                return value

        elif tag[0] == MainPOS.foreign:
            value = MainPOS.verbose[tag[0]]
            return value

        elif tag[0] == MainPOS.unanalysed:
            value = MainPOS.verbose[tag[0]]
            return value

        elif tag[0] == MainPOS.punctuation:
            value = MainPOS.verbose[tag[0]]
            return value

        return value


def parse(tag):
    if len(tag) > 0:
        value = MainPOS.parse(tag.lower())
    else:
        value = ""
    return value

In [15]:
import eddas

In [16]:
help(eddas)

Help on package eddas:

NAME
    eddas

PACKAGE CONTENTS
    pos
    reader
    tests
    text_manager
    utils

FILE
    /home/clementbesnier/languages/lib/python3.6/site-packages/eddas-1.3.2-py3.6.egg/eddas/__init__.py




In [17]:
from eddas import reader

In [18]:
voeluspaa_pos = reader.PoeticEddaPOSTaggedReader("Völuspá")

In [19]:
voeluspaa_pos.tagged_paras()[0]

[[('1', 'TA')],
 [('Hljóðs', 'NHEE'), ('bið', 'SFG1EÞ'), ('ek', 'FP1EN'), ('allar', 'LVFOSF')],
 [('helgar', 'LVFOSF'), ('kindir', 'NVFO'), (',', 'P')],
 [('meiri', 'LVFOVM'), ('ok', 'CC'), ('minni', 'LVFOVM')],
 [('mögu', 'NKFO'), ('Heimdallar', 'NKEEM'), (';', 'P')],
 [('viltu', 'SFG2EN'),
  ('at', 'CN'),
  ('ek', 'FP1EM'),
  (',', 'P'),
  ('Valföðr', 'NKENM'),
  (',', 'P')],
 [('vel', 'AA'), ('fyr', 'CN'), ('telja', 'SNG')],
 [('forn', 'LHFOSF'), ('spjöll', 'NHFO'), ('fira', 'NKFE'), (',', 'P')],
 [('þau', 'FA'),
  ('er', 'CT'),
  ('fremst', 'AA'),
  ('of', 'AA'),
  ('man', 'SFG1EN'),
  ('.', 'P')]]

In [20]:
parse("sfg3en")

'verb indicative active third singular present'

In [21]:
for sent in voeluspaa_pos.tagged_paras()[0]:
    for word, tag in sent:
        print(word, ":", tag.lower(), "->", parse(tag))

1 : ta -> 
Hljóðs : nhee -> noun neuter singular genitive
bið : sfg1eþ -> verb indicative active first singular past
ek : fp1en -> pronoun personal first singular nominative
allar : lvfosf -> adjective feminine plural accusative strong positive
helgar : lvfosf -> adjective feminine plural accusative strong positive
kindir : nvfo -> noun feminine plural accusative
, : p -> punctuation
meiri : lvfovm -> adjective feminine plural accusative weak comparative
ok : cc -> conjunction
minni : lvfovm -> adjective feminine plural accusative weak comparative
mögu : nkfo -> noun masculine plural accusative
Heimdallar : nkeem -> noun masculine singular genitive person
; : p -> punctuation
viltu : sfg2en -> verb indicative active second singular present
at : cn -> conjunction
ek : fp1em -> pronoun personal first singular 
, : p -> punctuation
Valföðr : nkenm -> noun masculine singular nominative person
, : p -> punctuation
vel : aa -> adverb no case 
fyr : cn -> conjunction
telja : sng -> verb infin

In [22]:
voeluspaa_lem = reader.PoeticEddaLemmatizationReader("Völuspá")
print(reader.poetic_edda_titles)
PUNCTUATIONS = ",:;.!?-"
tokens = voeluspaa_lem.words()
lemmata = set([lemma for word, lemma in voeluspaa_lem.tagged_words() if lemma is not None and lemma not in PUNCTUATIONS])
normalized_tokens = [token.lower() for token in tokens if token is not None and token not in PUNCTUATIONS]

['Rígsþula', 'Helreið Brynhildar', 'Gróttasöngr', 'Sigrdrífumál', 'Hárbarðsljóð', 'Grímnismál', 'Þrymskviða', 'Völuspá', 'Atlamál in grænlenzku', 'Hyndluljóð', 'Skírnismál', 'Hymiskviða', 'Atlakviða', 'Vafþrúðnismál', 'Oddrúnarkviða', 'Völundarkviða', 'Alvíssmál', 'Fáfnismál', 'Dráp Niflunga', 'Hávamál', 'Guðrúnarhvöt', 'Hamðismál', 'Baldrs draumar', 'Lokasenna', 'Guðrúnarkviða']


In [23]:
from collections import defaultdict
# For a lemma, give all its inflected forms present in the Völuspá
lemmata_tokens = defaultdict(set)
for word, lemma in voeluspaa_lem.tagged_words():
    if word is not None and lemma is not None and word not in PUNCTUATIONS and lemma not in PUNCTUATIONS:
        for token in normalized_tokens:
            if token == word.lower() and lemma != "":
                lemmata_tokens[lemma.lower()].add(token)

In [24]:
# For a token, give all its POS tags present in the Völuspá
taupi = defaultdict(set)
for word, pos_tag in voeluspaa_pos.tagged_words():
    if word is not None and pos_tag is not None and word not in PUNCTUATIONS and lemma not in PUNCTUATIONS:
        taupi[word.lower()].add(pos_tag.lower())

In [25]:
# For a lemma, all its possible tags present in the Völuspá
lambdapi = defaultdict(set)
for lemma in lemmata:
    for token in lemmata_tokens[lemma.lower()]:
        for pos_tag in taupi[token]:
            lambdapi[lemma.lower()].add(pos_tag)


In [26]:
# link lemmata_tokens, taupi and lambdapi to Zoëga's dictionary
import zoegas
from zoegas import reader as zoegas_reader

In [27]:
dictionary = zoegas_reader.Dictionary(zoegas_reader.dictionary_name)
dictionary.get_entries()

In [28]:
mjötvið = dictionary.find("bjöð")
print(mjötvið.description)



f. flat land;

áðr Börs synir ~um of yptu, ere the sons of B. raised the ground.




In [29]:
unknown_lemmata = []
known_lemmata = {}
for lemma in lemmata:
    if lemma is not None:
        entry = dictionary.find(lemma.lower())
        if entry is None:
            unknown_lemmata.append(lemma.lower())
#             print("Not in dictionary: "+lemma.lower())
        else:
            known_lemmata[lemma.lower()] = entry
#             print("lemma : "+lemma.lower())
#     i += 1
#     if i > 20:
#         break

In [30]:
unknown_lemmata

[',vættr',
 'valföðr',
 'óri',
 'niðhöggr',
 'geyr',
 'bávurr',
 'hornbori',
 'týr',
 'jötunheimr',
 'draupnir',
 'nóri',
 'gøra',
 'jari',
 'ókólnn',
 'finnr',
 'mímir',
 'glóinn',
 'ræðr',
 'hveralundr',
 'uns',
 'billingr',
 'und+es',
 'mjöðvitnir',
 'mæran',
 'hvern',
 'ér',
 'þorinn',
 'hænir',
 'eikinskjaldi',
 'öld',
 'hroft',
 'yngvi',
 'höðr',
 'skáfiðr',
 'skirfir',
 'morgun',
 'kíli',
 'folkvígr',
 'mjöð',
 'þórr',
 'blóðigr',
 'fensalir',
 'slíkt',
 'þráinn',
 'neka',
 'ósjaldan',
 'ljóra',
 'embla',
 'frosti',
 'mímisbrunnr',
 'váli',
 'frigg',
 'eftir',
 'göndul',
 'mættr',
 'ánn',
 'hon',
 'dúfr',
 'buri',
 'durinn',
 'lofar',
 'vindalfr',
 'haugspori',
 'lóni',
 'baldr',
 'náli',
 'fyr',
 'óinn',
 'vígspá',
 'alþjófr',
 'dóridóri',
 'herjans',
 'kindr',
 'brúni',
 'lóðurr',
 'upphiminn',
 'goðþjóð',
 'niðavallr',
 'gullr',
 'heimdallr',
 'sigyn',
 'vígband',
 'alfr',
 'ausinn',
 'nýráðr',
 'örlög',
 'móðsognir',
 'svíurr',
 'aurvangr',
 'dolgþrasir',
 'brimir',
 'glýja'

In [31]:
for i in dictionary.find_beginning_with("fíl"):
    print(i.word)
    print(i.description)

fíll


(-s, -ar), m. elephant.




In [32]:
import cltk.tokenize.word as cltkt

In [33]:
# help(reader)
from eddas import text_manager
import os
# help(text_manager)
loader_vaf = text_manager.TextLoader(os.path.join("Sæmundar-Edda", "Vafþrúðnismál"), "txt")
text_vaf = loader_vaf.load()
# print(text_vaf[:500])
from nltk.text import Text
text_vaf = Text(cltkt.tokenize_old_norse_words(text_vaf))
# text_vaf.concordance()

text_vaf.concordance("jötunn")
text_vaf.concordance("jötun")
text_vaf.concordance("jötni")
text_vaf.concordance("jötuns")
text_vaf.concordance("jötnar")
text_vaf.concordance("jötna")
text_vaf.concordance("jötna")

Displaying 10 of 10 matches:
 vita , ef þú fróðr sér eða alsviðr jötunn . " Vafþrúðnir kvað : 7 . "Hvat er 
k lengi farit - ok þinna andfanga , jötunn . " Vafþrúðnir kvað : 9 . "Hví þú þ
kom eða upphiminn fyrst , inn fróði jötunn . " Vafþrúðnir kvað : 21 . "Ór Ymis
m með jötna sonum fyrst , inn fróði jötunn . " Vafþrúðnir kvað : 31 . "Ór Éliv
ukku eitrdropar , svá óx , unz varð jötunn ; þar eru órar ættir komnar allar s
itir , hvé sá börn gat , inn baldni jötunn , er hann hafði -t gýgjar gaman . "
 fremst of veizt , þú ert alsviðr , jötunn . " Vafþrúðnir kvað : 35 . Örófi ve
t ek fyrst of man , er sá inn fróði jötunn á var lúðr of lagiðr . " Óðinn kvað
gr heitir , er sitr á himins enda , jötunn í arnar ham ; af hans vængjum kvæða
segir þú it sannasta , inn alsvinni jötunn . " Vafþrúðnir kvað : 43 . "Frá jöt
Displaying 3 of 3 matches:
 fornum stöfum við þann inn alsvinna jötun . " Frigg kvað : 2 . "Heima letja ek
erjaföðr í görðum goða ; því at engi jötun ek hugða jafnramman sem Vafþrúð

In [34]:
loader_gri = text_manager.TextLoader(os.path.join("Sæmundar-Edda", "Grímnismál"), "txt")
# print(loader_gri.load()[:500])

text_gri = loader_gri.load()
from nltk.text import Text
text_gri = Text(cltkt.tokenize_old_norse_words(text_gri))
# text_vaf.concordance()

text_gri.concordance("jötunn")
text_gri.concordance("jötun")
text_gri.concordance("jötni")
text_gri.concordance("jötuns")
text_gri.concordance("jötnar")
text_gri.concordance("jötna")
text_gri.concordance("jötna")
# text_gri.concordance("ok")
# text_gri.count("sonum")

Displaying 1 of 1 matches:
tti , er Þjazi bjó , sá inn ámáttki jötunn ; en nú Skaði byggvir , skír brúðr 
Displaying 1 of 1 matches:
Sökkmímis , ok dulðak þann inn aldna jötun , þá er ek Miðvitnis vark ins mæra b
No matches
No matches
No matches
No matches
No matches


### Training the tagger

## Tagging the Old Norse texts

In [35]:
from nltk.tag.tnt import TnT

In [36]:
help(TnT)

Help on class TnT in module nltk.tag.tnt:

class TnT(nltk.tag.api.TaggerI)
 |  TnT - Statistical POS tagger
 |  
 |  IMPORTANT NOTES:
 |  
 |  * DOES NOT AUTOMATICALLY DEAL WITH UNSEEN WORDS
 |  
 |    - It is possible to provide an untrained POS tagger to
 |      create tags for unknown words, see __init__ function
 |  
 |  * SHOULD BE USED WITH SENTENCE-DELIMITED INPUT
 |  
 |    - Due to the nature of this tagger, it works best when
 |      trained over sentence delimited input.
 |    - However it still produces good results if the training
 |      data and testing data are separated on all punctuation eg: [,.?!]
 |    - Input for training is expected to be a list of sentences
 |      where each sentence is a list of (word, tag) tuples
 |    - Input for tag function is a single sentence
 |      Input for tagdata function is a list of sentences
 |      Output is of a similar form
 |  
 |  * Function provided to process text that is unsegmented
 |  
 |    - Please see basic_sent_chop(

### Old Norse vs. Modern Icelandic

### The Old Norse Corpus

### Training the tagger on the Old Norse corpus

## Tagged texts in syntactic research

### Object Shift

### Passive

By Clément Besnier, email address: clemsciences@aol.com, web site: https://clementbesnier.fr/, twitter: clemsciences