# Parsing Zoëaga's entries

### Configuration

Install the modules.
```bash
$ pip3 install -r requirements.txt
```

Install the **kernel** associated with **python3.6** [https://ipython.readthedocs.io/en/stable/install/kernel_install.html](https://ipython.readthedocs.io/en/stable/install/kernel_install.html) 

In [1]:
import zoegas
help(zoegas)

Help on package zoegas:

NAME
    zoegas

PACKAGE CONTENTS
    constants
    reader
    tests
    utils

FILE
    /home/clementbesnier/.virtualenvs/old_norse_notebook/src/zoegas/zoegas/__init__.py




In [2]:
from zoegas import reader

In [3]:
dictionary = reader.Dictionary(reader.dictionary_name)
dictionary.get_entries()

In [4]:
word = dictionary.find("heimr")
print(word.word)
print(word.description)

heimr


(-s, -ar), m.

1) a place of abode, a region or world (níu man ek heima); spyrja e-n í hvern heim, to ask one freely;

2) this world (segðu mér ór heimi, ek man ór helju); koma í heiminn, to be born; fara af heiminum, to depart this life; liggja milli heims ok heljar, to lie between life and death;

3) the earth; kringla heimsins, the globe.




##### Letters present in *ZD*:

In [5]:
dictionary.get_entries()
entries = [entry.word for entry in dictionary.entries if entry is not None]
zd_characters = {c for word in entries if word is not None for c in word if not c.isdigit() and c.islower() and c.isalpha()}

In [6]:
print(" ".join(sorted(list(zd_characters))))

a b d e f g h i j k l m n o p r s t u v w x y z á æ é í ð ó ö ø ú ý þ œ


In [5]:
import random
random.seed(30)

In [6]:
import re

In [7]:
entries = {entry.word: entry.description for entry in dictionary.entries if entry is not None}

In [8]:
l = list(entries.keys())

In [9]:
random.shuffle(l)

In [10]:
words_to_test = l[:100]

In [13]:
for word in words_to_test:
    print("--------------")
    print(word)
    entry = dictionary.find(word)
    print(entry.pos)
    print(entries[word])

--------------
grœ
[]



--------------
allr
['adjective', 'neuter', 'singular', 'comparative', 'genitive', 'plural', 'genitive', 'plural']


(öll, allt), a.

1) all, entire, whole;

hón á allan arf eptir mik, she has all the heritage after me;

af öllum hug, with all (one’s) heart;

hvítr ~, white all over;

bú allt, the whole estate;

allan daginn, the whole day;

í ~i veröld, in the whole world;

allan hálfan mánuð, for the entire fortnight;

with addition of ‘saman’;

allt saman féit, the whole amount;

um þenna hernað allan saman, all together;

2) used almost adverbially, all, quite, entirely;

klofnaði hann ~ í sundr, he was all cloven asunder, kváðu Örn allan villast, that he was altogether bewildered;

var Hrappr ~ brottu, quite gone;

~ annarr maðr, quite another man;

3) gone, past;

áðr þessi dagr er ~, before this day is past;

var þá óll þeirra vinátta, their friendship was all over;

allt er nú mitt megin, my strength is exhausted, gone;

4) departed, dead (þá er Geirmun

In [24]:
print(words_to_test[0])
[entry.word for entry in dictionary.find_beginning_with(words_to_test[0])]

grœ


['grœfi',
 'grœ',
 'grœða',
 'grœðari',
 'grœðiligr',
 'grœðing',
 'grœfr',
 'grœgr',
 'grœnast',
 'grœnfainn',
 'grœnleikr',
 'grœnlenzkr',
 'grœnn',
 'grœntó',
 'grœntyrfa',
 'grœta',
 'grœti',
 'grœtiligr',
 'grœtir',
 'grœfi',
 'grœ',
 'grœða',
 'grœðari',
 'grœðiligr',
 'grœðing',
 'grœfr',
 'grœgr',
 'grœnast',
 'grœnfainn',
 'grœnleikr',
 'grœnlenzkr',
 'grœnn',
 'grœntó',
 'grœntyrfa',
 'grœta',
 'grœti',
 'grœtiligr',
 'grœtir']

--------------------------------

In [26]:
words_to_test[1], entries[words_to_test[1]]

('allr',
 '\n\n(öll, allt), a.\n\n1) all, entire, whole;\n\nhón á allan arf eptir mik, she has all the heritage after me;\n\naf öllum hug, with all (one’s) heart;\n\nhvítr ~, white all over;\n\nbú allt, the whole estate;\n\nallan daginn, the whole day;\n\ní ~i veröld, in the whole world;\n\nallan hálfan mánuð, for the entire fortnight;\n\nwith addition of ‘saman’;\n\nallt saman féit, the whole amount;\n\num þenna hernað allan saman, all together;\n\n2) used almost adverbially, all, quite, entirely;\n\nklofnaði hann ~ í sundr, he was all cloven asunder, kváðu Örn allan villast, that he was altogether bewildered;\n\nvar Hrappr ~ brottu, quite gone;\n\n~ annarr maðr, quite another man;\n\n3) gone, past;\n\náðr þessi dagr er ~, before this day is past;\n\nvar þá óll þeirra vinátta, their friendship was all over;\n\nallt er nú mitt megin, my strength is exhausted, gone;\n\n4) departed, dead (þá er Geirmundr var ~);\n\n5) neut. sing. (allt) used. as a subst. in the sense of all, everything;\

In [30]:
lines = [line for line in entries[words_to_test[1]].split("\n") if line]
lines

['(öll, allt), a.',
 '1) all, entire, whole;',
 'hón á allan arf eptir mik, she has all the heritage after me;',
 'af öllum hug, with all (one’s) heart;',
 'hvítr ~, white all over;',
 'bú allt, the whole estate;',
 'allan daginn, the whole day;',
 'í ~i veröld, in the whole world;',
 'allan hálfan mánuð, for the entire fortnight;',
 'with addition of ‘saman’;',
 'allt saman féit, the whole amount;',
 'um þenna hernað allan saman, all together;',
 '2) used almost adverbially, all, quite, entirely;',
 'klofnaði hann ~ í sundr, he was all cloven asunder, kváðu Örn allan villast, that he was altogether bewildered;',
 'var Hrappr ~ brottu, quite gone;',
 '~ annarr maðr, quite another man;',
 '3) gone, past;',
 'áðr þessi dagr er ~, before this day is past;',
 'var þá óll þeirra vinátta, their friendship was all over;',
 'allt er nú mitt megin, my strength is exhausted, gone;',
 '4) departed, dead (þá er Geirmundr var ~);',
 '5) neut. sing. (allt) used. as a subst. in the sense of all, ever

In [None]:
re.lines[0]

--------------------------------

In [32]:
len(dictionary.entries)

59902

--------------------------------

--------------------------------

--------------------------------

--------------------------------

--------------------------------

--------------------------------

--------------------------------

--------------------------------

--------------------------------

--------------------------------

--------------------------------

--------------------------------

The task is to get part of speech of each entry.

By Clément Besnier, email address: clemsciences@aol.com, web site: https://clementbesnier.fr/, twitter: clemsciences