# NENA to TF

This notebook will be used to develop code for converting texts from .nena format to Text-Fabric. The parser has principally been written by Hannes Vlaardingerbroek. Many thanks to him for his hard work on the parser. Updates and refinements have been added by Cody Kingham.

In [4]:
import os
import sys
import collections
import re
from pathlib import Path
from tf.convert.walker import CV
from tf.fabric import Fabric

# path to parser
parserpath = f'../nena_corpus/parse_nena/'
sys.path.append(parserpath)

from nena_parser import NenaLexer, NenaParser

# paths
VERSION = '0.01'
data_dir = Path(f'../nena_corpus/nena/{VERSION}')
dialect_dirs = list(Path(data_dir).glob('*'))

## Parse NENA

The NENA Parser delivers the text as structured morphemes, which can then be processed into a TF graph. We do that below by opening each source text, retrieving its parsed form, and begin each iteration. 

In [7]:
lexer = NenaLexer()
parser = NenaParser()

In [15]:
parsed = []
fails = []

for dialect in sorted(dialect_dirs):
    for file in sorted(dialect.glob('*.nena')):
        with open(file, 'r') as infile:
            
            text = infile.read()
            
            print(f'trying: {file.name}')
            try:
                #lexer.tokenize(text)
                parse = parser.parse(lexer.tokenize(text))
                print(f'\t√')
                parsed.append(parse)
                
            except:
                print('\tFAIL')
                fails.append(file)

print('-'*20)
print('success rate:')
print(len(parsed), 'success...')
print(len(fails), 'fails...')

trying: A Hundred Gold Coins.nena
	√
trying: A Man Called Čuxo.nena
	FAIL
trying: A Tale Of A Prince And A Princess.nena
	√
trying: A Tale Of Two Kings.nena


yacc: Syntax error at line 1, token=DIGITS


	√
trying: Baby Leliθa.nena
	√
trying: Dəmdəma.nena
	√
trying: Gozali And Nozali.nena
	FAIL
trying: I Am Worth The Same As A Blind Wolf.nena
	√
trying: Man Is Treacherous.nena
	√
trying: Measure For Measure.nena
	√
trying: Nanno And Jəndo.nena
	√
trying: Qaṭina Rescues His Nephew From Leliθa.nena
	√
trying: Sour Grapes.nena
	√
trying: Tales From The 1001 Nights.nena
	FAIL
trying: The Battle With Yuwanəs The Armenian.nena
	√
trying: The Bear And The Fox.nena


yacc: Syntax error at line 1, token=LETTER


	√
trying: The Brother Of Giants.nena
	√
trying: The Cat And The Mice.nena
	√
trying: The Cooking Pot.nena
	√
trying: The Crafty Hireling.nena
	√
trying: The Crow And The Cheese.nena
	√
trying: The Daughter Of The King.nena
	√
trying: The Fox And The Lion.nena
	√
trying: The Fox And The Miller.nena
	√
trying: The Fox And The Stork.nena
	√
trying: The Giant’s Cave.nena
	√
trying: The Girl And The Seven Brothers.nena
	√
trying: The King With Forty Sons.nena
	FAIL
trying: The Leliθa From Č̭āl.nena
	√
trying: The Lion King.nena
	√
trying: The Lion With A Swollen Leg.nena
	√
trying: The Man Who Cried Wolf.nena
	√
trying: The Man Who Wanted To Work.nena
	FAIL
trying: The Monk And The Angel.nena


yacc: Syntax error at line 1, token=DIGITS


	√
trying: The Monk Who Wanted To Know When He Would Die.nena
	√
trying: The Priest And The Mullah.nena
	√
trying: The Sale Of An Ox.nena
	FAIL
trying: The Scorpion And The Snake.nena
	√
trying: The Selfish Neighbour.nena
	√
trying: The Sisisambər Plant.nena
	√
trying: The Story With No End.nena
	√
trying: The Tale Of Farxo And Səttiya.nena
	FAIL
trying: The Tale Of Mămo And Zine.nena
	FAIL
trying: The Tale Of Mərza Pămət.nena
	√
trying: The Tale Of Nasimo.nena
	√
trying: The Tale Of Parizada, Warda And Nargis.nena
	√
trying: The Tale Of Rustam (1).nena


yacc: Syntax error at line 1, token=(
yacc: Syntax error at line 1, token=DIGITS


	√
trying: The Tale Of Rustam (2).nena
	FAIL
trying: The Wise Daughter Of The King.nena
	√
trying: The Wise Snake.nena
	√
trying: The Wise Young Man.nena
	FAIL
trying: Šošət Xere.nena
	√
trying: A Close Shave.nena
	√
trying: A Cure For A Husband’s Madness.nena
	FAIL
trying: A Donkey Knows Best.nena
	√
trying: A Dragon In The Well.nena
	√
trying: A Dutiful Son.nena


yacc: Syntax error at line 1, token=(


	√
trying: A Frog Wants A Husband.nena
	√
trying: A Lost Donkey.nena
	√
trying: A Lost Ring.nena
	√
trying: A Painting Of The King Of Iran.nena
	√
trying: A Pound Of Flesh.nena
	√
trying: A Sweater To Pay Off A Debt.nena
	√
trying: A Thousand Dinars.nena
	√
trying: A Visit From Harun Ar-rashid.nena


yacc: Syntax error at line 1, token=DIGITS


	√
trying: Agriculture And Village Life.nena
	√
trying: Am I Dead?.nena
	√
trying: An Orphan Duckling.nena
	√
trying: Axiqar.nena
	FAIL
trying: Events In 1946 On The Urmi Plain.nena
	√
trying: Games.nena
	√
trying: Hunting.nena
	√
trying: I Have Died.nena
	√
trying: Ice For Dinner.nena
	√
trying: Is There A Man With No Worries?.nena
	√
trying: Kindness To A Donkey.nena
	√
trying: Lost Money.nena
	√
trying: Mistaken Identity.nena
	√
trying: Much Ado About Nothing.nena
	√
trying: Nipuxta.nena
	√

yacc: Syntax error at line 1, token=[
yacc: Syntax error at line 1, token=HYPHEN



trying: No Bread Today.nena
	√
trying: Problems Lighting A Fire.nena
	√
trying: St. Zayya’s Cake Dough.nena
	√
trying: Star-crossed Lovers.nena
	√
trying: Stomach Trouble.nena
	√
trying: The Adventures Of A Princess.nena


yacc: Syntax error at line 1, token=LETTER


	√
trying: The Adventures Of Ashur.nena
	√
trying: The Adventures Of Two Brothers.nena
	√
trying: The Angel Of Death.nena
	√
trying: The Assyrians Of Armenia.nena
	√
trying: The Assyrians Of Urmi.nena


yacc: Syntax error at line 1, token=LETTER


	√
trying: The Bald Child And The Monsters.nena
	√
trying: The Bald Man And The King.nena
	√
trying: The Bird And The Fox.nena
	√
trying: The Cat’s Dinner.nena
	√
trying: The Cow And The Poor Girl.nena
	√
trying: The Dead Rise And Return.nena
	√
trying: The Fisherman And The Princess.nena
	√
trying: The Giant One-eyed Demon.nena
	√
trying: The Little Prince And The Snake.nena
	√
trying: The Loan Of A Cooking Pot.nena
	√
trying: The Man Who Wanted To Complain To God.nena
	√
trying: The Old Man And The Fish.nena
	√
trying: The Purchase Of A Donkey.nena
	√
trying: The Snake’s Dilemma.nena
	√
trying: The Stupid Carpenter.nena
	√
trying: The Wife Who Learns How To Work.nena


yacc: Syntax error at line 1, token=TITLE


	√
trying: The Wife’s Condition.nena
	√
trying: The Wise Brother.nena
	√
trying: The Wise Young Daughter.nena
	√
trying: Trickster.nena
	√
trying: Two Birds Fall In Love.nena
	√
trying: Two Wicked Daughters-in-law.nena
	√
trying: Village Life (2).nena
	FAIL
trying: Village Life (3).nena
	√
trying: Village Life (4).nena
	√
trying: Village Life (5).nena


yacc: Syntax error at line 1, token=[
yacc: Syntax error at line 1, token=[
yacc: Syntax error at line 1, token=[
yacc: Syntax error at line 1, token=[


	√
trying: Village Life (6).nena
	FAIL
trying: Village Life.nena


yacc: Syntax error at line 1, token=LETTER
yacc: Syntax error at line 1, token=DIGITS


	√
trying: Vineyards.nena
	√
trying: Weddings And Festivals.nena
	√
trying: Weddings.nena
	√
trying: When Shall I Die?.nena
	√
trying: Women Are Stronger Than Men.nena
	√
trying: Women Do Things Best.nena
	√
--------------------
success rate:
111 success...
14 fails...


In [24]:
fails

[PosixPath('../nena_corpus/nena/0.01/Barwar/A Man Called Čuxo.nena'),
 PosixPath('../nena_corpus/nena/0.01/Barwar/Gozali And Nozali.nena'),
 PosixPath('../nena_corpus/nena/0.01/Barwar/Tales From The 1001 Nights.nena'),
 PosixPath('../nena_corpus/nena/0.01/Barwar/The King With Forty Sons.nena'),
 PosixPath('../nena_corpus/nena/0.01/Barwar/The Man Who Wanted To Work.nena'),
 PosixPath('../nena_corpus/nena/0.01/Barwar/The Sale Of An Ox.nena'),
 PosixPath('../nena_corpus/nena/0.01/Barwar/The Tale Of Farxo And Səttiya.nena'),
 PosixPath('../nena_corpus/nena/0.01/Barwar/The Tale Of Mămo And Zine.nena'),
 PosixPath('../nena_corpus/nena/0.01/Barwar/The Tale Of Rustam (2).nena'),
 PosixPath('../nena_corpus/nena/0.01/Barwar/The Wise Young Man.nena'),
 PosixPath('../nena_corpus/nena/0.01/Urmi_C/A Cure For A Husband’s Madness.nena'),
 PosixPath('../nena_corpus/nena/0.01/Urmi_C/Axiqar.nena'),
 PosixPath('../nena_corpus/nena/0.01/Urmi_C/Village Life (2).nena'),
 PosixPath('../nena_corpus/nena/0.01/U

## Converter

Build a TF Walker class that can walk over the NENA parsed data and fit the text graph.

In [22]:
def director(CV):
    # code to unwrap source data and trigger 
    # the generation of TF nodes, edges and features
    
    # TODO: Pre-requisites
    slotType = 'char' # or morpheme?
    otext = {
        # text configs here
    }
    generic = {
        # generic meta data here
    }
    intFeatures = {
        # features as integers here
    }
    featureMeta = {
        # feature metadata here
    }
    
    # TODO:
    # walk through parsed data and trigger node
    # and edge creation
    
    pass