![](maes-cover.png)

This notebook contains instructions for navigating the SQLite version of [Maes 1959](https://www.worldcat.org/title/dictionnaire-ngbaka-francais-neerlandais-prec-dun-apercu-grammatical/oclc/251249793), which is a dictionary of [Ngbaka Minagende](https://glottolog.org/resource/languoid/id/ngba1285). This dictionary was scanned, OCR'd, and converted to a database for Danis 2017 and Danis 2019. Dana Matarlo greatly assisted with the OCR process, and Paul de Lacy assisted with converting the OCR text to a structured database. 

* Danis, Nick. 2017. [*Complex place and place identity*](https://rucore.libraries.rutgers.edu/rutgers-lib/55412/). PhD Dissertation, Rutgers University. Chair: Akin Akinlabi. [[doi](https://doi.org/doi:10.7282/T38055PH), [lingbuzz](https://ling.auf.net/lingbuzz/003693), [ROA-1324](http://roa.rutgers.edu/article/view/1695)]
* Danis, Nick. 2019. [Long-distance major place harmony](https://www.cambridge.org/core/journals/phonology/article/abs/longdistance-major-place-harmony/7620003BF480A83F2C0A3B604989B62F). *Phonology* 36.4. 573-604.  [[doi](https://doi.org/10.1017/S0952675719000307), [lingbuzz](https://ling.auf.net/lingbuzz/004988), [ROA-1365](http://roa.rutgers.edu/content/article/files/1804_danis_1.pdf)]

For the specifics of the transcription system and the parts of speech used in Maes, please see the original dictionary or the discussion in Danis 2017. 

In [1]:
import pandas as pd
import sqlite3

In [2]:
# connect to the database
conn = sqlite3.connect("Maes1959.sqlite")
c = conn.cursor()

# get a list of all tables in the db
c.execute('SELECT name FROM sqlite_master where type= "table"')
print(c.fetchall())

[('entry',), ('sqlite_sequence',), ('subs',)]


The `entry` table contains all main dictionary entries (headwords), while the `subs` table contains linked expressions and phrases using these headwords.

In [3]:
# save the contents of the entry table to a pandas df
c.execute("SELECT * FROM entry")
df = pd.DataFrame(c.fetchall(),columns = [x[0] for x in c.description])
df.sample(5)

Unnamed: 0,id,head,pos,def,defFr,defDu,defEn,IPA
69,2108,’bɛ̃lɛ̃,(v.),"déchirer l’enveloppe, défaire des déchets ; se...","déchirer l’enveloppe, défaire des déchets ; se...","Omhulsel afscheuren, ontdoen van schelp, — vel...",,’bɛ̃lɛ̃
1506,3558,wala,(v.),", = dɛ̀ dɔ̃̍ : devenir court, s’user. Kort wor...",", = dɛ̀ dɔ̃̍ : devenir court, s’user","Kort worden, afslijten",,wala
887,2930,mbɛ’dɛlɛ,(v. r.),"rabattre, étendre en battant, river ; être rab...","rabattre, étendre en battant, river ; être rab...","Omslaan, uiteen slaan; open geplooid zijn",,mbɛ’dɛlɛ
1048,3094,nɔ̃̍ɛ̃̀,(s.),oiseau. Vogel.,oiseau,Vogel,,nɔ̃̍ɛ̃̀
1417,3468,tɛkpɛ,(v.),"filtrer, passer par un tamis. Filteren, door z...","filtrer, passer par un tamis","Filteren, door zeef gieten",,tɛkpɛ


The `IPA` column should just be a duplicate of the `head` column, though there are some mismatches (below). I would ignore the `IPA` column; it was used only while parsing the data.

In [4]:
df.loc[df['head'] != df['IPA']]

Unnamed: 0,id,head,pos,def,defFr,defDu,defEn,IPA
86,2125,bílà,(cfr.) búlà (s.),"coupe, écope, moitié de calebasse coupée en de...","coupe, écope, moitié de calebasse coupée en deux",Drinkkroes of schepkom uit helft van kalebasvr...,,bílà =
701,2744,kpákólólì,,: faux muscatier à bois clair léger. Boom ver...,,,,kpákólólì : faux muscatier à bois clair lége...


In [5]:
# save the contents of the subs table to a pandas df
c.execute("SELECT * FROM subs")
df_subs = pd.DataFrame(c.fetchall(),columns = [x[0] for x in c.description])
df_subs.sample(5)

Unnamed: 0,id,entryid,head,def,defFr,defDu
1004,3087,2582,"hùnù kɔ̃̀, hùnù báŋgá","remuer la bouillie de farine. Dunne vrij —, me...",remuer la bouillie de farine,"Dunne vrij —, meelpap roeren"
1881,3964,3263,è òló nɛ̍,remettre à sa place. Terug op zijn plaats zetten.,remettre à sa place,Terug op zijn plaats zetten
765,2848,2483,mɔ̀ gɛ́ tɛ̀ mɔ̍,fais place. Garni de weg.,fais place,Garni de weg
1211,3294,2734,tò má kpà mi̍ gɔ̍,je n’ai pas reçu la nouvelle. Ik heb het nieuw...,je n’ai pas reçu la nouvelle,Ik heb het nieuws niet vernomen
1078,3161,2640,"tè kélé lí wa̍la̍, lí wa̍la̍ ki̍la̍ dó’dò","un arbre barre la route, le passage est obstru...","un arbre barre la route, le passage est obstrué","Een boom verspert de weg, de doorgang is afges..."


In [6]:
# close the connection to the database
conn.close()

To get the subentries for a particular entry, match the `id` field in `entry` with the `entryid` field in `subs`.

In [7]:
df.loc[df['id'] == 3714]

Unnamed: 0,id,head,pos,def,defFr,defDu,defEn,IPA
1662,3714,zùmà,(s.),chant. Zang.,chant,Zang,,zùmà


In [8]:
df_subs.loc[df_subs['entryid'] == 3714]

Unnamed: 0,id,entryid,head,def,defFr,defDu
2754,4837,3714,gà zùmà,chanter. Zingen.,chanter,Zingen
2755,4838,3714,zu̍ma̍ yɔ̀là,chant de danse. Zang bij dans.,chant de danse,Zang bij dans
2756,4839,3714,kɔ̀ zùmà,"répondre à un chant, chanter le refrain. Zang ...","répondre à un chant, chanter le refrain","Zang beantwoorden, rejrein meezingen"
2757,4840,3714,dò lí zùmà,"raisonné, sensé. Beredeneerd.","raisonné, sensé",Beredeneerd
2758,4841,3714,lí zùmá nɛ̍ bína̍,"insensé, absurde. Ongerijmd, zonder zin.","insensé, absurde","Ongerijmd, zonder zin"
