<a href="http://www.godgeleerdheid.vu.nl/etcbc" target="_blank"><img align="left" src="images/VU-ETCBC-xsmall.png"/></a>
<a href="http://laf-fabric.readthedocs.org/en/latest/" target="_blank"><img align="left" src="images/laf-fabric-xsmall.png"/></a>
<a href="http://www.persistent-identifier.nl/?identifier=urn%3Anbn%3Anl%3Aui%3A13-048i-71" target="_blank"><img align="right" src="images/etcbc4easy-small.png"/></a>

# Mother

This notebook explores some characteristics of the *mother* relationship in the ETCBC4 database.

The *mother* relationship holds between linguistic elements, where one (the *daughter*) is somehow dependent on the other (the *mother*).

We have a few questions, and we use some simple scripts to answer them.

In [1]:
import sys
import collections

import laf
from laf.fabric import LafFabric
from etcbc.preprocess import prepare

fabric = LafFabric()

  0.00s This is LAF-Fabric 4.4.3
API reference: http://laf-fabric.readthedocs.org/en/latest/texts/API-reference.html
Feature doc: http://shebanq-doc.readthedocs.org/en/latest/texts/welcome.html



In [2]:
API = fabric.load('etcbc4', '--', 'mother', {
    "xmlids": {"node": False, "edge": False},
    "features": ('''
        oid otype
        sp
    ''','''
        mother
    '''),
    "prepare": prepare,
}, verbose='NORMAL')
exec(fabric.localnames.format(var='fabric'))

  0.00s LOADING API: please wait ... 
  0.00s INFO: USING DATA COMPILED AT: 2014-07-23T09-31-37
  2.81s LOGFILE=/Users/dirk/laf-fabric-output/etcbc4/mother/__log__mother.txt
  3.50s INFO: DATA LOADED FROM SOURCE etcbc4 AND ANNOX -- FOR TASK mother AT 2014-09-29T08-41-59


# 1. Domain and codomain of mother

We want to know what types of objects are mother, and what type of objects are daughter.

In [14]:
mothers = set()
daughters = daughters = set()
for daughter in NN():
    for mother in C.mother.v(daughter):
        mothers.add(mother)
        daughters.add(daughter)

sys.stderr.write("Mothers: {}\nDaughters{}\n\n".format(len(mothers), len(daughters)))

nmothers = collections.Counter([F.otype.v(x) for x in mothers])
ndaughters = collections.Counter([F.otype.v(x) for x in daughters])

for x in sorted(nmothers):
    print("Mothers   of type {:<20}: {:>5}x".format(x, nmothers[x]))
for x in sorted(ndaughters):
    print("Daughters of type {:<20}: {:>5}x".format(x, ndaughters[x]))

Mothers   of type clause              : 11684x
Mothers   of type clause_atom         : 58002x
Mothers   of type phrase              :  5164x
Mothers   of type phrase_atom         :  9594x
Mothers   of type subphrase           : 20556x
Mothers   of type word                : 37008x
Daughters of type clause              : 18580x
Daughters of type clause_atom         : 89079x
Daughters of type phrase              :   207x
Daughters of type phrase_atom         : 13301x
Daughters of type subphrase           : 55244x


Mothers: 142008
Daughters176411



# 2. Frequency of mother relation per domain-codomain

We want to know how many edges there are for each domain-codomain pair.

In [17]:
pair_types = []
for daughter in NN():
    for mother in C.mother.v(daughter):
        pair_types.append((F.otype.v(daughter), F.otype.v(mother)))
npair_types = collections.Counter(pair_types)

for x in sorted(npair_types):
    print("daughter = {:<20} and mother = {:<20}: {:>5}x".format(x[0], x[1], npair_types[x]))

daughter = clause               and mother = clause              : 12462x
daughter = clause               and mother = phrase              :  5167x
daughter = clause               and mother = word                :   951x
daughter = clause_atom          and mother = clause_atom         : 89079x
daughter = phrase               and mother = clause              :     5x
daughter = phrase               and mother = phrase              :   195x
daughter = phrase               and mother = word                :     7x
daughter = phrase_atom          and mother = phrase_atom         : 11717x
daughter = phrase_atom          and mother = word                :  1584x
daughter = subphrase            and mother = subphrase           : 20556x
daughter = subphrase            and mother = word                : 34688x
