In [1]:
%%capture

%cd ..

## Introduction

This notebook shows you how to utilize ULTK for generating a large number of quantifiers and measuring their monotonicity.

 First, let's familiarize ourselves with the classes used in this example, which are subclasses from classes in the ULTK package.

### QuantifierModel

In this example, we will create a large number of quantifiers that are represented by the class `QuantifierModel`. As stated in `meaning.py`, a `QuantifierModel` is a triple ** <M, A, B> **, where M is analagous to all possible quantifier referents for a given situation. A and B are different sets of quantifier referents that correspond to the items of comparison in a quantificational construct.

A `QuantifierModel` is initialized by defining a sequence of symbols that encode the set composition of each of `M`, `A`, and `B`. 

The definitions for each symbol are as follows:

`0 => in A`

`1 => in B`

`2 => in (A and B)`

`3 => in M - (A | B)`

`4 => not in (M | A | B)`


In [8]:
from learn_quant.quantifier import QuantifierModel

qm = QuantifierModel("102134")
print(str(qm))

{'name': '101234', 'A': frozenset({1, 3}), 'B': frozenset({0, 2, 3}), 'M': frozenset({0, 1, 2, 3, 4})}


In the example above, we initialized a `QuantifierModel` via a combination of symbols 0 through 4. The sets `M`, `A`, and `B` were created from this string via the `QuantifierModel`'s `post_init` method. In the sequence `101234`, the "objects" at the specified indices pertain to the sets as follows:

0. M, A
1. M, B
2. M, A, B
3. M
4. X (neither M, A, nor B)

Notice that all sets are typed as `frozenset`s, as this allows for hashing and checking of `QuantifierModel`s in `Meaning` objects that is required for subsequent routines. 

### Importing a `QuantifierGrammar`

A QuantifierGrammar is a regular Grammar object, but allows for primitives representing integers to be added after a basic grammar has been loaded. This allows for integer primitives to be created on the fly during experiments of certain lengths (you usually would want to allow for primitives up until the length of the size of M, or all referents that are "in play"). 

In [9]:
from learn_quant.grammar import quantifiers_grammar

You can iterate through the grammar to see what rules it contains:

In [10]:
for rule in quantifiers_grammar:
    print(rule[0], ":", rule[1])

and : bool -> and(bool, bool)
or : bool -> or(bool, bool)
not : bool -> not(bool)
union : frozenset -> union(frozenset, frozenset)
intersection : frozenset -> intersection(frozenset, frozenset)
difference : frozenset -> difference(frozenset, frozenset)
index : frozenset -> index(int, frozenset)
cardinality : int -> cardinality(frozenset)
subset_eq : bool -> subset_eq(frozenset, frozenset)
equals : bool -> equals(int, int)
greater_than : bool -> greater_than(int, int)
A : frozenset -> A
B : frozenset -> B


To add primitives for integer indices, use the `add_indices_as_primitives` method on the `QuantifierGrammar` object by specifying either specific indices in a list, or an integer upper bound up until which primitive rules should be added:

In [11]:
from copy import deepcopy
new_grammar = deepcopy(quantifiers_grammar)
new_grammar.add_indices_as_primitives([0,1,2,3], 6.0)
for rule in new_grammar:
    print(rule[0], ":", rule[1])

and : bool -> and(bool, bool)
or : bool -> or(bool, bool)
not : bool -> not(bool)
union : frozenset -> union(frozenset, frozenset)
intersection : frozenset -> intersection(frozenset, frozenset)
difference : frozenset -> difference(frozenset, frozenset)
index : frozenset -> index(int, frozenset)
cardinality : int -> cardinality(frozenset)
subset_eq : bool -> subset_eq(frozenset, frozenset)
equals : bool -> equals(int, int)
greater_than : bool -> greater_than(int, int)
A : frozenset -> A
B : frozenset -> B
0 : int -> 0
1 : int -> 1
2 : int -> 2
3 : int -> 3


You can also pass as the first argument to `add_indices_as_primitives` an upper bound integer for inclusive indices to generate as primitives in the grammar. The second argument defines assigns a `weight` to give the primitives that is used during an enumeration procedure of `Expressions` that can be generated by the defined `QuantifierGrammar`.

In [12]:
new_grammar = deepcopy(quantifiers_grammar)
new_grammar.add_indices_as_primitives(4, 6.0)
for rule in new_grammar:
    print(rule[0], ":", rule[1])

and : bool -> and(bool, bool)
or : bool -> or(bool, bool)
not : bool -> not(bool)
union : frozenset -> union(frozenset, frozenset)
intersection : frozenset -> intersection(frozenset, frozenset)
difference : frozenset -> difference(frozenset, frozenset)
index : frozenset -> index(int, frozenset)
cardinality : int -> cardinality(frozenset)
subset_eq : bool -> subset_eq(frozenset, frozenset)
equals : bool -> equals(int, int)
greater_than : bool -> greater_than(int, int)
A : frozenset -> A
B : frozenset -> B
0 : int -> 0
1 : int -> 1
2 : int -> 2
3 : int -> 3


In [13]:
for rule in new_grammar:
    print(rule[1].name, ":\t\t", rule[1].weight)

and :		 1.0
or :		 1.0
not :		 1.0
union :		 1.0
intersection :		 1.0
difference :		 1.0
index :		 1.0
cardinality :		 1.0
subset_eq :		 1.0
equals :		 1.0
greater_than :		 1.0
A :		 10.0
B :		 10.0
0 :		 6.0
1 :		 6.0
2 :		 6.0
3 :		 6.0


### Define a universe of referents.

In this example, the function `create_universe` creates an exhaustive set of `QuantifierModel`s that could be created with different values for A, B, and M satisfying the constraints of a given defined size for `M` and `X`. This function wraps the set of `QuantifierModel`'s in a `QuantifierUniverse` object, which subclasses `Universe` and allows for the union between `QuantifierUniverse`s to create larger `QuantifierUniverse`s and records the maximal `M` and `X` size for all referents in the `QuantifierUniverse`.

In [14]:
from learn_quant.util import create_universe

In [16]:
quantifiers_universe = create_universe(m_size=3, x_size=4)
print("The size of the universe is {}".format(len(quantifiers_universe)))

The size of the universe is 256


Access the referents by refering to the `referents` property of the `QuantifierUniverse` object

In [20]:
quantifiers_universe.referents[0:10]

(QuantifierModel(name='4212', M=frozenset({1, 2, 3}), A=frozenset({1, 3}), B=frozenset({1, 2, 3})),
 QuantifierModel(name='2240', M=frozenset({0, 1, 3}), A=frozenset({0, 1, 3}), B=frozenset({0, 1})),
 QuantifierModel(name='2430', M=frozenset({0, 2, 3}), A=frozenset({0, 3}), B=frozenset({0})),
 QuantifierModel(name='0041', M=frozenset({0, 1, 3}), A=frozenset({0, 1}), B=frozenset({3})),
 QuantifierModel(name='4013', M=frozenset({1, 2, 3}), A=frozenset({1}), B=frozenset({2})),
 QuantifierModel(name='4101', M=frozenset({1, 2, 3}), A=frozenset({2}), B=frozenset({1, 3})),
 QuantifierModel(name='1430', M=frozenset({0, 2, 3}), A=frozenset({3}), B=frozenset({0})),
 QuantifierModel(name='1431', M=frozenset({0, 2, 3}), A=frozenset(), B=frozenset({0, 3})),
 QuantifierModel(name='4311', M=frozenset({1, 2, 3}), A=frozenset(), B=frozenset({2, 3})),
 QuantifierModel(name='3340', M=frozenset({0, 1, 3}), A=frozenset({3}), B=frozenset()))

You can access sizes of `X` and `M` in the QuantifierUniverse object:

In [21]:
print(quantifiers_universe.x_size)
print(quantifiers_universe.m_size)

4
3


We created a universe with the number of indices in generated `QuantifierModel`s having 4 indices total, with up to 3 of those indices being *considered for pertinence in A or B* (in M) during the generative process. Therefore, in this example, `['1', '2', '0', '0']` would not be valid, since `M_SIZE` is only 3 and not 4. On the other hand, `['4', '2', '0', '0']` is OK, since the first index is in `X` but not `M`.

Let's enumerate expressions that could be created with the Language of Thought (LoT) described in the `QuantifierGrammar` we have previously defined.

We'll enumerate expressions up to a depth of 4. Higher depth values allow for more complex expressions that depend on a greater number of rules.

In [22]:
from learn_quant.scripts.generate_expressions import enumerate_quantifiers
expressions_by_meaning = enumerate_quantifiers(4, quantifiers_universe, new_grammar)

In [25]:
print("The number of expressions generated by the enumeration procedure is", len(expressions_by_meaning.values()))

The number of expressions generated by the enumeration procedure is 3379


Let's save the quantifiers generated in this enumeration process. You can pass a boolean to a keyword argument `pickle` to optionally save a `.pkl` in addition to a `.yml` file of the expressions.

In [None]:
from learn_quant.util import save_quantifiers

save_quantifiers(expressions_by_meaning, out_path="learn_quant/outputs/generated_expressions.yml")

We can load-in the expressions we just saved in the YAML file we produced in the code block above (provided we also provide the load function a relevant universe of analysis):

In [29]:
from ultk.util.io import read_grammatical_expressions

`read_grammatical_expressions` loads `GrammaticalExpression` objects in an iterable as the first returned object. This contains the `Expression` and its resolved `Meaning`. The second returned object returns a dictionary with the `Meaning` of each `Expression` as a key. 

In [31]:
new_grammar = deepcopy(quantifiers_grammar)
new_grammar.add_indices_as_primitives(4, 6.0)

expressions, expressions_by_meaning = read_grammatical_expressions("learn_quant/outputs/generated_expressions.yml", new_grammar)

We get a object that pairs grammatical expressions with their respective `Meaning` objects, which are lists of licensed `QuantifierModels` (<M, A, B>) for that particular expression, given the universe in scope.

In [37]:
print(len(expressions))

3379


Each expression can be represented as a string:

In [34]:
str(expressions[0])

'subset_eq(A, A)'

Every expression object contains a `Meaning` that contains a list of referents that the expression verifies. 

In [39]:
expressions[0].meaning

Meaning(mapping=FrozenDict({QuantifierModel(name='0004', M=frozenset({0, 1, 2}), A=frozenset({0, 1, 2}), B=frozenset()): True, QuantifierModel(name='0014', M=frozenset({0, 1, 2}), A=frozenset({0, 1}), B=frozenset({2})): True, QuantifierModel(name='0024', M=frozenset({0, 1, 2}), A=frozenset({0, 1, 2}), B=frozenset({2})): True, QuantifierModel(name='0034', M=frozenset({0, 1, 2}), A=frozenset({0, 1}), B=frozenset()): True, QuantifierModel(name='0040', M=frozenset({0, 1, 3}), A=frozenset({0, 1, 3}), B=frozenset()): True, QuantifierModel(name='0041', M=frozenset({0, 1, 3}), A=frozenset({0, 1}), B=frozenset({3})): True, QuantifierModel(name='0042', M=frozenset({0, 1, 3}), A=frozenset({0, 1, 3}), B=frozenset({3})): True, QuantifierModel(name='0043', M=frozenset({0, 1, 3}), A=frozenset({0, 1}), B=frozenset()): True, QuantifierModel(name='0104', M=frozenset({0, 1, 2}), A=frozenset({0, 2}), B=frozenset({1})): True, QuantifierModel(name='0114', M=frozenset({0, 1, 2}), A=frozenset({0}), B=frozense