# Composition systems (`lamb.lang`)

This file is the primary documentation for composition systems, as well as ways of displaying semantic composition, for the lambda notebook. The functionally described here is mainly provided by the `lamb.lang` module.

## Overview

A `CompositionSystem` is, abstractly, a collection of ways of doing composition on some kind of object language elements via a mapping of those to metalanguage elements. This mapping is, again abstractly, managed by objects that implement the `Composable` interface. So, given some sequence of compatible `Composable`s, a `CompositionSystem` will search for ways of putting them together to produce a new `Composable` that aggregates any possible compositions of elements in the sequence. For most purposes, you will work with the active composition system set at the package-level, accessed via `lamb.lang.get_system` and `lamb.lang.set_system`.

#### Example: the `Item` class and `CompositionResult`s with purely bottom-up composition

There are a range of ways that `Composable`s can work internally, and they may cover one or more object language elements, with zero or more metalanguage elements for each of those. However, it is most useful for introductory purposes to start with a core class used across all composition systems: `lamb.lang.Item`. Objects of this class map a single string to a single metalanguage element. Items are exactly the class instantiated using the `%%lamb` magic with `||...|||`:

In [None]:
%%lamb
||cat|| = L x_e : Cat_<e,t>(x)

This magic is, among other things, a shorthand for constructing an `Item` object as you would any python object:

In [None]:
dog = lamb.lang.Item("dog", te("L x_e : Dog_<e,t>(x)"))
dog

Two key elements of the `Item` API are `name` and `content`, where the former is typically a string, and the latter is typically a metalanguage (`TypedExpression`) object:

In [None]:
cat.name, cat.content

Once you have some `Composable`s, you can compose them. In the default system, composition is purely bottom-up, that is, all you need to do to get a composition result is to have the pieces and some valid composition operation. The `*` operator will use the currently active module-level composition system to try to do this, giving another `Composable` as a result. In general, what sort of object you get back depends on the composition system, but for the default one, the result will be a `CompositionResult` object. This object represents a stage of composition, and can hold multiple composition paths. Each path is itself represented as a `Composable`. To access these, you can index the result object directly, and like `Item`, the objects used internally implement `name` and `content`. For this composition system, the name of composition results is determined by the operation and the parts, so therefore the name may vary depending on how composition proceeded and `CompositionResult` itself does not implement `name`. Its `content` field will contain a list. To access the content of a `Composable` in a general way that makes no assumptions about size, use the `content_iter` function to get an iterator over any content objects.

In [None]:
r = cat * dog
r

In [None]:
r.content[0]

In [None]:
r.content[0].name, r.content[0].content

Composition systems generally do exhaustive search over composition possibilities at each node; composition attempts that fail are stored as well, as a list in the `failures` attribute. By default these aren't shown, as there is usually a lot of chaff here, but they can be shown via the `failures` named argument to `show`. (From this example you can also see that the default display routine for `CompositionResult` objects calls `show`.)

In [None]:
r.show(failures=True)

All `Composable` objects should implement both `show()` and `tree()`, where the former shows the result without much history, and the latter produces a recursive diagram of the derivation history. If there are multiple composition paths, `tree()` should show them all.

In [None]:
r.tree()

## Bottom-up composition

The above example introduces bottom-up type-driven composition, which as input basically takes constituency groupings of lexical items. Constituency is specified entirely by the order and grouping in which `*` is applied; this can be viewed as a form of *label-free syntax*. Since non-leaf labels generally play little-to-no role in compositional semantics, even though this way of operating is rather different than how semantics is presented in textbook format, it is a very good match for how compositional semantics is used in practice. You can view the induced syntax via the `source_tree()` function:

In [None]:
(cat * (dog * cat)).source_tree()

This kind of composition system is implemented with `lamb.lang.CompositionSystem` and composition operations via `BinaryCompositionOp` or `UnaryCompositionOp`. The default system (`lamb.lang.td_system`) is this kind of system with three binary operations, after Heim and Kratzer 1998: Function Application, Predicate Modification, and Predicate Abstraction. This system can also be accessed as `lamb.lang.td_system` (where "td" stands for type-driven). `CompositionSystem` objects implement `_repr_html_()` so as to make them easy to inspect:

In [None]:
lang.set_system(lang.td_system) # this won't be necessary unless cells are run out-of-order
lang.get_system()

The example in section 1 already illustrates the main pieces of this system, but the next two sections give a more exhaustive list of the moving parts and their API.

### The lexicon in bottom-up composition

When you write a lexicon in the `%%lamb` magic, this generates mappings from lexicon elements to metalanguage elements. Mappings from primitive object-language elements are represented mainly via the `lamb.lang.Item` class, which takes a name and content, via its constructor. The name is a string, and the content parameter can be either a metalanguage object (a `TypedExpr`) or a string, in which case the string will be parsed into a meta-language expression. For example, similar to the above example:

In [None]:
lamb.lang.Item('cat', 'L x_e : Cat_<e,t>(x)')

In the general case, `Item` objects are designed to have arbitrary metalanguage contents. It is also possible to have `Item`s whose content is `None`; the rule **VAC** in the default composition system will cause these to be ignored.

There are several special kinds of `Item` that can be used: `IndexedPronoun`, `Trace`, `PresupPronoun`, and `Binder`. These all are slightly more complicated, in that they involve object-language *indices*.

**`lamb.lang.IndexedPronoun`** takes a name, an index, and an optional type (which defaults to `e`). The denotation of this kind of item is always a numbered variable with the number derived from the index. As a class method, this class provides a function `index_factory` that gives a curried version of the constructor: fix the name and type, and it gives you a function that maps indices to `IndexedPronoun` objects. This is usually how pronouns are best instantiated in interactive notebooks.

In [None]:
he3 = lamb.lang.IndexedPronoun("he", 3)
he3

In [None]:
he = lamb.lang.IndexedPronoun.index_factory("he")
he(4)

**`lamb.lang.Trace`** is a subclass of `IndexedPronoun` that always uses the object language name `t`. It takes an index and an optional type (again defaulting to `e`). It also provides the class method `index_factory` that allows a type to be prespecified, in case you are abstracting over arbitrary-typed traces.

To some degree this class is cosmetic, in that it does not involve any metalanguage difference from its superclass at all.

In [None]:
t = lamb.lang.Trace(3)
t

**`lamb.lang.PresupPronoun`** is a subclass of `IndexedPronoun` that also provides a partiality condition; specify this condition as a metalanguage function of the appropriate type.

In [None]:
he = lamb.lang.PresupPronoun.index_factory("he", condition=te("L x_e : Male_<e,t>(x)"))
he(4)

**`lamb.lang.Binder`** is a subclass of `Item` that consists just of an index with no content (content=`None`). This kind of `Item` (following Heim & Kratzer) is used to trigger Predicate Abstraction, and can be seen as an example of how to implement a syncategorematic rule.

In [None]:
he = lamb.lang.PresupPronoun.index_factory("he", condition=te("L x_e : Male_<e,t>(x)"))
(lamb.lang.Binder(4) * he(4)).tree()

**`lamb.lang.Items`**: There is also a way to map a single object language form to multiple meta-language expressions, via the class `lamb.lang.Items`. This takes a list of `Item` objects to its constructor, where multiple elements in this list are allowed to share a single object-language string. It isn't really intended to be instantiated directly but rather via the `%%lamb` magic: see the `Lexical Ambiguity.ipynb` notebook for detailed documentation on this.

In [None]:
bank = lamb.lang.Items([lamb.lang.Item("bank", "L x_e : Moneybank_<e,t>(x)"), lamb.lang.Item("bank", "L x_e : Riverbank_<e,t>(x)")])
bank

In [None]:
%lamb ||the|| = L f_<e,t> : Iota x_e : f(x)
the * bank

### The nuts and bolts of bottom-up composition

Bottom-up composition operations generate a `CompositionResult` object, which is basically a container class for `TreeComposite` objects.

**`TreeComposite`**: A `TreeComposite` object represents a single semantic composition operation situated in a tree structure: a parent node together with zero or more daughters, along with a composition operation that leads to the result. The special case of 0 daughters, i.e. a leaf node, is in fact how `Item` is implemented; composition will always produce non-leaf instances of `TreeComposite`. As usual, a `TreeComposite` object allows access to the result of composition via `name` and `content`, and implements `show()` and `tree()`. The value of `content` will be a metalanguage expression (or `None` in special cases). The value of `name` is programmatically constructed from the names of the parts and cannot be set. The composition operation used (if any; this value may be `None`) is stored in the attribute `mode`, and this will be the full `CompositionOp` object used to produce the result. See the example earlier for more cases for this API.

In [None]:
%lamb ||cat|| = L x_e : Cat_<e,t>(x)
%lamb ||dog|| = L x_e : Dog_<e,t>(x)
(cat * dog)[0].tree()

In [None]:
(cat * dog)[0].mode

While the *result* represented by a `TreeComposite` is deterministic, it is used in cases where multiple composition paths lead to the same place. The full set of composition paths leading to the content can be accessed via the `all_paths()` function. This state of affairs will be included in `show()` output.

A `TreeComposite` object includes the part-structure that led to the result; this can be accessed by indexing the object directly. That is, for some `TreeComposite t`, the first daughter is `t[0]`, the secondt `t[1]`, and the length can be accessed with `len(t)`. It is safe to assume that the parts of a `TreeComposite` are of the same type, though they may be subclasses.

**`CompositionResult`**: This kind of `Composable` collects together `TreeComposite` objects, representing a single step of semantic composition situated in a tree. (Even when two `CompositionResult` objects compose, the resulting `CompositionResult` collects together all the relevant `TreeComposite`s, effectively flattening out the representation at the current stage fo composition.) These objects implement `show()` and `tree()`, as well as `content` and `content_iter()`. They do not implement `name`. Failed composition results are stored as a list under `failures`: these too are `TreeComposite` objects, but they will have a `lamb.types.TypeMismatch` instance as their content, rather than a metalanguage object.

In [None]:
(cat * dog).failures[0].content

Of course, in the default composition system, you can observe that it isn't so easy to get composition-based ambiguity, and so very often for basic examples a `CompositionResult` object just contains one single `TreeComposite` that entirely represents the composition history. In fact, with the core three operations, strict types, and no lexical ambiguities, it is provably impossible. However, this isn't an assumption one can generally make even in the default system, as the earlier lexical ambiguity example (in section 3.1) illustrates. Beyond this, there are plenty of ways of augmenting the core set of composition operations that can lead to ambiguity -- hence, the generality of these classes.

**Unary composition**. The default system provides only binary composition operations. However, unary composition operations can be defined (examples below). To trigger such operations, you can either call the system's `compose` function directly, or you can use the `*` operator with `None` as the second argument.

## Tree-based composition

To do bottom-up composition, what you need is basically to simplify the full syntax tree into a label-less tree with unary branching flattened out, and for many purposes this is plenty. However, it is more standard to think of composition that operates on syntactic trees, rather than lexical items or the results of prior composition. This kind of composition system, implemented as a `lamb.lang.TreeCompositionSystem`, is more complicated, and somewhat richer than pure bottom-up composition. The `lamb.lang` package provides a composition system, `hk3_system`, modeled roughly after the composition system described in Heim & Kratzer 1998, chapter 3. Similar to the default `td_system`, it implements FA, PM, and PA. Like Heim and Kratzer, it explicitly handles non-branching nodes and lexicon lookup as composition rules. When inspecting the system (next cell), you will notice that FA is implemented as two rules, one for each order of daughter nodes.

In [None]:
reload_lamb()
lang.set_system(lang.hk3_system)
lang.get_system()

The basic idea is that composition operations are implemented on `local` tree structures -- they take a node with a sequence of daughters, and produce a denotation for that structure. In principle, this can happen in any order, but the results will depend on trying to compose the daughters in a similar fashion. Therefore, this composition system can generate placeholder composition results for nodes that have not yet been traversed that have underspecified types.

**`CompositionTree`**: This is the main container class for composition results in a `TreeCompositionSystem`. Each node in a tree constructed from these objects can store a sequence of denotations, and can be a placeholder. Composition can proceed in an arbitrary order in this kind of object. In contrast to bottom-up composition, this object can represent any stage of composition, including a completely uncomposed tree structure. You can turn a regular `Tree` object into a `CompositionTree` by doing composition from the composition system, or by a factory call that builds an uncomposed `CompositionTree`. Once you have the latter kind of object, you can start doing composition in arbitrary orders, but the former method starts from the top down.

In [None]:
t = Tree.fromstring("(X Y (Z Z2))")
t

In [None]:
lang.compose(t)

What has happened here? Basically, since the `Y` and `Z` nodes are entirely unexpanded when starting from the top, the system took their type as being completely underspecified. (Specifically, they have type `?`; see the documentation on type variables.) In consequence, all three operations that can work on binary trees can be applied here, leading to tighter assumption about the types. For `FA/left` to work, `Y` must denote something of type $\langle X, X'\rangle$ and `Z` must denote something of type `X` -- but no further inferences can be drawn. `FA/right` leads to the converse situation. The PM rule in this system involves no type inference, so forces the types to be completely specified. The way to think of these results is that they are possibilities that will be filtered down (maybe eliminated entirely) when `Y` and `Z` are in fact expanded.

There are several ways of inspecting this tree further that can help clarify what is going on. The `tree()` function will display the current state of composition with results collapsed into a single tree. The `paths()` function will show the same thing, more or less, but as multiple distinct trees. (`paths()` is shorthand for `show(short=False)`, where the default display routine calls `show(short=True)`.) These trees will not render below currently unexpanded (placeholder) nodes, though the tree structure involved is still present in the object.

In [None]:
from IPython.display import display
display(lang.compose(t).tree(), lang.compose(t).paths())

As with `CompositionResult`, you can see failures for `CompositionTree` by passing `failures=True` to the `show` method:

In [None]:
lang.compose(t).show(failures=True)

Though the current running example can get past the first step, it will fail after that, because the leaf nodes are not interpretable -- in fact there is no lexicon at all at the moment!

In [None]:
r = lang.get_system().expand_all(t)
display(r, r.tree())

To get composition to work out, the leaf nodes need to be defined in the lexicon. Tree composition systems use `Item` for lexical entries, just like bottom-up composition, and you can use `%%lamb` magics to define them as usual. (This must be done after the composition system has been set, in order to register them for the correct lexicon).

In [None]:
%%lamb
||gray|| = L x_e : Gray_<e,t>(x)
||cat|| = L x_e : Cat_<e,t>(x)

In [None]:
t2 = Tree.fromstring("(NP (AP (A gray)) (N cat))")
t2

In [None]:
lang.get_system().expand_all(t2).show(short=False, style=lamb.display.DerivStyle.PROOF)

**The lexicon**. All `CompositionSystem` objects track a lexicon, but this is most important for tree composition, where composition does not generally operate on `Item` objects directly. These map strings to composables, usually to an `Item`.

In [None]:
lang.get_system().lexicon

Lexicon-related calls on the `CompositionSystem`:
* `add_item`: add an item to the composition system's lexicon.
* `has_item`: check if an item is in the lexicon. If you pass this a string it will check for existence, and if you pass it an `Item` it will check that the lexical entry for the string is that object.
* `lookup_item`: get the lexical item corresponding to a string argument, or `None` if there is none.

**Pure bottom-up composition with trees**. For the sake of convenience, the `TreeCompositionSystem` class *does* support "pure" bottom-up composition as well, and this can be triggered in the usual way with the `*` operator. The way this works is that it first builds a local tree with the node label determined by the parts (as in a regular `CompositionSystem`) and then runs composition on that tree. A special case is the treatment of an `Item` object: an `Item` will be converted to a `CompositionTree` object by triggering the `Lexicon` rule (with the content of the item being added to the current lexicon if necessary).

In [None]:
(gray * cat).paths()

One caveat is that the semantics of `compose` are different for `CompositionTree` objects: calling `compose` with a `CompositionTree` as the only argument will try to compose that tree in place. Therefore, if you want to force unary composition (e.g. the NN rule), you cannot use the `* None` shorthand, or regular `compose` calls. If you want to do this, you can use the `unary_extend` function (which is a shorthand for calling a class-specific `compose` with `unary_extend=True`). Unary composition in general needs to be thought of a bit differently when using this composition system. (In the following example, `gray` is automatically converted as above before the unary composition step is applied.)

In [None]:
lang.get_system().unary_extend(gray).paths()

If you want to turn an `Item` into a tree directly, the simplest way is to call `compose` on that item (either via `* None` or the `compose` function):

In [None]:
(gray.compose()).paths()

## Composition operations

See also: [tutorial: Composition operations](../tutorials/Composition%20operations.ipynb).

A composition operation, at a high level, takes 1 or more `Composable`s (i.e. mappings from object-language to metalanguage elements) and manipulates them in some form to produce a new `Composable`. They are implemented as callable objects, in order to better track and display information, but these objects are just wrappers over a function that does the work. There are basically two kinds of composition operatinos:

* those that can be entirely represented in the metalanguage.
* those that require explicit manipulation of the object language.

The former kind have a much simplified API for building. The latter kind roughly correspond to what are usually thought of as `syncategorematic` rules. There are two built-in abstract classes for dealing with composition, corresponding to the two kinds of composition: `CompositionOp` and `TreeCompositionOp`.

### Composition operation objects for bottom-up composition

There are several levels of abstraction for building composition operations. The most complicated is to work with composition operation objects directly, but when the composition operation can be directly expressed as a metalanguage combinator, it is usually simplest to use one of several available factory functions. I begin with the simplest case and progress to more complicated ones.

#### The quickest way: adding combinators directly to a composition system

Many composition operations can be expressed directly as metalanguage *combinators*, functions with no free variables. If you can express your composition rule this way, it is almost certainly simplest to do so. Consider a hypothetical trivial composition rule that does nothing to its input (similar to the classic Heim and Kratzer non-branching node rule). This can be expressed as a straightforward polymorphic identity function:

In [None]:
%te L x_X : x

To short-circuit all the elaborate mechanics for composition rules below, we can just add a combinator-based composition rule directly to a `CompositionSystem` with the methods `add_unary_rule`, `add_typeshift`, `add_binary_rule`, and `add_binary_rule_uncurried`. Here is this trivial operation as a unary example:

In [None]:
lang.set_system(lang.td_system.copy())
lang.get_system().add_unary_rule(te("L x_X : x"), "Trivial")
%lamb ||cat|| = L x_e : Cat_<e,t>(x)
(cat * None).tree()

Similarly, here is a hypothetical binary composition operation that ignores one of its arguments entirely:

In [None]:
ignore_one_meta = %te L x_X : L y_Y : x
ignore_one_meta

We can add this to a composition system directly as a composition rule using `add_binary_rule`. The `commutative` named parameter indicates whether the rule should be assumed to be commutative; if set to `False` (the default), the composition system will try both orders of application if type-appropriate. (From the below example, you can see that PM is by default treated as commutative, because the underlying $\wedge$ operator has commutative semantics.)

In [None]:
lang.set_system(lang.td_system.copy())
lang.get_system().add_binary_rule(ignore_one_meta, "Ignore", commutative=False)
%lamb ||cat|| = L x_e : Cat_<e,t>(x)
%lamb ||dog|| = L x_e : Dog_<e,t>(x)
(cat * dog).tree()

Finally, I will demonstrate a typeshift. Here is an operation that adds a vacuous world binder. (The above examples are not plausible composition operations, but this one is; it is the *Unit* operation of the reader monad at type `s`, used for lifting non-intensional elements to compose with intensional operators.)

In [None]:
lang.set_system(lang.td_system.copy())
lang.get_system().add_basic_type(types.BasicType("s"))
lang.get_system().add_typeshift(te("L x_X : L w_s : x"), "Unit")
lang.get_system().typeshift = True
%lamb ||cat|| = L x_e : Cat_<e,t>(x)
%lamb ||Alfonso|| = A_e
%lamb ||Op|| = L p_<s,t> : p
(Op * (Alfonso * cat)).tree()

You can set the maximum typeshift recursion via the system property `typeshift_depth`. The default value for this is 3; here's an example that triggers `Unit` twice.

In [None]:
%lamb ||Op2|| = L p_<s,<s,t>> : p
(Op2 * (Alfonso * cat)).tree()

The final function accepts *uncurried* callables; this is primarily in case you want to write a python function in standard form, rather than a more natural curried combinator. First, here is an example using a metalanguage uncurried callable for the `Ignore` operation:

In [None]:
ignore_one_uncurried = %te L x_(X,Y) : x[0]
lang.set_system(lang.td_system.copy())
lang.get_system().add_binary_rule_uncurried(ignore_one_uncurried, "Ignore")
%lamb ||cat|| = L x_e : Cat_<e,t>(x)
%lamb ||dog|| = L x_e : Dog_<e,t>(x)
(cat * dog).tree()

And second, here is the same example using a very simple python function implementation for `Ignore`:

In [None]:
ignore_one_uncurried = lambda x,y: x
lang.set_system(lang.td_system.copy())
lang.get_system().add_binary_rule_uncurried(ignore_one_uncurried, "Ignore")
%lamb ||cat|| = L x_e : Cat_<e,t>(x)
%lamb ||dog|| = L x_e : Dog_<e,t>(x)
(cat * dog).tree()

While using a python callable makes it harder to guarantee type-safety, they can be used to access object properties not normally available in the metalanguage, e.g. to special-case handling of some specific class.

#### Working with `CompositionOp` objects directly

The above functions are shortcuts for building a `CompositionOp` object from a combinator. In some cases you may want even more general behavior. For example, composition operations that deal with indices may need to inspect indices that are part of the object language, not the metalanguage. In this case, the composition operation behavior is written using python code that has access to a full object language element, i.e. a `Composable`.

The `CompositionOp` class encapsulates composition operation behavior in a more general way. It is best thought of as an abstract base class, with subclasses specializing by arity; in practice, bottom-op operations can either be unary (`UnaryCompositionOp`) or binary (`BinaryCompositionOp`), depending on how many arguments they take. (In principle, n-ary operations are possible but none are implemented.) In addition to a name and a function providing the actual operation, both types take the following optional parameters:

* `allow_none=False`: if set to `False`, tells the operation to cause a `CompositionFailure` on arguments whose content is `None` (i.e. object language elements that explicitly lack a denotation). This can be set to `True` for composition operations that act entirely on object language information, i.e. *purely syncategorematic* rules.
* `reduce=True`: whether to automatically trigger reduction and simplification. (Generally, if this is `False`, the implementation should do it.)

A unary operation can also take:
* `typeshift=False`: if set to `True`, flags the operation as a type-shift, as seen in the prior section. This requires the composition system to have typeshifts enabled.

A binary operation can also take:
* `commutative=False`: if set to `True`, tells the composition system to not bother to try both orders (also seen above).

The following cell implements the above `Trivial` operation as a regular python function, rather than a combinator. The `trivial` function takes a single argument, which will be a `SingletonComposable` of some kind (you can assume that `content` is available and will be a single metalanguage object, or `None` if allowed). This kind of function must also allow an optional `assignment` named parameter and is responsible for passing along the assignment (which can of course be manipulated). The function returns another `Composable`; for normal cases, `UnaryComposite` and `BinaryComposite` are the wrapper classes to use around the result.

(From this example you can also get a sense of why, when it is possible, it is easier to write the combinator version -- it avoids a bunch of boilerplate code.)

In [None]:
def trivial(composable, assignment=None):
    new_content = composable.content.under_assignment(assignment)
    return lang.UnaryComposite(composable, new_content)

trivial_op = lang.UnaryCompositionOp("Trivial", trivial)
trivial_op

In [None]:
lang.set_system(lang.td_system.copy())
lang.get_system().add_rule(trivial_op)
%lamb ||cat|| = L x_e : Cat_<e,t>(x)
(cat * None).tree()

The following example implements the above binary `Ignore` operation as python code instead of a combinator.

In [None]:
def ignore_one(c1, c2, assignment=None):
    new_content = c1.content.under_assignment(assignment=None)
    return lang.BinaryComposite(c1, c2, new_content)

ignore_op = lang.BinaryCompositionOp("Ignore", ignore_one)
ignore_op

In [None]:
lang.set_system(lang.td_system.copy())
lang.get_system().add_rule(ignore_op)
%lamb ||cat|| = L x_e : Cat_<e,t>(x)
%lamb ||dog|| = L x_e : Dog_<e,t>(x)
(cat * dog).tree()

As we saw above, these two examples do not really need to be implemented in this way, and can be done entirely in the metalanguage. An example composition rule that needs to be implemented using `CompositionOp` directly is the standard Predicate Abstraction rule, which needs to make reference to the index on a binder node (and is implemented purely syncategorematically - binders do not have denotations).

Intermediate between the two options presented so far, there are factory functions for generation `CompositionOp` functions from callables. For bottom-up composition, these are: `unary_factory`, `binary_factory`, and `binary_factory_uncurried`. The factory functions take the same parameters that the two `CompositionOp` subclasses do.

### Composition operations for tree composition

The overall setup for tree composition operations is very similar to bottom-op composition. Composition operations for these systems are subclasses of the abstract base class `TreeCompositionOp`, and in the most general case operate on tree fragments whose elements may or may not have content. The primary difference is that the input to an operation is the tree fragment itself, not the parts. So there is not a fundamental difference between a binary and unary operation; there are rather `TreeCompositionOp`s that work on binary branching or unary branching trees. Further, there is generic handling of arbitrary tree expansion orders that inserts placeholders when working in a top-down fashion. One caveat is that typeshifts are not currently implemented for tree composition.

#### Tree operations via combinator

The main api for adding tree composition operations is very similar to that for bottom-up composition. First, a unary rule, using the `Trivial` example from earlier. (This is actually equivalent to the built-in non-branching node rule for trees, so the example starts by removing it.)

In [None]:
lang.set_system(lang.hk3_system.copy())
lang.get_system().remove_rule("NN")
lang.get_system().add_unary_rule(te("L x_X : x"), "Trivial")
lang.get_system().get_rule("Trivial")

In [None]:
%lamb ||cat|| = L x_e : Cat_<e,t>(x)
lang.get_system().expand_all(Tree.fromstring("(NP cat)")).paths()

As before, the simplest way to define a binary composition rule is via a combinator that takes two arguments:

In [None]:
lang.set_system(lang.hk3_system.copy())
ignore_one_meta = %te L x_X : L y_Y : x
lang.get_system().add_binary_rule(ignore_one_meta, "Ignore")
%lamb ||cat|| = L x_e : Cat_<e,t>(x)
%lamb ||dog|| = L x_e : Dog_<e,t>(x)
lang.get_system().expand_all(Tree.fromstring("(NP cat dog)")).paths()

The above example applies a strict order. The handling of forcing commutative in tree composition is somewhat different than bottom-up composition. There, both orders are tried unless suppressed. Here, combination with a combinator by default matches argument order. To control this, you can supply `swap=True`, which reverses order, or more importantly, `mirror=True` which generates rules that try either order. Importantly, this function will generate *two* rules with different names, one for either order; it will annotate the provided name accordingly.

In [None]:
lang.set_system(lang.hk3_system.copy())
ignore_one_meta = %te L x_X : L y_Y : x
lang.get_system().add_binary_rule(ignore_one_meta, "Ignore", mirror=True)
%lamb ||cat|| = L x_e : Cat_<e,t>(x)
%lamb ||dog|| = L x_e : Dog_<e,t>(x)
lang.get_system().expand_all(Tree.fromstring("(NP cat dog)")).paths()

Uncurried functions are supported:

In [None]:
lang.set_system(lang.hk3_system.copy())
ignore_one_uncurried = lambda x,y: x
lang.get_system().add_binary_rule_uncurried(ignore_one_uncurried, "Ignore")
%lamb ||cat|| = L x_e : Cat_<e,t>(x)
%lamb ||dog|| = L x_e : Dog_<e,t>(x)
(cat * dog).tree()

#### Working with `TreeCompositionOp` objects directly

A `TreeCompositionOp` takes a name, a callable that implements the operation, and some named paramaters. The callable should accept a tree fragment , and a named parameter `assignment`. A tree fragment is an object `t` that has the node name as `t.name` and has a list-like interface to its children; e.g. `t[0]` accesses the first child (in left-right order). List-like means that it supports `len`, iteration, slicing, etc. (This is the `nltk.Tree` interface for a depth 1 tree.)


Key named parameters:

* `precondition`: this lets you provide a callable that will check whether the operation is appropriate for a tree fragment before trying composition. This is primarily for error management: if a tree fragment fails the precondition, it isn't treated as a composition failure. Some pre-provided functions for this include `lamb.lang.tree_unary`, `lamb.lang.tree_binary`, and `lamb.lang.tree_leaf`.
* `allow_none` (default is `False`): if set to `True`, a `None` (empty) content for a child node will not prevent a tree fragment from being passed to the operation. Otherwise, a `CompositionFailure` will be raised.
* `source`: an object that provides information about how this composition op was generated, e.g., a combinator.

Generally, the only case where you really need to instantiate a `TreeCompositionOp` directly is if you are writing python code that interacts with the object language.

`TreeCompositionOp` has one specialized subclass, `LexiconOp`: this looks up a leaf node's object language form in the current composition system's lexicon, and fills in denotations that it finds. (There are many examples of this in action above.)

## Built-in composition systems

**`lang.td_system`**: type-driven bottom-up composition, based on the composition operations from Heim & Kratzer 1998. "Bottom-up" means that (unlike the full Heim & Kratzer system) this does not work with tree structures, but rather the combinations of denoting expressions (themselves either simple or complex).

* **FA**: A standard function application rule. Combinator: `L f_X : L a_Y : f(a)`
* **PM**: Heim & Kratzer Predicate Modification: given two elements of type $\langle e,t \rangle$, produce property conjunction. Combinator: `L f_<e,t>: L g_<e,t>: L x_e: f(x) & g(x)`
* **PA**: a Predicate Abstraction rule. This is modeled after the Coppock and Champollion (Invitation to Formal Semantics / Semantics Boot Camp) version. Given some indexed binder node with index `i` (use `lang.Binder`) and sibling, produce an expression that is roughly: `L vari : ||sibling||`. Not expressible as a combinator.
* **VAC**: binary rule to handle vacuous (content=`None`) nodes. If both children are vacuous, this produces a vacuous composite expression. If only one is vacuous, this produces a composite expression with the denontation of the non-vacuous child.

**`lang.td_presup`**: a variant of `lang.td_system` that supports partiality during composition operations. This has only **FA**, **PM**, and **PA**, which are straightforward variants of the operations above. Partiality is projected immediately on composing.

**`lang.hk_system`**: tree-based composition modeled on the core tree-based system of Heim & Kratzer 1998, though with modifications to make it work in practice in the lambda notebook. All operations here take a leaf, unary or binary tree fragment and produce a denotation for that tree fragment.

* **Lexicon**: For leaf nodes, look up a denotation in the current lexicon.
* **FA**: A standard function application rule, similar to the above but operating on binary tree fragments. Note: this is implemented as two distinct rules for each order, `FA/left` and `FA/right`!
* **PM**: Same as above, but operating on binary tree fragments.
* **PA**: Same as above, but operating on binary tree fragments. That is, this is a tree-based implementation of the IFS approach to Predicate Abstraction rather than the original H&K one. This requires a binary fragment with a vacuous, indexed child and a contentful child.
* **NN**: A non-branching node rule. This simply percolates meaning up a unary branching tree fragment. Combinator: `L x_X : x`.
* **VAC**: Same as above, but operating on binary tree fragments.
* **IDX**: A rule to percolate indices in unary structures. If a tree fragment has a child with an index, it will inherit that index. For example, a fragment like `[XP Binder(i)]` will be also indexed with `i`. This applies recursively. (This is used, e.g., in the Heim and Kratzer account of restrictive relative clauses, where a wh-item is treated as the binder.)

