# Paper Notebook (literally, but also not)

Before anything else, after having checke that the kernel you're running is a Python 3 kernel (it should say Python 3 on the top right, or see Help | About), execute the following cell. It should pull everythig from GitHub and install it.

In [1]:
# ! git pull
# ! pip3 install -e ..

In [2]:
from framenet.data.annotation import get_frame_df, Patterns, to_layers, gf_or_target, Groups
from framenet.data.pattern    import *
import pandas as pd, qgrid

## First, we create `Groups`

A `Group` simply groups related information w.r.t. a specific sentence and annotation. 

In [3]:
cfm_groups = Groups('Cause_fluidic_motion')

A `Groups` instance contains a [Pandas](http://pandas.pydata.org/pandas-docs/stable/index.html) `DataFrame` (type, _CamelCase_) in its attribute `data_frame`(Python identifier, *train_case*), which we need for generating the following table.

In [4]:
# qgrid.show_grid(cfm_groups.data_frame, 
#                 grid_options={'forceFitColumns': True, 'defaultColumnWidth': 50})

Here is how we check the kind of `LU`s the DataFrame we just created contains: 

In [5]:
cfm_groups.data_frame['annotationSet.LU'].unique()

array(['dribble.v', 'spill.v', 'spatter.v', 'splatter.v', 'splash.v',
       'spray.v', 'squirt.v', 'pump.v'], dtype=object)

## Second: we create a `Patterns` instance.

We start from our `Groups` instance `cfm_groups` we creted in the first step. This is the instance to query patterns in the chosen frame.

In [6]:
cfm_patterns = Patterns(cfm_groups)

## Third: we create the symbols for all our patterns

We need to do this only once per Notebook. We need those becuase we describe the patterns l=we lokk for as Python expressions. The following applies the constructor `Lit` to the strings `'Ext'`, `'Obj'`, `'Dep'`, and `'_'` (our wildcard) and assigns the result to the corresponding Python variables. This way `Ext`, `Obj`, `Dep` and `x` can be combined in the patterns below.

In [7]:
Ext, Obj, Dep, x = map(Lit, 'Ext Obj Dep _'.split())

This is *almost* $Pattern_1$ we used in the paper. The **`&`** should be intepreted as "followed by". So `p1` is `Ext` followed by `x` (the wildcard, which matches any *single* element), followd by `Obj`, followed by `Dep`. 

In [8]:
p1 = Ext & x & Obj & Dep

### To be really precise

$Pattern_1$ was **`Ext` `_` `Obj` `Dep` `+`**, that is, like the following:

In [9]:
p1_actually = Ext & x & Obj & Many1(Dep)

So, `Obj` was followed by one or more `Dep`s. 

### To summarize:

<table>
<tr><th>The operator</th><th>means</th></tr>                       
<tr><td>&amp;</td><td>_followed by_</td></tr>
<tr><td>`|`</td><td>_one or the other_</td></tr>
<tr><td>`Many1(...)`</td><td>_one or more of ..._</td></tr>
<tr><td>`Many(...)`</td><td>_zero or more of ..._</td></tr>
<tr><td>`Opt(...)`</td><td>_optionally ..._</td></tr>
</table>


So, if we wanted to say that `x` should be followed by one or more of `Obj` or `Dep`, we would say: 

In [10]:
weird_p = x & Many1(Obj | Dep)

Which would match thing like 

    v → Obj → Dep → Obj → Obj → Dep → Dep → Dep → Obj ...

### $Pattern_2$, which was **`Ext` `_` `Dep` `+`** is like this:

In [11]:
p2 =  Ext & x & Many1(Dep)

## Fourth: now we can actually look at patterns

This is with `p1`. The `display` method understands the following keywords, all optional:

* `pattern_matcher`: defined as the above expressions, or `None` / unspecified 
* `min_count`: the minimum count for sentences matching a pattern; a nonnegative integer number
* `negative`: a boolean specifying whether a pattern should be inverted (sentences picked when they *do not* match)
* `collapse_sentences`: a boolean that eliminates the sentences from the table (when se to `True`

In [12]:
cfm_patterns.display(pattern_matcher = p1, min_count = 0, negative=False)

Unnamed: 0,freq.,Patterns,Unnamed: 3
1,42,Ext: Agent (NP) → v → Obj: Fluid (NP) → Dep: Goal (PP),...
2,3,Ext: Agent (NP) → v → Obj: Goal (NP) → Dep: Fluid (PP),...
3,2,Ext: Cause (NP) → v → Obj: Fluid (NP) → Dep: Area (PP),...
4,2,Ext: Agent (NP) → v → Obj: Fluid (NP) → Dep: Source (PP),...
5,1,Ext: Cause (NP) → v → Obj: Goal (NP) → Dep: Fluid (PP),...
6,1,Ext: Cause (NP) → v → Obj: Area (NP) → Dep: Fluid (PP),...
7,1,Ext: Agent (NP) → v → Obj: Fluid (NP) → Dep: Area (PP),...
8,1,Ext: Agent (NP) → v → Obj: Fluid (NP) → Dep: Path (PP),...
9,1,Ext: Cause (NP) → v → Obj: Fluid (NP) → Dep: Path (PP),...


## Fifth: flow diagrams!

The method `select` takes a pattern (as described above), and `diagram` takes an optional keyword argument to select noncore `FE`s. The default value is `True`.

In [14]:
cfm_patterns.select(p1).diagram(noncore=True)

Written 10 records.


<IPython.core.display.Javascript object>

## More examples we worked on

### The `Filling` frame

Comment here.

In [15]:
filling_groups = Groups('Filling')

filling_pattern = Patterns(filling_groups)
filling_pattern.display(pattern_matcher=p1, min_count=0)

Unnamed: 0,freq.,Patterns,Unnamed: 3
1,142,Ext: Agent (NP) → v → Obj: Goal (NP) → Dep: Theme (PP),...
2,18,Ext: Agent (NP) → v → Obj: Goal (NP),...
3,10,Ext: Agent (NP) → v → Obj: Goal (NP) → Dep: Result (AJP),...
4,6,Ext: Agent (NP) → v → Obj: Theme (NP) → Dep: Goal (PP),...
5,2,Ext: Agent (NP) → v → Obj: Goal (NP) → Dep: Manner (AVP),...
6,1,Ext: Agent (NP) → v → Obj: Goal (NP) → Dep: Theme (PPing),...
7,1,Ext: Agent (NP) → v → Obj: Goal (NP) → Dep: Time (PP),...
8,1,Ext: Agent (NP) → v → Obj: Goal (NP) → Dep: Time (NP),...
9,1,Ext: Agent (NP) → v → Obj: Goal (NP) → Dep: Result (PP),...
10,1,Ext: Cause (NP) → v → Obj: Theme (NP) → Dep: Goal (PP),...


In [16]:
filling_pattern.select(p1).diagram(noncore=True)

Written 7 records.


<IPython.core.display.Javascript object>

In [19]:
cfm_patterns.display(pattern_matcher=p1)

Unnamed: 0,freq.,Patterns,Unnamed: 3
1,42,Ext: Agent (NP) → v → Obj: Fluid (NP) → Dep: Goal (PP),...
2,3,Ext: Agent (NP) → v → Obj: Goal (NP) → Dep: Fluid (PP),...
3,2,Ext: Cause (NP) → v → Obj: Fluid (NP) → Dep: Area (PP),...
4,2,Ext: Agent (NP) → v → Obj: Fluid (NP) → Dep: Source (PP),...
5,1,Ext: Cause (NP) → v → Obj: Goal (NP) → Dep: Fluid (PP),...
6,1,Ext: Cause (NP) → v → Obj: Area (NP) → Dep: Fluid (PP),...
7,1,Ext: Agent (NP) → v → Obj: Fluid (NP) → Dep: Area (PP),...
8,1,Ext: Agent (NP) → v → Obj: Fluid (NP) → Dep: Path (PP),...
9,1,Ext: Cause (NP) → v → Obj: Fluid (NP) → Dep: Path (PP),...


In [20]:
cfm_patterns.select(p1_actually).diagram(noncore=False)

Written 13 records.


<IPython.core.display.Javascript object>

In [21]:
cfm_patterns.select(p1_actually).diagram(noncore=False)

Written 13 records.


<IPython.core.display.Javascript object>

In [22]:
cm_groups   = Groups('Cause_motion')
cm_patterns = Patterns(cm_groups)

cm_patterns.display(pattern_matcher=p1_actually)

Unnamed: 0,freq.,Patterns,Unnamed: 3
1,178,Ext: Agent (NP) → v → Obj: Theme (NP) → Dep: Goal (PP),...
2,103,Ext: Agent (NP) → v → Obj: Theme (NP) → Dep: Path (PP),...
3,39,Ext: Agent (NP) → v → Obj: Theme (NP) → Dep: Source (PP),...
4,12,Ext: Agent (NP) → v → Obj: Theme (NP) → Dep: Goal (AVP),...
5,11,Ext: Agent (NP) → v → Obj: Theme (NP) → Dep: Path (PP) → Dep: Goal (PP),...
6,10,Ext: Agent (NP) → v → Obj: Theme (NP) → Dep: Manner (AVP) → Dep: Goal (PP),...
7,7,Ext: Agent (NP) → v → Obj: Theme (NP) → Dep: Result (AJP),...
8,7,Ext: Cause (NP) → v → Obj: Theme (NP) → Dep: Goal (PP),...
9,5,Ext: Agent (NP) → v → Obj: Theme (NP) → Dep: Goal (AJP),...
10,5,Ext: Agent (NP) → v → Obj: Theme (NP) → Dep: Result (PP),...


In [23]:
cm_patterns.select(p1).diagram(noncore=False)

Written 11 records.


<IPython.core.display.Javascript object>

The end.