# `ecco` state-space analysis

In this notebook, we see how the state-space of a `rr` model can be computed and analysed symbolically. By _symbolically_ we refer to the use of decision diagrams to represent sets of states in a compact way. More specifically, `ecco` uses [`libDDD` and `libITS`](https://lip6.github.io/ITSTools-web) as its basis for state-space representation and computation. The main benefit of such symbolic methods is that `ecco` can handle very large state-spaces, for models with tens of variables and more, while always maintaining a readable view of the model. However, a consequence is that the individual states are not shown during the analysis. (Another notebook will show how `ecco` can handle explicit state-spaces, that is state-space representations in which individual states are explicitely enumerated.)

## Building and displaying a view

To start with, as usual, we run `ecco` and load a `rr` model:

In [1]:
%run -m ecco termites.rr

Object `model` created by `ecco` can be called as a function, in which case it returns a component graph object that allows to explore the models' state space. Optional arguments may be provided:
 * `compact=False` (default is `True`) to keep the transient states and constraints occurrences in the state space
 * `init` specifies an initial states of the LTS. If not provided, take them from the model. Otherwise, it must be a string containing a comma-separated sequence of states assignements, that is interpreted as successive assignements starting from the initial states specified in the model. Each assignement be either:
   - `*`: take all the potential states (so called universe)
   - `+`: set all variables to `+`
   - `-`: set all variables to `-`
   - `VARx`, where `VAR` is a variable name and `x` is a sign in `+`, `-`, or `*`: set variable as specified by the sign
   
   For instance, `init="*,Ac+,Wk-` selects all the potential states restricted to those where `Ac` is `+` and `Wk` is `-`. A list of such strings may be used, in which case the union of the
   corresponding sets of states is considered.
 - `split` (default is `True`): should the graph be initially split into its initial states, SCC hull, and deadlocks+basins

In [2]:
g = model(compact=False)

HBox(children=(HTML(value='<b>saving</b>'), HTML(value='termites/termites.gal')))

A component graph may be drawn, yielding an interactive graph with additional information:
 * default nodes shapes are:
   * circles for SCC
   * squares for components that contain deadlocks
   * rounded squares for all the other components
 * some nodes may be marked with a small badge:
   * a circle for nodes that are SCC hull
   * a triangle for nodes that contain an initial states
 * default color reflects the components size (green for the smaller ones and red for the larger ones)
   
Remember to use `g.draw?` to see the method documentation.

In [3]:
g.draw(fig_height=300)

VBox(children=(HBox(children=(Dropdown(description='Layout', index=2, options=('PCA', 'dot', 'neato', 'fdp', '…

Information about the components (node) and the edges between them is available by selecting nodes in the graph, or directly through the tables `g.nodes` and `g.edges`.

## Statistical information about components

Method `g.count` allows to count for each component and each variable how many states have this variable on. If no argument is provided to `g.count`, it computes this for every component, otherwise, it expects the list of components for which the information has to be computed:

In [4]:
g.count()

Unnamed: 0,Ac,Ec,Fg,Rp,Sd,Te,Wd,Wk
1,0,0,0,0,0,1,0,1
2,2,7,4,4,8,10,10,10
3,6,0,0,8,6,0,7,14


This returns a `pandas.DataFrame` whose columns are the variables and whose index (the left-most column withot title) are the components numbers. It is possible to compute a PCA on this table using `g.pca()`. Note also that PCA is one of the layout engines proposed for the graphs of views: it sets the position of nodes according to the result of `g.pca`, considering the first factor as the `x` position and the second factor as the `y` position.

In [5]:
g.pca()

Unnamed: 0,0,1
1,2.993072,-0.754523
2,-2.326425,-1.663681
3,-0.666646,2.418204


## Splitting components

Components may be split in two according to some properties. In it's simplest form, a property is a variable. For instance below, we split the components of `g` by telling apart those states in which `Sd` in on from those in which `Sd` is off. This yields a new component graphs that we can draw in turn.

In [6]:
g2 = g.split("Sd")
g2.draw(fig_height=300)

VBox(children=(HBox(children=(Dropdown(description='Layout', index=2, options=('PCA', 'dot', 'neato', 'fdp', '…

From the components' numbers we known that only component `2` has been split into `4` and `5`. Indeed, the component numbers stay consistent among the various graphs one may build and if two components have the same number, we can be sure that hold exactly the same states.

In general, the split formula may be an expression using one of the following syntaxes:
 * CTL formulas
 * ARCTL formulas
 * states formulas

Note that the syntax is automatically detected and `ecco` shows which syntax it has detected and used.
 
### CTL formulas

CTL (Computation Tree Logic) is a temporal logic that allows to characterise states with respect to what happens or not in the states that are reachable in the future. A formula can be seen as a statement about a state `s`, that is validated by exploring the states reachable from `s`. When a CTL formula is used to split a component, `ecco` separates the states that validate the formula from those that does not and splits the component accordingly. CTL formulas have to respect the following syntax:
 * atoms are variable names, they may be quoted as in `"AG"` or `'EX'` to avoid conflicts with reserved keywords of CTL, sucha formula is true on every state where the variable is on
 * sub-formulas may be enclosed into parentheses to force operators priority
 * Boolean operators can be used to combine sub-formulas:
   * `~form` (NOT) is a formulat that holds on states where `form` does not
   * `left & right` (AND) holds on states where both `left` and `right` sub-formulas do
   * `left | right` (OR) holds on states where either `left` or `right` sub-formulas does, possibly both
   * `left => right` (IMPLY) holds on states where when `left` holds then `right` has to hold also, this is actually a shorthand for `~left | right`
   * `left <=> right` (IFF) is a shorthand for `(left => right) & (right => left)`
 * modalities allow to express conditions with respect to the future of states: `X` (NEXT), `F` (FUTURE), `G` (GLOBALLY), `U` (UNTIL), and `R` (RELEASE). Each modality must be quantified by either `A` (ALWAYS), or `E` (EXISTS). So a formula may be either:
   * `A path` holds on a state `s` if `path` does on all path starting from `s`
   * `E path` holds on a state `s` if `path` does on one paths starting from `s`
   
   `path` must then be a path formula, that is one formula qualified with a unary modality of two formulas connected by a binary modality:
   * `X form` holds if `form` holds on the next state
   * `F form` holds if `form` holds eventually in the future
   * `G form` holds if `form` holds from now on and forever
   * `left U right` holds if `left` holds until `right` holds forever
   * `left R right` holds if `right` holds until a state where `left` holds is reached, but then `left` or `right` are not required to hold anymore

#### Examples

 * `AX Sd` (_all the next states have soldiers_) selects all the states from which the next state alway has `Sd` on
 * **TODO**

### ARCTL formulas

ARCTL is a variant of CTL where quantifiers apply to a subset of actions. For instance `A{a|b}X Sd` is like `AX Sd` but only considering actions `a` or `b`. These actions are specified in the `rr` model by adding labels to rules or constraints, for instance in our termites model, we could label some rules with a letter indicating the main actor involved into each rule:

```
    [r] Rp+ >> Ec+
    [r] Rp+, Ec+ >> Wk+
    [w] Wk+ >> Wd+, Te+, Fg+, Ec+
    [w] Wk+, Wd+ >> Sd+, Rp+
    [w] Wk+, Te+ >> Wd-
        Wd- >> Wk-, Te-
        Wk- >> Fg-, Sd-
        Wk-, Rp- >> Ec-
    [a] Ac+, Sd- >> Wk-, Rp-

```

In general, labels are indicated into square brackets and are given as a comma separated list of words. For instance `[foo,bar,42]` would label a rule with actions`foo`, `bar`, and `42`. Actions are optional both in the `rr` syntax and the ARCTL syntax. In ARCTL, actions are specified through Boolean expressions constructed using the actions as atoms connected with the Boolean operators `&`, `|`, `=>`, `<=>`, and `~`, as well as parentheses.

#### Examples

 * **TODO**

### States formulas

A state formula allows to select a sets of states based on their features, but, contrary to (AR)CTL, with no reference to the successor states in the execution. Syntax is as follows:
 * the atoms are
   * a variable name as `Wk` which means that we want all the states in which `Wk` is on
   * a rule name as `R3` (or a constraint name as `C1`) which means that we want all the states in which the rule is enabled (ie, may be executed)
   * `DEAD` is the set of deadlocks
   * `INIT` is the set of initial states
   * `HULL` is the SCC hull
   * `TRANSIENT` is the set of transient states (i.e. those that enable a constraint)
   * `ALL` is the set of all reachable states
 * the operations are
   * `~expr` (NOT) which mean that we want all the states that are not represented by `expr`
   * `left | right` (OR) which means that we want all the states that are in `left`, `right`, or both
   * `left & right` (AND) which means that we want all the states that are both in `left` and `right`
   * `left ^ right` (XOR) which means that we want all the states that are either in `left` or `right` but not in both
   * `(expr)` to group sub-expressions and enforce operators priorities
 * some builtin functions may be applied to sets of states expressions:
   * `succ_R0(expr)` (resp. `pred_R0(expr)`) returns the successor (resp. predecessor) states of `expr` through rules `R0`, one such function exists for each rule or constraint
   * `hull(expr)` returns the convex hull of `expr`
   * `comp(expr)` returns the complement set of `expr`
   * `succ(expr)` returns the successors states of `expr`
   * `succ_s` is the least fixed point of `succ`
   * `succ_o` is the greatest fixed point of `succ`
   * `pred`, `pred_s`, and `pred_o` are similarly defined
   * `entries(expr)` is the set of states from `expr` that can be reached from its outside in one transition
   * `exit(expr)` is the set of states from `expr` that allow to leave it in one transition
   * `oneway(trans, expr)` is the set of states that can be reached by first firing `trans` (a rule or constraint name) and then arbitrary transitions; if `expr` is ommitted `ALL` is considered; an error is rased if `trans` is not a one-way action
   
#### Examples

 * `(Rp & ~Wk) | (R2 & ~R1)` represents all the states in which `Rp` is on and `Wk` of, plus all the states from which `R2` but not `R1` can be executed

### Fairness constraints

**TODO**

## Merging components

Components may be also merged using method `merge` whose arguments are the components numbers to be merged.

In [13]:
g3 = g2.merge(1,3)
g3.draw(fig_height=300)

VBox(children=(HBox(children=(Dropdown(description='Layout', index=2, options=('PCA', 'dot', 'neato', 'fdp', '…

## To be continued

Soon here: more information about split, merge, using formulas, tags, and aliases...

# Explicit analysis

A component graph in `ecco` may completely or partially explicited, ie, all or just some of its components may be split into their individual states. For example below, we explicit component `5` in `g2`:

In [7]:
x = g2.explicit(5)
x.draw()

VBox(children=(HBox(children=(Dropdown(description='Layout', index=2, options=('PCA', 'dot', 'neato', 'fdp', '…