# Display results for _not semi-deterministic_ benchmarks
The cells `[4]` and `[5]` were used to produce the right part of **Table 2** in the CAV paper. The plot from cell `[6]` corresponds to Figure 4 in the paper.

In [6]:
from ltlcross_wrapper import ResAnalyzer, gather_cumulative, gather_mins
import pandas as pd
pd.set_option("precision",0)
import spot
spot.setup()
from spot.jupyter import display_inline

For each benchmark, we list the cumulative numbers of states for each tool. The best value for each benchmark is highlighted by green background. The benchmarks consist of `random` formulas or formulas from `literature`. The suffix `_nd` indicates that `ltl2tgba` created automata that are not semi-deterministic.

The considered tools are:
 * `owl#best` : `ltl2ldgba` from [Owl library](https://owl.model.in.tum.de/); the `#best` indicates the _best of owl_ approach where we run 2 runs of `ltl2ldgba` and choose the better result.
 * `seminator-1-1` is the last presented version of Seminator.
 * `seminator#def` is the default setting of Seminator 2.

`yes` in the name of tools means that the Spot simplifications were applied on the results of the tools (were not disabled for `seminator`) and `no` the opposite. 

In the precomputed data we have even more more tools available. Apart from the tools described above, these are available (always both `yes.` and `no.` versions):

 * Owl without the _best of Owl_ approach; you can replace `#best` with `#a` or `#s` where `#a` stands for `ltl2ldgba -a` and analogously for `#s`.
 * `seminator-1-2` which implemented the SCC-aware optimization.
 * Seminator 2 set to use only one pipe-line; you can replace `#def` with `#tgba`, `#tba`, or `#sba` to see results of `seminator --via-tgba` etc.
 
 The list of tools that are displayed can be controlled in cell `[3]`. If you want to see numbers where Spot's simplifications were disabled, change the `yes` prefix to `no`. You can display all results by changing cell `[3]` to
 ```python
 tool_set = None
 ```

In [7]:
nd_benchmarks = {}
for name in ["literature_nd","random_nd"]:
    b = ResAnalyzer(f"data/{name}.csv", cols=["states","time","acc","transitions"])
    nd_benchmarks[name] = b
    b.compute_best(["yes.owl#s","yes.owl#a"],"yes.owl#best")
    b.compute_best(["no.owl#s","no.owl#a"],"no.owl#best")

In [8]:
tool_set = ["no.owl#best","yes.owl#best","yes.seminator-1-1","no.seminator#def"]
tool_set = ["yes.seminator#slim","yes.seminator#weakslim", "yes.empc#slim", "yes.empc#specialslim"]

## Cumulative number of states
Cumulative number of automata sizes for each benchmark. The main part of **Table 2** from the paper.

In [9]:
gather_cumulative(nd_benchmarks, tool_set=tool_set)

Unnamed: 0_level_0,literature_nd,random_nd
tool,Unnamed: 1_level_1,Unnamed: 2_level_1
yes.empc#slim,788,12931
yes.empc#specialslim,787,13214
yes.seminator#slim,408,8214
yes.seminator#weakslim,588,9690


### Minimal automata

The following table shows for how many formulas each tool produces automaton that has the smallest number of states. The minimum ranges over the considered tools selected by `tool_set` from cell `[3]`. The number in the column **min hits** shows how many times the same size as the smallest automaton was achieved. The number in **unique min hits** counts only cases where the given tool is the only tool with such a small automaton.

In [10]:
gather_mins(nd_benchmarks, tool_set=tool_set)



Unnamed: 0_level_0,literature_nd,literature_nd,random_nd,random_nd
Unnamed: 0_level_1,unique min hits,min hits,unique min hits,min hits
tool,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2
yes.empc#slim,0,2,0,47
yes.empc#specialslim,0,2,1,48
yes.seminator#slim,14,16,309,425
yes.seminator#weakslim,4,6,71,187


### Scatter plots
We offer interactive scatter plots that show how Seminator 2 compares to `owl#best` with Spot's simplifications (`In[6]`), to Owl in default settings `no.owl#def` (`In[9]`), and to Seminator 1.1 (`In[10]`) on the random benchmark. In the paper, we present only the first scatter plot (`In[6]`) in the non-interactive way.

When you click on some of the dots, you will get instructions on how to dispaly the formulas represented by the dot (some variant of cell `[7]`. Using the `formula_id` you can further display the two automata (we advise to do this only for automata of reasonable size for obvious reasons) as demonstrated in cell `[8]`. We cannot display the automata for `owl#best` as this is just a virtual tool for which we compute values (like number of states, time, etc.). The automaton is produced always by one of `yes.owl#a` or `yes.owl#s`.

In [11]:
b = nd_benchmarks["random_nd"]
# b.bokeh_scatter_plot("yes.seminator#slim","yes.seminator#weakslim", "no.empc#slim", include_equal=True)
b.bokeh_scatter_plot("yes.seminator#weakslim", "yes.empc#slim", include_equal=True)
b.bokeh_scatter_plot("yes.seminator#slim", "yes.empc#slim", include_equal=True)

In [12]:
data = b.get_plot_data('yes.seminator#weakslim','yes.seminator#slim',add_count=False)
data[(data['yes.seminator#weakslim'] == 2) & (data['yes.seminator#slim'] == 4)]

Unnamed: 0_level_0,tool,yes.seminator#weakslim,yes.seminator#slim
form_id,formula,Unnamed: 2_level_1,Unnamed: 3_level_1


In [14]:
tool1 = 'yes.seminator#weakslim'
tool2 = 'yes.seminator#slim'
tool3 = 'yes.empc#slim'
tool4 = 'yes.empc#slim'
form_id = 292
display_inline(b.aut_for_id(form_id, tool1),b.aut_for_id(form_id, tool2), b.aut_for_id(form_id, tool3),b.aut_for_id(form_id, tool4))

###### b = nd_benchmarks["random_nd"]
b.bokeh_scatter_plot("yes.seminator#slim","yes.seminator#weakslim", include_equal=True)

In [9]:
b.bokeh_scatter_plot("yes.seminator-1-2","yes.seminator#def", include_equal=True)

KeyError: "Passing list-likes to .loc or [] with any missing labels is no longer supported. The following labels were missing: Index(['yes.seminator-1-2'], dtype='object', name='tool'). See https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#deprecate-loc-reindex-listlike"

### Cross-comparison
The cross-comparison for a benchmark shows, in a cell (`row`,`column`) in how many cases the tool in `row` produces automaton that is better thatn the one produced by `column`. The last columns (`V`) summs the numbers across rows, while the green highlighting fill a space that is proportional to how well the tool in `row` competed agains `column` (proportional across columns).

In [12]:
for n, b in nd_benchmarks.items():
    print(n)
    display(b.cross_compare(tool_set=tool_set))

literature_nd


Unnamed: 0,yes.seminator#slim,yes.seminator#weakslim,yes.empc#slim,yes.empc#specialslim,V
yes.seminator#slim,,15.0,19.0,19.0,53
yes.seminator#weakslim,4.0,,20.0,19.0,43
yes.empc#slim,0.0,0.0,,9.0,9
yes.empc#specialslim,0.0,1.0,2.0,,3


random_nd


Unnamed: 0,yes.seminator#slim,yes.seminator#weakslim,yes.empc#slim,yes.empc#specialslim,V
yes.seminator#slim,,322.0,481.0,480.0,1283
yes.seminator#weakslim,144.0,,480.0,479.0,1103
yes.empc#slim,1.0,2.0,,166.0,169
yes.empc#specialslim,2.0,3.0,102.0,,107


### Running times and timeouts
The older versions of Seminator reached the 30s timeout in one case for formulae from literature. Otherwise, most of the execution times were below 1s for all tools.

In [13]:
with pd.option_context('display.precision', 2):
    for name, b in nd_benchmarks.items():
        print(name)
        display(b.get_error_counts())
        display(b.values.time.max().loc[tool_set])

literature_nd


Unnamed: 0,timeout,parse error,incorrect,crash,no output
yes.empc#slim,1,0,0,0,0
yes.empc#specialslim,1,0,0,0,0
yes.seminator#slim,1,0,0,0,0


tool
yes.seminator#slim        30.00
yes.seminator#weakslim    12.84
yes.empc#slim             30.00
yes.empc#specialslim      30.00
dtype: float64

random_nd


Unnamed: 0,timeout,parse error,incorrect,crash,no output
yes.empc#slim,4,0,0,0,0
yes.empc#specialslim,6,0,0,0,0
yes.seminator#slim,4,0,0,0,0
yes.seminator#weakslim,4,0,0,0,0


tool
yes.seminator#slim        30.01
yes.seminator#weakslim    30.02
yes.empc#slim             30.02
yes.empc#specialslim      30.02
dtype: float64