# Results of complementation of Büchi automata
This notebook presents the data used for Section 4.2 of the CAV paper. The cells `[4]` and `[5]` contain data for **Table 3** from the paper; the table is shown in cell `[6]`. 

In [1]:
from ltlcross_wrapper import ResAnalyzer, gather_cumulative, gather_mins
import pandas as pd
pd.set_option("precision",0)

import spot
spot.setup()
from spot.jupyter import display_inline

from tools import benchmark_names as names

### Visualization of the cumulative data over all benchmarks
For each benchmark, we list the cumulative numbers of states for each tool. The best value for each benchmark is highlighted by green background. The benchmarks start with translation of either `random` formulas or formulas from `literature` by `ltl2tgba`. The suffix `_det` indicates that `ltl2tgba` created automata, that are already deterministic, `_sd` stands for semi-deterministic (but not deterministic), and `_nd` represent automata that are not even semi-deterministic.

The tools are:

* `spot` : `autfilt` from [Spot](https://spot.lrde.epita.fr/) library performs determinization-based complementation; it is configured to produce TGBA as an output.
* `roll` : `Buechic` from [ROLL](https://iscasmc.ios.ac.cn/roll/doku.php) library which is based on automata learning techniques.
* `goal#fri` : The Fribourg complementation plugin for [GOAL](http://goal.im.ntu.edu.tw) which is based on [this paper](https://dl.acm.org/doi/10.1145/3209108.3209138).
* `goal#pit` : The Piterman complementation of GOAL which is based on Piterman's determinization (variant of Safra's construction) and conversion to NBA.
* `seminator#best` : The default complementation in [Seminator 2](https://github.com/mklokocka/seminator). The workflow here is that Seminator first performs semi-determinization and then applies two transition-based variants of the [NCSB algorithm](https://www.fi.muni.cz/~xstrejc/publications/tacas2016coSDBA_preprint.pdf) for complementation of semi-deterministic automata. The smaller of the results is then returned. The two variants are:
  - `spot` the algorithm as was already implemented in Spot
  - `pldi` the new version of the algorithm implemented in Seminator, it is based on this [PLDI'18 paper](https://dl.acm.org/doi/10.1145/3192366.3192405).

The `yes` prefix in the tool names means that the Spot simplifications were applied on the results of the tools (were not disabled for seminator).

The precomputed results contain also data without the simplifications of Spot and result of some other tool configurations that were not presented in the paper. You can control what is shown in cell `[3]` by changing the variable `tools`. The data without Spot's simplifications can be displayed by changing the `yes` prefix in the tool names to `no` (as performed down in this notebook). 

By setting the cell `[2]` to
```python
tools = None
```
you can display results for all benchmarked configurations. On top of (both `yes` and `no` versions of the tools described above, you will find resutls for the following tools.

* `spot_DPA` : `autfilt` configured to return parity automata without the conversion to TGBA
* `seminator#spot` and `seminator#pldi` are the two variants of the complementation in Seminator. Here we call the selected variant exclusively.

In [2]:
tool_set = ["yes.roll","yes.goal#fri","yes.goal#pit","yes.spot","yes.seminator#best"]

In [3]:
benchmarks = {}
for name in names:
    b = ResAnalyzer(f"data/{name}.csv", tool_set=tool_set, cols=["states","time","acc","transitions","edges"])
    b.name = name
    b.orig_count = len(b.values)
    b.clean_count = len(b.values.dropna())
    benchmarks[name] = b

## Cumulative number of states
Cumulative number of automata sizes for each benchmark. The main part of **Table 3** from the paper.

In [4]:
gather_cumulative(benchmarks)

Unnamed: 0_level_0,literature_det,literature_sd,literature_nd,random_det,random_sd,random_nd
tool,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
yes.goal#fri,627,290,142,2530,3294,5278
yes.goal#pit,617,277,206,2490,3676,7713
yes.roll,1388,833,272,3687,5681,6225
yes.seminator#best,622,210,169,2511,2781,4919
yes.spot,611,190,181,2477,2829,5310


### Minimal automata

The following table shows for how many formulas each tool produces automaton that has the smallest number of states. The minimum ranges over the considered tools selected by `tool_set` from cell `[2]`. The number in the column **min hits** shows how many times the same size as the smallest automaton was achieved. The number in **unique min hits** counts only cases where the given tool is the only tool with such a small automaton.

In [5]:
gather_mins(benchmarks)

Unnamed: 0_level_0,literature_det,literature_det,literature_sd,literature_sd,literature_nd,literature_nd,random_det,random_det,random_sd,random_sd,random_nd,random_nd
Unnamed: 0_level_1,unique min hits,min hits,unique min hits,min hits,unique min hits,min hits,unique min hits,min hits,unique min hits,min hits,unique min hits,min hits
yes.goal#fri,0,137,2,26,7,14,6,464,28,258,65,238
yes.goal#pit,0,143,0,28,1,5,0,477,0,125,8,96
yes.seminator#best,0,142,3,37,1,8,0,465,65,420,92,277
yes.spot,8,150,10,40,3,9,10,489,43,354,52,202
yes.roll,0,0,0,0,0,0,0,0,1,4,45,58


## Table 3

In [6]:
cum = gather_cumulative(benchmarks, highlight=False).loc[tool_set]
mins = gather_mins(benchmarks, highlight=False, unique_only=False).loc[tool_set]
mins.columns=mins.columns.droplevel(1)
df = pd.DataFrame()

for c in cum.columns:
    df[c] = cum[c].astype(int).astype(str)
    df[c] = df[c].str.cat(mins[c].astype(int).astype(str), sep=" (", join="left") + ")"
    
# Add formula counts
counts = {name: f"{b.clean_count}+{b.orig_count - b.clean_count}" for name, b in benchmarks.items()}
counts = pd.Series(counts, name="# of formulas")
df = pd.DataFrame(counts).transpose().append(df.loc[tool_set])

df.columns = pd.MultiIndex.from_tuples([c.split("_") for c in df.columns]).swaplevel()
df = df.sort_index(axis=1)
df = df[["det","sd","nd"]]

tool_names={
    "yes.spot" : "Spot",
    "yes.roll" : "ROLL+Spot",
    "yes.goal#fri": "Fribourg+Spot",
    "yes.goal#pit": "GOAL+Spot",
    "yes.seminator#best": "Seminator 2"
}
columns = {
    "literature" : "literature",
    "random" : "random",
    "det" : "deterministic",
    "sd" :  "semi-detereministic",
    "nd" : "non-semi-deterministic",
}
df.rename(index=tool_names, columns=columns)

Unnamed: 0_level_0,deterministic,deterministic,semi-detereministic,semi-detereministic,non-semi-deterministic,non-semi-deterministic
Unnamed: 0_level_1,literature,random,literature,random,literature,random
# of formulas,147+5,500+0,47+2,499+1,15+5,486+14
ROLL+Spot,1388 (0),3687 (0),833 (0),5681 (4),272 (0),6225 (58)
Fribourg+Spot,627 (137),2530 (464),290 (26),3294 (258),142 (14),5278 (238)
GOAL+Spot,617 (143),2490 (477),277 (28),3676 (125),206 (5),7713 (96)
Spot,611 (150),2477 (489),190 (40),2829 (354),181 (9),5310 (202)
Seminator 2,622 (142),2511 (465),210 (37),2781 (420),169 (8),4919 (277)


### Cross-comparison by automata type
The cross-comparison for a benchmark shows, in a cell (`row`,`column`) in how many cases the tool in `row` produces automaton that is better thatn the one produced by `column`. The last columns (`V`) summs the numbers across rows, while the green highlighting fill a space that is proportional to how well the tool in `row` competed agains `column` (proportional across columns).

In [7]:
for t in ["nd", "sd", "det"]:
    bench = {n: b for n, b in benchmarks.items() if n[-3:].find(t) >= 0}
    display(gather_cumulative(bench, tool_set))
    for b in bench.values():
        display(b.name, b.cross_compare(tool_set, include_fails=True))
    print("\n\n")

Unnamed: 0_level_0,literature_nd,random_nd
tool,Unnamed: 1_level_1,Unnamed: 2_level_1
yes.goal#fri,142,5278
yes.goal#pit,206,7713
yes.roll,272,6225
yes.seminator#best,169,4919
yes.spot,181,5310


'literature_nd'

Unnamed: 0,yes.roll,yes.goal#fri,yes.goal#pit,yes.spot,yes.seminator#best,V
yes.roll,,1.0,4.0,2.0,2.0,9
yes.goal#fri,19.0,,15.0,12.0,9.0,55
yes.goal#pit,16.0,2.0,,9.0,3.0,30
yes.spot,18.0,5.0,10.0,,7.0,40
yes.seminator#best,18.0,6.0,12.0,12.0,,48


'random_nd'

Unnamed: 0,yes.roll,yes.goal#fri,yes.goal#pit,yes.spot,yes.seminator#best,V
yes.roll,,113.0,231.0,124.0,96.0,564
yes.goal#fri,387.0,,379.0,233.0,228.0,1227
yes.goal#pit,266.0,58.0,,111.0,102.0,537
yes.spot,375.0,221.0,368.0,,213.0,1177
yes.seminator#best,402.0,242.0,381.0,243.0,,1268







Unnamed: 0_level_0,literature_sd,random_sd
tool,Unnamed: 1_level_1,Unnamed: 2_level_1
yes.goal#fri,290,3294
yes.goal#pit,277,3676
yes.roll,833,5681
yes.seminator#best,210,2781
yes.spot,190,2829


'literature_sd'

Unnamed: 0,yes.roll,yes.goal#fri,yes.goal#pit,yes.spot,yes.seminator#best,V
yes.roll,,4.0,6.0,3.0,3.0,16
yes.goal#fri,44.0,,9.0,12.0,2.0,67
yes.goal#pit,42.0,18.0,,13.0,0.0,73
yes.spot,46.0,22.0,18.0,,11.0,97
yes.seminator#best,46.0,24.0,18.0,15.0,,103


'random_sd'

Unnamed: 0,yes.roll,yes.goal#fri,yes.goal#pit,yes.spot,yes.seminator#best,V
yes.roll,,24.0,57.0,13.0,14.0,108
yes.goal#fri,473.0,,300.0,109.0,91.0,973
yes.goal#pit,442.0,114.0,,25.0,28.0,609
yes.spot,483.0,237.0,365.0,,108.0,1193
yes.seminator#best,486.0,246.0,369.0,142.0,,1243







Unnamed: 0_level_0,literature_det,random_det
tool,Unnamed: 1_level_1,Unnamed: 2_level_1
yes.goal#fri,627,2530
yes.goal#pit,617,2490
yes.roll,1388,3687
yes.seminator#best,622,2511
yes.spot,611,2477


'literature_det'

Unnamed: 0,yes.roll,yes.goal#fri,yes.goal#pit,yes.spot,yes.seminator#best,V
yes.roll,,2.0,2.0,0.0,4.0,8
yes.goal#fri,150.0,,5.0,15.0,13.0,183
yes.goal#pit,150.0,6.0,,15.0,11.0,182
yes.spot,152.0,15.0,12.0,,14.0,193
yes.seminator#best,148.0,8.0,4.0,14.0,,174


'random_det'

Unnamed: 0,yes.roll,yes.goal#fri,yes.goal#pit,yes.spot,yes.seminator#best,V
yes.roll,,1.0,1.0,0.0,5.0,7
yes.goal#fri,499.0,,36.0,21.0,59.0,615
yes.goal#pit,499.0,20.0,,13.0,50.0,582
yes.spot,500.0,43.0,44.0,,64.0,651
yes.seminator#best,495.0,27.0,13.0,9.0,,544







### Running times (in seconds) and timeouts
The complementation implemented in Seminator 2 is also very competitive in terms of running times.

In [8]:
gather_cumulative(benchmarks, col="time")

Unnamed: 0_level_0,literature_det,literature_sd,literature_nd,random_det,random_sd,random_nd
tool,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
yes.goal#fri,820,393,276,1791,2097,3102
yes.goal#pit,807,536,382,1821,2425,4302
yes.roll,1036,213,594,860,1034,2882
yes.seminator#best,7,2,2,19,23,357
yes.spot,7,2,12,19,22,147


In [9]:
for name, b in benchmarks.items():
    print(name)
    display(b.get_error_counts())
    print("\n")

literature_det


Unnamed: 0,timeout,parse error,incorrect,crash,no output
no.roll,4,0,0,1,0
yes.roll,4,0,0,0,0




literature_sd


Unnamed: 0,timeout,parse error,incorrect,crash,no output
no.goal#fri,2,0,0,0,0
no.goal#pit,2,0,0,0,0
no.roll,1,0,0,0,0
yes.goal#fri,1,0,0,0,0
yes.goal#pit,2,0,0,0,0
yes.roll,1,0,0,0,0




literature_nd


Unnamed: 0,timeout,parse error,incorrect,crash,no output
no.goal#fri,1,0,0,0,0
no.goal#pit,2,0,0,0,0
no.roll,3,0,0,0,0
yes.goal#fri,1,0,0,0,0
yes.goal#pit,2,0,0,0,0
yes.roll,2,0,0,0,0




random_det


Unnamed: 0_level_0,timeout,parse error,incorrect,crash,no output
tool,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1




random_sd


Unnamed: 0,timeout,parse error,incorrect,crash,no output
no.goal#pit,1,0,0,0,0
yes.goal#pit,1,0,0,0,0




random_nd


Unnamed: 0,timeout,parse error,incorrect,crash,no output
no.goal#fri,3,0,0,0,0
no.goal#pit,6,0,0,0,0
no.roll,5,0,0,0,0
yes.goal#fri,3,0,0,0,0
yes.goal#pit,8,0,0,0,0
yes.roll,5,0,0,0,0
yes.seminator#best,1,0,0,0,0
yes.seminator#pldi,1,0,0,0,0
yes.seminator#spot,1,0,0,0,0
yes.spot,1,0,0,0,0






### Scatter plots
We offer interactive scatter plots that show how Seminator 2 compares to Spot (`In[10]`) and to the Fribourg complementation with Spot's simplifications (`In[13]`) on the _not semi-deterministic random_ benchmark. In the paper, we present them as **Figure 5**. We also show comparison to the determinization-based complementation from GOAL (with Spot's simplifications), `yes.goal#pit` (`In[14]`) that is not presented in the paper.

When you click on some of the dots, you will get instructions on how to dispaly the formulas represented by the dot (some variant of cell `[11]`. Using the `formula_id` you can further display the two automata (we advise to do this only for automata of reasonable size for obvious reasons) as demonstrated in cell `[12]`. We cannot display the automata for `owl#best` as this is just a virtual tool for which we compute values (like number of states, time, etc.). The automaton is produced always by one of `yes.owl#a` or `yes.owl#s`.

In [10]:
b = benchmarks["random_nd"]

In [11]:
b.bokeh_scatter_plot("yes.spot","yes.seminator#best")

In [12]:
data = b.get_plot_data('yes.spot','yes.seminator#best',add_count=False)
data[(data['yes.spot'] == 2) & (data['yes.seminator#best'] == 4)]

Unnamed: 0_level_0,tool,yes.spot,yes.seminator#best
form_id,formula,Unnamed: 2_level_1,Unnamed: 3_level_1
189,F(FGa R b),2,4


In [13]:
tool1 = 'yes.spot'
tool2 = 'yes.seminator#best'
form_id = 189
display_inline(b.aut_for_id(form_id, tool1), b.aut_for_id(form_id, tool2))

In [14]:
b.bokeh_scatter_plot("yes.goal#fri","yes.seminator#best")

In [15]:
p = b.bokeh_scatter_plot("yes.goal#pit","yes.seminator#best")

## Without simplifications of Spot
GOAL#pit runs removing dead and unreachable states, roll probably does not create such states. The rest of the tools does not remove them.

We can observe that Fribourg generates large ammount of unnecessary states before simplifications.

In [16]:
no_tools = [t.replace("yes","no") for t in tool_set]

In [17]:
gather_cumulative(benchmarks, tool_set=no_tools)

Unnamed: 0_level_0,literature_det,literature_sd,literature_nd,random_det,random_sd,random_nd
tool,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
no.goal#fri,1251,2518,642,5135,15437,48796
no.goal#pit,767,1108,369,2992,8954,22941
no.roll,1628,915,271,4775,6799,7486
no.seminator#best,798,439,470,3100,5775,18341
no.spot,608,257,205,2480,3497,6974


## Minimal automata
The Fribourg complementation never returns automaton with the minimal number of states and thus is removed from the next table.

In [18]:
gather_mins(benchmarks, tool_set=no_tools)

Unnamed: 0_level_0,literature_det,literature_det,literature_sd,literature_sd,literature_nd,literature_nd,random_det,random_det,random_sd,random_sd,random_nd,random_nd
Unnamed: 0_level_1,unique min hits,min hits,unique min hits,min hits,unique min hits,min hits,unique min hits,min hits,unique min hits,min hits,unique min hits,min hits
no.goal#pit,0,44,0,0,0,0,0,74,1,4,3,10
no.seminator#best,0,11,2,21,1,1,0,69,17,161,25,56
no.spot,104,152,24,43,16,16,403,500,306,466,296,347
no.roll,0,0,4,4,3,3,0,0,16,35,122,148


### Time in seconds

In [19]:
gather_cumulative(benchmarks, col="time", tool_set=no_tools)

Unnamed: 0_level_0,literature_det,literature_sd,literature_nd,random_det,random_sd,random_nd
tool,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
no.goal#fri,829,444,271,1795,2089,3056
no.goal#pit,783,549,395,1784,2418,4142
no.roll,997,214,664,858,1026,2842
no.seminator#best,6,2,1,19,22,31
no.spot,6,1,1,18,21,24


In [20]:
for t in ["nd", "sd", "det"]:
    bench = {n: b for n, b in benchmarks.items() if n[-3:].find(t) >= 0}
    display(t, gather_cumulative(bench, no_tools))
    for b in bench.values():
        display(b.cross_compare(no_tools))
    print("\n\n\n")

'nd'

Unnamed: 0_level_0,literature_nd,random_nd
tool,Unnamed: 1_level_1,Unnamed: 2_level_1
yes.goal#fri,142,5278
yes.goal#pit,206,7713
yes.roll,272,6225
yes.seminator#best,169,4919
yes.spot,181,5310


Unnamed: 0,no.roll,no.goal#fri,no.goal#pit,no.spot,no.seminator#best,V
no.roll,,16.0,14.0,3.0,15.0,48
no.goal#fri,4.0,,1.0,0.0,3.0,8
no.goal#pit,6.0,18.0,,1.0,9.0,34
no.spot,17.0,20.0,19.0,,19.0,75
no.seminator#best,5.0,17.0,10.0,1.0,,33


Unnamed: 0,no.roll,no.goal#fri,no.goal#pit,no.spot,no.seminator#best,V
no.roll,,473.0,412.0,134.0,287.0,1306
no.goal#fri,27.0,,17.0,0.0,25.0,69
no.goal#pit,88.0,480.0,,15.0,101.0,684
no.spot,365.0,500.0,484.0,,442.0,1791
no.seminator#best,213.0,475.0,397.0,57.0,,1142








'sd'

Unnamed: 0_level_0,literature_sd,random_sd
tool,Unnamed: 1_level_1,Unnamed: 2_level_1
yes.goal#fri,290,3294
yes.goal#pit,277,3676
yes.roll,833,5681
yes.seminator#best,210,2781
yes.spot,190,2829


Unnamed: 0,no.roll,no.goal#fri,no.goal#pit,no.spot,no.seminator#best,V
no.roll,,23.0,13.0,4.0,7.0,47
no.goal#fri,25.0,,0.0,0.0,0.0,25
no.goal#pit,35.0,47.0,,0.0,0.0,82
no.spot,45.0,49.0,49.0,,28.0,171
no.seminator#best,42.0,49.0,49.0,8.0,,148


Unnamed: 0,no.roll,no.goal#fri,no.goal#pit,no.spot,no.seminator#best,V
no.roll,,391.0,255.0,28.0,131.0,805
no.goal#fri,107.0,,8.0,0.0,0.0,115
no.goal#pit,237.0,490.0,,1.0,10.0,738
no.spot,471.0,500.0,496.0,,372.0,1839
no.seminator#best,369.0,500.0,480.0,50.0,,1399








'det'

Unnamed: 0_level_0,literature_det,random_det
tool,Unnamed: 1_level_1,Unnamed: 2_level_1
yes.goal#fri,627,2530
yes.goal#pit,617,2490
yes.roll,1388,3687
yes.seminator#best,622,2511
yes.spot,611,2477


Unnamed: 0,no.roll,no.goal#fri,no.goal#pit,no.spot,no.seminator#best,V
no.roll,,60.0,2.0,0.0,7.0,69
no.goal#fri,91.0,,0.0,0.0,0.0,91
no.goal#pit,150.0,152.0,,0.0,76.0,378
no.spot,152.0,152.0,113.0,,141.0,558
no.seminator#best,144.0,150.0,56.0,0.0,,350


Unnamed: 0,no.roll,no.goal#fri,no.goal#pit,no.spot,no.seminator#best,V
no.roll,,292.0,7.0,0.0,16.0,315
no.goal#fri,203.0,,0.0,0.0,2.0,205
no.goal#pit,493.0,500.0,,0.0,135.0,1128
no.spot,500.0,500.0,431.0,,448.0,1879
no.seminator#best,483.0,496.0,75.0,0.0,,1054








In [21]:
for name, b in benchmarks.items():
    display(name, b.get_error_counts())

'literature_det'

Unnamed: 0,timeout,parse error,incorrect,crash,no output
no.roll,4,0,0,1,0
yes.roll,4,0,0,0,0


'literature_sd'

Unnamed: 0,timeout,parse error,incorrect,crash,no output
no.goal#fri,2,0,0,0,0
no.goal#pit,2,0,0,0,0
no.roll,1,0,0,0,0
yes.goal#fri,1,0,0,0,0
yes.goal#pit,2,0,0,0,0
yes.roll,1,0,0,0,0


'literature_nd'

Unnamed: 0,timeout,parse error,incorrect,crash,no output
no.goal#fri,1,0,0,0,0
no.goal#pit,2,0,0,0,0
no.roll,3,0,0,0,0
yes.goal#fri,1,0,0,0,0
yes.goal#pit,2,0,0,0,0
yes.roll,2,0,0,0,0


'random_det'

Unnamed: 0_level_0,timeout,parse error,incorrect,crash,no output
tool,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1


'random_sd'

Unnamed: 0,timeout,parse error,incorrect,crash,no output
no.goal#pit,1,0,0,0,0
yes.goal#pit,1,0,0,0,0


'random_nd'

Unnamed: 0,timeout,parse error,incorrect,crash,no output
no.goal#fri,3,0,0,0,0
no.goal#pit,6,0,0,0,0
no.roll,5,0,0,0,0
yes.goal#fri,3,0,0,0,0
yes.goal#pit,8,0,0,0,0
yes.roll,5,0,0,0,0
yes.seminator#best,1,0,0,0,0
yes.seminator#pldi,1,0,0,0,0
yes.seminator#spot,1,0,0,0,0
yes.spot,1,0,0,0,0
