# NFA Learning

![logo](https://iscasmc.ios.ac.cn/roll/lib/exe/fetch.php?media=wiki:logo.png)

This is a tutorial for the Java Library [```Regular Omega Language Learning (ROLL)```](https://iscasmc.ios.ac.cn/roll/doku.php) in a Groovy kernel Jupyter Notebook.
Groovy is very similar to Java and you can write all code in Java syntax.

**Tips** : If something goes strange, use the menu bar above ```Kernel -> Restart``` to reboot this notebook and run following code again.

---

**First of all, load the jar file of the learning library ROLL.**

In [1]:
%classpath add jar ROLL.jar

Added jar: [ROLL.jar]


In the active automata learning setting proposed by Angluin, there are a `teacher`, which knows the target language $L$, and a `learner`, whose task is to learn the target language, represented by an automaton, from the teacher by means of two kinds of queries: `membership queries` and `equivalence queries`. 
A membership query $MQ[w]$ asks whether a string $w$ belongs to $L$ while an equivalence query $EQ[A]$ asks whether the hypothesis automaton $A$ recognizes $L$. 
The teacher replies with a witness if the hypothesis is incorrect otherwise the learner completes its job.

In the following, we introduce two ways to learn the regular language  $L =\Sigma^* b \Sigma \Sigma$ over the alphabet $\Sigma = \{a, b\}$ by means of nondeterministic finite automata (NFAs).
The NFA learning algorithm used here is called NL$^*$ which was introduced by Benedikt Bollig, Peter Habermehl, Carsten Kern, and Martin Leucker in their paper titled "Angluin-Style Learning of NFA".
This algorithm uses an observation table to store the membership query results and makes use of a notion called "prime rows" corresponding to a state in the conjectured NFAs.
We refer the reader to that paper for more details on NL$^*$ learning algorithm.

The first way is to use embedded [DK package](http://www.brics.dk/automaton/) to play as the teacher.
The second way is to allow youself to play as the teacher.

**1. Learning the regular language $L$ from DK by giving a target NFA $N$**

we first need to create the target NFA $N$ which accepts the regular language $L$.

In [2]:
import roll.words.Alphabet
import roll.automata.DFA
// you can always import all the classes in roll.jupyter package
import roll.jupyter.*
import java.util.List
import java.util.ArrayList

// in order to create an alphabet, you need an array of Characters
// the variable apList is local since there is type in front of it
List<Character> apList = new ArrayList<Character>();

// in Groovy, we have to do strong cast for Characters 
apList.add((char)'a');
apList.add((char)'b');

// create an alphabet with a Character list
// the created alphabet is global in this notebook
JupyterROLL.createAlphabet(apList);

// use JupyterROLL to create an NFA object N
// the variable target is global since there is no type in front of it
// so we can use this variable everywhere in this notebook
N = JupyterROLL.createNFA();

// now we can get the alphabet in the DFA
alphabet = N.getAlphabet();

[0->a, 1->b]

In [3]:
// now we are ready to create the NFA which accepts the regular language L
// we first create 4 states
N.createState();
N.createState();
N.createState();
N.createState();


// 4 indices for the states
int fst = 0, snd = 1, thd = 2, fur = 3;
// the function getState is to get a state object by its state index
N.getState(fst).addTransition(alphabet.indexOf((char)'a'), fst); // 0 -> 0 via a
N.getState(fst).addTransition(alphabet.indexOf((char)'b'), fst); // 0 -> 0 via b
N.getState(fst).addTransition(alphabet.indexOf((char)'b'), snd); // 0 -> 1 via b
N.getState(snd).addTransition(alphabet.indexOf((char)'a'), thd); // 1 -> 2 via a
N.getState(snd).addTransition(alphabet.indexOf((char)'b'), thd); // 1 -> 2 via b
N.getState(thd).addTransition(alphabet.indexOf((char)'a'), fur); // 2 -> 3 via a
N.getState(thd).addTransition(alphabet.indexOf((char)'b'), fur); // 2 -> 3 via b

// set 0 as the initial state
N.setInitial(fst);
// set 3 as a final state
N.setFinal(fur);

// now we can output target in a DOT graph
N

Now we are ready to create an NFA learner to learn a hypothesis NFA $A$ from DK. The NFA $A$ accepts the regular language $L$ and happens to have the minimal number of states.

In the following, we will demostrate how to use the learning algorithm to learn the target language $L$.

In [7]:
import roll.jupyter.*
    
// we create a global variable sequence which stores the learning procedure as a
// list of Triple object, the Triple object has three elements
// the first is the table data structure, the second is the current hypothesis NFA, and the third is the counterexample
// which refines the previous hypothesis NFA to the current hypothesis
sequence = JupyterROLL.learningSeq("nlstar", "table", N);

// sequence is a java.util.List instance
sequence.size()

2

From the output of the learning list, the target language $L$ has been learned by the learning algorithm with only 2 equivalence queries.

we now can check the Triple object at each step of the learning procedure. **The prime rows in the observation table have been labeled by "\*" in the front**.

In [8]:
// initial learner data
sequence.get(0)

Learner,Hypothesis,Counterexample
|| ϵ | ========== * ϵ || - | ========== * a || - | * b || - |,%3000->0b0->0a11->0,


In [9]:
// we get a new hypothesis after one counterexample refinement
sequence.get(1)

Learner,Hypothesis,Counterexample
|| ϵ | baa | aa | a | ============================ * ϵ || - | + | - | - | * b || - | + | + | - | * ba || - | + | - | + | * baa || + | + | - | - | ============================ * a || - | + | - | - | bb || - | + | + | + | bab || + | + | + | - | * baaa || - | + | - | - | * baab || - | + | + | - |,%3000->0b0->0a110->1b1->0b1->0a1->1b221->2b1->2a2->0b2->0a2->1b332->3b2->3a3->0b3->0a3->1b44->0,$baa$


**We have just learned how to learn a NFA A from DK out of a given NFA N. Sometimes we may not have the target NFA in hand but we know exactly the language we want to learn in mind. ** 

In this case, we can first specify what kind of strings really belong to the target language $L$ and then refine the hypothesis if it does not recognize the target language by ourselves.
We are going to use the learning algorithm to show how to learn the NFA A from ourselves.

**2. Learning the regular language $L$ in an interactive way**

In [10]:
import roll.jupyter.*;
import java.util.function.Function;
import roll.words.*;

// now we define a function :: string -> boolean and this function is used to 
// determine whether a string is in the target language
// this function resolves all membership queries posed by the learners
mqOracle = {
    s -> 
    // b is the third last letter
    int num = s.length();
    if(num < 3) {
        return false;
    }
    // check the third last letter
    if (s.charAt(num - 3) == 'b') {
        return true;
    }
    return false;
};

// now we create an NFA learner to learn the target language 
nfaLearner = JupyterROLL.createNFALearner("nlstar", "table", mqOracle);
// we can also see the data structure of the learner in a DOT graph
nfaLearner

In [11]:
// output current hypothesis to see whether it recognizes the target language
nfaLearner.getHypothesis()

In [12]:
// the hypothesis is no correct and we can use a counterexample
// which is in the symmetric difference of the language of A and the target language
// here we use baa
nfaLearner.refineHypothesis("baa")
// this hypothesis is the correct NFA
nfaLearner.getHypothesis()

In [13]:
// we can output the current observation table 
nfaLearner