SUL Interface, or How to Learn Your Systems

SUL Interface

The system under learning (SUL) is a basic abstract class that you have to implement for all systems you would like to learn. When inferring loaded or randomly generated automata, one only has to pass it to the appropriate SUL found in aalpy/SULs/AutomataSUL.py. However, if you would like to learn any other system, you have to implement a reset and step method.

Reset is divided in

pre() : initialization/startup of the system
post() : graceful shutdown of the system/memory cleanup

The step method received an element of the input alphabet. Based on the received element, an action associated with said input is performed on the system under learning. The membership query is implemented as a sequence of steps.

For further examples of how to implement SUL refer to: aalpy/SULs/

Input Alphabet

Input alphabet is a list of any hashable object, most often strings or integers. In the step method of the SUL interface receives an element from the input alphabet and performs a system action based on it.

From standard automatons (loaded from file/randomly generated), you can obtain input alphabet with:

automaton.get_input_alphabet()

To see how class methods and their arguments can be used as the elements of the input alphabet, check out aalpy/SULs/PyMethodSUL and learnPythonClass in Examples.py.

Examples

Example - RegexSUL

Suppose that you would like to learn a regular expression. More precisely, you would like to learn a grammar that accepts or rejects the same language as the regular expression.

In Python, to check if the regular expression is matched, one can use re module and its match method. So how do we go from re.match(some_regex, some_string) to the SUL that can be used to learn a DFA representing such regex?

Let us examine the following code snippet that achieves such a task. In the constructor of the RegexSUL we pass a regular expression we would like to learn.

In the pre method, we initialize the system. In our case, we will reset the string that we will use to form membership queries to the empty string.

In the step method, an element of the input alphabet, denoted by 'letter' will be received as an argument. When learning DFAs and Moore machines, we have to ensure that we account for the None input. For Mealy machines, non-deterministic or stochastic formalism None will never be passed to the step method unless implicitly declared in the input alphabet.

In our example, we will append an element of the input alphabet, a character, to the self. string and return whether the regular expression accepts the resulting string. The step method is called for all inputs in the membership query.

import re

class RegexSUL(SUL):
    """
    An example implementation of a system under learning that can be used to learn any regex expression.
    Note that the $ is added to the expression as in this SUL only exact matches are learned.
    """
    def __init__(self, regex: str):
        super().__init__()
        self.regex = regex if regex[-1] == '$' else regex + '$'
        self.string = ""

    def pre(self):
        self.string = ""
        pass

    def post(self):

        pass

    def step(self, letter):
        if letter is not None:
            self.string += str(letter)
        return True if re.match(self.regex, self.string) else False

Once such SUL is implemented, let us learn a concrete regular expression. Let's define random regex over ['a', 'b', 'c'] input alphabet. Furthermore, we use StatePrefixEqOracle, as it ensures good test coverage.

regex = 'abc(b|c)+(ab|c)*'
alphabet = ['a', 'b', 'c']

regex_sul = RegexSUL(regex)

eq_oracle = StatePrefixEqOracle(alphabet, regex_sul, walks_per_state=100,
                                walk_len=20)

learned_regex = run_Lstar(alphabet, regex_sul, eq_oracle, automaton_type='dfa', print_level=2)

visualize_automaton(learned_regex)

When we run the presented example, our result is:

Hypothesis 1 has 1 states.
Hypothesis 2 has 7 states.
Hypothesis 3 has 8 states.
-----------------------------------
Learning Finished.
Learning Rounds:  3
Number of states: 8
Time (in seconds)
  Total                : 0.05
  Learning algorithm   : 0.0
  Conformance checking : 0.05
Learning Algorithm
 # Membership Queries  : 85
 # MQ Saved by Caching : 120
 # Steps               : 582
Equivalence Query
 # Membership Queries  : 800
 # Steps               : 18173
-----------------------------------

and

Example - MQTT Protocol

Suppose that you would like to learn an MQTT protocol. What would be the input alphabet, and how would you implement the SUL interface? You downloaded some clients and want to learn the behavior of the broker. The usual client setup would have the following signature:

class MockMqttExample:

    def __init__(self):
        self.state = 'CONCLOSED'
        self.topics = set()

    def subscribe(self, topic: str):
        ...

    def unsubscribe(self, topic):
        ...

    def connect(self):
        ...

    def disconnect(self):
        ...

    def publish(self, topic):
        ...

Mock implementation of the MQTT protocol can be found in BenchmarksSULs.py under the name "MockMqttExample". We can approach setting up the SUL in two ways.

Standard Way

Usually, we would define our input alphabet to be the abstract representation of possible actions. The input alphabet for this experiment would be: {connect, disconnect, publish, subscribe, unsubscribe}. Each element of the abstract input alphabet will be mapped to the concrete method. For example:

'connect' -> self.connect()
'publish' -> self.publish(topic='testTopic')
...

The complete learning procedure would then look like this:

from aalpy.base import SUL
from aalpy.oracles import RandomWalkEqOracle
from aalpy.learning_algs import run_Lstar
from aalpy.utils import visualize_automaton, MockMqttExample

class MQTT_SUL(SUL):
    def __init__(self):
        super().__init__()
        self.mqtt = MockMqttExample()

    def pre(self):
        self.mqtt.state = 'CONCLOSED'

    def post(self):
        self.mqtt.topics.clear()

    def step(self, letter):
        if letter == 'connect':
            return self.mqtt.connect()
        elif letter == 'disconnect':
            return self.mqtt.disconnect()
        elif letter == 'publish':
            return self.mqtt.publish(topic='test')
        elif letter == 'subscribe':
            return self.mqtt.subscribe(topic='test')
        else:
            return self.mqtt.unsubscribe(topic='test')

sul = MQTT_SUL()
input_al = ['connect', 'disconnect', 'publish','subscribe', 'unsubscribe']

eq_oracle = RandomWalkEqOracle(input_al, sul, num_steps=2000, reset_after_cex=True, reset_prob=0.15)

learned_mqtt= run_Lstar(input_al, sul, eq_oracle=eq_oracle, automaton_type='mealy', cache_and_non_det_check=True,
                  print_level=2)

visualize_automaton(learned_mqtt)

When run, the following output is observed:

Hypothesis 1 has 3 states.
-----------------------------------
Learning Finished.
Learning Rounds:  1
Number of states: 3
Time (in seconds)
  Total                : 0.01
  Learning algorithm   : 0.01
  Conformance checking : 0.0
Learning Algorithm
 # Membership Queries  : 75
 # MQ Saved by Caching : 5
 # Steps               : 225
Equivalence Query
 # Membership Queries  : 324
 # Steps               : 2000
-----------------------------------
Visualized model saved to Model_Visualization.pdf.

and

Learning Python Class

Let us now consider this example as a challenge where we would like to learn a Python class's behavior. In this case, we would wrap class methods and their arguments in the FunctionDecorator and use PyMethodSUL. Reset is then implemented by reinitializing the class.

from aalpy.SULs import PyClassSUL, FunctionDecorator
from aalpy.oracles import RandomWalkEqOracle
from aalpy.learning_algs import run_Lstar
from aalpy.utils import visualize_automaton, MockMqttExample

# class under learning (do not instantiate it)
mqtt = MockMqttExample

# methods weapped in the function decorators
input_al = [FunctionDecorator(mqtt.connect), FunctionDecorator(mqtt.disconnect),
            FunctionDecorator(mqtt.subscribe, 'topic'), FunctionDecorator(mqtt.unsubscribe, 'topic'),
            FunctionDecorator(mqtt.publish, 'topic')]

# SUL
sul = PyClassSUL(mqtt)

eq_oracle = RandomWalkEqOracle(input_al, sul, num_steps=2000, reset_after_cex=True, reset_prob=0.15)

learned_mqtt = run_Lstar(input_al, sul, eq_oracle=eq_oracle, automaton_type='mealy', cache_and_non_det_check=True)

visualize_automaton(learned_mqtt)