# Analyse et résumé de la reproductibilité des expériences

Ce carnet jupyter explique le processus d'analyse et de reproduction des expériences du papier **"Learning Moore Machines from Input-Output Traces"**. Le travail inclut la configuration de l'environnement, l'exécution des expériences telles que décrites dans le papier et dans le dépôt GitHub associé, et l'analyse critique des résultats.

---

## Objectifs :
1. Reproduire les expériences menées par les chercheurs, auteurs de la publication, à l'aide du code fourni.
2. Analyser la configuration des expériences, leur mise en œuvre et leurs résultats.
3. Documenter les observations, les défis rencontrés et les enseignements pour une compréhension approfondie et de futures travaux de recherches basées sur ce papier.

---

**Lien vers le dépôt GitHub des auteurs :** [Dépôt GitHub associé à la publication](https://github.com/ggiorikas/FSM-learning)

---

### Plan :
1. **Configuration de l'environnement** : guide étape par étape pour configurer l'environnement selon les spécifications des auteurs.
2. **Exécution des expériences** : lancement et relance des expériences, gestion des problèmes de mémoire, et suivi de la progression.
3. **Analyse des résultats** : étude des sorties stockées dans `all-results.txt` et interprétation des résultats obtenus.
4. **Observations critiques** : défis rencontrés et suggestions d'amélioration.
5. **Conclusion** : résumé des observations et prochaines étapes.

## Configuration de l'environnement

La configuration de l'environnement nécessaire pour reproduire les expériences telles que décrites dans le papier.

### Prérequis :
1. **Python version 2.7** : utilisé pour la génération des traces et pour les étapes intermédiaires du traitement.
2. **Langage D (D Compiler)** : utilisé pour exécuter les scripts principaux, notamment ceux permettant de faire l'expérience d'apprentissage des machines de Moore.
3. **Dépendances supplémentaires** :
   - Mémoire vive suffisante pour éviter les arrêts fréquents dus à des limites de mémoire.
   - Accès à un système compatible (Linux, Windows, ou MacOS).

### Vérification de l'environnement :
Les commandes ci-dessous permettent de vérifier que les outils nécessaires sont correctement installés et configurés selon ce qui a été demandé par les auteurs (voir [Dépôt GitHub associé à la publication](https://github.com/ggiorikas/FSM-learning)) :
- **Vérification de la version de Python :**

In [None]:
$ python --version

- **Vérification de la version du compilateur D :**

In [None]:
$ dmd --version

## Une fois que ces versions sont validées, vous pouvez passer à l'exécution des expériences.

### Méthodologie
Les expériences suivent les étapes suivantes :

1. **Génération des Machines de Moore** :
   - Machines aléatoires générées avec un nombre spécifique d'états, d'entrées et de sorties.

2. **Génération des Traces d'Entraînement et de Test** :
   - Les machines générées sont utilisées pour produire des traces qui servent à l'apprentissage et à l'évaluation.

3. **Application des Algorithmes d'Apprentissage** :
   - Trois algorithmes sont utilisés :
     - **Algorithme 1** : apprentissage basique.
     - **Algorithme 2** : amélioration avec un processus d'agrégation d'états.
     - **Algorithme 3** : optimisation avancée pour une précision élevée. (algorithme choisi pour notre étude)

4. **Évaluation de la Précision** :
   - Les machines apprises sont évaluées selon trois politiques :
     - **Faible (Weak)** : évaluation basique.
     - **Moyenne (Medium)** : évaluation intermédiaire.
     - **Forte (Strong)** : évaluation stricte (isomorphisme avec la machine originale).

### Commandes Principales
Les commandes suivantes sont utilisées pour exécuter les expériences :

- **Expérience d'apprentissage des machines de Moore génerées aléatoirement :**

In [None]:
$ dmd -m64 -i -O -release -inline -boundscheck=off generate.d
$ dmd -m64 -i -O -release -inline -boundscheck=off run_rand_fsm_experiments.d
$ ./run_rand_fsm_experiments

Les résultats seront stockés dans le fichier suivant :  
`o/all-results.txt`

En cas d'arrêt du processus dû à une utilisation excessive de mémoire (un message indiquant **"out of memory"** apparaît), il est nécessaire de relancer la commande. Le système utilise un mécanisme de mise en cache pour éviter de répéter les expériences déjà effectuées.

- **Expérience d'apprentissage des machines de Moore de Benchmark**

In [None]:
$ dmd -m64 -i -O -release -inline -boundscheck=off generate.d
$ dmd -m64 -i -O -release -inline -boundscheck=off run_real_fsm_experiments.d
$ ./run_rand_fsm_experiments

Les résultats seront stockés dans le fichier suivant :  
`o2/all-results.txt`

# GENERATING MY OWN MOORE MACHINE

In [7]:
import random
from graphviz import Digraph

In [8]:
class MooreMachine:
    def __init__(self, num_states, input_alphabet, output_alphabet):
        self.num_states = num_states
        self.states = range(num_states)
        self.input_alphabet = input_alphabet
        self.output_alphabet = output_alphabet
        self.transition_function = {}  # (current_state, input_symbol) -> next_state
        self.output_function = {}      # state -> output_symbol
        self.initial_state = 0         # Let's assume the initial state is 0

        self.generate_random_machine()

    def generate_random_machine(self):
        # Assign random output symbols to each state
        for state in self.states:
            output = random.choice(self.output_alphabet)
            self.output_function[state] = output

        # Define transitions for each state and input symbol
        for state in self.states:
            for input_symbol in self.input_alphabet:
                next_state = random.choice(self.states)
                self.transition_function[(state, input_symbol)] = next_state

    def get_output(self, state):
        return self.output_function[state]

    def get_next_state(self, current_state, input_symbol):
        return self.transition_function[(current_state, input_symbol)]


In [9]:
def create_random_moore_machine():
    # Define the number of states and the input/output alphabets
    num_states = int(input("Enter the number of states: "))
    input_alphabet = input("Enter the input alphabet symbols separated by commas: ").split(',')
    output_alphabet = input("Enter the output alphabet symbols separated by commas: ").split(',')

    # Trim whitespace from symbols
    input_alphabet = [symbol.strip() for symbol in input_alphabet]
    output_alphabet = [symbol.strip() for symbol in output_alphabet]

    # Create the Moore machine
    moore_machine = MooreMachine(num_states, input_alphabet, output_alphabet)
    return moore_machine


In [10]:
def visualize_moore_machine(moore_machine):
    dot = Digraph(comment='Moore Machine')
    dot.attr(rankdir='LR', size='8,5')

    # Define node shapes and labels
    for state in moore_machine.states:
        label = f'S{state}/{moore_machine.get_output(state)}'
        if state == moore_machine.initial_state:
            dot.node(str(state), label=label, shape='doublecircle')
        else:
            dot.node(str(state), label=label)

    # Define edges
    for (state, input_symbol), next_state in moore_machine.transition_function.items():
        dot.edge(str(state), str(next_state), label=input_symbol)

    # Render the graph
    dot.render('moore_machine.gv', view=True)


In [11]:
def main():
    moore_machine = create_random_moore_machine()
    visualize_moore_machine(moore_machine)


In [12]:
if __name__ == '__main__':
    main()


ValueError: invalid literal for int() with base 10: ''

In [14]:
import random
from graphviz import Digraph

class MooreMachine:
    def __init__(self, num_states, input_alphabet, output_alphabet):
        self.num_states = num_states
        self.states = list(range(num_states))  # Convert range to list for compatibility with random.choice
        self.input_alphabet = input_alphabet
        self.output_alphabet = output_alphabet
        self.transition_function = {}  # (current_state, input_symbol) -> next_state
        self.output_function = {}      # state -> output_symbol
        self.initial_state = 0         # Let's assume the initial state is 0

        self.generate_random_machine()

    def generate_random_machine(self):
        # Assign random output symbols to each state
        for state in self.states:
            output = random.choice(self.output_alphabet)
            self.output_function[state] = output

        # Define transitions for each state and input symbol
        for state in self.states:
            for input_symbol in self.input_alphabet:
                next_state = random.choice(self.states)
                self.transition_function[(state, input_symbol)] = next_state

    def get_output(self, state):
        return self.output_function[state]

    def get_next_state(self, current_state, input_symbol):
        return self.transition_function[(current_state, input_symbol)]

def create_random_moore_machine():
    # Define the number of states and the input/output alphabets
    while True:
        num_states_input = input("Enter the number of states: ")
        try:
            num_states = int(num_states_input)
            if num_states <= 0:
                print("Please enter a positive integer.")
                continue
            break
        except ValueError:
            print("Invalid input. Please enter a valid integer.")
    
    # Process input alphabet
    while True:
        input_alphabet_input = input("Enter the input alphabet symbols separated by commas: ")
        input_alphabet = [symbol.strip() for symbol in input_alphabet_input.split(',') if symbol.strip()]
        if input_alphabet:
            break
        else:
            print("Input alphabet cannot be empty. Please enter at least one symbol.")
    
    # Process output alphabet
    while True:
        output_alphabet_input = input("Enter the output alphabet symbols separated by commas: ")
        output_alphabet = [symbol.strip() for symbol in output_alphabet_input.split(',') if symbol.strip()]
        if output_alphabet:
            break
        else:
            print("Output alphabet cannot be empty. Please enter at least one symbol.")
    
    # Create the Moore machine
    moore_machine = MooreMachine(num_states, input_alphabet, output_alphabet)
    return moore_machine

def visualize_moore_machine(moore_machine):
    dot = Digraph(comment='Moore Machine')
    dot.attr(rankdir='LR', size='8,5')

    # Define node shapes and labels
    for state in moore_machine.states:
        label = f'S{state}/{moore_machine.get_output(state)}'
        if state == moore_machine.initial_state:
            dot.node(str(state), label=label, shape='doublecircle')
        else:
            dot.node(str(state), label=label)

    # Define edges
    for (state, input_symbol), next_state in moore_machine.transition_function.items():
        dot.edge(str(state), str(next_state), label=input_symbol)

    # Render the graph
    dot.format = 'pdf'  # Set the output format to PDF
    dot.render('moore_machine', view=True)

def main():
    moore_machine = create_random_moore_machine()
    visualize_moore_machine(moore_machine)

if __name__ == '__main__':
    main()


ExecutableNotFound: failed to execute WindowsPath('dot'), make sure the Graphviz executables are on your systems' PATH

In [1]:
from dataclasses import dataclass
from typing import Dict, List, Set, Tuple
import random

@dataclass
class MooreMachine:
    states: Set[str]
    input_alphabet: Set[str]
    output_alphabet: Set[str]
    transition_function: Dict[Tuple[str, str], str]
    output_function: Dict[str, str]
    initial_state: str

def generate_moore_machine(
    num_states: int,
    input_alphabet_size: int,
    output_alphabet_size: int,
    completeness_ratio: float = 1.0
) -> MooreMachine:
    """
    Generate a Moore machine with specified parameters.
    
    Args:
        num_states: Number of states in the machine
        input_alphabet_size: Size of input alphabet
        output_alphabet_size: Size of output alphabet
        completeness_ratio: Ratio of transitions to define (1.0 = complete machine)
    """
    # Generate states, alphabets
    states = {f"q{i}" for i in range(num_states)}
    input_alphabet = {chr(97 + i) for i in range(input_alphabet_size)}  # a, b, c...
    output_alphabet = {str(i) for i in range(output_alphabet_size)}     # 0, 1, 2...
    
    # Generate transition function
    all_transitions = [(s, i) for s in states for i in input_alphabet]
    num_transitions = int(len(all_transitions) * completeness_ratio)
    selected_transitions = random.sample(all_transitions, num_transitions)
    
    transition_function = {
        (s, i): random.choice(list(states))
        for s, i in selected_transitions
    }
    
    # Generate output function
    output_function = {
        state: random.choice(list(output_alphabet))
        for state in states
    }
    
    initial_state = random.choice(list(states))
    
    return MooreMachine(
        states=states,
        input_alphabet=input_alphabet,
        output_alphabet=output_alphabet,
        transition_function=transition_function,
        output_function=output_function,
        initial_state=initial_state
    )

def generate_traces(
    machine: MooreMachine,
    num_traces: int,
    max_length: int
) -> List[Tuple[str, str]]:
    """
    Generate input-output traces from a Moore machine.
    
    Args:
        machine: The Moore machine to generate traces from
        num_traces: Number of traces to generate
        max_length: Maximum length of each trace
        
    Returns:
        List of (input_sequence, output_sequence) pairs
    """
    traces = []
    
    for _ in range(num_traces):
        length = random.randint(1, max_length)
        current_state = machine.initial_state
        input_sequence = ""
        output_sequence = machine.output_function[current_state]
        
        for _ in range(length):
            # Get possible inputs from current state
            possible_inputs = [
                i for i in machine.input_alphabet
                if (current_state, i) in machine.transition_function
            ]
            
            if not possible_inputs:
                break
                
            input_symbol = random.choice(possible_inputs)
            input_sequence += input_symbol
            
            # Transition to next state
            current_state = machine.transition_function[(current_state, input_symbol)]
            output_sequence += machine.output_function[current_state]
            
        traces.append((input_sequence, output_sequence))
        
    return traces

# Example usage:
if __name__ == "__main__":
    # Generate a simple Moore machine
    machine = generate_moore_machine(
        num_states=3,
        input_alphabet_size=2,
        output_alphabet_size=2,
        completeness_ratio=0.8
    )
    
    # Generate some traces
    traces = generate_traces(machine, num_traces=5, max_length=4)
    
    # Print the machine details
    print("States:", machine.states)
    print("Input alphabet:", machine.input_alphabet)
    print("Output alphabet:", machine.output_alphabet)
    print("Initial state:", machine.initial_state)
    print("\nTransition function:")
    for (state, input_symbol), next_state in machine.transition_function.items():
        print(f"δ({state}, {input_symbol}) = {next_state}")
    print("\nOutput function:")
    for state, output in machine.output_function.items():
        print(f"λ({state}) = {output}")
    
    print("\nGenerated traces (input -> output):")
    for input_seq, output_seq in traces:
        print(f"{input_seq} -> {output_seq}")

States: {'q1', 'q0', 'q2'}
Input alphabet: {'b', 'a'}
Output alphabet: {'0', '1'}
Initial state: q0

Transition function:
δ(q2, b) = q2
δ(q2, a) = q0
δ(q1, a) = q1
δ(q0, b) = q0

Output function:
λ(q1) = 0
λ(q0) = 1
λ(q2) = 1

Generated traces (input -> output):
bbbb -> 11111
bbb -> 1111
bbb -> 1111
b -> 11
b -> 11


In [9]:
from dataclasses import dataclass
from typing import Dict, List, Set, Tuple
import random
import graphviz

@dataclass
class MooreMachine:
    states: Set[str]
    input_alphabet: Set[str]
    output_alphabet: Set[str]
    transition_function: Dict[Tuple[str, str], str]
    output_function: Dict[str, str]
    initial_state: str

def generate_moore_machine(
    num_states: int,
    input_alphabet_size: int,
    output_alphabet_size: int,
    completeness_ratio: float = 1.0
) -> MooreMachine:
    """
    Generate a Moore machine with specified parameters.
    
    Args:
        num_states: Number of states in the machine
        input_alphabet_size: Size of input alphabet
        output_alphabet_size: Size of output alphabet
        completeness_ratio: Ratio of transitions to define (1.0 = complete machine)
    """
    # Generate states, alphabets
    states = {f"q{i}" for i in range(num_states)}
    input_alphabet = {chr(97 + i) for i in range(input_alphabet_size)}  # a, b, c...
    output_alphabet = {str(i) for i in range(output_alphabet_size)}     # 0, 1, 2...
    
    # Generate transition function
    all_transitions = [(s, i) for s in states for i in input_alphabet]
    num_transitions = int(len(all_transitions) * completeness_ratio)
    selected_transitions = random.sample(all_transitions, num_transitions)
    
    transition_function = {
        (s, i): random.choice(list(states))
        for s, i in selected_transitions
    }
    
    # Generate output function
    output_function = {
        state: random.choice(list(output_alphabet))
        for state in states
    }
    
    initial_state = random.choice(list(states))
    
    return MooreMachine(
        states=states,
        input_alphabet=input_alphabet,
        output_alphabet=output_alphabet,
        transition_function=transition_function,
        output_function=output_function,
        initial_state=initial_state
    )

def generate_traces(
    machine: MooreMachine,
    num_traces: int,
    max_length: int
) -> List[Tuple[str, str]]:
    """
    Generate input-output traces from a Moore machine.
    
    Args:
        machine: The Moore machine to generate traces from
        num_traces: Number of traces to generate
        max_length: Maximum length of each trace
        
    Returns:
        List of (input_sequence, output_sequence) pairs
    """
    traces = []
    
    for _ in range(num_traces):
        length = random.randint(1, max_length)
        current_state = machine.initial_state
        input_sequence = ""
        output_sequence = machine.output_function[current_state]
        
        for _ in range(length):
            # Get possible inputs from current state
            possible_inputs = [
                i for i in machine.input_alphabet
                if (current_state, i) in machine.transition_function
            ]
            
            if not possible_inputs:
                break
                
            input_symbol = random.choice(possible_inputs)
            input_sequence += input_symbol
            
            # Transition to next state
            current_state = machine.transition_function[(current_state, input_symbol)]
            output_sequence += machine.output_function[current_state]
            
        traces.append((input_sequence, output_sequence))
        
    return traces

def visualize_moore_machine(machine: MooreMachine, filename: str = "moore_machine"):
    """
    Visualize a Moore machine using graphviz.
    
    Args:
        machine: The Moore machine to visualize
        filename: Name of the output file (without extension)
    """
    dot = graphviz.Digraph(comment='Moore Machine')
    dot.attr(rankdir='LR')
    
    # Add initial state marker
    dot.node('start', '', shape='point')
    
    # Add states
    for state in machine.states:
        label = f"{state}\n{machine.output_function[state]}"
        shape = 'doublecircle' if state == machine.initial_state else 'circle'
        dot.node(state, label, shape=shape)
    
    # Add initial transition
    dot.edge('start', machine.initial_state)
    
    # Add transitions
    for (state, input_symbol), next_state in machine.transition_function.items():
        dot.edge(state, next_state, label=input_symbol)
    
    # Save the visualization
    dot.render(filename, format='png', cleanup=True)

# Example usage:
if __name__ == "__main__":
    # Generate a simple Moore machine
    machine = generate_moore_machine(
        num_states=3,
        input_alphabet_size=2,
        output_alphabet_size=2,
        completeness_ratio=0.8
    )
    
    # Generate some traces
    traces = generate_traces(machine, num_traces=5, max_length=4)
    
    # Visualize the machine
    visualize_moore_machine(machine)
    
    # Print machine details and traces
    print("States:", machine.states)
    print("Input alphabet:", machine.input_alphabet)
    print("Output alphabet:", machine.output_alphabet)
    print("Initial state:", machine.initial_state)
    print("\nTransition function:")
    for (state, input_symbol), next_state in machine.transition_function.items():
        print(f"δ({state}, {input_symbol}) = {next_state}")
    print("\nOutput function:")
    for state, output in machine.output_function.items():
        print(f"λ({state}) = {output}")
    
    print("\nGenerated traces (input -> output):")
    for input_seq, output_seq in traces:
        print(f"{input_seq} -> {output_seq}")

States: {'q1', 'q2', 'q0'}
Input alphabet: {'a', 'b'}
Output alphabet: {'0', '1'}
Initial state: q1

Transition function:
δ(q1, b) = q1
δ(q2, a) = q1
δ(q0, b) = q0
δ(q2, b) = q0

Output function:
λ(q1) = 0
λ(q2) = 1
λ(q0) = 0

Generated traces (input -> output):
b -> 00
b -> 00
bbb -> 0000
b -> 00
bbb -> 0000
