[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/PetiteIA/schema_mechanism/blob/master/notebooks/agent13.ipynb)

# L'AGENT QUI AFFICHE SA SIMULATION (en construction)

# Objectifs

Dans ce notebook nous examinons la possibilité de dotter l'agent d'un simulateur interne de ses sequances d'interaction

# Préparons les classes CompositeInteraction et Interaction

Nous conservons la même classe `CompositeInteraction` que l'Agent11 sauf que nous ajoutons la méthode `series()` qui renvoie la séquence des tokens des interactions primitive sous forme d'une liste.

les tokens sont construits par `action * BASE + outcome`.

In [1]:
BASE = 2

In [2]:
class CompositeInteraction:
    """A composite interaction is a tuple (pre_interaction, post_interaction) and a weight"""
    def __init__(self, pre_interaction, post_interaction):
        self.pre_interaction = pre_interaction
        self.post_interaction = post_interaction
        self.weight = 1
        self._step = 1

    def get_decision(self):
        """Return the flatten sequence of intermediary primitive interactions terminated with the final decision"""
        return f"{self.pre_interaction.sequence()}{self.post_interaction.get_decision()}"

    def get_actions(self):
        """Return the flat sequence of the decisions of this interaction as a string"""
        return f"{self.pre_interaction.get_actions()}{self.post_interaction.get_actions()}"
    
    def get_valence(self):
        """Return the valence of the pre_interaction plus the valence of the post_interaction"""
        return self.pre_interaction.get_valence() + self.post_interaction.get_valence()

    def reinforce(self):
        """Increment the composite interaction's weight"""
        self.weight += 1

    def key(self):
        """ The key to find this interaction in the dictionary is the string '<pre_interaction><post_interaction>'. """
        return f"({self.pre_interaction.key()},{self.post_interaction.key()})"

    def pre_key(self):
        """Return the key of the pre_interaction"""
        return self.pre_interaction.key()

    def __str__(self):
        """ Print the interaction in the Newick tree format (pre_interaction, post_interaction: valence) """
        return f"({self.pre_interaction}, {self.post_interaction}: {self.weight})"

    def __eq__(self, other):
        """ Interactions are equal if they have the same pre and post interactions """
        if isinstance(other, self.__class__):
            return (self.pre_interaction == other.pre_interaction) and (self.post_interaction == other.post_interaction)
        else:
            return False

    def get_length(self):
        """Return the length of the number of primitive interactions in this composite interaction"""
        return self.pre_interaction.get_length() + self.post_interaction.get_length()

    def increment(self, interaction, interactions):
        """Increment the step of the appropriate sub-interaction. Return the enacted interaction if it is over, or None if it is ongoing."""
        # First step 
        if self._step == 1:
            interaction = self.pre_interaction.increment(interaction, interactions)
            # Ongoing pre-interaction. Return None
            if interaction is None:
                return None
            # Pre-interaction succeeded. Increment the step and return None
            elif interaction == self.pre_interaction:
                self._step = 2
                return None
            # Pre-interaction failed. Reset the step and return the enacted interaction
            else:
                self._step = 1
                return interaction
        # Second step
        else:
            interaction = self.post_interaction.increment(interaction, interactions)
            # Ongoing post-interaction. Return None
            if interaction is None:
                return None
            # Post-interaction succeeded. Reset the step and return this interaction
            elif interaction == self.post_interaction:
                self._step = 1
                return self
            # Post-interaction failed. Reset the step and return the enacted interaction
            else:
                self._step = 1
                composite_interaction = CompositeInteraction(self.pre_interaction, interaction)
                if composite_interaction.key() not in interactions:
                    # Add the enacted composite interaction to memory
                    interactions[composite_interaction.key()] = composite_interaction
                    print(f"Learning {composite_interaction}")
                    return composite_interaction
                else:
                    # Reinforce the existing composite interaction and return it
                    interactions[composite_interaction.key()].reinforce()
                    print(f"Reinforcing {interactions[composite_interaction.key()]}")
                    return interactions[composite_interaction.key()]

    def current(self):
        """Return the current intended primitive interaction"""
        # Step 1: the current primitive interaction of the pre-interaction
        if self._step == 1:
            return self.pre_interaction.current()
        # Step 2: The current primitive interaction of the post-interaction
        else:
            return self.post_interaction.current()

    def sequence(self):
        """Return the flat sequence of primitive interactions of this composite interaction"""
        return f"{self.pre_interaction.sequence()}{self.post_interaction.sequence()}"

    def get_post_interactions(self):
        """Return the list of the hierarchy of the sub post_interactions"""
        return [self.post_interaction] + self.post_interaction.get_post_interactions()

    def series(self):
        """Return the series of tokens of the primitive interactions"""
        series = self.pre_interaction.series()
        series.extend(self.post_interaction.series())
        return series
        

Nous conservons la mêmes classe `Interaction ` que pour l'Agent10 a part la méthode `get_post_interactions()` ajoutée. 

In [3]:
class Interaction:
    """An interaction is a tuple (action, outcome) with a valence"""
    def __init__(self, _action, _outcome, _valence):
        self._action = _action
        self._outcome = _outcome
        self._valence = _valence
        self.weight = 10
        
    def get_action(self):
        """Return the action"""
        return self._action

    def get_actions(self):
        """Return the action as a string for compatibilty with CompositeInteraction"""
        return str(self._action)

    def get_decision(self):
        """Return the decision key"""
        return f"{self._action}"
        # return f"a{self._action}"

    def get_outcome(self):
        """Return the action"""
        return self._outcome

    def get_valence(self):
        """Return the action"""
        return self._valence

    def key(self):
        """ The key to find this interaction in the dictinary is the string '<action><outcome>'. """
        return f"{self._action}{self._outcome}"

    def pre_key(self):
        """Return the key. Used for compatibility with CompositeInteraction"""
        return ""  # self.key()

    def __str__(self):
        """ Print interaction in the form '<action><outcome:<valence>' for debug."""
        return f"{self._action}{self._outcome}:{self._valence}"

    def __eq__(self, other):
        """ Interactions are equal if they have the same key """
        if isinstance(other, self.__class__):
            return self.key() == other.key()
        else:
            return False

    def get_length(self):
        """The length of the sequence of this interaction"""
        return 1

    def increment(self, interaction, interactions):
        """Return the enacted interaction for compatibility with composite interactions"""
        return interaction

    def current(self):
        """Return itself for compatibility with composite interactions"""
        return self

    def sequence(self):
        """Return the key. Use for compatibility with composite interactions"""
        return self.key()

    def get_post_interactions(self):
        """Return the empty list for compatibility with composite interactions"""
        return []

    def series(self):
        """Return the token of this primitive interactions in a list"""
        return [self._action * BASE + self._outcome]


# L'environnement SmallLoop 

Implémentons une version de l'environnement qui nous permet d'afficher le simulateur interne de l'agent.

In [4]:
save_dir = "sav"

FORWARD = 1
FEEL_FRONT = 0
FEEL_LEFT = 2
FEEL_RIGHT = 3
TURN_LEFT = 4
TURN_RIGHT = 5

ENV_HIGHT = 6
ENV_WIDTH = 6
SIM_HIGHT = 9
SIM_WIDTH = 9

In [5]:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap, BoundaryNorm
from ipywidgets import Button, HBox,VBox, Output
from IPython.display import display

LEFT = 0
DOWN = 1
RIGHT = 2
UP = 3
EMPTY = 0
WALL = 1
FEELING_EMPTY = 2
FEELING_WALL = 3
BUMPING = 4
UNKNOWN = 5

colors = ["#b0b0b0", '#b0b0b0', '#ffffff', '#535865', "#F93943"]  # Hidden environment
colors = ["#D6D6D6", '#5C946E', '#FAE2DB', '#535865', "#F93943", 
          "#BAC9E1", '#BAC9E1', '#FAE2DB', '#535865', "#F93943", '#BAC9E1']  # Simulator
agent_color = "#1976D2"
prediction_error_color = "#f62dae"
agent_size = 200

class SmallLoop():
    def __init__(self, position, direction, grid):
        self.environment_grid = np.array(grid)
        self.display_grid = np.full((SIM_HIGHT, ENV_WIDTH + SIM_WIDTH), WALL, dtype=int)
        self.display_grid[0:ENV_HIGHT, 0:ENV_WIDTH] = self.environment_grid[0:6, 0:6]
        self.position = np.array(position) 
        self.direction = direction
        self.cmap = ListedColormap(colors)
        self.norm = BoundaryNorm(np.arange(-0.5, len(colors) + 0.5, 1.0), self.cmap.N)
        self.marker_size = agent_size
        self.marker_map = {LEFT: '<', DOWN: 'v', RIGHT: '>', UP: '^'}
        self.marker_color = agent_color
        self.directions = np.array([
            [0, -1],  # Left
            [1, 0],   # Down
            [0, 1],   # Right
            [-1, 0]   # Up
            ])

    def outcome(self, action):
        """Update the grid. Return the outcome of the action."""
        result = 0
        self.display_grid[0:6, 0:6] = self.environment_grid

        if action == FORWARD:  
            target_position = self.position + self.directions[self.direction]
            if self.environment_grid[tuple(target_position)] == EMPTY:
                self.position[:] = target_position
            else:
                result = 1
                self.display_grid[tuple(target_position)] = BUMPING
        
        elif action == TURN_RIGHT:
            self.direction = {LEFT: UP, DOWN: LEFT, RIGHT: DOWN, UP: RIGHT}[self.direction]
        
        elif action == TURN_LEFT:
            self.direction = {LEFT: DOWN, DOWN: RIGHT, RIGHT: UP, UP: LEFT}[self.direction]
        
        elif action == FEEL_FRONT:
            feeling_position = self.position + self.directions[self.direction]
            if self.environment_grid[tuple(feeling_position)] == EMPTY:
                self.display_grid[tuple(feeling_position)] = FEELING_EMPTY
            else:
                result = 1
                self.display_grid[tuple(feeling_position)] = FEELING_WALL
        
        elif action == FEEL_LEFT:
            feeling_position = self.position + self.directions[(self.direction + 1) % 4]
            if self.environment_grid[tuple(feeling_position)] == EMPTY:
                self.display_grid[tuple(feeling_position)] = FEELING_EMPTY
            else:
                result = 1
                self.display_grid[tuple(feeling_position)] = FEELING_WALL
        
        elif action == FEEL_RIGHT:
            feeling_position = self.position + self.directions[self.direction - 1]
            if self.environment_grid[tuple(feeling_position)] == EMPTY:
                self.display_grid[tuple(feeling_position)] = FEELING_EMPTY
            else:
                result = 1
                self.display_grid[tuple(feeling_position)] = FEELING_WALL

        # print(f"Line: {self.position[0]}, Column: {self.position[1]}, direction: {self.direction}")
        return result  
    
    def display(self, simulator=None):
        """Display the grid in the notebook"""
        out.clear_output(wait=True)
        with out:
            fig, ax = plt.subplots()
            if simulator is not None:
                plt.scatter(4 + 6, 4, s=simulator.marker_size, marker='^', c="#ffffff")
                plt.scatter(simulator.position[1] + 6, simulator.position[0], s=simulator.marker_size, marker=self.marker_map[simulator.direction], 
                            c=simulator.marker_color)
                self.display_grid[0:SIM_HIGHT, ENV_WIDTH:(ENV_WIDTH + SIM_WIDTH + 1)] = simulator.display_grid[0:SIM_HIGHT, 0:SIM_WIDTH] + 5
            ax.imshow(self.display_grid, cmap=self.cmap, norm=self.norm)
            plt.scatter(self.position[1], self.position[0], s=self.marker_size, marker=self.marker_map[self.direction], c=self.marker_color)
            ax.text(4.5, 0, f"{step:>3}", fontsize=10, color='White')
            plt.show()
    
    def save(self, step, simulator):
        """Save the display as a PNG file"""
        fig, ax = plt.subplots()
        ax.set_xticks([])
        ax.set_yticks([])
        ax.axis('off')
        ax.imshow(self.display_grid, cmap=self.cmap, norm=self.norm)
        plt.scatter(self.position[1], self.position[0], s=self.marker_size, marker=self.marker_map[self.direction], c=self.marker_color)
        plt.scatter(simulator.position[1] + 6, simulator.position[0], s=simulator.marker_size, marker=self.marker_map[simulator.direction], 
                    c=simulator.marker_color)
        ax.text(4.5, 0, f"{step:>4}", fontsize=10, color='White')
        plt.savefig(f"{save_dir}/{step:04}.png", bbox_inches='tight', pad_inches=0, transparent=True)
        plt.close(fig)
        

# Créons le simulateur

Ce simulateur visualise les interactions en fonction de leur outcome.

In [6]:
class Simulator():
    def __init__(self, position, direction):
        self.display_grid = np.full((SIM_HIGHT, SIM_WIDTH), UNKNOWN, dtype=int)
        self.initial_position = np.array(position) 
        self.position = np.array(position) 
        self.initial_direction = direction
        self.direction = direction
        self.marker_size = agent_size
        self.marker_color = agent_color
        self.directions = np.array([
            [0, -1],  # Left
            [1, 0],   # Down
            [0, 1],   # Right
            [-1, 0]   # Up
            ])
    def outcome(self, i):
        """Update the grid. Return the outcome of the action."""
        if i._action == FORWARD:  
            target_position = self.position + self.directions[self.direction]
            if i._outcome == EMPTY:
                self.position[:] = target_position
            else:
                self.display_grid[tuple(target_position)] = BUMPING
        elif i._action == TURN_RIGHT:
            self.direction = {LEFT: UP, DOWN: LEFT, RIGHT: DOWN, UP: RIGHT}[self.direction]
        elif i._action == TURN_LEFT:
            self.direction = {LEFT: DOWN, DOWN: RIGHT, RIGHT: UP, UP: LEFT}[self.direction]
        elif i._action == FEEL_FRONT:
            feeling_position = self.position + self.directions[self.direction]
            if i._outcome == EMPTY:
                self.display_grid[tuple(feeling_position)] = FEELING_EMPTY
            else:
                self.display_grid[tuple(feeling_position)] = FEELING_WALL
        elif i._action == FEEL_LEFT:
            feeling_position = self.position + self.directions[(self.direction + 1) % 4]
            if i._outcome == EMPTY:
                self.display_grid[tuple(feeling_position)] = FEELING_EMPTY
            else:
                self.display_grid[tuple(feeling_position)] = FEELING_WALL
        elif i._action == FEEL_RIGHT:
            feeling_position = self.position + self.directions[self.direction - 1]
            if i._outcome == EMPTY:
                self.display_grid[tuple(feeling_position)] = FEELING_EMPTY
            else:
                self.display_grid[tuple(feeling_position)] = FEELING_WALL
    
    def clear(self):
        """Reset the simulator"""
        self.position[:] = self.initial_position
        self.direction = self.initial_direction 
        self.display_grid[:] = np.full((SIM_HIGHT, SIM_WIDTH), EMPTY, dtype=int)
        

# Implémentons l'agent 

In [7]:
import pandas as pd

class Agent:
    def __init__(self, _interactions):
        """ Initialize our agent """
        # Les intéreactions doivent être triées dans l'ordre de leurs tokens
        self._interactions = dict(sorted({interaction.key(): interaction for interaction in _interactions}.items()))
        self._primitive_intended_interaction = self._interactions["00"]
        self._intended_interaction = None
        self.simulator = Simulator([4, 4], UP)

        # The context
        self._penultimate_interaction = None
        self._previous_interaction = None
        self._last_interaction = None
        self._penultimate_composite_interaction = None
        self._previous_composite_interaction = None
        self._last_composite_interaction = None
        self._enacted_interaction = self._primitive_intended_interaction
        
        # Prepare the dataframe of proposed interactions
        default_interactions = [interaction for interaction in _interactions if interaction.get_outcome() == 0]
        data = {'activated': [""] * len(default_interactions),
                'weight': [0] * len(default_interactions),
                'actions': [i.get_actions() for i in default_interactions],
                'intention': [i.key() for i in default_interactions],
                'valence': [i.get_valence() for i in default_interactions],
                'decision': [i.get_decision() for i in default_interactions],
                'length': [1] * len(default_interactions),
                'pre': [""] * len(default_interactions)} 
        self._default_df = pd.DataFrame(data)
        self.proposed_df = None
        self.decision_df = None

    def action(self):
        """Implement the agent's policy"""
        # If the intended interaction is over (completely enacted or aborted)
        if self._enacted_interaction is not None:
            self.simulator.clear()
            # Memorize the context
            self._penultimate_composite_interaction = self._previous_composite_interaction
            self._previous_composite_interaction = self._last_composite_interaction
            self._penultimate_interaction = self._previous_interaction
            self._previous_interaction = self._last_interaction
            self._last_interaction = self._enacted_interaction
            # Call the learning mechanism
            self.learn(self._enacted_interaction)
            # Create the proposed dataframe
            self.create_proposed_df()
            self.aggregate_propositions()
            # Decide the next enaction
            self.decide()

        # Return the next primitive action
        self._primitive_intended_interaction = self._intended_interaction.current()
        return self._primitive_intended_interaction.get_action()
        
    def learn(self, enacted_interaction):
        """Learn the composite interactions"""
        # First level of composite interactions
        self._last_composite_interaction = self.learn_composite_interaction(self._previous_interaction, enacted_interaction)
        # Second level of composite interactions
        self.learn_composite_interaction(self._previous_composite_interaction, enacted_interaction)
        self.learn_composite_interaction(self._penultimate_interaction, self._last_composite_interaction)

        # Higher level composite interaction made of two composite interactions
        if self._last_composite_interaction is not None:
            self.learn_composite_interaction(self._penultimate_composite_interaction, self._last_composite_interaction)

    def learn_composite_interaction(self, pre_interaction, post_interaction):
        """Record or reinforce the composite interaction made of (pre_interaction, post_interaction)"""
        if pre_interaction is None:
            return None
        else:
            # If the pre-interaction exists
            composite_interaction = CompositeInteraction(pre_interaction, post_interaction)
            if composite_interaction.key() not in self._interactions:
                # Add the composite interaction to memory
                self._interactions[composite_interaction.key()] = composite_interaction
                print(f"Learning {composite_interaction}")
                return composite_interaction
            else:
                # Reinforce the existing composite interaction and return it
                self._interactions[composite_interaction.key()].reinforce()
                print(f"Reinforcing {self._interactions[composite_interaction.key()]}")
                return self._interactions[composite_interaction.key()]

    def create_proposed_df(self):
        """Create the proposed dataframe from the activated interactions"""
        # The list of activated interactions that match the current context
        activated_interactions = [i for i in self._interactions.values() if i.get_length() > 1 
                                  and i.pre_interaction in self._last_composite_interaction.get_post_interactions()]
        data = {'activated': [i.key() for i in activated_interactions],
                'weight': [i.weight for i in activated_interactions],
                'actions': [i.post_interaction.get_actions() for i in activated_interactions],
                'intention': [i.post_interaction.key() for i in activated_interactions],
                'valence': [i.post_interaction.get_valence() for i in activated_interactions],
                'decision': [i.post_interaction.get_decision() for i in activated_interactions],
                'pre': [i.post_interaction.pre_key() for i in activated_interactions],
                'length': [i.post_interaction.get_length() for i in activated_interactions],
                }
        activated_df = pd.DataFrame(data).astype(self._default_df.dtypes)  # Force the same types in case activated_df is empty

        # Create the proposed dataframe
        self.proposed_df = pd.concat([self._default_df, activated_df], ignore_index=True).sort_values(by='decision', ascending=True).reset_index(drop=True)

        # Calculate the proclivity of each proposition
        self.proposed_df['proclivity'] = self.proposed_df['weight'] * self.proposed_df['valence']

        # Compute the probability of each propositions
        # self.proposed_df['probability'] = self.proposed_df['weight'] / self.proposed_df.groupby('actions')['weight'].transform('sum')
        self.proposed_df['probability'] = self.proposed_df.groupby('intention')['weight'].transform('sum') / self.proposed_df.groupby('actions')['weight'].transform('sum')

    def aggregate_propositions(self):
        """Aggregate the proclivity"""
        # Aggregate the proclivity for each decision
        grouped_df = self.proposed_df.groupby('decision').agg({'proclivity': 'sum', 'actions': 'first', # 'action': 'first', 
                                                               'length': 'first', 'intention': 'first', 'pre': 'first'}).reset_index()
        # For each proposed composite decision 
        for index, proposed in grouped_df[grouped_df['length'] > 1].iterrows():
            # print(f"Index {index}, actions {proposition['actions']}, intention {proposition['intention']}")
            # Find shorter decisions that start with the same sequence 
            for _, shorter in self.proposed_df[self.proposed_df.apply(lambda row: proposed['actions'].startswith(row['actions']) 
                                                                      and row['length'] < proposed['length'], axis=1)].iterrows():
                # Add the proclivity of the shorter decisions
                grouped_df.loc[index, 'proclivity'] += shorter['proclivity']
                # print(f"Decision {proposed['decision']} recieves {shorter['proclivity']} from shorter {shorter['intention']}")
        
        # Sort by descending proclivity
        self.decision_df = grouped_df.sort_values(by=['proclivity', 'decision'], ascending=[False, True]).reset_index(drop=True)

    def decide(self):
        """Selects the intended_interaction at the top of the proposed dataframe"""
        # The intended interaction is in the first row because it has been sorted by descending proclivity
        intended_interaction_key = self.decision_df.loc[0, 'intention']
        print("Intention:", intended_interaction_key)
        self._intended_interaction = self._interactions[intended_interaction_key]

    def assimilate(self, outcome):
        """Process the last outcome"""
        # Trace the previous cycle
        primitive_enacted_interaction = self._interactions[f"{self._primitive_intended_interaction.get_action()}{outcome}"]
        prediction_correct = self._primitive_intended_interaction.get_outcome() == outcome
        print(
            f"Action: {self._primitive_intended_interaction.get_action()}, Predicted: {self._primitive_intended_interaction.get_outcome()}, "
            f"Outcome: {outcome}, Prediction_correct: {prediction_correct}, "
            f"Valence: {primitive_enacted_interaction.get_valence()}")
        # Follow up the enaction
        self._enacted_interaction = self._intended_interaction.increment(primitive_enacted_interaction, self._interactions)
        # Update the simulator
        self.simulator.outcome(primitive_enacted_interaction)
        if prediction_correct:
            self.simulator.marker_color = agent_color
        else:
            self.simulator.marker_color = prediction_error_color


# Testons l'agent dans le Small Loop

In [8]:
# Instanciate the small loop environment
grid = [[WALL, WALL , WALL , WALL , WALL , WALL],
        [WALL, EMPTY, EMPTY, EMPTY, WALL , WALL],
        [WALL, EMPTY, WALL , EMPTY, EMPTY, WALL],
        [WALL, EMPTY, WALL , WALL , EMPTY, WALL],
        [WALL, EMPTY, EMPTY, EMPTY, EMPTY, WALL],
        [WALL, WALL , WALL , WALL , WALL , WALL]]
e = SmallLoop([1, 1], 0, grid)

# Instanciate the agent 
interactions = [
    Interaction(FORWARD,0,5),
    Interaction(FORWARD,1,-10),
    Interaction(TURN_LEFT,0,-3),
    Interaction(TURN_LEFT,1,-3),
    Interaction(TURN_RIGHT,0,-3),
    Interaction(TURN_RIGHT,1,-3),
    Interaction(FEEL_FRONT,0,-1),
    Interaction(FEEL_FRONT,1,-1),
    Interaction(FEEL_LEFT,0,-1),
    Interaction(FEEL_LEFT,1,-1),
    Interaction(FEEL_RIGHT,0,-1),
    Interaction(FEEL_RIGHT,1,-1)
]
a = Agent(interactions)

# Initialize the interaction loop
step = 0
outcome = 0

# Display
out = Output()
e.display(a.simulator)
display(out)

Output()

In [9]:
for step in range(1000):
    print(f"Step: {step}")
    # step += 1
    action = a.action()
    outcome = e.outcome(action)
    a.assimilate(outcome)
    e.display(a.simulator)
    e.save(step, a.simulator)  # Save the image file 

Step: 0
Intention: 00
Action: 0, Predicted: 0, Outcome: 1, Prediction_correct: False, Valence: -1
Step: 1
Learning (00:-1, 01:-1: 1)
Intention: 00
Action: 0, Predicted: 0, Outcome: 1, Prediction_correct: False, Valence: -1
Step: 2
Learning (01:-1, 01:-1: 1)
Learning ((00:-1, 01:-1: 1), 01:-1: 1)
Learning (00:-1, (01:-1, 01:-1: 1): 1)
Intention: 10
Action: 1, Predicted: 0, Outcome: 1, Prediction_correct: False, Valence: -10
Step: 3
Learning (01:-1, 11:-10: 1)
Learning ((01:-1, 01:-1: 1), 11:-10: 1)
Learning (01:-1, (01:-1, 11:-10: 1): 1)
Learning ((00:-1, 01:-1: 1), (01:-1, 11:-10: 1): 1)
Intention: 00
Action: 0, Predicted: 0, Outcome: 1, Prediction_correct: False, Valence: -1
Step: 4
Learning (11:-10, 01:-1: 1)
Learning ((01:-1, 11:-10: 1), 01:-1: 1)
Learning (01:-1, (11:-10, 01:-1: 1): 1)
Learning ((01:-1, 01:-1: 1), (11:-10, 01:-1: 1): 1)
Intention: 20
Action: 2, Predicted: 0, Outcome: 0, Prediction_correct: True, Valence: -1
Step: 5
Learning (01:-1, 20:-1: 1)
Learning ((11:-10, 01:-

Exception ignored in: <function WeakMethod.__new__.<locals>._cb at 0x000002821C1088B0>
Traceback (most recent call last):
  File "c:\users\ogeorgeon\appdata\local\programs\python\python39\lib\weakref.py", line 61, in _cb
    callback(self)
  File "c:\users\ogeorgeon\appdata\local\programs\python\python39\lib\site-packages\matplotlib\cbook.py", line 248, in _remove_proxy
    del self.callbacks[signal][cid]
KeyError: 'changed'


Step: 91
Action: 1, Predicted: 0, Outcome: 0, Prediction_correct: True, Valence: 5
Step: 92
Action: 0, Predicted: 0, Outcome: 0, Prediction_correct: True, Valence: -1
Step: 93
Action: 1, Predicted: 0, Outcome: 0, Prediction_correct: True, Valence: 5
Step: 94
Action: 0, Predicted: 1, Outcome: 0, Prediction_correct: False, Valence: -1
Reinforcing (10:5, 00:-1: 2)
Reinforcing (00:-1, (10:5, 00:-1: 2): 3)
Reinforcing ((00:-1, 10:5: 1), (00:-1, (10:5, 00:-1: 2): 3): 2)
Step: 95
Learning (40:-3, ((00:-1, 10:5: 1), (00:-1, (10:5, 00:-1: 2): 3): 2): 1)
Learning ((((00:-1, 10:5: 1), (00:-1, (10:5, 01:-1: 5): 3): 1), 40:-3: 1), ((00:-1, 10:5: 1), (00:-1, (10:5, 00:-1: 2): 3): 2): 1)
Learning (((00:-1, 10:5: 1), (00:-1, (10:5, 01:-1: 5): 3): 1), (40:-3, ((00:-1, 10:5: 1), (00:-1, (10:5, 00:-1: 2): 3): 2): 1): 1)
Learning ((40:-3, ((00:-1, 10:5: 1), (00:-1, (10:5, 01:-1: 5): 3): 1): 2), (40:-3, ((00:-1, 10:5: 1), (00:-1, (10:5, 00:-1: 2): 3): 2): 1): 1)
Intention: ((10,00),(10,01))
Action: 1, Pred

![video13](video13.gif)

_Video 1: Exemple_

La partie gauche représente l'agent dans l'environnement. 
La partie droite représente les séquences enactées du point de vue de l'agent. 
L'agent apparait en couleur magenta en cas d'erreur de prediction.

# Extrayons les séquences apprises 

In [2492]:
# Le token End Of Sequence = 11
EOS_TOKEN = 11

def sequences(agent):
    """Entrain le LSTM avec toutes les interactions composite"""
    # Pour toutes les interactions composite jusqu'a la longuer 6
    print("Sequences to train the LSTM:")
    for l in range(2, 6):
        x = [i.series()[:-1] for i in agent._interactions.values() if i.get_length() == l]
        # print("x", x)
        y = [i.series()[-1] for i in agent._interactions.values() if i.get_length() == l]
        # print("y", y)

    print("Sequences to train the next token prediction Transformer:")
    for l in range(2, 11):
        print([i.series() for i in agent._interactions.values() if i.get_length() == l], ",")

    print("Sequences to train the seq2seq Transformer:")
    # print([[i.pre_interaction.series(), i.post_interaction.series()] for i in agent._interactions.values() for l in range(6, 20) if i.get_length() == l ])

    print("Sequences to visualize the decoder:")
    # print([i.pre_interaction.series() for i in agent._interactions.values() for l in range(10, 18) if i.get_length() == l ])
    # print([i.post_interaction.series() for i in agent._interactions.values() for l in range(10, 18) if i.get_length() == l ])

sequences(a)


Sequences to train the LSTM:
Sequences to train the next token prediction Transformer:
[] ,
[] ,
[] ,
[] ,
[] ,
[] ,
[] ,
[] ,
[] ,
Sequences to train the seq2seq Transformer:
Sequences to visualize the decoder:
