# THE VACUUM WORLD   

In this notebook, we will be discussing **the structure of agents** through an example of the **vacuum agent**. The job of AI is to design an **agent program** that implements the agent function: the mapping from percepts to actions. We assume this program will run on some sort of computing device with physical sensors and actuators: we call this the **architecture**:

<h3 align="center">agent = architecture + program</h3>

Before moving on, please review [<b>agents.ipynb</b>](https://github.com/aimacode/aima-python/blob/master/agents.ipynb)

## CONTENTS

* Agent
* Random Agent Program
* Table-Driven Agent Program
* Simple Reflex Agent Program
* Model-Based Reflex Agent Program
* Goal-Based Agent Program
* Utility-Based Agent Program
* Learning Agent

## AGENT PROGRAMS

An agent program takes the current percept as input from the sensors and returns an action to the actuators. There is a difference between an agent program and an agent function: an agent program takes the current percept as input whereas an agent function takes the entire percept history.

The agent program takes just the current percept as input because nothing more is available from the environment; if the agent's actions depend on the entire percept sequence, the agent will have to remember the percept.

We'll discuss the following agent programs here with the help of the vacuum world example:

* Random Agent Program
* Table-Driven Agent Program
* Simple Reflex Agent Program
* Model-Based Reflex Agent Program
* Goal-Based Agent Program
* Utility-Based Agent Program

## Random Agent Program

A random agent program, as the name suggests, chooses an action at random, without taking into account the percepts.   
Here, we will demonstrate a random vacuum agent for a trivial vacuum environment, that is, the two-state environment.

Let's begin by importing all the functions from the agents module:

In [1]:
from agents import *
from notebook import psource


loc_A, loc_B, loc_C, loc_D = (0, 0), (1, 0), (0, 1), (1, 1)


ModuleNotFoundError: No module named 'numpy'

Let us first see how we define the TrivialVacuumEnvironment. Run the next cell to see how abstract class TrivialVacuumEnvironment is defined in agents module:

In [None]:
# Edited these locations to add 2 more for a 2x2 env

# Initialize the 4-state environment
trivial_vacuum_env = TrivialVacuumEnvironment()

# Check the initial state of the environment
print("State of the Environment: {}.".format(trivial_vacuum_env.status))

Let's create our agent now. This agent will choose any of the actions from 'Right', 'Left', 'Suck' and 'NoOp' (No Operation) randomly.

In [None]:
# Create the random agent for 2x2 environment
random_agent = Agent(program=RandomAgentProgram(['Right', 'Left', 'Up', 'Down', 'Suck', 'NoOp']))

We will now add our agent to the environment.

In [None]:
# Add agent to the environment
trivial_vacuum_env.add_thing(random_agent)

print("RandomVacuumAgent is located at {}.".format(random_agent.location))

Let's run our environment now.

In [None]:
# Running the environment
trivial_vacuum_env.step()

# Check the current state of the environment
print("State of the Environment: {}.".format(trivial_vacuum_env.status))

print("RandomVacuumAgent is located at {}.".format(random_agent.location))

## TABLE-DRIVEN AGENT PROGRAM

A table-driven agent program keeps track of the percept sequence and then uses it to index into a table of actions to decide what to do. The table represents explicitly the agent function that the agent program embodies.  
In the two-state vacuum world, the table would consist of all the possible states of the agent.

In [None]:
# Define locations
loc_A = (0, 0)
loc_B = (1, 0)
loc_C = (0, 1)
loc_D = (1, 1)

# Updated table with boundary checks
table = {
        # Single percepts
        ((loc_A, 'Clean'),): 'Right',
        ((loc_A, 'Dirty'),): 'Suck',
        ((loc_B, 'Clean'),): 'Down',
        ((loc_B, 'Dirty'),): 'Suck',
        ((loc_C, 'Clean'),): 'Up',
        ((loc_C, 'Dirty'),): 'Suck',
        ((loc_D, 'Clean'),): 'Left',
        ((loc_D, 'Dirty'),): 'Suck',

        # Two-percept sequences
        ((loc_A, 'Clean'), (loc_B, 'Clean')): 'Down',
        ((loc_A, 'Clean'), (loc_B, 'Dirty')): 'Suck',
        ((loc_A, 'Dirty'), (loc_A, 'Clean')): 'Right',
        ((loc_B, 'Clean'), (loc_D, 'Clean')): 'Left',
        ((loc_B, 'Clean'), (loc_D, 'Dirty')): 'Suck',
        ((loc_B, 'Dirty'), (loc_B, 'Clean')): 'Down',
        ((loc_C, 'Clean'), (loc_A, 'Clean')): 'Right',
        ((loc_C, 'Clean'), (loc_A, 'Dirty')): 'Suck',
        ((loc_C, 'Dirty'), (loc_C, 'Clean')): 'Up',
        ((loc_D, 'Clean'), (loc_C, 'Clean')): 'Up',
        ((loc_D, 'Clean'), (loc_C, 'Dirty')): 'Suck',
        ((loc_D, 'Dirty'), (loc_D, 'Clean')): 'Left',

        # Three-percept sequences
        ((loc_A, 'Clean'), (loc_B, 'Clean'), (loc_D, 'Clean')): 'Left',
        ((loc_A, 'Clean'), (loc_B, 'Clean'), (loc_D, 'Dirty')): 'Suck',
        ((loc_A, 'Dirty'), (loc_A, 'Clean'), (loc_B, 'Clean')): 'Down',
        ((loc_A, 'Dirty'), (loc_A, 'Clean'), (loc_B, 'Dirty')): 'Suck',
        ((loc_B, 'Clean'), (loc_D, 'Clean'), (loc_C, 'Clean')): 'Up',
        ((loc_B, 'Clean'), (loc_D, 'Clean'), (loc_C, 'Dirty')): 'Suck',
        ((loc_B, 'Dirty'), (loc_B, 'Clean'), (loc_D, 'Clean')): 'Left',
        ((loc_B, 'Dirty'), (loc_B, 'Clean'), (loc_D, 'Dirty')): 'Suck',
        ((loc_C, 'Clean'), (loc_A, 'Clean'), (loc_B, 'Clean')): 'Down',
        ((loc_C, 'Clean'), (loc_A, 'Clean'), (loc_B, 'Dirty')): 'Suck',
        ((loc_C, 'Dirty'), (loc_C, 'Clean'), (loc_A, 'Clean')): 'Right',
        ((loc_C, 'Dirty'), (loc_C, 'Clean'), (loc_A, 'Dirty')): 'Suck',
        ((loc_D, 'Clean'), (loc_C, 'Clean'), (loc_A, 'Clean')): 'Right',
        ((loc_D, 'Clean'), (loc_C, 'Clean'), (loc_A, 'Dirty')): 'Suck',
        ((loc_D, 'Dirty'), (loc_D, 'Clean'), (loc_C, 'Clean')): 'Up',
        ((loc_D, 'Dirty'), (loc_D, 'Clean'), (loc_C, 'Dirty')): 'Suck',

        # Four-percept sequences (for handling edge cases and ensuring all cells are visited)
        ((loc_A, 'Clean'), (loc_B, 'Clean'), (loc_D, 'Clean'), (loc_C, 'Clean')): 'NoOp',
        ((loc_A, 'Clean'), (loc_B, 'Clean'), (loc_D, 'Clean'), (loc_C, 'Dirty')): 'Up',
        ((loc_B, 'Clean'), (loc_D, 'Clean'), (loc_C, 'Clean'), (loc_A, 'Clean')): 'NoOp',
        ((loc_B, 'Clean'), (loc_D, 'Clean'), (loc_C, 'Clean'), (loc_A, 'Dirty')): 'Left',
        ((loc_C, 'Clean'), (loc_A, 'Clean'), (loc_B, 'Clean'), (loc_D, 'Clean')): 'NoOp',
        ((loc_C, 'Clean'), (loc_A, 'Clean'), (loc_B, 'Clean'), (loc_D, 'Dirty')): 'Right',
        ((loc_D, 'Clean'), (loc_C, 'Clean'), (loc_A, 'Clean'), (loc_B, 'Clean')): 'NoOp',
        ((loc_D, 'Clean'), (loc_C, 'Clean'), (loc_A, 'Clean'), (loc_B, 'Dirty')): 'Down',

        # Additional sequences to ensure the agent doesn't get stuck
        ((loc_A, 'Clean'), (loc_B, 'Clean'), (loc_D, 'Dirty'), (loc_D, 'Clean')): 'Left',
        ((loc_B, 'Clean'), (loc_D, 'Clean'), (loc_C, 'Dirty'), (loc_C, 'Clean')): 'Up',
        ((loc_C, 'Clean'), (loc_A, 'Clean'), (loc_B, 'Dirty'), (loc_B, 'Clean')): 'Down',
        ((loc_D, 'Clean'), (loc_C, 'Clean'), (loc_A, 'Dirty'), (loc_A, 'Clean')): 'Right',
    }



# Create the TableDrivenVacuumAgent
table_driven_agent = TableDrivenVacuumAgent()

# Remove any previous agents if needed
trivial_vacuum_env.delete_thing(random_agent)
trivial_vacuum_env = TrivialVacuumEnvironment()
# Add the table-driven agent to the environment
trivial_vacuum_env.add_thing(table_driven_agent)

# Print the initial state
print("Initial state:")
print("Agent location:", table_driven_agent.location)
print("Environment status:", trivial_vacuum_env.status)

# Run a step in the environment
print("\nRunning steps:")

# Run the environment for several steps
step = 1
while not trivial_vacuum_env.is_clean():
    # Get the percept
    percept = trivial_vacuum_env.percept(table_driven_agent)
    
    # Get the action the agent will perform based on the percept
    action = table_driven_agent.program(percept)
    
    # Print the percept and the chosen action
    print(f"Step {step}: Percept: {percept}, Action: {action}")
    step += 1
    # Perform the step in the environment
    trivial_vacuum_env.step()
    
    # Print the new state of the environment
    print("Agent location:", table_driven_agent.location)
    print("Environment status:", trivial_vacuum_env.status)
    #print("Agent performance:", table_driven_agent.performance)
    
    # Check if the agent sucked at the current location
    if action == 'Suck':
        print(f"Agent sucked at location {table_driven_agent.location}.")
    
    print("-" * 50)



We will now create a table-driven agent program for our two-state environment.

In [None]:
# Create a table-driven agent
# table_driven_agent = Agent(program=TableDrivenAgentProgram(table=table))

Since we are using the same environment, let's remove the previously added random agent from the environment to avoid confusion.

In [None]:
#trivial_vacuum_env.delete_thing(random_agent)

In [None]:
# Add the table-driven agent to the environment
#trivial_vacuum_env.add_thing(table_driven_agent)

#print("TableDrivenVacuumAgent is located at {}.".format(table_driven_agent.location))

In [None]:
# Run the environment
#trivial_vacuum_env.step()

# Check the current state of the environment
#print("State of the Environment: {}.".format(trivial_vacuum_env.status))

#print("TableDrivenVacuumAgent is located at {}.".format(table_driven_agent.location))

## SIMPLE REFLEX AGENT PROGRAM

A simple reflex agent program selects actions on the basis of the *current* percept, ignoring the rest of the percept history. These agents work on a **condition-action rule** (also called **situation-action rule**, **production** or **if-then rule**), which tells the agent the action to trigger when a particular situation is encountered.  

The schematic diagram shown in **Figure 2.9** of the book will make this more clear:

"![simple reflex agent](images/simple_reflex_agent.jpg)"

Let us now create a simple reflex agent for the environment.

In [None]:
# Delete the previously added table-driven agent
trivial_vacuum_env.delete_thing(table_driven_agent)

To create our agent, we need two functions: INTERPRET-INPUT function, which generates an abstracted description of the current state from the percerpt and the RULE-MATCH function, which returns the first rule in the set of rules that matches the given state description.

In [None]:
loc_A, loc_B, loc_C, loc_D = (0, 0), (1, 0), (0, 1), (1, 1)  

def SimpleReflexVacuumAgent():
    """A simple reflex agent for the 2x2 vacuum environment."""
    def program(percept):
        location, status = percept
        if status == 'Dirty':
            return 'Suck'
        elif location == loc_A:
            return 'Right'
        elif location == loc_B:
            return 'Down'
        elif location == loc_C:
            return 'Up'
        elif location == loc_D:
            return 'Left'
    return Agent(program)

# Create a simple reflex agent for the 4-state environment
simple_reflex_agent = SimpleReflexVacuumAgent()

Now add the agent to the environment:

In [None]:
trivial_vacuum_env = TrivialVacuumEnvironment()
trivial_vacuum_env.add_thing(simple_reflex_agent)

print("SimpleReflexVacuumAgent is located at {}.".format(simple_reflex_agent.location))

In [None]:
# Run the environment
while not trivial_vacuum_env.is_clean():
    trivial_vacuum_env.step()

    # Check the current state of the environment
    print("State of the Environment: {}.".format(trivial_vacuum_env.status))

    print("SimpleReflexVacuumAgent is located at {}.".format(simple_reflex_agent.location))

## MODEL-BASED REFLEX AGENT PROGRAM

A model-based reflex agent maintains some sort of **internal state** that depends on the percept history and thereby reflects at least some of the unobserved aspects of the current state. In addition to this, it also requires a **model** of the world, that is, knowledge about "how the world works".

The schematic diagram shown in **Figure 2.11** of the book will make this more clear:
<img src="files/images/model_based_reflex_agent.jpg">

We will now create a model-based reflex agent for the environment:

In [None]:
# Delete the previously added simple reflex agent
trivial_vacuum_env.delete_thing(simple_reflex_agent)

We need another function UPDATE-STATE which will be responsible for creating a new state description.

In [None]:
# TODO: Implement this function for the two-dimensional environment --- edited for 4 state env
def update_state(state, action, percept, model):
    """Update the agent's state based on action and percept."""
    location, status = percept
    state[location] = status
    return state

# Create a model-based reflex agent
model_based_reflex_agent = ModelBasedVacuumAgent()
trivial_vacuum_env = TrivialVacuumEnvironment()
# Add the agent to the environment
trivial_vacuum_env.add_thing(model_based_reflex_agent)

print("ModelBasedVacuumAgent is located at {}.".format(model_based_reflex_agent.location))

In [None]:
# Run the environment
while not trivial_vacuum_env.is_clean():
    trivial_vacuum_env.step()

    # Check the current state of the environment
    print("State of the Environment: {}.".format(trivial_vacuum_env.status))

## GOAL-BASED AGENT PROGRAM

A goal-based agent needs some sort of **goal** information that describes situations that are desirable, apart from the current state description.

**Figure 2.13** of the book shows a model-based, goal-based agent:
<img src="files/images/model_goal_based_agent.jpg">

**Search** (Chapters 3 to 5) and **Planning** (Chapters 10 to 11) are the subfields of AI devoted to finding action sequences that achieve the agent's goals.

## UTILITY-BASED AGENT PROGRAM

A utility-based agent maximizes its **utility** using the agent's **utility function**, which is essentially an internalization of the agent's performance measure.

**Figure 2.14** of the book shows a model-based, utility-based agent:
<img src="files/images/model_utility_based_agent.jpg">

## LEARNING AGENT

Learning allows the agent to operate in initially unknown environments and to become more competent than its initial knowledge alone might allow. Here, we will breifly introduce the main ideas of learning agents.  

A learning agent can be divided into four conceptual components. The **learning element** is responsible for making improvements. It uses the feedback from the **critic** on how the agent is doing and determines how the performance element should be modified to do better in the future. The **performance element** is responsible for selecting external actions for the agent: it takes in percepts and decides on actions. The critic tells the learning element how well the agent is doing with respect to a fixed performance standard. It is necesaary because the percepts themselves provide no indication of the agent's success. The last component of the learning agent is the **problem generator**. It is responsible for suggesting actions that will lead to new and informative experiences.  

**Figure 2.15** of the book sums up the components and their working:  
<img src="files/images/general_learning_agent.jpg">