#QUESTION 2: Intelligent Agents: Reflex-Based Agents for the Vacuum-cleaner World



Now that you are having a basic knowledge of how the agent must work, recall our discussions regarding the agent types and environment. We will now include the PEAS information and redesign the agent.
In this assignment you will implement a simulator environment for an automatic vacuum cleaner robot, a set of different reflex-based agent programs, and perform a comparison study for cleaning a single room. Focus on the __cleaning phase__ which starts when the robot is activated and ends when the last dirty square in the room has been cleaned. Someone else will take care of the agent program needed to navigate back to the charging station after the room is clean.

## PEAS description of the cleaning phase

__Performance Measure:__ Each action costs 1 energy unit. The performance is measured as the sum of the energy units used to clean the whole room.

__Environment:__ A room with $n \times n$ squares where $n = 5$. Dirt is randomly placed on each square with probability $p = 0.2$. For simplicity, you can assume that the agent knows the size and the layout of the room (i.e., it knows $n$). To start, the agent is placed on a random square.

__Actuators:__ The agent can clean the current square (action `suck`) or move to an adjacent square by going `north`, `east`, `south`, or `west`.

__Sensors:__ Four bumper sensors, one for north, east, south, and west; a dirt sensor reporting dirt in the current square.  


## The agent program for a simple randomized agent

The agent program is a function that gets sensor information (the current percepts) as the arguments. The arguments are:

* A dictionary with boolean entries for the for bumper sensors `north`, `east`, `west`, `south`. E.g., if the agent is on the north-west corner, `bumpers` will be `{"north" : True, "east" : False, "south" : False, "west" : True}`.
* The dirt sensor produces a boolean.

The agent returns the chosen action as a string.

Here is an example implementation for the agent program of a simple randomized agent:  

In [177]:
import numpy as np

actions = ["north", "east", "west", "south", "suck"]

def simple_randomized_agent(bumpers, dirty):
    return np.random.choice(actions)

In [178]:
# define percepts (current location is NW corner and it is dirty)
bumpers = {"north" : True, "east" : False, "south" : False, "west" : True}
dirty = True

# call agent program function with percepts and it returns an action
simple_randomized_agent(bumpers, dirty)

'west'

__Note:__ This is not a rational intelligent agent. It ignores its sensors and may bump into a wall repeatedly or not clean a dirty square. You will be asked to implement rational agents below.

## Simple environment example

We implement a simple simulation environment that supplies the agent with its percepts.
The simple environment is infinite in size (bumpers are always `False`) and every square is always dirty, even if the agent cleans it. The environment function returns a performance measure which is here the number of cleaned squares (since the room is infinite and all squares are constantly dirty, the agent can never clean the whole room as required in the PEAS description above). The energy budget of the agent is specified as `max_steps`.

In [179]:
def simple_environment(agent, max_steps, verbose = True):
    num_cleaned = 0

    for i in range(max_steps):
        dirty = True
        bumpers = {"north" : False, "south" : False, "west" : False, "east" : False}

        action = agent(bumpers, dirty)
        print(action)
        if (verbose): print("step", i , "- action:", action)

        if (action == "suck"):
            num_cleaned = num_cleaned + 1

    return num_cleaned



Do one simulation run with a simple randomized agent that has enough energy for 20 steps.

In [180]:
simple_environment(simple_randomized_agent, max_steps = 20)

east
step 0 - action: east
south
step 1 - action: south
south
step 2 - action: south
suck
step 3 - action: suck
north
step 4 - action: north
north
step 5 - action: north
west
step 6 - action: west
north
step 7 - action: north
north
step 8 - action: north
south
step 9 - action: south
east
step 10 - action: east
south
step 11 - action: south
south
step 12 - action: south
east
step 13 - action: east
south
step 14 - action: south
south
step 15 - action: south
suck
step 16 - action: suck
suck
step 17 - action: suck
north
step 18 - action: north
west
step 19 - action: west


3

# Tasks


1. Make sure that you use the latest version of this notebook. Sync your forked repository and pull the latest revision.
2. Your implementation can use libraries like math, numpy, scipy, but not libraries that implement intelligent agents or complete search algorithms. Try to keep the code simple! In this course, we want to learn about the algorithms and we often do not need to use object-oriented design.
3. You notebook needs to be formatted professionally.
    - Add additional markdown blocks for your description, comments in the code, add tables and use mathplotlib to produce charts where appropriate
    - Do not show debugging output or include an excessive amount of output.
    - Check that your PDF file is readable. For example, long lines are cut off in the PDF file. You don't have control over page breaks, so do not worry about these.
4. Document your code. Add a short discussion of how your implementation works and your design choices.


## Task 1: Implement a simulation environment

The simple environment above is not very realistic. Your environment simulator needs to follow the PEAS description from above. It needs to:

* Initialize the environment by storing the state of each square (clean/dirty) and making some dirty.
* Keep track of the agent's position.
* Call the agent function repeatedly and provide the agent function with the sensor inputs.  
* React to the agent's actions. E.g, by removing dirt from a square or moving the agent around unless there is a wall in the way.
* Keep track of the performance measure. That is, track the agent's actions until all dirty squares are clean and count the number of actions it takes the agent to complete the task.

The easiest implementation for the environment is to hold an 2-dimensional array to represent if squares are clean or dirty and to call the agent function in a loop until all squares are clean or a predefined number of steps have been reached (i.e., the robot runs out of energy).

The simulation environment should be a function like the `simple_environment()` and needs to work with the simple randomized agent program from above. **Use the same environment for all your agent implementations in the tasks below.**

*Note on debugging:* Debugging is difficult. Make sure your environment prints enough information when you use `verbose = True`. Also, implementing a function that the environment can use to displays the room with dirt and the current position of the robot at every step is very useful.  

In [181]:
import random
def create_5x5_matrix(): # randomly define clean and dirty square
    matrix = [[random.randint(0,1) for _ in range(5)] for _ in range(5)]
    return matrix

# Creating a 5x5 matrix of '0 and 1' using [][] notation
five_by_five_matrix = create_5x5_matrix()

# Displaying the 5x5 matrix using [][] notation
for i in range(5):
    for j in range(5):
        print(five_by_five_matrix[i][j], end=' ')
    print()


1 0 0 0 1 
0 1 0 0 1 
0 0 0 1 1 
0 0 1 0 1 
1 0 0 0 1 


In [182]:
# agent psoition
agent_x_pos, agent_y_pos = 0,0

In [183]:
dirty_squares = sum(row.count(1) for row in five_by_five_matrix)  # Counting initial dirty squares
actions_taken = 0  # Initializing actions counter
dirty_squares

10

In [184]:
# Function to check if all squares are clean
def all_clean(matrix):
    return all(all(square == 0 for square in row) for row in matrix)


In [185]:
# Agent function
def simple_environment1(matrix, max_steps):
    global actions_taken
    x=y=0
    steps = 0
    direction = 1
    # Display current matrix
    print("\nCurrent Matrix:")

    # Perform actions until all squares are clean
    while not all_clean(matrix):
        if((max_steps==steps)):
            print("Max energy comnsumed  Out of Energy ")
            break

        # Check and clean the current square if dirty
        # Cleaning the current square if dirty
        if matrix[x][y] == 1:
            matrix[x][y] = 0
            print(f"Cleaning square at position ({x}, {y})")
            actions_taken += 1
        
        # Check if all squares are clean after each action
        if all_clean(matrix):
            print("\nAll squares cleaned!")
            break

        if x == 4 and y == 4:  # Break loop when reaching the last square
            break
        # Move to a new position in an 'S' shape pattern
        if x % 2 == 0:  # Moving right on even rows
            if 0 <= y + direction < 5:  # Check boundary condition
                y += direction
                steps += 1
                print(f"Moving right to position ({x}, {y})")
            else:
                x += 1
                steps += 1
                print(f"Moving down to position ({x}, {y})")
        else:  # Moving left on odd rows
            if 0 <= y - direction < 5:  # Check boundary condition
                y -= direction
                steps += 1
                print(f"Moving left to position ({x}, {y})")
            else:
                x += 1
                steps += 1
                print(f"Moving down to position ({x}, {y})")

    print(f"Total actions taken: {actions_taken}")



In [186]:
# Taking the energy count suppose one step costs 1 unit of energy
max_steps=int(input("Enter the max energy of the device : "))

In [187]:
# Call the agent function
simple_environment1(five_by_five_matrix,max_steps)


Current Matrix:
Cleaning square at position (0, 0)
Moving right to position (0, 1)
Moving right to position (0, 2)
Moving right to position (0, 3)
Moving right to position (0, 4)
Cleaning square at position (0, 4)
Moving down to position (1, 4)
Cleaning square at position (1, 4)
Moving left to position (1, 3)
Moving left to position (1, 2)
Moving left to position (1, 1)
Cleaning square at position (1, 1)
Moving left to position (1, 0)
Moving down to position (2, 0)
Moving right to position (2, 1)
Moving right to position (2, 2)
Moving right to position (2, 3)
Cleaning square at position (2, 3)
Moving right to position (2, 4)
Cleaning square at position (2, 4)
Moving down to position (3, 4)
Cleaning square at position (3, 4)
Moving left to position (3, 3)
Moving left to position (3, 2)
Cleaning square at position (3, 2)
Moving left to position (3, 1)
Moving left to position (3, 0)
Moving down to position (4, 0)
Cleaning square at position (4, 0)
Moving right to position (4, 1)
Moving r



The simple reflex agent randomly walks around but reacts to the bumper sensor by not bumping into the wall and to dirt with sucking. Implement the agent program as a function.

_Note:_ Agents cannot directly use variable in the environment. They only gets the percepts as the arguments to the agent function.