# Intelligent Agents: Reflex-Based Agents for the Vacuum-cleaner World

Student Name: [Add your name]

I have used the following AI tools: [list tools]

I understand that my submission needs to be my own work: [your initials]

## Learning Outcomes

* Design and build a simulation environment that models sensor inputs, actuator effects, and performance measurement.
* Apply core AI concepts by implementing the agent function for a simple and model-based reflex agents that respond to environmental percepts.
* Practice how the environment and the agent function interact.
* Analyze agent performance through controlled experiments across different environment configurations.
* Graduate Students: Develop strategies for handling uncertainty and imperfect information in autonomous agent systems.

## Instructions

Total Points: Undergrads 98 + 5 bonus / Graduate students 110

Complete this notebook. Use the provided notebook cells and insert additional code and markdown cells as needed. Submit the completely rendered notebook as a HTML file. 

### AI Use

Here are some guidelines that will make it easier for you:

* __Don't:__ Rely on AI auto completion. You will waste a lot of time trying to figure out how the suggested code relates to what we do in class. Turn off AI code completion (e.g., Copilot) in your IDE.
* __Don't:__ Do not submit code/text that you do not understand or have not checked to make sure that it is complete and correct.
* __Do:__ Use AI for debugging and letting it explain code and concepts from class.

### Using Visual Studio Code

If you use VS code then you can use `Export` (click on `...` in the menu bar) to save your notebook as a HTML file. Note that you have to run all blocks before so the HTML file contains your output.

### Using Google Colab

In Colab you need to save the notebook on GoogleDrive to work with it. For this you need to mount your google dive and change to the correct directory by uncommenting the following lines and running the code block.

In [163]:
# from google.colab import drive
# import os
#
# drive.mount('/content/drive')
# os.chdir('/content/drive/My Drive/Colab Notebooks/')

Once you are done with the assignment and have run all code blocks using `Runtime/Run all`, you can convert the file on your GoogleDrive into HTML be uncommenting the following line and running the block.

In [164]:
# %jupyter nbconvert --to html Copy\ of\ robot_vacuum.ipynb

You may have to fix the file location or the file name to match how it looks on your GoogleDrive. You can navigate in Colab to your GoogleDrive using the little folder symbol in the navigation bar to the left.

## Introduction

In this assignment you will implement a simulator environment for an automatic vacuum cleaner robot, a set of different reflex-based agent programs, and perform a comparison study for cleaning a single room. Focus on the __cleaning phase__ which starts when the robot is activated and ends when the last dirty square in the room has been cleaned. Someone else will take care of the agent program needed to navigate back to the charging station after the room is clean.

## PEAS description of the cleaning phase

__Performance Measure:__ Each action costs 1 energy unit. The performance is measured as the sum of the energy units used to clean the whole room.

__Environment:__ A room with $n \times n$ squares where $n = 5$. Dirt is randomly placed on each square with probability $p = 0.2$. For simplicity, you can assume that the agent knows the size and the layout of the room (i.e., it knows $n$). To start, the agent is placed on a random square.

__Actuators:__ The agent can clean the current square (action `suck`) or move to an adjacent square by going `north`, `east`, `south`, or `west`.

__Sensors:__ Four bumper sensors, one for north, east, south, and west; a dirt sensor reporting dirt in the current square.  


## The agent program for a simple randomized agent

The agent program is a function that gets sensor information (the current percepts) as the arguments. The arguments are:

* A dictionary with boolean entries for the for bumper sensors `north`, `east`, `west`, `south`. E.g., if the agent is on the north-west corner, `bumpers` will be `{"north" : True, "east" : False, "south" : False, "west" : True}`.
* The dirt sensor produces a boolean.

The agent returns the chosen action as a string.

Here is an example implementation for the agent program of a simple randomized agent:  

In [165]:
# make sure numpy is installed
%pip install -q numpy

Note: you may need to restart the kernel to use updated packages.


In [166]:
import numpy as np

actions = ["north", "east", "west", "south", "suck"]

def simple_randomized_agent(bumpers, dirty):
    return np.random.choice(actions)

In [167]:
# define percepts (current location is NW corner and it is dirty)
bumpers = {"north" : True, "east" : False, "south" : False, "west" : True}
dirty = True

# call agent program function with percepts and it returns an action
simple_randomized_agent(bumpers, dirty)

'east'

__Note:__ This is not a rational intelligent agent. It ignores its sensors and may bump into a wall repeatedly or not clean a dirty square. You will be asked to implement rational agents below.

## Simple environment example

We implement a simple simulation environment that supplies the agent with its percepts.
The simple environment is infinite in size (bumpers are always `False`) and every square is always dirty, even if the agent cleans it. The environment function returns a different performance measure than the one specified in the PEAS description! Since the room is infinite and all squares are constantly dirty, the agent can never clean the whole room. Your implementation needs to implement the **correct performance measure.** The energy budget of the agent is specified as `max_steps`. 

In [168]:
def simple_environment(agent_function, max_steps, verbose = True):
    num_cleaned = 0
    
    for i in range(max_steps):
        dirty = True
        bumpers = {"north" : False, "south" : False, "west" : False, "east" : False}

        action = agent_function(bumpers, dirty)
        if (verbose): print("step", i , "- action:", action) 
        
        if (action == "suck"): 
            num_cleaned = num_cleaned + 1
        
    return num_cleaned
        


Do one simulation run with a simple randomized agent that has enough energy for 20 steps.

In [169]:
simple_environment(simple_randomized_agent, max_steps = 20)

step 0 - action: west
step 1 - action: south
step 2 - action: east
step 3 - action: north
step 4 - action: east
step 5 - action: north
step 6 - action: suck
step 7 - action: north
step 8 - action: north
step 9 - action: east
step 10 - action: east
step 11 - action: suck
step 12 - action: suck
step 13 - action: south
step 14 - action: suck
step 15 - action: east
step 16 - action: suck
step 17 - action: south
step 18 - action: south
step 19 - action: north


5

# Tasks

## General [10 Points]

1. Make sure that you use the latest version of this notebook. 
2. Your implementation can use libraries like math, numpy, scipy, but not libraries that implement intelligent agents or complete search algorithms. Try to keep the code simple! In this course, we want to learn about the algorithms and we often do not need to use object-oriented design.
3. You notebook needs to be formatted professionally. 
    - Add additional markdown blocks for your description, comments in the code, add tables and use mathplotlib to produce charts where appropriate
    - Do not show debugging output or include an excessive amount of output.
    - Check that your submitted file is readable and contains all figures.
4. Document your code. Use comments in the code and add a discussion of how your implementation works and your design choices.


## Task 1: Implement a simulation environment [20 Points]

The simple environment above is not very realistic. Your environment simulator needs to follow the PEAS description from above. It needs to:

* Initialize the environment by storing the state of each square (clean/dirty) and making some dirty. ([Help with random numbers and arrays in Python](https://github.com/mhahsler/CS7320-AI/blob/master/HOWTOs/random_numbers_and_arrays.ipynb))
* Keep track of the agent's position.
* Call the agent function repeatedly and provide the agent function with the sensor inputs.  
* React to the agent's actions. E.g, by removing dirt from a square or moving the agent around unless there is a wall in the way.
* Keep track of the performance measure. That is, track the agent's actions until all dirty squares are clean and count the number of actions it takes the agent to complete the task.

The easiest implementation for the environment is to hold an 2-dimensional array to represent if squares are clean or dirty and to call the agent function in a loop until all squares are clean or a predefined number of steps have been reached (i.e., the robot runs out of energy).

The simulation environment should be a function like the `simple_environment()` and needs to work with the simple randomized agent program from above. **Use the same environment for all your agent implementations in the tasks below.**

*Note on debugging:* Debugging is difficult. Make sure your environment prints enough information when you use `verbose = True`. Also, implementing a function that the environment can use to displays the room with dirt and the current position of the robot at every step is very useful.  

In [170]:
# Your code and description goes here
import numpy as np


def setup_room(n, p):
    """
    Args:
        p: probability of each square being dirty
        n: side length of square
    
    Return:
        environment: n x n numpy array
    """
    room = np.zeros((n, n))
    k = round(p * n * n)
    idx = np.random.choice(n*n, k, replace=False)
    rows, cols = np.unravel_index(idx, (n, n))
    room[rows, cols] = 1.0
    return room

def real_environment(agent_function,
                     max_steps,
                     n = 5,
                     p= 0.2,
                     verbose = True):
    
    # Khởi đầu vị trí ngẫu nhiên
    x = np.random.randint(0, n)
    y = np.random.randint(0, n)

    room = setup_room(n, p)
    if verbose == True: 
        print(room)
        print(f"- Current position:[{x}][{y}]")

    need_to_clean = 0
    for i in range(n):
        for j in range(n):
            if room[i, j] == 1:
                need_to_clean += 1

    need_to_clean_cp= need_to_clean
    num_cleaned = 0
    steps = 0
    
    while steps < max_steps and need_to_clean > 0:
        if room[x][y] == 1:
            dirty = True
        else:
            dirty = False

        bumpers = {"north" : False, "south" : False, "west" : False, "east" : False}
        if x == 0 :
            bumpers["north"] = True
        if y == 0 :
            bumpers["west"] = True
        if x == n-1 :
            bumpers["south"] = True
        if y == n-1 :
            bumpers["east"] = True
    
        action = agent_function(bumpers, dirty)
        
        if (verbose): print("step", steps , "- action:", action)

        feedback = ""
        if action == "suck":
            num_cleaned = num_cleaned + 1
            if room[x][y] == 1:
                need_to_clean -= 1
                room[x][y] = 0
                if verbose: print("really sucks up dirt")

            steps+=1    
        
        elif action == "north":
            if x>0:
                x -= 1
                steps+=1
            else:
                feedback = "Hit the North wall (no movement) --> change direction"

        elif action == "east":
            if y<n-1:
                y += 1
                steps+=1
            else:
                feedback = "Hit the East wall (no movement) --> change direction"

        elif action == "south":
            if x<n-1:
                x += 1
                steps+=1
            else:
                feedback = "Hit the South wall (no movement) --> change direction"

        elif action == "west":
            if y>0:
                y -= 1
                steps+=1
            else:
                feedback = "Hit the West wall (no movement) --> change direction"
        elif action == "stand still":
            feedback = "All roads are blocked --> Stand still"
            steps+=1
        
        if verbose: 
            print(f"- Current position:[{x}][{y}]")
        if feedback !="" and verbose:
            print("Feedback", feedback)
            print()

    Actual_amount_of_dirt_vacuumed = need_to_clean_cp-need_to_clean
    if verbose:
        print()
        print("Amount of dirt:", need_to_clean_cp)
        print("Actual amount of dirt vacuumed: ", Actual_amount_of_dirt_vacuumed)
    return steps, need_to_clean_cp, Actual_amount_of_dirt_vacuumed
    
    

Show that your environment works with the simple randomized agent from above.

In [171]:
# Your code and description goes here
real_environment(agent_function=simple_randomized_agent, 
                 max_steps = 20,
                 n = 5, 
                 p=0.2,
                 verbose=True)

[[0. 1. 0. 1. 0.]
 [0. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]]
- Current position:[2][3]
step 0 - action: west
- Current position:[2][2]
step 1 - action: north
- Current position:[1][2]
step 2 - action: south
- Current position:[2][2]
step 3 - action: south
- Current position:[3][2]
step 4 - action: east
- Current position:[3][3]
step 5 - action: west
- Current position:[3][2]
step 6 - action: east
- Current position:[3][3]
step 7 - action: north
- Current position:[2][3]
step 8 - action: north
- Current position:[1][3]
step 9 - action: south
- Current position:[2][3]
step 10 - action: east
- Current position:[2][4]
step 11 - action: west
- Current position:[2][3]
step 12 - action: east
- Current position:[2][4]
step 13 - action: north
- Current position:[1][4]
step 14 - action: suck
- Current position:[1][4]
step 15 - action: north
- Current position:[0][4]
step 16 - action: north
- Current position:[0][4]
Feedback Hit the North wall (no movement) --> chang

(20, 5, 0)

## Task 2:  Implement a simple reflex agent [10 Points] 

The simple reflex agent randomly walks around but reacts to the bumper sensor by not bumping into the wall and to dirt with sucking. Implement the agent program as a function.

_Note:_ Agents cannot directly use variable in the environment. They only gets the percepts as the arguments to the agent function. Use the function signature for the `simple_randomized_agent` function above.

In [172]:
# Your code and description goes here
def simple_reflex_agent(bumpers, dirty):
    moves = ["north", "east", "south", "west"]
    if dirty == True:
        return "suck"
    if bumpers["north"] == True:
        moves.remove("north")
    if bumpers["east"] == True:
        moves.remove("east")
    if bumpers["south"] == True:
        moves.remove("south")
    if bumpers["west"] == True:
        moves.remove("west")
    
    try:
        action = np.random.choice(moves)
    except ValueError:
        action = "stand still"
    
    return action

Show how the agent works with your environment.

In [173]:
# Your code and description goes here
real_environment(agent_function=simple_reflex_agent,
                 max_steps = 50,
                 n = 5,
                 p = 0.2,
                 verbose=True)

[[0. 1. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [1. 0. 0. 0. 0.]]
- Current position:[2][0]
step 0 - action: south
- Current position:[3][0]
step 1 - action: north
- Current position:[2][0]
step 2 - action: north
- Current position:[1][0]
step 3 - action: south
- Current position:[2][0]
step 4 - action: south
- Current position:[3][0]
step 5 - action: south
- Current position:[4][0]
step 6 - action: suck
really sucks up dirt
- Current position:[4][0]
step 7 - action: east
- Current position:[4][1]
step 8 - action: north
- Current position:[3][1]
step 9 - action: north
- Current position:[2][1]
step 10 - action: suck
really sucks up dirt
- Current position:[2][1]
step 11 - action: west
- Current position:[2][0]
step 12 - action: north
- Current position:[1][0]
step 13 - action: south
- Current position:[2][0]
step 14 - action: north
- Current position:[1][0]
step 15 - action: south
- Current position:[2][0]
step 16 - action: south
- Current position:[3][0]
step 

(50, 5, 3)

## Task 3: Implement a model-based reflex agent [20 Points]

Model-based agents use a state to keep track of what they have done and perceived so far. Your agent needs to find out where it is located and then keep track of its current location. You also need a set of rules based on the state and the percepts to make sure that the agent will clean the whole room. For example, the agent can move to a corner to determine its location and then it can navigate through the whole room and clean dirty squares.

Describe how you define the __agent state__ and how your agent works before implementing it. ([Help with implementing state information on Python](https://github.com/mhahsler/CS7320-AI/blob/master/HOWTOs/store_agent_state_information.ipynb))

In [174]:
# Your short description of the state and your implementation goes here
"""
Di chuyển về góc rồi di di chuyển ngang dọc để hút bụi
"""

'\nDi chuyển về góc rồi di di chuyển ngang dọc để hút bụi\n'

In [175]:
# Model-based Reflex Agent
class Agent:
    def __init__(self,
                 max_steps=20, 
                 room = setup_room(5, 0.2),
                 station_location=[0, 0], 
                 verbose=True,
                 percent_error = 0.0):
        self.state = 0
        self.n = len(room)   # cạnh hình vuông
        self.actions_can_choice = ["north", "east", "south", "west", "suck" , "stand still"]
        self.position_x = 0
        self.position_y = 0

        self.steps = 0
        self.max_steps = max_steps
        
        self.station_location = station_location
        self.room = room

        self.need_to_clean = self.fc_need_to_clean()
        self.actual_dirt_vacuumed = self.need_to_clean
        
        self.verbose = verbose
        self.percent_error = percent_error
    
    def fc_need_to_clean(self):
        need_to_clean =0
        for i in range(self.n):
            for j in range(self.n):
                if room[i, j] == 1:
                    need_to_clean += 1
        return need_to_clean
    
    def update_agent_location(self, x, y):
        self.position_x = x
        self.position_y = y
    
    def moving (self, direction):
        feedback = ""
        if direction == "north":
            if self.position_x > 0:
                self.position_x -= 1
                self.steps += 1
            else:
                feedback = "Hit the North wall (no movement) --> change direction"

        elif direction == "east":
            if self.position_y < self.n - 1:
                self.position_y += 1
                self.steps += 1
            else:
                feedback = "Hit the East wall (no movement) --> change direction"
        
        elif direction == "south":
            if self.position_x < self.n-1:
                self.position_x += 1
                self.steps += 1
            else:
                feedback = "Hit the South wall (no movement) --> change direction"
            
        elif direction == "west":
            if self.position_y > 0:
                self.position_y -= 1
                self.steps += 1
            else:
                feedback = "Hit the West wall (no movement) --> change direction"
        elif direction == "stand still":
            feedback = "All roads are blocked --> Stand still"
            self.steps += 1
        if self.verbose: 
            print(f"- Current position:[{self.position_x}][{self.position_y}]")
        if feedback!="" and self.verbose:
            print(feedback, "\n")
    
    
    def go_to_the_station(self):
        # simple: Tìm đường đi đến góc [0][0]
        if self.verbose: 
            print("--> Go to station location")
        while self.position_x>0 :
            self.position_x -= 1
            self.steps +=1
        while self.position_y>0 :
            self.steps +=1
            self.position_y -= 1

        if self.verbose: 
            print(f"- Current position:[{self.position_x}][{self.position_y}]")
            
    
    def action_suck(self):
        self.steps += 1
        if room[self.position_x, self.position_y] == 1.:
            self.actual_dirt_vacuumed -= 1
        self.room[self.position_x, self.position_y] = 0
        if self.verbose: 
            print(f"--> SUCK")
    # def change_direction(self, direction):
    def check_dirt(self, i, j):
        k = np.random.randint(10)
        if k == 1:
            return self.room[i][j]
        else:
            return np.abs(self.room[i][j] - 1)
            
    def whole_house_vacuuming(self):
        # self.steps = 0
        if self.verbose: 
            print(f"- Current position:[{self.position_x}][{self.position_y}]")
        if self.position_x == 0 and self.position_y == 0:
            if self.verbose: 
                print("Already at the charging station")
        else:
            if self.verbose: 
                print("Not at the charging station yet.")
            self.go_to_the_station()
        if self.verbose: 
            print("==> Start vacuuming the whole room.")
        
        for i in range(self.n-1):
            for j in range(self.n-1):
                if self.percent_error != 0.0:
                    if self.check_dirt(self.position_x, self.position_y):
                        self.action_suck()
                elif self.room[self.position_x][self.position_y] == 1.:
                    self.action_suck()
                if i % 2 == 0:
                    if self.verbose: 
                        print("move east")
                    self.moving("east")
                else:
                    if self.verbose: 
                        print("move west")
                    self.moving("west")
            
            if i!=self.n-1:
                if self.verbose: 
                    print("move south")
                self.moving("south")
            
            if self.steps >= self.max_steps:
                if self.verbose:
                    print("Out of energy --> stop vacuuming")
                break

        if self.verbose: 
            print("==> End vacuuming the whole room.")
        if self.verbose: 
            print("The performance measure: ", self.steps)
    

In [176]:
n = 100
p = 0.2
room = setup_room(n, p)
print(room)

[[0. 0. 0. ... 0. 0. 1.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 ...
 [0. 0. 1. ... 0. 0. 0.]
 [0. 1. 0. ... 0. 0. 0.]
 [1. 0. 0. ... 0. 0. 1.]]


Show how the agent works with your environment.

In [177]:
# Your code goes here

agent_1 = Agent(max_steps=20,
                room = room,
                station_location=[0,0],
                verbose=True)

agent_1.update_agent_location(x=4, y=2)
agent_1.whole_house_vacuuming()

- Current position:[4][2]
Not at the charging station yet.
--> Go to station location
- Current position:[0][0]
==> Start vacuuming the whole room.
move east
- Current position:[0][1]
move east
- Current position:[0][2]
move east
- Current position:[0][3]
move east
- Current position:[0][4]
move east
- Current position:[0][5]
move east
- Current position:[0][6]
move east
- Current position:[0][7]
move east
- Current position:[0][8]
--> SUCK
move east
- Current position:[0][9]
move east
- Current position:[0][10]
move east
- Current position:[0][11]
move east
- Current position:[0][12]
move east
- Current position:[0][13]
move east
- Current position:[0][14]
move east
- Current position:[0][15]
move east
- Current position:[0][16]
--> SUCK
move east
- Current position:[0][17]
move east
- Current position:[0][18]
move east
- Current position:[0][19]
move east
- Current position:[0][20]
--> SUCK
move east
- Current position:[0][21]
move east
- Current position:[0][22]
move east
- Current 

## Task 4: Simulation study [30 Points]

Compare the performance (the performance measure is defined in the PEAS description above) of the agents using  environments of different size. Do at least $5 \times 5$, $10 \times 10$ and
$100 \times 100$. Use 100 random runs for each. Present the results using tables and graphs. Discuss the differences between the agents. 
([Help with charts and tables in Python](https://github.com/mhahsler/CS7320-AI/blob/master/HOWTOs/charts_and_tables.ipynb))

In [186]:
# Your code goes here
# Lấy max_steps cho Randomized Agent | Simple Reflex Agent là: 200
def Compare_the_performance_1(n=5, p=0.2, max_steps=200, iter=100, verbose=False):
    avg_p_random = 0
    avg_p_simple = 0
    avg_p_reflex = 0
    avg_need_to_clean_1, avg_actual_dirt_vacuumed_1 = 1, 1
    avg_need_to_clean_2, avg_actual_dirt_vacuumed_2 = 1, 1
    avg_need_to_clean_3, avg_actual_dirt_vacuumed_3 = 1, 1
    for i in range(iter):
        p_random, need_to_clean_1, actual_dirt_vacuumed_1 = real_environment(agent_function=simple_randomized_agent,
                                                                             max_steps = max_steps,
                                                                             n = n,
                                                                             p=p,
                                                                             verbose=verbose)
        avg_p_random+=p_random
        avg_need_to_clean_1 +=need_to_clean_1
        avg_actual_dirt_vacuumed_1+= actual_dirt_vacuumed_1
        #####################################################################
        p_simple, need_to_clean_2, actual_dirt_vacuumed_2 = real_environment(agent_function=simple_reflex_agent,
                                                                             max_steps = max_steps,
                                                                             n = n,
                                                                             p = p,
                                                                             verbose=verbose)
        avg_p_simple+=p_simple
        avg_need_to_clean_2 += need_to_clean_2
        avg_actual_dirt_vacuumed_2 += actual_dirt_vacuumed_2
        #####################################################################
        room = setup_room(n, p)
        agent_1 = Agent(max_steps=max_steps,
                        room = room,
                        station_location=[0, 0],
                        verbose=verbose)

        agent_1.update_agent_location(x=np.random.randint(0, n), y=np.random.randint(0, n))
        agent_1.whole_house_vacuuming()
        avg_p_reflex+=agent_1.steps
        avg_need_to_clean_3 += agent_1.need_to_clean
        avg_actual_dirt_vacuumed_3 += agent_1.actual_dirt_vacuumed

    avg_p_random = avg_p_random/iter
    avg_p_simple = avg_p_simple/iter
    avg_p_reflex = avg_p_reflex/iter

    avg_need_to_clean_1 = avg_need_to_clean_1/iter
    avg_actual_dirt_vacuumed_1 = avg_actual_dirt_vacuumed_1/iter
    avg_need_to_clean_2 = avg_need_to_clean_2/iter
    avg_actual_dirt_vacuumed_2 = avg_actual_dirt_vacuumed_2/iter
    avg_need_to_clean_3 = avg_need_to_clean_3/iter
    avg_actual_dirt_vacuumed_3 = avg_actual_dirt_vacuumed_3/iter


    print(f"Random: {avg_p_random} || Cleaning progress:{avg_actual_dirt_vacuumed_1/avg_need_to_clean_1 * 100} %")
    print(f"Simple: {avg_p_simple} || Cleaning progress:{avg_actual_dirt_vacuumed_2/avg_need_to_clean_2 *100} %")
    print(f"Reflex: {avg_p_reflex}|| Cleaning progress:{avg_actual_dirt_vacuumed_3/avg_need_to_clean_3 *100} %")
    # print(avg_need_to_clean_3)
    # print(avg_actual_dirt_vacuumed_3)



In [187]:
print("N = 5")
Compare_the_performance_1(n=5, p=0.2, max_steps=200, iter=1, verbose=False)
print("-----------------------------------------------------")

print("N = 10")
Compare_the_performance_1(n=10, p=0.2, max_steps=200, iter=1, verbose=False)
print("-----------------------------------------------------")

print("N = 100")
Compare_the_performance_1(n=100, p=0.2, max_steps=2000, iter=1, verbose=False)

N = 5
Random: 200.0 || Cleaning progress:83.33333333333334 %
Simple: 126.0 || Cleaning progress:100.0 %
Reflex: 27.0|| Cleaning progress:100.0 %
-----------------------------------------------------
N = 10
Random: 200.0 || Cleaning progress:47.61904761904761 %
Simple: 200.0 || Cleaning progress:61.904761904761905 %
Reflex: 118.0|| Cleaning progress:100.0 %
-----------------------------------------------------
N = 100
Random: 2000.0 || Cleaning progress:2.798600699650175 %
Simple: 2000.0 || Cleaning progress:5.297351324337831 %
Reflex: 2032.0|| Cleaning progress:97.07661290322581 %


Fill out the following table with the average performance measure for 100 random runs (you may also create this table with code):

| Size              | Randomized Agent | Simple Reflex Agent | Model-based Reflex Agent |
|-------------------|------------------|---------------------|--------------------------|
| 5x5               | 200              | 126                 | 27                       |
| 10x10             | 200              | 200                 | 118                      |
| 100x100           | 2000             | 2000                | 2032                     |

----------------
Mức độ làm sạch căn phòng (có giới hạn max_step)
----------------
| Size              | Randomized Agent | Simple Reflex Agent | Model-based Reflex Agent |
|-------------------|------------------|---------------------|--------------------------|
| 5x5               | 83.33 %          | 100 %               | 100 %                    |
| 10x10             | 47.61 %          | 61.90 %             | 100 %                    |
| 100x100           | 2.79 %           | 5.29 %              | 97.07 %                  |


Add charts to compare the performance of the different agents.

In [180]:
# Your graphs and discussion of the results goes herez

## Task 5: Robustness of the agent implementations [10 Points] 

Describe how **your agent implementations** will perform 

* if it is put into a rectangular room with unknown size, 
* if the cleaning area can have an irregular shape (e.g., a hallway connecting two rooms), or 
* if the room contains obstacles (i.e., squares that it cannot pass through and trigger the bumper sensors).
* if the dirt sensor is not perfect and gives 10% of the time a wrong reading (clean when it is dirty or dirty when it is clean).
* if the bumper sensor is not perfect and 10% of the time does not report a wall when there is one.

In [181]:
# Answers goes here
"""
1. Nếu agent được đặt vào căn phòng hình chữ nhật không biết kích thước:
--> Tìm đến góc rồi di chuyển ngang dọc để hút bụi

2. Nếu khu vực vệ sinh có hình dáng bất thường:
--> Cách tốt nhất: có thêm cảm biến quét kích thước từng căn phòng, khi đó có thể dọn dẹp từng căn phòng và hành lang.

3. Nếu có vật cản
--> Men theo vật cản n bước, sau đó đi ngược lại n bước

4. Nếu cảm biến bụi bẩn không hoàn hảo và cho kết quả sai trong 10% trường hợp (sạch khi bẩn hoặc bẩn khi sạch).
--> Thêm 1 bước kiểm tra lại căn phòng sau khi dọn dẹp 

4. Nếu cảm biến cản không hoàn hảo và trong 10% trường hợp không báo cáo có tường khi có tường.
--> Khi đó, agent không di chuyển thẳng tiếp được, báo cáo đụng tường và đổi hướng

"""

'\n1. Nếu agent được đặt vào căn phòng hình chữ nhật không biết kích thước:\n--> Tìm đến góc rồi di chuyển ngang dọc để hút bụi\n\n2. Nếu khu vực vệ sinh có hình dáng bất thường:\n--> Cách tốt nhất: có thêm cảm biến quét kích thước từng căn phòng, khi đó có thể dọn dẹp từng căn phòng và hành lang.\n\n3. Nếu có vật cản\n--> Men theo vật cản n bước, sau đó đi ngược lại n bước\n\n4. Nếu cảm biến bụi bẩn không hoàn hảo và cho kết quả sai trong 10% trường hợp (sạch khi bẩn hoặc bẩn khi sạch).\n--> Thêm 1 bước kiểm tra lại căn phòng sau khi dọn dẹp \n\n4. Nếu cảm biến cản không hoàn hảo và trong 10% trường hợp không báo cáo có tường khi có tường.\n--> Khi đó, agent không di chuyển thẳng tiếp được, báo cáo đụng tường và đổi hướng\n\n'

## Advanced task: Imperfect Dirt Sensor

* __Graduate students__ need to complete this task [10 points]
* __Undergraduate students__ can attempt this as a bonus task [max +5 bonus points].

1. Change your simulation environment to run experiments for the following problem: The dirt sensor has a 10% chance of giving the wrong reading. Perform experiments to observe how this changes the performance of the three implementations. Your model-based reflex agent is likely not able to clean the whole room, so you need to measure performance differently as a tradeoff between energy cost and number of uncleaned squares. 

2. Design an implement a solution for your model-based agent that will clean better. Show the improvement with experiments.

In [198]:
# Your code and discussion goes here

# Model-based Reflex Agent

# Model-based Reflex Agent
class Agent_advance:
    def __init__(self,
                 max_steps=20,
                 room = setup_room(5, 0.2),
                 station_location=[0, 0],
                 verbose=True,
                 percent_error = 0.0):
        self.state = 0
        self.n = len(room)   # cạnh hình vuông
        self.actions_can_choice = ["north", "east", "south", "west", "suck" , "stand still"]
        self.position_x = 0
        self.position_y = 0

        self.steps = 0
        self.max_steps = max_steps

        self.station_location = station_location
        self.room = room

        self.need_to_clean = self.fc_need_to_clean()
        self.actual_dirt_vacuumed = self.need_to_clean

        self.verbose = verbose
        self.percent_error = percent_error

    def fc_need_to_clean(self):
        need_to_clean =0
        for i in range(self.n):
            for j in range(self.n):
                if room[i, j] == 1:
                    need_to_clean += 1
        return need_to_clean

    def update_agent_location(self, x, y):
        self.position_x = x
        self.position_y = y

    def moving (self, direction):
        feedback = ""
        if direction == "north":
            if self.position_x > 0:
                self.position_x -= 1
                self.steps += 1
            else:
                feedback = "Hit the North wall (no movement) --> change direction"

        elif direction == "east":
            if self.position_y < self.n - 1:
                self.position_y += 1
                self.steps += 1
            else:
                feedback = "Hit the East wall (no movement) --> change direction"

        elif direction == "south":
            if self.position_x < self.n-1:
                self.position_x += 1
                self.steps += 1
            else:
                feedback = "Hit the South wall (no movement) --> change direction"

        elif direction == "west":
            if self.position_y > 0:
                self.position_y -= 1
                self.steps += 1
            else:
                feedback = "Hit the West wall (no movement) --> change direction"
        elif direction == "stand still":
            feedback = "All roads are blocked --> Stand still"
            self.steps += 1
        if self.verbose:
            print(f"- Current position:[{self.position_x}][{self.position_y}]")
        if feedback!="" and self.verbose:
            print(feedback, "\n")


    def go_to_the_station(self):
        # simple: Tìm đường đi đến góc [0][0]
        if self.verbose:
            print("--> Go to station location")
        while self.position_x>0 :
            self.position_x -= 1
            self.steps +=1
        while self.position_y>0 :
            self.steps +=1
            self.position_y -= 1

        if self.verbose:
            print(f"- Current position:[{self.position_x}][{self.position_y}]")


    def action_suck(self):
        self.steps += 1
        if room[self.position_x, self.position_y] == 1.:
            self.actual_dirt_vacuumed -= 1
        self.room[self.position_x, self.position_y] = 0
        if self.verbose:
            print(f"--> SUCK")
    # def change_direction(self, direction):
    def check_dirt(self, i, j):
        k = np.random.randint(10)
        if k == 1:
            return self.room[i][j]
        else:
            return np.abs(self.room[i][j] - 1)

    def sense_majority(self, i, j, k=3):
        votes = sum(self.check_dirt(i, j) for _ in range(k))
        return votes > k // 2
    
    def whole_house_vacuuming(self):
        # self.steps = 0
        if self.verbose:
            print(f"- Current position:[{self.position_x}][{self.position_y}]")
        if self.position_x == 0 and self.position_y == 0:
            if self.verbose:
                print("Already at the charging station")
        else:
            if self.verbose:
                print("Not at the charging station yet.")
            self.go_to_the_station()
        if self.verbose:
            print("==> Start vacuuming the whole room.")

        for i in range(self.n-1):
            for j in range(self.n-1):
                if self.percent_error != 0.0:
                    if self.sense_majority(self.position_x, self.position_y):
                        self.action_suck()
                elif self.room[self.position_x][self.position_y] == 1.:
                    self.action_suck()
                if i % 2 == 0:
                    if self.verbose:
                        print("move east")
                    self.moving("east")
                else:
                    if self.verbose:
                        print("move west")
                    self.moving("west")

            if i!=self.n-1:
                if self.verbose:
                    print("move south")
                self.moving("south")

            if self.steps >= self.max_steps:
                if self.verbose:
                    print("Out of energy --> stop vacuuming")
                break

        if self.verbose:
            print("==> End vacuuming the whole room.")
        if self.verbose:
            print("The performance measure: ", self.steps)



## Advance task: Compare 4 agent

2 agent đầu tiên là hành động random nên ko cần tỷ lệ sai, áp dụng tỷ lệ cảm biến sai cho Reflex và advace agent

In [1]:
# Your code goes here
# Lấy max_steps cho Randomized Agent | Simple Reflex Agent là: 200
def Compare_the_performance(n=5, p=0.2, max_steps=200, iter=100, verbose=False):
    avg_p_random = 0
    avg_p_simple = 0
    avg_p_reflex = 0
    avg_p_advance = 0
    avg_need_to_clean_1, avg_actual_dirt_vacuumed_1 = 1, 1
    avg_need_to_clean_2, avg_actual_dirt_vacuumed_2 = 1, 1
    avg_need_to_clean_3, avg_actual_dirt_vacuumed_3 = 1, 1
    avg_need_to_clean_4, avg_actual_dirt_vacuumed_4 = 1, 1
    for i in range(iter):
        p_random, need_to_clean_1, actual_dirt_vacuumed_1 = real_environment(agent_function=simple_randomized_agent,
                                                                             max_steps = max_steps,
                                                                             n = n,
                                                                             p=p,
                                                                             verbose=verbose)
        avg_p_random+=p_random
        avg_need_to_clean_1 +=need_to_clean_1
        avg_actual_dirt_vacuumed_1+= actual_dirt_vacuumed_1
        #####################################################################
        p_simple, need_to_clean_2, actual_dirt_vacuumed_2 = real_environment(agent_function=simple_reflex_agent,
                                                                             max_steps = max_steps,
                                                                             n = n,
                                                                             p = p,
                                                                             verbose=verbose)
        avg_p_simple+=p_simple
        avg_need_to_clean_2 += need_to_clean_2
        avg_actual_dirt_vacuumed_2 += actual_dirt_vacuumed_2
        #####################################################################
        room = setup_room(n, p)
        agent_1 = Agent(max_steps=max_steps,
                        room = room,
                        station_location=[0,0],
                        verbose=verbose,
                        percent_error=10.0)

        agent_1.update_agent_location(x=np.random.randint(0, n), y=np.random.randint(0, n))
        agent_1.whole_house_vacuuming()
        avg_p_reflex+=agent_1.steps
        avg_need_to_clean_3 += agent_1.need_to_clean
        avg_actual_dirt_vacuumed_3 += agent_1.actual_dirt_vacuumed

        agent_ad = Agent_advance(max_steps=max_steps,
                        room = room,
                        station_location=[0,0],
                        verbose=verbose)

        agent_ad.update_agent_location(x=np.random.randint(0, n), y=np.random.randint(0, n))
        agent_ad.whole_house_vacuuming()
        avg_p_advance+=agent_ad.steps
        avg_need_to_clean_4 += agent_ad.need_to_clean
        avg_actual_dirt_vacuumed_4 += agent_ad.actual_dirt_vacuumed

    avg_p_random = avg_p_random/iter
    avg_p_simple = avg_p_simple/iter
    avg_p_reflex = avg_p_reflex/iter
    avg_p_advance = avg_p_advance/iter

    avg_need_to_clean_1 = avg_need_to_clean_1/iter
    avg_actual_dirt_vacuumed_1 = avg_actual_dirt_vacuumed_1/iter
    avg_need_to_clean_2 = avg_need_to_clean_2/iter
    avg_actual_dirt_vacuumed_2 = avg_actual_dirt_vacuumed_2/iter
    avg_need_to_clean_3 = avg_need_to_clean_3/iter
    avg_actual_dirt_vacuumed_3 = avg_actual_dirt_vacuumed_3/iter
    avg_need_to_clean_4 = avg_need_to_clean_4/iter
    avg_actual_dirt_vacuumed_4 = avg_actual_dirt_vacuumed_4/iter


    print(f"Random: {avg_p_random} || Cleaning progress:{avg_actual_dirt_vacuumed_1/avg_need_to_clean_1 * 100} %")
    print(f"Simple: {avg_p_simple} || Cleaning progress:{avg_actual_dirt_vacuumed_2/avg_need_to_clean_2 *100} %")
    print(f"Reflex: {avg_p_reflex}|| Cleaning progress:{avg_actual_dirt_vacuumed_3/avg_need_to_clean_3 *100} %")
    print(f"Advance: {avg_p_advance}|| Cleaning progress:{avg_actual_dirt_vacuumed_4/avg_need_to_clean_4 *100} %")



In [200]:
print("N = 5")
Compare_the_performance(n=5, p=0.2, max_steps=200, iter=100, verbose=False)
print("-----------------------------------------------------")

print("N = 10")
Compare_the_performance(n=10, p=0.2, max_steps=200, iter=100, verbose=False)
print("-----------------------------------------------------")

print("N = 100")
Compare_the_performance(n=100, p=0.2, max_steps=2000, iter=100, verbose=False)

N = 5
Random: 186.91 || Cleaning progress:74.05189620758483 %
Simple: 99.1 || Cleaning progress:99.00199600798403 %
Reflex: 35.75|| Cleaning progress:55.489021956087825 %
Advace: 26.88|| Cleaning progress:89.62075848303394 %
-----------------------------------------------------
N = 10
Random: 200.0 || Cleaning progress:24.437781109445275 %
Simple: 200.0 || Cleaning progress:55.722138930534726 %
Reflex: 160.17|| Cleaning progress:45.03435352904434 %
Advace: 113.97|| Cleaning progress:85.38413491567769 %
-----------------------------------------------------
N = 100
Random: 2000.0 || Cleaning progress:2.28198859005705 %
Simple: 2000.0 || Cleaning progress:5.703471482642587 %
Reflex: 2078.35|| Cleaning progress:93.54869617399811 %
Advace: 2060.32|| Cleaning progress:97.41806647470258 %


Fill out the following table with the average performance measure for 100 random runs (you may also create this table with code):

| Size              | Model-based Reflex Agent | Model advance |
|-------------------|--------------------------|---------------|
| 5x5               | 35.75                    | 26.88         |
| 10x10             | 160                      | 113           
| 100x100           | 2078                     | 2060          |

----------------
Mức độ làm sạch căn phòng (có giới hạn max_step)

| Size              | Model-based Reflex Agent | Model advance |
|-------------------|--------------------------|---------------|
| 5x5               | 55.48                    | 89.62         |
| 10x10             | 45.03                    | 85.38         
| 100x100           | 93.54                    | 97.41         |


Add charts to compare the performance of the different agents.

## More Advanced Implementation (not for credit)

If the assignment was to easy for yuo then you can think about the following problems. These problems are challenging and not part of this assignment. We will learn implementation strategies and algorithms useful for these tasks during the rest of the semester.

* __Obstacles:__ Change your simulation environment to run experiments for the following problem: Add random obstacle squares that also trigger the bumper sensor. The agent does not know where the obstacles are. Perform experiments to observe how this changes the performance of the three implementations. Describe what would need to be done to perform better with obstacles. Add code if you can. 

* __Agent for and environment with obstacles:__ Implement an agent for an environment where the agent does not know how large the environment is (we assume it is rectangular), where it starts or where the obstacles are. An option would be to always move to the closest unchecked/uncleaned square (note that this is actually depth-first search).

* __Utility-based agent:__ Change the environment for a $5 \times 5$ room, so each square has a fixed probability of getting dirty again. For the implementation, we give the environment a 2-dimensional array of probabilities. The utility of a state is defined as the number of currently clean squares in the room. Implement a utility-based agent that maximizes the expected utility over one full charge which lasts for 100000 time steps. To do this, the agent needs to learn the probabilities with which different squares get dirty again. This is very tricky!

In [185]:
# Your ideas/code