# SBU CSE 352 - HW 1 -  Intelligent Agents: Reflex-Based Agents for the Vacuum-cleaner World

---

Name: Arslan Baig

I understand that my submission needs to be my own work: Yes

## Instructions

Total Points: 100

Complete this notebook. Use the provided notebook cells and insert additional code and markdown cells as needed. Only use standard packages (numpy, scipy, and built-in packages like random). Submit the completely rendered notebook as a HTML file.

## Introduction

In this assignment you will implement a simulator environment for an automatic vacuum cleaner robot, a set of different reflex-based agent programs, and perform a comparison study for cleaning a single room. Focus on the __cleaning phase__ which starts when the robot is activated and ends when the last dirty square in the room has been cleaned. Someone else will take care of the agent program needed to navigate back to the charging station after the room is clean.

## PEAS description of the cleaning phase

__Performance Measure:__ Each action costs 1 energy unit. The performance is measured as the sum of the energy units used to clean the whole room.

__Environment:__ A room with $n \times n$ squares where $n = 5$. Dirt is randomly placed on each square with probability $p = 0.2$. For simplicity, you can assume that the agent knows the size and the layout of the room (i.e., it knows $n$). To start, the agent is placed on a random square.

__Actuators:__ The agent can clean the current square (action `suck`) or move to an adjacent square by going `north`, `east`, `south`, or `west`.

__Sensors:__ Four bumper sensors, one for north, east, south, and west; a dirt sensor reporting dirt in the current square.  


## The agent program for a simple randomized agent

The agent program is a function that gets sensor information (the current percepts) as the arguments. The arguments are:

* A dictionary with boolean entries for the for bumper sensors `north`, `east`, `west`, `south`. E.g., if the agent is on the north-west corner, `bumpers` will be `{"north" : True, "east" : False, "south" : False, "west" : True}`.
* The dirt sensor produces a boolean.

The agent returns the chosen action as a string.

Here is an example implementation for the agent program of a simple randomized agent:  

In [None]:
import numpy as np

actions = ["north", "east", "west", "south", "suck"]

def simple_randomized_agent(bumpers, dirty):
    return np.random.choice(actions)

In [None]:
# define percepts (current location is NW corner and it is dirty)
bumpers = {"north" : True, "east" : False, "south" : False, "west" : True}
dirty = True

# call agent program function with percepts and it returns an action
simple_randomized_agent(bumpers, dirty)

'south'

__Note:__ This is not a rational intelligent agent. It ignores its sensors and may bump into a wall repeatedly or not clean a dirty square. You will be asked to implement rational agents below.

## Simple environment example

We implement a simple simulation environment that supplies the agent with its percepts.
The simple environment is infinite in size (bumpers are always `False`) and every square is always dirty, even if the agent cleans it. The environment function returns a performance measure which is here the number of cleaned squares (since the room is infinite and all squares are constantly dirty, the agent can never clean the whole room as required in the PEAS description above). The energy budget of the agent is specified as `max_steps`.

In [None]:
def simple_environment(agent, max_steps, verbose = True):
    num_cleaned = 0

    for i in range(max_steps):
        dirty = True
        bumpers = {"north" : False, "south" : False, "west" : False, "east" : False}

        action = agent(bumpers, dirty)
        if (verbose): print("step", i , "- action:", action)

        if (action == "suck"):
            num_cleaned = num_cleaned + 1

    return num_cleaned



Do one simulation run with a simple randomized agent that has enough energy for 20 steps.

In [None]:
simple_environment(simple_randomized_agent, max_steps = 20)

step 0 - action: west
step 1 - action: north
step 2 - action: north
step 3 - action: north
step 4 - action: east
step 5 - action: suck
step 6 - action: suck
step 7 - action: suck
step 8 - action: east
step 9 - action: north
step 10 - action: west
step 11 - action: suck
step 12 - action: south
step 13 - action: north
step 14 - action: suck
step 15 - action: north
step 16 - action: west
step 17 - action: west
step 18 - action: suck
step 19 - action: north


6

# Tasks

## General [10 Points]

1. Your implementation can use libraries like math, numpy, scipy, but not libraries that implement intelligent agents or complete search algorithms. Try to keep the code simple! In this course, we want to learn about the algorithms and we often do not need to use object-oriented design, for example. If it makes your code more simple objects are okay, but try to keep the code as simple as possible.
2. You notebook needs to be formatted professionally.
    - Add additional markdown blocks for your description, comments in the code, add tables and use mathplotlib to produce charts where appropriate
    - Do not show debugging output or include an excessive amount of output.
    - Check that your PDF file is readable. For example, long lines are cut off in the PDF file. You don't have control over page breaks, so do not worry about these.
3. Document your code. Add a short discussion of how your implementation works and your design choices.


## Task 1: Implement a simulation environment [20 Points]

The simple environment above is not very realistic. Your environment simulator needs to follow the PEAS description from above. It needs to:

* Initialize the environment by storing the state of each square (clean/dirty) and making some dirty. ([Help with random numbers and arrays in Python](https://github.com/mhahsler/CS7320-AI/blob/master/HOWTOs/random_numbers_and_arrays.ipynb))
* Keep track of the agent's position.
* Call the agent function repeatedly and provide the agent function with the sensor inputs.  
* React to the agent's actions. E.g, by removing dirt from a square or moving the agent around unless there is a wall in the way.
* Keep track of the performance measure. That is, track the agent's actions until all dirty squares are clean and count the number of actions it takes the agent to complete the task.

The easiest implementation for the environment is to hold an 2-dimensional array to represent if squares are clean or dirty and to call the agent function in a loop until all squares are clean or a predefined number of steps have been reached (i.e., the robot runs out of energy).

The simulation environment should be a function like the `simple_environment()` and needs to work with the simple randomized agent program from above. **Use the same environment for all your agent implementations in the tasks below.**

*Note on debugging:* Debugging is difficult. Make sure your environment prints enough information when you use `verbose = True`. Also, implementing a function that the environment can use to displays the room with dirt and the current position of the robot at every step is very useful.  

In [59]:
# Your code and description goes here

#Envirnment Parameters

#2) Need to take an input to retrieve an agent action based on the agent function/class being the input
#3) Need to take a maximum steps argument in order to track whether fully clean or reached max steps
#4) Need to turn verbose to true for bug testing

#Environment Objectives
#1) Need to initialize the 5x5 matrix in order to ensure that there is some dirty and some clean with a 0.2 chance to be dirty
#2) Randomly spawn in the object into the matrix at a random x and y, and keep track of its location
#3) Move the agent and clean up a square if its dirty or not
#4) Keep track of performance measure (moving and cleaning) One action each




import numpy as np
import random

def init_bumpers(agent_position, n):             #Initalize the bumper positions for the agent
  bumpers = {"north" : False, "south" : False, "west" : False, "east" : False}    #set all = false to start
  if agent_position[0] == 0:
    bumpers["north"] = True
    bumpers["south"] = False
  elif agent_position[0] == n-1:
    bumpers["south"] = True
    bumpers["north"] = False
  if agent_position[1] == 0:
    bumpers["west"] = True
    bumpers["east"] = False
  elif agent_position[1] == n-1:
    bumpers["east"] = True
    bumpers["west"] = False

  return bumpers           #checked every position to see if agent lies on border, then returns the set of bumper booleans

def simple_environment(agent, n, max_steps, verbose = False):
    actions = ["north", "east", "west", "south", "suck"]          #Initialize the action names, will include movement clarification further down
    floor_area = np.random.choice(["C", "D"], size=(n, n), p=[0.8, 0.2])   #Initialize the floor area. C = clean, D - Dirty with a .8 and .2 split respectively

    dirty_count = 0
    performance_measure = 0
    for i in range(n):
      for j in range(n):
        if(floor_area[i][j] == "D"):     #Count the number of dirty tiles, important for checking whether the agent has cleaned everything
          dirty_count += 1

    fully_cleaned = False if dirty_count > 0 else True      #Set this condition for if the floor fully cleaned or if max steps for the loop

    agent_position = np.random.randint(0, n, size=2)         #initialize the agents starting postion - random on the board
    bumpers = init_bumpers(agent_position, n)
    if (verbose):
      print("Generating Floor Area...")
      print(floor_area)
      print("Placing Agent")
      store_position = floor_area[agent_position[0],agent_position[1]]
      floor_area[agent_position[0],agent_position[1]] = "A"
      print(floor_area)
      floor_area[agent_position[0],agent_position[1]] = store_position
    i = 1
    while (performance_measure != max_steps):
      if not fully_cleaned:
        agent_action = agent.Agent_Move(bumpers, floor_area[agent_position[0]][agent_position[1]])
        if (verbose): print("step", i, "- action:", agent_action)
        i+=1
        if agent_action == "suck":
          if(floor_area[agent_position[0]][agent_position[1]] == "D"):
            floor_area[agent_position[0]][agent_position[1]] = "C"
            dirty_count -= 1
          performance_measure += 1
          if dirty_count == 0:
            fully_cleaned = True
            break
        elif agent_action == "north":
          if agent_position[0] > 0:
            agent_position[0] -= 1
            bumpers = init_bumpers(agent_position,n)
            performance_measure += 1
        elif agent_action == "south":
          if agent_position[0] < n-1:
            agent_position[0] += 1
            bumpers = init_bumpers(agent_position,n)
            performance_measure += 1
        elif agent_action == "east":
          if agent_position[1] < n-1:
            agent_position[1] += 1
            bumpers = init_bumpers(agent_position, n)
            performance_measure += 1
        elif agent_action == "west":
          if agent_position[1] > 0:
            agent_position[1] -= 1
            bumpers = init_bumpers(agent_position, n)
            performance_measure += 1
        if (verbose):
          store_position = floor_area[agent_position[0],agent_position[1]]
          floor_area[agent_position[0],agent_position[1]] = "A"
          print(floor_area)
          floor_area[agent_position[0],agent_position[1]] = store_position

    if verbose:
      print("Performance Rating : ", performance_measure)
      print("Task Status: ", "Fully Cleaned" if fully_cleaned else "Max Steps Reached")
    agent.Agent_Reset(n)
    return performance_measure

    # agent_action = agent(n,)
class Simple_Randomized_Agent:                             #Remade Randomized agent with SAME logic to apply to ensure consistency among agent classes
  def Agent_Reset(self, n, self_positition = (None,None), starting_position = False):
    return

  def Agent_Move(self, bumpers, dirty):
      actions = ["north", "east", "west", "south"]
      return np.random.choice(actions)

# simple_randomized_agent = Simple_Randomized_Agent()
# simple_environment(simple_randomized_agent,5,20)



## Task 2:  Implement a simple reflex agent [10 Points]

The simple reflex agent randomly walks around but reacts to the bumper sensor by not bumping into the wall and to dirt with sucking. Implement the agent program as a function.

_Note:_ Agents cannot directly use variable in the environment. They only gets the percepts as the arguments to the agent function.

In [36]:
# Your code and description goes here
# Capabilities
#1) Agent can sense if the current block is dirty or not
#2) Agent does not know its position
#3) Agent can respond to the bumpers, and not bump into them When choosing an action
#4) Agent cannot keep track of previous locations

class Simple_Reflex_Agent:
  def Agent_Reset(self, n, self_positition = (None,None), starting_position = False):
    return

  def Agent_Move(self, bumpers, dirty):
    actions = ["north", "east", "west", "south"]
    if(bumpers["north"]):
      actions.remove("north")
    elif (bumpers["south"]):
      actions.remove("south")
    if(bumpers["west"]):
      actions.remove("west")
    elif (bumpers["east"]):
      actions.remove("east")
    if(dirty == "D"):
      return "suck"
    else:
      return np.random.choice(actions)


simple_agent = Simple_Reflex_Agent()
simple_environment(simple_agent,7,200)

200

## Task 3: Implement a model-based reflex agent [20 Points]

Model-based agents use a state to keep track of what they have done and perceived so far. Your agent needs to find out where it is located and then keep track of its current location. You also need a set of rules based on the state and the percepts to make sure that the agent will clean the whole room. For example, the agent can move to a corner to determine its location and then it can navigate through the whole room and clean dirty squares.

Describe how you define the __agent state__ and how your agent works before implementing it. ([Help with implementing state information on Python](https://github.com/mhahsler/CS7320-AI/blob/master/HOWTOs/store_agent_state_information.ipynb))

In [273]:
# Your short description of the state and your implementation goes here

#State :
# I Intend to make the state position within the agent be a grid of the matrix itself. This is the best way for the agent to know and keep track of what is happening in the environment
# up until it visits each point. My idea is to spawn the agent in, and the initial position would be unknown, but to assume every square in the matrix is dirty. This way, the agent state at the
# very beginning will recognize that we have to navigate the entire matrix and find the end point
#Implemetation:
#To implement this bot, I will have it traverse to location (0,0), or the North-West corner of the matrix. When the robot is initialized, its self position will be unknown. It will calibrate its initial position
#based on the initial readings of the bumpers it gets. Moving forward, it will then traverse West until the west bumper is true making sure to clean any along the way. Once the position is recognized, The agent will go from
#west to east, and then north to south, and traverse back and forth until it reaches South-East/ North-East corner OR the floor is completely clean

In [57]:
# Your code goes here

class Model_Reflex_Agent:

  def __init__(self, n, self_positition = (None,None), starting_position = False):
    self.self_position = self_positition
    self.starting_position = starting_position
    self.floor_mapping = [["D"] * n for _ in range(n)]
    self.last_position = (None,None)

  def Agent_Reset(self, n, self_positition = (None,None), starting_position = False):
    self.self_position = self_positition
    self.starting_position = starting_position
    self.floor_mapping = [["D"] * n for _ in range(n)]
    self.last_position = (None,None)

  def Agent_Move(self,bumpers,dirty, verbose = False):
    if(dirty == "D"):
      return "suck"

    if not self.starting_position:
      if not bumpers["north"]:
        return "north"
      elif not bumpers["west"]:
        return "west"
      else:
        self.starting_position = True
        self.self_position = (0,0)

    self.floor_mapping[self.self_position[0]][self.self_position[1]] = "C"
    if(self.self_position[0] % 2 == 0):
      if not bumpers["east"]:
        self.self_position = (self.self_position[0],self.self_position[1]+1)
        if verbose:
          for row in self.floor_mapping:
            print(row)
          print()
        return "east"
      else:
        self.self_position = (self.self_position[0]+1,self.self_position[1])
        if verbose:
          for row in self.floor_mapping:
            print(row)
          print()
        return "south"
    else:
      if not bumpers["west"]:
        self.self_position = (self.self_position[0],self.self_position[1]-1)
        if verbose:
          for row in self.floor_mapping:
            print(row)
          print()
        return "west"
      else:
        self.self_position = (self.self_position[0]+1,self.self_position[1])
        if verbose:
          for row in self.floor_mapping:
            print(row)
          print()
        return "south"

n = 5
model_agent = Model_Reflex_Agent(n)
model_5x5 = 0
for i in range(100):
  print(i)
  model_5x5 += simple_environment(model_agent,5,100000)
print(model_5x5/100)

0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
27.66


\## Task 4: Simulation study [30 Points]

---



Compare the performance (the performance measure is defined in the PEAS description above) of the agents using  environments of different size. E.g., $5 \times 5$, $10 \times 10$ and
$100 \times 100$. Use 100 random runs for each with 100000 max steps. Present the results using tables and graphs. Discuss the differences between the agents.
([Help with charts and tables in Python](https://github.com/mhahsler/CS7320-AI/blob/master/HOWTOs/charts_and_tables.ipynb))

In [60]:
# Your code goes here
#Implement tests with all 3 bots, turning verbose off and then running it so that we can see their performance
#in all of the setting below. Get all tests done in one round of the code, so that it is clear for submission

random_agent = Simple_Randomized_Agent()
simple_agent = Simple_Reflex_Agent()
model_agent_5x5 = Model_Reflex_Agent(5)
model_agent_10x10 = Model_Reflex_Agent(10)
model_agent_100x100 = Model_Reflex_Agent(100)


random_5x5 = random_10x10 = random_100x100 = 0
simple_5x5 = simple_10x10 = simple_100x100 = 0
model_5x5 = model_10x10 = model_100x100 = 0

for i in range(100):
  random_5x5 += simple_environment(random_agent,5,100000)
  random_10x10 += simple_environment(random_agent,10,100000)
  random_100x100 += simple_environment(random_agent,100,100000)

  simple_5x5 += simple_environment(simple_agent,5,100000)
  simple_10x10 += simple_environment(simple_agent,10,100000)
  simple_100x100 += simple_environment(simple_agent,100,100000)

  model_5x5 += simple_environment(model_agent_5x5,5,100000)
  model_10x10 += simple_environment(model_agent_10x10,10,100000)
  model_100x100 += simple_environment(model_agent_100x100,100,100000)

random_5x5 /= 100
random_10x10 /= 100
random_100x100 /= 100
simple_5x5 /= 100
simple_10x10 /= 100
simple_100x100 /= 100
model_5x5 /= 100
model_10x10 /= 100
model_100x100 /= 100
print("Random 5x5: ", random_5x5)
print("Random 10x10: ", random_10x10)
print("Random 100x100: ", random_100x100)
print("Simple 5x5: ", simple_5x5)
print("Simple 10x10: ", simple_10x10)
print("Simple 100x100: ", simple_100x100)
print("Model 5x5: ", model_5x5)
print("Model 10x10: ", model_10x10)
print("Model 100x100: ", model_100x100)


KeyboardInterrupt: 

Fill out the following table with the average performance measure for 100 random runs (you may also create this table with code):

| Size     | Randomized Agent | Simple Reflex Agent | Model-based Reflex Agent |
|----------|------------------|---------------------|--------------------------|
| 5x5     | | | |
| 10x10   | | | |
| 100x100 | | | |

Add charts to compare the performance of the different agents.

In [None]:
# Your graphs and discussion of the results goes here

## Task 5: Robustness of the agent implementations [10 Points]

Describe how **your agent implementations** will perform

* if it is put into a rectangular room with unknown size,
* if the cleaning area can have an irregular shape (e.g., a hallway connecting two rooms), or
* if the room contains obstacles (i.e., squares that it cannot pass through and trigger the bumper sensors).

In [None]:
# Answer goes here



---
Assignment adapted from [Michael Hahsler](https://github.com/mhahsler/CS7320-AI) under [CC BY-SA](https://creativecommons.org/licenses/by-sa/4.0/deed.en) license.
