### 2.9

__Implement a performance-measuring environment simulator for the vacuum-cleaner world depicted in Figure 2.2 and specified on page 38. Your implementation should be modular so that the sensors, actuators, and environment characteristics (size, shape, dirt placement, etc.) can be changed easily. (_Note:_ for some choices of programming language and operating system there are already implementations in the [online code repository](http://aima.cs.berkeley.edu/code.html).)__

The world in Figure 2.2 has two squares, "A" and "B". I have implemented this as `vacuum_cleaner_world.environments.SimpleVacuumWorld`.

The specifications from page 38 are as follow:

* The performance measure awards one point for each clean square at each time step, over a  "lifetime" of 1000 time steps.
* The geography of the environment is known _a priori_ but the dirt distribution and the initial location of the agent are not. Clean squares stay clean and sucking cleans the current square. The _Left_ and _Right_ actions move the agent left and right except when this would take the agent outside the environment, in which case the agent remains where it is.
* The only available actions are _Left_,  _Right_, and _Suck_.
* The agent correctly perceives its location and whether that location contains dirt.

### 2.10
__Consider a modified version of the vacuum environment in Exercise 2.9, in which the agent is penalized one point for each movement.__

__a. Can a simple reflex agent be perfectly rational for this environment? Explain.__

A simple reflex agent cannot be perfectly rational. Since clearly a reflex agent that returns the action `Clean` when the dirt sensor informs it that there is dirt in its location will do better than one that moves, the action for percepts `[A, Dirt]` and `[B, Dirt]` will be `Clean`. For `[A, No Dirt]` a simple reflex agent can either return `Clean`, `Left`, or `Right`.

`Clean` or `Left` will cause the agent to stay in the same place for its lifetime. The expected performance of this agent will be poor, if square B is expected to have dirt. However, a reflex agent which returns `Right` for `[A, No Dirt]` and `Left` for `[B, No Dirt]` will oscillate back and forth once both squares are cleaned. An agent that stays still after visiting both squares would perform better, but no set of condition-action rules,  and therefore no simple reflex agent, can implement such an agent function.

__b. What about a reflex agent with state? Design such an agent.__

Since it takes just 1 movement to visit every square, a reflex in the environment only needs to keep track of how many moves it has made, and return `NoOp` if this is equal to 1.

The stateful reflex agent is implemented as `vacuum_cleaner_world.agents.StatefulReflexAgent`

In [1]:
from vacuum_cleaner_world.environment import SimpleVacuumWorld
from vacuum_cleaner_world.agents import StatefulReflexAgent

In [2]:
env = SimpleVacuumWorld(move_penalty=True)

In [3]:
for _ in range(10):
    print(env.simulate(StatefulReflexAgent))

1998
1999
1997
1997
1998
1999
1999
1997
1999
1999


The maximum scores attainable are:

__1999__ if there is initially no dirt, or dirt in the agent's starting square...

__1998__ if there is dirt in the opposite square...

__1997__ if there is initially dirt in both squares...

Assuming that the first score taken is after the agent makes its first move.

By manually setting the initial dirt and agent location, I show that the `StatefulReflexAgent` is rational.

In [4]:
dirty = SimpleVacuumWorld(move_penalty=True, dirt_init='dirty')
print(dirty.simulate(StatefulReflexAgent))

1997


In [6]:
opposite1 = SimpleVacuumWorld(move_penalty=True, dirt_init=[0, 1], init_loc='A')
print(opposite1.simulate(StatefulReflexAgent))

1998


In [7]:
opposite2 = SimpleVacuumWorld(move_penalty=True, dirt_init=[1, 0], init_loc='B')
print(opposite2.simulate(StatefulReflexAgent))

1998


In [8]:
same1 = SimpleVacuumWorld(move_penalty=True, dirt_init=[0, 1], init_loc='B')
print(same1.simulate(StatefulReflexAgent))

1999


In [9]:
same2 = SimpleVacuumWorld(move_penalty=True, dirt_init=[1, 0], init_loc='A')
print(same2.simulate(StatefulReflexAgent))

1999


In [5]:
clean = SimpleVacuumWorld(move_penalty=True, dirt_init='clean')
print(clean.simulate(StatefulReflexAgent))

1999
