# AGENT #

An agent, as defined in 2.1 is anything that can perceive its <b>environment</b> through sensors, and act upon that environment through actuators based on its <b>agent program</b>. This can be a dog, robot, or even you. As long as you can perceive the environment and act on it, you are an agent. This notebook will explain how to implement a simple agent, create an environment, and create a program that helps the agent act on the environment based on its percepts.

Before moving on, review the </b>Agent</b> and </b>Environment</b> classes in <b>[agents.py](https://github.com/aimacode/aima-python/blob/master/agents.py)</b>.

Let's begin by importing all the functions from the agents.py module and creating our first agent - a blind dog.

In [None]:
#from agents import *

#class BlindDog(Agent):
#    def eat(self, thing):
#        print("Dog: Ate food at {}.".format(self.location))
#            
#    def drink(self, thing):
#        print("Dog: Drank water at {}.".format( self.location))
#
#dog = BlindDog()

What we have just done is create a dog who can only feel what's in his location (since he's blind), and can eat or drink. Let's see if he's alive...

In [None]:
#print(dog.alive)

<!--- 
![Cool dog](https://gifgun.files.wordpress.com/2015/07/wpid-wp-1435860392895.gif) This is our dog. How cool is he? Well, he's hungry and needs to go search for food. For him to do this, we need to give him a program. But before that, let's create a park for our dog to play in.
-->

# ENVIRONMENT #

A park is an example of an environment because our dog can perceive and act upon it. The <b>Environment</b> class in agents.py is an abstract class, so we will have to create our own subclass from it before we can use it. The abstract class must contain the following methods:

<li><b>percept(self, agent)</b> - returns what the agent perceives</li>
<li><b>execute_action(self, agent, action)</b> - changes the state of the environment based on what the agent does.</li>

In [None]:
#class Food(Thing):
#    pass

#class Water(Thing):
#    pass

#class Park(Environment):
#    def percept(self, agent):
#        '''prints & return a list of things that are in our agent's location'''
#        things = self.list_things_at(agent.location)
#        print(things)
#        return things
    
#    def execute_action(self, agent, action):
#        '''changes the state of the environment based on what the agent does.'''
#        if action == "move down":
#            agent.movedown()
#        elif action == "eat":
#            items = self.list_things_at(agent.location, tclass=Food)
#            if len(items) != 0:
#                if agent.eat(items[0]): #Have the dog pick eat the first item
#                    self.delete_thing(items[0]) #Delete it from the Park after.
#        elif action == "drink":
#            items = self.list_things_at(agent.location, tclass=Water)
#            if len(items) != 0:
#                if agent.drink(items[0]): #Have the dog drink the first item
#                    self.delete_thing(items[0]) #Delete it from the Park after.
                    
#    def is_done(self):
#        '''By default, we're done when we can't find a live agent, 
#        but to prevent killing our cute dog, we will or it with when there is no more food or water'''
#        no_edibles = not any(isinstance(thing, Food) or isinstance(thing, Water) for thing in self.things)
#        dead_agents = not any(agent.is_alive() for agent in self.agents)
#        return dead_agents or no_edibles


## Wumpus Environment

In [None]:
#from ipythonblocks import BlockGrid
#from agents import *

#color = {"Breeze": (225, 225, 225),
#        "Pit": (0,0,0),
#        "Gold": (253, 208, 23),
#        "Glitter": (253, 208, 23),
#        "Wumpus": (43, 27, 23),
#        "Stench": (128, 128, 128),
#        "Explorer": (0, 0, 255),
#        "Wall": (44, 53, 57)
#        }

#def program(percepts):
#    '''Returns an action based on it's percepts'''
#    print(percepts)
#    return input()

#w = WumpusEnvironment(program, 7, 7)         
#grid = BlockGrid(w.width, w.height, fill=(123, 234, 123))

#def draw_grid(world):
#    global grid
#    grid[:] = (123, 234, 123)
#    for x in range(0, len(world)):
#        for y in range(0, len(world[x])):
#            if len(world[x][y]):
#                grid[y, x] = color[world[x][y][-1].__class__.__name__]

#def step():
#    global grid, w
#    draw_grid(w.get_world())
#    grid.show()
#    w.step()

# PROGRAM #
Now that we have a <b>Park</b> Class, we need to implement a <b>program</b> module for our dog. A program controls how the dog acts upon it's environment. Our program will be very simple, and is shown in the table below.
<table>
    <tr>
        <td><b>Percept:</b> </td>
        <td>Feel Food </td>
        <td>Feel Water</td>
        <td>Feel Nothing</td>
   </tr>
   <tr>
       <td><b>Action:</b> </td>
       <td>eat</td>
       <td>drink</td>
       <td>move up</td>
   </tr>
        
</table>


In [None]:
#class BlindDog(Agent):
#    location = 1
    
#    def movedown(self):
#        self.location += 1
        
#    def eat(self, thing):
#        '''returns True upon success or False otherwise'''
#        if isinstance(thing, Food):
#            print("Dog: Ate food at {}.".format(self.location))
#            return True
#        return False
    
#    def drink(self, thing):
#        ''' returns True upon success or False otherwise'''
#        if isinstance(thing, Water):
#            print("Dog: Drank water at {}.".format(self.location))
#            return True
#        return False
        
#def program(percepts):
#    '''Returns an action based on it's percepts'''
#    for p in percepts:
#        if isinstance(p, Food):
#            return 'eat'
#        elif isinstance(p, Water):
#            return 'drink'
#    return 'move down'               

In [None]:
#park = Park()
#dog = BlindDog(program)
#dogfood = Food()
#water = Water()
#park.add_thing(dog, 0)
#park.add_thing(dogfood, 5)
#park.add_thing(water, 7)

#park.run(10)

That's how easy it is to implement an agent, its program, and environment. But that was a very simple case. What if our environment was 2-Dimentional instead of 1? And what if we had multiple agents?

To make our Park 2D, we will need to make it a subclass of <b>XYEnvironment</b> instead of Environment. Also, let's add a person to play fetch with the dog.

In [None]:
#class Park(XYEnvironment):
#    def percept(self, agent):
#        '''prints & return a list of things that are in our agent's location'''
#        things = self.list_things_at(agent.location)
#        print(things)
#        return things
    
#    def execute_action(self, agent, action):
#        '''changes the state of the environment based on what the agent does.'''
#        if action == "move down":
#            agent.movedown()
#        elif action == "eat":
#            items = self.list_things_at(agent.location, tclass=Food)
#            if len(items) != 0:
#                if agent.eat(items[0]): #Have the dog pick eat the first item
#                    self.delete_thing(items[0]) #Delete it from the Park after.
#        elif action == "drink":
#            items = self.list_things_at(agent.location, tclass=Water)
#            if len(items) != 0:
#                if agent.drink(items[0]): #Have the dog drink the first item
#                    self.delete_thing(items[0]) #Delete it from the Park after.
                    
#    def is_done(self):
#        '''By default, we're done when we can't find a live agent, 
#        but to prevent killing our cute dog, we will or it with when there is no more food or water'''
#        no_edibles = not any(isinstance(thing, Food) or isinstance(thing, Water) for thing in self.things)
#        dead_agents = not any(agent.is_alive() for agent in self.agents)
#        return dead_agents or no_edibles

## Notes and exercises from the book.

### Part I Artificial Intelligence  
          1 Introduction 
The basis of the course is the idea of an intelligent agent (to be described later).  The intelligent agent receives input from its environment, is able to do some processing, and then act on the environment to modify it in some desired fashion. 
In some cases, large amounts of training data will make a poor algorithm outperform a good algorithm on a smaller amount of data.  (This is very interesting). 
--Yarowsky 1995, Word Sense Disambiguiation. 
--Banko and Brill, 2001, bootstrap 
--Hays and Efros, 2007, filling photos with holes 
Exercises: 
1.1 Define in your own words intelligence, artificial intelligence, agent, rationality, logical reasoning. 
Intelligence is the ability to take information from external sources, and along with internal stored information, take actions which change the external environment in a way that appears to have a structure or pattern that can be observed and recognized by others.  For example, if a door is closed and locked, and a person tries to open it by reaching into their pocket and seeking a key that fits, this would be considered an act of intelligence.  By contrast, simply kicking the door down would not be considered intelligence, since it is wasteful of resources. 
Artificial Intelligence is simulating the intelligence of living beings by non-living machines, but not necessarily by duplicating the methods (which may be unknown).  The pattern of taking external input, processing it in some way to reach a choice, and then acting on that choice externally are the key elements to reproduce. 
 
Notes on Turing (1950):  In this paper, Alan Turing describes the "Imitation Game" which is what we now think of as the Turing test.  In his original version, there is the interregator, the test subject, and also a third player who is human, and tries to provide evidence of human responses to the interogator (probably to simulate a control-test experimental setup).  Other parts of the paper describe learning machines, reinforcement learning, and the characterization of "thinking" as being related to storage size.  At a storage size of 1Gbit or so, Turing believes machines will pass the "Imitation Game" test in that an interogator will be unable to distinguish between a human and machine.  The human brain, estimated at the time of this paper, was asserted to have approximately 10^10 to 10^15 bits of storage, which is 1GB to 100TB – this is a fairly wide range, but with current day technology, easily achievable either in the cloud or on native hardware.

### 2 Intelligent Agents  
 
Rational Agent Design: 
A rational agent will maximize the performance measure given the percept value it has seen so far. 
Four Types of Agents: 
-a- Goal Seeking 
-b- Utility Maximization 
-c- Reflex 
-d- Model based 
-e- Learning agents 
 
Exercises: 
2.1  Suppose that the performance measure is concerned with just the first T steps of the environment and ignores everything thereafter.  Show that a rational agent's action may depend not on just the state of the environment but also the time step it has reached: 
In the general case, we would have a situation where you would need to include the time step and compare it to T in order to decide on an action.  Once the time step T is reached, any further action cannot change the performance measure, any action is acceptable.  However, before that time step is reached, the action taken must be such that the performance measure is maximized. 
 
 2.2:   
A) Show that the simple vacuum-cleaner agent function described in Figure 2.3 is indeed rational under the assumptions listed on page 38: 
Note:  A time step consists of observing a percept, taking an action, and having the performance measure updated: 
Definition of a Rational Agent:  For each possible percept sequence, a rational agent should select an action that is expected to maximize its performance measure, given the evidence provided by the percept sequence and whatever built-in knowledge the agent has. 
Assumptions: 
-a- The performance measure awards one point for each clean square at each time step, over a "lifetime" of 1000 time steps.  (Note:  The performance measure is assessed from the state of the entire environment, not agent, point of view) 
-b- The geography of the environment is known a priori (Figure 2.2), but the dirt distribution and initial location of the agent are not.  Clean squares stay clean and sucking cleans the current square.  The Left and Right actions move the agent outside the environment, in which  case the agent remains where it is. 
-c- The only available actions are Left, Right, and Suck. 
-d- The agent correctly perceives its location and whether that location contains dirt. 
Solution:   
There are four possible environments that can arise as initial condition (not including the initial location of the agent, which will double the count)  
Case 1:  A clean, B clean 
Case 2: A dirty, B clean 
Case 3: A clean, B dirty 
Case 4: A dirty, B dirty 
----> 
For the following analysis, assume that the agent is always initially places in square A. 
For Case 1:  The maximum theoretical performance value is 2x1000=2000 points, since at each time step, both squares are clean, so we are awarded 1 point for each panel and the total time history is 1000.  The agent will reproduce this theoretical maximum, because it will merely oscillate between squares A and B. 
For Case 2:  The maximum theoretical performance value is also 2x1000=2000 points, since we assume that if there is dirt in square A, then it will be cleaned, and at the end of the time step, the evaluation of the measure will be based on two clean squares since we are assuming that our agent is in square A initially.  Our agent performs actions that will produce the same result. 
For Case 3:  The maximum theoretical performance value is (1+0) + 2*999=1999.  During the first time step, we only have one clean square, and the agent will have to move from A to B in order to clean square B.  Including and after the second time step, we have two clean squares and gain 2 points for each time step.  Our agent reproduces this action and thus will match the maximum possible performance. 
For Case 4:  The maximum theoretical performance value is (1+0) + (1+0)+2*998= 1998.  On the first time step, we have can only clean the first square A, thus only obtaining 1 points.  On the second time step, we move to square B, and only have the clean square A to give a point.  On the third and subsequent steps, we have both squares clean, and get a full two points each step.  The agent will match this theoretical performance by its actions.  
 
B)  Describe a rational agent function for the case where each movement costs one point.  Does the corresponding agent program require internal state? 
A key observation is that once a given square is clean you don't want to waste points by moving back to it, so we need to remember whether or not a cell is clean. 
Like in part A, there are four cases that the environment can be in (8 if you include the position of the vacuum). 
Set of rules for agent program: 
[A, clean; A-State:unknown, B-state:unknown] -> [Right, A-State=clean, B-State=unknown]  
[B, clean; A-State:unknown, B-state:unknown] -> [Left, A-State=unknown, B-State=Clean] 
[A, dirty; A-State:unknown, B-state:unknown] -> [Suck, A-State=clean, B-State=unknown] 
[B, dirty; A-State:unknown, B-state:unknown] -> [Suck, A-State=unknown, B-State=clean] 
[A, clean; A-State:clean, B-state:unknown] -> [Right, A-State=clean, B-State=unknown] 
[B, clean; A-State:clean, B-state:unknown] -> [Suck, A-State=clean, B-State=clean] 
XX[A, dirty; A-State:clean, B-state:unknown] -> [Left, A-State=unknown, B-State=unknown] 
[B, dirty; A-State:clean, B-state:unknown] -> [Suck, A-State=clean, B-State=clean] 
[A, clean; A-State:unknown, B-state:clean] -> [Suck, A-State=clean, B-State=clean]  
[B, clean; A-State:unknown, B-state:clean] -> [Left, A-State=unknown, B-State=clean] 
[A, dirty; A-State:unknown, B-state:clean] -> [Suck, A-State=clean, B-State=clean] 
XX[B, dirty; A-State:unknown, B-state:clean] -> [Left, A-State=unknown, B-State=unknown] 
[A, clean; A-State:clean, B-state:clean] -> [Suck, A-State=clean, B-State=clean] 
[B, clean; A-State:clean, B-state:clean] -> [Suck, A-State=clean, B-State=clean] 
XX[A, dirty; A-State:clean, B-state:clean] -> [Left, A-State=unknown, B-State=unknown] 
XX[B, dirty; A-State:clean, B-state:clean] -> [Left, A-State=unknown, B-State=unknown] 
 
2.3) 
An agent that senses only partial information about the state cannot be perfectly rational. 
The definition of a rational agent is: For each possible percept sequence, a rational agent should select an action that is expected to maximize its performance measure, given the evidence provided by the percept sequence and whatever built-in knowledge the agent has. 
By this definition, as long as the agent selects an action that is expected to maximize its performance measure, given the evidence provided by the percept sequence, it is rational.  If the percept sequence is something like [A, maybe dirty], there will be an action for this case that will attempt to maximize the known performance measure. 
FALSE:  Here is an example where the agent is perfectly rational.  The environment is one square and a point is given for each time step where the square is clean.  The only action is Suck.  Assume the worse case that the agent cannot detect whether the square is clean or not.  This becomes irrelevant because each time step, the agent will take the single action Suck and either clean the square if it is dirty, and get the point, or simply clean an already clean square, with no loss.  Under the given performance measure (which is awarding points for clean squares) this is optimal. 
 
There exist task environments in which no pure reflex agent can behave rationally. 
A pure reflex agent only uses the current percept to make a decision.  It is not allowed to store information about previous percepts.  Imagine a situation where a point is deducted for each move on the two square vacuum world, and one point is given for each clean square.  Once a square has been cleaned, the agent shouldn't return to it.  However, without this knowledge being stored, the agent is destined to repeatedly return to previously clean squares, which is not rational, given the fact that these precepts have already been observed.  This assumes that the reflex agent is restricted to observing only the current square it is on. 

# Chapter 2 Exercises

2.7)  Write pseudocode for the goal-based and utility based agents:

** Goal-based: **

    currentDeltaAction=0
    currentBestAction=[]
    While goal==false:
        for iAction in listOfActions:
            if deltaValue(iAction)>currentDeltaAction:
                currentBestAction=iAction
        agent_action(currentBestAction):
    

** Utility-based: **

    currentDeltaUtility=0
    currentBestAction=[]
    While true:
        if iAction in listOfActions:
            if deltaUtility(iAction)>currentBestUtility:
                currentBestAction=iAction
            
        agent_action(currentBestAction)

2.8) Implement a performance-measuring environment simulator for the vacuum-cleaner world depicted in Figure 2.2 and specified on page 38.  Your implementation should be modular so that the sensors, actuators, and enviroment characteristics (size, shape, dirt placement, etc.) can be changed easily.

Agent Name:  Vacuum Robot Agent
-------------------------------
*Performance Measure:*  +1 point for each clean square at each time step, for 1000 time steps

*Environment:*  Two squares at positions (0,0) and (1,0).  The squares can either be dirty or clean.  The agent cannot go outside those two positions.

*Actuators:*  The actuators for the agent consist of the ability to move between the squares and the ability to suck up dirt.

*Sensors:*  The sensors allow for the agent to know current location and also whether there is dirt or not at the square the currently occupy.

In [None]:
from agents import *

# Define the dirt clump class
class DirtClump(Thing):
    pass

#Define the environment class
class adxyz_VacuumEnvironment(XYEnvironment):

# Need to override the percept method 
    def percept(self, agent):
        print ()
        print ("In adxyz_VacuumEnvironment - percept override:")
        print ("Self = ", self)
        print ("Self.things = ", self.things)
        print ("Agent ID = ", agent)
        print ("Agent location = ", agent.location)
        print ("Agent performance = ", agent.performance)
        
        for iThing in self.things:
            if iThing.location==agent.location:  #check location
                if iThing != agent:  # Don't return agent information
                    if (isinstance(iThing, DirtClump)):
                        print ("A thing which is not agent, but a dirt clump = ", iThing )
                        print ("Location = ", iThing.location)
                        return agent.location, "DirtClump"
                    
        return agent.location, "CleanSquare"  #Default, if we don't find a dirt clump.
                
# Need to override the action method (and update performance measure.)
    def execute_action(self, agent, action):
        print ()
        print ("In adxyz_VacuumEnvironment - execute_action override:")
        print("self = ", self)
        print("agent = ", agent)
        print("current agent action = ", action)
        print()
        if action=="Suck":
            print("Action-Suck")
            print("Need to remove dirt clump at correct location")
            deleteList = []
            for iThing in self.things:
                if iThing.location==agent.location:  #check location
                    if (isinstance(iThing, DirtClump)):  # Only suck dirt
                        print ("A thing which is not agent, but a dirt clump = ", iThing)
                        print ("Location of dirt clod = ", iThing.location)
                        self.delete_thing(iThing)
                        break  # can only do one deletion per action.
                                   
        elif action=="MoveRight":
            print("Action-MoveRight")
            print("agent direction before MoveRight = ", agent.direction)
            print("agent location before MoveRight = ", agent.location)
            agent.bump = False
            agent.direction.direction = "right"
            agent.bump = self.move_to(agent, agent.direction.move_forward(agent.location))
            print("agent direction after MoveRight = ", agent.direction)
            print("agent location after MoveRight = ", agent.location)
            print()
            
        elif action=="MoveLeft":
            print("Action-MoveLeft")
            print("agent direction before MoveLeft = ", agent.direction)
            print("agent location before MoveLeft = ", agent.location)
            agent.bump = False
            agent.direction.direction = "left"
            agent.bump = self.move_to(agent, agent.direction.move_forward(agent.location))
            print("agent direction after MoveLeft = ", agent.direction)
            print("agent location after MoveLeft = ", agent.location)
            print()
            
        elif action=="DoNothing":
            print("Action-DoNothing")
            
        else:
            print("Action-Not Understood")  #probably error.  Don't go to score section.
            return
                
###
### Count up number of clean squares (indirectly)
### and add that to the agent peformance score
###

        print("Before dirt count update, agent.performance = ", agent.performance)
        dirtCount=0
        for iThing in self.things:
            if isinstance(iThing, DirtClump):
                dirtCount = dirtCount+1

        cleanSquareCount = self.width*self.height-dirtCount 
        agent.performance=agent.performance + cleanSquareCount
        print("After execute_action, agent.performance = ", agent.performance)
        return    

2.9) Implement a simple reflex agent for the vacuum environment in Exercise 2.8.  Run the environment with this agent for all possible initial dirt configurations and agent locations.  Record the performance score for each consideration and the overall average score.

In [None]:
#
# The program for the simple reflex agent is:
# 
# Percept:         Action:
# --------         -------
# [(0,0),Clean] -> Right
# [(0,0),Dirty] -> Suck
# [(1,0),Clean] -> Left
# [(1,0),Dirty] -> Suck
#

def adxyz_SimpleReflexVacuum(percept):
     
    if percept[0] == (0,0) and percept[1]=="DirtClump":
        return "Suck"
    elif percept[0] == (1,0) and percept[1]=="DirtClump":
        return "Suck"
    elif percept[0] == (0,0) and percept[1]=="CleanSquare":
        return "MoveRight"
    elif percept[0] == (1,0) and percept[1]=="CleanSquare":
        return "MoveLeft"
    else:
        return "DoNothing" # Not sure how you would get here, but DoNothing to be safe.

# Instantiate a simple reflex vacuum agent
class adxyz_SimpleReflexVacuumAgent(Agent):
    pass

In [None]:
# Define the initial dirt configurations
initDirt=[]
initDirt.append([])             # neither location dirty - format(X,Y)-locations:A=(0,0), B=(1,0)
###initDirt.append([(0,0)])        # square A dirty, square B clean
##initDirt.append([(1,0)])        # square A clean, square B dirty
###initDirt.append([(0,0),(1,0)])  # square A dirty, square B dirty

print("initDirt = ", initDirt)

#
# Create agent placements
#
initAgent=[]
initAgent.append((0,0))
initAgent.append((1,0))
print("initAgent = ", initAgent)

In [None]:
# Create a loop over environments to run simulation

# Loop over agent placements
for iSimAgentPlacement in range(len(initAgent)):
###for iSimAgentPlacement in range(1):
    print("Simulation: iSimAgentPlacement = ", iSimAgentPlacement)

# Loop over dirt placements
    for iSimDirtPlacement in range(len(initDirt)):
        print ("Simulation: iSimDirtPlacement = " , iSimDirtPlacement)
        myVacEnv = adxyz_VacuumEnvironment() #Create a new environment for each dirt/agent setup
        myVacEnv.width = 2
        myVacEnv.height = 1

        for iPlace in range(len(initDirt[iSimDirtPlacement])):
            print ("Simulation: iPlace = " , iPlace)
            myVacEnv.add_thing(DirtClump(),location=initDirt[iSimDirtPlacement][iPlace])
            
#
# Now setup the agent.
#
        myAgent=adxyz_SimpleReflexVacuumAgent()
        myAgent.program=adxyz_SimpleReflexVacuum  #Place the agent program here
        myAgent.performance=0

# Instantiate a direction object for 2D generality
        myAgent.direction = Direction("up")  # need to leverage heading mechanism
        
# Add agent to environment
        myVacEnv.add_thing(myAgent,location=initAgent[iSimAgentPlacement])
        print()
        print("Environment:")
        for iThings in myVacEnv.things:
            print(iThings, iThings.location)
        print()
        
#
# Now step the environment clock
#
        numSteps = 5
        for iStep in range(numSteps):
            print()
            print("<-START->")
            print("Simulation: step =", iStep)
            myVacEnv.step()
            print("---END---")
            print("---------")
            print()
            
        print()    
        print("<====>")
        print("<====>")
        #need to keep running tally of initial configuration and final performance
        print("Final performance measure for Agent = ", myAgent.performance)
        print("======")
        print("======")
        print()
#
# End of script
#

Todo:
- Clean up comments/prints (mostly done)
- Make processing more generalized
-- Introduce multiple dirt clods.
-- Introduce multiple agents.
- Move data to cloud

2.10) Consider the modified version of the performance metric where the agent is penalized on point for each movement:

a) Can a simple reflex agent be perfectly rational for this environment?

For this problem, there are 8 cases to consider (8 states of the environment):  4 initial dirt configurations and 2 initial agent configurations.

Case 1a) Clean A, Clean B, agent in square A: The maximum performance score would be 2 points awarded at each step, because there are two clean squares.  If we were to design a reflex agent, we could use the following program:  [(clean, squareA)-->DoNothing]
Case 1b) Clean A, Clean B, agent in square B: The maximum performance score would be 2 points awarded at each step, because there are two clean squares.  If we were to design a reflex agent, we could use the following program: [(clean, SquareB)-->DoNothing]

Case 2a) Dirt A, Clean B, agent in square A:  The maximum performance score would be 2 points, once the dirt is removed from square A.  The agent program that could accomplish this is:  [(dirt, squareA)-->suck], [(clean, squareA)-->DoNothing]
Case 2b) Dirt A, Clean B, agent in square B:  The maximum performance score would be 1-1 (1 point for clean B, -1 for move to A), then 2 points for each step after that.  The agent program that could accomplish this is: [(clean, squareB)-->MoveLeft], [(dirt, squareA)-->suck], [(clean, squareA)-->DoNothing].  However, this is in conflict with the optimum program for Case 1b.

Case 3a) Clean A, Dirt B, agent in squareA:  The maximum peformance score would be 1(for clean initial square) -1 (for move to B) = 0 points for step 1.  2 points each step from then on.  The agent program that could accomplish this would be:  [(clean, squareA)-->MoveRight], [(Dirt, SquareB)-->Suck], [(clean,SquareB)-->doNothing].  However, we can see from this situation that our program for 3a is in conflict with the program for 1a.
Case 3b) Clean A, Dirt B, agent in squareB:  The maximum performance score for this would be 2 per time step:  The following agent program could accomplish this [(Dirt, SquareB)-->Suck][(clean,SquareB)-->doNothing.

Case 4a) Dirt A, Dirt B, Agent in Square A:  The maximum possible performance points would be 1 for first step, 1-1 for second step, 2 points from that step onwards.  An agent function that could accomplish this is:  [(dirt,squareA)-->suck], [(clean,squareA)-->moveRight], [(dirt,SquareB)-->suck], [(clean,SquareB)-->doNothing].  However, this includes an instruction which is in conflict with the optimum program in case 1a.

Case 4b) Dirt A, Dirt B, Agent in Square B:  The maximum possible performance points would be 1 for the first step, 1-1 for the second step, and 2 points from the step onwards.  An agent function that could accomplish this is:  [(dirt, squareB)-->suck], [(clean,squareB)-->moveLeft], [(dirt,squareA)-->suck], [(clean, squareA)-->doNothing].  This has instruction which are in conflict with case 1b.  

Because we have conflicting instructions in order to achieve optimum performance results, we would have to choose one or the other, which would lead to a suboptimal result in at least one case.  Thus a perfectly rational agent cannot be designed.  By perfectly rational, I mean one that is optimum in every case, since we must assume all cases are possible to occur.

b) What about a reflex agent with state?  Design such an agent.

In [None]:
#
# The program for the simple reflex agent with state is:
# 
# Percept:         Action:
# --------         -------
# [(0,0),Clean] -> Right
# [(0,0),Dirty] -> Suck
# [(1,0),Clean] -> Left
# [(1,0),Dirty] -> Suck
#

def adxyz_SimpleReflexStateVacuum(percept):
     
    if percept[0] == (0,0) and percept[1]=="DirtClump":
        return "Suck"
    elif percept[0] == (1,0) and percept[1]=="DirtClump":
        return "Suck"
    elif percept[0] == (0,0) and percept[1]=="CleanSquare":
        return "MoveRight"
    elif percept[0] == (1,0) and percept[1]=="CleanSquare":
        return "MoveLeft"
    else:
        return "DoNothing" # Not sure how you would get here, but DoNothing to be safe.

# Instantiate a simple reflex vacuum agent
class adxyz_SimpleReflexStateVacuumAgent(Agent):
    pass

# Part II Problem Solving 

## Chapter 3 (Solving Problems by Searching

We are looking to design agents that can solve goal seeking problems.
Step 1:  Define the goal, which is a state of the environment.  For example, the desired goal might be "Car in Bucharest" or "Robot in square (10,10) with all squares clean"  
Step 2:  Define the problem.  
- Define the states of the environment (atomic)
- Define the initial state
- Define legal actions
- Define transitions (How the states change based on the actions)
- Define goal test
- Define path/step costs

graph-search:  A key algorithm for expanding the search space, that avoids redundent paths.  The search methods in this chapter are based on graph-search algorithm.
Each step of the algorithm does this:
Unexplored state -> frontier states -> explored states.
A state can only be in one of the three above categories.

Infrastructure for search algorithms:
Graphs - nodes that include references to 
parent nodes
state descriptions
action that got from parent to child node
path cost (from initial state).

Types of cost:
Search cost (time to determine solution)
Path cost (cost of actual solution - for example distance on a roadmap)
Total cost:  Sum of search + path cost (with appropriate scaling to put them in common units).

## Types of Search Strategies

Algorithm evaluation criteria:
- Completeness (Does the algorithm find a solution - or all solutions)
- Optimality (Does the algorithm find the best solution)
- Time complexity (how long does the algorithm take to find solution)
- Space complexity (how much memory is used)

### Uninformed search

This includes all search algorithms that have no idea whether one choice is "more promising" than another non-goal state.  These algorithms generate non-goal states and test for goal states.

- Breadth-first search:  Each node is expanded into the successor nodes one level at a time.  Uses a FIFO queue for the frontier.

Pseudo-code for BFS search:

    unexploredNodes = dict()
    exploredNodes = dict()
    frontierNodes = initialState
    goalNodeFound = False
    
    while not frontierNodes.empty:
        currentNode = frontierNodes.pop
        if currentNode.goal == True:
            currentNode.pathCost=currentNode.parent.pathCost+currentNode.stepCost
            goalNodeFound=True
            break
        else:
            exploredNodes[currentNode]=True   # add current node to explored nodes
            for childNode,dummy in currentNode.links.items():  #Any link is a "child"
                if (childNode in exploredNodes) or (childNode in frontierNodes):
                    continue
                else:
                    frontierNodes.push(childNode)
                    childNode.stepCost=childNode.link[currentNode]  # provide step cost
                    childNode.parent=currentNode
                    del unexploredNodes[childNode]
                
    If goalNodeFound != True:  # goal node was not set
        error
        
Need to start at goal node and work back to initial state to provide solution pathway:

    pathSequence = queue.LifoQueue()

    currentNode = goalNode
    pathSequence.put(currentNode)

    while currentNode != currentNode.parent:
        pathSequence.put(currentNode.parent)
        currentNode=currentNode.parent

    pathSequence.put(currentNode)

    while not pathSequence.empty():
        print("Path sequence = ", pathSequence.get())
    

We want to create a generic graph that could be undirected in general and search it using BFS and a FIFO frontier queue.

In [30]:
class GraphNode():
    def __init__(self, initName):
        self.links=dict()        # (name of link:step cost)
        self.parent=None         # Is assigned during BFS
        self.goal=False          # True if goal state
        self.pathCost=0
        self.stepCost=0
        self.frontier=False      # True if node has been added to frontier
        self.name=initName

In [31]:
#
# create map
#
#
#   Node1 ----- 10 ----- Node2 ---7--- Node6
#     |      28--------/ |
#     |     /           6
#     |    /            |
#     15  /          Node5
#     |  |             |
#     | | =======8======= 
#     |/ |
#   Node3 
#     |
#     |
#     |
#     17
#     |
#     |
#     |
#   Node4
#
#
Node1=GraphNode("Node1")
Node2=GraphNode("Node2")
Node3=GraphNode("Node3")
Node4=GraphNode("Node4")
Node5=GraphNode("Node5")
Node6=GraphNode("Node6")

Node1.links[Node2]=10
Node1.links[Node3]=15

Node2.links[Node1]=10
Node2.links[Node3]=28
Node2.links[Node5]=6
Node2.links[Node6]=7

Node3.links[Node1]=15
Node3.links[Node2]=28
Node3.links[Node4]=17
Node3.links[Node5]=8

Node4.links[Node3]=17

Node5.links[Node2]=6
Node5.links[Node3]=8

Node6.links[Node2]=7

print("NodeSetup:")
print("Node1 = ", Node1)
print("Node2 = ", Node2)
print("Node3 = ", Node3)
print("Node4 = ", Node4)
print("Node5 = ", Node5)
print("Node6 = ", Node6)

print("Node1 links = ", Node1.links)
print("Node2 links = ", Node2.links)
print("Node3 links = ", Node3.links)
print("Node4 links = ", Node4.links)
print("Node5 links = ", Node5.links)
print("Node6 links = ", Node6.links)

Node1.parent=Node1  # node1 is the initial node - pointing to itself as parent is the flag.

Node6.goal=True
print("Node6.goal = ", Node6.goal)
print()

NodeSetup:
Node1 =  <__main__.GraphNode object at 0x103ba8fd0>
Node2 =  <__main__.GraphNode object at 0x103ba8f98>
Node3 =  <__main__.GraphNode object at 0x103bc90b8>
Node4 =  <__main__.GraphNode object at 0x103bc90f0>
Node5 =  <__main__.GraphNode object at 0x103bc9128>
Node6 =  <__main__.GraphNode object at 0x103bc9160>
Node1 links =  {<__main__.GraphNode object at 0x103ba8f98>: 10, <__main__.GraphNode object at 0x103bc90b8>: 15}
Node2 links =  {<__main__.GraphNode object at 0x103bc9128>: 6, <__main__.GraphNode object at 0x103bc90b8>: 28, <__main__.GraphNode object at 0x103ba8fd0>: 10, <__main__.GraphNode object at 0x103bc9160>: 7}
Node3 links =  {<__main__.GraphNode object at 0x103ba8f98>: 28, <__main__.GraphNode object at 0x103bc9128>: 8, <__main__.GraphNode object at 0x103ba8fd0>: 15, <__main__.GraphNode object at 0x103bc90f0>: 17}
Node4 links =  {<__main__.GraphNode object at 0x103bc90b8>: 17}
Node5 links =  {<__main__.GraphNode object at 0x103ba8f98>: 6, <__main__.GraphNode objec

In [32]:
#
# Run the BFS process
#

import queue

###exploredNodes = dict()
frontierNodes = queue.Queue()
goalNodeFound = False

#
# Initialize the frontier queue
#

frontierNodes.put(Node1)
Node1.frontier=True

# Main loop

while not frontierNodes.empty():
    print("Exploring frontier nodes: ")
    currentNode = frontierNodes.get()
    if currentNode.goal == True:
        goalNodeFound=True
        break
    else:   
        print("Expanding current node: ", currentNode.name)
        for childNode,dummy in currentNode.links.items():  #Any link is a potential "child" 
            if (childNode.frontier==True):
                print("Child Node has been seen before: ", childNode.name)
                continue
            else:
                print("Child Node is being added to frontier: ", childNode.name)
                frontierNodes.put(childNode)
                childNode.frontier=True
                childNode.parent=currentNode
                childNode.stepCost=childNode.links[currentNode]  # provide step cost
                childNode.pathCost=currentNode.pathCost+childNode.stepCost
    
    print("End of frontier loop:")
    print("-------")
    print()
                
if goalNodeFound != True:  # goal node was not set
    print ("Goal node not found.")
else:
    print ("Goal node found.")
    print ("Current Node = ", currentNode.name)
    print ("Current Node Parent = ", currentNode.parent.name)
    print ("Current Node Step Cost = ", currentNode.stepCost)
    print ("Current Node Path Cost = ", currentNode.pathCost)

Exploring frontier nodes:
Expanding current node:  Node1
Child Node is being added to frontier:  Node2
Child Node is being added to frontier:  Node3
End of frontier loop:
-------

Exploring frontier nodes:
Expanding current node:  Node2
Child Node is being added to frontier:  Node5
Child Node has been seen before:  Node3
Child Node has been seen before:  Node1
Child Node is being added to frontier:  Node6
End of frontier loop:
-------

Exploring frontier nodes:
Expanding current node:  Node3
Child Node has been seen before:  Node2
Child Node has been seen before:  Node5
Child Node has been seen before:  Node1
Child Node is being added to frontier:  Node4
End of frontier loop:
-------

Exploring frontier nodes:
Expanding current node:  Node5
Child Node has been seen before:  Node2
Child Node has been seen before:  Node3
End of frontier loop:
-------

Exploring frontier nodes:
Goal node found.
Current Node =  Node6
Current Node Parent =  Node2
Current Node Step Cost =  7
Current Node Pat

In [33]:
#
# Report out the solution path working backwords from the goal node to the
# initial node (which is flagged by having the parent=node)
#

pathSequence = queue.LifoQueue()
pathSequence.put(currentNode)

while currentNode != currentNode.parent:
    pathSequence.put(currentNode.parent)
    currentNode=currentNode.parent

# Add the final node, which is the initial in this case
# The initial node was specially marked to point to itself as parent

while not pathSequence.empty():
    print("Path sequence = ", pathSequence.get().name)

Path sequence =  Node1
Path sequence =  Node2
Path sequence =  Node6


### Uniform Cost search

This approach uses a priority queue for the frontier based on the smallest path cost to a given new node (need to check if this is smallest path cost or smallest step cost).

    Example:

    Sib-----99-----Fagar
    |               |
    |               |
    80              |
    |               |
    |               |
    RimV           211      
    |               /
    |              /  
    97            /  
    |            /
    |           /
    Pit        /
    |         /
    |        /
    101     /
    |      /
    |     /
    Bucharest

Initialize:
Frontier <- Sib

Processing steps:
1. Pop frontier (priority queue, total path cost order), 
2. Goal test
3. Generate descendent nodes and insert descendent nodes into frontier.

Initialize:
Frontier <- Sib

Order of examining nodes.
    1. Frontier(Sib)  
    2. Frontier.pop -> Sib
    3. GoalTest(Sib) -> False
    4. Expand(Sib) -> RimV, Fagar
    5. Frontier(RimV, Fagar)
    6. Frontier.pop -> RimV
    7. GoalTest(RimV) -> False
    8. Expand(RimV) -> Pit
    9. Frontier(Fagar, Pit)
    10. Frontier.pop -> Fagar
    11. GoalTest(Fagar) -> False
    12. Expand(Fagar) -> Fagar-Bucharest
    13. Frontier(Pit, Fagar-Bucharest)
    14. Frontier.pop -> Pit
    15. GoalTest(Pit) -> False
    16. Expand(Pit) -> Pit-Bucharest
    17. Frontier(Pit-Bucharest, Fagar-Bucharest)
    18. Frontier.pop -> Pit-Bucharest
    19. GoalTest(Pit-Bucharest) -> True
    20. STOP

#### Related proofs

- Prove optimality of uniform cost search (TBD)

### Part II Problem Solving  
#### 3 Solving Problems by Searching  
The Chapter introduces solutions to environments that are deterministic, observable, static, and completely known. 
Proof that uniform cost search is optimal (The algorithm will find the path to the goal state with the smallest path cost) 
Algorithm (Frontier is Priority Queue with path cost as the priority – smaller path cost, higher priority: 
Initialize:  Frontier <- Initial State 
Frontier.pop -> Ns = Node with smallest path cost in Frontier 
GoalTest(Ns).  If True, then stop else expand Ns(and mark as expanded) and place children in Frontier. 
Repeat steps 2 & 3. 

Lemma 1:  The path from the starting node to any unexpanded node in the graph must cross the Frontier.  This is by the graph separation property.   

Definitions:  A graph can be partitioned into three mutually exclusive sets. 

Expanded nodes:  A node that is on any path from the initial state node to any frontier node.  A node becomes expanded after two steps:  1) it has been added to the frontier and (2) its descendants have been added to the frontier at which point the node itself is marked as "expanded" and removed from the frontier set. 

Frontier nodes:  A node which is currently in the frontier, but not its descendants. 

Unexpanded nodes:  All other nodes in the graph. 

Proof:  

Base case.  The start node is placed in the frontier during initialization, and thus every other node has to be outside the frontier that reaches the initial node.   

Inductive step:  Assume a node is a frontier node.  We expand all its descendants and make them frontier nodes and remove the original node from the frontier, marking it "expanded." Note that there might be descendent nodes that are already frontier nodes.  There are two possible paths to reach the original node from an unexpanded node.   

Path 1:  Through a descendent node.  Since each descendent node is on the frontier, this would entail crossing the frontier. 

Path 2:  Through the parent(s) of the original node.  However, since the original node was in the frontier, its parent must have been marked as expanded, meaning that all of its descendants had to be in the frontier thus preventing an unexpanded node from reaching the parent without crossing the frontier.  This process can be repeated by induction until the initial node is reached, whereby definition it is already inside of the frontier and cannot be reached by an unexpanded node without crossing the frontier.   

Lemma 2:  At each step, the unexpanded node with the smallest path cost will be selected from the Frontier for expansion. 

Base case:  The start node is placed in the frontier.  The path cost is zero (we assume all path costs to be non-zero positive numbers that require moving to a different node).  Thus, the start node will be selected for expansion since the smallest possible path cost is zero. 

Inductive case:   The unexpanded node with smallest path cost will be selected from the priority queue frontier for expansion.  
Additionally, this path cost is optimal (there is no smaller path cost to this node). 


- Prove that the graph-search version of A-star is optimal if the heuristic function is consistent (TBD)

### Informed search (heuristics)

These approaches for searching the graph tend to produce faster results, but are dependent on information that may or may not be available at all times.  At each step, an evaluation function is applied to each node in the frontier, and the one that has the optimal evaluation function value will be expanded.

### Exercises in Chapter 3: 
3.1 Explain why problem formulation must follow goal formulation.  The goal (state) specifies the structure of the answer being sought by the agent.  For example, the goal might be "have all the tiles in an 8-tile puzzle in the correct location starting from an arbitrary random arrangement."  This provides the framework that specifies the state space (arrangements tiles in an 8-tile puzzle), and also implies restrictions on how this can be accomplished (namely following the mechanics of how 8-til puzzles work, one-tile can be moved at a time into a blank, or alternatively, a blank can be moved in one of four directions from center, one of three on edge, one of two in a corner position. 

3.2 Your goal is to navigate a robot out of a maze.  The robot starts at the center facing north.  You can turn the robot to face north, east, south, or west.  You can direct the robot to move a certain distance forward, although it will stop before hitting a wall. 

A) How large is the state space:  If we consider the state space to be the location of the robot on a discrete x-y grid consisting of N locations, then the state space will be 4*N, for each of the four directions the robot can be facing at each location. 

B) We change the problem so that the only place you can turn is at the intersection of two or more corridors, how big is the state space now?: Let M be the number of intersections of the type just mentioned.  Then, we would have a state space of 4*M.  Each state would have the form, <intersection_1, heading_north>, which would then yield the next possible state that is reachable in the search tree. 

C) From each point in the maze, we can move in any of the four directions until we reach a turning point, and this is the only action we need to do.  Reformulate the problem now.  Do we still need to keep track of the robot's orientation?  A state could be defined like this: 
<Turning point location=(X,Y), Adjacent Turning Point<Xa,Ya>, Direction, Evaluation Function.  No, you don’t need to keep track of the robots orientation. 

D) List the simplifications to the real world.  Simplifications:  
1) Only four directions are possible 
2) You can only turn at intersections 
3) The knowledge of the robots location and heading are exact.   

3.3) Suppose two friends live in different cities on a map, such as the Romania one.  On every turn we can simultaneously move each friend to a neighboring city on the map.  The amount of time to move from one city to another is the road distance d(I,j), but the friend that arrives first must wait for the other to arrive at their city before the next step.  We want the two friends to meet as quickly as possible: 

A) Write a detailed formulation of the search problem: 

Initial State:  FriendA in their starting location, FriendB in their starting location 

Actions:  FriendA & FriendB select next destination that minimizes evaluation function.  This action list should also include not moving from the given location (think of the degenerate case of only two cities-  In the event of a tie on the goal contour, friendA can arbitrarily be selected to do the travelling. 

Transition Model: (Describing the state changes as a consequence of the actions):  New locations for FriendA and FriendB 

Goal Test Function:  Are FriendA and FriendB at same map location? 

Path Cost Function:  Distance to next town + distance traveled so far. 
If we imagine the state to consist of pairs of cities (with the goal state being the same city), then we can precompute the straight line distance between the city pairs, and this would be an admissible heuristic (it does not over estimate the travel time).  At each time step, we would expand the nodes and take the state with the smallest heuristic. 

B) Admissible heuristics?

C) Are there completely connected maps that have no solution? 

One possible case is a map consisting of two nodes that are connected.  If the search algorithm doesn't take this into account, then the friends could swap cities, and then no longer be able to either swap back (since the state has been visited) and cannot meet.

CityA ------------ CityB

D) Are there maps in which all solutions require one friend to visit the same city twice?

Consider the following map:


CityA--5---CityE
|            |
|            |
10           15
|            |
|          CityD
CityB (3)   /
           /
(BC=20)   25
(AC=30)  /
CityC----


FriendA starts in CityA
FriendB starts in CityC
This map would require FriendA to visit CityA twice if we used the straightline distance heuristic (which we assert is the same as the road distances shown on the graph).

3.4) Show that the 8-tile puzzle states are divided into two disjoint sets.  You can reach any state from another other state within a given set, but cannot go between sets.

B12      
345
678

1B2
345
678

12B
345
678

125
34B
678

125
348
67B

3.5).  Consider the 8-queens problem with the efficient incremental implementation on page 72.  Explain why the state space has at least cube root of n factorial states and estimate the largest n for which exhaustive search is feasible.

A given state is defined as the positions of the n-queens in n separate columns.

In a given column we have the following situation.

From the first column:

In each column, we reduce the potential state space by 3 squares in each remaining column to the right in the worst case.  Thus, the next column, i, will have at least N_i-3 new nodes to add to the search tree (although, the exact nodes as they refer to board locations can depend on the selected position of the queens in earlier rounds).

Therefore, the sequence of branching is:

i=0: N
i=1: N-3(i)
i=2: N-3(i)
.
.
.
i=7: N-3(i)

The total number of states then is

Prod(i=0 to N-1) max{N-3(i),1}
For the case of 8-queens, this is (although after the first three terms, the rest are set to 1):  

   N(N-3)(N-6)  *     (1)(1)(1)      *     (1)(1)  
 
<= N(N-1)(N-2)  *  (N-3)(N-4)(N-5)   *   (N-6)(N-7)  = N!

        X       *         X          *        X      = N!

where X=cuberoot(N!)

Is the cuberoot of N! <= N(N-3)(N-6) ?

if the cuberoot of N! = N(N-3)(N-6), then N! = N(N-3)(N-6) * N(N-3)(N-6) * N(N-3)(N-6), which it does not.  Is N(N-3)(N-6) greater or less than the cuberoot of N! ?  


N(N-3)(N-6) * N(N-3)(N-6) * N(N-3)(N-6)

N*N*N >= N (N-1)(N-2)
(N-3)(N-3)(N-3) >= (N-3)(N-4)(N-5)
(N-6)(N-6)(N-6) >= (N-6)(N-7)(1)

Therefore, cuberoot of N! <= N(N-3)(N-6), which itself was a lower bound on the number of states that would need to be searched, therefore the proof is complete.  The number of states that must be searched is at least cuberoot of N!.  This proof depends on being able to split up the N! evenly into 3 products, the first of which only includes those terms up to the point where 3i < N, for integer i.


## 4 Beyond Classical Search  
 

## 5 Adversarial Search


## 6 Constraint Satisfaction Problems


Part III Knowledge and Reasoning  
          7 Logical Agents  
 
          8 First-Order Logic  
 
          9 Inference in First-Order Logic  
 
        10 Classical Planning  
 
        11 Planning and Acting in the Real World  
 
        12 Knowledge Representation  
 
Part IV Uncertain Knowledge and Reasoning  
        13 Quantifying Uncertainty  
 
        14 Probabilistic Reasoning  
 
        15 Probabilistic Reasoning over Time  
 
        16 Making Simple Decisions  
 
        17 Making Complex Decisions  
 
Part V Learning  
        18 Learning from Examples  
 
        19 Knowledge in Learning  
 
        20 Learning Probabilistic Models  
 
        21 Reinforcement Learning  
 
Part VII Communicating, Perceiving, and Acting  
        22 Natural Language Processing  
 
        23 Natural Language for Communication  
 
        24 Perception  
 
        25 Robotics  
 
Part VIII Conclusions  
        26 Philosophical Foundations  
 
        27 AI: The Present and Future  

A Mathematical Background [pdf]  
         B Notes on Languages and Algorithms [pdf]  
             Bibliography [pdf and histograms]  
             Index [html or pdf] 

### OVERALL NOTES: 
Performance Measures: 
We always consider first the performance measure that is evaluated on any given sequence of environment states (not states of the agent).  This is critical. 
As a general rule, it is better to design performance measures according to what one actually wants in the environment, rather than according to how one thinks the agent should behave. 
Rational agents maximize expected performance measures. 
The definition of a rational agent is: For each possible percept sequence, a rational agent should select an action that is expected to maximize its performance measure, given the evidence provided by the percept sequence and whatever built-in knowledge the agent has. 
Perfect agents maximize actual performance measures. 

EXAMPLE:  PEAS framework – Create this first, before designing the agent. 
Agent Type 
Performance Measure (EXTERNAL TO AGENT) 
Environment (EXTERNAL TO AGENT) 
Actuators 
(AVAILABLE TO AGENT) 
Sensors 
(AVAILABLE TO AGENT) 
Taxi Driver 
Safe, fast, legal, comfortable trip, maximize profits 
Roads, other traffic, pedestrians, customers 
Steering, accelerator, brake, signal, horn, display 
Cameras, sonar, speedometer, GPS, odometer, accelerometer, engine sensors, keyboard 


### Types of Rational Agents: 
Simple Reflex 
 
Model Based 
 
Goal Based 
Problem-solving agent (Chap 3):  Using atomic representations – states are black boxes. 
 
Planning agents (Chap 7): Using factored or structured state representations. 
 
Utility Based 
 
Learning Agents 
Types of task environments: 
1) Observability: 
Fully observable 
Partially observable 
Totally unobservable 
2) Agents: 
Single 
Multiple 
3)  Determinism: 
Deterministic 
Stochastic 
4)  Episode: 
Episodic 
Sequential 
5)  Dynamic: 
Static 
Semi-Dynamic 
Dynamic 
6)  Discreteness: 
Discrete 
Continuous 
 
Types of states of the environment.   
1) Atomic – each state of the environment is a discrete, indivisible state. 
Search & Game Playing (Chapters 3-5) 
Hidden Markov Models (Chapter 15) 
Markov Decision Process (Chapter 17) 
2) Factored -- each state of the environment can be described by internal values such as variables, booleans. 
Constraint satisfaction (Chapter 6) 
Propositional logic (Chapter 7) 
Planning (Chapter 10-11) 
Bayesian networks (13-16) 
Machine Learning (18,20,21) 
3) Structured -- Each state can consists of an internal structure with objects that have relationships to each other. 
Relational Databases and first order logic (Chapter 8,9, 12) 
First order probability models (Chapter 14) 
Knowledge-based learning (Chapter 19) 
Natural language processing (Chapter 22, 23) 