# CSC 421 - Agents 

### Instructor: Brandon Haworth


#### Notebook Credit: George Tzanetakis
Jupyter Notebooks you encounter during the course were largely developed by Prof. Tzanetakis from a previous iteration of this course. I've since changed/developed them where necessary for my own iterations of CSC 421.

## Agents 


**EMPHASIS**: Agents as a unifying concept of thinking about AI and software 


During this lecture, we will cover the following topics: 

1. Agents 
2. Performance, environments, actuators, sensors 
3. Agent architectures 
7. Learning 


## WORKPLAN 

The section number is based on the 4th edition of the AIMA textbook and is the suggested
reading for this week. Each list entry provides just the additional sections. For example, the Expected reading includes the sections listed under Basic as well as the sections listed under Expected. Some additional readings are suggested for Advanced. 

1. Basic: Sections **2.1**, **2.3**, **2.4** and Summary  
2. Expected: **2.2**
3. Advanced: Bibliographical and historical notes 


## Agents and Environments  

**Agents** perceive their **environment** through **sensors** and act upon that environment through **actuators**. 


Terminology: 

1. Percept 
2. Percept sequence 
3. Agent function (abstract mathematical distribution, in many cases infinite tabulation) 
4. Agent program (concrete implementation running on a physical system) 

What makes an agent effective, good, intelligent? 


Any area of engineering can be viewed through the lenses of agents. What makes AI unique is the significant computational resources that can be employed by the agent and the non-trivial decision-making that the task environment requires. 

<img src="images/aima_simple_agent.png" width="60%"/>


### Generalization of the agent paradigm

The generalization of this paradigm could be taken to almost all areas, eventually trivializing the utility of the idea.

In fact, one could define AI agents as systems that can not be developed using traditional engineering approaches.

**Imagine for example a simple function:**

In [3]:
def agent_function(percept):
    action = "Zero"
    if percept > 0:
        action = "Positive"
    elif percept < 0:
        action = "Negative"
    return action

def agent(percept):
    action = agent_function(percept)
    print(action)

agent(-1)
agent(1)
agent(0)

Negative
Positive
Zero


**Now that same function written for some program you need to identify the sign of a number**

In [4]:
def print_sign(number):
    result = "Zero"
    if number > 0:
        result = "Positive"
    elif number < 0:
        result = "Negative"
    print(result)

print_sign(-1)
print_sign(1)
print_sign(0)

Negative
Positive
Zero


**The utility of this description is in understanding AI systems. It is not meant to remap everything to AI and say all computing things are AI.**

**AI, for our purposes, is something that has significant computational resources and its task environment requires non-trivial decision-making**

## PEAS Description of Agent 

1. **Performance** a way to measure how the agent is doing 
2. **Environment** essential to the problems or worlds in which the agent needs to operate 
3. **Actuators** are the different ways the agent can interact with the environment as well as possibly its own
operation. They receive **actions** that encode the information needed for them to operate. 
4. **Sensors** are the ways the agent can acquire information about the environment it is operating as well as possibly 
information about its own operation. The information they acquire is represented as **percepts**.

Let's consider some examples - what are the possible percepts, environments, sensors, actuators, and actions for these 
agents: 

1.  Human 
2.  Robot 
3.  Vacuum cleaner world 
4.  Single chess piece valid chessboard moves 
5.  Self-driving car 
6.  Ant 
7.  NPC in-game 
8.  Chess playing program 



### Performance and Rationality
In some ways, the concept of a performance measure ties the PEAS concept together in the framework of this course.

The book has an excellent definition of a **rational agent**, that centres rationality around performance.
> For each possible **percept sequence**, a **rational agent** should select an **action** that is expected to maximize its **performance** measure, given the evidence provided by the **percept sequence** and whatever built-in knowledge the agent has.

## TASK ENVIRONMENTS 

Specifying the task environment (essentially the problem to which rational agents are the solutions): 


1. Performance 
2. Environment 
3. Actuators 
4. Sensors 


Properties of task environments (for each one think of examples or consider the examples mentioned above): 

1. Fully observable vs partially observable
2. Single-agent vs multiagent 
    1. Competitive multiagent (chess) vs cooperative multiagent (self-driving cars avoiding collisions)
3. Deterministic vs nondeterministic

**Agent = architecture + program** 
An AI agent is its agent program and the architecture it runs on, a computer, a robot chassis, etc, where for example:
1. computational power
2. storage media and their capacity
3. sensors and their fidelity
4. actuators and their degrees of freedom and physical capabilities,
dictate the operational limits of the agent.

### Fully observable vs partially observable
**Fully observable** environments mean that the entire state of the environment is available to the sensors of the agent. In particular, the environmental information that is *relevant*.

**Partial observability** of a task environment introduces unique challenges. The agent may observe what is available to its sensors but depending on the internal construction of the agent may struggle to reason within the environment.

**There are many reasons an environment may be only partially observable**

Often in real life, outside heavily controlled environments, environments are partially observable. This makes day-to-day AI agents difficult to develop. Consider the autonomous vehicle...

### Single-agent vs multiagent
The **single-agent** task environment allows us to focus on the interactions of that agent with the task environment

The **multi-agent** task environment raises interesting questions and issues in our agents' operations. 

Consider 
1. Are other things in the environment agents?
    1. What makes an agent to an agent?
    2. When must something be considered an agent by another agent?
2. Do those agents communicate?
3. What is the interaction between the concepts of partial observability and multi-agent environments? (HINT: mind reading)
    1. Are those other agents now part of the environment w.r.t the agent?
4. Are these agents competitive or cooperative?
    1. Can we engage game theory to understand maximizing their performance concerning each other?
    2. Co-operative agents in game theory are viewed as a group for utility

### Deterministic vs nondeterministic
**Deterministic environments** are those environments where the next state of the environment is entirely dependent on the current state of the environment and the action the agent takes. In this way, given our transition function, we can say exactly what state we will end up in.

**Non-deterministic environments** raise several interesting issues and questions about how our agent operates.

Consider 

1. Do we know what is even possible in this environment? That is, even though we may not know the next state with certainty, can we enumerate all possible states?
2. Are real-life states/environments non-deterministic? Does this complicate agent designs? (HINT: human behaviour)

## Structure of Agents 

1. **Reflex agents** 

Reflex agents simply act based on the current percept ignoring the rest of the percept history. They can 
operate using simple condition-action rules. Humans have many such rules that are typically used when fast action 
in response to a stimulus is required. 

<img src="images/aima_simple_agent.png" width="60%"/>

2. **Model-based reflex agents**

The most effective way for an agent to deal with a partially observable environment is to maintain 
some kind of internal representation (model) keeping track of the parts of the world that it can not 
perceive. This model needs to be updated based on knowledge about how the world changes independently of the 
agent as well as about how the agent's own actions can affect the world. 

<img src="images/aima_model_agent.png" width="60%"/>



5. **Goal-based agents** 

Knowing something about the current state of the world is not always sufficient in order to decide what to do. There are many situations in which the agent needs to have some sort of **goal** information that describes situations that are desirable. Goal-based agents are fundamentally different than reflex agents as they consider the future. We will be looking at **Search** (in detail) and **Planning** which are two research areas of AI that focus on finding action sequences for agents to achieve specific goals. Goal agents are more flexible than reflex-agents but in general, tend to be more computationally demanding and therefore slower as they need to consider how actions create multiple possible "futures" and determine if these **futures** meet specific **goals**. 

<img src="images/aima_goal_agent.png" width="60%"/>



6. **Utility-based agents**
   
Goals alone are not enough. Utility is an internal representation of the performance measure. You can think of it as a "happiness" measure for the agent. Utility has the same relationship with an external performance measure that the internal world representation of an agent has with the actual world/environment it operates. Utility can assist in two situations in which goal-based agents have a hard time: 1) when there are conflicting goals (for example speed and safety) and 2) when there are multiple goals that the agent can aim for but none of which can achieved with certainty. Because uncertainty is always present in typical real-world situations requiring rational/intelligent behaviour technically speaking a utility-based agent chooses the action that maximizes the **expected utility**. 

<img src="images/aima_utility_agent.png" width="60%"/>

Discussion Examples: 
NPC in a graph-based text adventure 
Driving to the airport 



### LEARNING AGENTS 

All types of agent architectures can benefit from learning. Learning occurs when the performance measure with which we measure how the agent is doing improves through "experience" i.e repeated operation in an environment. 

## World/Environment Representations 

**Important note:** The environment representation is different than the environment and it consists of what 
the agent "knows" about it so in most cases it does not contain all the information in the environment. 

Atomic representation, factored representation, structured representation, distributed representation 




<img src="images/aima_environment_representations.png" width="70%"/>


* Algorithms for search and game-playing, hidden markov modes, and markov decision processes work with atomic representations. 
* Constraint satisfaction problems, propositional logic, bayesian networks, and machine learning algorithms frequently work with factored representatinos. 
* Relational databases, first-order logic, natural language understanding and knowledge-based learning operate on structured representation. 


## Some Agent Sketches

In [28]:
# code sketch of a reflex agent. The set of rules (condition-action pairs) 
# can remain static through the execution or can be modified if the agent is capable of learning. 
def simple_rule_agent(percept, rules): 
    state = interpet_input(percept)
    rule  = rule_match(state,rules) 
    action = rule.action() 
    return action


# code sketch of a model-based reflex agent 
# state: the agent's current conception of the world state 
# model: a description of how the next state depends on the current state and action 
def model_agent(percept, rules, state, model): 
    state = update_state(state, action, percept, model) 
    rule = rule_match(state, rules) 
    action = rule.action() 
    return action 
    