# Chapter 2 - Intelligent Agents

## 2.1 Agents and Environments

An **agent** is anything that can be viewed as perceiving its **environment** through **sensors** and acting upon that environment through **actuators**. An agent’s behavior is described by the **agent function** that maps any given **percept sequence** to an action. The agent function is an abstract mathematical description; the agent program is a concrete implementation, running within some physical system.

<img src="https://i.ibb.co/Y7pHmWbm/image.png" width="250px">

The vacuum-cleaner world, which consists of a robotic vacuum-cleaning agent in a world consisting of squares that can be either dirty or clean.

## 2.2 Good Behavior: The Concept of Rationality

**consequentialism**: we evaluate an agent’s behavior by its consequences. This notion of desirability is captured by a **performance measure** that evaluates any given sequence of environment states.

We might propose to measure performance by the amount of dirt cleaned up. A rational agent can maximize this performance measure by cleaning up the dirt, then dumping it all on the floor, then cleaning it up again, and so on.

### 2.2.2 Rationality

- The performance measure that defines the criterion of success.
- The agent’s prior knowledge of the environment.
- The actions that the agent can perform.
- The agent’s percept sequence to date.

For each possible percept sequence, a rational agent should select an action that is expected to maximize its performance measure, given the evidence provided by the percept sequence and whatever built-in knowledge the agent has.

We need to be careful to distinguish between rationality and **omniscience**. An omniscient agent knows the actual outcome of its actions and can act accordingly; but omniscience is impossible in reality.

To the extent that an agent relies on the prior knowledge of its designer rather than on its own percepts and learning processes, we say that the agent lacks **autonomy**. 

## 2.3 The Nature of Environments

PEAS (Performance, Environment, Actuators, Sensors) 

**Fully observable vs. partially observable**: An environment might be partially
observable because of noisy and inaccurate sensors or because parts of the state are simply missing from the sensor data— an automated taxi cannot see what other drivers are thinking. If the agent has no sensors at all then the environment is **unobservable**.

**Single-agent vs. multiagent**

In chess, the opponent entity B is trying to maximize its performance measure, which, by the rules of chess, minimizes agent A’s performance measure. Thus, chess is
a **competitive** multiagent environment. In the taxi-driving environment, avoiding collisions maximizes the performance measure of all agents, so it is a **partially co-
operative** multiagent environment. It is also **partially competitive** because, for example, only one car can occupy a parking space.

**Deterministic vs. nondeterministic**. If the next state of the environment is completely determined by the current state and the action executed by the agent(s), then we say the environment is deterministic. In principle, an agent need not worry about uncertainty in a fully observable, deterministic environment.

We say that a model of the environment is **stochastic** if it explicitly deals with probabilities and “nondeterministic” if the possibilities are listed without being quantified.

**Episodic vs. sequential**: In an episodic task environment, the next episode does not depend on the actions taken in previous episodes. In sequential environments, on the other hand, the current decision could affect all future decisions.

**Static vs. dynamic**, If the environment itself does not change with the passage of time but the agent’s performance score does, then we say the environment is **semidynamic**. Chess, when played with a clock, is semidynamic. Crossword puzzles are static.

**Discrete vs. continuous**: The discrete/continuous distinction applies to the state of the environment, to the way time is handled, and to the percepts and actions of the agent.

**Known vs. unknown**: The agent’s (or designer’s) state of knowledge about the “laws of physics” of the environment. In a known environment, the outcomes (or outcome probabilities if the environment is nondeterministic) for all actions are given.

It is quite possible for a known environment to be partially observable— in solitaire card games, I know the rules but am still unable to see the cards that have not yet been turned over. Conversely, an unknown environment can be fully observable—in a new video game, the screen may show the entire game state but I still don’t know what the buttons do until I try them.

**The hardest case is partially observable, multiagent, nondeterministic, sequential, dynamic, continuous, and unknown.**

## 2.4 The Structure of Agents

agent= architecture + program

### 2.4.1 Agent programs

The agent programs that we design in this book all have the same skeleton: they take the current percept as input from the sensors and return an action to the actuators.

Notice the difference between the agent program, which takes the current percept as input, and the agent function, which may depend on the entire percept history.

- Simple reflex agents
- Model-based reflex agents
- Goal-based agents
- Utility-based agents

In [1]:
# It is instructive to consider why the table-driven approach to agent construction
# is doomed to failure. For the taxi, a lookup table with over 10^600,000,000,000
# entries for an hour’s driving. Despite all this, TABLE-DRIVEN-AGENT does do
# what we want: it implements the desired agent function.

def table_driven_agent(percept, percepts=[], table={}):
    """
    Simulates a table-driven agent.
    
    Arguments:
    - percept: the current percept
    - percepts: a persistent list of past percepts (maintained across calls)
    - table: a dictionary that maps percept sequences to actions
    
    Returns:
    - action: the action found in the table corresponding to the percept sequence
    """
    # Append the new percept to the sequence
    percepts.append(percept)

    # Look up the action based on the current sequence of percepts
    key = tuple(percepts)  # Using tuple to make it hashable for dictionary lookup
    action = table.get(key, None)  # Returns None if not found

    return action

### 2.4.2 Simple reflex agents

These agents select actions on the basis of the current percept, ignoring the rest of the percept history. Vacuum agent is a simple reflex agent, because its decision is based only on the current location and on whether that location contains dirt.

Infinite loops are often unavoidable for simple reflex agents operating in partially observable environments. Escape from infinite loops is possible if the agent can **randomize** its actions. A randomized simple reflex agent might outperform a deterministic simple reflex agent.

<img src="https://i.ibb.co/fzK64g8n/image.png" width="250px">

In [2]:
def reflex_vacuum_agent(percept):
    location, status = percept
    if status == 'Dirty':
        return 'Suck'
    elif location == 'A':
        return 'Right'
    elif location == 'B':
        return 'Left'

# only if the correct decision can be made on the basis of just the current
# percept—that is, only if the environment is fully observable.
def simple_reflex_agent(percept, rules):
    state = interpret_input(percept)
    rule = rule_match(state, rules)
    action = rule['action']
    return action

### 2.4.3 Model-based reflex agents

The most effective way to handle partial observability is for the agent to keep track of the part of the world it can’t see now. That is, the agent should maintain some sort of **internal state** that depends on the percept history and thereby reflects at least some of the unobserved aspects of the current state.

Updating this internal state information as time goes by requires two kinds of knowledge. First, we need some information about how the world changes over time. This knowledge about “how the world works” is called a **transition model** of the world. Second, we need some information about how the state of the world is reflected in the agent’s percepts. This kind of knowledge is called a **sensor model**.

<img src="https://i.ibb.co/v47H1TNr/image.png" width="250px">

In [3]:
def model_based_reflex_agent(percept, state, transition_model, sensor_model, rules, action=None):
    state = update_state(state, action, percept, transition_model, sensor_model)
    rule = rule_match(state, rules)
    action = rule['action']
    return action

### 2.4.4 Goal-based agents

**Search and planning are the** subfields of AI devoted to finding **action sequences** that achieve the agent’s goals.

Goal-based agent appears less efficient, it is more flexible because the knowledge that supports its decisions is represented explicitly and can be modified.

<img src="https://i.ibb.co/MD9YVjv5/image.png" width="250px">

### 2.4.5 Utility-based agents

An agent’s **utility function** is essentially an internalization of the **performance measure**. Provided that the internal utility function and the external performance measure are in agreement, an agent that chooses actions to maximize its **utility** will be rational according to the external performance measure.

Perfect rationality is usually unachievable in practice because of computational complexity. Not all utility-based agents are model-based, **a model-free agent** can learn what action is best in a particular situation without ever learning exactly how that action changes the environment.

<img src="https://i.ibb.co/vvmBdcnQ/image.png" width="250px">

### 2.4.6 Learning agents

The most important distinction is between the **learning element**, which is responsible for making improvements, and the **performance element**, which is responsible for selecting external actions. The performance element is what we have previously considered to be the entire agent.

Learning in intelligent agents can be summarized as a process of modification of each component of the agent to bring the components into closer agreement with the available feedback information, thereby improving the overall performance of the agent.

<img src="https://i.ibb.co/9HTNWY08/image.png" width="250px">

### 2.4.7 How the components of agent programs work

We can place the representations along an axis of increasing complexity and expressive power—atomic, factored, and structured.

In an **atomic representation** each state of the world is indivisible—it has no internal structure —a single atom of knowledge, a **“black box”** whose only discernible property is that of being identical to or different from another black box. The standard algorithms underlying search and game-playing, hidden Markov models, and Markov decision processes.

A **factored representation** splits up each state into a fixed set of **variables or attributes**, each of which can have a **value**. Two different factored states can share some attributes. This makes it much easier to work out how to turn one state into another. Many important areas of AI are based on factored representations, including constraint satisfaction algorithms, propositional logic, planning, Bayesian networks.

**Structured representation**, in which objects and their various and varying relationships can be described explicitly. Structured representations underlie relational databases and first-order logic, first-order probability models, and much of natural language understanding.

<img src="https://i.ibb.co/13G9WJy/image.png" width="500">