# Chapter 16 - Making Simple Decisions

*In which we see how an agent should make decisions so that it gets what it wants in an
uncertain world—at least as much as possible and on average.* - Peter Norvig and Stuart Russell in Artificial Intelligence: A Modern Approach

## Table of Contents

Chapter 16, "Making Simple Decisions," in "Artificial Intelligence: A Modern Approach" by Stuart Russell and Peter Norvig, is a comprehensive examination of how agents can make rational decisions under conditions of uncertainty. This chapter integrates utility theory with probability theory, providing a foundational framework for decision-making processes in artificial intelligence. It starts with the premise that decisions involve choosing among actions with uncertain outcomes, emphasizing the importance of considering both the agent's beliefs (probabilities) and desires (utilities) to make rational choices.

The chapter is organized into several key sections, beginning with an exploration of how beliefs and desires under uncertainty can be combined through expected utility theory, progressing through the intricacies of utility functions, multiattribute utility functions, and decision networks. It also discusses the value of information and how it can influence decision-making, and it concludes with an examination of how to handle unknown preferences.

By the end of the chapter, readers are expected to understand the principles guiding rational decision-making in uncertain environments, how to construct and use decision networks, and the importance of information in shaping decisions. 

- The subchapters for Chapter 16 - MAKING SIMPLE DECISIONS in "Artificial Intelligence: A Modern Approach" by Stuart Russell and Peter Norvig are as follows:
- 16.1 Combining Beliefs and Desires under Uncertainty
- 16.2 The Basis of Utility Theory
- 16.3 Utility Functions
- 16.4 Multiattribute Utility Functions
- 16.5 Decision Networks
- 16.6 The Value of Information
- 16.7 Unknown Preferences

## 16.1 "Combining Beliefs and Desires under Uncertainty"

- **Introduction to Decision Theory** : It starts by introducing the concept of decision theory, which aims to combine utility theory with probability theory to guide how an agent can make rational decisions based on its beliefs and desires. 
- **Action and Outcome Uncertainty** : It acknowledges the uncertainties in both the agent's current state and the outcomes of its actions. The transition model, represented by the probability P(s′∣sa)P(s'|sa)P(s′∣sa), indicates the likelihood of reaching a new state s′s's′ after taking action aaa in state sss. 
- **Expected Utility Theory** : Central to decision theory is the principle of maximizing expected utility. The expected utility of an action, given the agent's evidence, is the sum of the utilities of possible outcomes weighted by their probabilities. 
- **Maximizing Expected Utility (MEU)** : The MEU principle dictates that a rational agent should choose the action that maximizes its expected utility. This principle is presented as a fundamental guide for intelligent behavior, highlighting the importance of calculating and maximizing utility over actions. 
- **Operational Challenges** : While the MEU principle provides a clear directive for rational action, operationalizing this principle in AI involves complex challenges such as estimating probabilities over possible world states, requiring capabilities in perception, learning, knowledge representation, and inference. 
- **Relation to Performance Measures** : The MEU principle is related to the idea of performance measures introduced in earlier chapters. It justifies the use of utility functions to guide step-by-step actions towards achieving the highest possible performance score across possible environments.

These points illustrate the foundational concepts of decision theory as applied to AI, emphasizing the integration of beliefs (probabilities) and desires (utilities) to make rational decisions under uncertainty.

## 16.2 "The Basis of Utility Theory" 

Introducing  fundamental concepts underlying utility theory, which is essential for understanding rational decision-making. 

- **Rationality and Utility Maximization** : The chapter begins by questioning why maximizing expected utility (MEU) is considered the only rational way to make decisions, suggesting that while MEU seems intuitive, other methods could also appear rational. 
- **Constraints on Rational Preferences** : It introduces constraints that any rational preference ordering must satisfy, including orderability (the ability to order preferences), transitivity (if A is preferred to B and B is preferred to C, then A should be preferred to C), continuity (preferences between lotteries should be consistent), substitutability (indifferent preferences should be substitutable in complex lotteries without altering the preference order), and monotonicity (a preference for lotteries with a higher probability of a preferred outcome). 
- **Rational Preferences Lead to a Utility Function** : It argues that if an agent's preferences satisfy the axioms of utility theory, then a utility function exists that can represent these preferences. This utility function allows preferences to be expressed numerically, facilitating rational decision-making. 
- **Expected Utility of a Lottery** : It describes how the utility of a lottery (a probabilistic set of outcomes) can be calculated as the sum of the utilities of its outcomes, weighted by their probabilities. This concept is foundational for understanding how decisions under uncertainty can be evaluated in terms of expected utility. 
- **Utility Functions and Rational Decision-Making** : The chapter emphasizes that having a utility function enables an agent to make decisions that are consistent with its preferences, even in the face of uncertainty. This is because decisions can be evaluated based on the expected utility they generate, aligning with the principle of maximizing expected utility.

These points illustrate the theoretical underpinnings of utility theory as it applies to rational decision-making, setting the stage for its application in various contexts, including AI and economics.

### 16.2.1 "Constraints on Rational Preferences" 

This section outlines essential criteria that rational agents' preferences must meet, providing a foundation for the Maximum Expected Utility (MEU) principle. These constraints ensure that preferences are consistent and rational across various scenarios, especially under uncertainty.

Here's a summary of the six constraints: 
- **Orderability** : A rational agent must be able to order any two lotteries by preference or consider them equally preferable. This means the agent must decide among options, avoiding indecision. 
- *Example*: For any two actions A and B, exactly one of these relationships holds: A ≻ B, B ≻ A, or A ∼ B. 
- **Transitivity** : If an agent prefers A over B and B over C, then the agent must also prefer A over C, ensuring consistency in preference order. 
- *Example*: (A ≻ B) ∧ (B ≻ C) implies A ≻ C. 
- **Continuity** : If an agent's preference for B is between A and C, there exists a probability p where the agent is indifferent between choosing B for sure and a lottery offering A with probability p and C with probability 1-p. 
- *Example*: A ≻ B ≻ C implies the existence of a p where [p,A; 1−p,C] ∼ B. 
- **Substitutability** : If an agent is indifferent between two lotteries A and B, then substituting A for B in any complex lottery should not change the agent's preference. 
- *Example*: A ∼ B implies [p,A; 1−p,C] ∼ [p,B; 1−p,C], even if we substitute ≻ for ∼. 
- **Monotonicity** : When two lotteries have the same outcomes but different probabilities, the agent must prefer the lottery with the higher probability for the more preferred outcome. 
- *Example*: If A ≻ B, then a lottery with a higher probability for A is preferred over a lottery with a lower probability for A. 
- **Decomposability** : Preferences for compound lotteries can be broken down into simpler ones without changing the overall preference, adhering to probability laws. 
- *Example*: [p,A; 1−p,[q,B; 1−q,C]] can be simplified to [p,A; (1−p)q,B; (1−p)(1−q),C] without altering the agent's preference.

These axioms ensure that an agent's preference system is rational and can be represented through a utility function, supporting rational decision-making under uncertainty. Violating these constraints can lead to irrational behavior, such as being exploited in a cycle of nontransitive preferences.

### 16.2.2 Rational preferences lead to utility** 

This subchapter delves into how the axioms of utility theory, which are essentially about preferences, imply the existence of a utility function. Here are the key points summarized:
 
- **Existence of Utility Function** : It states that if an agent's preferences adhere to the utility axioms, there must be a utility function UUU such that U(A)>U(B)U(A) > U(B)U(A)>U(B) if AAA is preferred over BBB, and U(A)=U(B)U(A) = U(B)U(A)=U(B) if the agent is indifferent between AAA and BBB. This foundational principle establishes a direct correlation between preferences and numerical utility values. 
- **Expected Utility of a Lottery** : The utility of a lottery is defined as the weighted sum of the utilities of its outcomes, with the weights being the probabilities of these outcomes. This formula underpins the concept of expected utility, which is crucial for making rational decisions in uncertain conditions. 
- **Utility Function Uniqueness and Transformation** : While a utility function for any rational agent exists under these axioms, the function itself is not unique. An agent's utility function can undergo a positive affine transformation (U′(S)=aU(S)+bU'(S) = aU(S) + bU′(S)=aU(S)+b, where a>0a > 0a>0) without altering the agent's behavior. This property highlights the relative nature of utility values. 
- **Value Function in Deterministic Environments** : In environments where outcomes are deterministic, an agent only needs a preference ranking of states, referred to as a value function or ordinal utility function. The specific numerical values of the utility function are less important than the ranking order they represent. 
- **Utility Maximization and Rational Behavior** : The existence of a utility function aligning with an agent's preferences does not necessarily imply that the agent consciously maximizes that utility in its decision-making process. Rational behavior can manifest through various mechanisms, such as table lookup for small state spaces. 
- **Observation and Inference of Utility Functions** : Observers can infer an agent's utility function by analyzing its behavior, even if the agent itself is not explicitly aware of the utility function it is maximizing. This concept is crucial for understanding and predicting the actions of rational agents.

This subchapter fundamentally links the theoretical underpinnings of utility theory with practical decision-making, illustrating how rational preferences naturally lead to the concept of utility and its application in AI and decision theory.


## 16.3 "Utility Functions" 

We dwelve into the concept of utility functions and their properties within the context of decision theory. 

- **Definition and Flexibility** : Utility functions translate lotteries or outcomes into real numbers, adhering to the axioms of orderability, transitivity, continuity, substitutability, monotonicity, and decomposability. This section underscores the flexibility in an agent's preferences, illustrating that practically any set of preferences, as long as it aligns with these axioms, can be rational. For instance, an agent's preference for having a prime number of dollars in its bank account, while peculiar, is not deemed irrational within the framework of utility theory. 
- **Preference Diversity** : The chapter highlights that while agents can have any preferences, including unusual ones like preferring a dented 1973 Ford Pinto over a brand-new Mercedes under certain conditions, real-world agents tend to have more systematic preferences that are easier to model and predict. 
- **Utility and Real-world Analogies** : It draws an analogy between utilities and temperature measurements, noting that converting temperatures between Fahrenheit and Celsius doesn't change the underlying physical temperature, similar to how different utility scales or transformations don't alter the preference orderings they represent​

### 16.3.1 Utility Assessment and Utility Scales

This subchapter explores how to determine a human's utility function for decision-theoretic systems, a process known as preference elicitation. Here are the key points: 
- **Preference Elicitation** : This is the process of determining a human's utility function by presenting choices and observing preferences. It's essential for systems that help humans make decisions or act on their behalf. 
- **Utility Scales** : Although there's no absolute scale for utilities, establishing a comparative scale is crucial. This involves setting a "best possible prize" and a "worst possible catastrophe" to finite utilities, often normalized with values 0 and 1, respectively. 
- **Standard Lottery Method** : To assess the utility of a particular outcome, individuals are asked to choose between the outcome and a standard lottery that offers the best and worst outcomes with certain probabilities. The point of indifference, where the individual values both options equally, helps determine the utility of the outcome. 
- **Application Examples** : The method can be applied to various scenarios, from sports outcomes to life-and-death decisions in medical or environmental contexts. The value placed on human life, despite its discomforting nature, is crucial for making informed decisions on policies and safety measures. 
- **Monetary Value on Life** : The concept introduces the use of a "statistical life" value by government agencies to balance costs and benefits of regulations. Despite ethical concerns, failing to assign a value can lead to undervaluing life in decision-making processes. 
- **Micromorts and QALYs** : These units measure the value individuals and societies place on life and health. Micromorts quantify the willingness to pay to avoid risk, while QALYs measure the trade-offs people make between life expectancy and quality of life.

This section underscores the complexity and ethical considerations in utility assessment, illustrating how utilities are not merely abstract concepts but deeply connected to real-world values and decisions.

### 16.3.2 The Utility of Money** 

This section discusses the relationship between money and utility, illustrating the complexities of how monetary value influences decision-making. Key points include: 
- **Monotonic Preference** : Generally, agents prefer more money to less, assuming other conditions remain constant. However, this preference does not directly translate to how decisions are made in monetary lotteries, indicating money's complex role in utility functions. 
- **Expected Monetary Value (EMV) vs. Utility** : An example highlights that a rational decision does not solely depend on the EMV of a gamble. For instance, choosing a guaranteed $1,000,000 over a 50/50 chance of winning $2,500,000 or nothing reflects the non-linear utility of money rather than its arithmetic average. 
- **Utility and Wealth** : Utility is not directly proportional to money. The utility derived from financial gains decreases as wealth increases, a concept known as diminishing marginal utility. This means the first million dollars has a significantly higher utility than subsequent millions. 
- **Logarithmic Utility of Money** : Studies suggest that the utility of money for most people is proportional to the logarithm of the amount, leading to a concave utility function for positive wealth and possibly an S-shaped curve when including debt. This reflects varying attitudes towards risk at different levels of wealth. 
- **Risk Aversion and Risk Seeking** : People tend to be risk-averse with positive wealth, preferring sure outcomes over gambles with higher expected values. Conversely, those in significant debt might exhibit risk-seeking behaviors. 
- **Certainty Equivalent and Insurance Premium** : The certainty equivalent is the guaranteed amount an agent would accept instead of a gamble. It often is less than the gamble's EMV, illustrating risk aversion. The difference between the EMV and the certainty equivalent is the insurance premium, underpinning the insurance industry. 
- **Risk Neutrality for Small Gambles** : For small changes in wealth, utility functions can be approximately linear, leading to risk-neutral behavior. This justifies the use of small gambles in assessing probabilities and supports the axioms of probability.

This subchapter underscores the nuanced role of money in decision-making processes, revealing that while money is a universal measure of value, its utility is subjective and varies greatly depending on individual circumstances and wealth levels.


### 16.3.3 Expected Utility and Post-Decision Disappointment** 

This section addresses the concept of expected utility as the foundation for making rational decisions and the phenomenon known as the "optimizer's curse," which can lead to post-decision disappointment. Here are the key points: 
- **Rational Choice Maximization** : The optimal action (a∗a^*a∗) is determined by maximizing the expected utility (EUEUEU), assuming the probability model accurately reflects the stochastic processes of the outcomes. 
- **Challenges with Estimates** : Often, the expected utility is estimated (EU^(a)\widehat{EU}(a)EU(a)) because of incomplete knowledge or computational difficulties. These estimates are assumed to be unbiased, meaning the expected error between the estimated and true expected utilities is zero. 
- **Discrepancy in Real Outcomes** : Despite unbiased estimates, the actual outcome frequently falls short of expectations. This discrepancy is due to the natural bias towards overly optimistic estimates when selecting among multiple options. 
- **Optimizer's Curse** : This term describes the tendency for the estimated expected utility of the best choice to be overly optimistic. It occurs because the selection process inherently favors estimates with positive errors, leading to systematic overestimation of the utility of chosen actions. 
- **Quantifying Disappointment** : The extent of post-decision disappointment can be quantified by analyzing the distribution of the maximum of the utility estimates. As the number of options increases, so does the likelihood of encountering extremely optimistic estimates, thereby increasing potential disappointment. 
- **Bayesian Approach to Mitigation** : To counteract the optimizer's curse, a Bayesian approach is recommended, incorporating an explicit probability model for the error in utility estimates. This model, combined with prior expectations about utilities, allows for the adjustment of utility estimates towards more realistic values using Bayes' rule.

The discussion emphasizes the importance of critically evaluating utility estimates and the benefits of adopting a Bayesian framework to mitigate inherent biases in decision-making processes, thereby reducing the risk of post-decision disappointment.

<img src="https://raw.githubusercontent.com/ValRCS/RBS_PBM773_Introduction_to_AI/main/img/ch16_making_simple_decisions/DALL%C2%B7E%202024-03-04%2014.34.24%20-%20Create%20an%20image%20of%20a%20scene%20at%20a%20remote%20gas%20station%20in%20the%20desert%2C%20focusing%20on%20a%20single%20character%20inspired%20by%20Javier%20Bardem%20with%20the%20distinctive%20haircu.webp" alt="javier" width="400">

### 16.3.4 Human Judgment and Irrationality** 

This section explores the discrepancies between how humans actually make decisions (descriptive theory) and how a rational agent should make decisions according to decision theory (normative theory). It highlights several phenomena illustrating human irrationality in decision-making: 
- **Allais Paradox** : Demonstrates inconsistencies in human choices among lotteries, challenging the consistency expected in utility theory. People often prefer a sure thing over a probabilistically higher value, contradicting the expected utility principle. 
- **Certainty Effect** : Humans have a strong preference for certain outcomes, influenced by the desire to avoid computational complexity, distrust in probabilities, and emotional factors like the potential for regret. 
- **Ellsberg Paradox** : Shows that people have an aversion to ambiguity. Given a choice, individuals prefer known probabilities over unknown ones, even when it may not be rational. 
- **Framing Effect** : The way a problem is presented can significantly influence decisions. For example, a procedure with a "90% survival rate" is preferred over one with a "10% death rate," despite both statements conveying the same information. 
- **Anchoring Effect** : Initial exposure to a number (an anchor) can skew subsequent judgments and decisions, such as perceiving a $55 bottle of wine as a bargain after seeing a $200 bottle listed.

These examples illustrate that human decision-making often deviates from the rational agent model due to psychological biases and the influence of how information is presented. While these paradoxes and effects suggest irrationality, further exploration reveals that humans can adjust their judgments upon reflection or when information is presented in a more comprehensible manner. Evolutionary psychology offers insights into why the brain's decision-making mechanisms, shaped by evolutionary pressures, may not align with abstract numerical and probability-based decision problems, suggesting that humans may act more rationally when decisions are framed in contextually familiar or "evolutionarily appropriate" formats.

## 16.4 "Multiattribute Utility Functions"

Here we explores the extension of utility theory to scenarios involving multiple attributes, which is crucial in fields like public policy where decisions impact both financial and human aspects.

- **Introduction to Multiattribute Utility Theory** : This section introduces the concept of multiattribute utility theory, which is essential for making decisions where outcomes are characterized by multiple attributes. It emphasizes that this theory enables the comparison of options that differ across several dimensions, likening it to comparing apples to oranges. 
- **Attributes and Their Representation** : It discusses how to represent multiple attributes mathematically, focusing on ensuring that higher attribute values always correspond to higher utilities. This might involve transforming attributes so they align with the utility function's requirements. 
- **Example Attributes for an Airport Site Selection** : Provides concrete examples of attributes in the context of choosing a site for a new airport, such as throughput (number of flights), safety (measured by expected number of deaths), quietness (based on the number of people living under flight paths), and frugality (cost of construction), illustrating how complex real-world decisions can be broken down into quantifiable attributes. 
- **Dominance in Decision Making** : Introduces the concept of dominance, particularly strict dominance, where if one option is better than another across all attributes, the inferior option can be disregarded. It further explains how dominance helps narrow down choices but often does not lead to a single best option. 
- **Stochastic Dominance** : Expands the concept of dominance to include uncertain outcomes, showing how one option can be preferable to another under uncertainty by comparing their distributions across attributes. 
- **Preference Structure and Utility** : Discusses how preferences across multiple attributes can be structured and simplified, aiming to reduce the complexity of the utility function. It introduces concepts like preference independence and mutual preferential independence (MPI), which allow for the simplification of the utility function into an additive or multiplicative form based on the attributes. 
- **Additive and Multiplicative Utility Functions** : Details how preferences that exhibit certain structures, such as MPI, can be represented by simpler utility functions, either additive or multiplicative, reducing the need to assess complex multi-dimensional utility functions directly.

These key ideas demonstrate the complexity of decision-making in scenarios with multiple attributes and outline how utility theory provides a framework for systematically comparing and evaluating different options based on their respective utilities across all relevant dimensions.

## 16.5 "Decision Networks" 

Our focus is now on the formalism of decision networks (also known as influence diagrams) as a means for making rational decisions in uncertain environments. Here are the key ideas: 
- **Introduction to Decision Networks** : The subchapter starts by introducing decision networks as a general mechanism for making rational decisions, combining Bayesian networks with additional nodes for actions and utilities, using the example of selecting a site for a new airport​[]() ​. 
- **Components of Decision Networks** : It breaks down the structure of decision networks into three types of nodes: 
- **Chance nodes**  represent random variables and uncertainties in the environment, such as construction costs or air traffic levels, which depend on the chosen site. 
- **Decision nodes**  symbolize the points where the decision maker has choices of actions, influencing the outcomes in terms of safety, quietness, and frugality. 
- **Utility nodes**  encapsulate the agent’s utility function, indicating the utility of outcomes as a function of their attributes. This includes both deterministic utilities and the expected utility associated with each action, simplifying decision-making under uncertainty​​. 
- **Evaluating Decision Networks** : Describes the process for selecting actions by evaluating the decision network for each possible decision, setting decision nodes similarly to evidence variables in Bayesian networks. This involves setting current state evidence, calculating posterior probabilities for outcomes, and then computing the utility for each action to select the one with the highest utility​​.

These concepts illustrate how decision networks provide a structured and formal approach to decision-making under uncertainty, integrating probabilistic reasoning with utility theory to guide the selection of rational actions based on expected outcomes.

### 16.5.1 Representing a Decision Problem with a Decision Network** 

Decision networks, expanding upon Bayesian networks, offer a structured framework for representing and solving decision problems by incorporating information about an agent's current state, potential actions, resulting states from those actions, and the utility or value of those states. This approach is instrumental in developing utility-based agents, providing a comprehensive model for decision-making processes.

A decision network is characterized by three types of nodes, each serving a distinct purpose: 
- **Chance Nodes (Ovals)** : These nodes represent random variables, embodying the uncertainty inherent in various aspects of the decision problem, such as construction costs, air traffic levels, or litigation potentials in an airport-siting scenario. Conditional distributions associated with these nodes account for the state of their parent nodes, which can include both chance and decision nodes, thus capturing the probabilistic nature of the environment. 
- **Decision Nodes (Rectangles)** : These nodes denote the points at which the decision-maker has a choice among different actions. For example, selecting a site for airport construction influences factors like safety, quietness, and cost-effectiveness. The model typically focuses on a singular decision node, with extensions to multiple decisions discussed in subsequent chapters. 
- **Utility Nodes (Diamonds)** : Representing the agent's utility function, these nodes are linked to all variables that directly impact the agent's utility. The utility node encapsulates the agent's preferences or values as a function of various outcomes, providing a quantitative measure of the desirability of different states or outcomes.

The decision network can be simplified by directly connecting the utility node to both the current-state and decision nodes, bypassing the explicit representation of outcome states. This streamlined version focuses on the expected utility of each action, making it less flexible in adapting to changes in circumstances compared to the more detailed form. It effectively compiles the decision problem by summing out the outcome state variables, offering a more straightforward but less adaptable model for decision analysis.

In essence, decision networks provide a powerful tool for modeling and solving complex decision problems by integrating uncertainty, decision-making alternatives, and preferences or utilities into a coherent framework.

<img src="https://github.com/ValRCS/RBS_PBM773_Introduction_to_AI/blob/main/img/ch16_making_simple_decisions/airport_decision_network.jpg?raw=true" width="400" alt="decision network">

### 16.5.2 Solving Decision Networks**

```python
# Pseudo-code for Evaluating Decision Networks

# Define the function to evaluate decision networks
def evaluate_decision_network(evidence, decision_options, utility_function, probabilistic_inference):
    """
    Evaluate a decision network to find the best action.

    :param evidence: Dictionary containing the evidence variables and their values
    :param decision_options: List of possible values for the decision node
    :param utility_function: Function that calculates utility given parent node values
    :param probabilistic_inference: Function that performs probabilistic inference
    :return: The best action and its utility
    """
    
    # Step 1: Set the evidence variables for the current state
    set_evidence(evidence)
    
    best_action = None
    highest_utility = -float('inf')  # Initialize with negative infinity
    
    # Step 2: Iterate through each possible value of the decision node
    for action in decision_options:
        # Step 2(a): Set the decision node to the current action
        set_decision_node(action)
        
        # Step 2(b): Calculate the posterior probabilities for the parent nodes of the utility node
        posterior_probabilities = probabilistic_inference()
        
        # Step 2(c): Calculate the resulting utility for the action
        utility = utility_function(posterior_probabilities)
        
        # Update the best action if this action has the highest utility so far
        if utility > highest_utility:
            best_action = action
            highest_utility = utility
            
    # Step 3: Return the action with the highest utility
    return best_action, highest_utility

# Example usage (pseudo-code, specific implementation details will vary)
# evidence = {'Weather': 'Sunny', 'Day': 'Weekday'}
# decision_options = ['Open', 'Close']
# utility_function = ... # Define how utility is calculated based on posterior probabilities
# probabilistic_inference = ... # Define how probabilistic inference is performed
# best_action, utility = evaluate_decision_network(evidence, decision_options, utility_function, probabilistic_inference)
# print(f"Best action: {best_action} with utility: {utility}")
```



This pseudo-code outlines the algorithm for evaluating decision networks to select the best action based on the given evidence and the set of possible actions. The `probabilistic_inference` and `utility_function` would need to be implemented based on the specific details of the decision network being evaluated.

## 16.6 "The Value of Information" 

Now we discuss the importance and valuation of information in decision-making processes. 

- **Importance of Acquiring Information** : Initially, it discusses the critical role of knowing which questions to ask or what information to gather before making decisions. This aspect is especially crucial in scenarios where not all relevant or available information is initially provided to the decision-maker, such as in medical diagnosis, where tests can be expensive or hazardous​[]() ​. 
- **Information Value Theory** : This section introduces information value theory, which helps an agent decide what information is worth acquiring. It emphasizes a simplified form of sequential decision-making, where observation actions only affect the agent's belief state and not the physical state, highlighting that the value of any observation derives from its potential to influence the agent's physical actions​[]() ​. 
- **Examples and General Formula for Perfect Information** : Through examples, such as an oil company considering buying drilling rights and the potential value of seismic survey information, it illustrates how to calculate the expected profit or loss from acquiring specific information. It also presents a general formula for the value of perfect information (VPI), which quantifies the difference in expected value between the best actions before and after obtaining information​[]() ​. 
- **Non-Negative Expected Value of Information** : The chapter asserts that the expected value of information is always non-negative, meaning that acquiring more information cannot harm decision-making. In worst-case scenarios, the decision-maker can choose to ignore the new information, underscoring that information either has positive value or no impact at all on the decision process​[]() ​. 
- **Myopic and Nonmyopic Information Gathering** : It distinguishes between myopic (shortsighted) information gathering, where the value of information is calculated assuming only a single piece of evidence will be acquired, and nonmyopic approaches that consider the possibility of obtaining multiple observations. Myopic information gathering is likened to greedy search, often effective but sometimes suboptimal compared to strategies that consider multiple evidence pieces​[]() ​.

These concepts explain how decision-making can be significantly improved by strategically acquiring and utilizing information, emphasizing the calculation and implications of the value of information in uncertain environments.

## 16.7 "Unknown Preferences"

Now we discuss the complexities and methodologies for dealing with uncertainty in utility functions, whether it's an agent uncertain about its own preferences or a machine uncertain about human preferences. 

- **Uncertainty in Self-Preferences** : It illustrates situations where an agent (human or machine) is unsure about its own preferences. For example, a person trying to decide between two flavors of ice cream, vanilla and durian, with known preference for vanilla but uncertain about durian, due to its polarizing nature. 
- **Modeling Uncertainty with Decision Networks** : The text describes how to model this uncertainty using decision networks, incorporating uncertain preferences as random variables. This allows for a systematic approach to decision-making even when the agent's preferences aren't fully known. 
- **Decision Making with Uncertain Preferences** : It discusses how agents can make decisions when their preferences are uncertain, showing that it is possible to calculate expected utilities even when preferences over outcomes are not clear-cut. 
- **Deference to Humans** : The second part of the subchapter addresses situations where a machine aims to assist a human but is uncertain about the human's preferences. It explores scenarios where the machine might choose to defer to the human's judgment, effectively allowing the human to make the final decision. This deference is framed within the context of an "off-switch" game, illustrating how a machine might choose to switch itself off or defer to human decision-making to optimize outcomes based on uncertain preferences. 
- **Robbie and Harriet Case Study** : Provides a detailed example involving a robot assistant (Robbie) and its human owner (Harriet), showcasing how Robbie uses deference as a strategy to manage uncertainty about Harriet's preferences regarding hotel bookings. This example underscores the value of allowing for human input in decision-making processes when machines are uncertain about human desires or outcomes.

These concepts highlight the importance of handling preference uncertainty, both for individual decision-making and in designing machines or systems that interact with humans. The discussions aim to provide frameworks for making rational decisions in the face of incomplete information about preferences, emphasizing the role of deference and information gathering in navigating these uncertainties.

## Bibliographical and Historical Notes 

- **Arnauld (1662)** : In "L’art de Penser" or "Port-Royal Logic," Arnauld discusses the necessity of considering both the probability of outcomes and their inherent good or evil to make judgments on actions. 
- **Daniel Bernoulli (1738)** : Investigated the St. Petersburg paradox, introducing the concept of utility to explain preferences for lotteries. 
- **Jeremy Bentham (1823)** : Proposed the hedonic calculus for weighing pleasures and pains, suggesting all decisions could be reduced to utility comparisons. 
- **Ramsey (1931)** : First carried out the derivation of numerical utilities from preferences, a foundational work for utility theory. 
- **Von Neumann and Morgenstern (1944)** : "Theory of Games and Economic Behavior" introduced axioms for preference that influenced the modern understanding of utility theory. 
- **Savage (1954)**  and **Jeffrey (1983)** : Contributed to constructing subjective probabilities and utilities from an agent's preferences. 
- **Howard and Matheson (1984)** : Introduced influence diagrams or decision networks, an essential tool for decision theory, based on earlier work at SRI (Miller et al., 1976). 
- **Shachter (1986)** : Developed a method for making decisions based directly on a decision network without creating an intermediate decision tree. 
- **Von Winterfeldt and Edwards (1986)** , **Smith (1988)** , **Fenton and Neil (2018)** : Provided significant insights into decision analysis, utility modeling, and solving real-world problems using decision networks. 
- **Jerry Feldman (1974, 1977)** : Applied decision theory to problems in vision and planning, highlighting early adoption of decision-theoretic tools in AI. 
- **Horvitz et al. (1988)** , **Cowell et al. (2002)** : Contributed to the acceptance and development of decision-theoretic expert systems. 
- **Harsanyi (1967)** : Explored the problem of incomplete information in game theory, showing that games with incomplete information are equivalent to games with imperfect information. 
- **Cyert and de Groot (1979)** : Developed a theory of adaptive utility, where an agent can be uncertain about its own utility function. 
- **Chajewska et al. (2000)** , **Boutilier (2002)** , **Fern et al. (2014)** : Worked on Bayesian preference elicitation and models of assistance, proposing frameworks for understanding and assisting with human goals under uncertainty. 
- **Hadfield-Menell et al. (2017b)** , **Russell (2019)** : Proposed models and frameworks for beneficial AI, including the off-switch game as a key example.