# Course 1 Week 4  
## Lecture 1. Maximum Expected Utility
### 1.1 Simple Decision Making

A simple decision making situation $\mathcal{D}$  
* $Val(A) = \{a^1,...,a^k\}$ is a set of possible actions
* $Val(X) = \{x^1,...,x^n\}$ is a set of states
* $P(X \mid A)$ is a distribution
* $U(X, A)$ is a utility function the **Agent's preference**

### 1.2 Expected Utility  
$$EU[\mathcal{D}[a]] = \sum_x P(x \mid z) U(x,a)$$

Sum up over all possible states of the world that could be affected by the action, and then multiply by the utility of the stated-action.  

This sum is the **overall happiness** of the decision  

We want to choose the action $a$ that maximizes the expected utility.  This can be defined as:  

$$a^* = argmax_a EU[\mathcal{D}[a]]$$

### 1.3 Influence Diagram
We can make a model that is no longer a pure probabilistic model to represent this decision process.  

<img src="./images/influence_diagram.png">  

The box represents the action variable where the Agent chooses the value of the variable. 
* These boxes are not random variables and therefore are not represented by a CPD.  

The diamond $U$ is the utility  

Computing the utility of the two choices in this graph  

$$EU[\mathcal{D}[a=f]] = \sum_{m,a} P(m \mid z=empty) U(a,m)$$  

remember here $a = f$  

$EU[F^0] = 0$  
$EU[F^1] = (0.5*-7) + (0.3*5) + (0.2*20) = 2$  

The optimal action would be to found a company.  

### 1.4 Utility function can be decomposed

<img src="./images/complex_influence_diagram.png">  

Decomposed Utility Function  
$V_G$ = happiness of getting a good grade  
$V_S$ = happiness of getting a good job  
$V_Q$ = quality of life during the course  

Assuming Grade, Job, Study and Difficulty are binary, how many values would you have to elicit for a **joint** utility function over all variables?

You would need to elicit $2^4 = 16$ values since there are 4 binary variables.

decomposition makes each utility function a CPD to avoid the exponential growth of large JPDs

### 1.5 Information Edges

<img src="./images/information_edge.png">

The edges that connect Parents to the **action variable** are considered **information edges**.  

With this, we can define a **decision rule**, $\delta$, at the action node $A$.  
* This decision rule is effectively a CPD, $P(A \mid Pa(A))$


### 1.6 Expected Utility with Information

$$EU[\mathcal{D}[\delta_A]] = \sum_{x,a} P_{\delta_{A}}(x \mid a) U(x,a)$$

$P_{\delta_{A}}(x \mid a)$ is a Joint Probability Distribution over $\overline{X} \cup [A]$

The Agent wants to choose the decision rule $\delta_A$ such that they obtain the **Maximum Expected Utility (MEU)**.  

$$a^* = argmax_{\delta_A} EU[\mathcal{D}[\delta_A]]$$  

$$MEU(\mathcal{D}) = max_{\delta_A}EU[\mathcal{D}[\delta_A]]$$  

### 1.7 Finding the MEU Decision Rules

<img src="./images/optimize_delta.png">

We can write the expected utility as a product of factors such that:  

$EU[\mathcal{D}[\delta_A]] = \sum_{x,a} P_{\delta_{A}}(x \mid a) U(x,a)$  

is the following expression in the Market-survey graph

$\sum_{M,S,F} P(M)P(S \mid M) \delta_{F}(F \mid S) U(F,M)$  

$\sum_{S,F} \delta_{F}(F \mid S) \sum_{M} P(M)P(S \mid M)  U(F,M)$  

After marginalizing for M we obtain a new factor $\mu (F,S)$  

$\sum_{S,F} \delta_{F}(F \mid S) \mu(F,S)$  

The reason for using $\mu$ is to suggest that this JPD has a utility component derived from $U$  

#### Identify the decision rules with maximum utility

<img src="./images/optimal_decision_rule.png">  

To Idenif the decision rule $\delta_A$ we look at the CPD for the Agent given the information, $P(F \mid S)$.  Here we find that the best decision is $(f^0 \mid s^0)$, $(f^1 \mid s^1)$, and $(f^1 \mid s^2)$.  
This information is converted to a decision rule that meets the following criteria:  

$\delta^{*}_{A}(a \mid z) = \begin{cases} 1 \quad a = argmax_A \mu(A,z) \\ 0 \quad otherwise \end{cases}$  

Specifically this means the following:  

* $\delta_{F}(f=0 \mid S=0) = 1$
* $\delta_{F}(f=0 \mid S=1) = 0$
* $\delta_{F}(f=0 \mid S=2) = 0$
* $\delta_{F}(f=1 \mid S=0) = 0$
* $\delta_{F}(f=1 \mid S=1) = 1$
* $\delta_{F}(f=1 \mid S=2) = 1$

In this case the optimal expected utility is:  

$$MEU(\mathcal{D}) = max_{\delta_A}EU[\mathcal{D}[\delta_A]] = \sum_{S,F} \delta_{F}(F \mid S) \mu(F,S)$$

$$\sum_{S,F} \delta_{F}(F \mid S) \mu(F,S) = P(f=0,s=0) + P(f=0,s=1) + P(f=0,s=2) + P(f=1,s=0) + P(f=1,s=1) + P(f=1,s=2)$$   

$P(f=0,s=0) = \delta_{F}(F \mid S) \mu(F,S) = 1.0 * 0.0 = 0$  
$P(f=0,s=1) = \delta_{F}(F \mid S) \mu(F,S) = 0.0 * 0.0 = 0$  
$P(f=0,s=2) = \delta_{F}(F \mid S) \mu(F,S) = 0.0 * 0.0 = 0$  
$P(f=1,s=0) = \delta_{F}(F \mid S) \mu(F,S) = 0.0 * -1.25 = 0$  
$P(f=1,s=1) = \delta_{F}(F \mid S) \mu(F,S) = 1.0 * 1.25 = 1.25$  
$P(f=1,s=2) = \delta_{F}(F \mid S) \mu(F,S) = 1.0 * 2.25 = 2.25$  

$$MEU(\mathcal{D}) = max_{\delta_A}EU[\mathcal{D}[\delta_A]] = 0 + 0 + 0 + 0 + 1.25 + 2.25$$  

When the Agent has access to the survey, the MEU is 3.25, which is higher than the MEU when the Agent doesn't have the survey.  

In [1]:
import numpy as np
from pgmpy.factors.discrete import TabularCPD
from pgmpy.factors.discrete import DiscreteFactor

cpd_M = TabularCPD('M',
                    3,
                    [[0.5, 0.3, 0.2]])

cpd_S = TabularCPD('S',
                    3,
                    [[ 0.6,   0.3,  0.1],
                     [ 0.3,   0.4,  0.4],
                     [ 0.1,   0.3,  0.5]],
                    evidence = ['M'],
                    evidence_card=[3])

cpd_U = DiscreteFactor(['F','M'],
                       [2,3],
                       [[ 0.0, 0, 0],
                        [ -7,  5, 20]])

P = cpd_S * cpd_M * cpd_U 
print P

print 'u(S=0,F=0) = %.2f' %(0.0)
print 'u(S=1,F=0) = %.2f' %(0.0)
print 'u(S=2,F=0) = %.2f \n' %(0.0)
print 'u(S=0,F=1) = %.2f' %(-2.1+0.45+0.4)
print 'u(S=1,F=1) = %.2f' %(-1.05+0.6+1.6)
print 'u(S=2,F=1) = %.2f' %(-0.35+0.45+2)

+-----+-----+-------+-----+------+-----+-----+
| M   | M_0 | M_0   | M_1 | M_1  | M_2 | M_2 |
+-----+-----+-------+-----+------+-----+-----+
| F   | F_0 | F_1   | F_0 | F_1  | F_0 | F_1 |
+-----+-----+-------+-----+------+-----+-----+
| S_0 | 0.0 | -2.1  | 0.0 | 0.45 | 0.0 | 0.4 |
+-----+-----+-------+-----+------+-----+-----+
| S_1 | 0.0 | -1.05 | 0.0 | 0.6  | 0.0 | 1.6 |
+-----+-----+-------+-----+------+-----+-----+
| S_2 | 0.0 | -0.35 | 0.0 | 0.45 | 0.0 | 2.0 |
+-----+-----+-------+-----+------+-----+-----+
u(S=0,F=0) = 0.00
u(S=1,F=0) = 0.00
u(S=2,F=0) = 0.00 

u(S=0,F=1) = -1.25
u(S=1,F=1) = 1.15
u(S=2,F=1) = 2.10


### 1.8 General Form of MEU and its Maximization
<img src="./images/general_maximize_delta.PNG">  

### 1.9 MEU Algorithm Summary

To compute MEU and optimize decision at $A$
* Treat $A$ as a random variable with arbitrary CPD
* Introduce Utility factor with scope $Pa_U$
* Eliminate all variables except $A$, $Z$ ($A$'s parents) to produce factor $\mu (A, Z)$ 
* For each $Z$, select:

$\delta^{*}_{A}(a \mid z) = \begin{cases} 1 \quad a = argmax_A \mu(A,z) \\ 0 \quad otherwise \end{cases}$  

### 1.10 Decision Making Under Uncertainty  
* MEU principle provides rigorous foundation
* PGMs provide structured representation for probabilities, actions, and utilities
* PGM inference methods (VE) can be used for 
    - finding the optimal strategy
    - determining the overall value of the decision situation
* Efficient methods also exist for 
    - multiple utility components
    - multiple decisions


## Lecture 2. Utility Functions
### 2.1 Utilities and Preferences

Utility functions are necessary for our ability to compare complex scenarios that involve uncertainty or risk.  

the way to formalize the decision making process of an Agent in these situations is to ascribe a numerical utility to the different outcomes.  

For example:  

Scenario A: \$4 Million with P(0.2)   -or-  \$0 with P(0.8)  
Scenario B: \$3 Million with P(0.25)  -or-  \$0 with P(0.75)  

Utility Scenario A = U(\$4M) x 0.20 + U(\$0) x 0.80 = \$800,000  
Utility Scenario B = U(\$3M) x 0.25 + U(\$0) x 0.75 = \$750,000  

We can then use the principle of MEU to determine the preferred scenario between these two scenarios.   

#### utility preference is not linear

It might seem that the preferred scenario between two scenarios is a linear function.  However, this is not the case.  

Take the following scenarios:  

Scenario A: \$4 Million with P(0.8)  -or-  \$0 with P(0.2)  
Scenario B: \$3 Million with P(1.0)  -or-  \$0 with P(0.0)  

Even though scenario A has a higher utility (\$3.2 M), most people would chose scenario B.  

### 2.2 St. Petersburg Paradox  
* Fair coin is tossed repeatedly until it comes up heads, say on the $n^{th}$ toss.  
* player is paid $\$2^n$  

The expected payout can be computed as:  

$$P(T)*2 + P(TT) * 4 + P(TTT) * 8 ... = \frac{1}{2}*2 + \frac{1}{4}*4 + \frac{1}{8} * 8 + ... = \infty$$  

In principle, a person might be willing to pay any amount to play this game, because no matter what they pay the expected utility is greater than that sum.  

However, the reality is that the expected pay-out for anyone playing this game is approx. \$2.  

### 2.3 The Utility Curve
 The x-axis represents the \$ dollar amount.   
 The y-axis is the utility that an Agent prescribes to an event where that dollar value is earned.  
 
 <img src="./images/utility_curve.PNG">  
 
 In this plot, the solid line represents the utility of getting the dollar value **with certainty**. 
 
 When we look at a series of lotteries where $D = \begin{cases} \$0 \qquad with \; prob. \; 1-p \\ \$1000 \quad with \; prob. \; p \end{cases}$  
 
Because of the linearity of the utility, these lotteries are going to sit on the dotted line depending on the value of $p$.  

In this case, \$400 is called the **certainty equivalent** to the lottery in which $p = 0.5$. The certainty equivalent is the minimum amount of money you would prefer to take with 100 % certainty over the lottery.

The difference $\$500 - \$400 = \$100 $ is the **insurance premium**

### 2.2 Risk Profiles
**Risk Averse** - A Concave utility curve represent risk aversion.  

**Risk Neutral** - A Linear utility curve.

**Risk Seeking** - A Convex utility curve represents risk seeking behavior.  

### 2.3 Typical Utility Curve  
On the Gains/Earning side of the the Utility Curve, the preference is for risk aversion, getting money with certainty and a concave utility function.

On the Loss side of the curve, the preference is risk seeking, or taking a small probability of a large loss as opposed to taking a smaller loss with certainty; this curve is convex. 

Near $ \$ 0 $, the behavior is often risk neutral.

### 2.4 Multi-Attribute Utility

* All attributes affecting the person's preferences must be integrated into a single utility function (difficult to do b/c we may need to put human life on same scale as monitary gain).  
    * Money
    * Time
    * Pleasure
* How to bring human life into a utility function.
    * wrong strategy is to bring the notion of a death into the function
    * rather we use **Micromort** = 1/1,000,000 chance of mortality (in the 1980's a micromort was worth \$20)
    * QALY - Quality Adjusted Life Year  
    
**Example: Prenatal Testing**  
Down's syndrome (D) - liklihood of having Down's  
Testing (T) - Pain of testing for prenatal disorders  
Knowledge (K) - Comfort of knowing test results, (what's going to happen)  
Loss of fetus (L) - Testing may cause the loss of the fetus  
Future pregnancy (F) - Whether there will be a future pregnancy or not (would you care if the baby has Down's if this is your last pregnancy)  

Because there is structure in many people's utility function, these variables can be decomposed into the following Multi-attribute Utility Function:  

$U(T) + U(K) + U(D,L) + U(L,F)$  

If the variables T, K, D, L, and F are all binary valued variables, and the utility function is decomposed into $U(T) + U(K) + U(D,L) + U(L,F)$, how many different utility values have to be elicited to characterize the utility function?   

Answer = 12, because we have $2 + 2 + 2*2 + 2*2 = 12$ (as opposed to 32 if we used a joint utility function over all variables).  

### 2.5 Summary  
* Our utility function determine our preference about decisions that involve uncertainty
* Utility generally depends on multiple factors
    * Money, time, chances of death, ...
* Relationship is usually non-linear  
    * Shape of utility curve determines attitude to risk 
* Multi-attribute utilities can help decompose high-dimensional functions into tractable pieces



## Lecture 3. Value of Perfect Information
### 3.1 Value of Information  
* $VPI(A \mid X)$ Value of Perfect Information - is the value of observing $X$ before choosing an action $A$.  
* $\mathcal{D} =$ original influence diagram
* $\mathcal{D}_{X \rightarrow A} =$ influence diagram after introducing edge $X \rightarrow A$

We can now define the Value of Perfect Information as:  

$$VPI(A \mid X) := MEU(\mathcal{D}_{X \rightarrow A}) - MEU(\mathcal{D}) $$

<img src="./images/VPI_1.PNG"> 

### 3.2 Theorem

Properties of VPI:  

* $VPI(A \mid X) \geq 0$  
* $VPI(A \mid X) = 0$ i.i.f. the optimal decision role for $\mathcal{D}$ is still optimal for $\mathcal{D}_{X \rightarrow A}$  

The first theorem is true because the old influence diagram $MEU(\mathcal{D})$ is is optimizing the decision rule $\delta$ that satisfies the MEU of the CPD $\delta(A \mid \bar{Z})$, and the new influence diagram $MEU(\mathcal{D}_{X \rightarrow A})$ is optimizing $\delta(A \mid \bar{Z}, X)$.  

$\delta(A \mid \bar{Z}, X)$ is a strictly larger class of CPDs compared to $\delta(A \mid \bar{Z})$.  Therefore, any decision rule implemented in the previous diagram,$\delta(A \mid \bar{Z})$, can also be implemented in the current diagram, $\delta(A \mid \bar{Z}, X)$.  The value of the decision in the previous diagram would have the same value as that same decision in the new diagram.  

Therefore, one cannot possibly lose when exploring a space with a larger decision CPD.  

For the second property, if the optimal decision in the original diagram, $\delta(A \mid \bar{Z})$, is still the optimal decision in the new diagram, $\delta(A \mid \bar{Z}, X)$, we have gained nothing from the additional observation $X$.  

This gives us a clear definition of when information is useful.  

For example:  

Say we discover a magical lantern and upon rubbing it, a genie tells us that $\delta(A \vert \overline{Z},X) \neq \delta(A \vert \overline{Z})$. What does this allow us to conclude about the value of knowing X?

The value of knowing $X > 0$, because $\delta(A \vert \overline{Z})$ is not optimal for $D_{X \rightarrow A}$.  

Simply put... **Information is useful precisely when it changes my decision in at least one case.**  

### 3.3 Value of Information Example
Recent graduate is trying to decide whether to join company 1 or company 2.  The diagram is as follows:  

<img src="./images/VPI_2.PNG">   

Company 1 is more likely to be in a better state than company 2.

We are assuming that the Agent's utility is 1 if the company gets funded and 0 if it does not.  

$D = \begin{cases} 1 \qquad if \; company \; funded \\ 0 \qquad if \; company \; not \; funded \end{cases}$  

Based on this diagram (without any information):  

$EU(\mathcal{D}[c_1]) = 0.1*0.1 + 0.2*0.4 + 0.7*0.9 = 0.72$  
$EU(\mathcal{D}[c_2]) = 0.4*0.1 + 0.5*0.4 + 0.1*0.9 = 0.33$  





In [2]:
cpd_S1 = TabularCPD('S1',
                    3,
                    [[0.1, 0.2, 0.7]])

cpd_F1 = TabularCPD('F1',
                    2,
                    [[ 0.9, 0.6, 0.1],
                     [ 0.1, 0.4, 0.9]],
                    evidence = ['S1'],
                    evidence_card=[3])


cpd_V = DiscreteFactor(['C','F1'],
                       [2,2],
                       [[0,1],
                        [0,0]])

P = cpd_S1 * cpd_F1 * cpd_V
print P
0.01 + 0.08 + 0.63

+------+------+------+------+------+
| F1   | F1_0 | F1_0 | F1_1 | F1_1 |
+------+------+------+------+------+
| C    | C_0  | C_1  | C_0  | C_1  |
+------+------+------+------+------+
| S1_0 | 0.0  | 0.0  | 0.01 | 0.0  |
+------+------+------+------+------+
| S1_1 | 0.0  | 0.0  | 0.08 | 0.0  |
+------+------+------+------+------+
| S1_2 | 0.0  | 0.0  | 0.63 | 0.0  |
+------+------+------+------+------+


0.72

In [3]:
cpd_S2 = TabularCPD('S2',
                    3,
                    [[0.4, 0.5, 0.1]])

cpd_F2 = TabularCPD('F2',
                    2,
                    [[ 0.9, 0.6, 0.1],
                     [ 0.1,  0.4, 0.9]],
                    evidence = ['S2'],
                    evidence_card=[3])

cpd_V = DiscreteFactor(['C','F2'],
                       [2,2],
                       [[0,0],
                        [0,1]])

P = cpd_S2 * cpd_F2 * cpd_V
print P
0.04 + 0.2 + 0.09

+------+------+------+------+------+
| F2   | F2_0 | F2_0 | F2_1 | F2_1 |
+------+------+------+------+------+
| C    | C_0  | C_1  | C_0  | C_1  |
+------+------+------+------+------+
| S2_0 | 0.0  | 0.0  | 0.0  | 0.04 |
+------+------+------+------+------+
| S2_1 | 0.0  | 0.0  | 0.0  | 0.2  |
+------+------+------+------+------+
| S2_2 | 0.0  | 0.0  | 0.0  | 0.09 |
+------+------+------+------+------+


0.33

What happens if we allow the Agent to know information about company 2?  

Specifically, we are adding a new edge to the diagram : $\mathcal{D}_{State_2 \rightarrow Company}$  

The new decision rule can be summarized as:   

$\delta^* (C \mid S_2) = \begin{cases} P(c^2) =1 \qquad if \; S_2=s^3 \\ P(c^1) =1 \qquad otherwise \end{cases}$  

Specifically this means that if $S_2 = s^1$ and $S_2 = s^2$ the utility of choosing $C_2$ is $0.1$ and $0.4$, respectively.  None of these choices are better than the utility of choosing $C_1$, which is $0.73$.  Only when $S_2 = s^3$ do we have a choice with better utility, $0.9$.  

You just saw that the optimal decision only changes our mind if the second company is doing very well $(s^3)$. Recall that this only happens with prior probability 0.1. Do you think VPI will be high or low in this case?  

Low -- The basic idea is that the gain in utility in this case has to be weighted by the probability of the state of the world that brings about that gain (i.e. the second company is doing very well). Because the probability of $s^3$ is not very large we should not expect a very large gain in expected utility so the value of the information will likely be small.  

In fact, $MEU(\mathcal{D}_{S2 \rightarrow C}) = 0.738$  

Therefore, the Agent shouldn't be willing to pay very much money for this additional information.  

In another scenario, the probability of company 1 being in a good state is reduced as follows:  

<img src="./images/VPI_3.PNG">   

the optimal expected utility for the graph $\mathcal{D}$ can be computed as:

$\mathcal{D} = \delta(C \mid S2) * P(S1) * P(S2) * P(F1 \mid S1) * P(F2 \mid S2) * V(F1, C, F2)$  

$EU[\mathcal{D}[\delta_A]] = \sum_{F1,F2,S1,S2,C} \delta(C \mid S2) * P(S1) * P(S2) * P(F1 \mid S1) * P(F2 \mid S2) * V(F1, C, F2)$  

$EU[\mathcal{D}[\delta_A]] = \sum_{S2,C} \delta(C \mid S2) * \sum_{F1,F2,S1} P(S1) * P(S2) * P(F1 \mid S1) * P(F2 \mid S2) * V(F1, C, F2)$  

$EU[\mathcal{D}[\delta_A]] = \sum_{S2,C} \delta(C \mid S2) * \sum_{F1,F2,S1} P(S2) * V(C)$  

$EU[\mathcal{D}[\delta_A]] = \sum_{S2,C} \delta(C \mid S2) * \mu(S2,C)$  


In [4]:
cpd_S1 = TabularCPD('S1',
                    3,
                    [[0.1, 0.2, 0.7]])

cpd_F1 = TabularCPD('F1',
                    2,
                    [[ 0.9, 0.6, 0.1],
                     [ 0.1, 0.4, 0.9]],
                    evidence = ['S1'],
                    evidence_card=[3])

cpd_S2 = TabularCPD('S2',
                    3,
                    [[0.4, 0.5, 0.1]])

cpd_F2 = TabularCPD('F2',
                    2,
                    [[ 0.9, 0.6, 0.1],
                     [ 0.1, 0.4, 0.9]],
                    evidence = ['S2'],
                    evidence_card=[3])

cpd_C = TabularCPD('C',
                    2,
                    [[1, 1, 0],
                     [0, 0, 1]],
                    evidence = ['S2'],
                    evidence_card = [3])

factor_V = DiscreteFactor(['C','F1', 'F2'],
                       [2,2,2],
                       [[0, 0, 1, 1],
                        [0, 1, 0, 1]])

P = cpd_C * cpd_S2  * factor_V * cpd_F2 * cpd_F1 * cpd_S1  
# P.marginalize(['F1','F2','S1','V'])
# print P # this is way to big to fit, looks better on a raw text editor such as `notepad`

print 'We set the value of Delta at the random variable, cpd_C, to be 1 when the decision is part of the\nOptimal Expected Utility (MEU) and 0 when the decision is not part of the MEU.\n\n\
This CPD, cpd_C, represents the Agents decision as to which company they will chose based \n\
on the MEU of chosing a company while knowing the state of company 2.'

print '\nLook carefully at cpd_C and factor_V.  cpd_C is a CPD that defines P(C|Pa(C)), and\n\
factor_V is a factor Phi(C,F1,F2).\n\n\
factor_V does not define a new random variable!'

print 'The following matrix is the JPD before marginallizing for F1, F2, S1, S2, and C.'

P.get_values() # this prints well here, but is uninterpretable.

print 'The sum of this large JPD is the MEU of our decision rules Delta.'
print 'MEU(D[delta]) = %.4f' %(np.sum(P.get_values()))

We set the value of Delta at the random variable, cpd_C, to be 1 when the decision is part of the
Optimal Expected Utility (MEU) and 0 when the decision is not part of the MEU.

This CPD, cpd_C, represents the Agents decision as to which company they will chose based 
on the MEU of chosing a company while knowing the state of company 2.

Look carefully at cpd_C and factor_V.  cpd_C is a CPD that defines P(C|Pa(C)), and
factor_V is a factor Phi(C,F1,F2).

factor_V does not define a new random variable!
The following matrix is the JPD before marginallizing for F1, F2, S1, S2, and C.
The sum of this large JPD is the MEU of our decision rules Delta.
MEU(D[delta]) = 0.7380


#### Example 2:
In another scenario we see what happens when the state of the first company is not as certain.  

<img src="./images/VPI_4.PNG">  

The value of the information is higher in this scenario.  


#### Example 3:
In the last scenario we see what happens when the probability of funding is much higher across all states of a company.  

<img src="./images/VPI_5.PNG">  

The value of the information is higher in this scenario.   

### 3.4 Summary  

Influence diagrams provide clear and coherent semantics for the value of making an observation.
* difference between values of two ID's  (independent decisions).  

Information is valuable i.i.f. it induces a change in action in at least one context.  