# $g$-vulnerability

Remember Evil-Eye Henry and his buried treasure from [Secrets And Vulnerability](https://github.com/damik3/qif-notebooks/blob/master/secrets_and_vulnerability.ipynb). There we discussed ways to measure the vunlnerability of a secret. Here we are gonna discuss g-vulnerability in the same context. 

Let's define $\pi$ same as before.

In [1]:
import numpy as np
try:
    from qif import *
except: # install qif if not available (for running in colab, etc)
    import IPython; IPython.get_ipython().run_line_magic('pip', 'install qif')
    from qif import *

In [2]:
pi = [1/4, 1/4, 1/8, 1/8, 1/8, 1/8]
print(pi)

[0.25, 0.25, 0.125, 0.125, 0.125, 0.125]


### Defining $g$

In a more practical, real life scenario, we would have to consider how much we would get from finding the treasure and how much it would costs us searching in each location. When we take a guess and it is the right one, we get rewarded and in our example the reward is the monetary value of the treasure. But if we take a guess and it's the wrong one, then we lose the money we spent traveling back and forth and digging up the place. And that can be expressed with a negative number.

For our example let's say that the treasure is worth $\$1500$. But seraching at each location has a different cost. The matrix below represents that idea.

$$
\begin{array}{|c|c|c|c|c|c|c|}
\hline
g & \text{True} X = 1 & \text{True} X = 2 & \text{True} X = 3 & \text{True} X = 4 & \text{True} X = 5 & \text{True} X = 6 \\ \hline
\text{Guess } X = 1 & \mathbf{\$1100} & -\$400 & -\$400 & -\$400 & -\$400 & -\$400  \\ \hline
\text{Guess } X = 2 & -\$800 & \mathbf{\$700} & -\$800 & -\$800 & -\$800 & -\$800 \\ \hline
\text{Guess } X = 3 & -\$100 & -\$100 & \mathbf{\$1400} & -\$100 & -\$100 & -\$100 \\ \hline
\text{Guess } X = 4 & -\$200 & -\$200 & -\$200 & \mathbf{\$1300} & -\$200 & -\$200 \\ \hline
\text{Guess } X = 5 & -\$300 & -\$300 & -\$300 & -\$300 & \mathbf{\$1200} & -\$300 \\ \hline
\text{Guess } X = 6 & -\$400 & -\$400 & -\$400 & -\$400 & -\$400 & \mathbf{\$1100} \\ \hline
\end{array}
$$

$g$'s first line corresponds to choosing to dig up location 1. 

 - $g(1, 1)$ means we choose to search location 1 and the treasure is indeed there. So we get our prize of $ \$ 1500$ minus the digging expenses for location 1, which are equal to $\$400$. So in total we get $\$1100$. 
 - $g(1, 2)$ means we choose 1 and the treasure is in 2. But we spent $400$ which are the digging expenses. 
 - $g(1, 3)$ means we choose 1 and the treasure is in 3. Again, we spent $400$ because we still chose to dig in location 1. 
 - ...
 
The same logic applies to the rest of $g$.

What we have just done is define a gain function $g$. By definition, $g(w, x)$ specifies the gain that the adversary achieves by taking action $w$ when the value of the secret is $x$. In this example the role of the adversary is basically us trying to guess the secret location $X$.

In [3]:
g = np.array([
    [1100, -400, -400, -400, -400, -400],
    [-800, 700, -800, -800, -800, -800],
    [-100, -100, 1400, -100, -100, -100],
    [-200, -200, -200, 1300, -200, -200],
    [-300, -300, -300, -300, 1200, -300],
    [-400, -400, -400, -400, -400, 1100],
])

### Calculating g-vulnerability

In general we cannot be sure about what we gain from each action we take. That dependes on the true value of $X$. But we can make an estimate based on the probability distribution of $X$ by computing the average gain we obtain from each action.

In [4]:
exp_gain = np.matmul(g, np.transpose(pi))
for i in range(len(exp_gain)):
    print("Average gain when choosing location %d: $%.2f" % (i+1, exp_gain[i]))

Average gain when choosing location 1: $-25.00
Average gain when choosing location 2: $-425.00
Average gain when choosing location 3: $87.50
Average gain when choosing location 4: $-12.50
Average gain when choosing location 5: $-112.50
Average gain when choosing location 6: $-212.50


And now what would our best choice be? The one with the highest average winnings of course!

In [5]:
print("Best choice: Location", np.argmax(exp_gain)+1)
print("Expected winnings: $", max(exp_gain), sep='')

Best choice: Location 3
Expected winnings: $87.5


Location 1 has a higher probability of being the true value of $X$ but it costs us more if we are wrong. On the other hand, location 3 has a lower probability of being the true value of $X$ but it costs us less if we are wrong. But that lower probability balances out with the smaller cost for when being wrong. And on average,it makes for a better choice than location 1.

Notice also that without considering $g$, our best choice would have been to guess $X=1$. But given the information $g$ provides us, $X=3$ is a better guess.

### Setting a threshold

Someone could argue that the average winnings, whichever location we chose, are not good enough in order to justify us taking action. For almost all but one location, the average gain is negative, meaning in total we lose money. And in the one case where the gain is positive, it is just not worth it.

So we might want to set a threshold for our gain. Meaning that, if the average gain of an action is lower than the threshold, we never chose that action. 

For our case someone could say that to choose an action, we should have an average of at least $300 in order for it to be worth it.

In order to achieve that we can add an additional row to $g$ like this:

In [6]:
g = np.array([
    [1100, -400, -400, -400, -400, -400],
    [-800, 700, -800, -800, -800, -800],
    [-100, -100, 1400, -100, -100, -100],
    [-200, -200, -200, 1300, -200, -200],
    [-300, -300, -300, -300, 1200, -300],
    [-400, -400, -400, -400, -400, 1100],
    [300, 300, 300, 300, 300, 300],
])

Now, watch what happens when we compute the average gain for each action and then pick the action with the highest gain.

In [7]:
exp_gain = np.matmul(g, np.transpose(pi))
for i in range(len(exp_gain)):
    print("Average gain when choosing location %d: $%.2f" % (i+1, exp_gain[i]))

Average gain when choosing location 1: $-25.00
Average gain when choosing location 2: $-425.00
Average gain when choosing location 3: $87.50
Average gain when choosing location 4: $-12.50
Average gain when choosing location 5: $-112.50
Average gain when choosing location 6: $-212.50
Average gain when choosing location 7: $300.00


In [8]:
print("Best choice: Location", np.argmax(exp_gain)+1)
print("Expected winnings: $", max(exp_gain), sep='')

Best choice: Location 7
Expected winnings: $300.0


Of course there is no Location 7. It just means tha the best action is the last one, which corresponds to:

>_"It's not worth digging any of the other locations up. Just stay home and study QIF."_

### Keep experimenting

Can you find a differnet probability distribution for $X$ so that there is at least one location worth searching? Don't change anything from $g$. Just play around with $\pi$. Notice that this time we compute the expected winnings using `measure.g_vuln.prior(g, pi)` instead of `max(exp_gain)`. They basically do the same thing.

In [9]:
pi = [2/8, 2/8, 1/8, 1/8, 1/8, 1/8] # Remember they must add up to 1

exp_gain = np.matmul(g, np.transpose(pi))
for i in range(len(exp_gain)):
    print("Average gain when choosing location %d: $%.2f" % (i+1, exp_gain[i]))
print("Best choice: Location", np.argmax(exp_gain)+1)
print("Expected winnings: $", measure.g_vuln.prior(g, pi), sep='') # measure.g_vuln.prior(g, pi) = max(exp_gain)

Average gain when choosing location 1: $-25.00
Average gain when choosing location 2: $-425.00
Average gain when choosing location 3: $87.50
Average gain when choosing location 4: $-12.50
Average gain when choosing location 5: $-112.50
Average gain when choosing location 6: $-212.50
Average gain when choosing location 7: $300.00
Best choice: Location 7
Expected winnings: $300.0
