**Notebook week 4**

# Scissors-Paper-Rock part 2. 

Now that we have covered mutual information, we will continue to analyse relationships between variables in your previous Scissors-Paper-Rock gameplay; see the description of Stage 1 Familiarisation and Stage 2 Entropy Calculations in the previous module.

### Recall our initial questions:

* Why are we interested in using measures of information theory to analyse this data set?
* What in particular might we wish to measure?
* Information theory is all about questions and answers. What questions might we ask of the data? What hypotheses might we have about the answers?

### <p style="color:darkblue">Stage 3 -- Conditional entropy calculations</p>

We will now analyse the conditional uncertainty in the player's moves, given their previous move, and consider whether this relates to their performance in the game. (Do you have a hypothesis on this?).
The coding is very similar to what you already did in stage 2 previously.

1. Open the Jupyter notebook SPR and focus on the function computeConditionalEntropyForPlayer(Player). This aims to compute the entropy of moves for a given named player, conditioned on their previous move, over all the iterations in all of their games. The code retrieves the data for each game of this player using loadGamesForPlayer(), then loops over each game. Fill out the missing parts of code:
    - In the loop, pull out the moves for that player, their previous moves (and the results on the current, not previous, move), and append them into the arrays used to store these values over all iterations. <b>Take care</b>: You can only pull out moves which have a paired sample of a previous move in the given game. This means the moves from the 2nd iteration onwards. A helpful hint is that if you had a 2D numpy array, and you wanted to pull out column 1 of its contents, but only from the 2nd row onwards, you could first pull out column 1 as <span style="color:darkblue">myColumn = data[:,1]</span> and then pull the 2nd row onwards as <span style="color:darkblue">myColumn[2:]</span>. (You could do this in one go as: <span style="color:darkblue">data[2:,1]</span>). Similarly, you can only pull out previous moves which have a paired sample of a next move in the given game. This means the moves up to the 2nd last iteration. A helpful hint there is that if you had a 2D numpy array, and you wanted to pull out column 1 of its contents, but only up to the 2nd last row, you would first pull out column 1 as <span style="color:darkblue">myColumn = data[:,1]</span> and then pull out all rows but the last as <span style="color:darkblue">myColumn[:-1]</span>. (You could do this in one go as: <span style="color:darkblue">data[:-1,1]</span>). You should also only pull out samples of results that relate to the current (but not previous) moves.

    - Compute the conditional entropy over the players' moves given their previous moves, using your (or my) conditional entropy function.


2. Call the script for a few different players, e.g. computeConditionalEntropyForPlayer('Joe'), and compare.
<br>

3. Now call it to compute the conditional entropy using samples for all players' data in the one calculation: computeConditionalEntropyForPlayer('*'). What implicit assumption are we making when we analyse the data in this way
<br>

4. Open the Matlab file computeConditionalEntropyForAllPlayers.m. This aims to compute conditional entropy of moves for each player in turn (considering each player separately), then plots these, and looks for relationships between conditional entropy and win/loss rates. Fill out the missing parts of code:
    - In the loop over player names, use our previous script computeConditionalEntropyForPlayer() to compute the conditional entropy for that player.
    - Once we have the conditional entropy for each player and their win / loss ratios, compute the correlation between conditional entropy and win ratio, and entropy and loss ratio. HINT: Use Matlab's inbuilt correlation function corrcoef or corr.
    
    
5. Call the script to see the conditional entropies of each player, the plots and correlation analyses on how this related to performance. Who was most (conditionally) uncertain? Did this correlate to wins? What about losses? Does this match your hypothesis?
<br>

6. Challenge: are these correlation values statistically significant? As per stage 2, look up theory on how to compute whether a correlation value is statistically significant. To add this to the code, you can check out the other return values from Matlab's inbuilt correlation function.
<br>


### <p style="color:darkblue">Stage 4 -- Mutual information calculations</p>

We will now analyse the mutual information in the player's previous moves (or those of their opponent) to their next move, and consider whether this relates to their performance in the game. (Do you have a hypothesis on this?).
The coding is very similar to what you already did in stage 3 above.

1. Now, focus on the computeMutualInformationForPlayer("Player") function. This aims to compute the mutual information of moves for a given named player to their previous move (or those of their opponent), over all the iterations in all of their games. The code retrieves the data for each game of this player using loadGamesForPlayer, then loops over each game. Fill out the missing parts of code:
    - In the loop, pull out the moves for that player, their previous moves or that of their opponent (and the results on the current, not previous, move), and append them into the arrays used to store these values over all iterations. Take note of how you performed the similar operations for the conditional entropy.
    - Compute the mutual information between the moves and previous moves, using our mutual information function. 


2. Call the script for a few different players, e.g. computeMutualInformationForPlayer('Joe', true), and compare.
<br>

3. Now call it to compute the mutual information using samples for all players' data in the one calculation: computeMutualInformationForPlayer('*', true). What implicit assumption are we making when we analyse the data in this way?
<br>

4. Focus now on the computeMutualInformationForAllPlayers() function. This aims to compute mutual information of moves to previous moves for each player in turn (considering each player separately), then plots these, and looks for relationships between mutual information and win/loss rates. Fill out the missing parts of code:
    - In the loop over player names, use our previous script to compute the mutual information for that player.
    - Once we have the mutual information for each player and their win / loss ratios, compute the correlation between mutual information and win ratio, and entropy and loss ratio. HINT: Use Matlab's inbuilt correlation function corrcoef or corr.


5. Call the script to see the mutual information of each player, the plots and correlation analyses on how this related to performance. Do this for MI from the players' own previous move (argument fromSelf==true) and from their opponent (fromSelf==false). Who reflected the most information in their moves? Did this correlate to wins? What about losses? Does this match your hypothesis?
<br>
6. Challenge: are these correlation values statistically significant? Look up theory on how to compute whether a correlation value is statistically significant. To add this to the code, you can check out the other return values from Matlab's inbuilt correlation function.

### <p style="color:darkblue">Stage 5 -- Further analysis</p>


Are there additional analyses that you would like to perform here?

E.g. measuring mutual information from (jointly) the previous move of the player and their opponent, to the player's next move. What would you hypothesise about that? Or, is there any mutual information between concurrent moves? What would that mean?

# 4. Coding conditional mutual information

In this exercise we continue to alter the Python code to measure the conditional mutual information between variables x and y, conditional on variable z, for a distribution p(x,y,z):

$I\left(X;Y\mid Z\right)=H\left(X\mid Z\right)+H\left(Y\mid Z\right)-H\left(X,Y\mid Z\right)$

For the conditional mutual information, we will focus only on its empirical calculation (for the most part). We will code conditional mutual information I(X;Y|Z) for empirical samples xn and yn and zn in the file conditionalmutualinformationempirical.m.

    4.1. Find the lines where you need to add code to this file, and do so. Hint: You can call your existing code conditionalentropyempirical.py to compute H(X,Y|Z), H(X|Z) and H(Y|Z) respectively, by passing in [xn,yn],zn, then xn,zn and yn,zn as arguments to these functions respectively.

In [None]:
def conditionalmutualinformation(p):
    
    # 1. Joint Entropy
    H_XYZ # = ??? 
    
    # 2. entropy of X,Z:
    # But how to get p_xz???
    # Sum p over the y's (dimension 2 argument in the sum) will just return p(x,z) terms. 
    # Won't be a 2D array, but fine to compute entropy on
    
    p_xz # = ???
    H_XZ # = ???
    
    #3. entropy of Y,Z:
    # But how to get p_yz???
    # Sum p over the x's (dimension 1 argument in the sum) will just return p(y,z) terms. 
    # Won't be a 2D array, but fine to compute entropy on
    
    p_yz # = ???
    H_YZ # = ???
    
    # 4. marginal entropy of Z:
    # But how to get p_z???
    # Sum p_xz over the x's (dimension 1 argument in the sum) will just return p(z) terms. 
    # Won't be a 1D array, but fine to compute entropy on
    
    p_z # = ???
    H_Z # = ???
    
    # 5. Computing Conditional Mutual Information
    condMutInf = H_XZ - H_Z + H_YZ - H_XYZ
    
    return np.round(condMutInf, 6)

4.2. Test that your code works by running, e.g.:<br>
        a. <b>conditionalmutualinformationempirical([0,0,1,1],[0,1,0,1],[0,1,0,1])</b> and validating that you get the result 0 bits.<br>
        b. <b>conditionalmutualinformationempirical([0,0,1,1],[0,0,1,1],[0,1,1,0])</b> and validating that you get the result 1 bit.<br>
        c. <b>conditionalmutualinformationempirical([0,0,1,1],[0,1,0,1],[0,1,1,0])</b> and validating that you get the result 1 bit.
    <br>d. Can you explain the expected results for these boundary cases?
            <br>

4.3. Challenge: Can you alter the code in conditionalmutualinformationempirical.m to compute conditional mutual information I(X;Y|Z) using the expression I(X;Y|Z) = I(X;Y,Z) - I(X;Z)?
<br>

 4.4. Challange: We did not code a function for <b>conditionalmutualinformation</b> in this exercise - an implementation is provided for you however in the solutions (see below). Can you read the code and understand how this is calculating the conditional mutual information for the given probability table p? Note that the argument p would be a 3D matrix, representing the probability p(x,y,z).