# MUTUAL INFORMATION

Mutual information is one of many quantities that measures how much one random variables tells us about another. 

Consider this formula below:


$\displaystyle\sum_{n=ij} ^{\infty} P_{ij} * log (P_{ij}) $    -  $\displaystyle\sum_{n=j} ^{\infty} P_{i} * log (P_{i}) $  - $\displaystyle\sum_{n=i} ^{\infty} P_{j} * log (P_{j}) $ 


Implement this mutual probability formula in python.


$                    P_{i}$   =      $\displaystyle\sum_{n=j} ^{\infty} P_{ij}  $ 

and
 
 $                   P_{j}$   =       $\displaystyle\sum_{n=i} ^{\infty} P_{ij}  $ 

# SOLUTION

First to understand the question, we have to understand a 2 dimensional array which is signified by '$P_{ij}$'

A two dimensional array is somewhat similar to a matrix. It has just rows and columns. The two of them could be of any length. 

From the formula, we also have the 1-d array signified by '$P_{i}$' and '$P_{j}$'. A 1-D array is usually called a vector!

In the 2-d array, we signify the first term `i` as the row and the term `j` as the column. Therefore, the term   '$ P_{i}$' is a row vector. Now the issue is converting a 2-d array to a row vector. From the formula, we see the relationship between the 2d array and the row vector is established as the summation of all the column terms i.e. `the 'j' term`. The sum of all the column terms will result in a row vector.

The same goes for '$ P_{j}$' which is a column vector. This vector here is the sum of the row terms i.e. `the 'i' term`. 



# TEST CASES

To capture all faults we have the list of possible test cases.

1. The function can work for a 1 by 1 array
2. The function can work for a vector (i by 1 array)
3. The function can work for a square matrix of any length 
4. The function can work for a rectangular matrix of any length
5. The function can work for 0
6. The function can work for integers and floats

In [1]:
import pandas as pd
import numpy as np
import warnings
warnings.filterwarnings('ignore')

In [2]:
def formula (array):
    Array_Format = np.array(array) #First Format to Array
    if len(Array_Format.shape) == 0:
        Array_Format = np.array([[Array_Format]])
        
    elif len(Array_Format.shape) == 1:
        Array_Format = np.array([Array_Format])
        
    elif len(Array_Format.shape) > 2:
        return print('Maximum of 2D-ARRAY')
        
    Log_Array_Format = np.log(Array_Format) #Then get the log of the Array
    Log_Array_Format = np.nan_to_num(Log_Array_Format) #This sets arrays that do not have log to return 0
    
    col_totals = Array_Format.sum(axis=0) #Add all the columns together to get the row vector
    col_log = np.log(col_totals) #Find the log of the row vector
    col_log = np.nan_to_num(col_log) #This sets arrays that do not have log to return 0
    
    row_totals = Array_Format.sum(axis=1) #Add all the rows together to get the column vector
    row_log = np.log(row_totals) #Find the log of the column vector
    row_log = np.nan_to_num(row_log) #This sets arrays that do not have log to return 0
   
    #Perform a scalar multiplication and add the sum to infinity
    return (np.sum(Array_Format * Log_Array_Format) - np.sum(col_totals * col_log) - np.sum(row_totals * row_log))


In [3]:
tests = [{'test1':np.random.randint(10,  size = (1, 1)),
       'test2':np.random.randint(10,  size = (5, 1)),
       'test3':np.random.randint(100,  size = (15, 15)),
        'test4':np.random.randint(1000,  size = (150, 200)),
        'test5':0
       }]

In [4]:
#Test Case 1
formula(np.random.randint(10,  size = (1, 1)))

-13.621371043387192

In [5]:
#Test Case 2
formula(np.random.randint(10,  size = (5, 1)))

-72.11636696637044

In [6]:
#Test Case 3
formula(np.random.randint(100,  size = (15, 15)))

-108385.13595732226

In [7]:
#Test Case 4
formula(np.random.randint(1000,  size = (150, 200)))

-244144020.02602667

In [8]:
#Test Case 5
formula([0])

0.0

In [9]:
#Test Case 6
formula(97.8)

-448.21002363458746

The `formula function` can take in your matrix and solve it according to the formula above. 

It took test cases of users who put in 1dimensional arrays and also, users that input integers.