# Problem 1.2 - Pearson correlation vs. Mutual Information

### Information on correlation types:
pearson coefficient: number in [-1,1], where 1 indicates a positive linear relationship, -1 a negative linear relation and 0 no linear relation at all  
the pearson coefficient cannot detect any nonlinear relations in data  
  
mutual information: number > 0, where a higher number indicates a stronger dependence. MI measures any kind of statistical dependencies, not only linear ones. 

In [13]:
# importing libraries
import numpy as np
from scipy.stats import pearsonr
from sklearn.feature_selection import mutual_info_regression

In [14]:
# create a function that tests correlations of mappings, dependent on standard deviation of the noise
def experiment(function, sigma, n = 150):
    x0 = np.random.uniform(0,1,n)
    noise = np.random.normal(0,sigma,n)
    y_clean = function(x0)
    y0 = y_clean + noise
    pearson,_ = pearsonr(x0, y0) 
    mi = mutual_info_regression(x0.reshape(-1,1),y0,random_state = 42)[0]
    return pearson, mi


In [15]:
# define functions that should be tested
def f(x):
    return 2 * x - 1

def g(x):
    return np.sin(5 * np.pi * x)

In [16]:
# call the experiment function for different standard deviations and print the results
np.random.seed(42)
sigmas = [0.5, 0.2, 0.01]
for sigma in sigmas:
    print(f"σ = {sigma}")
    for function, name in [(f, "f"), (g, "g")]:
        correlation, mi = experiment(function, sigma)
        print(f"{name}: Pearson = {correlation:.4f}, Mutual info = {mi:.4f}")

σ = 0.5
f: Pearson = 0.7686, Mutual info = 0.4462
g: Pearson = -0.0589, Mutual info = 0.4021
σ = 0.2
f: Pearson = 0.9481, Mutual info = 1.0793
g: Pearson = 0.0333, Mutual info = 0.8661
σ = 0.01
f: Pearson = 0.9999, Mutual info = 3.3752
g: Pearson = -0.1234, Mutual info = 1.0851


# Observations:
for the linear function f:  
--> Pearson coefficient goes to 1 with smaller vairance σ, which implies a strong linear correlation. (which makes sense since f is linear)  
--> Mutual Information is always positive and increasing with smaller σ, so the dependecy increases. (dependance detected)  
for nonlinear function g:  
--> Pearson coefficient is near to 0, which indicates that no linear relationship between x and y is detected.  
--> Mutual Information is always positive, so MI has deteced some kind of relation between x and y.  
general for both functions:  
--> with lower variance the dependecies can be detected better by both methods.  


### Conclusion:
Pearson is great for any linear dependency but nonlinear relations are better detected with mutual information.