# **Correlation**

For Example:

1. One variable could cause or depend on the values of another variable.
2. One variable could be lightly associated with another variable.
3. Two variables could depend on a third unknown variable.
   
. Positive Correlation: Both variables change in the same direction.

. Neutral Correlation: No relationship in the change of the variables.

. Negative Correlation: Variables change in opposite directions.

### Covariance:
. Variables can be related by a linear relationship. This is arelationship that is consistently additive across the two data samples.

. This relationship can be summarized between two variables, called the covariance.

. The sign of the covariance can be interpreted as whether the two variables change in the same direction (POSITIVE) or change in different directions (NEGATIVE).

. The magnitude of the covariance is not easily interpreted. A Covariance value of zero indicates that both variables are completely independent.


. Types:

    Pearson's r     (For normal data)

    Spearman's rho  (For not normal data)

    Kendall's tau   (For ranking)

In [None]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

kashti = sns.load_dataset('titanic')
phool = sns.load_dataset('iris')

In [None]:
# Both are numaric
# Covariance
np.cov(kashti['age'],kashti['fare'])

In [None]:
x = [1.23, 2.12, 3.34, 4.5]
y = [2.56, 2.89, 3.76, 3.95]
# Find Covariance with respect to columns
cov_mat = np.stack((x,y), axis=0)
# cov_mat
print(np.cov(cov_mat))

## Correlation instead of Covariance

In [None]:
kashti.info()

Simple Correlation

In [None]:
kashti.corr()

1_Pearson's Correlation (For Normal Data)

In [None]:
corr1 = kashti.corr(method='pearson')

2_Spearman's Correlation (For Not-Normal Data)

In [None]:
corr2 = kashti.corr(method='spearman')

3_Kendall Correlation (For Ranking)

In [None]:
corr3 = kashti.corr(method='kendall')

### Positive Correlation 

In [None]:
# corr1    p < 0.5
sns.regplot(kashti['adult_male'],kashti['alone'], data=kashti)

In [None]:
sns.regplot(phool['sepal_length'],phool['petal_length'],data=phool)

### Negative Correlation

In [None]:
sns.regplot(kashti['parch'],kashti['alone'],data=kashti)

ANOTHER WAY TO SHOW IN GRAPHS

In [None]:
sns.heatmap(corr1,annot=True)

In [None]:
corr1.style.background_gradient(cmap='coolwarm')

In [None]:
sns.pairplot(corr1)

## For Culculating Correlation

In [None]:
# Also Do With other types
from scipy.stats import pearsonr
corr1, _ = pearsonr(phool['sepal_length'],phool['petal_length'])
print('Pearsons Correlation: %3.f' % corr1) 

## For printing 10 most correlated columns

In [None]:
df_num_corr = df_num.corr()['SalePrice'][:-1] # -1 because the latest row is SalePrice
golden_features_list = df_num_corr[abs(df_num_corr) > 0.5].sort_values(ascending=False)
print("There is {} strongly correlated values with SalePrice:\n{}".format(len(golden_features_list), golden_features_list))