### CORELATION & COVARIANCE

**Correlation and covariance are two measures that are used in statistics to describe the relationship between two variables.**

1. Covariance is a measure of the joint variability of two random variables. It measures how much two variables change together. If the covariance between two variables is positive, it means that they tend to increase or decrease together. If the covariance is negative, it means that as one variable increases, the other tends to decrease. A covariance of zero indicates that there is no linear relationship between the two variables.

2. Correlation, on the other hand, measures the strength and direction of the linear relationship between two variables. It is a standardized measure of covariance, which means that it is scaled to always fall between -1 and 1. A correlation of +1 indicates a perfect positive linear relationship, a correlation of -1 indicates a perfect negative linear relationship, and a correlation of 0 indicates no linear relationship.

**Both covariance and correlation are important measures in statistics, as they help us to understand the relationship between two variables. However, correlation is often preferred over covariance because it is a standardized measure, which means that it is easier to compare across different datasets and variables with different units of measurement.**

In [8]:
import seaborn as sns
import numpy as np

In [9]:
d = sns.load_dataset("healthexp")

In [10]:
d.head(5)

Unnamed: 0,Year,Country,Spending_USD,Life_Expectancy
0,1970,Germany,252.311,70.6
1,1970,France,192.143,72.2
2,1970,Great Britain,123.993,71.9
3,1970,Japan,150.437,72.0
4,1970,USA,326.961,70.9


In [11]:
## Covariance
d.cov()

Unnamed: 0,Year,Spending_USD,Life_Expectancy
Year,201.098848,25718.83,41.915454
Spending_USD,25718.827373,4817761.0,4166.800912
Life_Expectancy,41.915454,4166.801,10.733902


In [14]:
## SPearman Rank Corelation
d.corr(method='spearman')

Unnamed: 0,Year,Spending_USD,Life_Expectancy
Year,1.0,0.931598,0.896117
Spending_USD,0.931598,1.0,0.747407
Life_Expectancy,0.896117,0.747407,1.0


In [15]:
## Pearson Corelation
d.corr(method='pearson')

Unnamed: 0,Year,Spending_USD,Life_Expectancy
Year,1.0,0.826273,0.902175
Spending_USD,0.826273,1.0,0.57943
Life_Expectancy,0.902175,0.57943,1.0


In [19]:
e = sns.load_dataset('penguins')

In [20]:
e.head(5)

Unnamed: 0,species,island,bill_length_mm,bill_depth_mm,flipper_length_mm,body_mass_g,sex
0,Adelie,Torgersen,39.1,18.7,181.0,3750.0,Male
1,Adelie,Torgersen,39.5,17.4,186.0,3800.0,Female
2,Adelie,Torgersen,40.3,18.0,195.0,3250.0,Female
3,Adelie,Torgersen,,,,,
4,Adelie,Torgersen,36.7,19.3,193.0,3450.0,Female


In [21]:
e.cov()

Unnamed: 0,bill_length_mm,bill_depth_mm,flipper_length_mm,body_mass_g
bill_length_mm,29.807054,-2.534234,50.375765,2605.591912
bill_depth_mm,-2.534234,3.899808,-16.21295,-747.370093
flipper_length_mm,50.375765,-16.21295,197.731792,9824.416062
body_mass_g,2605.591912,-747.370093,9824.416062,643131.077327


In [22]:
e.corr()

Unnamed: 0,bill_length_mm,bill_depth_mm,flipper_length_mm,body_mass_g
bill_length_mm,1.0,-0.235053,0.656181,0.59511
bill_depth_mm,-0.235053,1.0,-0.583851,-0.471916
flipper_length_mm,0.656181,-0.583851,1.0,0.871202
body_mass_g,0.59511,-0.471916,0.871202,1.0
