<h2 align = 'center'><font color = 'blue'>Spearman's Correlation</font></h2>

- ***`The Spearman rank-order correlation is a statistical procedure that is designed to measure the relationship between two variables on an ordinal scale of measurement.`***

### Spearman’s Correlation : (Non Gaussian Distribution)

- `Two variables may be related by a nonlinear relationship, such that the relationship is stronger or weaker across the distribution of the variables.`
<br>



- `Further, the two variables being considered may have a non-Gaussian distribution.`

![title](Spearmans_Correlation.png)

***Interpretation :***

        H0: There is no [monotonic] association between the two variables [in the population].
        H1: There is [monotonic] association between the two variables [in the population].

- In this case, the Spearman’s correlation coefficient (`named for Charles Spearman`) can be used to summarize the strength between the two data samples. This test of relationship can also be used if there is a linear relationship between the variables, but will have slightly less power (e.g. may result in lower coefficient scores).

- As with the Pearson correlation coefficient, the scores are between -1 and 1 for perfectly negatively correlated variables and perfectly positively correlated respectively.

- Instead of calculating the coefficient using covariance and standard deviations on the samples themselves, these statistics are calculated from the ***relative rank of values on each sample.*** This is a common approach used in ***non-parametric statistics, e.g. statistical methods where we do not assume a distribution of the data such as Gaussian.***

            Spearman's correlation coefficient = covariance(rank(X), rank(Y)) / (stdv(rank(X)) * stdv(rank(Y)))

A linear relationship between the variables is not assumed, although a **monotonic relationship** is assumed. This is a *mathematical name for an increasing or decreasing relationship between the two variables.*

- ***`If you are unsure of the distribution and possible relationships between two variables, Spearman correlation coefficient is a good tool to use.`***


- `Spearman’s rank correlation can be calculated in Python using the  `**spearmanr()**`  SciPy function between two data samples with the same length.`
<br>


- `The function takes two real-valued samples as arguments and returns both the correlation coefficient in the range between -1 and 1 and the p-value for interpreting the significance of the coefficient.`

In [1]:
# calculate the spearman's correlation between two variables

from numpy.random import rand
from numpy.random import seed
from scipy.stats import spearmanr

# seed random number generator
seed(1)
# prepare data
data1 = rand(1000) * 20
data2 = data1 + (rand(1000) * 10)

# calculate spearman's correlation
coef, p = spearmanr(data1, data2)
print('Spearmans correlation coefficient: %.3f' % coef)

# interpret the significance
alpha = 0.05
if p > alpha:
    print('Samples are uncorrelated (fail to reject H0) p=%.3f' % p)
else:
    print('Samples are correlated (reject H0) p=%.3f' % p)

Spearmans correlation coefficient: 0.900
Samples are correlated (reject H0) p=0.000


- ***Returns :***
            
    Spearman’s correlation coefficient
                
                    rho(p) : float
                
    Two-tailed p-value
    
                    p-value : float