In this article, I will show the necessary steps using Python to find Spearman's rank correlation coefficient.

Spearman's rank correlation coefficient shows the correlation between two variables how one variable increases or decreases as the other variable increases which are described as "monotonic". Spearman's correlation assesses monotonic relationships whether linear or not.

You can read [A comparison of the Pearson and Spearman correlation here](https://support.minitab.com/en-us/minitab-express/1/help-and-how-to/modeling-statistics/regression/supporting-topics/basics/a-comparison-of-the-pearson-and-spearman-correlation-methods/) 

A value of 1 means the set of data is strictly increasing and the value of -1 means it is strictly decreasing. A value of 0 means that data shows no monotonic behavior. the Spearman correlation between two variables will be high when an observation has a similar rank between the two variables.

Examples to use Spearman's correlation are:

- [IQ of a person with the number of hours spent on games](https://www.wikiwand.com/en/Spearman%27s_rank_correlation_coefficient)
- [Physics and Math ranks](https://www.statisticshowto.datasciencecentral.com/spearman-rank-correlation-definition-calculate/)
- [Free university meals and their CGPA scores](https://www.toppr.com/guides/business-mathematics-and-statistics/correlation-and-regression/rank-correlation/)

You can use [the null hypothesis for this test](https://statistics.laerd.com/statistical-guides/spearmans-rank-order-correlation-statistical-guide-2.php).

You can see all the code from this [link](https://github.com/shinokada/python-for-ib-diploma-mathematics/blob/master/Spearman's%20rank%20correlation%20coefficient.ipynb).

We import necessary libraries.

In [26]:
from scipy.stats import chi2_contingency
import pandas as pd
import numpy as np
from scipy.stats import spearmanr

Example of the Spearman's Rank Correlation Test using two sets of list.

In [27]:
data1 = [1,2,3,4,5,6,7]
data2=[3,2,1,5,7,4,6]
stat, p = spearmanr(data1, data2)
print('stat=%.6f, p=%.6f' % (stat, p))
if p > 0.05:
	print('Probably independent')
else:
	print('Probably dependent')

stat=0.678571, p=0.093750
Probably independent


Example using Pandas dataframe.

In [7]:
race = pd.DataFrame(
    [
        [1,3],
        [2,2],
        [3,1],
        [4,5],
        [5,7],
        [6,4],
        [7,6]
    ],
    columns=["After 5km","End of race"])
race

Unnamed: 0,After 5km,End of race
0,1,3
1,2,2
2,3,1
3,4,5
4,5,7
5,6,4
6,7,6


In [8]:
stat, p = spearmanr(race)
print('stat=%.3f, p=%.3f' % (stat, p))

stat=0.679, p=0.094


Scatter plot

In [11]:
race2 = pd.DataFrame(
    [
        [1,1],
        [2,2],
        [3,3],
        [4,4],
        [5,5],
        [6,6],
        [7,7]
    ],
    columns=["After 5km","End of race"])
stat, p = spearmanr(race2)
print('stat=%.3f, p=%.3f' % (stat, p))

stat=1.000, p=0.000


In [12]:
race3 = pd.DataFrame(
    [
        [1,7],
        [2,6],
        [3,5],
        [4,4],
        [5,3],
        [6,2],
        [7,1]
    ],
    columns=["After 5km","End of race"])
stat, p = spearmanr(race3)
print('stat=%.3f, p=%.3f' % (stat, p))

stat=-1.000, p=0.000


In [28]:
race4 = pd.DataFrame(
    [
        [4,4],
        [4,4],
        [4,4],
        [4,4],
        [4,4],
        [4,4],
        [4,4]
    ],
    columns=["After 5km","End of race"])
stat, p = spearmanr(race4)
print('stat=%.3f, p=%.3f' % (stat, p))

stat=nan, p=nan
