# **Basic Statistic**

<hr>

# Correlation
In human language, correlation is the measure of how two features are, well, correlated; just like the month-of-the-year is correlated with the average daily temperature, and the hour-of-the-day is correlated with the amount of light outdoors.

Statistics and data science are often concerned about the relationships between two or more variables (or features) of a dataset. Each data point in the dataset is an observation, and the features are the properties or attributes of those observations.

## **Analyze Output of Correlation**

If you analyze any two features of a dataset, then you’ll find some type of correlation between those two features. Consider the following figures:

![corr](https://files.realpython.com/media/py-corr-1.d13ed60a9b91.png)

Each of these plots shows one of three different forms of correlation:

-    **Negative correlation (red dots)**: In the plot on the left, the y values tend to decrease as the x values increase. This shows strong negative correlation, which occurs when large values of one feature correspond to small values of the other, and vice versa.

-    **Weak or no correlation (green dots)**: The plot in the middle shows no obvious trend. This is a form of weak correlation, which occurs when an association between two features is not obvious or is hardly observable.

-    **Positive correlation (blue dots)**: In the plot on the right, the y values tend to increase as the x values increase. This illustrates strong positive correlation, which occurs when large values of one feature correspond to large values of the other, and vice versa.



> **``Note``**: When you’re analyzing correlation, you should always have in mind that ``correlation does not indicate causation``. It quantifies the strength of the relationship between the features of a dataset. Sometimes, the association is caused by a factor common to several features of interest.

<hr>

``Class Activity``
## ``Overview Correlation``

#### 1. Definition
#### 2. Condition (Syarat)
#### 3. How to code in python

### Continuous Feature
1. **Pearson** (Linear Correlation): Distribusi normal, linear, & datanya continouos.
    
2. **Spearman**: Datanya tidak terdistribusi normal, datanya ordinal/(continous-> ordinal), monotonic, 
    (X & Y) adalah dua data yang berbeda, tidak berpasangan, contohnya: uji kecerdasan numerik pada sekolah A dan sekolah B
    
3. **Kendall**: Datanya tidak terdistribusi normal, cenderung data kecil, monotonic, sering dipakai dalam konteks eksperimental
   (X & Y)  adalah dua data yang berpasangan (orangnya sama tapi beda waktu), contohnya: uji kecerdasan numerik pada sekolah A kelas X di bulan juni (X) dan juli (Y)

### Categorical Feature
1. Cramer's V
2. Theil's U

**NOIR**:

- **Nominal/Categorical**: Pria (1), Wanita (2)
- **Ordinal**: SD (1), SMP (2), SMA (3), PT (4)

- **Interval**: Hasil pengukuran tapi tidak memiliki nol mutlak sampai nilai minus, jarak antar data itu sama, contohnya seperti suhu 0 derajat Celvin. 
- **Rasio**: Hasil pengukuran/penghitungan dan memiliki nol mutlak, jarak antar data itu sama. contohnya: kedalaman, kecepatan, usia

<hr>

# Linear Correlation

**Linear correlation** measures the proximity of the mathematical relationship between variables or dataset features to a linear function. If the relationship between the two features is closer to some linear function, then their linear correlation is stronger and the absolute value of the correlation coefficient is higher.

## **Pearson Correlation Coefficient**

### ``Pearson's correlation coefficient = covariance(X, Y) / (stdv(X) * stdv(Y))``
### ``covariance(X, Y) = (sum (x - mean(X)) * (y - mean(Y)) ) * 1/(n-1)``
#### ``The use of mean and standard deviation in the calculation suggests the need for the two data samples to have a Gaussian or Gaussian-like distribution.``

![linear](https://www.emathzone.com/wp-content/uploads/2014/10/linear-nonlinear-corrrelation.jpg)

Consider a dataset with two features: x and y. Each feature has n values, so x and y are n-tuples. Say that the first value x₁ from x corresponds to the first value y₁ from y, the second value x₂ from x to the second value y₂ from y, and so on. Then, there are n pairs of corresponding values: (x₁, y₁), (x₂, y₂), and so on. Each of these x-y pairs represents a single observation.

The **Pearson (product-moment) correlation coefficient** is a measure of the linear relationship between two features. It’s the ratio of the covariance of x and y to the product of their standard deviations. It’s often denoted with the letter r and called Pearson’s r. You can express this value mathematically with this equation:

r = Σᵢ((xᵢ − mean(x))(yᵢ − mean(y))) (√Σᵢ(xᵢ − mean(x))² √Σᵢ(yᵢ − mean(y))²)⁻¹

Here, i takes on the values 1, 2, …, n. The mean values of x and y are denoted with mean(x) and mean(y). This formula shows that if larger x values tend to correspond to larger y values and vice versa, then r is positive. On the other hand, if larger x values are mostly associated with smaller y values and vice versa, then r is negative.

Here are some important facts about the Pearson correlation coefficient:

-    The Pearson correlation coefficient can take on any real value in the range −1 ≤ r ≤ 1.

-    The maximum value r = 1 corresponds to the case when there’s a perfect positive linear relationship between x and y. In other words, larger x values correspond to larger y values and vice versa.

-    The value r > 0 indicates positive correlation between x and y.

-    The value r = 0 corresponds to the case when x and y are independent.

-    The value r < 0 indicates negative correlation between x and y.

-    The minimal value r = −1 corresponds to the case when there’s a perfect negative linear relationship between x and y. In other words, larger x values correspond to smaller y values and vice versa.


| Pearson’s r Value | Correlation Between x and y |
| --- | --- |
| equal to 1 | perfect positive linear relationship |
| greater than 0 |  positive correlation |
| equal to 0 | independent |
| less than 0 | negative correlation |
| equal to -1 | perfect negative linear relationship |

![linear](https://saylordotorg.github.io/text_introductory-statistics/section_14/07aa5db140b70615a15e8631c2d7a2c4.jpg)

<hr>

# Rank Correlation

**Rank correlation** compares the ranks or the orderings of the data related to two variables or dataset features. If the orderings are similar, then the correlation is strong, positive, and high. However, if the orderings are close to reversed, then the correlation is strong, negative, and low. In other words, rank correlation is concerned only with the order of values, not with the particular values from the dataset.

To illustrate the difference between linear and rank correlation, consider the following figure:

![rank](https://files.realpython.com/media/py-corr-2.ac1acc7812d0.png)

The left plot has a perfect positive linear relationship between x and y, so r = 1. The central plot shows positive correlation and the right one shows negative correlation. However, neither of them is a linear function, so r is different than −1 or 1.

When you look only at the orderings or ranks, all three relationships are perfect! The left and central plots show the observations where larger x values always correspond to larger y values. This is perfect positive rank correlation. The right plot illustrates the opposite case, which is perfect negative rank correlation.

## **Spearman Correlation Coefficient**
### ``Spearman's correlation coefficient = covariance(rank(X), rank(Y)) / (stdv(rank(X)) * stdv(rank(Y)))``

The Spearman correlation coefficient between two features is the Pearson correlation coefficient between their rank values. It’s calculated the same way as the Pearson correlation coefficient but takes into account their ranks instead of their values. It’s often denoted with the Greek letter rho (ρ) and called Spearman’s rho.

Say you have two n-tuples, x and y, where (x₁, y₁), (x₂, y₂), … are the observations as pairs of corresponding values. You can calculate the Spearman correlation coefficient ρ the same way as the Pearson coefficient. You’ll use the ranks instead of the actual values from x and y.

Here are some important facts about the Spearman correlation coefficient:

-    It can take a real value in the range −1 ≤ ρ ≤ 1.

-    Its maximum value ρ = 1 corresponds to the case when there’s a monotonically increasing function between x and y. In other words, larger x values correspond to larger y values and vice versa.

-    Its minimum value ρ = −1 corresponds to the case when there’s a monotonically decreasing function between x and y. In other words, larger x values correspond to smaller y values and vice versa.


## **Kendall Correlation Coefficient**

Let’s start again by considering two n-tuples, x and y. Each of the x-y pairs (x₁, y₁), (x₂, y₂), … is a single observation. A pair of observations (xᵢ, yᵢ) and (xⱼ, yⱼ), where i < j, will be one of three things:

-    concordant if either (xᵢ > xⱼ and yᵢ > yⱼ) or (xᵢ < xⱼ and yᵢ < yⱼ)
-    discordant if either (xᵢ < xⱼ and yᵢ > yⱼ) or (xᵢ > xⱼ and yᵢ < yⱼ)
-    neither if there’s a tie in x (xᵢ = xⱼ) or a tie in y (yᵢ = yⱼ)

The **Kendall correlation coefficient** compares the number of concordant and discordant pairs of data. This coefficient is based on the difference in the counts of concordant and discordant pairs relative to the number of x-y pairs. It’s often denoted with the Greek letter tau (τ) and called Kendall’s tau.

According to the scipy.stats official docs, the Kendall correlation coefficient is calculated as τ = (n⁺ − n⁻) / √((n⁺ + n⁻ + nˣ)(n⁺ + n⁻ + nʸ)), where:

-    n⁺ is the number of concordant pairs
-    n⁻ is the number of discordant pairs
-    nˣ is the number of ties only in x
-    nʸ is the number of ties only in y

If a tie occurs in both x and y, then it’s not included in either nˣ or nʸ.

The Wikipedia page on Kendall rank correlation coefficient gives the following expression: τ = (2 / (n(n − 1))) Σᵢⱼ(sign(xᵢ − xⱼ) sign(yᵢ − yⱼ)) for i < j, where i = 1, 2, …, n − 1 and j = 2, 3, …, n. The sign function sign(z) is −1 if z < 0, 0 if z = 0, and 1 if z > 0. n(n − 1) / 2 is the total number of x-y pairs.

Some important facts about the Kendall correlation coefficient are as follows:

-    It can take a real value in the range −1 ≤ τ ≤ 1.

-    Its maximum value τ = 1 corresponds to the case when the ranks of the corresponding values in x and y are the same. In other words, all pairs are concordant.

-    Its minimum value τ = −1 corresponds to the case when the rankings in x are the reverse of the rankings in y. In other words, all pairs are discordant.


<hr>

# ``Tutorial``
#### 1. **Pandas**
#### 2. **Numpy**
#### 3. **Scipy**

In [1]:
import numpy as np
import pandas as pd
import scipy.stats

<hr>

## **A. Pandas**

Pandas dataframe.corr() is used to find the pairwise correlation of all columns in the dataframe. Any na values are automatically excluded. For any non-numeric data type columns in the dataframe it is ignored. 

**Syntax**: DataFrame.corr(self, method=’pearson’, min_periods=1)

**Parameters**:
method :
- pearson : standard correlation coefficient
- kendall : Kendall Tau correlation coefficient
- spearman : Spearman rank correlation
- min_periods : Minimum number of observations required per pair of columns to have a valid result. Currently only available for pearson and spearman correlation


In [2]:
df = pd.read_csv('nba.csv')
df.head(3)

Unnamed: 0,Name,Team,Number,Position,Age,Height,Weight,College,Salary
0,Avery Bradley,Boston Celtics,0.0,PG,25.0,6-2,180.0,Texas,7730337.0
1,Jae Crowder,Boston Celtics,99.0,SF,25.0,6-6,235.0,Marquette,6796117.0
2,John Holland,Boston Celtics,30.0,SG,27.0,6-5,205.0,Boston University,


In [3]:
df.dropna(inplace=True)

### **1. Pearson Correlation**

In [4]:
df.corr(method = 'pearson')

Unnamed: 0,Number,Age,Weight,Salary
Number,1.0,0.02509,0.239768,-0.154655
Age,0.02509,1.0,0.058737,0.159385
Weight,0.239768,0.058737,1.0,0.144334
Salary,-0.154655,0.159385,0.144334,1.0


In [5]:
df['Age'].corr(df['Salary'], method='pearson')

0.1593849340088173

**Testing Covariance**

In [26]:
np.cov(df['Age'], df['Salary'])

array([[1.79232888e+01, 3.45463329e+06],
       [3.45463329e+06, 2.62114876e+13]])

### **2. Spearman Correlation**

In [6]:
df[['Age', 'Salary']].corr(method='spearman')

Unnamed: 0,Age,Salary
Age,1.0,0.235912
Salary,0.235912,1.0


In [7]:
df['Age'].corr(df['Salary'], method='spearman')

0.23591150710218392

### **3. Kendall Correlation**

In [8]:
df[['Weight', 'Salary']].corr(method='kendall')

Unnamed: 0,Weight,Salary
Weight,1.0,0.089465
Salary,0.089465,1.0


In [9]:
df['Age'].corr(df['Salary'], method='kendall')

0.15020250343460145

## **B. NumPy**

**Pearson Correlation**

In [10]:
np.corrcoef(df['Age'], df['Salary'])

array([[1.        , 0.15938493],
       [0.15938493, 1.        ]])

## **C. SciPy**

### **1. Pearson Correlation**

In [32]:
r1, p1 = scipy.stats.pearsonr(df['Weight'], df['Salary'])

print('coefficient:', r1)
print('pvalue', p1)

# interpret the significance
alpha = 0.05
if p1 < alpha:
    print(f'Samples are correlated (reject H0) p={p1}')
else:
    print(f'Samples are uncorrelated (fail to reject H0) p={p1}')

coefficient: 0.14433404719214293
pvalue 0.005803074208304683
Samples are correlated (reject H0) p=0.005803074208304683


### **2. Spearman Correlation**

In [33]:
r2, p2 = scipy.stats.spearmanr(df['Weight'], df['Salary'])

print('coefficient:', r2)
print('pvalue', p2)

# interpret the significance
alpha = 0.05
if p2 < alpha:
    print(f'Samples are correlated (reject H0) p={p2}')
else:
    print(f'Samples are uncorrelated (fail to reject H0) p={p2}')

coefficient: 0.13084801624307163
pvalue 0.012469000251064597
Samples are correlated (reject H0) p=0.012469000251064597


### **2. Kendall Tau Correlation**

In [34]:
r3, p3 = scipy.stats.kendalltau(df['Weight'], df['Salary'])

print('coefficient:', r3)
print('pvalue', p3)

# interpret the significance
alpha = 0.05
if p3 < alpha:
    print(f'Samples are correlated (reject H0) p={p3}')
else:
    print(f'Samples are uncorrelated (fail to reject H0) p={p3}')

coefficient: 0.08946530838749979
pvalue 0.01241724683714226
Samples are correlated (reject H0) p=0.01241724683714226


In [14]:
# mengetahui ranking setiap data di feature 'Weight'
scipy.stats.rankdata(df['Weight'])

array([ 18.5, 253.5,  28. , 253.5, 267. ,  50. , 184. , 346. ,  28. ,
       184. , 160. , 332.5,  50. ,  91. , 184. ,  91. , 171. ,  11. ,
       362.5,  91. , 184. , 263.5, 307.5, 114. , 197. , 138. , 184. ,
       279.5, 138. ,  91. ,  68.5, 337. , 319. , 253.5, 279.5, 114. ,
       102.5, 160. , 319. , 138. , 299.5, 307.5,  91. ,  91. , 218. ,
       362.5,  11. , 114. , 124. , 184. , 147. , 184. , 319. ,  50. ,
       114. , 253.5, 160. ,  68.5, 319. ,  50. , 205.5, 346. ,  11. ,
        50. , 354. , 230.5, 160. , 184. , 279.5, 184. , 337. , 160. ,
       319. , 319. ,  68.5, 205.5, 253.5, 329.5, 160. , 354. , 230.5,
        11. , 253.5,  50. ,  91. ,  68.5, 319. , 319. , 138. ,  60.5,
       359.5, 230.5, 230.5, 319. ,  68.5, 359.5, 346. , 138. ,  50. ,
       124. , 129.5,  91. , 160. ,  41.5, 346. , 218. ,  11. ,  50. ,
       299.5, 230.5, 346. , 279.5, 150. , 218. , 279.5,  11. , 359.5,
        28. , 184. , 230.5, 354. ,  68.5, 267. ,  36.5, 319. ,   1. ,
       184. , 230.5,

<hr>

## **Categorical Features Correlation**

## 1. Cramér’s V
It is based on a nominal variation of Pearson’s Chi-Square Test, and comes built-in with some great benefits:

-    Similarly to correlation, the output is in the range of [0,1], where **0 means no association and 1 is full association**. (Unlike correlation, there are no negative values, as there’s no such thing as a negative association. Either there is, or there isn’t)
-    Like correlation, Cramer’s V is symmetrical — it is insensitive to swapping x and y

In [15]:
team = df['Team'].unique().tolist()
dic_team = {team[i]:i for i in range(len(team))}

In [16]:
df['Team_encoded'] = [dic_team[i] for i in df['Team']]
df.tail()

Unnamed: 0,Name,Team,Number,Position,Age,Height,Weight,College,Salary,Team_encoded
449,Rodney Hood,Utah Jazz,5.0,SG,23.0,6-8,206.0,Duke,1348440.0,29
451,Chris Johnson,Utah Jazz,23.0,SF,26.0,6-6,206.0,Dayton,981348.0,29
452,Trey Lyles,Utah Jazz,41.0,PF,20.0,6-10,234.0,Kentucky,2239800.0,29
453,Shelvin Mack,Utah Jazz,8.0,PG,26.0,6-3,203.0,Butler,2433333.0,29
456,Jeff Withey,Utah Jazz,24.0,C,26.0,7-0,231.0,Kansas,947276.0,29


In [17]:
# function to encode your column (label binarizer duplicated)
def encode_column(yourdf, yourcolumn):
    column_unique = yourdf[yourcolumn].unique().tolist()
    dict_column = {column_unique[i]:i for i in range(len(column_unique))}
    yourdf[yourcolumn + ' encoded'] = [dict_column[i] for i in yourdf[yourcolumn]]
    return yourdf.head()

In [18]:
encode_column(df, 'College')

Unnamed: 0,Name,Team,Number,Position,Age,Height,Weight,College,Salary,Team_encoded,College encoded
0,Avery Bradley,Boston Celtics,0.0,PG,25.0,6-2,180.0,Texas,7730337.0,0,0
1,Jae Crowder,Boston Celtics,99.0,SF,25.0,6-6,235.0,Marquette,6796117.0,0,1
3,R.J. Hunter,Boston Celtics,28.0,SG,22.0,6-5,185.0,Georgia State,1148640.0,0,2
6,Jordan Mickey,Boston Celtics,55.0,PF,21.0,6-8,235.0,LSU,1170960.0,0,3
7,Kelly Olynyk,Boston Celtics,41.0,C,25.0,7-0,238.0,Gonzaga,2165160.0,0,4


In [19]:
encode_column(df, 'Position')

Unnamed: 0,Name,Team,Number,Position,Age,Height,Weight,College,Salary,Team_encoded,College encoded,Position encoded
0,Avery Bradley,Boston Celtics,0.0,PG,25.0,6-2,180.0,Texas,7730337.0,0,0,0
1,Jae Crowder,Boston Celtics,99.0,SF,25.0,6-6,235.0,Marquette,6796117.0,0,1,1
3,R.J. Hunter,Boston Celtics,28.0,SG,22.0,6-5,185.0,Georgia State,1148640.0,0,2,2
6,Jordan Mickey,Boston Celtics,55.0,PF,21.0,6-8,235.0,LSU,1170960.0,0,3,3
7,Kelly Olynyk,Boston Celtics,41.0,C,25.0,7-0,238.0,Gonzaga,2165160.0,0,4,4


In [20]:
import scipy.stats as ss
def cramers_v(x, y):
    confusion_matrix = pd.crosstab(x,y)
    chi2 = ss.chi2_contingency(confusion_matrix)[0]
    n = confusion_matrix.sum().sum()
    phi2 = chi2/n
    r,k = confusion_matrix.shape
    phi2corr = max(0, phi2-((k-1)*(r-1))/(n-1))
    rcorr = r-((r-1)**2)/(n-1)
    kcorr = k-((k-1)**2)/(n-1)
    return np.sqrt(phi2corr/min((kcorr-1),(rcorr-1)))

In [21]:
cramers_v(df['College encoded'], df['Position encoded'])

0.0

In [22]:
cramers_v(df['Team_encoded'], df['Position encoded'])

0.0

In [23]:
cramers_v(df['Team_encoded'], df['College encoded'])

0.03047629068389529

**Note**: Kelemahan Cramer's V adalah jika nilai x diketahui, maka nilai y tidak dapat ditentukan. Tetapi jika nilai y diketahui, maka nilai y baru bisa diketahui.

Misalnya: hubungan 'college' (x) dengan 'team' (y). Jika 'college' diketahui belum menjamin atau tertebak apa 'team'nya. Tetapi jika 'team'-nya diketahui, bisa diketahui apa 'college'-nya.

Kelemahan ini bisa dipenuhi dengan menggunakan Theil's U.

<hr>

## 2. Theil's U
Theil’s U, also referred to as the Uncertainty Coefficient, is based on the conditional entropy between x and y — or in human language, given the value of x, how many possible states does y have, and how often do they occur. 

Just like Cramer’s V, the output value is on the range of [0,1], with the same interpretations as before — but unlike Cramer’s V, it is asymmetric, meaning U(x,y)≠U(y,x) (while V(x,y)=V(y,x), where V is Cramer’s V). 

Using Theil’s U in the simple case above will let us find out that knowing y means we know x, but not vice-versa.


In [24]:
from collections import Counter
import math
import scipy.stats as ss

def conditional_entropy(x,y):
    # entropy of x given y
    y_counter = Counter(y)
    xy_counter = Counter(list(zip(x,y)))
    total_occurrences = sum(y_counter.values())
    entropy = 0
    for xy in xy_counter.keys():
        p_xy = xy_counter[xy] / total_occurrences
        p_y = y_counter[xy[1]] / total_occurrences
        entropy += p_xy * math.log(p_y/p_xy)
    return entropy

def theil_u(x,y):
    s_xy = conditional_entropy(x,y)
    x_counter = Counter(x)
    total_occurrences = sum(x_counter.values())
    p_x = list(map(lambda n: n/total_occurrences, x_counter.values()))
    s_x = ss.entropy(p_x)
    if s_x == 0:
        return 1
    else:
        return (s_x - s_xy) / s_x

``Theil's U, also known as the Uncertainty Coefficient. Formaly marked as U(x|y), this coefficient provides a value in the range of [0,1], where 0 means that feature y provides no information about feature x, and 1 means that feature y provides full information abpout features x's value.``

In [25]:
theil_u(df['Team_encoded'], df['College encoded'])

0.5644694051952239

<hr>

## **Categorical & Continuous Features Correlation**

But what about a pair of a continuous feature and a categorical feature? For this, we can use the Correlation Ratio (often marked using the greek letter eta).

Mathematically, it is defined as the weighted variance of the mean of each category divided by the variance of all samples; in human language, the Correlation Ratio answers the following question: **Given a continuous number how well can you know to which category it belongs to?**, Just like the two coefficients we’ve seen before, here too the output is on the range of [0,1].

In [25]:
def correlation_ratio(categories, measurements):
    fcat, _ = pd.factorize(categories)
    cat_num = np.max(fcat)+1
    y_avg_array = np.zeros(cat_num)
    n_array = np.zeros(cat_num)
    for i in range(0,cat_num):
        cat_measures = measurements[np.argwhere(fcat == i).flatten()]
        n_array[i] = len(cat_measures)
        y_avg_array[i] = np.average(cat_measures)
    y_total_avg = np.sum(np.multiply(y_avg_array,n_array))/np.sum(n_array)
    numerator = np.sum(np.multiply(n_array,np.power(np.subtract(y_avg_array,y_total_avg),2)))
    denominator = np.sum(np.power(np.subtract(measurements,y_total_avg),2))
    if numerator == 0:
        eta = 0.0
    else:
        eta = np.sqrt(numerator/denominator)
    return eta

## **Reference**:
- Shubham__Ranjan, "Python | Pandas dataframe.corr()", https://www.geeksforgeeks.org/python-pandas-dataframe-corr/
- Akshay Matam, "Alone in the woods: Using Theil's U for survival", https://www.kaggle.com/akshay22071995/alone-in-the-woods-using-theil-s-u-for-survival
- Shaked Zychlinski, "The Search for Categorical Correlation", https://towardsdatascience.com/the-search-for-categorical-correlation-a1cf7f1888c9
- Mirko Stojiljković , "NumPy, SciPy, and Pandas: Correlation With Python", https://realpython.com/numpy-scipy-pandas-correlation-python/
- Jason Brownlee, "How to Calculate Correlation Between Variables in Python", https://machinelearningmastery.com/how-to-use-correlation-to-understand-the-relationship-between-variables/
- Jeff Macaluso, "Testing Linear Regression Assumptions in Python", https://jeffmacaluso.github.io/post/LinearRegressionAssumptions/
- Minitab, "Linear, nonlinear, and monotonic relationships", https://support.minitab.com/en-us/minitab-express/1/help-and-how-to/modeling-statistics/regression/supporting-topics/basics/linear-nonlinear-and-monotonic-relationships/
- Jason Brownlee, "How to Calculate Nonparametric Rank Correlation in Python", https://machinelearningmastery.com/how-to-calculate-nonparametric-rank-correlation-in-python/