Skip to content

Pearson's correlation coefficient

Maurice HT Ling edited this page Aug 13, 2021 · 4 revisions

Purpose: To measure the strength of linear relationship between 2 variables.

Null hypothesis: Correlation coefficient = 0

Alternate hypothesis: Correlation coefficient ≠ 0.

Note: Pearson's correlation assumes that both variables are normally distributed, linear, and homoscedastic. That is, Pearson's correlation is a parametric measurement.

Code:

>>> from scipy import stats
>>> X = [1, 2, 3, 4, 5]
>>> Y = [5, 6, 7, 8, 7]
>>> result = stats.pearsonr(X, Y)
>>> print("Pearson's correlation coefficient = %.3f" % result[0])
Pearson's correlation coefficient = 0.832
>>> print("p-value = %.3f" % result[1])
p-value = 0.081

Reference

  1. Student. 1908. Probable error of a correlation coefficient. Biometrika 6(2-3), 302-310.
Clone this wiki locally