Skip to content

Laguna1989/CodeNummy_CorrelationCoefficient

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Overview

This is a Code Nummy about the correlation coefficient.

Theory

Correlation

Knowledge about correlation is a powerful tool in maths, statistics and data analysis. You have two (seemingly unrelated?) measures x and y. How likely is it that an increase in x leads to an increase in y? This is the question that is answered by the correlation between xand y.

This leads to all sort of interesting as well as entertaining observerations [1, 2, 3]. But remember,

"Correlation does not mean causation!" wikipedia

But to be able to boast with this sentence among your friends, you need to understand how this correlation thing works internally and how to calculate it.

Assume a set of x and y value pairs. E.g. the number of nuclear power plants per year and the amount of swimming pool drownings per year. Are they correlated or not?

The correlation coefficient r will answer this question. It is a value in the range [-1, 1] where a value of 0 means " completely unrelated", and a value of -1 or 1 means "completely related". It is calculated as follows:

Exercise

  • implement the function calculate_sum(values) in src/correlation, which calculates the sum of the values
  • implement the function calculate_sum_of_squares(values) in src/correlation, which calculates the sum of the squared values
  • implement the function calculate_sum_of_multiplies in src/correlation, which calculates the sum of the multiplied values
  • Now implement the function correlation in src/correlation, which will calculate the value r by using all the previously defined functions.

Hints for C++

std::accumulate and std::inner_product can prove helpful.

Hints for Python

np.multiply can prove helpful.

Application

Think of any measurement of two (possibly?) related values, that you can easily perform on your own. Some ideas:

  • grab some books from your bookshelf and measure width and height of a book
  • the width and length of individual spaghetti from a pack
  • pick two measures from csgostats or league of graphs
  • stock market example: S&P500 and bitcoin value in USD
  • any of the examples from [1, 2, 3]

Further Reading and references

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages