# Correlations and Correlograms in R
## 1. Introduction


Correlograms (or correlation matrices) are used to investigate the dependence between multiple variables at the same time. 

The result is a table containing the correlation coefficients (and p-values) between each variable and the others.


### 1.1 How to calculate correlation coefficients using the built-in cor() function

We first import the built-in *iris* dataframe

![Title](img/flower.png)

In [None]:
# Load data 
data("iris")
iris[1:10,]

We can calculate the matrix of Pearson's correlation coefficients with one command:

In [None]:
options(warn=-1) 
cor_matrix<-cor(iris[,1:4]) #default is Pearson's
cor_matrix

Let's calculate p-values, after importing more packages

In [None]:
library(ggplot2)
library(lattice)
library(Formula)
library(survival)
library(Hmisc)

options(repr.plot.width=6, repr.plot.height=4.5)
cor_matrix_2 <- rcorr(as.matrix(iris[-5]))
#cor_matrix_2$r #to visualise the corr coeff
cor_matrix_2$P #to visualise the p-value
#cor_matrix_2$n #to visualise the corr num of observations

### 1.2. Visualize correlations 
R offers several solutions. 
Here we show two examples:

1. corrplot
2. GGally

R package corrplot provides a visual exploratory tool on correlation matrix that supports automatic variable reordering to help detect hidden patterns among variables.

In [None]:
library(corrplot)  
corrplot(cor_matrix_2$r, type="upper", order="hclust", method ='number',diag = FALSE)

GGally extends ggplot2 by adding several functions to reduce complexity. Some of these functions include a pairwise plot matrix, a scatterplot plot matrix.

In [None]:
library(GGally)

ggpairs(iris, aes(colour = Species, alpha = 0.4),
        lower=list(combo=wrap("facethist",binwidth=0.3)), upper = list(continuous = wrap("cor", size = 3)))

## 2. Correlation matrices: an example with MRI data 

Let's import some real data 

![Title](img/Figure10_SI_HD253.png)

Import variables for correlation from excel file

In [None]:
Data<- read.csv('example_HD235.csv', dec = ".")
colnames(Data)[1]= 'Iron' #rename first column

head(Data)


In [None]:
#Alternatively

Data %>% head() 

Now we display a 'pair-wise' summary of the results with ggpairs

In [None]:
lowerFn <- function(data, mapping, method = "lm", ...) {
  p <- ggplot(data = data, mapping = mapping) +
    geom_point(colour = "blue") +
    geom_smooth(method = method, color = "red",  formula = y ~ x)
  p
}
ggpairs(
  Data, lower = list(continuous = wrap(lowerFn, method = "lm")),
  diag = list(continuous = wrap("barDiag", colour = "blue", binwidth = 20)),
  upper = list(continuous = wrap("cor", size = 5))
)

## 2.1 How to costumize correlograms with corrplot()

In [None]:
corr_matrix_pears=rcorr(as.matrix(Data), type = "pearson")
CP=round(corr_matrix_pears$r,2) #corr coeff values
PVP=round(corr_matrix_pears$P,3) #p-values

corrplot.mixed(CP,lower='number',upper='ellipse',
               lower.col = "black", number.cex = .8,
               tl.col = "blue")

## 3. Print Dependencies
Packages and R version used in this tutorial

In [None]:
sessionInfo()

# 4. R courses at the LUMC

https://www.albinusnet.nl/weten-en-regelen/onderzoek/research-facilities/courses/
- Using R for data analysis