# Chapter 17 – The Statistics Suite R
Data analysis commonly involves visualization of data. This includes both plotting the data themselves and plotting properties of the dataset like frequencies of certain numbers. R is a very well-established platform for scientists in general and computational biologists in particular. Besides from being a programming environment for statistical computing, R is also a data visualization tool. Several topic specific packages are available to analyze and visualize experimental data, e.g. for evolutionary biology, the evaluation of biochemical assays, nucleotide and amino acid sequence analysis, microarray data interpretation, and more. This chapter provides a basic introduction, exemplifies statistical data analysis, demonstrates installation of additional packages, and shows how to retrieve data from a MariaDB or MySQL database.

## Installation of R
Use: `

### Check Ubuntu Release and Install RStudio
Download the right version from https://posit.co/download/rstudio-desktop/ 

In [None]:
lsb_release -a

In [None]:
wget 'https://download1.rstudio.org/electron/jammy/amd64/rstudio-2023.12.1-402-amd64.deb'

Install with `sudo apt install -f ./rstudio-2023.12.1-402-amd64.deb`

Start RStudio with `rstudio &`

In [None]:
R --save -q -e 'a<-1:10; b<-c(23,56,67:72,98,65); length(a); length(b)'

The following command saves the plot in file *Rplots.pdf* in the current directory.

In [None]:
R --save -q -e 'plot(a,b); c<-seq(from=10,to=100,by=10); points(a,c,col="red",pch=2); lines(a,c,col="blue")'

In [None]:
R --save -q -e 'b[3]; b<=70; which(b<=70)'

In [None]:
R --save -q -e '
a <- seq(from=1,to=100,by=5); 
b <- 0; for (i in 1:length(a)) {b[i] <- i*i}; 
r <- 0; for (i in 1:length(a)) {r[i] <- runif(1, -50, 50)};  
min(a); max(a); min(b+r); max(b+r)'

In [None]:
R --save -q -e '
plot(13, 42, xlim=c(0,100), type="n", ylim=c(-30,430), xlab="Variable a", ylab="", main="Root and Random")'

In [None]:
R --save -q -e '
plot(13, 42, xlim=c(0,100), type="n", ylim=c(-30,430), xlab="Variable a", ylab="", main="Root and Random");
points(a, b, col="blue", pch=1);
points(a, b+r, col="red", pch=2, type="b");
points(a, r, col="black", pch=3, type="l");
legend(c(100,300), c("a", "a^2", "r"), pch=c(1,2,NA), lty=c(NA,1,1), col=c("blue", "red", "black"))'