# COMPOSITE PLOTS

Let's create enhanced plots with some data ...

First let's read the data .. Some quiz results

Read data from file "grades.csv"

Values are seperated by ","(comma) and file includes header

In [None]:
datax <- read.table(
    "~/file/grades.csv",
    sep=",",
    header=T)

In [None]:
head(datax)

You are supposed to perform the following tasks:
- Divide the screen into 14 grids: 2 rows and 7 columns
- Top row: draw quiz scores of people (Up to quiz 7, one plot for each quiz)
- Color of people with M1 score larger than 70 should be red
- Color of people with M1 score smaller than 70 should be blue
- Bottom row, columns 1-6: draw histograms of QAll
- With increasing detail (More bins)
- Give title to each histogram


To format the plot "canvas", we will use the "par" function:

par {graphics}

Set or Query Graphical Parameters

par can be used to set or query graphical parameters.

Parameters can be set by specifying them as arguments to par in tag = value form, or by passing them as a list of tagged values.

Before the tasks, we are given the following line to format the margins of the plots with the "mar" argument to "par" function:

mar

A numerical vector of the form c(bottom, left, top, right) which gives the number of lines of margin to be specified on the four sides of the plot.

The default is c(5, 4, 4, 2) + 0.1.

In [None]:
par(mar=c(1.5,1.5,2.5,2.5))

Now, let's divide the screen into 14 grids in row major order (first left then down) using "mfrow" argument to "par":

mfcol, mfrow

A vector of the form c(nr, nc).

Subsequent figures will be drawn in an nr-by-nc array on the device by columns (mfcol), or rows (mfrow), respectively.

In [None]:
par(mfrow=c(2,7))

Top row: draw quiz scores of people (Up to quiz 7, one plot for each quiz)

Color of people with M1 score larger than 70 should be red

Color of people with M1 score smaller than 70 should be blue

Check the colnames:

In [None]:
names(datax)

See that quizes start from the second column. So first quiz stands on the second column and so on

For example the plot the first quiz (without coloring):

In [None]:
plot(datax[["Q1"]])

But we need a coloring scheme:

M1 score larger than 70 should be red

M1 score smaller than 70 should be blue:

In [None]:
ifelse(datax$M1>70,"red","blue")

Now let's pass the coloring scheme via "col" argument to "plot" function

In [None]:
plot(datax$Q1,col=ifelse(datax$M1>70,"red","blue"))

But we should plot the first 7 quizzes. Let's put all of them into a loop:

In [None]:
for (i in c(1:7)){
  varx <- paste("Q", i, sep = "")
  plot(datax[[varx]],col=ifelse(datax$M1>70,"red","blue"), ylab = varx) 
}

Let's combine them:

In [None]:
par(mar=c(1.5,1.5,2.5,2.5))

par(mfrow=c(2,7))

for (i in c(1:7)){
  varx <- paste("Q", i, sep = "")
  plot(datax[[varx]],col=ifelse(datax$M1>70,"red","blue"), ylab = varx) 
}

Now, the second row:

bottom row, columns 1-6: draw histograms of QAll

with increasing detail (More bins)

give title to each histogram


Let's make the bins start with 4 categories, upto 14

You can pass the "main" title of the histogram with the "main" argument

Let's make the main title as such "xxx bins"

First the 4 bin version:



In [None]:
hist(datax$QAll,breaks = 4, main = paste(4,"bins"))

And let's put it into a loop for number of bins:

In [None]:
for (i in c(2:7)){
  hist(datax$QAll,breaks=i*2,main=paste(i*2,"bins"))
}

Let's combine them:

In [None]:
par(mar=c(1.5,1.5,2.5,2.5))

par(mfrow=c(2,7))

for (i in c(1:7)){
  varx <- paste("Q", i, sep = "")
  plot(datax[[varx]],col=ifelse(datax$M1>70,"red","blue"), ylab = varx) 
}

for (i in c(2:7)){
  hist(datax$QAll,breaks=i*2,main=paste(i*2,"bins"))
}

Now the last part:

bottom row, last column: draw a pie chart that shows the ratio of

number of quizes, midterms and projects


Check the names again:

In [None]:
names(datax)

Mx denotes midterms, Qx denotes quizzes and Px denotes projects

So there are 10 quizzes, 4 projects and one midterm:

In [None]:
newD <- c(quizes = 10, projects = 4, midterms = 1)
newD

And draw the pie chart:

In [None]:
pie(newD)

Let's put them all together (and don't forget to reset the mfrow argument of par to c(1,1) ):

In [None]:
par(mar=c(1.5,1.5,2.5,2.5))

par(mfrow=c(2,7))

for (i in c(1:7)){
  varx <- paste("Q", i, sep = "")
  plot(datax[[varx]],col=ifelse(datax$M1>70,"red","blue"), ylab = varx) 
}

for (i in c(2:7)){
  hist(datax$QAll,breaks=i*2,main=paste(i*2,"bins"))
}

newD <- c(quizes = 10, projects = 4, midterms = 1)

pie(newD)

par(mfrow=c(1,1))

# BOX PLOTS

First create a data frame of standart normal distributed random numbers:
- First column including both negative and positive numbers
- Second column including just positive numbers

In [None]:
rnumbers <- data.frame(rnorm1000 = rnorm(1000),
                      pos_rnorm1000 = abs(rnorm(1000)))
head(rnumbers)

Get structure and summary info

In [None]:
str(rnumbers)

In [None]:
summary(rnumbers)

In [None]:
boxplot(rnumbers$rnorm1000)

See that lower and upper bounds of the box correspond to 1st and 3rd quartiles.

The bold line in the middle is the median 

In [None]:
boxplot(rnumbers$pos_rnorm1000)

Let's combine them into a single plot:

In [None]:
par(mfrow = c(1,2))
boxplot(rnumbers$rnorm1000)
boxplot(rnumbers$pos_rnorm1000)
par(mfrow = c(1,1))

See, that the x axes have different scales. We should coerce both x axes to the same scale

First get the min and max values of both

In [None]:
rangex <- sapply(rnumbers, function(x) c(min(x), max(x)))
rangex

A range of -4,+4 will suffice

We pass the limits of y axis with "ylim" argument:

In [None]:
par(mfrow = c(1,2))
boxplot(rnumbers$rnorm1000, ylim = c(-4, 4))
boxplot(rnumbers$pos_rnorm1000, ylim = c(-4, 4))
par(mfrow = c(1,1))

A second method is to plot them with a single command so that axes are aligned:

In [None]:
boxplot(rnumbers$rnorm1000, rnumbers$pos_rnorm1000)

# STRIPCHARTS

"stripchart produces one dimensional scatter plots (or dot plots) of the given data.

These plots are a good alternative to boxplots when sample sizes are small."

In [None]:
stripchart(list(rnumbers$rnorm1000,
               rnumbers$pos_rnorm1000))

This is done with default "overplot" method

Let's redo it with "jitter" method:

In [None]:
stripchart(list(rnumbers$rnorm1000,
               rnumbers$pos_rnorm1000),
          method = "jitter")

Now let's differentiate the colors

In [None]:
stripchart(list(rnumbers$rnorm1000,
               rnumbers$pos_rnorm1000),
          method = "jitter", col = c("red", "blue"))