# Week 07: Data Visualization


## Introduction 

In this tutorial, we will learn to generate visualizations in R using the `ggplot2` package.

When visualizing data with the `ggplot2` package:

* We start by calling the `ggplot` function  
* Next, we define the data that we want to visualize.  
* Then, we provide aesthetics (`aes`) which determine the axes (and optionally colors and shapes).  
* Next, we add layers using geometrics (`geom_`). 
* Optionally, we break the plot up into factes.
* Optionally, we manually set colors and define axes labels.
* Finally, and, optionally, we can modify the theme.

The most basic form of a `ggplot` looks as follows: 

> ggplot(data, aes(x, y)) + geom_point()

**Preparation and session set up**

Before turning to the code below, please install the packages by running the code below this paragraph. If you have already installed the packages mentioned below, then you can skip ahead and ignore this section. To install the necessary packages, simply run the following code - it may take some time (between 1 and 5 minutes to install all of the libraries so you do not need to worry if it takes some time).


In [None]:
# install packages
#install.packages("dplyr")
#install.packages("ggplot2")


Now that we have installed the packages, we activate them as shown below.



In [None]:
# activate packages
library(dplyr)
library(ggplot2)


##  Tutorial Activity 

Go into groups - each group will visualize 1 data set.

You need to visualize the following:

**Task 1**: Create a boxplot the secat on the y axis and the class on the x-axis.

**Task 2**: Create a boxplot the secat on the y axis and the class on the x-axis AND use different colors for each class: orange for SLAT7806, gray for SLAT7829, and lightblue for SLAT7855.

**Task 3**: Prettify the graph you have just created.

**Task 4**: Create a boxplot the secat on the y axis and the class on the x-axis AND break up the plot into different facets by grade.

**Task 5**: Calculate the mean secats for each class and grade and visualize them as a bar plot.

**Task 6**: Calculate the mean secats for each class and grade and visualize them as a bar plot AND add labels for each bar AND break up the plot into facets by class.

**Task 7**: Prettify the graph you just created.


## Load data 


Load data


In [None]:
# group 1
dat <- readxl::read_excel(here::here("data", "week6g1.xlsx")) 
# inspect
head(dat)


## Task 1

Create a boxplot the secat on the y axis and the class on the x-axis.


In [None]:
ggplot(dat, aes(x = class, y = secat)) +
  geom_boxplot()


## Task 2

Create a boxplot the secat on the y axis and the class on the x-axis AND use different colors for each class: orange for SLAT7806, gray for SLAT7829, and lightblue for SLAT7855.


In [None]:
ggplot(dat, aes(x = class, y = secat, fill = class)) +
  geom_boxplot() +
  scale_fill_manual(values = c("orange", "gray", "lightblue"))


## Task 3

Prettify the graph you have just created.


In [None]:
ggplot(dat, aes(x = class, y = secat, fill = class)) +
  geom_boxplot() +
  scale_fill_manual(values = c("orange", "gray", "lightblue")) +
  theme_bw() +
  labs(x = "Course", y = "SeCATs (out of 7)", title = "Student evaluations (SeCATs) of 3 selected Applied Linguistics courses") +
  theme(legend.position="none")


## Task 4

Create a boxplot the secat on the y axis and the class on the x-axis AND break up the plot into different facets by grade.


In [None]:
ggplot(dat, aes(x = class, y = secat, fill = class)) +
  geom_boxplot() +
  facet_wrap(~grade) +
  theme(legend.position="none")


## Task 5

Calculate the mean secats for each class and grade and visualize them as a bar plot.


In [None]:
dat %>%
  dplyr::mutate(grade = factor(grade, levels = c("low", "mid", "high"))) %>%
  dplyr::group_by(class, grade) %>%
  dplyr::summarise(Mean = mean(secat)) %>%
  ggplot(aes(x = class, y = Mean, fill = grade))+
  geom_bar(stat = "identity", position = position_dodge())


## Task 6

Calculate the mean secats for each class and grade and visualize them as a bar plot AND add labels for each bar AND break up the plot into facets by class.


In [None]:
dat %>%
  dplyr::mutate(grade = factor(grade, levels = c("low", "mid", "high"))) %>%
  dplyr::group_by(class, grade) %>%
  dplyr::summarise(Mean = mean(secat)) %>%
  ggplot(aes(x = class, y = Mean, fill = grade))+
  geom_bar(stat = "identity", position = position_dodge()) +
  geom_text(aes(y = Mean-1, label = round(Mean, 1)), color = "white", size=5) + 
  facet_wrap(~ grade)


## Task 7

Prettify the graph you just created.


In [None]:
dat %>%
  dplyr::mutate(grade = factor(grade, levels = c("low", "mid", "high"))) %>%
  dplyr::group_by(class, grade) %>%
  dplyr::summarise(Mean = mean(secat)) %>%
  ggplot(aes(x = grade, y = Mean, fill = grade))+
  geom_bar(stat = "identity", position = position_dodge()) +
  geom_text(aes(y = Mean-1, label = round(Mean, 1)), color = "gray20", size=4) + 
  facet_wrap(~ class) +
  labs(x = "Course Grade", y = "Mean SeCat (out of 7)", title = "Mean SeCAts (Student Evaluation) \nof selected Applied Linguistics Courses by course grade (low, mid, high)") +
  scale_fill_manual(values = c("gray80", "gray60", "gray40")) +
  coord_cartesian(y = c(1, 7)) +
  scale_y_continuous(labels = seq(1, 7, 1), breaks = seq(1, 7, 1)) +
  theme_bw() +
  theme(legend.position="none") 


[Back to top](#descriptive_statistics)

