![Introduction to Data Visualization with R - AcqVA Aurora workshop](https://slcladal.github.io/images/acqvalab.png)

# Introduction to Data Visualization with R - AcqVA Aurora workshop

This document only contains the code we will use in this workshop and very minimal descriptions.

[Here](https://github.com/MartinSchweinberger/AcqVA_DataVisR_WS) is the link to an GitHub repository with all materials.

## What to do before the workshop


In [None]:
# update R
#install.packages("installr")
#library(installr)
#updateR()
# take time
t0 <- Sys.time()
# install required packages
install.packages(c("dplyr", "ggplot2", "here", "tidyr",  "readxl", "stringr", "cowplot"), dependencies = F)
install.packages("xlsx")
t1 <- Sys.time()
t1-t0


# Getting started

For everything to work, please do the following:

1. Please copy this Jupyter notebook so that you are able to edit it. For this, simply go to: File > Save a copy in Drive.

2. Install the packages you need for this workshop (this will take a couple of minutes).

Once you have done that, you are good to go.

## Primer


To execute the code, simply click on the *Play* button in the top right corner of the code box.


In [None]:
# generate two variables (x and y)
x <- sample(seq(0, 1, 0.01), 100) # draw a sample of 100 from numbers between 0 and 1
y <- rep(c("Group A", "Group B"), each = 50) # create a vector of 100 representing 2 groups
# create a box plot
boxplot(x ~ y)           


Yay! You created a nice boxplot in R. 



In [None]:
# create a nicer box plot
boxplot(x ~ y, 
        col = c("orange", "lightblue"), # define colors of boxes
        xlab = "Groups",                # define x-axis label
        ylab = "Probability")           # define y-axis label


## Today's Data


We will use 3 data sets:

1. **`data_german`**

2. **`L2EnglishIntervention`**

3. **`AJT_V2`**

## Session preparation

Now, we start by preparing the session.


In [None]:
# load packages
library(dplyr)
library(here)
library(readxl)
library(xlsx)
library(ggplot2)
library(tidyr)
library(stringr)
library(cowplot)


# How to load different data formats into R

## Load files when working in Google Colab

You can also use you own data. The code chunk below shows you how to upload two files from your own computer **BUT** to be able to load your own data, you need to click on the folder symbol to the left of the screen:

![Colab Folder Symbol](https://slcladal.github.io/images/ColabFolder.png)

Then on the upload symbol. 

![Colab Upload Symbol](https://slcladal.github.io/images/ColabUpload.png)


Next, upload the files you want to analyze and then the respective files names in the `file` argument of the `read_xlsx` function. When you then execute the code (like to code chunk below, you will upload your own data.

To apply the code and functions below to your own data, you will need to modify the code chunks and replace the data we use here with your own data object. 


## Load xlsx-file when working in RStudio on  your own computer


In [None]:
# load xlsx data from Google Drive 
myxlsx <- readxl::read_xlsx("data_german.xlsx", sheet = 1)
# load xlsx from the data folder in your project folder
#myxlsx <- readxl::read_xlsx(here::here("data", "data_german.xlsx"), sheet = 1)
# inspect data (the head function shows the first 6 rows of a table or data frame)
head(myxlsx)


* save data



In [None]:
# into the the data folder in your project folder
#xlsx::write.xlsx(myxlsx, here::here("data", "myxlsx.xlsx"))
# into Google Colab
xlsx::write.xlsx(myxlsx, "myxlsx.xlsx")


## Load csv-file when working in RStudio on  your own computer

**Make sure you have uploaded the file `data_german.csv` into Google Colab for this to work!**


In [None]:
# load csv file from Google Drive 
mycsv <- read.csv("data_german.csv")
# load csv from the data folder in your project folder
#mycsv <- read.csv(here::here("data", "data_german.csv"))
# inspect data
head(mycsv)


* save data



In [None]:
# into the the data folder in your project folder
#write.csv(mycsv, here::here("data", "mycsv.csv"))
# into Google Colab
write.csv(mycsv, "mycsv.csv")


## Load txt-file when working in RStudio on  your own computer

Again: make sure you have uploaded the file `data_german.txt` into Google Colab for this to work!


In [None]:
# load txt file from Google Drive 
mytxt <- read.delim("data_german.txt", sep = "\t")
# load txt from the data folder in your project folder
#mytxt <- read.delim(here::here("data", "data_german.txt"), sep = "\t")
# inspect data
head(mytxt)


* save data



In [None]:
# into the the data folder in your project folder
#write.table(mytxt, here::here("data", "mytxt.txt"), sep = "\t")
# into Google Colab
write.table(mytxt, "mytxt.txt", sep = "\t")


## Load rda-file when working in RStudio on  your own computer

Again: make sure you have uploaded the file `data_german.rda` into Google Colab for this to work!


In [None]:
# load xlsx data from Google Drive 
myrda <- readRDS("data_german.rda")
# load txt from the data folder in your project folder
#myrda <- readRDS(here::here("data", "data_german.rda"))
# inspect data
head(myrda)


* save data



In [None]:
# into the the data folder in your project folder
#base::saveRDS(myrda, file = here::here("data", "myrda.rda"))
# into Google Colab
base::saveRDS(myrda, file = "myrda.rda")


# Basics of data preparation

Basic procedures for processing tabular data:

* `mutate`: creates new or changes existing columns

* `filter`: chooses rows based on given criteria

* `select`: chooses columns based on given criteria

* `group_by`: groups rows based on criteria in other columns

* `summarize`: summarizes column values

* `spread`: split values of a column and spread it across columns

* `gather`: take values of several columns and combine them into a single column 

* `%>%`: pipe-symbol that can be read as *and then*


**Example**


In [None]:
myxlsx %>% # take the myxlsx data and then
  # create a new column called Age that contains the age of children in years 
  mutate(Age = age_months/12) %>% # and then
  # only keep rows with children older than  values higher than 1
  filter(Age > 8) %>% # and then
  # only retain the columns Age, accent_response, and family
  select(Age, accent_response, family) -> newdata # store the results in an object called newdata
# inspect newdata
head(newdata)


We can also group and summarize the data now.



In [None]:
newdata %>%
  group_by(family, accent_response) %>%
  summarise(N = n()) -> newdata2
# inspect
newdata2


Now, we have the number of observations  for each combination or *family* and *accent_response*.


# Getting started with ggplot2


In [None]:
ggplot(myxlsx, aes(x = accent_response, y = age_months))



## Box Plots

* add the geom-layer


In [None]:
ggplot(myxlsx, aes(x = accent_response, y = age_months)) +
  geom_boxplot()


### Saving plots

We can use the `ggsave` function to save plots to your computer.

**WARNING!** The code below for loading and saving data only works if you are **not working** in the Jupyter notebook but in RStudio on your own computer!


In [None]:
ggsave("myfirstggplot.png")



Another way of doing this: piping

Prettifying the plot

* Get rid of NAs


In [None]:
myxlsx %>%
  drop_na() %>%
ggplot(aes(x = accent_response, y = age_months)) +
  geom_boxplot()


* Reorder accent_response



In [None]:
myxlsx %>%
  drop_na() %>%
  mutate(accent_response = factor(accent_response, 
                                  levels = c("no", "w", "s"), 
                                  labels = c("No accent", "Weak accent", "Strong accent"))) %>%
ggplot(aes(x = accent_response, y = age_months)) +
  geom_boxplot()


* Changing axes-labels

Option 1: change the data


In [None]:
myxlsx %>%
  drop_na() %>%
  mutate(accent_response = factor(accent_response, 
                                  levels = c("no", "w", "s"), 
                                  labels = c("No accent", "Weak accent", "Strong accent"))) %>%
ggplot(aes(x = accent_response, y = age_months)) +
  geom_boxplot() +
  labs(x = "Accent Rating",
       y = "Age (in months)")


Option 2: change the labels directly



In [None]:
myxlsx %>%
  drop_na() %>%
  ggplot(aes(x = accent_response, y = age_months)) +
  geom_boxplot() +
  labs(x = "Accent Rating",
       y = "Age (in months)") +
  scale_x_discrete(limits = c("no", "w", "s"), 
                   labels = c("No accent", "Weak accent", "Strong accent"))


* Add color



In [None]:
myxlsx %>%
  drop_na() %>%
  ggplot(aes(x = accent_response, y = age_months, fill = accent_response)) +
  geom_boxplot() +
  labs(x = "Accent Rating",
       y = "Age (in months)") +
  scale_x_discrete(limits = c("no", "w", "s"), 
                   labels = c("No accent", "Weak accent", "Strong accent"))


* Change background to white



In [None]:
myxlsx %>%
  drop_na() %>%
  ggplot(aes(x = accent_response, y = age_months, fill = accent_response)) +
  geom_boxplot() +
  labs(x = "Accent Rating",
       y = "Age (in months)") +
  scale_x_discrete(limits = c("no", "w", "s"), 
                   labels = c("No accent", "Weak accent", "Strong accent")) +
  theme_bw()


* Move legend to the top



In [None]:
myxlsx %>%
  drop_na() %>%
  ggplot(aes(x = accent_response, y = age_months, fill = accent_response)) +
  geom_boxplot() +
  labs(x = "Accent Rating",
       y = "Age (in months)")  +
  scale_x_discrete(limits = c("no", "w", "s"), 
                   labels = c("No accent", "Weak accent", "Strong accent")) +
  theme_bw() +
  theme(legend.position = "top") +
  guides(fill=guide_legend(title="Levels of Accent"))


* Change axes limits



In [None]:
myxlsx %>%
  drop_na() %>%
  ggplot(aes(x = accent_response, y = age_months, fill = accent_response)) +
  geom_boxplot() +
  labs(x = "Accent Rating",
       y = "Age (in months)")  +
  scale_x_discrete(limits = c("no", "w", "s"), 
                   labels = c("No accent", "Weak accent", "Strong accent")) +
  theme_bw() +
  theme(legend.position = "top") +
  guides(fill=guide_legend(title="Levels of Accent")) +
  coord_cartesian(x = c(0.5, 3.5),
                  y = c(0, 150))


* Change colors



In [None]:
myxlsx %>%
  drop_na() %>%
  ggplot(aes(x = accent_response, y = age_months, fill = accent_response)) +
  geom_boxplot() +
  labs(x = "Accent Rating",
       y = "Age (in months)")  +
  scale_x_discrete(limits = c("no", "w", "s"), 
                   labels = c("No accent", "Weak accent", "Strong accent")) +
  theme_bw() +
  theme(legend.position = "top") +
  guides(fill=guide_legend(title="Levels of Accent")) +
  coord_cartesian(x = c(0.5, 3.5),
                  y = c(0, 150)) +
  scale_fill_manual(values = c("red", "blue", "gray"))


What if we want to include another factor?



In [None]:
myxlsx %>%
  drop_na() %>%
  ggplot(aes(x = accent_response, y = age_months, fill = accent_response)) +
  geom_boxplot() +
  labs(x = "Accent Rating",
       y = "Age (in months)")  +
  scale_x_discrete(limits = c("no", "w", "s"), 
                   labels = c("No accent", "Weak accent", "Strong accent")) +
  theme_bw() +
  theme(legend.position = "top") +
  guides(fill=guide_legend(title="Levels of Accent")) +
  coord_cartesian(x = c(0.5, 3.5),
                  y = c(0, 150)) +
  scale_fill_manual(values = c("red", "blue", "gray"))  + 
  facet_grid(~family)


* Change direction of axes tick-marks



In [None]:
myxlsx %>%
  drop_na() %>%
  ggplot(aes(x = accent_response, y = age_months, fill = accent_response)) +
  geom_boxplot() +
  labs(x = "Accent Rating",
       y = "Age (in months)")  +
  scale_x_discrete(limits = c("no", "w", "s"), 
                   labels = c("No accent", "Weak accent", "Strong accent")) +
  theme_bw() +
  theme(legend.position = "top",
        axis.text.x = element_text(size=8, angle=90)) +
  guides(fill=guide_legend(title="Levels of Accent")) +
  coord_cartesian(x = c(0.5, 3.5),
                  y = c(0, 150)) +
  scale_fill_manual(values = c("red", "blue", "gray"))  + 
  facet_grid(~family)


* Change header of legend



In [None]:
myxlsx %>%
  drop_na() %>%
  mutate(family = case_when(family == "bil-mixed" ~ "Mixed bilingual",
                            family == "bil-rus" ~ "Bilingual Russian",
                            family == "mono-de" ~ "Monolingual German",
                            TRUE ~ family)) %>%
ggplot(aes(x = accent_response, y = age_months, fill = accent_response)) +
  geom_boxplot() +
  labs(x = "Accent Rating",
       y = "Age (in months)")  +
  scale_x_discrete(limits = c("no", "w", "s"), 
                   labels = c("No accent", "Weak accent", "Strong accent")) +
  theme_bw() +
  theme(legend.position = "top",
        axis.text.x = element_text(size=8, angle=90)) +
  guides(fill=guide_legend(title="Levels of Accent")) +
  coord_cartesian(x = c(0.5, 3.5),
                  y = c(0, 150)) +
  scale_fill_manual(values = c("red", "blue", "gray"))  + 
  facet_grid(~family)


* Save plot



In [None]:
ggsave("myniceggplot.png")



### Exercise

The code below loads another data set called `exdata`. Use the data to create another boxplot (also, add colors and try to make it nice and publishable). I have made it a bit easier for you by calculating the mean of *Response-code*).


In [None]:
exdata <- read_excel("L2EnglishIntervention.xlsx") %>%
  dplyr::mutate_if(is.character, factor) %>%
  dplyr::group_by(Group, Test, `Test-item`, Condition, Grammaticality) %>%
  dplyr::summarise(Response_mean = mean(`Respose-code`))
# inspect
head(exdata)


You will most probably encounter some difficulties - don't worry and don't lose hope! We will try and do this together!

If you want to have an aim, try and re-create the following boxplot:

![Example boxplot](https://slcladal.github.io/images/exboxplot.png)

You can write code in the code box below.


Below is the skeleton script for the plot (in case you need help ;)



In [None]:
exdata %>%
  dplyr::filter( != "") %>%
  ggplot(aes(x = , y = , fill = )) +
  geom_boxplot() +
  facet_grid( ~ ) +
  theme_bw() +
  labs(y = "") +
  theme(legend.position = "top")


## Excursion: in-built statistics

One nice thing about `ggplot` is that it allows to visualize statistical properties such as mean, standard errors or standard deviations very easily using the `stat_summary` geom (see below).


In [None]:
# install required packages
install.packages("Hmisc")
# load package
library(Hmisc)
# load data and plot
read_excel("L2EnglishIntervention.xlsx") %>%
  ggplot(aes(x = Test, y = `Respose-code`, group = Group, color = Group)) +
  stat_summary(fun = mean, geom = "point", aes(group= Group)) +          
  stat_summary(fun.data = mean_cl_boot,       
               # add error bars
               geom = "errorbar", width = 0.2) +
  facet_grid(Grammaticality~Condition) +
  theme_bw()


## Line Plots

Prepare data

* remove NAs


In [None]:
linedat <- myxlsx %>%
  drop_na()
# inspect data
head(linedat)


* create column with age groups



In [None]:
linedat <- linedat %>%
  mutate(age_cat = case_when(age_months < 60 ~ "41-60",
                             age_months < 70 ~ "61-70",
                             age_months < 80 ~ "71-80",
                             age_months < 90 ~ "81-90",
                             age_months < 100 ~ "91-100",
                             age_months < 110 ~ "101-110",
                             age_months < 120 ~ "111-120"),
         # convert into factor with set order of levels
         age_cat = factor(age_cat, levels = c("41-60", "61-70", "71-80", "81-90", 
                                              "91-100", "101-110", "111-120"))) 
# inspect
head(linedat)


* create column with mean accent rating per family type and age group



In [None]:
linedat <- linedat %>%
  # grouping by age group and family type
  group_by(family, age_cat) %>%
  # calculate mean of accent rating
  summarise(accent_numeric = mean(accent_numeric))


Generate plot



In [None]:
linedat  %>%
  ggplot(aes(x = age_cat, y = accent_numeric,
             # generate different lines for each family type
             group = family, 
             # give different colors to each line
             color = family)) +
  geom_line()


Prettify plot

* add different line types

* increase thickness of lines


In [None]:
linedat %>%
  ggplot(aes(x = age_cat, y = accent_numeric, 
                   group = family, color = family, linetype = family)) +
  # change line thickness
  geom_line(size = 1.5)


* white background

* change axes labels


In [None]:
linedat %>%
  ggplot(aes(x = age_cat, y = accent_numeric, 
                   group = family, color = family, linetype = family)) +
  geom_line(size = 1.5) +
  theme_bw() +
  labs(x = "Age", y = "Accent strength rating")


* legend at top
* change y-axis tick labels


In [None]:
linedat %>%
  ggplot(aes(x = age_cat, y = accent_numeric, 
                   group = family, color = family, linetype = family)) +
  geom_line(size = 1.5) +
  theme_bw() +
  labs(x = "Age", y = "Accent strength rating") +
  theme(legend.position = "top") +
  scale_y_discrete(name ="Accent strength rating", 
                   limits = seq(0, 2, 1), 
                   labels = c("No accent", "Weak accent", "Strong accent"))


* change legend text



In [None]:
linedat %>%
    mutate(family = case_when(family == "bil-mixed" ~ "Mixed bilingual",
                              family == "bil-rus" ~ "Bilingual Russian",
                              family == "mono-de" ~ "Monolingual German",
                              TRUE ~ family)) %>%
  ggplot(aes(x = age_cat, y = accent_numeric, 
                   group = family, color = family, linetype = family)) +
  geom_line(size = 1.5) +
  theme_bw() +
  labs(x = "Age", y = "Accent strength rating") +
  theme(legend.position = "top") +
  scale_y_discrete(name ="Accent strength rating", 
                   limits = seq(0, 2, 1), 
                   labels = c("No accent", "Weak accent", "Strong accent")) +
    guides(linetype=guide_legend(title="Family type"),
         color=guide_legend(title="Family type"))


Save plot



In [None]:
ggsave("niceline.png")



## Smoothed Line Plots

2 numeric variable a or 1 numeric (y) and 1 categorical

Generate basic plot


In [None]:
ggplot(myxlsx, aes(x = age_months, y = accent_numeric, 
                   group = family, color = family, fill = family)) +
  geom_smooth()


### Exercise

Based on what you learned before, can you make the plot "nicer"?
Please change the following:

* Axes labels
* Background color
* Line and fill color
* Legend position
* legend title
* y-axis tick labels

The final plot should look sth like this:

![Example smoothed line plot](https://slcladal.github.io/images/nicesmooth.png)


You can write the code in the following code box.


## Bar Chart

One categorical variable.


Generate basic plot


In [None]:
ggplot(myxlsx, aes(x = family)) +
  geom_bar(stat = "count")


Alternative: show pre-calculated frequencies

Prepare data


In [None]:
bardata <- myxlsx %>%
  # change the levels of family to be more meaningful
  mutate(family = case_when(family == "bil-mixed" ~ "Mixed bilingual",
                            family == "bil-rus" ~ "Bilingual Russian",
                            family == "mono-de" ~ "Monolingual German",
                            TRUE ~ family)) %>%
  # group by family
  group_by(family) %>%
  # get frequency of familytypes
  summarise(Frequency = n())
# inspect
head(bardata)


* add percentage



In [None]:
bardata <- bardata %>%
  # ungroup
  ungroup() %>%
  # calculate total and stre value in extra column called Total
  mutate(Total = sum(Frequency)) %>%
  # perform calculations row-wise
  rowwise() %>%
  # calculate percent
  mutate(Percent = round(Frequency/Total *100, 1),
         # add a Label column with the Frequency and the Percentage value
         Label = paste0(Frequency, " (", Percent, "%)")) %>%
  # remove Total column (we don't need it any lonnger)
  select(-Total)
# check data
bardata


1 numeric and 1 categorical variable



In [None]:
ggplot(bardata, aes(x = family, y = Frequency)) +
  geom_bar(stat = "identity")


Prettify plot

* change background
* modify axes labels
* change color/filling
* remove legend


In [None]:
ggplot(bardata, aes(x = family, y = Frequency, fill = family)) +
  geom_bar(stat = "identity") +
  theme_bw() +
  labs(x = "Family Type", y = "Raw Frequency") +
  theme(legend.position = "none")


* adapt axis range
* Add text/annotation


In [None]:
ggplot(bardata, aes(x = family, y = Frequency, fill = family, label = Label)) +
  geom_bar(stat = "identity") +
  theme_bw() +
  labs(x = "Family Type", y = "Raw Frequency") +
  theme(legend.position = "none") +
  coord_cartesian(ylim = c(0, 500)) +
  geom_text(vjust=-1.5, position = position_dodge(0.9))


Save plot



In [None]:
ggsave("nicebar.png")



### Exercise

Use the `regdata` data set (shown below) to create a bargraph that shows the responses by sentence type, word order, and group. Adapt color, axes title, axes tick labels, and try adding text. You can see an example of what the plot looks like below.


In [None]:
regdat <- read.delim("AJT_V2.csv", sep = ";") %>%
  # factorize character variables
  dplyr::mutate_if(is.character, factor)
# inspect data
str(regdat)


You can see an example of what the plot looks like below.

![Example barplot](https://slcladal.github.io/images/barplotex.png)


You can write the code in the code box below.


## Mosaic plots with vcd

* prepare data: tidy format


In [None]:
# install required packages
install.packages("vcd")
# load packages
library(vcd)
# process data
mosaicdat <- myxlsx %>%
  # group by family, age_group, and accent_response
  dplyr::group_by(family, age_group, accent_response) %>%
  # get frequencies of the configurations
  dplyr::summarise(Frequency = n())
# inspect
mosaicdat


* ungroup and convert character variables into factors



In [None]:
mosaicdat <- mosaicdat %>%
  # ungroup
  dplyr::ungroup() %>%
  # convert character variables to factors
  dplyr::mutate_if(is.character, factor)
# inspect
mosaicdat


* split data (the `pull` function *pulls out* the values for a variable)



In [None]:
mos1 <- mosaicdat %>%
  # filter out all rows where family is bil-mixed
  dplyr::filter(family == "bil-mixed") %>%
  # pull out the numeric values
  dplyr::pull()
# inspect
mos1


* we also do this for the other family types



In [None]:
mos2 <- mosaicdat %>%
  dplyr::filter(family == "bil-rus") %>%
  dplyr::pull()
mos3 <- mosaicdat %>%
  dplyr::filter(family == "mono-de") %>%
  dplyr::pull()
# inspect
mos2; mos3


* generate matrix



In [None]:
# add dimnames (dimension names)
row.names <- c("no", "s", "w")
column.names <- c("preschool", "school")
matrix.names <- c("bil-mixed", "bil-rus", "mono-de")
# generate matrix
mos_mx <- array(c(mos1, mos2, mos3), 
                dim = c(3, 2, 3),
                dimnames = list(row.names, 
                                column.names,
                                matrix.names))
# inspect
mos_mx


* basic mosaic plot



In [None]:
mosaic(mos_mx,
       shade = TRUE,
       direction = c("h",  "v", "v"),
       just_labels = c("center", "center", "center", "center"))


* save mosaic plot



In [None]:
# open connection
png("mosaic.png", width = 750, height = 300)
# generate plot
mosaic(mos_mx,
       axis.cex = 15,
       shade = TRUE,
       direction = c("h",  "v", "v"),
       just_labels = c("center", "center", "center", "center"))
# close window
dev.off() 


## Visualizing Likert data

Load data


In [None]:
sdat  <- base::readRDS(url("https://slcladal.github.io/data/sdd.rda", "rb"))
# inspect 
head(sdat)
# install required packages
install.packages("likert")
# load pacjages
library(likert)


* clean column names



In [None]:
# clean column names
colnames(sdat)[3:ncol(sdat)] <- paste0("Q ", str_pad(1:10, 2, "left", "0"), ": ", colnames(sdat)[3:ncol(sdat)]) %>%
  stringr::str_replace_all("\\.", " ") %>%
  stringr::str_squish() %>%
  stringr::str_replace_all("$", "?")
# inspect column names
colnames(sdat)


* replace numeric values with labels



In [None]:
lbs <- c("disagree", "somewhat disagree", "neither agree nor disagree",  
         "somewhat agree", "agree")
survey <- sdat %>%
  dplyr::mutate_if(is.character, factor) %>%
  dplyr::mutate_if(is.numeric, factor, levels = 1:5, labels = lbs) %>%
  drop_na() %>%
  as.data.frame()
# inspect
head(survey)


In [None]:
plot(likert(survey[,3:12]), ordered = F, wrap= 30)



* save plot



In [None]:
survey_p1 <- plot(likert(survey[,3:12]), ordered = F, wrap= 60)
# save plot
cowplot::save_plot("stu_p1.png", # where to save the plot
                   survey_p1,        # object to plot
                   base_asp = 1.5,  # ratio of space fro questions vs space for plot
                   base_height = 8) # size! higher for smaller font size


* include groups



In [None]:
# create plot
survey_p2 <- plot(likert(survey[,3:8], grouping = survey[,1]))
# save plot
cowplot::save_plot("stu_p2.png", # where to save the plot
                   survey_p2,        # object to plot
                   base_asp = 1.5,  # ratio of space fro questions vs space for plot
                   base_height = 8) # size! higher for smaller font size
# show plot
survey_p2


# Visualizing Model Effects

* load data


In [None]:
regdat <- read.delim("AJT_V2.csv", sep = ";")
# inspect data
str(regdat)
# install required packages
install.packages("lme4")
install.packages("sjPlot")
# load packages
library(lme4)
library(sjPlot)


* perform regression analysis



In [None]:
# run model
m1 = glmer(Response_num ~ (1|ID) + Word_order + Sentence_type + Sentence_type * Group+Sentence_type * Word_order, 
                 data = regdat, 
                 family = binomial)
# inspect results
sjPlot::tab_model(m1)


* visualize results



In [None]:
sjPlot::plot_model(m1, type = "pred", terms = c("Sentence_type", "Group"))



Show other significant interaction and modify plot.



In [None]:
sjPlot::plot_model(m1, type = "pred", terms = c("Sentence_type", "Word_order"))  + 
  ggplot2::theme_bw() +
  ggplot2::labs(x = "Sentence Type", y = "Probability\nfor Response = 1")


# Resources and Wrap-Up

That's all folks!
