# **Visualizing Relationships**

---

<br>

## Packages

In [None]:
# install packages for today's lecture
#install.packages("dslabs")
install.packages("pheatmap")  # package for heatmap

In [None]:
# load the necessary packages
# library(dslabs)
library(pheatmap)

<br>

<br>

---

<br>

## Preprocessing Variables

* First, let's preprocess the variables we need for the analysis

In [None]:
# convert transmission type to a factor
mtcars$am <- as.factor(mtcars$am)
# relabel levels
levels(mtcars$am) <- c("automatic", "manual")

<br>

In [None]:
# convert number of cylinders to a factor
mtcars$cyl <- as.factor(mtcars$cyl)
# relabel levels
levels(mtcars$cyl) <- c("4 cylinders", "6 cylinders", "8 cylinders")

<br>

<br>



---

<br>

## Boxplots

* In `R`, boxplots are plotted using the `boxplot()` function

* The syntax for `boxplot()` is the following:

  ``` boxplot(y ~ x) ```

  where `y` is our quantitative variable and `x` is our categorical variable (or `factor`)

<br>

In [None]:
# controls width and height of plot
options(repr.plot.width=8, repr.plot.height=8)

In [None]:
# Create a boxplot of mpg vs number of cylinders
boxplot(mtcars$mpg ~ mtcars$cyl,
        xlab = "Number of Cylinders",
        ylab = "mpg",
        main = "mpg vs Number of Cylinders")

<br>

<br>

---

<br>

## Scatterplots

* In `R`, scatterplots are plotted using the `plot()` function

* The syntax for `plot()` is the following:

  ``` plot(x, y) ```

  where `x` is our independent quantitative variable and `y` is our dependent quantitative variable

<br>

In [None]:
# create a scatterplot of mpg (y-axis) vs hp (x-axis)
plot(mtcars$hp, mtcars$mpg,
     xlab = "Horsepower",
     ylab = "mpg",
     main = "mpg vs Horsepower")

<br>

<br>

---

<br>

## Heatmap

* In `R`, there are several options for heatmaps

* Here, we will use the `pheatmap()` function from the `pheatmap` package

* The `pheatmap()` function takes as input a table of frequency counts

<br>

In [None]:
options(repr.plot.width=6, repr.plot.height=4)

# Create the frequency table
table_data <- table(mtcars$am, mtcars$cyl)
table_data

<br>

In [None]:
# Generate heatmap using pheatmap
pheatmap(table_data,
         col = colorRampPalette(c("white", "red"))(256)  # specifies color range for heatmap
         )

<br>

<br>