# Data Visualization in R

In this example, we will be plotting temperature data for the first day of summer in Tucson, Arizona.

We start by importing additional packages necessary for this visualization.

In [None]:
library("ggplot2")

print("Libraries loaded!")

Next, we load data and look at the first few rows.

In [None]:
tucson <- read.csv(file = "data/tucson-summer.csv")
head(tucson)

Our first plot will be plotting the daily minimum temperature of the summer solstice over time. Using the ggplot method `geom_smooth`, it will automatically draw the line from linear regression.

In [None]:
ggplot(data = tucson, mapping = aes(x = year, y = tmin)) +
  geom_point() +
  geom_smooth(method = "lm")

We should update our y-axis label, using the `ylab` command.

In [None]:
ggplot(data = tucson, mapping = aes(x = year, y = tmin)) +
  geom_point() +
  geom_smooth(method = "lm") +
  ylab("Minimum temperature (F)")

Next, we will modify our code to change what we plot on the y-axis. In this case we want to plot the maximum temperature (`tmax`) on the y-axis. Update the code below to change the values we are plotting on the y-axis. _Hint_: you'll need to change the name of the variable passed to y on the first line, as well as the axis label on the last line.

In [None]:
ggplot(data = tucson, mapping = aes(x = year, y = tmin)) +
  geom_point() +
  geom_smooth(method = "lm") +
  ylab("Minimum temperature (F)")

By default, it will add a linear regression line and confidence intervals. This may not be a linear relationship - try a polynomial relationship by adding `formula = y ~ poly(x, degree = 2)` to the `geom_smooth` method (immediately following the method specification).

In [None]:
# Paste your code from above here, and update

To finish off this plot, we want to write the plot to a png file. Paste the code from above and run the cell.

In [None]:
# Paste your plotting code here:

# Leave the next line as-is
ggsave(file = "output/tucson-plot.png")