You need to communicate your understanding to others. Your audience will likely not share your background knowledge and will not be deeply invested in the data. To help others quickly build up a good mental model of the data, you will need to invest considerable effort in making your plots as self-explanatory as possible.

# Label
The easiest place to start when turning an exploratory graphic into an expository graphic is with good labels. You can add labels with the `labs()` function.
```r
ggplot(mpg, aes(displ, hwy)) +
  geom_point(aes(color = class)) +
  geom_smooth(se = FALSE)+
   labs(title = "Fuel efficiency generally decreases with engine size")
```

If you need to add more text, there are two other useful labels that you can use: 

* `subtitle` adds additional detail in a smaller font beneath the title.

* `caption` adds text at the bottom right of the plot, often used to describe the source of the data.
```r
ggplot(mpg, aes(displ, hwy)) +
  geom_point(aes(color = class)) +
  geom_smooth(se = FALSE) +
  labs(
    title = "Fuel efficiency generally decreases with engine size",
    subtitle = "Two seaters (sports cars) are an exception because of their light weight",
    caption = "Data from fueleconomy.gov"
  )
```

***
You can also use `labs()` to replace the axis and legend titles. It’s usually a good idea to replace short variable names with more detailed descriptions, and to include the units.

```r
ggplot(mpg, aes(displ, hwy)) +
  geom_point(aes(color = class)) +
  geom_smooth(se = FALSE) +
  labs(
    x = "Engine displacement (L)",
    y = "Highway fuel economy (mpg)",
    color = "Car type"
  )
```

*** 
It’s possible to use mathematical equations instead of text strings. Just switch "" out for `quote()` and read about the available options in `? plotmath`

```r
ggplot(mpg, aes(displ, hwy)) +
  geom_point(aes(color = class)) +
  geom_smooth(se = FALSE)+
  labs(
    x = quote(sum(x[i] ^ 2, i == 1, n)),
    y = quote(alpha + beta + frac(delta, theta))
  )
```

## Your turn

* Create one plot using `diamonds` or `mpg` data with customized `title`, `subtitle`, `x`, `y`, and `color` labels.

# Annotations
In addition to labelling major components of your plot, it’s often useful to label individual observations or groups of observations.

You can use `geom_text()` or `geom_label()`. They are similar to `geom_point()`, but have an additional aesthetic: `label`.  `geom_text()` adds text directly to the plot; `geom_label()` draws a rectangle behind the text, making it easier to read.

```r
mpg_label <- tibble(
  displ = 4,
  hwy = 30,
  label = "a label at (4,30)"
)

ggplot(mpg, aes(x= displ, y=hwy)) +
  geom_point() +
  geom_label(data = mpg_label, aes(label=label),vjust = "top", hjust = "right")
 
```


If you want to place the text exactly on the borders of the plot, you can use `+Inf` and `-Inf`.
Use of hjust and vjust to control the alignment of the label.

You have many other geoms in ggplot2 available to help annotate your plot. A few ideas:

* Use `geom_hline()` and `geom_vline()` to add reference lines. 

* Use `geom_rect()` to draw a rectangle around points of interest. The boundaries of the rectangle are defined by aesthetics `xmin`, `xmax`, `ymin`, `ymax`.

* Use `geom_segment()` with the arrow argument to draw attention to a point with an arrow. Use aesthetics x and y to define the starting location, and xend and yend to define the end location.



## Your turn
* Use `geom_text()` with infinite positions to place text at the four corners of the plot.

* How do labels with `geom_text()` interact with faceting? How can you add a label to a single facet? How can you put a different label in each facet? (Hint: think about the underlying data.)

* Try `geom_segment` to add an arrow to your plot

# Scales
 Scales control the mapping from data values to things that you can perceive. Normally, ggplot2 automatically adds scales for you.
```r
 ggplot(mpg, aes(displ, hwy)) +
  geom_point(aes(color = class)) +
  scale_x_continuous() +
  scale_y_continuous() +
 scale_color_discrete()
```
 
 Note the naming scheme for scales: `scale_` followed by the name of the aesthetic, then `_`, then the type of the scale: continuous, discrete, datetime, or date.

## Axis ticks and legend keys
There are two primary arguments that affect the appearance of the ticks on the axes and the keys on the legend: `breaks` and `labels`.

`breaks` controls the position of the ticks, or the values associated with the keys. `labels` controls the text label associated with each tick/key.

```r
ggplot(mpg, aes(displ, hwy)) +
  geom_point() +
  scale_y_continuous(breaks = seq(15, 40, by = 5))
```

You can use labels in the same way (a character vector the same length as breaks), but you can also set it to `NULL` to suppress the labels altogether. 

```r
ggplot(mpg, aes(displ, hwy)) +
  geom_point() +
  scale_x_continuous(labels = NULL) +
  scale_y_continuous(labels = NULL)
```


# Legend layout

To control the overall position of the legend, you need to use a `theme()` setting. In brief, it controls the non-data parts of the plot.

```r
base <- ggplot(mpg, aes(displ, hwy)) +
  geom_point(aes(color = class))

base + theme(legend.position = "left")
base+theme(axis.text.x = element_text(angle = 90))

```

You can also use `legend.position = "none"` to suppress the display of the legend.

To control the display of individual legends, use `guides()` along with `guide_legend()` or `guide_colorbar()`.

```r
ggplot(mpg, aes(displ, hwy)) +
  geom_point(aes(color = class)) +
  geom_smooth(se = FALSE) +
  theme(legend.position = "bottom") +
  guides(color = guide_legend(nrow = 1))

```

## Your turn


* Move the legend to the bottom (1 row), change the scale of `y` from 10 to 50 and the scale of `x` from 1 to 10 (Check `limits` argument if you find `breaks` does not work)

```r
ggplot(mpg, aes(displ, hwy)) +
  geom_point(aes(color = class))
```

# Zooming

To zoom in on a region of the plot, it’s generally best to use `coord_cartesian()`. Compare the following two plots.

```r
ggplot(mpg, mapping = aes(displ, hwy)) +
  geom_point(aes(color = class)) +
  geom_smooth() +
  coord_cartesian(xlim = c(5, 7), ylim = c(10, 30))



ggplot(mpg, mapping = aes(displ, hwy)) +
  geom_point(aes(color = class)) +
  geom_smooth() +
  scale_x_continuous(limits=c(5,7))+
  scale_y_continuous(limits=c(10,30))
```

# Themes

You can customise the non-data elements of your plot with a theme.
```r
ggplot(mpg, aes(displ, hwy)) +
  geom_point(aes(color = class)) +
  geom_smooth(se = FALSE) +
  theme_bw()
```

<img src="./figures/visualization/visualization-themes.png" alt="ds" style="width: 750px;"/>

It’s also possible to control individual components of each theme, like the size and color of the font used for the y axis.

# Saving your plots
There are two main ways to get your plots out of R and into your final write-up: `ggsave()` and knitr. `ggsave()` will save the most recent plot to disk:
```r
ggplot(mpg, aes(displ, hwy)) + geom_point()
ggsave("my-plot.pdf")

```

If you don’t specify the `width` and `height` they will be taken from the dimensions of the current plotting device. For reproducible code, you’ll want to specify them.



# Summary 

## How ggplot2 builds a graph

<img src="./figures/visualization/flowchart.png" alt="ds" style="width: 1000px;"/>


## The layered grammar of graphics
```r
ggplot(data = <DATA>) + 
  <GEOM_FUNCTION>(
     mapping = aes(<MAPPINGS>),
     stat = <STAT>, 
     position = <POSITION>
  ) +
  <COORDINATE_FUNCTION> +
  <FACET_FUNCTION>
```



#  Learning more

Book `ggplot2: Elegant graphics for data analysis` goes into much more depth about the underlying theory, and has many more examples of how to combine the individual pieces to solve practical problems. You can find the source code at <https://github.com/hadley/ggplot2-book>.

Another great resource is the ggplot2 extensions gallery <https://exts.ggplot2.tidyverse.org/gallery/>. This site lists many of the packages that extend ggplot2 with new geoms and scales. 