# Visualisations in R

In [3]:
## installations
# install.packages("tidyverse")

## libraries
library(tidyverse)

# ggplot2
* Explore Dataframe with __str()__

## 1.1 Data:
* Dataset being plotted

## 1.2 Aesthetics: Scales on which data is mapped
* `x-axis`
* `y-axis`
* `color`
* `fill`
* `size` $\Rightarrow$ size of line (continuous value)
* `labels` $\Rightarrow$ labels beside points
* `alpha` $\Rightarrow$ transparency (double from 0 - 1)
* `shape` $\Rightarrow$ shape of points (integer from 0 - 25)
* `linewidth`
* `linetype` $\Rightarrow$ `c("solid", "dashed", "dotted", "dotdash", "longdash", "twodash")`
* Axis Labels:
    * `+ labs(title="_", x="_", y="_")`
    
* __Modifying Aesthetics:__
    * `position`: Adjustment for overlapping $\Rightarrow$ `c(identity, dodge, stack, fill, jitter, jitterdodge, nudge)`
        * `position_*` : gives additional functionality for position (e.g position_jitter)
    * `scale_*` : Scale functions for `c(x, y, color, fill, shape, lintype, size)`

## 1.3 Geometrics: Visual elements for Data
### 1.3.1 Scatterplots
* `geom_point()`
    * Regular scatterplot
    * For displaying relationship between two continuous variables
    
    
* `geom_jitter()`
    * Adds a **small amount of random variation** to the location of each point, and is a useful way of handling overplotting caused by discreteness in smaller datasets
    

### 1.3.2 Count Overlapping Points
* `geom_count()`
    * A variant of `geom_point()` that counts the number of observations at each point, then maps the count to point area
    * Useful for **discrete data** and overplotting
 
### 1.3.3 Heatmap of 2d Bin Counts
* `geom_bin2d()`
    * Divides the plane into rectangles, counts the number of cases in each rectangle, and then maps the number of cases to the rectangle's fill
    * Useful alternative to `geom_point()` in the presence of overplotting
    * For **continuous** x and y values

### 1.3.4 Hexagonal Heatmap of 2d Bin Counts
* `geom_hex()`
    * Similar to `geom_bin2d()` but with hexagonal mapping
    * Hexagon bins avoid the visual artefacts sometimes generated by the very regular alignment of `geom_bin2d()`

### 1.3.5 Rug Plots in the Margins
* `geom_rug()`
    * A compact visualisation to supplement a 2d display with two 1d marginal distributions
    * Rug plots display individual cases so are best used with smaller datasets
    
### 1.3.6 Histogram
* `geom_histogram(dataset, aes(variable, ..density..))`
    * used for __continuous__ x-axis variable

* `geom_bar()`
    * used for __categorical__ x-axis variable
    
* `geom_col()`
    * similar to `geom_bar()` but the heights of the bar represent the exact values in the data

### 1.3.7 Line Plots
* `geom_smooth()`

* `geom_area()`

### 1.3.8 Explanatory Plots
* `geom_segment()`
    * xend:
    * yend:
* `annotate()` : use to add additional text and arrows

## 1.4 Themes: All non-data ink
* `theme()`:
    * `legend.position`: controls the position of the legend $\Rightarrow$ `c('top', 'bottom', 'left', 'right', 'none')` or `c(0,0)` for bottom-left, `c(1,1)` for top-right
    * `element_rect()`:
    * `element_text()`:
    * `element_line()`:
    * `element_blank()`: to remove a plot element
    
* Using Custom Themes:
    * Built-in Themes:
        * `theme_bw()` : useful when using transparency
        * `theme_classic()` : more traditional
        * `theme_void()` : removes everything but data
    * ggthemes:
        * `theme_fivethirtyeight()`
        * `theme_tufte()`
        * `theme_wsj()` : Wall Street Journal theme
    * `theme_set(*)`: set * as default theme

## 1.5 Statistics: Representation of data to aid Understanding

## 1.6 Coordinates: Space on which data is plotted

## 1.7 Facets: Plotting small multiples