# <center>WORKING WITH TABLE DATA WITH R - REFRESHER</center>
<img src="../elem/caldiss_symbol_square.png" width="200">


<i><center>Kristian Gade Kjelmann</center></i>
<i><center>August 20th 2020</center><i>

# How to work with the R language

Working with R means to work with objects and functions. 

*Objects* are containers of information and can be pretty much anything.

An object is always of a certain *class*, denoting the type of information the object contains (a numerical value, a string/text value, a dataset, a list of values/vector etc.).

*Functions* are used to manipulate objects and produce output. Functions consist of *arguements*

**Assign objects**

Assign objects using `<-`.

In [1]:
a_number <- 42     # assign the number '42' to object 'a_number'
a_word <- "hello"  # assign the word 'hello' to object 'a_word'

**Using a function**

Functions are used by writing the function name with the arguements in `()`. Additional arguements are separated by `,`. 

In [4]:
toupper(a_word)

Functions do not change an object. The output of a function has to be stored in a new or the same object to be stored:

In [5]:
print(a_word)

[1] "hello"


In [6]:
another_word = toupper(a_word)
print(another_word)

[1] "HELLO"


**Check class**

The class of an object can be checked using the function `class()`.

In [2]:
class(a_number)    # what class is a_number?
class(a_word)      # what class is a_word?

If an object is stored as an incorrect class (like a number stored as text) it can be changed using a function like `as.numeric()`:

In [7]:
another_number <- "42" 
class(another_number)   # 'another_number' is a character class (text)

another_number <- as.numeric(another_number)
class(another_number)   # 'another_number' is now a numeric class

**Logical/boolean values**

An often occuring type of object class in R is the logical class. A logical is either `TRUE` or `FALSE`.

**NA**

`NA` is the R equivalent of a missing value.

# Data structures in R

- Vectors

- Data frames

# Packages

- Installing 
- Loading

# (Re-)Introducing tidyverse

- `readr`
- `dplyr`
- `lubridate`
- `stringr`
- etc.

# Reading and inspecting data in R

- `read_csv`
- `head()`
- `summary()`
- `colnames()`
- Subsetting

# Datawrangling with dplyr

- `select()`
- `filter()`
- `arrange()`
- `drop_na()`
- Using the pipe `%>%`
- `mutate()` (recoding and new variables) 
    - `recode()` (single values)
    - `if_else()` (logicals)
    - `case_when()` (several logicals)

# Categorical variables in R (factors)

# Refresher exercise

- Read data
- Inspect data
- Create an age variable
- Create a smoker dummy variable
- Create a subset: Smokers over 40