# introduction to R using jupyterlab

## overview

Programming is a powerful tool that allows us to do complicated calculations and analysis, visualize data, and automate workflows to ensure consistency, accuracy, and reproducibility in our research. 

In this exercise, you will get a basic introduction to the __R__ programming language, including variables and objects, data types and data structures, and how to load data from external files. Over the next few sessions, we will build on these ideas, using some real example data, to get more of an idea of how to use __R__ in our research.

This is an _interactive_ notebook - in between bits of text, there are interactive cells like the one below, which you can use to execute snippets of __R__ code. To run these cells, highlight them by clicking on them, then press __CTRL__ and __ENTER__ (or by pressing the "play" button at the top of this tab).

## objectives

- learn and gain experience with some of the basic elements of R and programming
- learn how to use the interactive command prompt
- practice planning out a script

## objects and assignment

We'll start by creating a new __object__, `foo`, and _assigning_ it a __value__ of `"hello, world!"`. Then, we use the built-in `print()` function (more on these later) to see the value of that object. Go ahead and run the cell to get started.

In [None]:
foo <- "hello, world!" # assign an object using <-
print(foo) # print the object to the terminal

You should notice that the cell has changed. First, the square brackets (`[ ]`) have a number inside of them (`[1]`), and you can also see the output below the cell:

`[1] "hello, world!"`

The `print()` function allows us to print messages and information to the screen. We'll see a number of other uses of it as we go, but you can also read more about it in the [documentation](https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/print).

One important thing to remember is that the _name_ of an __object__ is _case-sensitive_ (meaning that _foo_ is different from _Foo_):

In [None]:
print(Foo) # this won't work, because we haven't created an object called Foo yet

We'll see more examples of error messages later (and how to interpret them), but hopefully the message:

`Error in print(Foo): object 'Foo' not found`

is clear enough. Because we were expecting this error message, we can ignore it and move on.

## data types

So, we've created our first __object__, `foo`. But what kind of object is `foo`? To find out, we can use the `typeof()` function ([documentation](https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/typeof)):

In [None]:
typeof(foo) # use the typeof() function to find the type of an object

So `foo` is an object of type __character__, meaning that it is _text_. 

In general, __R__ has the following basic data types:

- __character__, for text objects
- __numeric__, for numbers. These can be divided into the following:
    - __double__, for floating-point (real) values
    - __integer__, for integer values
- __logical__, for boolean (True/False) objects
- __complex__, for complex numbers

Let's look at an example of a __numeric__ object:

In [None]:
x <- 1 # assign a value of 1 to the object x
typeof(x) # should be integer, right?

That's interesting - even though we assigned an integer value to `x`, __R__ has created an object of type __double__. This is because by default, numeric values in __R__ are type double. To create an object with an __integer__ value, we append an `L` to each number (alternatively, we can _coerce_ to the integer type using the `as.integer()` function):

In [None]:
x <- 1L
typeof(x)

In [None]:
As we will see when we 

We'll come back to data types more as we work through the example exercises.

## data structures

Most of the time, we'll want to use groups of data, or _data structures_, rather than individual values. Just like with data types, __R__ has a number of different data structures, ranging from one-dimensional to multi-dimensional structures.  

### vectors

A __vector__ is the most basic data structure in __R__ - it's a one-dimensional sequence of a single data type.

To assign a vector explicitly, we can use the function `c()` (short for "combine"):

In [None]:
campuses <- c('Belfast', 'Coleraine', 'Jordanstown', 'Magee')
print(campuses)

### indexing

To access the individual elements of a vector, we need to use the __index__ of that element, along with square brackets (`[` and `]`). 

In the example above:

```R
campuses <- c('Belfast', 'Coleraine', 'Jordanstown', 'Magee')
```

In __R__, the index of a vector starts at 1. "Coleraine" is the second element in the `campuses` vector, which means that it has an index of 2. We can check this with the following cell:

In [None]:
campuses[2] # return the second element from the 'campuses' vector

### factors

for categorical data

### lists

for mixing data types

### data frames

## basic arithmetic

- addition/subtraction/multiplication/division
- exponentiation
- order of operations
- logical operators?


In [None]:
exp(1)

### a note about "recycling"

## functions

- example: calculating the area of a circle (also introduces constant pi)
    - mention other constants

### built-in functions

## controlling flow

- if, else
- while


## loading and viewing datasets

topics outline:

- objects and assignment
- data types
- data structures
- basic arithmetic operations
- functions
- controlling flow
- loading data from files (SPSS example)

## recap

In this lesson, we have introduced the following concepts:

- variables, objects, values, and assignment
- data __type__
- data structures:
    - vectors
    - 
- indexing
- arithmetic operations
- functions
- flow control using logic
- loading and viewing data from a file