# Basics of R programming

<div class="alert alert-info">

### This is a Jupyter notebook

You can learn all about the Jupyter interface [here](https://realpython.com/jupyter-notebook-introduction/)

Some basics: 
Jupyter has two modes: command and edit. When you click a cell, you enter ```edit mode``` and you can edit its contents. To exit ```edit mode```, you press ```ESC```.

* Cells can be either code cells or text (or markdown) cells
* To run cells, press Shift+enter 
* To turn a cell into a text cell, press `m` (markdown) in command mode
* To create a new cell use `b` (new cell below) or `a` (above) in command mode

</div>

## assignment and arrays

### assignment
assigning in **R** can be done in many ways:

* using `<-` or `=`
* using the `assign(variable, value)` function
when you use the assign function, the variable needs to be enclosed in `"`, as if it were a string.

### arrays
You can create sequences of numbers in several ways:

* using the `:` notation, e.g., `1:6`
* using the `seq()` function
* using the `c()` (concatenate) function

In [89]:
x <- 2:7
x = seq(from=2,to=7,by=1)
assign("x",2:7)
x = c(2,3,4,5,6,7)

In [90]:
x

## Selecting data from an array

You can use logic rules on the values of the array to select them.
For example, `x>5` will return an array of TRUE and FALSES, depending on whether the value of x at each position fulfils the condition. Using this you can get the values of `x` that fulfil the condition.

In [18]:
x
x>5 # vector of TRUE or FALSE
x[x>5] # the values that fulfil the condition

Alternatively, you can use the `which()` function, which will return the **positions** that fulfil the condition

In [25]:
pos = which(x>5)
pos

x[pos]

You can address data in arrays by their index:

In [30]:
x[1]
x[2:3]

**However**, negatives indices actually remove that position from the array.

In [103]:
x[-4]

you can do all sorts of calculations on arrays:

In [33]:
mean(x) # the mean
sd(x) # standard deviation

You can also do operations like multiply by numbers, additions, etc:

In [35]:
2*x
x+x+x

If you want to append values to your array you use `c()`

In [37]:
c(x,1)

## Matrices

If you want to make multidimensional arrays, we can use matrices.

You create an empty matrix by using `matrix()`. This function takes parameters:

* `data`
* `nrow`,`ncol`, number of rows and columns
* `dimnames` if you would like to name your columns

<div class="alert alert-warning">

All elements of a matrix must be of the same type! You cannot mix numbers with strings. Even if all elements are numbers, they need to be of the same type.

</div>

In [85]:
M  = matrix(1:20,4,5)
M

0,1,2,3,4
1,5,9,13,17
2,6,10,14,18
3,7,11,15,19
4,8,12,16,20


### Addressing rows, columns and values

In the same way you address individual values of arrays with `[i]`, you can address values of matrices with `[x,y]`.

You can get columns or rows by omiting one of the indices. For example, `M[i,]` will return the `i`th row. `M[,j]` will return the `j`th column.

In [86]:
M[1,]
M[,1]

You can name the columns (and rows) and use those names to refer to them

In [87]:
colnames(M) = c("a","b","c","d","e")

In [88]:
M[,'b']


In [84]:
summary(M)

       a              b              c               d               e        
 Min.   :1.00   Min.   :5.00   Min.   : 9.00   Min.   :13.00   Min.   :17.00  
 1st Qu.:1.75   1st Qu.:5.75   1st Qu.: 9.75   1st Qu.:13.75   1st Qu.:17.75  
 Median :2.50   Median :6.50   Median :10.50   Median :14.50   Median :18.50  
 Mean   :2.50   Mean   :6.50   Mean   :10.50   Mean   :14.50   Mean   :18.50  
 3rd Qu.:3.25   3rd Qu.:7.25   3rd Qu.:11.25   3rd Qu.:15.25   3rd Qu.:19.25  
 Max.   :4.00   Max.   :8.00   Max.   :12.00   Max.   :16.00   Max.   :20.00  

You can replace a column by simply assigning new values to it:

In [97]:
M[,"b"] = c(4,3,6,5)
M

a,b,c,d,e
1,4,9,13,17
2,3,10,14,18
3,6,11,15,19
4,5,12,16,20


To add new rows you can use `rbind()`. To add columns you use `cbind()`.

In [98]:
rbind(M,c(0,0,0,0,0))

a,b,c,d,e
1,4,9,13,17
2,3,10,14,18
3,6,11,15,19
4,5,12,16,20
0,0,0,0,0


In [100]:
cbind(M,c(1,2,3,4))

a,b,c,d,e,Unnamed: 5
1,4,9,13,17,1
2,3,10,14,18,2
3,6,11,15,19,3
4,5,12,16,20,4


In [101]:
help(cbind)

## Flow control in R

Looping is one of the basic aspects of any programming language. In **R**, the `for` loop is the basic way to iterate over arrays.
The basic syntax is:

```
for(val in sequence){
   # code to loop over
}
```

`while` also exists in **R**.

Very often you'll want to make your code depend on particular conditions. `if` is the basic function for that:

```
if(condition){
 # code to run if condition is true
}else{
 # code to run if condition is false
}
```
When you have many conditions you want to test it is usually better to use `switch()`.

### an example.

Let's go over a simple example that uses these functions.

We will create a vector of random numbers and separate it into two vectors: for positive and negative values.

In [106]:
# this creates a vector of 15 random numers taken from a Normal distribution.

x <- rnorm(15)
x

In [108]:
# split values into positive and negative

## create two empty vectors to store the values
posv<-vector()
negv<-vector()

# loop over the `x` array and decide into which vector the value goes into
for(i in 1:length(x)){
    if (x[i]>0) posv<-c(posv,x[i])
    if (x[i]<=0) negv<-c(negv,x[i])
}
posv
negv

Alternatively, you could use `if ... else ...`

In [None]:
# change the code to use if else
posv<-vector()
negv<-vector()

for(i in 1:length(x)){
    if (x[i]>0) {
        posv<-c(posv,x[i])
    }else{
        negv<-c(negv,x[i])
    }
}
posv
negv

Of course, we could have done it even simpler, using selectors:

In [109]:
posv = x[which(x>0)]
negv = x[which(x<=0)]

## Functions

Functions are a way to encapsulate functionality. A function is declared as:

```
function(x){
 # code that operates on x
}
```

In [112]:
my_func = function(x){
    x+1
}

In [113]:
my_func(3)