<img src="images/banner_introRProg.png" align="left" />

<table style="float:right;">
    <tr>
        <td>                      
            <div style="text-align: right"><a href="https://www.research.manchester.ac.uk/portal/syed.murtuzabaker.html" target="_blank">Syed Murtuza Baker</a></div>
            <div style="text-align: right">Research Fellow</div>
            <div style="text-align: right">University of Manchester</div>
         </td>
         <td>
             <img src="images/Syed_Baker.jpg" width="50%" />
         </td>
     </tr>
</table>

# Introduction to Programming with R
****

#### About this Notebook
This notebook introduces the R programming language and the Jupyter notebook environment. 

Level: <code>beginner</code> 

Duration: Approximately 2 hours to complete

<div class="alert alert-block alert-warning"><b>Learning Objectives:</b> 
<br/> At the end of this notebook you will be able to:
    
- Describe the basic principles of R language

- Log your work in a Jupyter notebook
    
- Explain the features of R that support object-oriented programming


</div> 

<a id="top"></a>

<b>Table of contents</b><br>

1.0 [Introduction](#intro)

2.0 [Variables](#variables)

3.0 [Vectors](#vectors)

4.0 [Data Frames](#dataframes)

5.0 [Lists](#lists)

6.0 [Indexing](#indexing)

7.0 [If Statement](#ifstatement)

8.0 [If...else Statement](#ifelsestatement)

9.0 [For Loop](#forloop)

10.0 [Functions](#functions)

11.0 [Your Turn](#yourturn)

*****

<a id="intro"></a>

## Introduction

R is one of the leading programming languages in Biological data analysis. It is used to perform data analyis, statistics, machine learning and visualisations. This course is designed as an introduction to R for participants with no previous programming experience. 

We first introduce on how to start programming in R and progress our way. We will learn how to manipulate data and visualise it, read and write to files. During the course we will also be working with tidyverse which will allow us to manipulate our data effectively.

For help in using this Jupyter notebook please refer to the [Jupyter Notebook User Guide](https://online.manchester.ac.uk/bbcswebdav/orgs/I3116-ADHOC-I3HS-HUB-1/Jupyter%20Notebooks/content/index.html#/)





<div class="alert alert-block alert-info">
<b>Task 1:</b>
<br> 
Let's run our very first R program to display the classic <code>Hello world</code> message on the screen. To do this select (click on) the cell below and hold the <code>shift</code> key and press the <code>enter</code> key at the same time. Alternatively click on the <code>run cell</code> button on the menu above. You should see <code>Hello world</code> displayed under the cell.
</div>

In [3]:
hello.string <- 'Hello, world'
print(hello.string)

[1] "Hello, world"



*****
[back to the top](#top)

<a id="variables"></a>

## Variables



Another thing you’ll want to do using R is assign things to a name so that you can use it later. Think of this as being if you were a chipmunk and you buried a nut in the ground to dig up later. You can assign anything in R to a name, then use it later (in the current R session of course :)).

Assign the number 5 to the name mynumber

In [4]:
mynumber <- 5

Later you can use <code>mynumber</code> , like adding it to another number

In [5]:
mynumber + 1

### While working with variables
- Data types 
    - integers: 1, 5, 7
    - double: 1.5, 3.2
    - stings: Hellow
    - boolean: TRUE/FALSE
    - factor
- Variable Names
- Working with Strings (data length, changing test to lower case, check on spaces etc functions)
- changing a variable type

We will introduce the data types while going through vectors


<a id="vectors"></a>

## Vectors

Vectors are one of the simplest and common objects in R. Think of a vector like a cat’s tail. Some are short. Some are long. But they are are pretty much the same width - that is, they can only contain a single data type. So a vector can only have all <code>numeric</code> , <code>all character</code> , <code>all factor </code>, etc.

But how do we make a vector? The easiest way is to use a function called `c`. So `c(5,6,7)` will create a vector of numbers 5, 6, and 7.

#### Integer

In [6]:
c(5,6,7)

#### Double
Making a value as double will make all other values double

In [7]:
c(5, 8, 200, 1, 1.5, 0.9)

#### String
Let’s say you have a vector of three types of animals:



In [8]:
animals <- c("birds","squirrels","fish")
animals

You can add something to each of them like:


In [9]:
paste(animals, "are beautiful")

#### Boolean

In [1]:
are_you_good <- TRUE
are_you_good

#### Factor
Factor variables are categorical variables that can be either numeric or string variables.

In [3]:
condition <- factor(c('Control','Control','Control','Treatment','Treatment'))
condition



*****
[back to the top](#top)

<a id="dataframes"></a>
## Data Frames

A `data.frame` is one of the most commonly used objects in R. Just think of a `data.frame` like a table, or a spreadsheet, with rows and columns and numbers, text, etc. in the cells. A very special thing about the `data.frame` in R is that it can handle multiple types of data - that is, each column can have a different type. Like in the below table the first column is of __numeric__ type, the second a __factor__, and the third __character__.



In [10]:
df <- data.frame(hey=c(5,6,7), there=as.factor(c("a","b","c")),
             fella=c("blue","brown","green"))
df

hey,there,fella
5,a,blue
6,b,brown
7,c,green


Notice that the first column of numbers are actually row names, and are not part of the data.frame per se, though are part of the metadata for the data.frame.

We can quickly get a sense for the type of data in the df object by using the function str , which gives information on the types of data in each column.

In [11]:
str(df)

'data.frame':	3 obs. of  3 variables:
 $ hey  : num  5 6 7
 $ there: Factor w/ 3 levels "a","b","c": 1 2 3
 $ fella: Factor w/ 3 levels "blue","brown",..: 1 2 3


### Matrices

Think of a matrix in R like a data.frame with all the same type of data, only numeric, only character, etc. A matrix is technically a special case of a two- dimensional array.

In [12]:
mat <- matrix(c(1,2,3, 11,12,13), nrow = 2, ncol = 3)

In [13]:
mat

0,1,2
1,3,12
2,11,13


*****
[back to the top](#top)

<a id="lists"></a>
## Lists

Lists are quite special. They are kinda like vectors, but kinda not. Using our cat tail analogy again, lists are like cat tails in that they can be short or long, but they can also vary in width. That is, they can hold any type of object. Whereas vectors can only hold one type of object (only character for example), lists can hold for example, a data.frame and a numeric , or a data.frame and another list! The way we make a list is via the function list

In [14]:
list(1, "a")

A nested list

In [15]:
mylist <- list(1, list("a","b","c")) 
mylist

Just like vectors, you can do operations on each element of the list. However, since lists can be nested you have to worry about what level of nesting you want to manipulate.

In [16]:
mylist[[1]]

In [17]:
mylist[[2]]

In [18]:
mylist[[2]][[1]]

*****
[back to the top](#top)

<a id="indexing"></a>
## Indexing

Okay, so let’s say you have made a vector, list, or data.frame. How do you get to the things in them? Its slightly different for each one. There is a general way to index objects in R that can be used across vectors, lists, and data.frame. That is the square bracket: []. For some objects you can index by the sequence number (e.g., 5 ) of the thing you want, while with others you can do that, but also index by the character name of the thing (e.g., kitty).


### Indexing Vectors
Vectors only have one dimension, as we said above. So with [] there is only one number to give here. For example, let’s say we have the vector

In [19]:
bb <- c(5,6,7)

We can index to each of those 3 numbers by the sequence of its place in the vector. Get the 6 by doing

In [20]:
bb[2]

### Indexing Named Vectors

is this a sub section?
You can also have a named vector. What’s that? A named vector is like bb above, but each of the three elements has a name.

In [21]:
bb <- c(5,6,7)
names(bb) <- c("hey","hello","wadup") 
bb

In [22]:
names(bb)

With a named vector we can get to each element in the vector using its name with a single set, or double set of brackets to get the value, or the value and name, respectively.

In [23]:
bb["hello"]

In [24]:
bb[["hello"]]

### Indexing Lists

Indexing on lists is similar to vectors. A huge difference though is that lists can be nested. So there could be infinite things within each slot of a list. For example, let’s say we have the nested list from above <code>mylist</code>

In [25]:
mylist <- list(foo=1, bar=list("a","b","c"))

We can index to the first item in the list, including its name, by

In [26]:
mylist[1]

Or equivalently

In [27]:
mylist["foo"]

And get just the value by using two [

In [28]:
mylist[[1]]

Or equivalently

In [29]:
mylist[["foo"]]

### Indexing data.frame and matrix

Indexing on a <code>data.frame</code> and <code>matrix</code> is similar. Both have two things to index on: <code>rows</code> and <code>columns</code>. Within [,], the part before the comma is for rows, and the part after the comma for columns. So if you have a data frame <code>iris</code> in R

In [38]:
head(iris)

Sepal.Length,Sepal.Width,Petal.Length,Petal.Width,Species
5.1,3.5,1.4,0.2,setosa
4.9,3.0,1.4,0.2,setosa
4.7,3.2,1.3,0.2,setosa
4.6,3.1,1.5,0.2,setosa
5.0,3.6,1.4,0.2,setosa
5.4,3.9,1.7,0.4,setosa


You can index to the third row and second column by doing

In [39]:
iris[3,2]

You can also use names to index if you have named rows or columns. For example,

In [40]:
iris[2,"Species"]

You can also use the $ symbol to index to a column, like

In [41]:
mtcars$mpg

*****
[back to the top](#top)

<a id="ifstatement"></a>
## If Statement

Example of an if statement in real life could be:

*If it rains I bring my umbrella.*

The syntax of if statement is:

```
if (test_expression) {
  statement
}
```

If the test_expression is <code>TRUE</code>, the statement gets executed. But if it’s <code>FALSE</code>, nothing happens. Here, test_expression can be a logical or numeric vector, but only the first element is taken into consideration. In the case of numeric vector, zero is taken as <code>FALSE</code>, rest as <code>TRUE</code>

In [47]:
x <- 5
if(x > 0){
  print("Positive number")
}

[1] "Positive number"


*****
[back to the top](#top)

<a id="ifelsestatement"></a>
## If...else Statement

Example of an If else statement in real life could be: 

*If it rains then I will bring my umbrella if not (else) then let's go*

The syntax of if… else statement is:


In [51]:
x <- -5
if(x > 0){
  print("Non-negative number")
} else {
  print("Negative number")
}

[1] "Negative number"


*****
[back to the top](#top)

<a id="forloop"></a>
## For Loop

Loops allow you to automate your code when are used in programming to repeat a specific block of code.

```
for (val in sequence)
{
  statement
}
```

Here, sequence is a vector and val takes on each of its value during the loop. In each iteration, statement is evaluated.

### Example: for loop

Below is an example to count the number of even numbers in a vector.

In [54]:
x <- c(2,5,3,9,8,11,6)
count <- 0
for (val in x) {
  if(val %% 2 == 0)  
    count = count+1
}
print(count)

[1] 3


*****
[back to the top](#top)

<a id="functions"></a>
## Functions

Cats are the type of feline to love functions. Functions make your life easier by allowing you to generalize many lines of code, and avoiding repeating yourself. Functions make your work tidier - just like cats like it. Functions are written like this

In [4]:
I_want_to_add <- function(){
  2 + 3
}

In [5]:
I_want_to_add()

In [6]:
I_want_to_add <- function(a, b){
  a + b
}

In [7]:
I_want_to_add(5,7)

__Do it yourself: change the value to 27, 12 and see the result__

#### Function within function

In [55]:
foo <- function(){
  writeLines("Being a cat, I strongly dislike dogs")
}

Now call the <code>function</code>

In [56]:
foo()

Being a cat, I strongly dislike dogs


The foo function was pretty simple. We can also pass in parameters to the function.

In [57]:
foo <- function(printVal){ 
  writeLines(printVal)
}

In [58]:
foo("Being a cat, I strongly dislike dogs")

Being a cat, I strongly dislike dogs


And set parameters to default values.

In [59]:
foo <- function(printVal = "Being a cat, I strongly dislike dogs"){ 
  writeLines(printVal)
}

In [60]:
foo()

Being a cat, I strongly dislike dogs


In [61]:
foo('I print whatever I like')

I print whatever I like


*****
[back to the top](#top)

## R console
Writing code is fun. So open up R, and you’ll see something like this:
![R console](images/R_Console.png)

## R studio
when loading R-Studio it will look like this
![R-Studio](images/R-Studio.png)

____________

<a id="yourturn"></a>
## Your Turn


<div class="alert alert-block alert-info">
    <b>Task 1</b><br/>
   <p> Write a chunk of code that converts 24 hour time into am/pm. If the time is more than 12 then it says it is now pm otherwise it says it is now am. Think about how to split the sequence of hours up into before midday and afterwards in order to write a simple if else statement</p> 
</div>

In [68]:
inputTime <- 13
if(inputTime > 12){
  print('It is now pm')
}else{
  print("It is now am")
}



[1] "It is now pm"


<div class="alert alert-block alert-info">
    <b>Task 2</b></br>
<p>Write for loop to calculate the first 10 terms of Fibonacci sequence.<br/>
The Fibonacci Sequence is the series of numbers: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34,… i.e. the third number is calculated by adding up the two numbers before it:<br/>
[Hint: Assign the first two numbers in two variables and iterate from third variable]</p>
</div>

In [69]:
a <- 0
b <- 1
print(a)

[1] 0


In [70]:
print(b)

[1] 1


In [71]:
for(i in 1:8){
  c <- a + b
  print(c)
  a <- b
  b <- c
}

[1] 1
[1] 2
[1] 3
[1] 5
[1] 8
[1] 13
[1] 21
[1] 34


<div class="alert alert-block alert-info">
    <b>Task 3</b> </br>
<p>Write a function that will say whether a number is prime or not.
</br>
[Hint: <code>%%</code> would give you the reminder of the calculation. For eg. <code>5%%2</code> will give you <code>1</code>]</p>

In [1]:
find_prime <- function(val){
  count <- 0
  for (i in 2:(val/2)) {
    if(val%%i == 0)
      count = count + 1
  }
  #print(count)
  if(count > 0)
    print('It is not a prime number')
  else
    print('It is a prime number')
}


In [2]:
find_prime(23)

[1] "It is a prime number"


In [3]:
find_prime(21)

[1] "It is not a prime number"


*****
[back to the top](#top)

### Notebook details
<br>
<i>Notebook created by <strong>Syed Murtaza Baker</strong>. Other contributors include Fran Hooley... 

Publish date: May 2021<br>
Review date: May 2022</i>

Please give your feedback using the button below:

****

## Notes:
