# NB 1: Introduction to R Notebooks

R Notebooks are Jupyter notebooks ran on Google Colab servers which allows you to program in either R, Python, or Julia for any computing tasks. It allows you to create a notebook that will communicate to Google's servers and

## What is a Jupyter Notebook?

Jupyter notebooks are ways for statisticians and data scientists to analyze code why writing narratives of their analysis. It is a method to ensure that there code is well documented and reproducible.

So... what is a Jupyter Notebook (JNB)?
- It's kind of like a word document made up of different blocks called "cells".
- There are two kinds of cells: CODE cells and MARKDOWN (Text) cells.
- The nice thing about a JNB is that when you run some code, the output will appear right below it.
- They are a great way to showcase and share your data analyses with others!


## Cells

Cells can be added by hovering over with your mouse between cells and clicking onn the desired cell type.

## What is a Markdown (Text) Cell?

A markdown cell allows you to document your code a data science workflow with simple sentences. It uses markdown syntax to customize the output of the cell. If you double click on this cell, you will be put into editor mode and can change the text you see. Clicking onm the Up or Down Arrows on the side will allow you to exit out of editor mode. You may also with `Ctrl+Enter` to get out of th cell.

## What is Markdown?

When you enter editor mode in a markdown cell, notice that there may be symbols in the editor such as the hashtag (`#`), backtick, and others. These symbols change how the text will be presented once you get out of editor mode. These symbols also control how you can add links, images and math symbols. For your convenience, Google Colab adds a toolbar above the cell that allows you to embed different formatting techniques, such as **bold text**, *italicized text*, and so much more.


## What is a Code Cell?

A code cell allows you to embed R (or any other supported computer language) in your document. Below is a bit of R code that will print "Hello World!". All you need to do is click on the right arrow button when you hover your mouse over the `[1]`. You may also use `Ctrl+Enter` to execute the code.

In [None]:
print("Hello World!")

## Run Initial NB Code

At the beginning of each notebook, there will be code and packages that you need to reload when you open a colab. The sample below will be at the beginning of each notebook:

Run the code below by clicking on the cell and pressing the left arrow sign or click `Shift+Enter` on the keyboard. This will install an R package called `rcistats` that contains helper functions to make your programming lives easier. It will take Colab approximately a minute. Afterwards, you will ask R to load the `rcistats` and `tidyverse` package. This is done using the library function like `library(rcistats)`.

In [None]:
## Run Initial Code
install.packages("rcistats",
                 repos = c("https://inqs909.r-universe.dev", "https://cloud.r-project.org"))
library(rcistats)
library(tidyverse)

## rcistats

The R package "rcistats" is a custom R package related to the course. It contains functions and documents that is used for this class and other classes.

## R as a calculator

R can be thought of as a really powerful calculator. You can do basic arithmetic operations as well complicated machine learning algorithms. All you need to do is type the commands and go. Below we will do some basic arithmetic problems.

2+2

2*5

2/45 + 34

4+3-1

3(4-1)/11

## Functions

Functions are a set of commands that will complete a task on a number or object (more on that later). A function is always recognizable by a word followed by a set of paranthesis. For example, `exp()` is a function that will compute $e^x$. All user needs to do is put a number in the paranthesis. The $e^4$ can be found with the code below:

In [None]:
exp(4)

### Multi-argument Functions

In the example above, you gave one argument to the function, the value 4. Other functions, like `log()` allow you to supply more than one argument to change how the function will execute the command. For example, if you want to compute $\log(4)$ you will need to type `log(100, base = 10)` The `base` argument is needed to alter the `log()` to compute in base 10. Otherwise, it will compute $\ln(100)$.

Try `log(100)`, `log(100, base = 10)`, `log(100, 10)`.

Explain what you observed in executing the tasks.

## Data Types

In R, the type of data, also known as `class`, we are using dictates how the programming works. For the most part, users will use numeric, logical, POSIX and character data types. Other types of data you may encounter are complex and raw.

### Numeric

Numeric classes are essentially numbers, also known as double and integer types of data. A double data is essentially a number with decimal value. An integer data are whole numbers. Try `is.numeric(5.63)`, `is.double(5.63)` and `is.integer(5.63)`

### Character

A character value is where the data values follow a string format. Examples of character values are letters, words and even numbers. A character value is any value surrounded by quotation marks. For example, the phrase “Hello World!” is considered as one character value. Another example is if your data is coded with the actual words “yes” or “no”. To check if you have character data, use the `is.character()` function. Try `is.character("Hello World!")`.

In both the numeric and character data type examples, the output was either `TRUE` or `FALSE`. What do you think that means? What type of data is it?

### Logical

A logical class are data where the only value is `TRUE` or `FALSE`. Sometimes the data is coded as `1` for `TRUE` and `0` for `FALSE`. The data may also be coded as `T` or `F`. To check if data belongs in the logical class, you will need the `is.logical()` function. Try `is.logical(3 < 4)`.


## R Objects

R objects are where most of your data will be stored. An R object can be thought of as a container of data. Each object will share some sort of characteristics that will make them unique for different types of analysis. To create a an R object, you will assign data, using a `<-` operator. to a string of letters and number. Try `x <- 8`.

**Notice the nothing prints out!** This is okay and not all your code will print out something. The line of code `x <- 8` states: create an object called `x` and store (assign, `<-`) the value 8 into it. Now type `x` and run the line. See what happens.

Now try arithmetic operations on `x`. For example, try `x^2`.

What happens to `x`?

### Vectors

A vector is a set data values of a certain length. The R object `x` is considered as a numerical vector (because it contains a number) with the length 1. To check, try `is.numeric(x)` and `is.vector(x)`

Now let’s create a logical vector that contains 4 elements (have it follow this sequence: T, F, T, F) and assign it to `y`. To create a vector use the `c()` function and type all the values and separating them with columns. Type `y <- c(T, F, T, F)`. Afterwars, print it out.

See if the vector `y` is logical.

### Data Frames

Data frames are similar to data set that you may encounter in an excel file. However, there are a couple of differences. First, each row represents an observation, and each column (variable) represents a characteristic of the observation. Additionally, each column (variable) in a data frame will be the same data type. To get an idea of what a data frame looks like, try `head(iris)`

 If you are interested in viewing a specific variable (column) from a data frame, you can use the `$` operator to specify which variable from a specific data frame. We view the variable by typing the name of the data set, then `$`, and the name of the variable: `data_set_name$name_of_variable`.
 For example, if we are interested in observing the `Sepal.Length` variable from the iris data frame, we will type `iris$Sepal.Length`:

## Order of Code Cells Matters!!

When running the code cells, the order of cells being executed matters. That is because Google colab uses an R environment to access objects that a user created. If you do not run the line that creates necessary objects beforehand, then you will get an error message. Therefore add code the create objects, if necessary, at the beginning of the code cell. You may also create a seperate code cell before the cell that needs the object and run it. This will create the object in the environment that can be reused. The 3 cells below illustrate why odrder of code cells matter.



In [None]:
my_vector

In [None]:
my_vector <- 1:20

In [None]:
my_vector

## Debugging Code

Run the lines of code below:

In [None]:
log(a)

In [None]:
log(4))

In [None]:
log(8


When running a line code, sometimes an error will pop out. This is part of the learning process. Learning how to debug your code is a useful skill. It helps build your understanding of R.

Debugging code in R involves identifying and resolving errors by both reading the code you ran and the error that is outputted. In you code, check if you mispelled functions or objects. Check if you used the correct upper- or lower-case letter. Check if you are missing a parenthesis, bracket or comma. Check if you have an extra paranthesis, bracket, or comma.

Additionally, read the error output to give you a hint what may be the problem. It may hint if somehting is missing or extra.


Look at the lines of code below and see if they have been corrected from the lines above.

In [None]:
a <- 5
log(a)

In [None]:
log(4)

In [None]:
log(8)

Explain in your own words what fixed each line of code.

## Further Resources

The R community has developed a variety of resources to practice R. A quick google search can lead great resources. Here are some recommended resources:

- Secion on R Programming: [Statistcal Computing](https://www.inqs.info/stat_comp/r.html)

- [Google Colab Cheatsheet](https://colab.research.google.com/github/Tanu-N-Prabhu/Python/blob/master/Cheat_sheet_for_Google_Colab.ipynb)

- [R for Data Science](https://r4ds.hadley.nz/)

## Submitting Notebooks

At the end of lab, you will need to submit an assignment. You can submit this lab via Canvas and Google Assignments.

Additionally it is good practive to download the file in case anything goes wrong. *File* > *Download* > *Download as ipynb*
