# Lesson 2: Introduction to Computing Using R

Today:
1. Working with Jupyter Notebooks
2. Computing and arithmetic in R
3. Working with "names"
4. Types of Data
5. Functions
6. Lists

## 1. Working with Jupyter Notebooks

A Jupyter Notebook is a type of document that can be used for both writing text (like a Word document) and writing codes

A Jupyter Notebook consists of **cells** (places to enter our texts or codes)

There are two main types of cells:
+ **Markdown cells"**: for writing texts
+ **Code cells**: for writing codes

To run a code in a current cell, we press Shift + Enter

*Side note*<br>
In our course, we will write **R** codes, but in general we can write codes in other languages in a Jupyter Notebook (e.g. python), by specifying the "kernel" to use.

## Demonstration

If you happen to have a laptop or tablet, please follow along!  

These lecture slides are in fact a Jupyter Notebook.  All notebooks we used in class (slides and demonstrations) will be available in the class_share folder on our Jupyter Hub (https://coreua-111.rcnyu.org) for your reference.

(Sometimes, the lecture slides might be a pdf file, accompanied by Jupyter notebooks for demonstrations.)

# 2. Computing and arithmetic in R

We can do basic computations in R, such as adding, subtracting, multiplying, and dividing numbers

#### Order of Operations

#### Exponents

#### Computing Averages
Suppose that there are five students and their homework scores are: 100, 90, 100, 75, 80.  What is their average homework score?

What should we type in the code cell below to compute their average homework score?

#### Proportions and Percentages

Suppose that out of 60 students, 25 are living in Manhattan, 11 in Brooklyn, 9 in Queens, 9 in the Bronx, 3 in Staten Island, 2 in New Jersey, and 1 in Westchester.  Find the fraction of students who live in Brooklyn.  How many percent is it?

### A quick comment on Comments

As your R code becomes more and more involved, it is important to make sure that you and others understand what exactly the code does.  To do this, we want to add additional explanation (in english) that we want R to ignore computationally.  This additional explanation can be added as "comments" in R.  For example:

In the above cell, any text to the right of the `#` sign is ignored by R.  Any text that is preceded by `#` is a comment.

It is good practice to accompany your code with comments, both for your benefit (when you revisit it in the future) and for others's benefit (other people might be interested in understanding what you did).

# 3. Working with "names"

Sometimes, we would like to give names to describe the quantities that we are working with so that we can easily refer to them.

To display the content/ the value stored in each name, we simply type the name.

That is, names are "labels" or "placeholders" or "storage units".  We could store not just numbers, but also text.  Make sure to surround text to be stored by a single quotation mark:

In addition to giving names to individual numbers or texts, we also give names to entire sets of data.

For example, in the cell below, we load into R (part of) the UC Berkeley 1973 graduate admissions data that we briefly saw at the end of last class.

In [None]:
# There is a file called berkeley73.csv in the folder where this jupyter notebook file is located
# We load this file





# For now, the main thing to keep in mind is that we are giving a name to a data table 
#  so that we can easily refer to it and display it

# When you type the name of the table and run the code, R displays the contents of the table





## Recap so far

Today:
1. Working with Jupyter Notebooks $\checkmark$
2. Computing and arithmetic in R $\checkmark$
    + A comment about comments $\checkmark$
3. Working with "names" $\checkmark$
4. Types of Data
5. Functions
6. Lists

## 4. Types of Data

Here is an outline of the main types of data we will work with:

##### 1. Numerical Data
    
These are data that are numbers!  They could be positive, negative, integer, or non-integer numbers.

There are two main numerical data types that are stored differently in R:
1. Whole Numbers (a.k.a. **integer**) <br><br>
   
2. Numbers that could have fractional/decimal parts 
      a.k.a. **doubles**, short for "double precision", which very roughly speaking indicates how many decimal places will be stored.

##### 2. Text Data (a.k.a. string or **character**)

##### 3. Categorical/Group Data (a.k.a. **factor**)
   
These are data that indicates groups/categories/types.
   


##### 4. Boolean Data (a.k.a. **logical** data)
   
These are "binary" data, which has only two possibilities: TRUE or FALSE values.  We will see their concrete use later.

### Example

Student survey data:

<table>
    <tr>
        <th>School</th>
        <th>Year</th>
        <th>Height</th>
        <th>Favorite Ice Cream</th>
        <th>Mac user?</th>
    </tr>
    <tr>
        <td>CAS</td>
        <td>First year</td>
        <td>62</td>
        <td>Strawberry</td>
        <td>Yes</td>
    </tr>
    <tr>
        <td>Tisch</td>
        <td>Junior</td>
        <td>71</td>
        <td>Vanilla</td>
        <td>No</td>
    </tr>
    <tr>
        <td>Steinhardt</td>
        <td>Junior</td>
        <td>58</td>
        <td>Cookies and creme</td>
        <td>Yes</td>
    </tr>
</table>

+ **School**: categorical
+ **Year**: categorical
+ **Height**: numerical (could be non-integer)
+ **Favorite Ice Cream**: text 
+ **Mac user?**: boolean (since it's a yes/no question)

Note: The data types sometimes depend on the intention of the survey.  For example, Favorite Ice Cream could have a categorical data (if user is prompted to select one from several option instead of free-typing their favorite ice cream flavor), etc.

## 5. Functions

R allows us to do a lot of things using "functions".  

We can think of functions in R as "verbs" which we can use to tell R to do a particular task.  


Just as some verbs in English must be followed by a noun ("transitive verbs") and some don't, some functions in R must take a particular object or input (often called an "argument") and some don't.

You can also think of functions in R like mathematical functions.

For example, when we say $$f(x) = 5x,$$

what we mean is:
+ The function $f$ takes as an input number $x$.
+ Then, the function $f$'s action is to multiply whatever $x$ is by 5

That is, the function $f$ is like a "verb" or an "action" that does something to the input $x$.  The action in this example is to multiply $x$ by 5.

### Examples

Let's start with simple function: the `print()` function.  It's use is to print the content of a name.  For example:

Contrast the output above with the output of the cell below, where `print()` was not used:

Notice that the "noun"/object that the function is acting upon is placed inside the pair of parenthes that come directly after the function name (without space between the function and the open parenthesis.)

Here are a couple other simple R functions that helps us does arithmetic:
+ `sqrt()`: takes the square root of a number
+ `abs()`: takes the absolute value of a number

Try them by running and modifying the code cells below.

### Function (summary)

A function
+ takes **input(s)** (or **argument(s)**) and
+ does something according to the inputs.
+ A function might also returns **output(s)**.

Example:

The function called `abs`
+ takes the number -5 as an input, and
+ computes the absolute value of the number.
+ It then outputs the value 5 as the computed quantity.

The function called `print`
+ takes the name `x` as an input, and
+ displays the value stored in `x`

Summary of useful general functions we've seen so far:
+ `print()`: to display the value stored in a name
+ `abs()`: to take the absolute value of a number
+ `sqrt()`: to take the square root of a number

## 6. Lists

Sometimes, we need to work not just with one number but a collection of numbers; in R, these collections of numbers are called lists.

### 6.1. Making a new list using the `c()` function

We use the function `c()` to put together several different values into one object, a list.  See the example below, where we store the heights of the three students into one list, which we name `height`:

### 6.2. Making a new list of integers (second method)
Here is a second way to create a list containing consecutive integers: `firstInteger:lastInteger`.

For example, instead of using `c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)` to create a list of all integers from 1 to 10, we could have created the same list using the following command: `1:10`, which is more concise.  Try it below and name this list `my_other_list`:

This is particularly useful if you want to create a very long list.  For example, if we want to create a list of all integers from -100 to 100:

### 6.3. Lists of text data

We can also create a list of text data

### 6.4. Arithmetic with numeric lists

If we have a list containing numerical data (i.e., a list of numbers), we can in fact do arithmetic with them.  For example:

### 6.5. Useful functions for working with lists

In addition to the function `c()`, which we use to make new lists, the following are three other useful functions that we can use to examine lists:
+ `length()`: to find the "length" of a list (i.e. how many values are stored in a list 
+ `max()`: to find the largest value in a list
+ `min()`: to find the smallest value in a list
Try them by running and modifying the code cells below.

**Exercise** Create a list of all integers from 1 to 10 and name this list `my_new_list`.  Then, 
1. print the contents of this list.
2. find the sum of the integers 1, 2, ... , 10

##### 2-Minute Group Work
How might you do the following in R as succinctly as possible?  Think about this question on your own for about 30 seconds, then discuss with at least one neighbor.
1. Create a list containing the first 50 positive even numbers
2. Create a list containing the first 50 positive odd numbers

## Summary

Today:
1. Working with Jupyter Notebooks $\checkmark$
2. Computing and arithmetic in R $\checkmark$
    + A comment about "comments" $\checkmark$
3. Working with "names" $\checkmark$
4. Types of Data $\checkmark$
5. Functions $\checkmark$
6. Lists $\checkmark$

## Summary of new R functions
Here's a brief summary of the new R functions we learned today.  You should be familiar with these functions and able to use them in labs and projects.

You might want to add their descriptions down here as you review/study.  As you use these functions frequently in labs, projects, and homework, you will learn to remember and understand them and be more "fluent" in using them.

+ General functions
    + print()
    + sqrt()
    + abs()
+ Functions for checking types of data
    + typeof()
    + class()
+ Functions for working with lists
    + c()
    + length()
    + max()
    + min()