<img src="materials/images/introduction-to-r-programming-cover.png"/>


# 👋 Welcome, before you start
<br>

### 📚 Module overview

R is a statistical programming language that is very effective for computation and high-level graphics. It is commonly used for data analytics and data science.

We are going through five lessons in this module:

- <font color=#E98300>**Lesson 1: R Basic Data Types**</font>    `📍You are here.`
    
- [**Lesson 2: R Data Structures**](Lesson_2_R_Data_Structures.ipynb)
    
- [**Lesson 3: Importing Data**](Lesson_3_Importing_Data.ipynb)

- [**Lesson 4: Conditionals and Loops**](Lesson_4_Conditionals_and_Loops.ipynb)

- [**Lesson 5: Functions**](Lesson_5_Functions.ipynb)


</br>

### ✅ Exercises
We encourage you to try the exercise questions in this module, and use the [**solutions to the exercises**](Exercise_solutions.ipynb) to help you study.

</br>

<div class="alert alert-block alert-info">
<h3>⌨️ Keyboard shortcut</h3>

These common shortcut could save your time going through this notebook:
- Run the current cell: **`Enter + Shift`**.
- Add a cell above the current cell: Press **`A`**.
- Add a cell below the current cell: Press **`B`**.
- Change a code cell to markdown cell: Select the cell, and then press **`M`**.
- Delete a cell: Press **`D`** twice.

Need more help with keyboard shortcut? Press **`H`** to look it up.
</div>

---

# Lesson 1: R Basic Data Types

You’ll learn about some of the different types of data you can work with in R:

- [Numbers](#Numbers-(Integer,-Numeric))
- [Character](#Character)
- [Variables](#Variables)
- [Logicals](#Logicals)


`🕒 This module should take about 30 minutes to complete.`

`✍️ This notebook is written using R.`

## Numbers (Integer, Numeric)

- **Integers** are whole numbers.
- **Numerics** are decimal numbers.



Expression |<p style='text-align:center'>Description</p> | Result
:------------: |------------ | :-------------
3 + 2 |<p style='text-align:center'>Addition</p> | <p style='text-align:center'>5</p>
3 - 2 |<p style='text-align:center'>Subtraction</p> | <p style='text-align:center'>1</p>
3 * 2|<p style='text-align:center'>Multiplication</p> | <p style='text-align:center'>6</p>
3 / 2|<p style='text-align:center'>Division</p> | <p style='text-align:center'>1.5</p>
3 %% 2|<p style='text-align:center'>Modulus/Remainder</p> | <p style='text-align:center'>1</p>
3 ^ 2|<p style='text-align:center'>Exponent</p>| <p style='text-align:center'>9</p>

### ✅ `Run` each of the cells below:

In [1]:
3+2

In [2]:
3-2

In [3]:
3*2

In [4]:
3/2

In [5]:
# Modulus: returns the remainder after division

3%%2

In [6]:
# Exponent

3^2

#### ✅ Add a new cell below this one, and `Run` the following mathematical operation: 
```4 raised to the power of 2, times 3```

```The output should be: 48```

In [7]:
4^2*3

<div class="alert alert-block alert-success">
<b>Note:</b> You may have noticed the use of comments in the cells above. A comment is text in a code cell that is to be ignored. A comment, in R, is preceded by a hash symbol. Comments are useful for annotating your code in plain English. It's a good habit to frequently add notes to your code that describe your thinking, your approach to solving a problem, or even as a to-do list. 
</div>


<div class="alert alert-block alert-info">
<b>Tip:</b> As you're learning R, use comments to take notes in order to remember key concepts 
    and to enhance your understanding.
</div>

---

## Character
A **character** type is a piece of text represented as a sequence of characters (letters, numbers, and symbols) within either single or double quotes.

In [8]:
# Single or double quotes are fine...

'This is fun!'

In [9]:
# If an apostrophe is used, then double quotes are required.

"Isn’t this fun?"

#### Concatenation of Character Types

You can concatenate (or combine) character types by using the **paste()** function.

In [10]:
paste("Here we go", "again!")

The default separator is a space, but you can declare your desired separator by using the "sep" parameter:

In [11]:
paste("look", "here", sep="_")

---

## Print
So far, the result of the code that we've written was automatically returned as output for us to see. This is because Jupyter Notebook is an interactive environment and, by default, sends its results to the output to be viewed. However, as you write more code, there will be times when this won't be the case. That's when you'll formally want to request that results be output to the screen. The **print()** function is one way to accomplish this.

In [12]:
print("Let's print some R!")

[1] "Let's print some R!"


In [13]:
# Setting the quote parameter to FALSE removes the quotes from the output.

print("Let's print some R!", quote = FALSE)

[1] Let's print some R!


In [14]:
# Using the noquote() function prevents quotes from being output.

noquote("Let's print some R!")

[1] Let's print some R!

---

## Variables
A variable is a named value. In other words, if we create a variable named **_introduction_** and assign the value "R here!" to it, we can then type **_introduction_** to have the value "R here!" returned:

<div class="alert alert-block alert-warning">
<b>Alert:</b> Assignment, in R, generally uses the leftward assignment operator (i.e., <- ). The equals operator (i.e., = ) can be used in some circumstances but not all. The <- operator can be used anywhere for assignment.

</div>





In [15]:
# Assigning a value to a variable:

introduction <- "R here!"
print(introduction)

[1] "R here!"


You can change the value of a variable at any time, and R will always keep track of its current value.

In [16]:
introduction <- "Hiya, I'm R!"
print(introduction)

[1] "Hiya, I'm R!"


The following shortcuts will insert the leftward assignment operator (<-):
 
- Win: `alt` + `-`

- Mac: `option` + `-`


In [17]:
# Try the shortcut here.
 <- 

ERROR: Error in parse(text = x, srcfile = src): <text>:2:2: unexpected assignment
1: # Try the shortcut here.
2:  <-
    ^


### Naming variables

When naming variables in R you must keep a few rules and guidelines in mind:

**Variables:**
   - should begin with a letter. 
   - can use only letters, numbers, and the underscore (_) character.
   - are case-sensitive (data, Data and DATA would be different variable names).
   
   For example:
   
```
    first_name <- "Elon"
    lastName <- "Musk"
```

### Formatted printing

The **cat()** function can be used to concatenate and format character output.

In [18]:
# Format output (spaces are automatically added between items)

men <- 32
women <- 27

# Assigning the sum of the variables men and women to the variable num_participants.
num_participants <- men+women


# Concatenating the variable num_participants with two character strings and returning the output.
cat("There are", num_participants, "individuals currently participating in the study.")

There are 59 individuals currently participating in the study.

<div class="alert alert-block alert-success">
<b>Note:</b> When a value is returned using print(), a number is presented to the left of the value. Otherwise (e.g., when using paste() or cat() ), only the value is returned. 
</div>

---

## Logicals
The logical data type can either be **TRUE** or **FALSE** (must be in all caps). Logicals are typically used with conditional expressions. If the condition is true then TRUE is returned, otherwise FALSE is returned. Later, you will learn how logicals enable you to access specific parts of your data.

`TRUE` and `FALSE` are reserved words in R. (T and F can be substituted, respectively.)

In [19]:
TRUE

In [20]:
T

In [21]:
# Must be in all caps. True or False will return an error.

FALSE

In [22]:
F

### Comparison Operators

Expression |Description | Result
:------------: |------------ | :-------------
1 < 2 |Less Than | True
1 > 2 |Greater Than | False
1 <= 2|Less Than Or Equal To | True
1 >= 2|Greater Than Or Equal To | False
1 == 2|Equivalent To | False
1 != 2|Not Equal To | True

Numerical comparison:

In [23]:
5 < 2

In [24]:
5 > 2

In [25]:
5 <= 2

In [26]:
5 >= 2

In [27]:
5 == 2

In [28]:
5 != 2

In [29]:
# Not FALSE

!FALSE

Character comparison:

In [30]:
"meme" == "meme"

In [31]:
"meme" == "memo"

### Logical Operators (&, |)
When working with compound conditions, if "and" (&) is used, then both tests must be TRUE for the statement to return TRUE. If "or" (|) is used then only one of the conditions must be TRUE.

#### Compound logicals

In [32]:
# True and True

TRUE & TRUE

In [33]:
# True and False

TRUE & FALSE

In [34]:
# False and False

F & F

In [35]:
# True or False

T | F

In [36]:
# TRUE  &  TRUE

(5 > 2) & (2 < 5)

In [37]:
# FALSE   |    FALSE

(3 > 10)  |  (3 > 20)

In [38]:
# FALSE    |  TRUE

(12 == 8)  |  13

<div class="alert alert-block alert-success">
<b>Note:</b> Any non-zero value is considered TRUE.
</div>

### ✅ Exercise 1

- Create a variable named **cats** and assign the value 2 to it. 
- Create a variable named **dogs** and assign the value 3 to it. 
- Create a variable named **pets**. 
- Assign **cats** plus **dogs** to the variable **pets**. 
- Print out the variable **pets**.


In [39]:
# Assign values to variables
cats <- 2
dogs <- 3

# Sum the values and assign to pets
pets <- cats + dogs

# Print the value of pets
print(pets)

[1] 5


<div class="alert alert-block alert-warning">
<b>Alert:</b> Remember to use the leftward assignment operator (i.e., <- ).
                                                                         
Assignment operator shortcut:
- Win: `alt` + `-`
- Mac: `option` + `-` 

</div>

### ✅ Exercise 2

Using the three variables that you created above and the **_cat()_** function, output the following formatted sentence:
  
"We have 5 pets: 2 cats and 3 dogs."


In [40]:
# Assign values to variables
cats <- 2
dogs <- 3
pets <- cats + dogs

# Print formatted sentence
cat("We have", pets, "pets:", cats, "cats and", dogs, "dogs.")

We have 5 pets: 2 cats and 3 dogs.

### ✅ Exercise 3

- Create a variable named **length_of_sides** and assign the value 8 to it.
- Then create a variable named **area** and assign **length_of_sides** squared to it.
- Then create a variable named **perimeter** and assign **length_of_sides** times 4 to it.
- Print out the variables **area** and **perimeter**.

In [41]:
# Assign value to length_of_sides
length_of_sides <- 8

# Calculate area (side^2) and perimeter (side * 4)
area <- length_of_sides^2
perimeter <- length_of_sides * 4

# Print the values of area and perimeter
print(area)
print(perimeter)

[1] 64
[1] 32


---

# 🌟 Ready for the next one?
<br>

- [**Lesson 2: R Data Structures**](Lesson_2_R_Data_Structures.ipynb)
    
- [**Lesson 3: Importing Data**](Lesson_3_Importing_Data.ipynb)

- [**Lesson 4: Conditionals and Loops**](Lesson_4_Conditionals_and_Loops.ipynb)

- [**Lesson 5: Functions**](Lesson_5_Functions.ipynb)

---

# Contributions & acknowledgment

Thanks Antony Ross for contributing the content for this notebook.

---

Copyright (c) 2022 Stanford Data Ocean (SDO)

All rights reserved.