# Goal
This is R Studio. The goal of this lecture is to see some basics about R so that we can dive into more exciting things tomorrow. 

# Overview of R Studio 
Layout of RStudio: 

- Script
  - Where you will be writing your own programs
- Environment/History/Git
  - Mostly just environment and Git
  - Shows which data objects you have loaded in memory
  - Eventually where you'll do version control with GitHub (last day)
- Files/Plots/Packages/Help/Viewer  
  - Helps you load packages and other files
  - Default window when you're trying to get help with a function from a package
- Console/Terminal
  - Where the code actually runs
  - Scripts execute in the console
  - Code disappears in the console, whereas a script saves your code
  - To run something in the console, type it in and hit "enter"
  - Important to know there is a "terminal" in R Studio. Again, just know it's there. 

# Console

- We can type code straight into the console and run it. 
- The console in R Studio knows you're running R code. 



In [3]:
# Type the following in the console, then press enter 
2 + 3

# Variable assignment happens with an arrow (on a Mac can do option -) 
a <- 2
b <- 3

a + b


# How to write and execute a script

**Writing Scripts and Running Code**

- New Script: File > New File > R Script (or R Markdown, which is what I'm using now), or just the New Document script in the upper left-hand corner of the screen > R Script (or R Markdown)
- Keyboard shortcut to run a section of code: highlight or put your cursor on that line and hit Ctrl + Enter (Windows) or Command + Enter (Mac)
- Executing code: the run button at the top right corner of the script 
- For R Markdowns, we also have the Knit button at the top of the script.
  - That will run all code and "knit" it together into a PDF, HTML, or doc
  - File type is specified at the top of the R Markdown file

**R Markdown**

- I wrote this PDF using an R Markdown (**show them quickly**). 
- I love these because I'm able to write lots of notes to myself while I'm coding
  - It produces a nice shareable file 
  - Can share my notes, code, and results with others. 

**Problem Sets**

- If you have no experience with R, start with a script. 
- If you want to do an R Markdown and are used to them, that's fine. 

You can "clean up" the Environment after you've executed code by clicking the broom icon. **This will delete everything in your environment**.

# Basic Data Types and COMMENTS



In [6]:
# This is a comment, you can use '#' to write notes to yourself in your code 
# - Comments are what make or break good coders, and coders who can collaborate with others. 
# - If you ever think you're writing "too" many comments, you almost always are not.
# - The things you think are obvious in your code won't be to others 
# - (nor yourself in a year when you get back to a project)

# Numeric -- integer: no decimal points
myInt <- 1
myInt

# Numeric -- double: decimal points
myNum <- 2.4
myNum

# Logical (Boolean/Indicator variable): a true/false statement. Use () to evaluate if something is true or false
myBool_1 <- (3 < 4)
myBool_2 <- (3 > 4)

# Character (string) 
myChar_a <- "a"
myChar_b <- 'b'


# Ways to store data types 


In [7]:
# Vector: can only be a vector of one data type (numeric, logical, string)
myVec_n <- c(1, 2, 3, 4, 5)
myVec_s <- c(myChar_a, "b", "c")
myVec_string <- c(1, "b", "c")
myVec_string # notice the 1 has been made a character because of the "

# Matrix: should only be a matrix of one data type
myMat_n <- matrix(c(myVec_n, 
                6, 7, 8, 9, 10), 
              nrow = 2, 
              ncol = 5)

# Lists: Very powerful, but somewhat confusing. For now, just know they exist
myList <- list(2, "c", myMat_n)
myList[[1]] # returns numeric 
myList[[2]] # returns string
myList[[3]] # returns matrix


0,1,2,3,4
1,3,5,7,9
2,4,6,8,10


## Data Frames

- Like matrices
- Can have different data types in each column
- Reference specific columns using the "$" operator, followed by the name of the column
- For the most part, you’ll be loading new data by reading a CSV 


- You might have to create one at some point.
- By looking at how they’re created we can get a better sense of what goes into them


In [8]:
# Data frame: can have multiple data types 
myDF <- as.data.frame(myMat_n)
colnames(myDF) # these don't mean anything to me 
colnames(myDF) <- c("age_yr", "weight_lb", "income_$", "height_ft", "height_in")


In [9]:
# Investigate one column 
myDF$age_yr

# Create a new column 
myDF$nonsense <- myDF$age_yr + myDF$weight_lb


In [10]:
# Create the data frame
myPpl <- data.frame(
   gender = c("Male", "non-binary", "Female"), 
   male = c(TRUE, FALSE, FALSE),
   height = c(152, 171.5, 165), 
   weight = c(81, 93, 78),
   age = c(42, 38, 26)
)

# Try referencing one column
myPpl$male # version 1
myPpl[,2] # version 2

# Try referencing one row 
myPpl[1, ]

# Try referencing one cell
myPpl$height[1] # version 1
myPpl[1, 3] # version 2


Unnamed: 0_level_0,gender,male,height,weight,age
Unnamed: 0_level_1,<chr>,<lgl>,<dbl>,<dbl>,<dbl>
1,Male,True,152,81,42


# A word of caution 

- Make sure you don't overwrite your variables by accident. 


In [None]:
# Assigning new value to same variable (something to do carefully)
a <- 5
a <- a + 1 # If you run this line more than once, you will NOT get six
a

# Assigning new value to new variable
a <- 5
a_new <- a + 1 # If you run this line more than once, you WILL get six
a_new
