<a href="https://colab.research.google.com/github/JonPaulBIlbao/Intro-to-R/blob/master/intro_to_r_colab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction to Basic R Concepts / Machine Learning in Finance 2024-25
This notebook introduces fundamental R concepts, including arithmetic operations, variables, data types, vectors, data frames, and basic subsetting.

## Simple Arithmetic Operations

In [1]:
# Addition
3 + 5

# Subtraction
10 - 4

# Multiplication
2 * 6

# Division
15 / 3

## Variables and Assignment

In [2]:
# Assign values to variables
x <- 10  # Assigning value 10 to x
y <- 5   # Assigning value 5 to y

# Perform operations
x + y

## Data Types in R

In [3]:
# Numeric data type
a <- 7.5

# Character data type
b <- "hello"

# Logical data type
c <- TRUE

## Sequence in R

In [4]:
# Generate a sequence from 1 to 10
1:10

## Creating and Manipulating Vectors

In [5]:
# Create vectors
ages <- c(25, 30, 35, 40)
names <- c("John", "Mary", "David", "Emily")
gender <- c("Male", "Female", "Male", "Female")
salary <- c(50000, 60000, 70000, 80000)

# Print vectors
ages
names
gender
salary

## Summary Statistics

In [6]:
# Summary statistics
summary(ages)
summary(names)  # Not meaningful for character data
summary(gender) # Not meaningful for character data
summary(salary)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  25.00   28.75   32.50   32.50   36.25   40.00 

   Length     Class      Mode 
        4 character character 

   Length     Class      Mode 
        4 character character 

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  50000   57500   65000   65000   72500   80000 

## Converting Gender to a Factor

In [7]:
# Convert gender to a factor
gender_factor <- factor(gender)

# Print and summarize factor
gender_factor
summary(gender_factor)

## Subsetting Vectors

In [8]:
# Subsetting individual elements
gender[1]  # First element
gender[3]  # Third element

# Subsetting multiple elements
gender[1:2]  # First two elements
gender[c(1, 3)]  # First and third elements

## Logical Conditions in R

In [9]:
# Logical conditions
gender == "Male"
gender != "Male"

# Subsetting using conditions
gender[gender == "Male"]
gender[gender != "Female"]

## Creating and Exploring Data Frames

In [10]:
# Create a dataframe
data <- data.frame(ages, names, gender, salary)

# Print dataframe
data

# Summary of dataframe
summary(data)

ages,names,gender,salary
<dbl>,<chr>,<chr>,<dbl>
25,John,Male,50000
30,Mary,Female,60000
35,David,Male,70000
40,Emily,Female,80000


      ages          names              gender              salary     
 Min.   :25.00   Length:4           Length:4           Min.   :50000  
 1st Qu.:28.75   Class :character   Class :character   1st Qu.:57500  
 Median :32.50   Mode  :character   Mode  :character   Median :65000  
 Mean   :32.50                                         Mean   :65000  
 3rd Qu.:36.25                                         3rd Qu.:72500  
 Max.   :40.00                                         Max.   :80000  

## Accessing Columns in a Data Frame

In [11]:
# Accessing individual columns
data$ages
data$names
data$gender
data$salary

## Subsetting Data Frames

In [12]:
# Subsetting row 3, all columns
data[3,]

# Subsetting rows 1 and 3, all columns
data[c(1, 3), ]

# Subsetting rows where Salary is greater than 60000
data[data$salary > 60000, ]

# Subsetting rows where Gender is Male, selecting Name and Salary
subset3 <- data[data$gender == "Male", c("names", "salary")]
print(subset3)

Unnamed: 0_level_0,ages,names,gender,salary
Unnamed: 0_level_1,<dbl>,<chr>,<chr>,<dbl>
3,35,David,Male,70000


Unnamed: 0_level_0,ages,names,gender,salary
Unnamed: 0_level_1,<dbl>,<chr>,<chr>,<dbl>
1,25,John,Male,50000
3,35,David,Male,70000


Unnamed: 0_level_0,ages,names,gender,salary
Unnamed: 0_level_1,<dbl>,<chr>,<chr>,<dbl>
3,35,David,Male,70000
4,40,Emily,Female,80000


  names salary
1  John  50000
3 David  70000
