** This is a practice notebook for R101 course on webssite : ** ***[cognitiveclass.ai](https://cognitiveclass.ai/courses/r-101/)***

# Math variables and strings

R supports basic mathematical operations like **addition**, **subtraction**, **multiplication**, **division** and **exponentiation**.
It also supports **comparison** and **logical** operators.
Variables store values of data. Values can be assigned to variables using a **"<-"** operator or an **"="** operator(not commonly used).
Always use meaningful variable names using an **underscore between words**.


In [None]:
distance_1_in_mts <- 1500
distance_2_in_mts <- 1000
time_in_min <- 15

speed <- (distance_1_in_mts + distance_2_in_mts) / (time_in_min * 60)
speed

Strings can be assigned to variables but they need to be enclosed between quotes **double quotes " "** or ** single quotes' '**.


In [None]:
name_1 <- 'john doe'
name_2 <- "jane doe"

name_1
name_2

# Vectors and factors

## Vectors

A vector is a one dimension array of objects. It is a simple tool to store data. 
In R, there is no restriction to type and number of objects a vector can contain.
A vector is created by enclosing the values within a **c()** separated by commas.
The **items in a vector must be of the same class**.

In [None]:
distances <- c(1500, 1000)
distances

Mathematical operations on vectors are performed on each element of the vector

In [None]:
speeds <- distances / time_in_min
speeds

Vectors containing a range of numbers can also be created.

In [None]:
one_to_ten <- c(1:10)
one_to_ten

reverse_one_to_ten <- c(10:1)
reverse_one_to_ten

Vectors can also be used to store labels or strings

In [None]:
names <- c("jane doe", "john doe")
names

Logical vector is simply a "True" or "False".
Logical vectors are created usually by comparison or logical operators.

In [None]:
movie_ratings <- c(7.5, 6.0, 5.0, 5.3, 8.6, 9.0, 4.0, 7.9, 9.5, 3.6)
movie_ratings >= 7.5

## Factors

In R, factors are variables that can take on a **limited number of values**.
Usually applied on **categorical variables** or **labels**.

**summary()** function can be applied on vectors and factors to get summarize them.

In [None]:
genre_vector <- c("Comedy", "Thriller", "Adventure", "History", "Comedy", "Adventure", "Drama", "Drama", "Fantasy", "Comedy")
genre_vector

summary(genre_vector)

Factor summary provides the number of occurances of each label.

In [None]:
genre_factor <- factor(genre_vector)
genre_factor

summary(genre_factor)

### Ordered factors

Ordered factors are the factors that have an inherent order. Vectors (ordinal categorical) can be given a order by converting them to factors and specifying a level of order.

In [None]:
# unordered vector
movie_length_vector <- c("short", "long", "medium", "very short", "very long", "short", "long", "short", "long", "long")

# ordered factor
movie_length_factor <- factor(movie_length_vector, 
                             ordered = TRUE,
                             levels = c("very short", "short", "medium", "long", "very long"))

summary(movie_length_factor)

The output of summary() can be sorted using **sort()**

In [None]:
sort(summary(movie_length_factor))

## Vector operations

Retrieve particular values from the vectors using **square brackets []**.

In [None]:
titles <- c("Toy Story", "Akira", "The Breakfast Club", "Lord of The Rings")

titles[3]

To retrieve more than one elements 

In [None]:
# find elements 2 and 4.
titles[c(2,4)]

Retrieve a range of elements using a **colon symbol :**

In [None]:
# find elements 2 to 4, i.e. 2, 3 and 4
titles[2:4]

If we retrieve an element which does not exist, R will show a **missing value NA**.

In [None]:
titles[5]

Missing values can also be added in a vector by using **NA**.

In [None]:
age_restric <- c(14, 12, 10, NA, 18, NA)
age_restric

Length of a vector can be found using **length()**

In [None]:
length(age_restric)

We can perform aritmetic operations between 2 vectors

In [None]:
nums <- c(1:6)

multiplication <- age_restric * nums
multiplication

addition <- age_restric + nums
addition

division <- age_restric / nums
division

exponentiation <- age_restric ^ nums
exponentiation

To NOT display an element from any vector by using a **negative index of the element** withing square brackets. This does not alter the original vector.

In [None]:
cost_2014 = c(8.6, 8.5, 8.1)

#to remove index 1, i.e 8.6
cost_2014[-1]

#original vector is intact
cost_2014

A new variable can be created by removing an element or overwrite an existing vector

In [None]:
new_cost_2014 <- cost_2014[-1]
new_cost_2014

Find **min()** and **max()** elements of a vector.

In [None]:
min(cost_2014)

In [None]:
max(cost_2014)

Find **sum()** and **mean()** of elements in a vector.

In [None]:
sum(cost_2014)

In [None]:
mean(cost_2014)

We can map two vectors using **names()** function. Here we name the movies released in the release_year values.

In [None]:
#map names to release_year
release_year <- c(1985, 1999, 2015, 1964)
names(release_year) <- c("The Breakfast Club", "American Beauty", "Black Swan", "Chicago")

release_year

Query movies using movie names or their index within **square brackets []**

In [None]:
release_year[1]
release_year[c(3, 4)]
release_year["Chicago"]
release_year[c("American Beauty", "Black Swan")]

Operations can be performed on the mapped elements as follows

In [None]:
release_year[1] + 100 - 20

In [None]:
release_year[c("American Beauty", "Black Swan")] * 20

In [None]:
names(release_year)[1:3]

In [None]:
summary(names(release_year))

In [None]:
summary(release_year)

In [None]:
sort(names(release_year))