<h1>R Basics with Jupyter Notebook</h1>

Estimated time needed: **15** minutes

## Objectives

After completing this lab you will be able to:

-   Understand the sample dataset
-   Create variables and perform basic math operations
-   Perform basic strings operations


## Table of Contents


<ul>
<li><a href="#About-the-Dataset">About the Dataset</a></li>
<li><a href="#Simple-Math-in-R">Simple Math in R</a></li>
<li><a href="#Variables-in-R">Variables in R</a></li>
<li><a href="#Strings-in-R">Strings in R</a></li>
</ul>

<a id="ref0"></a>
<h2 align=center>About the Dataset</h2>

Which movie should you watch next?

Let's say each of your friends tells you their favorite movies. You do some research on the movies and put it all into a table. Now you can begin exploring the dataset, and asking questions about the movies. For example, you can check if movies from some certain genres tend to get better ratings. You can check how the production cost for movies changes across years, and much more.

**Movies dataset**

The table gathered includes one row for each movie, with several columns for each movie characteristic:

- **name** - Name of the movie
- **year** - Year the movie was released
- **length_min** - Length of the movie (minutes)
- **genre** - Genre of the movie
- **average_rating** - Average rating on [IMDB](http://www.imdb.com/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMDeveloperSkillsNetworkRP0101ENCoursera889-2022-01-01)
- **cost_millions** - Movie's production cost (millions in USD)
- **foreign** - Is the movie foreign (1) or domestic (0)?
- **age_restriction** - Age restriction for the movie
<br>

<img src="https://ibm.box.com/shared/static/6kr8sg0n6pc40zd1xn6hjhtvy3k7cmeq.png" width="90%" align="left">

### We can use R to help us explore the dataset
But to begin, we'll need to start from the basics, so let's get started!

<a id="ref1"></a>
<h2 align=center> Simple Math in R </h2>

Let's say you want to watch *Fight Club* and *Star Wars: Episode IV (1977)*, back-to-back. Do you have enough time to **watch both movies in 4 hours?** Let's try using simple math in R.  

What is the **total movie length** for Fight Club and Star Wars (1977)?
- **Fight Club**: 139 min
- **Star Wars: Episode IV**: 121 min

<div class="alert alert-success alertsuccess" style="margin-top: 20px">
Tip: To run the grey code cell below, click on it, and press Shift + Enter.
</div>

In [1]:
139 + 121

260

Great! You've determined that the total number of movie play time is **260 min**.

**What is 260 min in hours?**

In [2]:
260 / 60

4.333333333333333

Well, it looks like it's **over 4 hours**, which means you can't watch *Fight Club* and *Star Wars (1977)* back-to-back if you only have 4 hours available!

<hr></hr>
<div class="alert alert-success alertsuccess" style="margin-top: 20px">
<h4> [Tip] Simple math in R </h4>
<p></p>
You can do a variety of mathematical operations in R including:  
<li> addition: $2 + 2$ </li>
<li> subtraction: $5 - 2$ </li>
<li> multiplication: $3*2$ </li>
<li> division: $4 / 2$ </li>
<li> exponentiation: $4 ** 2$ or $4 ^ 2$ </li>
</div>

<a id="ref2"></a>
<h2 align=center> Variables in R </h2>

We can also **store** our output in **variables**, so we can use them later on. For example:


In [1]:
x<- 139 + 121

To return the value of **`x`**, we can simply run the variable as a command:

In [2]:
x

You can check its variable type using `class()` function

In [4]:
class(x)

And cast the type of `x` to character

In [6]:
x_char <- as.character(x)
class(x)

And cast it back to numeric

In [7]:
x_num <- as.numeric(x_char)
class(x_num)

We can also perform operations on **`x`** and save the result to a **new variable**:

In [9]:
y <- x / 60
y

If we save something to an **existing variable**, it will **overwrite** the previous value:


In [10]:
x <- x / 60
x

It's good practice to use **meaningful variable names**, so you don't have to keep track of what variable is what:


In [11]:
total <- 139 + 121
total

In [13]:
total_hr <- total / 60
total_hr

You can put this all into a single expression, but remember to use **round brackets** to add together the movie lengths first, before dividing by 60.

In [14]:
total_hr <- (139 + 121) / 60
total_hr

<hr></hr>
<div class="alert alert-success alertsuccess" style="margin-top: 0px">
<h4> [Tip] Variables in R </h4>
<p></p>
As you just learned, you can use variables to store values for repeated use. Here are some more characteristics of variables in R:
<li>variables store the output of a block of code </li>
<li>variables are typically assigned using $<-$, but can also be assigned using $=$, as in x $<-$ 1 or $x = 1$ </li>
<li>once created, variables can be removed from memory using `rm(my_variable)`  </li>
<p></p>
</div>


In [15]:
# Write your code below. Don't forget to press Shift+Enter to execute the cell
(139 - 121) /60

<a id="ref4"></a>
<h2 align=center>Strings in R</h2>

In [16]:
movie = "Toy Story"
movie

In [17]:
class(movie)

In [19]:
as.numeric(movie)

“NAs introduced by coercion”