# R Tutorial 1 - Intro To R
> This is the first R tutorial I am making for anyone interested in learning R or wanting to work on their fundamentals. <br>
> > My goals for this tutorial series are: to further my own learning, help others learn, and to show just how powerful and useful R is. <br>
> 
> For those of you reading now, feedback is greatly appreciated, and if you would like to see any more on a particular topic or would like to collaborate on a project then let me know! <br>
> For this tutorial series I will be using base R but also packages such as the `tidyverse` package. <br>
<br>
> I will let you know how to install them and use their functions as we go. Also I will be adding in comments with my code to help explain what it happening on each line. Comments in R have a `#` before them, telling the editor that what follows on the same line after the hash is a comment and to not run it as code.

# Installing and Using Packages
> R itself is a great tool, but what makes it even better is the many packages that you can use to really customize your experience. There are packages for: making complex functions easier, having more control over plots and their aesthetics, and even packages that contain datasets. When getting oriented with R you can get oriented with my favorite package the `tidyverse` package. This package is a collection of many packages that cover a range of needs from data cleaning to data visualization.  Below I will show you how to install and begin to use the  `tidyverse` . <br>
<br>
Using the function `install.packages(" ")` you can install your package of choice by inserting the name between the quotes. <br>
> > Once you have the package installed you do not need to install it again. <br>
>
> After installing a package you will have to load it when you want to use it in a notebook, script, or project, and you do this with the `library()` function <br>
> > Inserting the name between the parenthesis (also you do not use quotes when loading installed packages) <br>
>
> For installing and loading the `tidyverse` package, it should look like this:

In [17]:
install.packages("tidyverse") # install the tidyverse package (only needs to be done once, and needs quotes)
library(tidyverse) # load the tidyverse package (needs to be done for each script, project, or notebook that will use this package, and does not need quotes)

> Installing and loading packages is not too hard, and if you get to the point where you have many packages to load for your analysis then you can even store the package names as a character vector using `c()` and load them all in at once with a function like below.  <br>
<br>
> Here is an example:

In [18]:
packages <- c("ggplot2","tibble","tidyr","readr","purrr","dplyr","stringr","forcats") # object "packages" contains each of these character strings which are seperated by commas and have quotes on either side
# all of these packages above are actually included in the tidyverse package so just loading tidyverse will load all of them 

invisible(lapply(packages, library, character.only = TRUE)) # make the output invisible -> lapply applies a function over a list or vector -> apply object "packages" to function library() -> when doing this, only read the characters 

# Importing CSV Data
> CSV files are very common data sources and easy to import into R. I will show how to import other data file types in the future.  <br>
<br>
> For this way of importing a CSV you need to have the tidyverse package loaded. <br>
<br>
> First choose a name for the data frame object, so for the mtcars data frame I will choose `cars_df` , I will type this on a new line, and then assign this name the following function by using "`<-`" <br>
<br>
> After the "`<-`" insert the `read_csv(" ")` function to read a CSV, you need to make sure that the file workspace has the CSV you wish to import and put the `filename.csv` between the quotes. Sometimes you need to use the `file path` and `name.csv` <br>
<br>
> Look at my example below and I promise it is not as complicated as the words make it seem

In [19]:
cars_df <- read_csv("../input/mtcars/mtcars.csv") 

# "cars_df" is what I will be calling my dataframe I am importing (you can be creative with the names, but I would make it clear and concise if possible)

# "<-" is an assignment operator, and it assigns what follows to my object "cars_df"

# "read_csv("../input/mtcars/mtcars.csv")" this is the file path and file name, (do not forget the ".csv")

>  Now that the`mtcars.csv` has been imported and saved as `cars_df` we can take a look at how this data frame looks, and there are a few different ways to do this. <br>
<br>
> The most simple way to do this is to go to a new line and just type the object's name and run it, in my case it would be `cars_df`

In [20]:
cars_df # typing in an objects name will allow us to view this object in the output

> What if you are working with an even bigger dataset? <br>
<br>
> If you are working with a very large dataset and just want to see a "preview" of the data frame then you can use the `head()` function and type in the objects name as an argument <br>
<br>
> This will give you only the top 6 rows in your output, which I find useful for seeing the columns and types of data for each while saving some screen real estate (however I still make sure all the data is accurate and clean before proceeding with just viewing the top 6 rows)

In [21]:
head(cars_df) # head of "cars_df" (top 6 rows)

# Simple Math Functions
> Now that you know how to install and load in packages like the `tidyverse` , import and view CSV files as objects, we will go over some simple math functions. <br>
<br>
> So for the cars_df, what if we want to know the average mpg (miles per gallon) of all the cars? <br>
<br>
> It is simple, you can use the `mean()` argument <br>
<br> 
> We want the mean of the `mpg` column in the `cars_df` so we will want to make sure the editor knows this... and you do this by referencing the column in the data frame `cars_df$mpg`

In [22]:
cars_df$mpg # this prints the mpg column as a list in the "cars_df"

In [23]:
mean(cars_df$mpg) # average of mpg in "cars_df"

> `sum()` and `length()` are some other simple math functions that can be used just as `mean()` <br>
<br>
> you can also type out operators to do math functions, such as `3+3` and the output will include the answer (6)

In [24]:
sum(cars_df$mpg) # sum of mpg from "cars_df"

length(cars_df$mpg) # count of mpg from "cars_df"

3+3 # addition
3-3 # subtraction
6/2 # division
3*2 # multiplication 
3 %% 2 # modular arithmetic


> You can even combine the two, and use functions and operators at the same time

In [25]:
sum(cars_df$mpg) / length(cars_df$mpg) # sum of mpg divided by length of mpg = average mpg

mean(cars_df$mpg) == sum(cars_df$mpg) / length(cars_df$mpg) # are these two equal? yes, they both give you the average!