## R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see <http://rmarkdown.rstudio.com>.

When you click the **Knit** button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. 

You can embed an R code chunk like below.

Try executing this chunk by clicking the *Run* button within the chunk or by placing your cursor inside it and pressing *Cmd+Shift+Enter*. 

In [1]:
library(conflicted)
conflict_prefer("filter", "dplyr")
conflict_prefer("lag", "dplyr")

[1m[22m[90m[conflicted][39m Will prefer [1m[34mdplyr[39m[22m::filter over any other package.
[1m[22m[90m[conflicted][39m Will prefer [1m[34mdplyr[39m[22m::lag over any other package.


In [2]:
library(tidyverse)

── [1mAttaching core tidyverse packages[22m ──────────────────────── tidyverse 2.0.0 ──
[32m✔[39m [34mdplyr    [39m 1.1.4     [32m✔[39m [34mreadr    [39m 2.1.5
[32m✔[39m [34mforcats  [39m 1.0.0     [32m✔[39m [34mstringr  [39m 1.5.1
[32m✔[39m [34mggplot2  [39m 3.5.2     [32m✔[39m [34mtibble   [39m 3.3.0
[32m✔[39m [34mlubridate[39m 1.9.4     [32m✔[39m [34mtidyr    [39m 1.3.1
[32m✔[39m [34mpurrr    [39m 1.1.0     


## Saving data to local file
To start with, we will load the built-in iris data set.

We will then manipulate the data, adding a column, and save the data.

Data will be exported in multiple formats to show R capabilities.

A `data` folder has been created for you to save the files in to help keep the 
repo organized.

In [3]:
iris

Sepal.Length,Sepal.Width,Petal.Length,Petal.Width,Species
<dbl>,<dbl>,<dbl>,<dbl>,<fct>
5.1,3.5,1.4,0.2,setosa
4.9,3.0,1.4,0.2,setosa
4.7,3.2,1.3,0.2,setosa
4.6,3.1,1.5,0.2,setosa
5.0,3.6,1.4,0.2,setosa
5.4,3.9,1.7,0.4,setosa
4.6,3.4,1.4,0.3,setosa
5.0,3.4,1.5,0.2,setosa
4.4,2.9,1.4,0.2,setosa
4.9,3.1,1.5,0.1,setosa


Now let's add a column to the data.

In [4]:
iris_mutate <- mutate(iris, Sepal.Ratio=Sepal.Width/Sepal.Length) 
iris_mutate

Sepal.Length,Sepal.Width,Petal.Length,Petal.Width,Species,Sepal.Ratio
<dbl>,<dbl>,<dbl>,<dbl>,<fct>,<dbl>
5.1,3.5,1.4,0.2,setosa,0.6862745
4.9,3.0,1.4,0.2,setosa,0.6122449
4.7,3.2,1.3,0.2,setosa,0.6808511
4.6,3.1,1.5,0.2,setosa,0.6739130
5.0,3.6,1.4,0.2,setosa,0.7200000
5.4,3.9,1.7,0.4,setosa,0.7222222
4.6,3.4,1.4,0.3,setosa,0.7391304
5.0,3.4,1.5,0.2,setosa,0.6800000
4.4,2.9,1.4,0.2,setosa,0.6590909
4.9,3.1,1.5,0.1,setosa,0.6326531


Before writing out the table, let's figure out where we are in the repo.

In [5]:
getwd()

The output shows we are in the `notebooks` folder. So we will have to go up a directory
and then down into the `data` folder to save to the correct location.

Once you have calculated new information in our table you may want to write the table to a file.  You can use the write.table function for this.

The file will show up in the files tab to the right with the name `../data/pbc_mutate.txt`. The `..` means to go up a directory.

In [6]:
# Check if a directory exists before creating it
if (!dir.exists("data")) {
  dir.create("data")
}

In [7]:
write.table(iris_mutate,"data/iris_mutate.txt",row.names=F,sep="\t")

We can also save the data into a csv format. The syntax is very similar.

In [8]:
write.csv(iris_mutate,"data/iris_mutate.csv",row.names=F)

## Read data in from local file

We will now read the data we saved back into R. 

Let's start with the text table saved first.

In [9]:
iris_mutate_txt = read.table("data/iris_mutate.txt", header = TRUE, sep = "\t")
iris_mutate_txt

Sepal.Length,Sepal.Width,Petal.Length,Petal.Width,Species,Sepal.Ratio
<dbl>,<dbl>,<dbl>,<dbl>,<chr>,<dbl>
5.1,3.5,1.4,0.2,setosa,0.6862745
4.9,3.0,1.4,0.2,setosa,0.6122449
4.7,3.2,1.3,0.2,setosa,0.6808511
4.6,3.1,1.5,0.2,setosa,0.6739130
5.0,3.6,1.4,0.2,setosa,0.7200000
5.4,3.9,1.7,0.4,setosa,0.7222222
4.6,3.4,1.4,0.3,setosa,0.7391304
5.0,3.4,1.5,0.2,setosa,0.6800000
4.4,2.9,1.4,0.2,setosa,0.6590909
4.9,3.1,1.5,0.1,setosa,0.6326531


Next, let's load the csv data back into r with a new variable name.

In [10]:
iris_mutate_csv = read.csv("data/iris_mutate.csv", header = TRUE)
iris_mutate_csv

Sepal.Length,Sepal.Width,Petal.Length,Petal.Width,Species,Sepal.Ratio
<dbl>,<dbl>,<dbl>,<dbl>,<chr>,<dbl>
5.1,3.5,1.4,0.2,setosa,0.6862745
4.9,3.0,1.4,0.2,setosa,0.6122449
4.7,3.2,1.3,0.2,setosa,0.6808511
4.6,3.1,1.5,0.2,setosa,0.6739130
5.0,3.6,1.4,0.2,setosa,0.7200000
5.4,3.9,1.7,0.4,setosa,0.7222222
4.6,3.4,1.4,0.3,setosa,0.7391304
5.0,3.4,1.5,0.2,setosa,0.6800000
4.4,2.9,1.4,0.2,setosa,0.6590909
4.9,3.1,1.5,0.1,setosa,0.6326531


Now that we have two data frames loaded, let's check if they are the same.

In [11]:
identical(iris_mutate_txt, iris_mutate_csv)

This shows that even though we saved the data in two formats, when we read the 
data back into R, the data frames are identical. 

So how you save your data is up to your preference, as long as you specify the correct parameters when saving and importing the data.

This concludes the data import/export module.