# CSV Input and Output

CSV stands for comma separated variable and its one of the most common ways we'll be working with data throughout this course. The basic format of a csv file is the first line indicating the column names and the rest of the rows/lines being data points separated by commas. One of the most basic ways to read in csv files in R is to use **read.csv()** which is built-in to R. Later on we'll learn about **fread** which will be a bit faster and more convenient, but its important to understand all your options!

When using **read.csv()** you'll need to either pass in the entire path of the file or have the file be in the same directory as your R script. Make sure to account for possible spaces in the file path name, you may need to use backslashes to account for this. This is often a point of confusion for people new to programming, so make sure you understand the above before continuing!

In [5]:
# Pass in the entire file path if not in same directory
ex <- read.csv('./CSV/example.csv')
ex

Name,Orders,Date
John,12,12/05/2016
Charlie,11,12/06/2016
Matilda,10,12/07/2016


In [6]:
# Check structure
str(ex)

'data.frame':	3 obs. of  3 variables:
 $ Name  : Factor w/ 3 levels "Charlie","John",..: 2 1 3
 $ Orders: int  12 11 10
 $ Date  : Factor w/ 3 levels "12/05/2016","12/06/2016",..: 1 2 3


In [7]:
# Check column names
colnames(ex)

In [8]:
# create a sample csv file
write.csv(mtcars, file = './CSV/mtcars.csv' )

In [11]:
mtcarsdf <- read.csv('./CSV/mtcars.csv')
head(mtcarsdf)

X,mpg,cyl,disp,hp,drat,wt,qsec,vs,am,gear,carb
Mazda RX4,21.0,6,160,110,3.9,2.62,16.46,0,1,4,4
Mazda RX4 Wag,21.0,6,160,110,3.9,2.875,17.02,0,1,4,4
Datsun 710,22.8,4,108,93,3.85,2.32,18.61,1,1,4,1
Hornet 4 Drive,21.4,6,258,110,3.08,3.215,19.44,1,0,3,1
Hornet Sportabout,18.7,8,360,175,3.15,3.44,17.02,0,0,3,2
Valiant,18.1,6,225,105,2.76,3.46,20.22,1,0,3,1


## read.table

In [13]:
# The read.table function is the general 
# form of read.csv, in fact read.csv is actually just a thin wrapper 
# around read.table which just makes it easier to use sometimes. 
read.table('./CSV/example.csv')

V1
"Name,Orders,Date"
"John,12,12/05/2016"
"Charlie,11,12/06/2016"
"Matilda,10,12/07/2016"


In [14]:
# Notice how we get an error here! 
# That means we need to add additional arguments to read.table, 
# like what the delimiter (sep) is:
read.table('./CSV/example.csv', sep = ',')

V1,V2,V3
Name,Orders,Date
John,12,12/05/2016
Charlie,11,12/06/2016
Matilda,10,12/07/2016


## fread

**fread()** is similar to **read.table** but faster and more convenient:

In [None]:
fread('./CSV/example.csv')