# Loading data from a file
Credits: http://www.cookbook-r.com/ (Creative Commons Attribution-Share Alike 3.0 Unported License)

## Problem
You want to load data from a file.

## Solution
### Delimited text files

The simplest way to import data is to save it as a text file with delimiters such as tabs or commas (CSV).

The data files used in this example are:

<a href="http://www.cookbook-r.com/Data_input_and_output/Loading_data_from_a_file/datafile.csv">datafile.csv</a>
with the content:

"First","Last","Sex","Number"<br/>
"Currer","Bell","F",2<br/>
"Dr.","Seuss","M",49<br/>
"","Student",NA,21<br/>

<a href="http://www.cookbook-r.com/Data_input_and_output/Loading_data_from_a_file/datafile-noheader.csv">datafile-noheader.csv</a>
with the content:

"Currer","Bell","F",2<br/>
"Dr.","Seuss","M",49<br/>
"","Student",NA,21<br/>


### Loading a file from the Internet
Data can be loaded from a URL. These (very long) URLs will load the files linked to below.

In [1]:
data <- read.csv("http://www.cookbook-r.com/Data_input_and_output/Loading_data_from_a_file/datafile.csv")

#show the data
data

Unnamed: 0,First,Last,Sex,Number
1,Currer,Bell,F,2
2,Dr.,Seuss,M,49
3,,Student,,21


In [2]:
# Read in a CSV file without headers
data <- read.csv("http://www.cookbook-r.com/Data_input_and_output/Loading_data_from_a_file/datafile-noheader.csv", header=FALSE)

#show the data
data

Unnamed: 0,V1,V2,V3,V4
1,Currer,Bell,F,2
2,Dr.,Seuss,M,49
3,,Student,,21


In [3]:
# Manually assign the header names
names(data) <- c("First","Last","Sex","Number")

#show the data
data

Unnamed: 0,First,Last,Sex,Number
1,Currer,Bell,F,2
2,Dr.,Seuss,M,49
3,,Student,,21


### Treating strings as factors or characters
By default, strings in the data are converted to factors. If you load the data below with read.csv, then all the text columns will be treated as factors, even though it might make more sense to treat some of them as strings. To do
this, use stringsAsFactors=FALSE:

In [4]:
data <- read.csv("http://www.cookbook-r.com/Data_input_and_output/Loading_data_from_a_file/datafile.csv")

# You might have to convert some columns to factors
data$Sex <- factor(data$Sex)

Another alternative is to load them as factors and convert some columns to 
characters:

In [5]:
data <- read.csv("http://www.cookbook-r.com/Data_input_and_output/Loading_data_from_a_file/datafile.csv")

data$First <- as.character(data$First)
data$Last  <- as.character(data$Last)

# Another method: convert columns named "First" and "Last"
stringcols <- c("First","Last")
data[stringcols] <- lapply(data[stringcols], as.character)