## Reading Text Files
There are many, many formats and standards of text documents for storing data. Com‐
mon formats for storing data are delimiter-separated values (CSV or tab-delimited),
eXtensible Markup Language (XML), JavaScript Object Notation (JSON), and YAML
(which recursively stands for YAML Ain’t Markup Language).

### CSV and Tab-Delimited Files
RedDeerEndocranialVolume.dlm is a whitespace-delimited file containing measure ments of the endocranial volume of some red deer, measured using different techniques

In [None]:
library(learningr)
deer_file <- system.file(
 "extdata",
 "RedDeerEndocranialVolume.dlm",
 package = "learningr"
)
deer_data <- read.table(deer_file, header = TRUE, fill=TRUE)
str(deer_data, vec.len = 1) #vec.len alters the amount of output

In [None]:
head(deer_data)

### There are several convenience wrapper functions to read.table. read.csv sets the default separator to a comma, and assumes that the data has a header row. read.csv2 is its European cousin, using a comma for decimal places and a semicolon as a separator. Likewise read.delim and read.delim2 import tab-delimited files with full stops1 or commas for decimal places, respectively

In [None]:
crab_file <- system.file(
 "extdata",
 "crabtag.csv",
 package = "learningr"
)
(crab_id_block <- read.csv(
 crab_file,
 header = FALSE,
 skip = 3,
 nrow = 2
))
(crab_tag_notebook <- read.csv(
 crab_file,
 header = FALSE,
 skip = 8,
 nrow = 5
))

(crab_lifetime_notebook <- read.csv(
 crab_file,
 header = FALSE,
 skip = 15,
 nrow = 3
))




In [None]:
write.csv(
 crab_lifetime_notebook,
 "sampledatasets/csvAdded.csv",
 row.names = FALSE,
 fileEncoding = "utf8"
)
"sucess"

In [None]:
z<-read.csv('sales_modified.txt',sep='')
z


# Unstructured Text Files

In [None]:
text_file <- readLines('sampledatasets/gutenberg.txt')
text_file[1926:1927]
    

In [None]:
writeLines(
rev(text_file), #rev reverses vectors
"Shakespeare's The Tempest, backwards.txt"
)

###  Reading JSON Files


#### Importing JavaScript Object Notation (JSON) Files Into R
To get JSON files into R, you first need to install or load the rjson package

library(rjson)
JsonData <- fromJSON(file = "<filename.json>" )
Your JSON file is available through a URL.

library(rjson)
JsonData <- fromJSON(file = "<URL to your JSON file>" ) 

#install
#install.packages("jsonlite", repos="http://cran.r-project.org")
#install package rjson

In [None]:
data2 <- fromJSON("https://api.github.com/users/hadley/repos")

In [None]:
#it's a data frame...
names(data2)
data2$name

In [None]:
#...with has a nested data frame
names(data2$owner)
data2$owner$login

#these are equivalent :)
data2[1,]$owner$login
data2[1,"owner"]$login
data2$owner[1,"login"]
data2$owner[1,]$login

## Data Import

It is often necessary to import sample textbook data into R before you start working on your homework.



In [82]:
library("readxl")
FirstTable  <- read_excel("sampledatasets/SampleData/SampleData.xlsx", 1)

head(FirstTable)


Unnamed: 0,Unnamed: 1
,
,Online Instruction Page
,Sample Data for Excel
,
,Related tutorials
,VLOOKUP Function


# Web Data

## Fortunately, R has a variety of ways to import data from web sources; retrieving the data programmatically makes it possible to collect much more of it with much less effort.

## Sites with an API
### Several packages exist that download data directly into R using a website’s application programming interface (API). 

In [88]:
library(WDI)
#list all available datasets
wdi_datasets <- WDIsearch()

head(wdi_datasets)

indicator,name
BG.GSR.NFSV.GD.ZS,Trade in services (% of GDP)
BM.KLT.DINV.GD.ZS,"Foreign direct investment, net outflows (% of GDP)"
BN.CAB.XOKA.GD.ZS,Current account balance (% of GDP)
BN.CUR.GDPM.ZS,Current account balance excluding net official capital grants (% of GDP)
BN.GSR.FCTY.CD.ZS,Net income (% of GDP)
BN.KLT.DINV.CD.ZS,Foreign direct investment (% of GDP)


In [103]:
webdta<-read.table("http://www.ats.ucla.edu/stat/examples/ara/angell.txt")

In [105]:
head(webdta)

V1,V2,V3,V4,V5
Rochester,19.0,20.6,15.0,E
Syracuse,17.0,15.6,20.2,E
Worcester,16.4,22.1,13.6,E
Erie,16.2,14.0,14.8,E
Milwaukee,15.8,17.4,17.6,MW
Bridgeport,15.3,27.9,17.5,E


## Working with Twitter Data 

In [None]:
#install.packages("twitteR")
#install.packages("ROAuth")
library("twitteR")
library("ROAuth")

In [144]:
# Download "cacert.pem" file
download.file(url="http://curl.haxx.se/ca/cacert.pem",destfile="cacert.pem")

#create an object "cred" that will save the authenticated object that we can use for later sessions
 cred <- OAuthFactory$new(consumerKey='xMQMrGYaxT3MpLa23FIG4fOGe',
      consumerSecret='lSMsRLRj7K2QZnY4FVXBWaKE9uXJ8OWKm3gCt3SzEb4zR8QGBY',
      requestURL='https://api.twitter.com/oauth/request_token',
      accessURL='https://api.twitter.com/oauth/access_token',
      authURL='https://api.twitter.com/oauth/authorize')
# Executing the next step generates an output --> To enable the connection, please direct your web browser to: <hyperlink> . Note:  You only need to do this part once
 cred$handshake(cainfo="cacert.pem")
registerTwitterOAuth(cred)

To enable the connection, please direct your web browser to: 
https://api.twitter.com/oauth/authorize?oauth_token=qcqfXgAAAAAAfi1sAAABV-ZYFbM
When complete, record the PIN given to you and provide it here: 


ERROR: Error: Authorization Required



In [145]:
#save for later use for Windows
save(cred, file="twitter authentication.Rdata")


In [148]:
load("twitter authentication.Rdata")
registerTwitterOAuth(cred)


ERROR: Error in registerTwitterOAuth(cred): ROAuth is no longer used in favor of httr, please see ?setup_twitter_oauth
