From 46271f2420fedaf2a2ed100b779399b211cc7045 Mon Sep 17 00:00:00 2001 From: Bob Rudis Date: Sat, 1 Oct 2016 09:27:40 -0400 Subject: [PATCH] better example --- README.Rmd | 50 ++++++++++++++++++++++++++++ README.md | 95 +++++++++++++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 144 insertions(+), 1 deletion(-) diff --git a/README.Rmd b/README.Rmd index af9b83a..9b72dde 100644 --- a/README.Rmd +++ b/README.Rmd @@ -29,6 +29,56 @@ devtools::install_git("https://gitlab.com/hrbrmstr/bom.git") options(width=120) ``` +There are some basic examples in the [Usage](#Usage) section, but this may be a better illustration. Say you have a CSV file: + +```{r} +fil <- system.file("examples", "stop_times.txt", package="bom") +``` + +And, say you want to read it in with a more modern CSV reader: + +```{r} +library(readr) + +df <- read_csv(fil) +``` + +Let's look at that file: + + +```{r} +print(df, n=1) +``` + +Hrm…why are those backticks around `trip_id`? Isn't it just a regular string? + +```{r} +print(colnames(df)[1]) +``` + +It sure _looks_ that way, but looks can be deceiving: + +```{r} +print(charToRaw(colnames(df)[1])) +``` + +Those strange characters at the beginning are a byte order mark (BOM). We can test for it being there and work around it: + +```{r} +library(bom) + +if (file_has_bom(fil)) { + n <- switch(file_bom_type(fil), `UTF-8`=3, 2) + df <- read_csv(readBin(fil, "raw", file.size(fil))[-(1:n)]) +} + +print(df, n=1) + +charToRaw(colnames(df)[1]) +``` + +Note that the built-in `read.csv()` can be used with `encoding="UTF-8-BOM"` and you can even use that encoding on non-binary connections, but you end up having to type convert and tibble convert that object so you're basically rewriting (badly) `readr::read_csv()`. + ### Usage ```{r message=FALSE, warning=FALSE, error=FALSE} diff --git a/README.md b/README.md index fea3914..1e31745 100644 --- a/README.md +++ b/README.md @@ -26,6 +26,99 @@ devtools::install_git("https://gitlab.com/hrbrmstr/bom.git") options(width=120) ``` +There are some basic examples in the [Usage](#Usage) section, but this may be a better illustration. Say you have a CSV file: + +``` r +fil <- system.file("examples", "stop_times.txt", package="bom") +``` + +And, say you want to read it in with a more modern CSV reader: + +``` r +library(readr) + +df <- read_csv(fil) +``` + + ## Parsed with column specification: + ## cols( + ## `trip_id` = col_integer(), + ## arrival_time = col_time(format = ""), + ## departure_time = col_time(format = ""), + ## stop_id = col_integer(), + ## stop_sequence = col_integer(), + ## pickup_type = col_integer(), + ## drop_off_type = col_integer() + ## ) + +Let's look at that file: + +``` r +print(df, n=1) +``` + + ## # A tibble: 64,827 × 7 + ## `trip_id` arrival_time departure_time stop_id stop_sequence pickup_type drop_off_type + ##