/
first-glance.Rmd
71 lines (53 loc) · 1.94 KB
/
first-glance.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
---
title: "A first glance at the Divvy data"
author: "Peter Carbonetto"
output: workflowr::wflow_html
---
Here, we will take a brief look at the data provided by Divvy.
```{r knitr-opts, include=FALSE}
knitr::opts_chunk$set(comment = "#",collapse = TRUE)
```
I begin by loading a few packages, as well as some additional
functions I wrote for this project.
```{r load-packages, message = FALSE}
library(data.table)
source("../code/functions.R")
```
## Reading the data
I wrote a function, `read.divvy.data`, that reads in the trip and
station data from the [Divvy CSV
files](https://www.divvybikes.com/system-data). This function uses
`fread` from the `data.table` package to quickly read in the data (it
is much faster than `read.table`). This function also prepares the
data, including the departure dates and times, so that they are easier
to work with.
```{r load-data}
divvy <- read.divvy.data()
```
## A first glance at the Divvy data
We have data on 581 Divvy stations across the city.
```{r get-stations-overview}
nrow(divvy$stations)
print(head(divvy$stations),row.names = FALSE)
```
We also have information about the >3 million trips taken on Divvy
bikes in 2016.
```{r get-trips-overview}
nrow(divvy$trips)
print(head(divvy$trips),row.names = FALSE)
```
Out of all the Divvy stations in Chicago, the one on Navy Pier (near
the corner of Streeter and Grand) had the most activity by far.
```{r get-most-trips-by-station}
departures <- table(divvy$trips$from_station_name)
as.matrix(head(sort(departures,decreasing = TRUE)))
```
## Divvy bikes at the University of Chicago
I would also like to take a close look at the trip data for the main
Divvy station on the University of Chicago campus. The Divvy bikes
were rented almost 8,000 times in 2016 at that location.
```{r get-uchicago-station}
sum(divvy$trips$from_station_name == "University Ave & 57th St",na.rm = TRUE)
```
This is the version of R and the packages that were used to generate
these results.