/
time-of-day-trends.Rmd
110 lines (86 loc) · 2.96 KB
/
time-of-day-trends.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
---
title: "Divvy usage by time-of-day"
author: "Peter Carbonetto"
output: html_document
---
<!-- Define defaults shared by all workflowr files. -->
```{r read-chunk, include=FALSE, cache=FALSE}
knitr::read_chunk("chunks.R")
```
<!-- Update knitr chunk options -->
```{r knitr-opts-chunk, include=FALSE}
```
<!-- Insert the date the file was last updated -->
```{r last-updated, echo=FALSE, results='asis'}
```
<!-- Insert the code version (Git commit SHA1) if Git repository
exists and R package git2r is installed -->
```{r code-version, echo=FALSE, results='asis'}
```
Here we use the Divvy trip data to examine biking trends over the
course of a typical day in Chicago.
I begin by loading a few packages, as well as some additional
functions I wrote for this project.
```{r load-packages, message = FALSE}
library(data.table)
library(ggplot2)
source("../code/functions.R")
```
<br>
## Read the data
Following [my earlier steps](first-glance.html), I use function
`read.divvy.data` to read the trip and station data from the CSV
files.
```{r load-data, cache = TRUE}
divvy <- read.divvy.data()
```
To make it easier to compile statistics by time of day, I convert the
"start hour" column to a factor (*i.e.*, categorical variable).
```{r convert-start-hour, fig.height = 4, fig.width = 7}
divvy$trips <- transform(divvy$trips,start.hour = factor(start.hour,0:23))
```
<br>
## Count departures
Now that `start.hour` is a factor, it is easy to create a bar chart
showing the total number of departures at each hour.
Unsurprisingly, there is very little biking activity at night. The two
peaks ("modes") in the bar chart nicely recapitulate the morning and
afternoon rush hours.
```{r timeofday-barchart}
ggplot(divvy$trips,aes(start.hour)) +
geom_bar(fill = "black",width = 0.6) +
theme_minimal() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank())
```
This summary is a bit muddled because it is counting trips on the both
weekdays and weekends; once we separate the counts by the day of the
week, the rush-hour trends become more striking (and disappear
completely on Saturday and Sunday).
```{r timeofday-barchart-2, fig.height = 8.5, fig.width = 7}
ggplot(divvy$trips,aes(start.hour)) +
geom_bar(fill = "black",width = 0.6) +
facet_wrap(~start.day,ncol = 2) +
scale_x_discrete(breaks = seq(0,24,2)) +
theme_minimal() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank())
```
<br>
## University of Chicago trends
```{r timeofday-uchicago-barchart}
ggplot(subset(divvy$trips,from_station_name == "University Ave & 57th St"),
aes(start.hour)) +
geom_bar(fill = "black",width = 0.6) +
facet_wrap(~start.day,ncol = 2) +
scale_x_discrete(breaks = seq(0,24,2)) +
theme_minimal() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank())
```
<br>
## Session information
This is the version of R and the packages that were used to generate
these results.
```{r session-info}
```