/
get_area.Rmd
410 lines (322 loc) 路 13.8 KB
/
get_area.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
---
title: "Get areas and data to layer on a map"
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
```{r setup}
library(mapbaltimore)
library(getdata)
library(ggplot2)
set_map_theme()
```
Using the mapbaltimore package starts with understanding the `get_baltimore_area()` function and getdata function that powers most data access functions in the package: `get_location_data()`.
To show how these different functions work, I'll make a simple `map_area` function we will use in this article.
```{r map_area}
map_area <- function(x, col) {
ggplot(data = x) +
geom_sf(aes(fill = .data[[col]])) +
geom_sf_label(aes(label = .data[[col]])) +
guides(fill = "none")
}
```
## Get areas
The `get_area` function uses the `dplyr::filter()` to select one or more areas of a specified type of political or administrative geography. You can select any one of the seven different types:
- Neighborhoods
- Baltimore City Council districts
- Maryland state legislative districts
- U.S. Congressional districts that include Baltimore City
- Baltimore City Planning Districts
- Baltimore City Police Districts
- Baltimore City Community Statistical Areas
### Get areas by name or id
You can review the names (`name`) or identifiers (`id`) for each type of area by looking at the corresponding column in the data. Typically, the name column should also work as a label for an area and the id column is used as a unique identifier. The names require an exact match. For example, `get_baltimore_area(type = "neighborhood", name = "Washington Village/Pigtown")` works but `get_baltimore_area(type = "neighborhood", name = "Pigtown")` will return an error.
```{r name}
# Show the first 3 council district names
council_districts$name[1:3]
# Get district 8 by name
get_baltimore_area(
type = "council district",
name = "District 8"
) %>%
map_area("name")
# Show the first 3 council district ids
council_districts$id[1:3]
# Get district 7 by id
get_baltimore_area(
type = "council district",
id = 7
) %>%
map_area("id")
```
### Get multiple areas
To return multiple areas, you can provide a vector of names or identifiers.
```{r multi}
area_multiple <- get_baltimore_area(
type = "neighborhood",
name = c("Mount Vernon", "Mid-Town Belvedere", "Seton Hill")
)
area_multiple %>%
map_area("name")
```
You can also combine multiple areas into a single simple feature using the `union` parameter. This is helpful when you want to get data for multiple neighborhoods at the same time or map them as a single combined area.
By default the area names are concatenated using a ampersand separator, however, the length of these combined names are difficult to fit on a map and it is often better to replace the name with a shorter alternative.
```{r multi_union}
area_multiple_union <- get_baltimore_area(
type = "neighborhood",
name = c("Mount Vernon", "Mid-Town Belvedere", "Seton Hill"),
union = TRUE
)
area_multiple_union$name
area_multiple_union$name <- "Mount Vernon area"
area_multiple_union %>%
map_area("name")
```
## Get data for an area
The `get_area_data()` function offers a great deal of flexibility. You can provide an area from `get_area()` or any other simple feature polygon or multipolygon located within Baltimore City (or any region if using cached `baltimore_msa_streets` data set). You can also provide a bounding box created with the `sf::st_bbox()` function.
To illustrate the options for this function, I'm getting the downtown neighborhood as a simple feature object (`area`) and making a list with ggplot2 layers, guide, and scale (`area_layer`) that I reuse below for the example maps in this section.
```{r example_area}
area <- get_baltimore_area(
type = "neighborhood",
name = "Downtown"
)
area_layer <- list(
geom_sf(data = area, fill = "grey90", alpha = 0.8, color = "grey20", linetype = "dotted"),
geom_sf_label(data = area, aes(label = name)),
guides(fill = "none"),
scale_fill_viridis_d()
)
```
### Adjust the area bounding box
In order to place an area in context, you may want a portion of data for the surrounding area so the function returns data within the bounding box of the area by default. The dimensions of this bounding box can be adjusted using the `dist`, `diag_ratio`, and `asp` parameters. You can access these adjustments directly using the `buffer_area()`, `adjust_bbox_asp()`, and `adjust_bbox()` functions. These functions are used below to illustrate how they work when you use the corresponding parameters with `get_area_data()`.
The `dist` parameter is passed to the `sf::st_buffer()` function and is used to set the buffer in meters for the area. The `diag_ratio` is also used to set a buffer distances but the number represents the proportion of the diagonal distance of the area bounding box. This is helpful because a set ratio will scale in proportion to the size of the area.
```{r diag_ratio_example}
example_dist <- 50
example_diag_ratio <- 0.25
# 50 meter buffer
area_dist <- sfext::st_buffer_ext(area, dist = example_dist)
area_dist_bbox <- sfext::sf_bbox_to_sf(sf::st_bbox(area_dist))
# buffer 1/4 (0.25) of the diagonal distance of the bounding box
area_diag_ratio <- sfext::st_buffer_ext(area, diag_ratio = example_diag_ratio)
area_diag_ratio_bbox <- sfext::sf_bbox_to_sf(sf::st_bbox(area_diag_ratio))
ggplot() +
geom_sf(data = area_dist, fill = "purple", alpha = 0.1) +
geom_sf(data = area_dist_bbox, color = "purple", fill = NA) +
geom_sf(data = area_diag_ratio, fill = "darkorange", alpha = 0.1) +
geom_sf(data = area_diag_ratio_bbox, color = "darkorange", fill = NA) +
area_layer
```
The `asp` parameter is applied after any buffers are applied. The `adjust_bbox_asp()` function accepts either a number, e.g. 1.5, or a string in the format most commonly used for aspect ratios, e.g. "6:4". This example shows the extent of a square bounding box for both the buffered downtown areas created above.
```{r asp_example}
example_asp <- "1:1"
area_dist_asp <- sfext::st_bbox_asp(area_dist, asp = example_asp) %>%
sfext::sf_bbox_to_sf()
area_diag_ratio_asp <- sfext::st_bbox_asp(area_diag_ratio, asp = example_asp) %>%
sfext::sf_bbox_to_sf()
ggplot() +
geom_sf(data = area_dist_asp, fill = "purple", color = "purple", alpha = 0.1) +
geom_sf(data = area_diag_ratio_asp, fill = "darkorange", color = "darkorange", alpha = 0.1) +
area_layer
```
### Cropping and trimming data
Finally, here is how these area adjustments work in combination with the `get_location_data()` function. By default, the data is cropped to the bounding box of the provided area:
```{r get_data_example}
get_location_data(
location = area,
data = council_districts
) %>%
map_area("name") +
area_layer
```
Here is the data with a `diag_ratio` buffer:
```{r get_data_diag_ratio}
get_location_data(
location = area,
data = council_districts,
diag_ratio = example_diag_ratio
) %>%
map_area("name") +
area_layer
```
Here is the data using an `asp` adjustment to return a square :
```{r get_data_asp}
get_location_data(
location = area,
data = council_districts,
asp = example_asp
) %>%
map_area("name") +
area_layer
```
You can also avoid cropping if you want to return the full extent of any data that even partially overlaps with the area or bounding box. For example, this is the same example as above with `crop = FALSE`.
```{r crop_false}
get_location_data(
location = area,
data = council_districts,
crop = FALSE
) %>%
map_area("name") +
area_layer
```
If you want to use `crop = FALSE` in combination with the area adjustment parameters you must either supply a bounding box instead of an area *or* adjust the area using `buffer_area()` before passing it to the `get_area_data()` function. The maps are similar enough to the prior example that I've hid the results but provided the code here as a sample.
```{r crop_false_alt, results='hide'}
get_location_data(
location = sfext::st_buffer_ext(area, diag_ratio = example_diag_ratio),
data = council_districts,
crop = FALSE
)
get_location_data(
location = sf::st_bbox(area),
data = council_districts,
diag_ratio = example_diag_ratio,
crop = FALSE
)
```
Depending on the type of data you are working with, you may also want to trim the data to the area using the `sf::st_intersection()` function. You can't trim to an area if you only provide a bounding box (`bbox`); you must provide an area.
```{r trim_example, eval=FALSE}
area_trees <- get_location_data(
location = sf::st_bbox(area),
data = "trees",
dist = example_dist,
from_crs = 2804,
package = "mapbaltimore"
)
area_trees_trimmed <- get_location_data(
location = area,
data = "trees",
dist = example_dist,
trim = TRUE,
package = "mapbaltimore"
)
ggplot() +
area_layer +
geom_sf(data = area_trees, color = "wheat3") +
geom_sf(data = area_trees_trimmed, color = "forestgreen", alpha = 0.8)
```
Similar to crop, using the `trim = TRUE` parameter ignores any distance adjustments but the same work around can be used to apply a buffer to the area before passing it to `get_area_data()`.
```{r trim_example_alt, eval=FALSE}
area_trees_trimmed_diag_ratio <- get_location_data(
location = sfext::st_buffer_ext(area, diag_ratio = example_diag_ratio),
data = "trees",
pkg = "mapbaltimore",
trim = TRUE
)
ggplot() +
area_layer +
geom_sf(data = area_trees_trimmed_diag_ratio, color = "forestgreen") +
geom_sf(data = area_trees_trimmed, color = "wheat3")
```
The `trim` parameter is also supported by the `get_location_data()` and `get_osm_data()` functions that are discussed in more detail in the [article on external, cached, and remote data sources](/articles/articles/extdata_cachedata.html).
## Layering data in area maps
You may be wondering why these all of these parameters may be useful. The `maplayer::layer_location_data()` function combines `get_location_data()` with `ggplot2::geom_sf()` to quickly turn the data from `mapbaltimore` into ggplot maps. Here is a simple example that turns streets and parks data into a map of the downtown area.
```{r layer_area_data}
example_diag_ratio <- 0.05
layer_streets <- maplayer::layer_location_data(
location = area,
data = streets,
color = "gray60",
diag_ratio = example_diag_ratio
)
layer_parks <- maplayer::layer_location_data(
location = area,
data = parks,
fill = "forestgreen",
diag_ratio = example_diag_ratio
)
background_layers <- list(layer_streets, layer_parks)
ggplot() +
background_layers
```
The following example shows how to create a new map layer using data imported from Open Street Map. When no location is provided, no filtering takes place.
```{r layer_area_osm}
layer_area_buildings <- maplayer::layer_location_data(
data = getdata::get_osm_data(
location = area,
diag_ratio = example_diag_ratio,
key = "building",
value = "yes",
geometry = "polygons"
),
fill = "antiquewhite2",
color = NA,
alpha = 1
)
ggplot() +
background_layers +
layer_area_buildings +
labs(caption = "漏 OpenStreetMap contributors")
```
You can also pass a url to add data from any ArcGIS MapServer or FeatureServer.
```{r layer_area_url, eval=FALSE}
parking_facility_url <- "https://opendata.baltimorecity.gov/egis/rest/services/Hosted/Parking_Facilities/FeatureServer/0"
layer_area_parking <- maplayer::layer_location_data(
location = area,
data = parking_facility_url,
diag_ratio = example_diag_ratio,
color = "gray10",
fill = "yellow",
shape = 24,
size = 4
)
ggplot() +
background_layers +
layer_area_buildings +
layer_area_parking +
ggtitle("Parking facilities in Downtown Baltimore")
```
Finally, you can apply some additional function to the data using the same lambda syntax used for `purrr`. For example, the tree data includes dead trees which could be removed before displaying them on a map.
```{r layer_area_trees_f, eval=FALSE}
layer_area_trees <- list(
maplayer::layer_location_data(
location = area,
data = "trees.gpkg",
package = "mapbaltimore",
fn = ~ dplyr::filter(.x, condition != "Dead"),
trim = TRUE,
mapping = aes(
size = dbh * 0.4,
color = factor(condition, c("Good", "Fair", "Poor"))
),
alpha = 0.6
),
guides(size = "none"),
labs(color = "Tree condition"),
scale_color_manual(values = shades::gradient(c("forestgreen", "burlywood4"), 3))
)
ggplot() +
background_layers +
layer_area_trees
```
## Working with multiple areas
There are a few different ways to use these functions with a dataframe of multiple areas. The `get_area_data()` function always combines multiple areas into a single geometry and returns data for a bounding box that encompasses all areas.
If you want to get data for each area separately, the `dplyr::nest_by()` and `purrr::map_dfr()` functions can be used. The following example also shows how `get_nearby_areas()` can be used to return a data frame of overlapping or immediately surrounding areas.
```{r nearby_areas_nested, warning=FALSE}
nearby_areas <- get_nearby_areas(area = area, type = "neighborhood")
nearby_areas_nested <- dplyr::nest_by(nearby_areas, name, .keep = TRUE)
nearby_parks <- purrr::map_dfr(
nearby_areas_nested$data,
~ getdata::get_location_data(
location = .x,
data = parks,
trim = TRUE
) %>%
dplyr::bind_cols(neighborhood = .x$name)
)
# FIXME: This isn't working!
# ggplot() +
# maplayer::layer_location_data(location = nearby_areas, data = streets, trim = TRUE, color = "gray70", crs = 2804) +
# # layer_parks +
# ggplot2::geom_sf(data = sf::st_make_valid(nearby_parks), aes(fill = neighborhood)) +
# # scale_fill_viridis_d() +
# labs(fill = "Neighborhood\nof park")
```
Another approach relies on using the data inherited from `ggplot()` with the option to apply different aesthetics or process the data differently in each layer of the map.
```{r}
parks %>%
ggplot() +
maplayer::layer_location_data(location = area, trim = TRUE, fill = "forestgreen") +
maplayer::layer_location_data(location = nearby_areas[6, ], trim = TRUE, fill = "yellowgreen")
```