geocompx · Nowosad · Jun 7, 2022 · May 7, 2022 · May 7, 2022 · May 7, 2022
diff --git a/01-introduction.Rmd b/01-introduction.Rmd
@@ -139,8 +139,8 @@ rowMeans(b)
 ```{r, eval=FALSE}
 library(leaflet)
 popup = c("Robin", "Jakub", "Jannes")
-leaflet() %>%
-  addProviderTiles("NASAGIBS.ViirsEarthAtNight2012") %>%
+leaflet() |>
+  addProviderTiles("NASAGIBS.ViirsEarthAtNight2012") |>
   addMarkers(lng = c(-3, 23, 11),
              lat = c(52, 53, 49), 
              popup = popup)
@@ -152,8 +152,8 @@ if(knitr::is_latex_output()){
 } else if(knitr::is_html_output()){
     # library(leaflet)
     # popup = c("Robin", "Jakub", "Jannes")
-    # interactive = leaflet() %>%
-    #   addProviderTiles("NASAGIBS.ViirsEarthAtNight2012") %>%
+    # interactive = leaflet() |>
+    #   addProviderTiles("NASAGIBS.ViirsEarthAtNight2012") |>
     #   addMarkers(lng = c(-3, 23, 11),
     #              lat = c(52, 53, 49), 
     #              popup = popup)  

diff --git a/02-spatial-data.Rmd b/02-spatial-data.Rmd
@@ -267,7 +267,7 @@ There are many reasons (linked to the advantages of the simple features model):
 - Enhanced plotting performance
 - **sf** objects can be treated as data frames in most operations
 - **sf** function names are relatively consistent and intuitive (all begin with `st_`)
-- **sf** functions can be combined using `%>%` operator and works well with the [tidyverse](http://tidyverse.org/) collection of R packages\index{tidyverse}.
+- **sf** functions can be combined with the `|>` operator and works well with the [tidyverse](http://tidyverse.org/) collection of R packages\index{tidyverse}.
 
 **sf**'s support for **tidyverse** packages is exemplified by the provision of the `read_sf()` function for reading geographic vector datasets.
 Unlike the function `st_read()`, which returns attributes stored in a base R `data.frame` (and which provides more verbose messages, not shown in the code chunk below), `read_sf()` returns data as a **tidyverse** `tibble`.
@@ -298,7 +298,7 @@ world_sf = st_as_sf(world_sp)           # from sp to sf
 ### Basic map making {#basic-map}
 
 Basic maps are created in **sf** with `plot()`.
-By default this creates a multi-panel plot (like **sp**'s `spplot()`), one sub-plot for each variable of the object, as illustrated in the left-hand panel in Figure \@ref(fig:sfplot).
+By default this creates a multi-panel plot, one sub-plot for each variable of the object, as illustrated in the left-hand panel in Figure \@ref(fig:sfplot).
 A legend or 'key' with a continuous color is produced if the object to be plotted has a single variable (see the right-hand panel).
 Colors can also be set with `col = `, although this will not create a continuous palette or a legend. 
 \index{map making!basic}

diff --git a/03-attribute-operations.Rmd b/03-attribute-operations.Rmd
@@ -29,8 +29,8 @@ library(osmdata)
 london_coords = c(-0.1, 51.5)
 london_bb = c(-0.11, 51.49, -0.09, 51.51)
 bb = tmaptools::bb(london_bb)
-osm_data = opq(bbox = london_bb) %>% 
-  add_osm_feature(key = "highway", value = "bus_stop") %>% 
+osm_data = opq(bbox = london_bb) |> 
+  add_osm_feature(key = "highway", value = "bus_stop") |> 
   osmdata_sf()
 osm_data_points = osm_data$osm_points
 osm_data_points[4, ]
@@ -78,7 +78,7 @@ methods(class = "sf") # methods for sf objects, first 12 shown
 
 ```{r 03-attribute-operations-5, eval=FALSE, echo=FALSE}
 # Another way to show sf methods:
-attributes(methods(class = "sf"))$info %>% 
+attributes(methods(class = "sf"))$info |>
   dplyr::filter(!visible)
 ```
 
@@ -193,15 +193,15 @@ Key functions for subsetting data frames (including `sf` data frames) with **dpl
 i = sample(nrow(world), size = 10)
 benchmark_subset = bench::mark(
   world[i, ],
-  world %>% slice(i)
+  world |> slice(i)
 )
 benchmark_subset[c("expression", "itr/sec", "mem_alloc")]
 # # October 2021 on laptop with CRAN version of dplyr:
 # # A tibble: 2 × 3
 #   expression         `itr/sec` mem_alloc
 #   <bch:expr>             <dbl> <bch:byt>
 # 1 world[i, ]             1744.    5.55KB
-# 2 world %>% slice(i)      671.    4.45KB
+# 2 world |> slice(i)      671.    4.45KB
 ```
 
 `select()` selects columns by name or position.
@@ -317,15 +317,15 @@ Pipes enable expressive code: the output of a previous function becomes the firs
 This is illustrated below, in which only countries from Asia are filtered from the `world` dataset, next the object is subset by columns (`name_long` and `continent`) and the first five rows (result not shown).
 
 ```{r 03-attribute-operations-24}
-world7 = world %>%
-  filter(continent == "Asia") %>%
-  dplyr::select(name_long, continent) %>%
+world7 = world |>
+  filter(continent == "Asia") |>
+  dplyr::select(name_long, continent) |>
   slice(1:5)
 ```
 
 The above chunk shows how the pipe operator allows commands to be written in a clear order:
 the above run from top to bottom (line-by-line) and left to right.
-The alternative to `%>%` is nested function calls, which is harder to read:
+The alternative to `|>` is nested function calls, which is harder to read:
 
 ```{r 03-attribute-operations-25}
 world8 = slice(
@@ -364,20 +364,20 @@ nrow(world_agg2)
 ```
 
 The resulting `world_agg2` object is a spatial object containing 8 features representing the continents of the world (and the open ocean).
-`group_by() %>% summarize()` is the **dplyr** equivalent of `aggregate()`, with the variable name provided in the `group_by()` function specifying the grouping variable and information on what is to be summarized passed to the `summarize()` function, as shown below:
+`group_by() |> summarize()` is the **dplyr** equivalent of `aggregate()`, with the variable name provided in the `group_by()` function specifying the grouping variable and information on what is to be summarized passed to the `summarize()` function, as shown below:
 
 ```{r 03-attribute-operations-28}
-world_agg3 = world %>%
-  group_by(continent) %>%
+world_agg3 = world |>
+  group_by(continent) |>
   summarize(pop = sum(pop, na.rm = TRUE))
 ```
 
 The approach may seem more complex but it has benefits: flexibility, readability, and control over the new column names.
 This flexibility is illustrated in the command below, which calculates not only the population but also the area and number of countries in each continent:
 
 ```{r 03-attribute-operations-29}
-world_agg4  = world %>% 
-  group_by(continent) %>%
+world_agg4  = world |> 
+  group_by(continent) |>
   summarize(pop = sum(pop, na.rm = TRUE), `area (sqkm)` = sum(area_km2), n = n())
 ```
 
@@ -388,14 +388,14 @@ Let's combine what we have learned so far about **dplyr** functions, by chaining
 The following command calculates population density (with `mutate()`), arranges continents by the number countries they contain (with `dplyr::arrange()`), and keeps only the 3 most populous continents (with `top_n()`), the result of which is presented in Table \@ref(tab:continents)):
 
 ```{r 03-attribute-operations-30}
-world_agg5 = world %>% 
-  st_drop_geometry() %>%                      # drop the geometry for speed
-  dplyr::select(pop, continent, area_km2) %>% # subset the columns of interest  
-  group_by(continent) %>%                     # group by continent and summarize:
-  summarize(Pop = sum(pop, na.rm = TRUE), Area = sum(area_km2), N = n()) %>%
-  mutate(Density = round(Pop / Area)) %>%     # calculate population density
-  top_n(n = 3, wt = Pop) %>%                  # keep only the top 3
-  arrange(desc(N))                            # arrange in order of n. countries
+world_agg5 = world |> 
+  st_drop_geometry() |>                      # drop the geometry for speed
+  dplyr::select(pop, continent, area_km2) |> # subset the columns of interest  
+  group_by(continent) |>                     # group by continent and summarize:
+  summarize(Pop = sum(pop, na.rm = TRUE), Area = sum(area_km2), N = n()) |>
+  mutate(Density = round(Pop / Area)) |>     # calculate population density
+  top_n(n = 3, wt = Pop) |>                  # keep only the top 3
+  arrange(desc(N))                           # arrange in order of n. countries
 ```
 
 ```{r continents, echo=FALSE}
@@ -551,14 +551,14 @@ Alternatively, we can use one of **dplyr** functions - `mutate()` or `transmute(
 `mutate()` adds new columns at the penultimate position in the `sf` object (the last one is reserved for the geometry):
 
 ```{r 03-attribute-operations-43, eval=FALSE}
-world %>% 
+world |> 
   mutate(pop_dens = pop / area_km2)
 ```
 
 The difference between `mutate()` and `transmute()` is that the latter drops all other existing columns (except for the sticky geometry column):
 
 ```{r 03-attribute-operations-44, eval=FALSE}
-world %>% 
+world |> 
   transmute(pop_dens = pop / area_km2)
 ```
 
@@ -567,15 +567,15 @@ For example, we want to combine the `continent` and `region_un` columns into a n
 Additionally, we can define a separator (here: a colon `:`) which defines how the values of the input columns should be joined, and if the original columns should be removed (here: `TRUE`):
 
 ```{r 03-attribute-operations-45, eval=FALSE}
-world_unite = world %>%
+world_unite = world |>
   unite("con_reg", continent:region_un, sep = ":", remove = TRUE)
 ```
 
 The `separate()` function does the opposite of `unite()`: it splits one column into multiple columns using either a regular expression or character positions.
 This function also comes from the **tidyr** package.
 
 ```{r 03-attribute-operations-46, eval=FALSE}
-world_separate = world_unite %>% 
+world_separate = world_unite |> 
   separate(con_reg, c("continent", "region_un"), sep = ":")
 ```
 
@@ -588,20 +588,20 @@ The first replaces an old name with a new one.
 The following command, for example, renames the lengthy `name_long` column to simply `name`:
 
 ```{r 03-attribute-operations-48, eval=FALSE}
-world %>% 
+world |> 
   rename(name = name_long)
 ```
 
 `setNames()` changes all column names at once, and requires a character vector with a name matching each column.
 This is illustrated below, which outputs the same `world` object, but with very short names: 
 
 ```{r 03-attribute-operations-49, eval=FALSE, echo=FALSE}
-abbreviate(names(world), minlength = 1) %>% dput()
+abbreviate(names(world), minlength = 1) |> dput()
 ```
 
 ```{r 03-attribute-operations-50, eval=FALSE}
 new_names = c("i", "n", "c", "r", "s", "t", "a", "p", "l", "gP", "geom")
-world %>% 
+world |> 
   setNames(new_names)
 ```
 
@@ -613,7 +613,7 @@ Hence, an approach such as `select(world, -geom)` will be unsuccessful and you s
 ]
 
 ```{r 03-attribute-operations-51}
-world_data = world %>% st_drop_geometry()
+world_data = world |> st_drop_geometry()
 class(world_data)
 ```
 

diff --git a/04-spatial-operations.Rmd b/04-spatial-operations.Rmd
@@ -60,7 +60,7 @@ To demonstrate spatial subsetting, we will use the `nz` and `nz_height` datasets
 The following code chunk creates an object representing Canterbury, then uses spatial subsetting to return all high points in the region:
 
 ```{r 04-spatial-operations-3}
-canterbury = nz %>% filter(Name == "Canterbury")
+canterbury = nz |> filter(Name == "Canterbury")
 canterbury_height = nz_height[canterbury, ]
 ```
 
@@ -125,16 +125,18 @@ Note: the solution involving `sgbp` objects is more generalisable though, as it
 The same result can be achieved with the **sf** function `st_filter()` which was [created](https://github.com/r-spatial/sf/issues/1148) to increase compatibility between `sf` objects and **dplyr** data manipulation code:
 
 ```{r}
-canterbury_height3 = nz_height %>%
+canterbury_height3 = nz_height |>
   st_filter(y = canterbury, .predicate = st_intersects)
 ```
 
+<!--toDo:jn-->
+<!-- fix pipes -->
 
 ```{r 04-spatial-operations-7b-old, eval=FALSE, echo=FALSE}
 # Additional tests of subsetting
-canterbury_height4 = nz_height %>%
-  filter(st_intersects(x = ., y = canterbury, sparse = FALSE))
-canterbury_height5 = nz_height %>%
+canterbury_height4 = nz_height |>
+  filter(st_intersects(x = _, y = canterbury, sparse = FALSE))
+canterbury_height5 = nz_height |>
   filter(sel_logical)
 identical(canterbury_height3, canterbury_height4)
 identical(canterbury_height3, canterbury_height5)
@@ -437,7 +439,7 @@ b9sf$domain_b = rep(rep(domains, each = 3), each = 2)
 b9sf = rbind(b9sf, ii, bi, ei, ib, bb, eb, ie, be, ee)
 b9sf$domain_a = ordered(b9sf$domain_a, levels = c("Interior", "Boundary", "Exterior"))
 b9sf$domain_b = ordered(b9sf$domain_b, levels = c("Interior", "Boundary", "Exterior"))
-b9sf = b9sf %>% 
+b9sf = b9sf |> 
   mutate(alpha = case_when(
    Object == "x" ~ 0.1, 
    Object == "y" ~ 0.1, 
@@ -597,8 +599,8 @@ random_df = data.frame(
   x = runif(n = 10, min = bb[1], max = bb[3]),
   y = runif(n = 10, min = bb[2], max = bb[4])
 )
-random_points = random_df %>% 
-  st_as_sf(coords = c("x", "y")) %>% # set coordinates
+random_points = random_df |> 
+  st_as_sf(coords = c("x", "y")) |> # set coordinates
   st_set_crs("EPSG:4326") # set geographic CRS
 ```
 
@@ -663,9 +665,9 @@ if (knitr::is_latex_output()){
     # tm_bubbles(col = "red", alpha = 0.5, size = 0.2) +
     # tm_scale_bar()
   library(leaflet)
-  leaflet() %>%
-    # addProviderTiles(providers$OpenStreetMap.BlackAndWhite) %>%
-    addCircles(data = cycle_hire) %>%
+  leaflet() |>
+    # addProviderTiles(providers$OpenStreetMap.BlackAndWhite) |>
+    addCircles(data = cycle_hire) |>
     addCircles(data = cycle_hire_osm, col = "red")  
 }
 ```
@@ -712,8 +714,8 @@ This is because some cycle hire stations in `cycle_hire` have multiple matches i
 To aggregate the values for the overlapping points and return the mean, we can use the aggregation methods learned in Chapter \@ref(attr), resulting in an object with the same number of rows as the target:
 
 ```{r 04-spatial-operations-26}
-z = z %>% 
-  group_by(id) %>% 
+z = z |> 
+  group_by(id) |> 
   summarize(capacity = mean(capacity))
 nrow(z) == nrow(cycle_hire)
 ```
@@ -731,7 +733,7 @@ The result of this join has used a spatial operation to change the attribute dat
 
 As with attribute data aggregation, spatial data aggregation *condenses* data: aggregated outputs have fewer rows than non-aggregated inputs.
 Statistical *aggregating functions*, such as mean average or sum, summarise multiple values \index{statistics} of a variable, and return a single value per *grouping variable*.
-Section \@ref(vector-attribute-aggregation) demonstrated how `aggregate()` and `group_by() %>% summarize()` condense data based on attribute variables, this section shows how the same functions work with spatial objects.
+Section \@ref(vector-attribute-aggregation) demonstrated how `aggregate()` and `group_by() |> summarize()` condense data based on attribute variables, this section shows how the same functions work with spatial objects.
 \index{aggregation!spatial}
 
 Returning to the example of New Zealand, imagine you want to find out the average height of high points in each region: it is the geometry of the source (`y` or `nz` in this case) that defines how values in the target object (`x` or `nz_height`) are grouped.
@@ -753,8 +755,8 @@ tm_shape(nz_agg) +
 ```
 
 ```{r 04-spatial-operations-29}
-nz_agg2 = st_join(x = nz, y = nz_height) %>%
-  group_by(Name) %>%
+nz_agg2 = st_join(x = nz, y = nz_height) |>
+  group_by(Name) |>
   summarize(elevation = mean(elevation, na.rm = TRUE))
 ```
 
@@ -766,10 +768,11 @@ plot(nz_agg2)
 
 The resulting `nz_agg` objects have the same geometry as the aggregating object `nz` but with a new column summarising the values of `x` in each region using the function `mean()`.
 Other functions could be used instead of `mean()` here, including `median()`, `sd()` and other functions that return a single value per group.
-Note: one difference between the `aggregate()` and `group_by() %>% summarize()` approaches is that the former results in `NA` values for unmatching region names while the latter preserves region names.
+Note: one difference between the `aggregate()` and `group_by() |> summarize()` approaches is that the former results in `NA` values for unmatching region names while the latter preserves region names.
 The 'tidy' approach is thus more flexible in terms of aggregating functions and the column names of the results.
 Aggregating operations that also create new geometries are covered in Section \@ref(geometry-unions).
 
+
 ### Joining incongruent layers {#incongruent}
 
 Spatial congruence\index{spatial congruence} is an important concept related to spatial aggregation.
@@ -809,7 +812,7 @@ This is illustrated in the code chunk below, which finds the distance between th
 \index{sf!distance relations}
 
 ```{r 04-spatial-operations-31, warning=FALSE}
-nz_heighest = nz_height %>% top_n(n = 1, wt = elevation)
+nz_heighest = nz_height |> top_n(n = 1, wt = elevation)
 canterbury_centroid = st_centroid(canterbury)
 st_distance(nz_heighest, canterbury_centroid)
 ```