diff --git a/02-spatial-data.md b/02-spatial-data.md
index a6c0c00d7..cfb942bdb 100644
--- a/02-spatial-data.md
+++ b/02-spatial-data.md
@@ -768,11 +768,13 @@ Consequently, the discrete borders of these features become blurred, and dependi
 <p class="caption">(\#fig:raster-intro-plot2)Examples of continuous and categorical rasters.</p>
 </div>
 
-### R packages for raster data handling
+### R packages for working with raster data
 
-R has several packages able to read and process spatial raster data; see \Wref(the-history-of-r-spatial) for more context.
-However, currently, two main packages with this purpose exist -- **terra** and **stars**.^[We are not mentioning the **raster** package here as it is now being replaced with **terra**.]
-We are focusing on the **terra** package in this book; however, it may be worth knowing the basic similarities and differences between the packages before deciding which one to use.
+Over the last two decades, several packages packages for reading and processing raster datasets have been developed.
+As outlined in Section \@ref(the-history-of-r-spatial), chief among them was **raster**, which led to a step change in R's raster capabilities when it was launched in 2010 and the premier package in the space until the development of **terra** and **stars**.
+Both more recently developed package provide powerful and performant functions for working with raster datasets and there is substantial overlap between their possibly use cases.
+In this book we focus on **terra**, which replaces the older and (in most cases) slower **raster**.
+Before learning about the how **terra**'s class system works, this section describes similarities and differences between **terra** and **raster**; this knowledge will help decide which is most appropriate in different situations.
 
 First, **terra** focuses on the most common raster data model (regular grids), while **stars** also allows storing less popular models (including regular, rotated, sheared, rectilinear, and curvilinear grids).
 While **terra** usually handle one or multi-layered rasters^[It also has an additional class `SpatRasterDataset` for storing many collections of datasets.], the **stars** package provides ways to store raster data cubes -- a raster object with many layers (e.g., bands), for many moments in time (e.g., months), and many attributes (e.g., sensor type A and sensor type B).
diff --git a/adv-map.html b/adv-map.html
index d8c4d58d9..aa0413fa0 100644
--- a/adv-map.html
+++ b/adv-map.html
@@ -261,7 +261,7 @@ <h3>
 <p>There are two main types of map aesthetics: those that change with the data and those that are constant.
 Unlike <strong>ggplot2</strong>, which uses the helper function <code><a href="https://ggplot2.tidyverse.org/reference/aes.html">aes()</a></code> to represent variable aesthetics, <strong>tmap</strong> accepts aesthetic arguments directly.
 To map a variable to an aesthetic, pass its column name to the corresponding argument, and to set a fixed aesthetic, pass the desired value instead.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;
-If there is a clash between a fixed value and a column name, the column name takes precedence. This can be verified by running the next code chunk after running &lt;code&gt;nz$red = 1:nrow(nz)&lt;/code&gt;.&lt;/p&gt;"><sup>40</sup></a>
+If there is a clash between a fixed value and a column name, the column name takes precedence. This can be verified by running the next code chunk after running &lt;code&gt;nz$red = 1:nrow(nz)&lt;/code&gt;.&lt;/p&gt;"><sup>39</sup></a>
 The most commonly used aesthetics for fill and border layers include color, transparency, line width and line type, set with <code>col</code>, <code>alpha</code>, <code>lwd</code>, and <code>lty</code> arguments, respectively.
 The impact of setting these with fixed values is illustrated in Figure <a href="adv-map.html#fig:tmstatic">8.3</a>.</p>
 <div class="sourceCode" id="cb268"><pre class="downlit sourceCode r">
@@ -368,7 +368,7 @@ <h3>
 Categorical palettes consist of easily distinguishable colors and are most appropriate for categorical data without any particular order such as state names or land cover classes.
 Colors should be intuitive: rivers should be blue, for example, and pastures green.
 Avoid too many categories: maps with large legends and many colors can be uninterpretable.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;
-&lt;code&gt;col = "MAP_COLORS"&lt;/code&gt; can be used in maps with a large number of individual polygons (for example, a map of individual countries) to create unique colors for adjacent polygons.&lt;/p&gt;'><sup>41</sup></a></p>
+&lt;code&gt;col = "MAP_COLORS"&lt;/code&gt; can be used in maps with a large number of individual polygons (for example, a map of individual countries) to create unique colors for adjacent polygons.&lt;/p&gt;'><sup>40</sup></a></p>
 <p>The second group is sequential palettes.
 These follow a gradient, for example from light to dark colors (light colors tend to represent lower values), and are appropriate for continuous (numeric) variables.
 Sequential palettes can be single (<code>Blues</code> go from light to dark blue, for example) or multi-color/hue (<code>YlOrBr</code> is gradient from light yellow to brown via orange, for example), as demonstrated in the code chunk below — output not shown, run the code yourself to see the results!</p>
@@ -393,7 +393,7 @@ <h3>
 This property is not preserved in the rainbow color palette; therefore, we suggest avoiding it in geographic data visualization <span class="citation">(<a href="references.html#ref-borland_rainbow_2007" role="doc-biblioref">Borland and Taylor II 2007</a>)</span>.
 Instead, <a href="https://cran.r-project.org/web/packages/viridis/">the viridis color palettes</a>, also available in <strong>tmap</strong>, can be used.
 Secondly, changes in colors should be accessible to the largest number of people.
-Therefore, it is important to use colorblind friendly palettes as often as possible.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;See the “Color blindness simulator” options in &lt;code&gt;tmaptools::palette_explorer()&lt;/code&gt;.&lt;/p&gt;"><sup>42</sup></a></p>
+Therefore, it is important to use colorblind friendly palettes as often as possible.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;See the “Color blindness simulator” options in &lt;code&gt;tmaptools::palette_explorer()&lt;/code&gt;.&lt;/p&gt;"><sup>41</sup></a></p>
 </div>
 <div id="layouts" class="section level3" number="8.2.5">
 <h3>
@@ -801,7 +801,7 @@ <h2>
 Learn more at: <a href="https://github.com/rstudio/shiny-server" class="uri">https://github.com/rstudio/shiny-server</a>.
 </div>
 <p>Before considering large apps, it is worth seeing a minimal example, named ‘lifeApp,’ in action.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;
-The word ‘app’ in this context refers to ‘web application’ and should not be confused with smartphone apps, the more common meaning of the word.&lt;/p&gt;"><sup>43</sup></a>
+The word ‘app’ in this context refers to ‘web application’ and should not be confused with smartphone apps, the more common meaning of the word.&lt;/p&gt;"><sup>42</sup></a>
 The code below defines and launches — with the command <code><a href="https://rdrr.io/pkg/shiny/man/shinyApp.html">shinyApp()</a></code> — a lifeApp, which provides an interactive slider allowing users to make countries appear with progressively lower levels of life expectancy (see Figure <a href="adv-map.html#fig:lifeApp">8.24</a>):</p>
 <div class="sourceCode" id="cb295"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span class="kw"><a href="https://rdrr.io/r/base/library.html">library</a></span><span class="op">(</span><span class="va"><a href="https://shiny.rstudio.com/">shiny</a></span><span class="op">)</span>    <span class="co"># for shiny apps</span>
diff --git a/algorithms.html b/algorithms.html
index 7f51a92ed..8712c2cc5 100644
--- a/algorithms.html
+++ b/algorithms.html
@@ -114,7 +114,7 @@ <h2>
 </blockquote>
 <p>There are strong reasons for moving in that direction, however.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;
 This chapter does not teach programming itself.
-For more on programming, we recommend &lt;span class="citation"&gt;&lt;a href="references.html#ref-wickham_advanced_2019" role="doc-biblioref"&gt;Wickham&lt;/a&gt; (&lt;a href="references.html#ref-wickham_advanced_2019" role="doc-biblioref"&gt;2019&lt;/a&gt;)&lt;/span&gt;, &lt;span class="citation"&gt;&lt;a href="references.html#ref-gillespie_efficient_2016" role="doc-biblioref"&gt;Gillespie and Lovelace&lt;/a&gt; (&lt;a href="references.html#ref-gillespie_efficient_2016" role="doc-biblioref"&gt;2016&lt;/a&gt;)&lt;/span&gt;, and &lt;span class="citation"&gt;&lt;a href="references.html#ref-xiao_gis_2016" role="doc-biblioref"&gt;Xiao&lt;/a&gt; (&lt;a href="references.html#ref-xiao_gis_2016" role="doc-biblioref"&gt;2016&lt;/a&gt;)&lt;/span&gt;. &lt;/p&gt;'><sup>56</sup></a>
+For more on programming, we recommend &lt;span class="citation"&gt;&lt;a href="references.html#ref-wickham_advanced_2019" role="doc-biblioref"&gt;Wickham&lt;/a&gt; (&lt;a href="references.html#ref-wickham_advanced_2019" role="doc-biblioref"&gt;2019&lt;/a&gt;)&lt;/span&gt;, &lt;span class="citation"&gt;&lt;a href="references.html#ref-gillespie_efficient_2016" role="doc-biblioref"&gt;Gillespie and Lovelace&lt;/a&gt; (&lt;a href="references.html#ref-gillespie_efficient_2016" role="doc-biblioref"&gt;2016&lt;/a&gt;)&lt;/span&gt;, and &lt;span class="citation"&gt;&lt;a href="references.html#ref-xiao_gis_2016" role="doc-biblioref"&gt;Xiao&lt;/a&gt; (&lt;a href="references.html#ref-xiao_gis_2016" role="doc-biblioref"&gt;2016&lt;/a&gt;)&lt;/span&gt;. &lt;/p&gt;'><sup>55</sup></a>
 The advantages of reproducibility go beyond allowing others to replicate your work:
 reproducible code is often better in every way than code written to be run only once, including in terms of computational efficiency, scalability and ease of adapting and maintaining it.</p>
 <p>Scripts are the basis of reproducible R code, a topic covered in Section <a href="algorithms.html#scripts">10.2</a>.
@@ -146,13 +146,13 @@ <h2>
 <span class="co">#&gt; [1] "Hello geocompr"</span></code></pre></div>
 <p>There are no strict rules on what can and cannot go into script files and nothing to prevent you from saving broken, non-reproducible code.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;
 Lines of code that do not contain valid R should be commented out, by adding a &lt;code&gt;#&lt;/code&gt; to the start of the line, to prevent errors.
-See line 1 of the &lt;code&gt;10-hello.R&lt;/code&gt; script.&lt;/p&gt;"><sup>57</sup></a>
+See line 1 of the &lt;code&gt;10-hello.R&lt;/code&gt; script.&lt;/p&gt;"><sup>56</sup></a>
 There are, however, some conventions worth following:</p>
 <ul>
 <li><p>Write the script in order: just like the script of a film, scripts should have a clear order such as ‘setup,’ ‘data processing’ and ‘save results’ (roughly equivalent to ‘beginning,’ ‘middle’ and ‘end’ in a film).</p></li>
 <li><p>Add comments to the script so other people (and your future self) can understand it. At a minimum, a comment should state the purpose of the script (see Figure <a href="algorithms.html#fig:codecheck">10.1</a>) and (for long scripts) divide it into sections. This can be done in RStudio, for example, with the shortcut <code>Ctrl+Shift+R</code>, which creates ‘foldable’ code section headings.</p></li>
 <li><p>Above all, scripts should be reproducible: self-contained scripts that will work on any computer are more useful than scripts that only run on your computer, on a good day. This involves attaching required packages at the beginning, reading-in data from persistent sources (such as a reliable website) and ensuring that previous steps have been taken.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;
-Prior steps can be referred to with a comment or with an if statement such as &lt;code&gt;if(!exists("x")) source("x.R")&lt;/code&gt; (which would run the script file &lt;code&gt;x.R&lt;/code&gt; if the object &lt;code&gt;x&lt;/code&gt; is missing).&lt;/p&gt;'><sup>58</sup></a></p></li>
+Prior steps can be referred to with a comment or with an if statement such as &lt;code&gt;if(!exists("x")) source("x.R")&lt;/code&gt; (which would run the script file &lt;code&gt;x.R&lt;/code&gt; if the object &lt;code&gt;x&lt;/code&gt; is missing).&lt;/p&gt;'><sup>57</sup></a></p></li>
 </ul>
 <p>It is hard to enforce reproducibility in R scripts, but there are tools that can help.
 By default, RStudio  ‘code-checks’ R scripts and underlines faulty code with a red wavy line, as illustrated below:</p>
@@ -174,7 +174,7 @@ <h2>
 Such dependencies should be mentioned as comments in the script or elsewhere in the project of which it is a part, as illustrated in the script <a href="https://github.com/Robinlovelace/geocompr/blob/main/code/10-centroid-alg.R"><code>10-centroid-alg.R</code></a>.
 The work undertaken by this script is demonstrated in the reproducible example below, which works on a pre-requisite object named <code>poly_mat</code>, a square with sides 9 units in length (the meaning of this will become apparent in the next section):<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;
 This example shows that &lt;code&gt;source()&lt;/code&gt; works with URLs (a shortened version is used here), assuming you have an internet connection.
-If you do not, the same script can be called with &lt;code&gt;source("code/10-centroid-alg.R")&lt;/code&gt;, assuming you are running R from the root directory of the &lt;code&gt;geocompr&lt;/code&gt; folder, which can be downloaded from &lt;a href="https://github.com/Robinlovelace/geocompr" class="uri"&gt;https://github.com/Robinlovelace/geocompr&lt;/a&gt;.&lt;/p&gt;'><sup>59</sup></a></p>
+If you do not, the same script can be called with &lt;code&gt;source("code/10-centroid-alg.R")&lt;/code&gt;, assuming you are running R from the root directory of the &lt;code&gt;geocompr&lt;/code&gt; folder, which can be downloaded from &lt;a href="https://github.com/Robinlovelace/geocompr" class="uri"&gt;https://github.com/Robinlovelace/geocompr&lt;/a&gt;.&lt;/p&gt;'><sup>58</sup></a></p>
 <div class="sourceCode" id="cb346"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span class="va">poly_mat</span> <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/r/base/cbind.html">cbind</a></span><span class="op">(</span>
   x <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/pkg/terra/man/c.html">c</a></span><span class="op">(</span><span class="fl">0</span>, <span class="fl">0</span>, <span class="fl">9</span>, <span class="fl">9</span>, <span class="fl">0</span><span class="op">)</span>,
@@ -252,13 +252,13 @@ <h2>
 <p>This code chunk outputs the correct result.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;
 The result can be verified with the following formula (which assumes a horizontal base):
 area is half of the base width times height, &lt;span class="math inline"&gt;\(A = B * H / 2\)&lt;/span&gt;.
-In this case &lt;span class="math inline"&gt;\(10 * 10 / 2 = 50\)&lt;/span&gt;.&lt;/p&gt;'><sup>60</sup></a>
+In this case &lt;span class="math inline"&gt;\(10 * 10 / 2 = 50\)&lt;/span&gt;.&lt;/p&gt;'><sup>59</sup></a>
 The problem is that code is clunky and must by re-typed if we want to run it on another triangle matrix.
 To make the code more generalizable, we will see how it can be converted into a function in Section <a href="algorithms.html#functions">10.4</a>.</p>
 <p>Step 4 requires steps 2 and 3 to be undertaken not just on one triangle (as demonstrated above) but on all triangles.
 This requires <em>iteration</em> to create all triangles representing the polygon, illustrated in Figure <a href="algorithms.html#fig:polycent">10.3</a>.
 <code><a href="https://rdrr.io/r/base/lapply.html">lapply()</a></code> and <code><a href="https://rdrr.io/r/base/lapply.html">vapply()</a></code> are used to iterate over each triangle here because they provide a concise solution in base R:<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;
-See &lt;code&gt;?lapply&lt;/code&gt; for documentation and Chapter &lt;a href="location.html#location"&gt;13&lt;/a&gt; for more on iteration.&lt;/p&gt;'><sup>61</sup></a></p>
+See &lt;code&gt;?lapply&lt;/code&gt; for documentation and Chapter &lt;a href="location.html#location"&gt;13&lt;/a&gt; for more on iteration.&lt;/p&gt;'><sup>60</sup></a></p>
 <div class="sourceCode" id="cb351"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span class="va">i</span> <span class="op">=</span> <span class="fl">2</span><span class="op">:</span><span class="op">(</span><span class="fu"><a href="https://rdrr.io/pkg/raster/man/ncell.html">nrow</a></span><span class="op">(</span><span class="va">poly_mat</span><span class="op">)</span> <span class="op">-</span> <span class="fl">2</span><span class="op">)</span>
 <span class="va">T_all</span> <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/r/base/lapply.html">lapply</a></span><span class="op">(</span><span class="va">i</span>, <span class="kw">function</span><span class="op">(</span><span class="va">x</span><span class="op">)</span> <span class="op">{</span>
@@ -299,7 +299,7 @@ <h2>
 The experience should lead to an appreciation of low-level geographic libraries such as GEOS (which underlies <code><a href="https://r-spatial.github.io/sf/reference/geos_unary.html">sf::st_centroid()</a></code>) and CGAL (the Computational Geometry Algorithms Library) which not only run fast but work on a wide range of input geometry types.
 A great advantage of the open source nature of such libraries is that their source code is readily available for study, comprehension and (for those with the skills and confidence) modification.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;
 The CGAL function &lt;code&gt;CGAL::centroid()&lt;/code&gt; is in fact composed of 7 sub-functions as described at &lt;a href="https://doc.cgal.org/latest/Kernel_23/group__centroid__grp.html" class="uri"&gt;https://doc.cgal.org/latest/Kernel_23/group__centroid__grp.html&lt;/a&gt; allowing it to work on a wide range of input data types, whereas the solution we created works only on a very specific input data type.
-The source code underlying GEOS function &lt;code&gt;Centroid::getCentroid()&lt;/code&gt; can be found at &lt;a href="https://github.com/libgeos/geos/search?q=getCentroid" class="uri"&gt;https://github.com/libgeos/geos/search?q=getCentroid&lt;/a&gt;.&lt;/p&gt;'><sup>62</sup></a></p>
+The source code underlying GEOS function &lt;code&gt;Centroid::getCentroid()&lt;/code&gt; can be found at &lt;a href="https://github.com/libgeos/geos/search?q=getCentroid" class="uri"&gt;https://github.com/libgeos/geos/search?q=getCentroid&lt;/a&gt;.&lt;/p&gt;'><sup>61</sup></a></p>
 </div>
 <div id="functions" class="section level2" number="10.4">
 <h2>
@@ -316,7 +316,7 @@ <h2>
 <p>The above example demonstrates two key components of <a href="http://adv-r.had.co.nz/Functions.html">functions</a>:
 1) the function <em>body</em>, the code inside the curly brackets that define what the function does with the inputs; and 2) the <em>formals</em>, the list of arguments the function works with — <code>x</code> in this case (the third key component, the environment, is beyond the scope of this section).
 By default, functions return the last object that has been calculated (the coordinates of the centroid in the case of <code>t_centroid()</code>).<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;
-You can also explicitly set the output of a function by adding &lt;code&gt;return(output)&lt;/code&gt; into the body of the function, where &lt;code&gt;output&lt;/code&gt; is the result to be returned.&lt;/p&gt;"><sup>63</sup></a></p>
+You can also explicitly set the output of a function by adding &lt;code&gt;return(output)&lt;/code&gt; into the body of the function, where &lt;code&gt;output&lt;/code&gt; is the result to be returned.&lt;/p&gt;"><sup>62</sup></a></p>
 <p>The function now works on any inputs you pass it, as illustrated in the below command which calculates the area of the 1<sup>st</sup> triangle from the example polygon in the previous section (see Figure <a href="algorithms.html#fig:polycent">10.3</a>):</p>
 <div class="sourceCode" id="cb354"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span class="fu">t_centroid</span><span class="op">(</span><span class="va">T1</span><span class="op">)</span>
@@ -348,7 +348,7 @@ <h2>
 Provided that you know what the output will be, one function can be used as the building block of another.
 Thus, the functions <code>t_centroid()</code> and <code>t_area()</code> can be used as sub-components of a larger function to do the work of the script <code>10-centroid-alg.R</code>: calculate the area of any convex polygon.
 The code chunk below creates the function <code>poly_centroid()</code> to mimic the behavior of <code><a href="https://r-spatial.github.io/sf/reference/geos_unary.html">sf::st_centroid()</a></code> for convex polygons:<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;
-Note that the functions we created are called iteratively in &lt;code&gt;lapply()&lt;/code&gt; and &lt;code&gt;vapply()&lt;/code&gt; function calls.&lt;/p&gt;"><sup>64</sup></a></p>
+Note that the functions we created are called iteratively in &lt;code&gt;lapply()&lt;/code&gt; and &lt;code&gt;vapply()&lt;/code&gt; function calls.&lt;/p&gt;"><sup>63</sup></a></p>
 <div class="sourceCode" id="cb358"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span class="va">poly_centroid</span> <span class="op">=</span> <span class="kw">function</span><span class="op">(</span><span class="va">poly_mat</span><span class="op">)</span> <span class="op">{</span>
   <span class="va">Origin</span> <span class="op">=</span> <span class="va">poly_mat</span><span class="op">[</span><span class="fl">1</span>, <span class="op">]</span> <span class="co"># create a point representing the origin</span>
diff --git a/attr.html b/attr.html
index 22d36eb8d..d809af931 100644
--- a/attr.html
+++ b/attr.html
@@ -523,7 +523,7 @@ <h3>
 <p>This section covers the majority of joining use cases.
 For more information, we recommend <span class="citation"><a href="references.html#ref-grolemund_r_2016" role="doc-biblioref">Grolemund and Wickham</a> (<a href="references.html#ref-grolemund_r_2016" role="doc-biblioref">2016</a>)</span>, the <a href="https://geocompr.github.io/geocompkg/articles/join.html">join vignette</a> in the <strong>geocompkg</strong> package that accompanies this book, and documentation of the <strong>data.table</strong> package.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;
 &lt;strong&gt;data.table&lt;/strong&gt; is a high-performance data processing package.
-Its application to geographic data is covered in a blog post hosted at r-spatial.org/r/2017/11/13/perp-performance.html.&lt;/p&gt;"><sup>21</sup></a>
+Its application to geographic data is covered in a blog post hosted at r-spatial.org/r/2017/11/13/perp-performance.html.&lt;/p&gt;"><sup>20</sup></a>
 Another type of join is a spatial join, covered in the next chapter (Section <a href="spatial-operations.html#spatial-joining">4.2.3</a>).</p>
 </div>
 <div id="vec-attr-creation" class="section level3" number="3.2.5">
@@ -573,7 +573,7 @@ <h3>
 As mentioned at the outset of the chapter, it can be useful to remove the geometry.
 To do this, you have to explicitly remove it.
 Hence, an approach such as <code>select(world, -geom)</code> will be unsuccessful and you should instead use <code><a href="https://r-spatial.github.io/sf/reference/st_geometry.html">st_drop_geometry()</a></code>.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;
-&lt;code&gt;st_geometry(world_st) = NULL&lt;/code&gt; also works to remove the geometry from &lt;code&gt;world&lt;/code&gt;, but overwrites the original object.&lt;/p&gt;"><sup>22</sup></a></p>
+&lt;code&gt;st_geometry(world_st) = NULL&lt;/code&gt; also works to remove the geometry from &lt;code&gt;world&lt;/code&gt;, but overwrites the original object.&lt;/p&gt;"><sup>21</sup></a></p>
 <div class="sourceCode" id="cb96"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span class="va">world_data</span> <span class="op">=</span> <span class="va">world</span> <span class="op"><a href="https://magrittr.tidyverse.org/reference/pipe.html">%&gt;%</a></span> <span class="fu"><a href="https://r-spatial.github.io/sf/reference/st_geometry.html">st_drop_geometry</a></span><span class="op">(</span><span class="op">)</span>
 <span class="fu"><a href="https://rdrr.io/r/base/class.html">class</a></span><span class="op">(</span><span class="va">world_data</span><span class="op">)</span>
diff --git a/conclusion.html b/conclusion.html
index 450c4a9a7..f878cd626 100644
--- a/conclusion.html
+++ b/conclusion.html
@@ -137,7 +137,7 @@ <h2>
 This is verified using the base R function <code><a href="https://rdrr.io/r/base/identical.html">identical()</a></code>.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;
 The first operation, undertaken by the function &lt;code&gt;st_union()&lt;/code&gt;, creates an object of class &lt;code&gt;sfc&lt;/code&gt; (a simple feature column).
 The latter two operations create &lt;code&gt;sf&lt;/code&gt; objects, each of which &lt;em&gt;contains&lt;/em&gt; a simple feature column.
-Therefore, it is the geometries contained in simple feature columns, not the objects themselves, that are identical.&lt;/p&gt;"><sup>85</sup></a>
+Therefore, it is the geometries contained in simple feature columns, not the objects themselves, that are identical.&lt;/p&gt;"><sup>84</sup></a>
 Which to use?
 It depends: the former only processes the geometry data contained in <code>nz</code> so is faster, while the other options performed attribute operations, which may be useful for subsequent steps.</p>
 <p>The wider point is that there are often multiple options to choose from when working with geographic data in R, even within a single package.
@@ -174,7 +174,7 @@ <h2>
 In this context the decision to use a particular approach, such as the <strong>sf</strong>/<strong>tidyverse</strong>/<strong>raster</strong> ecosystem advocated in this book should be made with knowledge of alternatives.
 The <strong>sp</strong>/<strong>rgdal</strong>/<strong>rgeos</strong> ecosystem that <strong>sf</strong> is designed to supersede, for example, can do many of the things covered in this book and, due to its age, is built on by many other packages.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;
 At the time of writing 452 package &lt;code&gt;Depend&lt;/code&gt; or &lt;code&gt;Import&lt;/code&gt; &lt;strong&gt;sp&lt;/strong&gt;, showing that its data structures are widely used and have been extended in many directions.
-The equivalent number for &lt;strong&gt;sf&lt;/strong&gt; was 69 in October 2018; with the growing popularity of &lt;strong&gt;sf&lt;/strong&gt;, this is set to grow.&lt;/p&gt;"><sup>86</sup></a>
+The equivalent number for &lt;strong&gt;sf&lt;/strong&gt; was 69 in October 2018; with the growing popularity of &lt;strong&gt;sf&lt;/strong&gt;, this is set to grow.&lt;/p&gt;"><sup>85</sup></a>
 Although best known for point pattern analysis, the <strong>spatstat</strong> package also supports raster and other vector geometries <span class="citation">(<a href="references.html#ref-baddeley_spatstat_2005" role="doc-biblioref">Baddeley and Turner 2005</a>)</span>.
 At the time of writing (October 2018) 69 packages depend on it, making it more than a package: <strong>spatstat</strong> is an alternative R-spatial ecosystem.</p>
 <p>It is also being aware of promising alternatives that are under development.
@@ -224,7 +224,7 @@ <h2>
 There are good reasons for learning R as a language for geocomputation, as described in Chapter <a href="intro.html#intro">1</a>, but it is not the only option.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;
 R’s strengths relevant to our definition of geocomputation include its emphasis on scientific reproducibility, widespread use in academic research and unparalleled support for statistical modeling of geographic data.
 Furthermore, we advocate learning one language (R) for geocomputation in depth before delving into other languages/frameworks because of the costs associated with context switching.
-It is preferable to have expertise in one language than basic knowledge of many.&lt;/p&gt;"><sup>87</sup></a>
+It is preferable to have expertise in one language than basic knowledge of many.&lt;/p&gt;"><sup>86</sup></a>
 It would be possible to study <em>Geocomputation with: Python</em>, <em>C++</em>, <em>JavaScript</em>, <em>Scala</em> or <em>Rust</em> in equal depth.
 Each has evolving geospatial capabilities.
 <a href="https://github.com/mapbox/rasterio"><strong>rasterio</strong></a>, for example, is a Python package
diff --git a/eco.html b/eco.html
index ac0b2b85b..567f37873 100644
--- a/eco.html
+++ b/eco.html
@@ -114,7 +114,7 @@ <h2>
 <p>In this chapter we will model the floristic gradient of fog oases to reveal distinctive vegetation belts that are clearly controlled by water availability.
 To do so, we will bring together concepts presented in previous chapters and even extend them (Chapters <a href="spatial-class.html#spatial-class">2</a> to <a href="geometric-operations.html#geometric-operations">5</a> and Chapters <a href="gis.html#gis">9</a> and <a href="spatial-cv.html#spatial-cv">11</a>).</p>
 <p>Fog oases are one of the most fascinating vegetation formations we have ever encountered.
-These formations, locally termed <em>lomas</em>, develop on mountains along the coastal deserts of Peru and Chile.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;Similar vegetation formations develop also in other parts of the world, e.g., in Namibia and along the coasts of Yemen and Oman &lt;span class="citation"&gt;(&lt;a href="references.html#ref-galletti_land_2016" role="doc-biblioref"&gt;Galletti, Turner, and Myint 2016&lt;/a&gt;)&lt;/span&gt;.&lt;/p&gt;'><sup>81</sup></a>
+These formations, locally termed <em>lomas</em>, develop on mountains along the coastal deserts of Peru and Chile.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;Similar vegetation formations develop also in other parts of the world, e.g., in Namibia and along the coasts of Yemen and Oman &lt;span class="citation"&gt;(&lt;a href="references.html#ref-galletti_land_2016" role="doc-biblioref"&gt;Galletti, Turner, and Myint 2016&lt;/a&gt;)&lt;/span&gt;.&lt;/p&gt;'><sup>80</sup></a>
 The deserts’ extreme conditions and remoteness provide the habitat for a unique ecosystem, including species endemic to the fog oases.
 Despite the arid conditions and low levels of precipitation of around 30-50 mm per year on average, fog deposition increases the amount of water available to plants during austal winter.
 This results in green southern-facing mountain slopes along the coastal strip of Peru (Figure <a href="eco.html#fig:study-area-mongon">14.1</a>).
@@ -156,7 +156,7 @@ <h2>
 <code class="sourceCode R"><span class="fu"><a href="https://rdrr.io/r/utils/data.html">data</a></span><span class="op">(</span><span class="st">"study_area"</span>, <span class="st">"random_points"</span>, <span class="st">"comm"</span>, <span class="st">"dem"</span>, <span class="st">"ndvi"</span>, package <span class="op">=</span> <span class="st">"spDataLarge"</span><span class="op">)</span></code></pre></div>
 <p><code>study_area</code> is an <code>sf</code> polygon representing the outlines of the study area.
 <code>random_points</code> is an <code>sf</code> object, and contains the 100 randomly chosen sites.
-<code>comm</code> is a community matrix of the wide data format <span class="citation">(<a href="references.html#ref-wickham_tidy_2014" role="doc-biblioref">Wickham 2014</a>)</span> where the rows represent the visited sites in the field and the columns the observed species.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;In statistics, this is also called a contingency table or cross-table.&lt;/p&gt;"><sup>82</sup></a></p>
+<code>comm</code> is a community matrix of the wide data format <span class="citation">(<a href="references.html#ref-wickham_tidy_2014" role="doc-biblioref">Wickham 2014</a>)</span> where the rows represent the visited sites in the field and the columns the observed species.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;In statistics, this is also called a contingency table or cross-table.&lt;/p&gt;"><sup>81</sup></a></p>
 <div class="sourceCode" id="cb442"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span class="co"># sites 35 to 40 and corresponding occurrences of the first five species in the</span>
 <span class="co"># community matrix</span>
@@ -182,7 +182,7 @@ <h2>
 <p>The next step is to compute variables which we will not only need for the modeling and predictive mapping (see Section <a href="eco.html#predictive-mapping">14.4.2</a>) but also for aligning the Non-metric multidimensional scaling (NMDS) axes with the main gradient in the study area, altitude and humidity, respectively (see Section <a href="eco.html#nmds">14.3</a>).</p>
 <p>Specifically, we will compute catchment slope and catchment area from a digital elevation model using R-GIS bridges (see Chapter <a href="gis.html#gis">9</a>).
 Curvatures might also represent valuable predictors, in the Exercise section you can find out how they would change the modeling result.</p>
-<p>To compute catchment area and catchment slope, we will make use of the <code>saga:sagawetnessindex</code> function.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;Admittedly, it is a bit unsatisfying that the only way of knowing that &lt;code&gt;sagawetnessindex&lt;/code&gt; computes the desired terrain attributes is to be familiar with SAGA.&lt;/p&gt;"><sup>83</sup></a>
+<p>To compute catchment area and catchment slope, we will make use of the <code>saga:sagawetnessindex</code> function.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;Admittedly, it is a bit unsatisfying that the only way of knowing that &lt;code&gt;sagawetnessindex&lt;/code&gt; computes the desired terrain attributes is to be familiar with SAGA.&lt;/p&gt;"><sup>82</sup></a>
 <code>get_usage()</code> returns all function parameters and default values of a specific geoalgorithm.
 Here, we present only a selection of the complete output.</p>
 <div class="sourceCode" id="cb443"><pre class="downlit sourceCode r">
@@ -264,7 +264,7 @@ <h2>
 <span class="co"># keep only sites in which at least one species was found</span>
 <span class="va">pa</span> <span class="op">=</span> <span class="va">pa</span><span class="op">[</span><span class="fu"><a href="https://rdrr.io/pkg/raster/man/rowSums.html">rowSums</a></span><span class="op">(</span><span class="va">pa</span><span class="op">)</span> <span class="op">!=</span> <span class="fl">0</span>, <span class="op">]</span>  <span class="co"># 84 rows, 69 columns</span></code></pre></div>
 <p>The resulting output matrix serves as input for the NMDS.
-<code>k</code> specifies the number of output axes, here, set to 4.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;One way of choosing &lt;code&gt;k&lt;/code&gt; is to try &lt;code&gt;k&lt;/code&gt; values between 1 and 6 and then using the result which yields the best stress value &lt;span class="citation"&gt;(&lt;a href="references.html#ref-mccune_analysis_2002" role="doc-biblioref"&gt;McCune, Grace, and Urban 2002&lt;/a&gt;)&lt;/span&gt;.&lt;/p&gt;'><sup>84</sup></a>
+<code>k</code> specifies the number of output axes, here, set to 4.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;One way of choosing &lt;code&gt;k&lt;/code&gt; is to try &lt;code&gt;k&lt;/code&gt; values between 1 and 6 and then using the result which yields the best stress value &lt;span class="citation"&gt;(&lt;a href="references.html#ref-mccune_analysis_2002" role="doc-biblioref"&gt;McCune, Grace, and Urban 2002&lt;/a&gt;)&lt;/span&gt;.&lt;/p&gt;'><sup>83</sup></a>
 NMDS is an iterative procedure trying to make the ordinated space more similar to the input matrix in each step.
 To make sure that the algorithm converges, we set the number of steps to 500 (<code>try</code> parameter).</p>
 <div class="sourceCode" id="cb450"><pre class="downlit sourceCode r">
diff --git a/geometric-operations.html b/geometric-operations.html
index 515949ae6..75a583fd7 100644
--- a/geometric-operations.html
+++ b/geometric-operations.html
@@ -178,7 +178,7 @@ <h3>
 <!-- https://bost.ocks.org/mike/simplify/ -->
 The following code chunk uses this function to simplify <code>us_states2163</code>.
 The result has only 1% of the vertices of the input (set using the argument <code>keep</code>) but its number of objects remains intact because we set <code>keep_shapes = TRUE</code>:<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;
-Simplification of multipolygon objects can remove small internal polygons, even if the &lt;code&gt;keep_shapes&lt;/code&gt; argument is set to TRUE. To prevent this, you need to set &lt;code&gt;explode = TRUE&lt;/code&gt;. This option converts all mutlipolygons into separate polygons before its simplification.&lt;/p&gt;"><sup>24</sup></a></p>
+Simplification of multipolygon objects can remove small internal polygons, even if the &lt;code&gt;keep_shapes&lt;/code&gt; argument is set to TRUE. To prevent this, you need to set &lt;code&gt;explode = TRUE&lt;/code&gt;. This option converts all mutlipolygons into separate polygons before its simplification.&lt;/p&gt;"><sup>23</sup></a></p>
 <div class="sourceCode" id="cb156"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span class="co"># proportion of points to retain (0-1; default 0.05)</span>
 <span class="va">us_states2163</span><span class="op">$</span><span class="va">AREA</span> <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/pkg/terra/man/catalyze.html">as.numeric</a></span><span class="op">(</span><span class="va">us_states2163</span><span class="op">$</span><span class="va">AREA</span><span class="op">)</span>
@@ -211,7 +211,7 @@ <h3>
 In such cases <em>point on surface</em> operations can be used to guarantee the point will be in the parent object (e.g., for labeling irregular multipolygon objects such as island states), as illustrated by the red points in Figure <a href="geometric-operations.html#fig:centr">5.3</a>.
 Notice that these red points always lie on their parent objects.
 They were created with <code><a href="https://r-spatial.github.io/sf/reference/geos_unary.html">st_point_on_surface()</a></code> as follows:<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;
-A description of how &lt;code&gt;st_point_on_surface()&lt;/code&gt; works is provided at &lt;a href="https://gis.stackexchange.com/q/76498" class="uri"&gt;https://gis.stackexchange.com/q/76498&lt;/a&gt;.&lt;/p&gt;'><sup>25</sup></a></p>
+A description of how &lt;code&gt;st_point_on_surface()&lt;/code&gt; works is provided at &lt;a href="https://gis.stackexchange.com/q/76498" class="uri"&gt;https://gis.stackexchange.com/q/76498&lt;/a&gt;.&lt;/p&gt;'><sup>24</sup></a></p>
 <div class="sourceCode" id="cb158"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span class="va">nz_pos</span> <span class="op">=</span> <span class="fu"><a href="https://r-spatial.github.io/sf/reference/geos_unary.html">st_point_on_surface</a></span><span class="op">(</span><span class="va">nz</span><span class="op">)</span>
 <span class="va">seine_pos</span> <span class="op">=</span> <span class="fu"><a href="https://r-spatial.github.io/sf/reference/geos_unary.html">st_point_on_surface</a></span><span class="op">(</span><span class="va">seine</span><span class="op">)</span></code></pre></div>
@@ -817,7 +817,7 @@ <h3>
 <span class="co">#&gt; [1] 0 0</span></code></pre></div>
 <p>If two rasters have different origins, their cells do not overlap completely which would make map algebra impossible.
 To change the origin – use <code><a href="https://rdrr.io/pkg/terra/man/origin.html">origin()</a></code>.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;
-If the origins of two raster datasets are just marginally apart, it sometimes is sufficient to simply increase the &lt;code&gt;tolerance&lt;/code&gt; argument of &lt;code&gt;terra::terraOptions()&lt;/code&gt;.&lt;/p&gt;"><sup>26</sup></a>
+If the origins of two raster datasets are just marginally apart, it sometimes is sufficient to simply increase the &lt;code&gt;tolerance&lt;/code&gt; argument of &lt;code&gt;terra::terraOptions()&lt;/code&gt;.&lt;/p&gt;"><sup>25</sup></a>
 Looking at Figure <a href="geometric-operations.html#fig:origin-example">5.14</a> reveals the effect of changing the origin.</p>
 <div class="sourceCode" id="cb184"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span class="co"># change the origin</span>
@@ -840,7 +840,7 @@ <h3>
 To match resolutions, one can either decrease (<code><a href="https://r-spatial.github.io/sf/reference/aggregate.sf.html">aggregate()</a></code>) or increase (<code><a href="https://rdrr.io/pkg/terra/man/disaggregate.html">disagg()</a></code>) the resolution of one raster.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;
 Here we refer to spatial resolution.
 In remote sensing the spectral (spectral bands), temporal (observations through time of the same area) and radiometric (color depth) resolution are also important.
-Check out the &lt;code&gt;tapp()&lt;/code&gt; example in the documentation for getting an idea on how to do temporal raster aggregation.&lt;/p&gt;"><sup>27</sup></a>
+Check out the &lt;code&gt;tapp()&lt;/code&gt; example in the documentation for getting an idea on how to do temporal raster aggregation.&lt;/p&gt;"><sup>26</sup></a>
 As an example, we here change the spatial resolution of <code>dem</code> (found in the <strong>spDataLarge</strong> package) by a factor of 5 (Figure <a href="geometric-operations.html#fig:aggregate-example">5.15</a>).
 Additionally, the output cell value should correspond to the mean of the input cells (note that one could use other functions as well, such as <code><a href="https://rdrr.io/pkg/terra/man/summarize-generics.html">median()</a></code>, <code><a href="https://rdrr.io/r/base/sum.html">sum()</a></code>, etc.):</p>
 <div class="sourceCode" id="cb185"><pre class="downlit sourceCode r">
@@ -888,7 +888,7 @@ <h3>
 <li>Bilinear interpolation - assigns a weighted average of the four nearest cells from the original raster to the cell of the target one (Figure <a href="geometric-operations.html#fig:bilinear">5.16</a>). The fastest method for continuous rasters.</li>
 <li>Cubic interpolation - uses values of 16 nearest cells of the original raster to determine the output cell value, applying third-order polynomial functions. Used for continuous rasters. It results in a more smoothed surface than the bilinear interpolation, but is also more computationally demanding.</li>
 <li>Cubic spline interpolation - also uses values of 16 nearest cells of the original raster to determine the output cell value, but applies cubic splines (piecewise third-order polynomial functions) to derive the results. Used for continuous rasters.</li>
-<li>Lanczos windowed sinc resampling - uses values of 36 nearest cells of the original raster to determine the output cell value. Used for continuous rasters.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;More detailed explanation of this method can be found at &lt;a href="https://gis.stackexchange.com/a/14361/20955" class="uri"&gt;https://gis.stackexchange.com/a/14361/20955&lt;/a&gt;.&lt;/p&gt;'><sup>28</sup></a>
+<li>Lanczos windowed sinc resampling - uses values of 36 nearest cells of the original raster to determine the output cell value. Used for continuous rasters.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;More detailed explanation of this method can be found at &lt;a href="https://gis.stackexchange.com/a/14361/20955" class="uri"&gt;https://gis.stackexchange.com/a/14361/20955&lt;/a&gt;.&lt;/p&gt;'><sup>27</sup></a>
 </li>
 </ul>
 <p>As you can find in the above explanation, only <em>nearest neighbor</em> is suitable for categorical rasters, while all the methods can be used (with different outcomes) for the continuous rasters.
diff --git a/gis.html b/gis.html
index fac5f5301..bde6298db 100644
--- a/gis.html
+++ b/gis.html
@@ -6,16 +6,16 @@
 <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
 <title>Chapter 9 Bridges to GIS software | Geocomputation with R</title>
 <meta name="author" content="Robin Lovelace, Jakub Nowosad, Jannes Muenchow">
-<meta name="description" content="Prerequisites This chapter requires QGIS, SAGA and GRASS to be installed and the following packages to be attached:44 library(sf) library(raster) #&gt; Warning: multiple methods tables found for...">
+<meta name="description" content="Prerequisites This chapter requires QGIS, SAGA and GRASS to be installed and the following packages to be attached:43 library(sf) library(raster) #&gt; Warning: multiple methods tables found for...">
 <meta name="generator" content="bookdown 0.24 with bs4_book()">
 <meta property="og:title" content="Chapter 9 Bridges to GIS software | Geocomputation with R">
 <meta property="og:type" content="book">
 <meta property="og:url" content="https://geocompr.robinlovelace.net/gis.html">
 <meta property="og:image" content="https://geocompr.robinlovelace.net/images/cover.png">
-<meta property="og:description" content="Prerequisites This chapter requires QGIS, SAGA and GRASS to be installed and the following packages to be attached:44 library(sf) library(raster) #&gt; Warning: multiple methods tables found for...">
+<meta property="og:description" content="Prerequisites This chapter requires QGIS, SAGA and GRASS to be installed and the following packages to be attached:43 library(sf) library(raster) #&gt; Warning: multiple methods tables found for...">
 <meta name="twitter:card" content="summary">
 <meta name="twitter:title" content="Chapter 9 Bridges to GIS software | Geocomputation with R">
-<meta name="twitter:description" content="Prerequisites This chapter requires QGIS, SAGA and GRASS to be installed and the following packages to be attached:44 library(sf) library(raster) #&gt; Warning: multiple methods tables found for...">
+<meta name="twitter:description" content="Prerequisites This chapter requires QGIS, SAGA and GRASS to be installed and the following packages to be attached:43 library(sf) library(raster) #&gt; Warning: multiple methods tables found for...">
 <meta name="twitter:image" content="https://geocompr.robinlovelace.net/images/cover.png">
 <!-- JS --><script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/2.0.6/clipboard.min.js" integrity="sha256-inc5kl9MA1hkeYUt+EC3BhlIgyp/2jDIyBLS6k3UxPI=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/fuse.js/6.4.6/fuse.js" integrity="sha512-zv6Ywkjyktsohkbp9bb45V6tEMoWhzFzXis+LrMehmJZZSys19Yxf1dopHx7WzIKxr5tK2dVcYmaCk2uqdjF4A==" crossorigin="anonymous"></script><script src="https://kit.fontawesome.com/6ecbd6c532.js" crossorigin="anonymous"></script><script src="libs/header-attrs-2.11/header-attrs.js"></script><script src="libs/jquery-3.6.0/jquery-3.6.0.min.js"></script><meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
 <link href="libs/bootstrap-4.6.0/bootstrap.min.css" rel="stylesheet">
@@ -98,7 +98,7 @@ <h2>Prerequisites<a class="anchor" aria-label="anchor" href="#prerequisites-7"><
 </h2>
 <ul>
 <li>This chapter requires QGIS, SAGA and GRASS to be installed and the following packages to be attached:<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;
-Packages that have already been used including &lt;strong&gt;spData&lt;/strong&gt;, &lt;strong&gt;spDataLarge&lt;/strong&gt; and &lt;strong&gt;dplyr&lt;/strong&gt; also need to be installed. &lt;/p&gt;"><sup>44</sup></a>
+Packages that have already been used including &lt;strong&gt;spData&lt;/strong&gt;, &lt;strong&gt;spDataLarge&lt;/strong&gt; and &lt;strong&gt;dplyr&lt;/strong&gt; also need to be installed. &lt;/p&gt;"><sup>43</sup></a>
 </li>
 </ul>
 <div class="sourceCode" id="cb302"><pre class="downlit sourceCode r">
@@ -121,7 +121,7 @@ <h2>
 Other ‘command-lines’ include terminals for interacting with the operating system and other interpreted languages such as Python.
 Many GISs originated as a CLI:
 it was only after the widespread uptake of computer mice and high-resolution screens in the 1990s that GUIs became common.
-GRASS, one of the longest-standing GIS programs, for example, relied primarily on command-line interaction before it gained a sophisticated GUI &lt;span class="citation"&gt;(&lt;a href="references.html#ref-landa_new_2008" role="doc-biblioref"&gt;Landa 2008&lt;/a&gt;)&lt;/span&gt;.&lt;/p&gt;'><sup>45</sup></a>
+GRASS, one of the longest-standing GIS programs, for example, relied primarily on command-line interaction before it gained a sophisticated GUI &lt;span class="citation"&gt;(&lt;a href="references.html#ref-landa_new_2008" role="doc-biblioref"&gt;Landa 2008&lt;/a&gt;)&lt;/span&gt;.&lt;/p&gt;'><sup>44</sup></a>
 In dedicated GIS packages, by contrast, the emphasis tends to be on the graphical user interface (GUI).
 You <em>can</em> interact with GRASS, QGIS, SAGA and gvSIG from system terminals and embedded CLIs such as the <a href="https://docs.qgis.org/testing/en/docs/pyqgis_developer_cookbook/intro.html">Python Console in QGIS</a>, but ‘pointing and clicking’ is the norm.
 This means many GIS users miss out on the advantages of the command-line according to Gary Sherman, creator of QGIS <span class="citation">(<a href="references.html#ref-sherman_desktop_2008" role="doc-biblioref">Sherman 2008</a>)</span>:</p>
@@ -132,7 +132,7 @@ <h2>
 you can do something on the command line in a fraction of the time you
 can do it with a GUI.</p>
 </blockquote>
-<p>The ‘CLI vs GUI’ debate can be adversial but it does not have to be; both options can be used interchangeably, depending on the task at hand and the user’s skillset.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;GRASS GIS and PostGIS are popular in academia and industry and can be seen as products which buck this trend as they are built around the command-line.&lt;/p&gt;"><sup>46</sup></a>
+<p>The ‘CLI vs GUI’ debate can be adversial but it does not have to be; both options can be used interchangeably, depending on the task at hand and the user’s skillset.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;GRASS GIS and PostGIS are popular in academia and industry and can be seen as products which buck this trend as they are built around the command-line.&lt;/p&gt;"><sup>45</sup></a>
 The advantages of a good CLI such as that provided by R (and enhanced by IDEs such as RStudio) are numerous.
 A good CLI:</p>
 <ul>
@@ -147,7 +147,7 @@ <h2>
 <ul>
 <li>Has a ‘shallow’ learning curve meaning geographic data can be explored and visualized without hours of learning a new language;</li>
 <li>Provides excellent support for ‘digitizing’ (creating new vector datasets), including trace, snap and topological tools;<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;
-The &lt;strong&gt;mapedit&lt;/strong&gt; package allows the quick editing of a few spatial features but not professional, large-scale cartographic digitizing;&lt;/p&gt;"><sup>47</sup></a>
+The &lt;strong&gt;mapedit&lt;/strong&gt; package allows the quick editing of a few spatial features but not professional, large-scale cartographic digitizing;&lt;/p&gt;"><sup>46</sup></a>
 </li>
 <li>Enables georeferencing (matching raster images to existing maps) with ground control points and orthorectification;</li>
 <li>Supports stereoscopic mapping (e.g., LiDAR and structure from motion); and</li>
@@ -156,7 +156,7 @@ <h2>
 <p>Another advantage of dedicated GISs is that they provide access to hundreds of ‘geoalgorithms’ (computational recipes to solve geographic problems — see Chapter <a href="algorithms.html#algorithms">10</a>).
 Many of these are unavailable from the R command line, except via ‘GIS bridges,’ the topic of (and motivation for) this chapter.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;
 An early use of the term ‘bridge’ referred to the coupling of R with GRASS &lt;span class="citation"&gt;(&lt;a href="references.html#ref-neteler_open_2008" role="doc-biblioref"&gt;Neteler and Mitasova 2008&lt;/a&gt;)&lt;/span&gt;.
-Roger Bivand elaborated on this in his talk, “Bridges between GIS and R,” delivered at the 2016 GEOSTAT summer school (see slides at: &lt;a href="http://spatial.nhh.no/misc/" class="uri"&gt;http://spatial.nhh.no/misc/&lt;/a&gt;).&lt;/p&gt;'><sup>48</sup></a></p>
+Roger Bivand elaborated on this in his talk, “Bridges between GIS and R,” delivered at the 2016 GEOSTAT summer school (see slides at: &lt;a href="http://spatial.nhh.no/misc/" class="uri"&gt;http://spatial.nhh.no/misc/&lt;/a&gt;).&lt;/p&gt;'><sup>47</sup></a></p>
 
 <div class="rmdnote">
 A command-line interface is a means of interacting with computer programs in which the user issues commands via successive lines of text (command lines).
@@ -174,7 +174,7 @@ <h2>
 Though not covered here, it is worth being aware of the interface to ArcGIS, a proprietary and very popular GIS software, via <strong>RPyGeo</strong>.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;By the way, it is also possible to use R from within Desktop GIS software packages.
 The so-called R-ArcGIS bridge (see &lt;a href="https://github.com/R-ArcGIS/r-bridge" class="uri"&gt;https://github.com/R-ArcGIS/r-bridge&lt;/a&gt;) allows R to be used from within ArcGIS.
 One can also use R scripts from within QGIS (see &lt;a href="https://docs.qgis.org/2.18/en/docs/training_manual/processing/r_intro.html" class="uri"&gt;https://docs.qgis.org/2.18/en/docs/training_manual/processing/r_intro.html&lt;/a&gt;).
-Finally, it is also possible to use R from the GRASS GIS command line (see &lt;a href="https://grasswiki.osgeo.org/wiki/R_statistics/rgrass7" class="uri"&gt;https://grasswiki.osgeo.org/wiki/R_statistics/rgrass7&lt;/a&gt;).&lt;/p&gt;'><sup>49</sup></a>
+Finally, it is also possible to use R from the GRASS GIS command line (see &lt;a href="https://grasswiki.osgeo.org/wiki/R_statistics/rgrass7" class="uri"&gt;https://grasswiki.osgeo.org/wiki/R_statistics/rgrass7&lt;/a&gt;).&lt;/p&gt;'><sup>48</sup></a>
 To complement the R-GIS bridges, the chapter ends with a very brief introduction to interfaces to spatial libraries (Section <a href="gis.html#gdal">9.6.1</a>) and spatial databases (Section <a href="gis.html#postgis">9.6.2</a>).</p>
 <div class="inline-table"><table class="table table-sm">
 <caption>
@@ -633,7 +633,7 @@ <h3>
 <span class="co">#&gt; ...</span></code></pre></div>
 <p>This example — which returns the same result as <code><a href="http://rgdal.r-forge.r-project.org/reference/readOGR.html">rgdal::ogrInfo()</a></code> — may be simple, but it shows how to use GDAL via the system command-line, independently of other packages.
 The ‘link’ to GDAL provided by <strong>link2gi</strong> could be used as a foundation for doing more advanced GDAL work from the R or system CLI.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;
-Note also that the &lt;strong&gt;RSAGA&lt;/strong&gt; package uses the command line interface to use SAGA geoalgorithms from within R (see Section &lt;a href="gis.html#rsaga"&gt;9.3&lt;/a&gt;). &lt;/p&gt;'><sup>50</sup></a>
+Note also that the &lt;strong&gt;RSAGA&lt;/strong&gt; package uses the command line interface to use SAGA geoalgorithms from within R (see Section &lt;a href="gis.html#rsaga"&gt;9.3&lt;/a&gt;). &lt;/p&gt;'><sup>49</sup></a>
 TauDEM (<a href="http://hydrology.usu.edu/taudem/taudem5/index.html" class="uri">http://hydrology.usu.edu/taudem/taudem5/index.html</a>) and the Orfeo Toolbox (<a href="https://www.orfeo-toolbox.org/" class="uri">https://www.orfeo-toolbox.org/</a>) are other spatial data processing libraries/programs offering a command line interface.
 At the time of writing, it appears that there is only a developer version of an R/TauDEM interface on R-Forge (<a href="https://r-forge.r-project.org/R/?group_id=956" class="uri">https://r-forge.r-project.org/R/?group_id=956</a>).
 In any case, the above example shows how to access these libraries from the system command line via R.
@@ -651,15 +651,15 @@ <h3>
 This is useful because geographic datasets tend to become big and messy quite quickly.
 Databases enable storing and querying large datasets efficiently based on spatial and non-spatial fields, and provide multi-user access and topology support.</p>
 <p>The most important open source spatial database is PostGIS <span class="citation">(<a href="references.html#ref-obe_postgis_2015" role="doc-biblioref">Obe and Hsu 2015</a>)</span>.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;
-SQLite/SpatiaLite are certainly also important but implicitly we have already introduced this approach since GRASS is using SQLite in the background (see Section &lt;a href="gis.html#rgrass"&gt;9.4&lt;/a&gt;).&lt;/p&gt;'><sup>51</sup></a>
+SQLite/SpatiaLite are certainly also important but implicitly we have already introduced this approach since GRASS is using SQLite in the background (see Section &lt;a href="gis.html#rgrass"&gt;9.4&lt;/a&gt;).&lt;/p&gt;'><sup>50</sup></a>
 R bridges to spatial DBMSs such as PostGIS are important, allowing access to huge data stores without loading several gigabytes of geographic data into RAM, and likely crashing the R session.
 The remainder of this section shows how PostGIS can be called from R, based on “Hello real word” from <em>PostGIS in Action, Second Edition</em> <span class="citation">(<a href="references.html#ref-obe_postgis_2015" role="doc-biblioref">Obe and Hsu 2015</a>)</span>.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;
-Thanks to Manning Publications, Regina Obe and Leo Hsu for permission to use this example.&lt;/p&gt;"><sup>52</sup></a></p>
+Thanks to Manning Publications, Regina Obe and Leo Hsu for permission to use this example.&lt;/p&gt;"><sup>51</sup></a></p>
 <p>The subsequent code requires a working internet connection, since we are accessing a PostgreSQL/PostGIS database which is living in the QGIS Cloud (<a href="https://qgiscloud.com/" class="uri">https://qgiscloud.com/</a>).<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;
 QGIS Cloud lets you store geographic data and maps in the cloud.
 In the background, it uses QGIS Server and PostgreSQL/PostGIS.
 This way, the reader can follow the PostGIS example without the need to have PostgreSQL/PostGIS installed on a local machine.
-Thanks to the QGIS Cloud team for hosting this example.&lt;/p&gt;"><sup>53</sup></a></p>
+Thanks to the QGIS Cloud team for hosting this example.&lt;/p&gt;"><sup>52</sup></a></p>
 <div class="sourceCode" id="cb336"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span class="kw"><a href="https://rdrr.io/r/base/library.html">library</a></span><span class="op">(</span><span class="va"><a href="https://github.com/tomoakin/RPostgreSQL">RPostgreSQL</a></span><span class="op">)</span>
 <span class="va">conn</span> <span class="op">=</span> <span class="fu"><a href="https://dbi.r-dbi.org/reference/dbConnect.html">dbConnect</a></span><span class="op">(</span>drv <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/pkg/RPostgreSQL/man/PostgreSQL.html">PostgreSQL</a></span><span class="op">(</span><span class="op">)</span>, dbname <span class="op">=</span> <span class="st">"rtafdf_zljbqm"</span>,
@@ -698,7 +698,7 @@ <h3>
 <span class="va">buf</span> <span class="op">=</span> <span class="fu"><a href="https://r-spatial.github.io/sf/reference/st_read.html">st_read</a></span><span class="op">(</span><span class="va">conn</span>, query <span class="op">=</span> <span class="va">query</span><span class="op">)</span></code></pre></div>
 <p>Note that this was a spatial query using functions (<code>ST_Union()</code>, <code>ST_Buffer()</code>) you should be already familiar with since you find them also in the <strong>sf</strong>-package, though here they are written in lowercase characters (<code><a href="https://r-spatial.github.io/sf/reference/geos_combine.html">st_union()</a></code>, <code><a href="https://r-spatial.github.io/sf/reference/geos_unary.html">st_buffer()</a></code>).
 In fact, function names of the <strong>sf</strong> package largely follow the PostGIS naming conventions.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;
-The prefix &lt;code&gt;st&lt;/code&gt; stands for space/time.&lt;/p&gt;"><sup>54</sup></a>
+The prefix &lt;code&gt;st&lt;/code&gt; stands for space/time.&lt;/p&gt;"><sup>53</sup></a>
 The last query will find all Hardee restaurants (<code>HDE</code>) within the buffer zone (Figure <a href="gis.html#fig:postgis">9.4</a>).</p>
 <div class="sourceCode" id="cb341"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span class="va">query</span> <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/r/base/paste.html">paste</a></span><span class="op">(</span>
@@ -716,7 +716,7 @@ <h3>
 <span class="va">hardees</span> <span class="op">=</span> <span class="fu"><a href="https://r-spatial.github.io/sf/reference/st_read.html">st_read</a></span><span class="op">(</span><span class="va">conn</span>, query <span class="op">=</span> <span class="va">query</span><span class="op">)</span></code></pre></div>
 <p>Please refer to <span class="citation"><a href="references.html#ref-obe_postgis_2015" role="doc-biblioref">Obe and Hsu</a> (<a href="references.html#ref-obe_postgis_2015" role="doc-biblioref">2015</a>)</span> for a detailed explanation of the spatial SQL query.
 Finally, it is good practice to close the database connection as follows:<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;
-It is important to close the connection here because QGIS Cloud (free version) allows only ten concurrent connections.&lt;/p&gt;"><sup>55</sup></a></p>
+It is important to close the connection here because QGIS Cloud (free version) allows only ten concurrent connections.&lt;/p&gt;"><sup>54</sup></a></p>
 <div class="sourceCode" id="cb342"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span class="fu">RPostgreSQL</span><span class="fu">::</span><span class="fu"><a href="https://rdrr.io/pkg/RPostgreSQL/man/postgresqlSupport.html">postgresqlCloseConnection</a></span><span class="op">(</span><span class="va">conn</span><span class="op">)</span></code></pre></div>
 <pre><code>#&gt; old-style crs object detected; please recreate object with a recent sf::st_crs()
diff --git a/location.html b/location.html
index 6522c4f83..8e1e5b69b 100644
--- a/location.html
+++ b/location.html
@@ -276,7 +276,7 @@ <h2>
 Raster cells are assumed to have a population of 127 if they have a value of 1 (cells in ‘class 1’ contain between 3 and 250 inhabitants) and 375 if they have a value of 2 (containing 250 to 500 inhabitants), and so on (see Table <a href="location.html#tab:census-desc">13.1</a>).
 A cell value of 8000 inhabitants was chosen for ‘class 6’ because these cells contain more than 8000 people.
 Of course, these are approximations of the true population, not precise values.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;
-The potential error introduced during this reclassification stage will be explored in the exercises.&lt;/p&gt;"><sup>79</sup></a>
+The potential error introduced during this reclassification stage will be explored in the exercises.&lt;/p&gt;"><sup>78</sup></a>
 However, the level of detail is sufficient to delineate metropolitan areas (see next section).</p>
 <p>In contrast to the <code>pop</code> variable, representing absolute estimates of the total population, the remaining variables were re-classified as weights corresponding with weights used in the survey.
 Class 1 in the variable <code>women</code>, for instance, represents areas in which 0 to 40% of the population is female;
@@ -435,7 +435,7 @@ <h2>
 <li>
 <code><a href="https://docs.ropensci.org/osmdata/reference/osmdata_sf.html">osmdata_sf()</a></code>, which converts the OSM data into spatial objects (of class <code>sf</code>).</li>
 <li>
-<code>while()</code>, which tries repeatedly (three times in this case) to download the data if it fails the first time.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;The OSM-download will sometimes fail at the first attempt.&lt;/p&gt;"><sup>80</sup></a>
+<code>while()</code>, which tries repeatedly (three times in this case) to download the data if it fails the first time.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;The OSM-download will sometimes fail at the first attempt.&lt;/p&gt;"><sup>79</sup></a>
 Before running this code: please consider it will download almost 2GB of data.
 To save time and resources, we have put the output named <code>shops</code> into <strong>spDataLarge</strong>.
 To make it available in your environment ensure that the <strong>spDataLarge</strong> package is loaded, or run <code>data("shops", package = "spDataLarge")</code>.</li>
diff --git a/read-write.html b/read-write.html
index 035e0d893..86a6f5c6a 100644
--- a/read-write.html
+++ b/read-write.html
@@ -265,7 +265,7 @@ <h2>
 <p>Sometimes, packages come with built-in datasets.
 These can be accessed in four ways: by attaching the package (if the package uses ‘lazy loading’ as <strong>spData</strong> does), with <code>data(dataset, package = mypackage)</code>, by referring to the dataset with <code>mypackage::dataset</code>, or with <code>system.file(filepath, package = mypackage)</code> to access raw data files.
 The following code chunk illustrates the latter two options using the <code>world</code> dataset (already loaded by attaching its parent package with <code><a href="https://nowosad.github.io/spData/">library(spData)</a></code>):<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;
-For more information on data import with R packages, see Sections 5.5 and 5.6 of &lt;span class="citation"&gt;&lt;a href="references.html#ref-gillespie_efficient_2016" role="doc-biblioref"&gt;Gillespie and Lovelace&lt;/a&gt; (&lt;a href="references.html#ref-gillespie_efficient_2016" role="doc-biblioref"&gt;2016&lt;/a&gt;)&lt;/span&gt;.&lt;/p&gt;'><sup>34</sup></a></p>
+For more information on data import with R packages, see Sections 5.5 and 5.6 of &lt;span class="citation"&gt;&lt;a href="references.html#ref-gillespie_efficient_2016" role="doc-biblioref"&gt;Gillespie and Lovelace&lt;/a&gt; (&lt;a href="references.html#ref-gillespie_efficient_2016" role="doc-biblioref"&gt;2016&lt;/a&gt;)&lt;/span&gt;.&lt;/p&gt;'><sup>33</sup></a></p>
 <div class="sourceCode" id="cb234"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span class="va">world2</span> <span class="op">=</span> <span class="fu">spData</span><span class="fu">::</span><span class="va"><a href="https://nowosad.github.io/spData/reference/world.html">world</a></span>
 <span class="va">world3</span> <span class="op">=</span> <span class="fu"><a href="https://r-spatial.github.io/sf/reference/st_read.html">read_sf</a></span><span class="op">(</span><span class="fu"><a href="https://rdrr.io/r/base/system.file.html">system.file</a></span><span class="op">(</span><span class="st">"shapes/world.gpkg"</span>, package <span class="op">=</span> <span class="st">"spData"</span><span class="op">)</span><span class="op">)</span></code></pre></div>
@@ -322,7 +322,7 @@ <h2>
 <p>Packages <strong>ows4R</strong>, <strong>rwfs</strong> and <strong>sos4R</strong> have been developed for working with OWS services in general, WFS and the sensor observation service (SOS) respectively.
 As of October 2018, only <strong>ows4R</strong> is on CRAN.
 The package’s basic functionality is demonstrated below, in commands that get all <code>FAO_AREAS</code> as we did in the previous code chunk:<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;
-To filter features on the server before downloading them, the argument &lt;code&gt;cql_filter&lt;/code&gt; can be used. Adding &lt;code&gt;cql_filter = URLencode(&quot;F_CODE= '27'&quot;)&lt;/code&gt; to the command, for example, would instruct the server to only return the feature with values in the &lt;code&gt;F_CODE&lt;/code&gt; column equal to 27.&lt;/p&gt;"><sup>35</sup></a></p>
+To filter features on the server before downloading them, the argument &lt;code&gt;cql_filter&lt;/code&gt; can be used. Adding &lt;code&gt;cql_filter = URLencode(&quot;F_CODE= '27'&quot;)&lt;/code&gt; to the command, for example, would instruct the server to only return the feature with values in the &lt;code&gt;F_CODE&lt;/code&gt; column equal to 27.&lt;/p&gt;"><sup>34</sup></a></p>
 <div class="sourceCode" id="cb239"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span class="kw"><a href="https://rdrr.io/r/base/library.html">library</a></span><span class="op">(</span><span class="va"><a href="https://github.com/eblondel/ows4R">ows4R</a></span><span class="op">)</span>
 <span class="va">wfs</span> <span class="op">=</span> <span class="va"><a href="https://rdrr.io/pkg/ows4R/man/WFSClient.html">WFSClient</a></span><span class="op">$</span><span class="fu">new</span><span class="op">(</span><span class="st">"http://www.fao.org/figis/geoserver/wfs"</span>,
@@ -341,7 +341,7 @@ <h2>
 Today the variety of file formats may seem bewildering but there has been much consolidation and standardization since the beginnings of GIS software in the 1960s when the first widely distributed program (<a href="https://news.harvard.edu/gazette/story/2011/10/the-invention-of-gis/">SYMAP</a>) for spatial analysis was created at Harvard University <span class="citation">(<a href="references.html#ref-coppock_history_1991" role="doc-biblioref">Coppock and Rhind 1991</a>)</span>.</p>
 <p>
 GDAL (which should be pronounced “goo-dal,” with the double “o” making a reference to object-orientation), the Geospatial Data Abstraction Library, has resolved many issues associated with incompatibility between geographic file formats since its release in 2000.
-GDAL provides a unified and high-performance interface for reading and writing of many raster and vector data formats.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;As we mentioned in Chapter &lt;a href="geometric-operations.html#geometric-operations"&gt;5&lt;/a&gt;, GDAL also contains a set of utility functions allowing for raster mosaicing, resampling, cropping, and reprojecting, etc.&lt;/p&gt;'><sup>36</sup></a>
+GDAL provides a unified and high-performance interface for reading and writing of many raster and vector data formats.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;As we mentioned in Chapter &lt;a href="geometric-operations.html#geometric-operations"&gt;5&lt;/a&gt;, GDAL also contains a set of utility functions allowing for raster mosaicing, resampling, cropping, and reprojecting, etc.&lt;/p&gt;'><sup>35</sup></a>
 Many open and proprietary GIS programs, including GRASS, ArcGIS and QGIS, use GDAL behind their GUIs for doing the legwork of ingesting and spitting out geographic data in appropriate formats.</p>
 <p>GDAL provides access to more than 200 vector and raster data formats.
 Table <a href="read-write.html#tab:formats">7.2</a> presents some basic information about selected and often used spatial file formats.</p>
@@ -548,7 +548,7 @@ <h2>
 It was developed in the early 1990s and has a number of limitations.
 First of all, it is a multi-file format, which consists of at least three files.
 It only supports 255 columns, column names are restricted to ten characters and the file size limit is 2 GB.
-Furthermore, ESRI Shapefile does not support all possible geometry types, for example, it is unable to distinguish between a polygon and a multipolygon.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;To learn more about ESRI Shapefile limitations and possible alternative file formats, visit &lt;a href="http://switchfromshapefile.org/" class="uri"&gt;http://switchfromshapefile.org/&lt;/a&gt;.&lt;/p&gt;'><sup>37</sup></a>
+Furthermore, ESRI Shapefile does not support all possible geometry types, for example, it is unable to distinguish between a polygon and a multipolygon.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;To learn more about ESRI Shapefile limitations and possible alternative file formats, visit &lt;a href="http://switchfromshapefile.org/" class="uri"&gt;http://switchfromshapefile.org/&lt;/a&gt;.&lt;/p&gt;'><sup>36</sup></a>
 Despite these limitations, a viable alternative had been missing for a long time.
 In the meantime, <a href="https://www.geopackage.org/">GeoPackage</a> emerged, and seems to be a more than suitable replacement candidate for ESRI Shapefile.
 Geopackage is a format for exchanging geospatial information and an OGC standard.
@@ -768,7 +768,7 @@ <h3>
 </p>
 </div>
 <p>Naturally, some options are specific to certain drivers.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;
-A list of supported vector formats and options can be found at &lt;a href="http://gdal.org/ogr_formats.html" class="uri"&gt;http://gdal.org/ogr_formats.html&lt;/a&gt;.&lt;/p&gt;'><sup>38</sup></a>
+A list of supported vector formats and options can be found at &lt;a href="http://gdal.org/ogr_formats.html" class="uri"&gt;http://gdal.org/ogr_formats.html&lt;/a&gt;.&lt;/p&gt;'><sup>37</sup></a>
 For example, think of coordinates stored in a spreadsheet format (<code>.csv</code>).
 To read in such files as spatial objects, we naturally have to specify the names of the columns (<code>X</code> and <code>Y</code> in our example below) representing the coordinates.
 We can do this with the help of the <code>options</code> parameter.
@@ -894,7 +894,7 @@ <h3>
 The function expects input regarding output data type and file format, but also accepts GDAL options specific to a selected file format (see <code><a href="https://rdrr.io/pkg/terra/man/writeRaster.html">?writeRaster</a></code> for more details).</p>
 <p>
 The <strong>terra</strong> package offers nine data types when saving a raster: LOG1S, INT1S, INT1U, INT2S, INT2U, INT4S, INT4U, FLT4S, and FLT8S.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;
-Using INT4U is not recommended as R does not support 32-bit unsigned integers.&lt;/p&gt;"><sup>39</sup></a>
+Using INT4U is not recommended as R does not support 32-bit unsigned integers.&lt;/p&gt;"><sup>38</sup></a>
 The data type determines the bit representation of the raster object written to disk (Table <a href="read-write.html#tab:datatypes">7.4</a>).
 Which data type to use depends on the range of the values of your raster object.
 The more values a data type can represent, the larger the file will get on disk.
diff --git a/reference-keys.txt b/reference-keys.txt
index 9a286f98b..aa0ed6015 100644
--- a/reference-keys.txt
+++ b/reference-keys.txt
@@ -150,7 +150,7 @@ sfg
 sfc
 sf
 raster-data
-r-packages-for-raster-data-handling
+r-packages-for-working-with-raster-data
 an-introduction-to-terra
 basic-map-raster
 raster-classes
diff --git a/references.html b/references.html
index 140f5e476..305811ce4 100644
--- a/references.html
+++ b/references.html
@@ -747,7 +747,6 @@ <h1>References<a class="anchor" aria-label="anchor" href="#references"><i class=
 
 
 
-
 
 
   <div class="chapter-nav">
diff --git a/reproj-geo-data.html b/reproj-geo-data.html
index 744eec771..4fff5353a 100644
--- a/reproj-geo-data.html
+++ b/reproj-geo-data.html
@@ -132,7 +132,7 @@ <h2>
 <p>This shows that unless a CRS is manually specified or is loaded from a source that has CRS metadata, the CRS is <code>NA</code>.
 A CRS can be added to <code>sf</code> objects with <code><a href="https://r-spatial.github.io/sf/reference/st_crs.html">st_set_crs()</a></code> as follows:<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;
 The CRS can also be added when creating &lt;code&gt;sf&lt;/code&gt; objects with the &lt;code&gt;crs&lt;/code&gt; argument (e.g., &lt;code&gt;st_sf(geometry = st_sfc(st_point(c(-0.1, 51.5))), crs = "EPSG:4326")&lt;/code&gt;).
-The same argument can also be used to set the CRS when creating raster datasets (e.g., &lt;code&gt;rast(crs = "EPSG:4326")&lt;/code&gt;).&lt;/p&gt;'><sup>29</sup></a></p>
+The same argument can also be used to set the CRS when creating raster datasets (e.g., &lt;code&gt;rast(crs = "EPSG:4326")&lt;/code&gt;).&lt;/p&gt;'><sup>28</sup></a></p>
 <div class="sourceCode" id="cb216"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span class="va">london_geo</span> <span class="op">=</span> <span class="fu"><a href="https://r-spatial.github.io/sf/reference/st_crs.html">st_set_crs</a></span><span class="op">(</span><span class="va">london</span>, <span class="st">"EPSG:4326"</span><span class="op">)</span>
 <span class="fu"><a href="https://r-spatial.github.io/sf/reference/st_is_longlat.html">st_is_longlat</a></span><span class="op">(</span><span class="va">london_geo</span><span class="op">)</span>
@@ -211,7 +211,7 @@ <h2>
 The difference in location between the two points is not due to imperfections in the transforming operation (which is in fact very accurate) but the low precision of the manually-created coordinates that created &lt;code&gt;london&lt;/code&gt; and &lt;code&gt;london_proj&lt;/code&gt;.
 Also surprising may be that the result is provided in a matrix with units of meters.
 This is because &lt;code&gt;st_distance()&lt;/code&gt; can provide distances between many features and because the CRS has units of meters.
-Use &lt;code&gt;as.numeric()&lt;/code&gt; to coerce the result into a regular number.&lt;/p&gt;"><sup>30</sup></a></p>
+Use &lt;code&gt;as.numeric()&lt;/code&gt; to coerce the result into a regular number.&lt;/p&gt;"><sup>29</sup></a></p>
 <div class="sourceCode" id="cb220"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span class="fu"><a href="https://r-spatial.github.io/sf/reference/geos_measures.html">st_distance</a></span><span class="op">(</span><span class="va">london2</span>, <span class="va">london_proj</span><span class="op">)</span>
 <span class="co">#&gt; Units: [m]</span>
@@ -250,7 +250,7 @@ <h2>
 <p>When deciding on a custom CRS, we recommend the following:<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;
 <!--toDo:rl-->
 <!-- jn:I we can assume who is the "anonymous reviewer", can we ask him/her to use his/her name? -->
-Many thanks to an anonymous reviewer whose comments formed the basis of this advice.&lt;/p&gt;'><sup>31</sup></a></p>
+Many thanks to an anonymous reviewer whose comments formed the basis of this advice.&lt;/p&gt;'><sup>30</sup></a></p>
 <p>
 
 
@@ -454,7 +454,7 @@ <h2>
 <span class="fu"><a href="https://rdrr.io/pkg/terra/man/crs.html">crs</a></span><span class="op">(</span><span class="va">con_raster</span><span class="op">)</span>
 <span class="co">#&gt; [1] "GEOGCRS[\"WGS 84\",\n    DATUM[\"World Geodetic System 1984\",\n        ELLIPSOID[\"WGS 84\",6378137,298.257223563,\n            LENGTHUNIT[\"metre\",1]]],\n    PRIMEM[\"Greenwich\",0,\n        ANGLEUNIT[\"degree\",0.0174532925199433]],\n    CS[ellipsoidal,2],\n        AXIS[\"geodetic latitude (Lat)\",north,\n            ORDER[1],\n            ANGLEUNIT[\"degree\",0.0174532925199433]],\n        AXIS[\"geodetic longitude (Lon)\",east,\n            ORDER[2],\n            ANGLEUNIT[\"degree\",0.0174532925199433]],\n    ID[\"EPSG\",4326]]"</span></code></pre></div>
 <p>We will reproject this dataset into a projected CRS, but <em>not</em> with the nearest neighbor method which is appropriate for categorical data.
-Instead, we will use the bilinear method which computes the output cell value based on the four nearest cells in the original raster.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;Other methods mentioned in Section &lt;a href="geometric-operations.html#resampling"&gt;5.3.4&lt;/a&gt; also can be used here.&lt;/p&gt;'><sup>32</sup></a>
+Instead, we will use the bilinear method which computes the output cell value based on the four nearest cells in the original raster.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;Other methods mentioned in Section &lt;a href="geometric-operations.html#resampling"&gt;5.3.4&lt;/a&gt; also can be used here.&lt;/p&gt;'><sup>31</sup></a>
 The values in the projected dataset are the distance-weighted average of the values from these four cells:
 the closer the input cell is to the center of the output cell, the greater its weight.
 The following commands create a text string representing WGS 84 / UTM zone 12N, and reproject the raster into this CRS, using the <code>bilinear</code> method:</p>
@@ -466,7 +466,7 @@ <h2>
 These changes are demonstrated in Table <a href="reproj-geo-data.html#tab:rastercrs">6.2</a><a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;
 Another minor change, that is not represented in Table &lt;a href="reproj-geo-data.html#tab:rastercrs"&gt;6.2&lt;/a&gt;, is that the class of the values in the new projected raster dataset is &lt;code&gt;numeric&lt;/code&gt;.
 This is because the &lt;code&gt;bilinear&lt;/code&gt; method works with continuous data and the results are rarely coerced into whole integer values.
-This can have implications for file sizes when raster datasets are saved.&lt;/p&gt;'><sup>33</sup></a>:</p>
+This can have implications for file sizes when raster datasets are saved.&lt;/p&gt;'><sup>32</sup></a>:</p>
 <div class="inline-table"><table class="table table-sm">
 <caption>
 <span id="tab:rastercrs">TABLE 6.2: </span>Key attributes in the original (‘con_raster’) and projected (‘con_raster_ea’) continuous raster datasets.</caption>
diff --git a/search.json b/search.json
index fa718c718..1f581cd1f 100644
--- a/search.json
+++ b/search.json
@@ -1 +1 @@
-[{"path":"index.html","id":"welcome","chapter":"Welcome","heading":"Welcome","text":"online home Geocomputation R, book geographic data analysis, visualization modeling.Note: first edition book published CRC Press R Series.\ncan buy book CRC Press, Amazon, see archived First Edition hosted bookdown.org.Inspired bookdown Free Open Source Software Geospatial (FOSS4G) movement, code prose underlying book open source, ensuring ’s reproducible, accessible modifiable (e.g. case find inevitable typo) benefit people worldwide.\nonline version book hosted geocompr.robinlovelace.net kept --date GitHub Actions.\ncurrent ‘build status’ follows:version book built GH Actions 2021-12-14.work licensed Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.","code":""},{"path":"index.html","id":"how-to-contribute","chapter":"Welcome","heading":"How to contribute?","text":"bookdown makes editing book easy editing wiki, provided GitHub account (sign-github.com).\nlogged-GitHub, click ‘Edit page’ icon right panel book website.\ntake editable version source R Markdown file generated page ’re .raise issue book’s content (e.g. code running) make feature request, check-issue tracker.Maintainers contributors must follow repository’s CODE CONDUCT.","code":""},{"path":"index.html","id":"reproducibility","chapter":"Welcome","heading":"Reproducibility","text":"quickest way reproduce contents book ’re new geographic data R may web browser, thanks Binder.\nClicking link open new window containing RStudio Server web browser, enabling open chapter files running code chunks test code reproducible.see something like image , congratulations, ’s worked can start exploring Geocomputation R cloud-based environment (aware mybinder.org user guidelines):\nFIGURE 0.1: Screenshot reproducible code contained Geocomputation R running RStudio Server browser served Binder\nreproduce code book computer, need recent version R --date packages.\ncan installed using remotes package.installing book’s dependencies, able reproduce code chunks book’s chapters.\nclone book’s repo navigate geocompr folder, able reproduce contents following command:See project’s GitHub repo details reproducing book.","code":"\ninstall.packages(\"remotes\")\nremotes::install_github(\"geocompr/geocompkg\")\nremotes::install_github(\"nowosad/spData\")\nremotes::install_github(\"nowosad/spDataLarge\")\n\n# During development work on the 2nd edition you may also need dev versions of\n# other packages to build the book, e.g.,:\nremotes::install_github(\"rspatial/terra\")\nremotes::install_github(\"mtennekes/tmap\")\nbookdown::serve_book()"},{"path":"index.html","id":"supporting-the-project","chapter":"Welcome","heading":"Supporting the project","text":"find book useful, please support :Telling people personCommunicating book digital media, e.g., via #geocompr hashtag Twitter (see Guestbook geocompr.github.io) letting us know courses using bookCiting linking-‘Starring’ geocompr GitHub repositoryReviewing , e.g., Amazon GoodreadsAsking questions making suggestion content via GitHub Twitter.Buying copyFurther details can found github.com/Robinlovelace/geocompr.globe icon used book created Jean-Marc Viglino licensed CC-4.0 International.","code":""},{"path":"foreword-1st-edition.html","id":"foreword-1st-edition","chapter":"Foreword (1st Edition)","heading":"Foreword (1st Edition)","text":"‘spatial’ R always broad, seeking provide integrate tools geography, geoinformatics, geocomputation spatial statistics anyone interested joining : joining asking interesting questions, contributing fruitful research questions, writing improving code.\n, ‘spatial’ R always included open source code, open data reproducibility.‘spatial’ R also sought open interaction many branches applied spatial data analysis, also implement new advances data representation methods analysis expose cross-disciplinary scrutiny.\nbook demonstrates, often alternative workflows similar data similar results, may learn comparisons others create understand workflows.\nincludes learning similar communities around Open Source GIS complementary languages Python, Java .R’s wide range spatial capabilities never evolved without people willing share creating adapting.\nmight include teaching materials, software, research practices (reproducible research, open data), combinations .\nR users also benefitted greatly ‘upstream’ open source geo libraries GDAL, GEOS PROJ.book clear example , curious willing join , can find things need match aptitudes.\nadvances data representation workflow alternatives, ever increasing numbers new users often without applied quantitative command-line exposure, book kind really needed.\nDespite effort involved, authors supported pressing forward publication., fresh book ready go; authors tried many tutorials workshops, readers instructors able benefit knowing contents continue tried people like .\nEngage authors wider R-spatial community, see value choice building workflows important, enjoy applying learn things care .Roger BivandBergen, September 2018","code":""},{"path":"preface.html","id":"preface","chapter":"Preface","heading":"Preface","text":"","code":""},{"path":"preface.html","id":"who-this-book-is-for","chapter":"Preface","heading":"Who this book is for","text":"book people want analyze, visualize model geographic data open source software.\nbased R, statistical programming language powerful data processing, visualization geospatial capabilities.\nbook covers wide range topics interest wide range people many different backgrounds, especially:People learned spatial analysis skills using desktop Geographic Information System (GIS), QGIS, ArcGIS, GRASS SAGA, want access powerful (geo)statistical visualization programming language benefits command-line approach (Sherman 2008):\n\nadvent ‘modern’ GIS software, people want point click way life. ’s good, tremendous amount flexibility power waiting command line.\nPeople learned spatial analysis skills using desktop Geographic Information System (GIS), QGIS, ArcGIS, GRASS SAGA, want access powerful (geo)statistical visualization programming language benefits command-line approach (Sherman 2008):advent ‘modern’ GIS software, people want point click way life. ’s good, tremendous amount flexibility power waiting command line.Graduate students researchers fields specializing geographic data including Geography, Remote Sensing, Planning, GIS Geographic Data ScienceGraduate students researchers fields specializing geographic data including Geography, Remote Sensing, Planning, GIS Geographic Data ScienceAcademics post-graduate students working geographic data — fields Geology, Regional Science, Biology Ecology, Agricultural Sciences, Archaeology, Epidemiology, Transport Modeling, broadly defined Data Science — require power flexibility R researchAcademics post-graduate students working geographic data — fields Geology, Regional Science, Biology Ecology, Agricultural Sciences, Archaeology, Epidemiology, Transport Modeling, broadly defined Data Science — require power flexibility R researchApplied researchers analysts public, private third-sector organizations need reproducibility, speed flexibility command-line language R applications dealing spatial data diverse Urban Transport Planning, Logistics, Geo-marketing (store location analysis) Emergency PlanningApplied researchers analysts public, private third-sector organizations need reproducibility, speed flexibility command-line language R applications dealing spatial data diverse Urban Transport Planning, Logistics, Geo-marketing (store location analysis) Emergency PlanningThe book designed intermediate--advanced R users interested geocomputation R beginners prior experience geographic data.\nnew R geographic data, discouraged: provide links materials describe nature spatial data beginner’s perspective Chapter 2 links provided .","code":""},{"path":"preface.html","id":"how-to-read-this-book","chapter":"Preface","heading":"How to read this book","text":"book divided three parts:Part : Foundations, aimed getting --speed geographic data R.Part II: Extensions, covers advanced techniques.Part III: Applications, real-world problems.chapters get progressively harder recommend reading book order.\nmajor barrier geographical analysis R steep learning curve.\nchapters Part aim address providing reproducible code simple datasets ease process getting started.important aspect book teaching/learning perspective exercises end chapter.\nCompleting develop skills equip confidence needed tackle range geospatial problems.\nSolutions exercises, number extended examples, provided book’s supporting website, geocompr.github.io.Impatient readers welcome dive straight practical examples, starting Chapter 2.\nHowever, recommend reading wider context Geocomputation R Chapter 1 first.\nnew R, also recommend learning language attempting run code chunks provided chapter (unless ’re reading book understanding concepts).\nFortunately R beginners R supportive community developed wealth resources can help.\nparticularly recommend three tutorials: R Data Science (Grolemund Wickham 2016) Efficient R Programming (Gillespie Lovelace 2016), especially Chapter 2 (installing setting-R/RStudio) Chapter 10 (learning learn), introduction R (R Core Team, Smith, Team 2021).","code":""},{"path":"preface.html","id":"why-r","chapter":"Preface","heading":"Why R?","text":"Although R steep learning curve, command-line approach advocated book can quickly pay .\n’ll learn subsequent chapters, R effective tool tackling wide range geographic data challenges.\nexpect , practice, R become program choice geospatial toolbox many applications.\nTyping executing commands command-line , many cases, faster pointing--clicking around graphical user interface (GUI) desktop GIS.\napplications Spatial Statistics modeling R may realistic way get work done.outlined Section 1.2, many reasons using R geocomputation:\nR well-suited interactive use required many geographic data analysis workflows compared languages.\nR excels rapidly growing fields Data Science (includes data carpentry, statistical learning techniques data visualization) Big Data (via efficient interfaces databases distributed computing systems).\nFurthermore R enables reproducible workflow: sharing scripts underlying analysis allow others build-work.\nensure reproducibility book made source code available github.com/Robinlovelace/geocompr.\nfind script files code/ folder generate figures:\ncode generating figure provided main text book, name script file generated provided caption (see example caption Figure 12.2).languages Python, Java C++ can used geocomputation excellent resources learning geocomputation without R, discussed Section 1.3.\nNone provide unique combination package ecosystem, statistical capabilities, visualization options, powerful IDEs offered R community.\nFurthermore, teaching use one language (R) depth, book equip concepts confidence needed geocomputation languages.","code":""},{"path":"preface.html","id":"real-world-impact","chapter":"Preface","heading":"Real-world impact","text":"Geocomputation R equip knowledge skills tackle wide range issues, including scientific, societal environmental implications, manifested geographic data.\ndescribed Section 1.1, geocomputation using computers process geographic data:\nalso real-world impact.\ninterested wider context motivations behind book, read ; covered Chapter 1.","code":""},{"path":"preface.html","id":"acknowledgements","chapter":"Preface","heading":"Acknowledgements","text":"Many thanks everyone contributed directly indirectly via code hosting collaboration site GitHub, including following people contributed direct via pull requests: prosoitos, florisvdh, katygregg, rsbivand, KiranmayiV, zmbc, erstearns, MikeJohnPage, eyesofbambi, nickbearman, tyluRp, marcosci, giocomai, KHwong12, LaurieLBaker, MarHer90, mdsumner, pat-s, gisma, ateucher, annakrystalli, DarrellCarvalho, kant, gavinsimpson, Himanshuteli, yutannihilation, jbixon13, olyerickson, yvkschaefer, katiejolly, layik, mpaulacaldas, mtennekes, mvl22, ganes1410, richfitz, wdearden, yihui, chihinl, cshancock, gregor-d, jasongrahn, p-kono, pokyah, schuetzingit, sdesabbata, tim-salabim, tszberkowitz.\nSpecial thanks Marco Sciaini, created front cover image, also published code generated (see code/frontcover.R book’s GitHub repo).\nDozens people contributed online, raising commenting issues, providing feedback via social media.\n#geocompr hashtag live !like thank John Kimmel CRC Press, worked us two years take ideas early book plan production via four rounds peer review.\nreviewers deserve special mention : detailed feedback expertise substantially improved book’s structure content.thank Patrick Schratz Alexander Brenning University Jena fruitful discussions input Chapters 11 14.\nthank Emmanuel Blondel Food Agriculture Organization United Nations expert input section web services;\nMichael Sumner critical input many areas book, especially discussion algorithms Chapter 10;\nTim Appelhans David Cooley key contributions visualization chapter (Chapter 8);\nKaty Gregg, proofread every chapter greatly improved readability book.Countless others mentioned contributed myriad ways.\nfinal thank software developers make geocomputation R possible.\nEdzer Pebesma (created sf package), Robert Hijmans (created raster) Roger Bivand (laid foundations much R-spatial software) made high performance geographic computing possible R.","code":""},{"path":"intro.html","id":"intro","chapter":"1 Introduction","heading":"1 Introduction","text":"book using power computers things geographic data.\nteaches range spatial skills, including: reading, writing manipulating geographic data; making static interactive maps; applying geocomputation solve real-world problems; modeling geographic phenomena.\ndemonstrating various geographic operations can linked, reproducible ‘code chunks’ intersperse prose, book also teaches transparent thus scientific workflow.\nLearning use wealth geospatial tools available R command line can exciting, creating new ones can truly liberating.\nUsing command-line driven approach taught throughout, programming techniques covered Chapter 10, can help remove constraints creativity imposed software.\nreading book completing exercises, therefore feel empowered strong understanding possibilities opened R’s impressive geographic capabilities, new skills solve real-world problems geographic data, ability communicate work maps reproducible code.last decades free open source software geospatial (FOSS4G) progressed astonishing rate.\nThanks organizations OSGeo, geographic data analysis longer preserve expensive hardware software: anyone can now download run high-performance spatial libraries.\nOpen source Geographic Information Systems (GIS), QGIS, made geographic analysis accessible worldwide.\nGIS programs tend emphasize graphical user interfaces (GUIs), unintended consequence discouraging reproducibility (although many can used command line ’ll see Chapter 9).\nR, contrast, emphasizes command line interface (CLI).\nsimplistic comparison different approaches illustrated Table 1.1.TABLE 1.1: Differences emphasis software packages (Graphical User Interface (GUI) Geographic Information Systems (GIS) R).book motivated importance reproducibility scientific research (see note ).\naims make reproducible geographic data analysis workflows accessible, demonstrate power open geospatial software available command-line.\n“Interfaces software part R” (Eddelbuettel Balamuta 2018).\nmeans addition outstanding ‘house’ capabilities, R allows access many spatial software libraries, explained Section 1.2 demonstrated Chapter 9.\ngoing details software, however, worth taking step back thinking mean geocomputation.Reproducibility major advantage command-line interfaces, mean practice?\ndefine follows: “process results can generated others using publicly accessible code.”","code":""},{"path":"intro.html","id":"what-is-geocomputation","chapter":"1 Introduction","heading":"1.1 What is geocomputation?","text":"Geocomputation young term, dating back first conference subject 1996.1\ndistinguished geocomputation (time) commonly used term ‘quantitative geography,’ early advocates proposed, emphasis “creative experimental” applications (P. . Longley et al. 1998) development new tools methods (Openshaw Abrahart 2000):\n“GeoComputation using various different types geodata developing relevant geo-tools within overall context ‘scientific’ approach.”\nbook aims go beyond teaching methods code; end able use geocomputational skills, “practical work beneficial useful” (Openshaw Abrahart 2000).approach differs early adopters Stan Openshaw, however, emphasis reproducibility collaboration.\nturn 21st Century, unrealistic expect readers able reproduce code examples, due barriers preventing access necessary hardware, software data.\nFast-forward two decades things progressed rapidly.\nAnyone access laptop ~4GB RAM can realistically expect able install run software geocomputation publicly accessible datasets, widely available ever (see Chapter 7).2\nUnlike early works field, work presented book reproducible using code example data supplied alongside book, R packages spData, installation covered Chapter 2.Geocomputation closely related terms including: Geographic Information Science (GIScience); Geomatics; Geoinformatics; Spatial Information Science; Geoinformation Engineering (P. Longley 2015); Geographic Data Science (GDS).\nterm shares emphasis ‘scientific’ (implying reproducible falsifiable) approach influenced GIS, although origins main fields application differ.\nGDS, example, emphasizes ‘data science’ skills large datasets, Geoinformatics tends focus data structures.\noverlaps terms larger differences use geocomputation rough synonym encapsulating :\nseek use geographic data applied scientific work.\nUnlike early users term, however, seek imply cohesive academic field called ‘Geocomputation’ (‘GeoComputation’ Stan Openshaw called ).\nInstead, define term follows: working geographic data computational way, focusing code, reproducibility modularity.Geocomputation recent term influenced old ideas.\ncan seen part Geography, 2000+ year history (Talbert 2014);\nextension Geographic Information Systems (GIS) (Neteler Mitasova 2008), emerged 1960s (Coppock Rhind 1991).Geography played important role explaining influencing humanity’s relationship natural world long invention computer, however.\nAlexander von Humboldt’s travels South America early 1800s illustrates role:\nresulting observations lay foundations traditions physical plant geography, also paved way towards policies protect natural world (Wulf 2015).\nbook aims contribute ‘Geographic Tradition’ (Livingstone 1992) harnessing power modern computers open source software.book’s links older disciplines reflected suggested titles book: Geography R R GIS.\nadvantages.\nformer conveys message comprises much just spatial data:\nnon-spatial attribute data inevitably interwoven geometry data, Geography something map.\nlatter communicates book using R GIS, perform spatial operations geographic data (R. Bivand, Pebesma, Gómez-Rubio 2013).\nHowever, term GIS conveys connotations (see Table 1.1) simply fail communicate one R’s greatest strengths:\nconsole-based ability seamlessly switch geographic non-geographic data processing, modeling visualization tasks.\ncontrast, term geocomputation implies reproducible creative programming.\ncourse, (geocomputational) algorithms powerful tools can become highly complex.\nHowever, algorithms composed smaller parts.\nteaching foundations underlying structure, aim empower create innovative solutions geographic data problems.","code":""},{"path":"intro.html","id":"why-use-r-for-geocomputation","chapter":"1 Introduction","heading":"1.2 Why use R for geocomputation?","text":"Early geographers used variety tools including barometers, compasses sextants advance knowledge world (Wulf 2015).\ninvention marine chronometer 1761 became possible calculate longitude sea, enabling ships take direct routes.Nowadays lack geographic data hard imagine.\nEvery smartphone global positioning (GPS) receiver multitude sensors devices ranging satellites semi-autonomous vehicles citizen scientists incessantly measure every part world.\nrate data produced overwhelming.\nautonomous vehicle, example, can generate 100 GB data per day (Economist 2016).\nRemote sensing data satellites become large analyze corresponding data single computer, leading initiatives OpenEO.‘geodata revolution’ drives demand high performance computer hardware efficient, scalable software handle extract signal noise, understand perhaps change world.\nSpatial databases enable storage generation manageable subsets vast geographic data stores, making interfaces gaining knowledge vital tools future.\nR one tool, advanced analysis, modeling visualization capabilities.\ncontext focus book language (see Wickham 2019).\nInstead use R ‘tool trade’ understanding world, similar Humboldt’s use tools gain deep understanding nature complexity interconnections (see Wulf 2015).\nAlthough programming can seem like reductionist activity, aim teach geocomputation R fun, understanding world.R multi-platform, open source language environment statistical computing graphics (r-project.org/).\nwide range packages, R also supports advanced geospatial statistics, modeling visualization.\n\nNew integrated development environments (IDEs) RStudio made R user-friendly many, easing map making panel dedicated interactive visualization.core, R object-oriented, functional programming language (Wickham 2019), specifically designed interactive interface software (Chambers 2016).\nlatter also includes many ‘bridges’ treasure trove GIS software, ‘geolibraries’ functions (see Chapter 9).\nthus ideal quickly creating ‘geo-tools,’ without needing master lower level languages (compared R) C, FORTRAN Java (see Section 1.3).\n\ncan feel like breaking free metaphorical ‘glass ceiling’ imposed GUI-based proprietary geographic information systems (see Table 1.1 definition GUI).\nFurthermore, R facilitates access languages:\npackages Rcpp reticulate enable access C++ Python code, example.\nmeans R can used ‘bridge’ wide range geospatial programs (see Section 1.3).Another example showing R’s flexibility evolving geographic capabilities interactive map making.\n’ll see Chapter 8, statement R “limited interactive [plotting] facilities” (R. Bivand, Pebesma, Gómez-Rubio 2013) longer true.\ndemonstrated following code chunk, creates Figure 1.1 (functions generate plot covered Section 8.4).\nFIGURE 1.1: blue markers indicate authors . basemap tiled image Earth night provided NASA. Interact online version geocompr.robinlovelace.net, example zooming clicking popups.\ndifficult produce Figure 1.1 using R years ago, let alone interactive map.\nillustrates R’s flexibility , thanks developments knitr leaflet, can used interface software, theme recur throughout book.\nuse R code, therefore, enables teaching geocomputation reference reproducible examples provided Figure 1.1 rather abstract concepts.","code":"\nlibrary(leaflet)\npopup = c(\"Robin\", \"Jakub\", \"Jannes\")\nleaflet() %>%\n  addProviderTiles(\"NASAGIBS.ViirsEarthAtNight2012\") %>%\n  addMarkers(lng = c(-3, 23, 11),\n             lat = c(52, 53, 49), \n             popup = popup)"},{"path":"intro.html","id":"software-for-geocomputation","chapter":"1 Introduction","heading":"1.3 Software for geocomputation","text":"R powerful language geocomputation many options geographic data analysis providing thousands geographic functions.\nAwareness languages geocomputation help decide different tool may appropriate specific task, place R wider geospatial ecosystem.\nsection briefly introduces languages C++, Java Python geocomputation, preparation Chapter 9.important feature R (Python) interpreted language.\nadvantageous enables interactive programming Read–Eval–Print Loop (REPL):\ncode entered console immediately executed result printed, rather waiting intermediate stage compilation.\nhand, compiled languages C++ Java tend run faster (compiled).C++ provides basis many GIS packages QGIS, GRASS SAGA sensible starting point.\nWell-written C++ fast, making good choice performance-critical applications processing large geographic datasets, harder learn Python R.\nC++ become accessible Rcpp package, provides good ‘way ’ C programming R users.\nProficiency low-level languages opens possibility creating new, high-performance ‘geoalgorithms’ better understanding GIS software works (see Chapter 10).Java another important versatile language geocomputation.\nGIS packages gvSig, OpenJump uDig written Java.\nmany GIS libraries written Java, including GeoTools JTS, Java Topology Suite (GEOS C++ port JTS).\nFurthermore, many map server applications use Java including Geoserver/Geonode, deegree 52°North WPS.Java’s object-oriented syntax similar C++.\nmajor advantage Java platform-independent (unusual compiled language) highly scalable, making suitable language IDEs RStudio, book written.\nJava fewer tools statistical modeling visualization Python R, although can used data science (Brzustowicz 2017).Python important language geocomputation especially many Desktop GIS GRASS, SAGA QGIS provide Python API (see Chapter 9).\nLike R, popular tool data science.\nlanguages object-oriented, many areas overlap, leading initiatives reticulate package facilitates access Python R Ursa Labs initiative support portable libraries benefit entire open source data science ecosystem.practice R Python strengths extent use less important domain application communication results.\nLearning either provide head-start learning .\nHowever, major advantages R Python geocomputation.\nincludes much better support geographic data models vector raster language (see Chapter 2) corresponding visualization possibilities (see Chapters 2 8).\nEqually important, R unparalleled support statistics, including spatial statistics, hundreds packages (unmatched Python) supporting thousands statistical methods.major advantage Python general-purpose programming language.\nused many domains, including desktop software, computer games, websites data science.\nPython often shared language different (geocomputation) communities can seen ‘glue’ holds many GIS programs together.\nMany geoalgorithms, including QGIS ArcMap, can accessed Python command line, making well-suited starter language command-line GIS.3For spatial statistics predictive modeling, however, R second--none.\nmean must choose either R Python: Python supports common statistical techniques (though R tends support new developments spatial statistics earlier) many concepts learned Python can applied R world.\n\n\nLike R, Python also supports geographic data analysis manipulation packages osgeo, Shapely, NumPy PyGeoProcessing (Garrard 2016).","code":""},{"path":"intro.html","id":"r-ecosystem","chapter":"1 Introduction","heading":"1.4 R’s spatial ecosystem","text":"many ways handle geographic data R, dozens packages area.4\nbook endeavor teach state---art field whilst ensuring methods future-proof.\nLike many areas software development, R’s spatial ecosystem rapidly evolving (Figure 1.2).\nR open source, developments can easily build previous work, ‘standing shoulders giants,’ Isaac Newton put 1675.\napproach advantageous encourages collaboration avoids ‘reinventing wheel.’\npackage sf (covered Chapter 2), example, builds predecessor sp.surge development time (interest) ‘R-spatial’ followed award grant R Consortium development support Simple Features, open-source standard model store access vector geometries.\nresulted sf package (covered Section 2.2.1).\nMultiple places reflect immense interest sf.\nespecially true R-sig-Geo Archives, long-standing open access email list containing much R-spatial wisdom accumulated years.\nFIGURE 1.2: Downloads selected R packages working geographic data. y-axis shows average number downloads per day, within 91-day rolling window.\nnoteworthy shifts wider R community, exemplified data processing package dplyr (released 2014) influenced shifts R’s spatial ecosystem.\nAlongside packages shared style emphasis ‘tidy data’ (including, e.g., ggplot2), dplyr placed tidyverse ‘metapackage’ late 2016.\n\n\ntidyverse approach, focus long-form data fast intuitively named functions, become immensely popular.\nled demand ‘tidy geographic data’ partly met sf.\nobvious feature tidyverse tendency packages work harmony.\n\n\nequivalent geoverse, attempts harmonization packages hosted r-spatial organization growing number packages use sf (Table 1.2).TABLE 1.2: top 5 downloaded packages depend sf, terms average number downloads per day previous month. 2021-11-19 289 packages import sf.Parallel group developments relates rspatial set packages.5\nmain member terra package spatial raster handling (see Section 2.3.2).","code":""},{"path":"intro.html","id":"the-history-of-r-spatial","chapter":"1 Introduction","heading":"1.5 The history of R-spatial","text":"many benefits using recent spatial packages sf, also important aware history R’s spatial capabilities: many functions, use-cases teaching material contained older packages.\ncan still useful today, provided know look.\n\nR’s spatial capabilities originated early spatial packages S language (R. Bivand Gebhardt 2000).\n\n1990s saw development numerous S scripts handful packages spatial statistics.\nR packages arose 2000 R packages various spatial methods “point pattern analysis, geostatistics, exploratory spatial data analysis spatial econometrics,” according article presented GeoComputation 2000 (R. Bivand Neteler 2000).\n, notably spatial, sgeostat splancs still available CRAN (B. S. Rowlingson Diggle 1993; B. Rowlingson Diggle 2017; Venables Ripley 2002; Majure Gebhardt 2016).subsequent article R News (predecessor R Journal) contained overview spatial statistical software R time, much based previous code written S/S-PLUS (Ripley 2001).\noverview described packages spatial smoothing interpolation, including akima geoR (Akima Gebhardt 2016; Jr Diggle 2016), point pattern analysis, including splancs (B. Rowlingson Diggle 2017) spatstat (Baddeley, Rubak, Turner 2015).following R News issue (Volume 1/3) put spatial packages spotlight , detailed introduction splancs commentary future prospects regarding spatial statistics (R. Bivand 2001).\nAdditionally, issue introduced two packages testing spatial autocorrelation eventually became part spdep (R. Bivand 2017).\nNotably, commentary mentions need standardization spatial interfaces, efficient mechanisms exchanging data GIS, handling spatial metadata coordinate reference systems (CRS).maptools (written Nicholas Lewin-Koh; R. Bivand Lewin-Koh (2017)) another important package time.\nInitially maptools just contained wrapper around shapelib permitted reading ESRI Shapefiles geometry nested lists.\ncorresponding nowadays obsolete S3 class called “Map” stored list alongside attribute data frame.\nwork “Map” class representation nevertheless important since directly fed sp prior publication CRAN.2003 Roger Bivand published extended review spatial packages.\nproposed class system support “data objects offered GDAL”, including ‘fundamental’ point, line, polygon, raster types.\nFurthermore, suggested interfaces external libraries form basis modular R packages (R. Bivand 2003).\nlarge extent ideas realized packages rgdal sp.\nprovided foundation spatial data analysis R, described Applied Spatial Data Analysis R (ASDAR) (R. Bivand, Pebesma, Gómez-Rubio 2013), first published 2008.\nTen years later, R’s spatial capabilities evolved substantially still build ideas set-R. Bivand (2003):\ninterfaces GDAL PROJ, example, still power R’s high-performance geographic data /O CRS transformation capabilities (see Chapters 6 7, respectively).rgdal, released 2003, provided GDAL bindings R greatly enhanced ability import data previously unavailable geographic data formats.\ninitial release supported raster drivers subsequent enhancements provided support coordinate reference systems (via PROJ library), reprojections import vector file formats (see Chapter 7 file formats).\nMany additional capabilities developed Barry Rowlingson released rgdal codebase 2006 (see B. Rowlingson et al. 2003 R-help email list context).sp, released 2005, overcame R’s inability distinguish spatial non-spatial objects (E. J. Pebesma Bivand 2005).\nsp grew workshop Vienna 2003 hosted sourceforge migrating R-Forge.\nPrior 2005, geographic coordinates generally treated like number.\nsp changed classes generic methods supporting points, lines, polygons grids, attribute data.sp stores information bounding box, coordinate reference system attributes slots Spatial objects using S4 class system,\nenabling data operations work geographic data (see Section 2.2.2).\n, sp provides generic methods summary() plot() geographic data.\nfollowing decade, sp classes rapidly became popular geographic data R number packages depended increased around 20 2008 100 2013 (R. Bivand, Pebesma, Gómez-Rubio 2013).\n2018 almost 500 packages rely sp, making important part R ecosystem.\nProminent R packages using sp include: gstat, spatial spatio-temporal geostatistics; geosphere, spherical trigonometry; adehabitat used analysis habitat selection animals (E. Pebesma Graeler 2018; Calenge 2006; Hijmans 2016).rgdal sp solved many spatial issues, R still lacked ability geometric operations (see Chapter 5).\nColin Rundel addressed issue developing rgeos, R interface open-source geometry library (GEOS) Google Summer Code project 2010 (R. Bivand Rundel 2018).\nrgeos enabled GEOS manipulate sp objects, functions gIntersection().Another limitation sp — restricted support raster data — overcome raster, first released 2010 (Hijmans 2017).\nclass system functions support range raster operations outlined Section 2.3.\nkey feature raster ability work datasets large fit RAM (R’s interface PostGIS supports -disc operations vector geographic data).\nraster also supports map algebra (see Section 4.3.2).parallel developments class systems methods came support R interface dedicated GIS software.\nGRASS (R. S. Bivand 2000) follow-packages spgrass6 rgrass7 (GRASS GIS 6 7, respectively) prominent examples direction (R. Bivand 2016a, 2016b).\nexamples bridges R GIS include RSAGA (Brenning, Bangs, Becker 2018, first published 2008), RPyGeo (Brenning 2012a, first published 2008), RQGIS (Muenchow, Schratz, Brenning 2017, first published 2016), rqgisprocess  (see Chapter 9).\n\nVisualization focus initially, bulk R-spatial development focused analysis geographic operations.\nsp provided methods map making using base lattice plotting system demand growing advanced map making capabilities.\nRgoogleMaps first released 2009, allowed overlay R spatial data top ‘basemap’ tiles online services Google Maps OpenStreetMap (Loecher Ropkins 2015).\n\nfollowed ggmap package added similar ‘basemap’ tiles capabilities ggplot2 (Kahle Wickham 2013).\nThough ggmap facilitated map-making ggplot2, utility limited need fortify spatial objects, means converting long data frames.\nworks well points computationally inefficient lines polygons, since coordinate (vertex) converted row, leading huge data frames represent complex geometries.\nAlthough geographic visualization tended focus vector data, raster visualization supported raster received boost release rasterVis, described book subject spatial temporal data visualization (Lamigueiro 2018).\n2018 map making R hot topic dedicated packages tmap, leaflet mapview supporting class system provided sf, focus next chapter (see Chapter 8 visualization).Since 2018, movement modernizing basic R packages related handling spatial data continued.\n\nterra – successor raster package aimed better performance straightforward user interface firstly released (see Chapter 2.3) 2020 (hijmans_terra_2021?).\nmid-2021, significant change made sf package incorporating spherical geometry calculations.\nSince change, default, many spatial operations data geographic CRSs apply C++ s2geometry library’s spherical geometry algorithms, types operations data projected CRSs still using GEOS.\n\n\nNew ideas spatial data representations also developed period.\n\n\nincludes stars package, closely connected sf, handling raster vector data cubes (pebesma_stars_2021?) lidR processing airborne LiDAR (Light Detection Ranging) point clouds (Roussel2020?).modernization several reasons, including emergence new technologies standard, impacts spatial software development outside R environment (R. S. Bivand 2020).\nimportant external factor affecting spatial software, including R spatial packages, major updates, including many breaking changes PROJ library begun 2018.\nimportantly, changes forced replacement proj4string WKT2 representation storage coordinate reference systems coordinates operations (learn Section 2.4 Chapter 6).\nSince 2018, progress spatial visualization tools R related factors.\nFirstly, new types spatial plots developed, including rayshader package offering combination raytracing multiple hill-shading methods produce 2D 3D data visualizations (morganwall_rayshader_2021?).\n\nSecondly, ggplot2 gained new spatial capabilities, mostly thanks ggspatial package adds spatial vizualization elements, including scale bars north arrows (dunnington_ggspatial_2021?) gganimate enables smooth customizable spatial animations (pedersen_gganimate_2020?).\nThirdly, performance visualizing large spatial dataset improved.\nespecially relates automatic plotting downscaled rasters tmap possibility using high-performance interactive rendering platforms mapview package, \"leafgl\" \"mapdeck\".\nLastly, existing mapping tools rewritten minimize dependencies, improve user interface, allow easier creation extensions.\nincludes mapsf package (successor cartography) (giraud_mapsf_2021?) version 4 tmap package, internal code revised.late 2021, planned retirement rgdal, rgeos maptools end 2023 announced R-sig-Geo mailing list Roger Bivand.\nlarge impact existing workflows applying packages, also influence packages depend rgdal, rgeos maptools.\nTherefore, Bivand’s suggestion plan transition modern tools, including sf terra, explained book’s next chapters.","code":""},{"path":"intro.html","id":"exercises","chapter":"1 Introduction","heading":"1.6 Exercises","text":"E1. Think terms ‘GIS’, ‘GDS’ ‘geocomputation’ described . () best describes work like using geo* methods software ?E2. Provide three reasons using scriptable language R geocomputation instead using graphical user interface (GUI) based GIS QGIS.E3. Name two advantages two disadvantages using mature vs recent packages geographic data analysis (example sp vs sf, raster vs terra).","code":""},{"path":"spatial-class.html","id":"spatial-class","chapter":"2 Geographic data in R","heading":"2 Geographic data in R","text":"","code":""},{"path":"spatial-class.html","id":"prerequisites","chapter":"2 Geographic data in R","heading":"Prerequisites","text":"first practical chapter book, therefore comes software requirements.\nassume --date version R installed comfortable using software command-line interface integrated development environment (IDE) RStudio.\nnew R, recommend reading Chapter 2 online book Efficient R Programming Gillespie Lovelace (2016) learning basics language reference resources Grolemund Wickham (2016).\nOrganize work (e.g., RStudio projects) give scripts sensible names 02-chapter.R document code write learn.\npackages used chapter can installed following commands:6All packages needed reproduce contents book can installed following command: remotes::install_github(\"geocompr/geocompkg\").\nnecessary packages can ‘loaded’ (technically attached) library() function follows:output library(sf) reports versions key geographic libraries GEOS package using, outlined Section 2.2.1.packages installed contain data used book:","code":"\ninstall.packages(\"sf\")\ninstall.packages(\"terra\")\ninstall.packages(\"spData\")\ninstall.packages(\"spDataLarge\", repos = \"https://nowosad.r-universe.dev\")\nlibrary(sf)          # classes and functions for vector data\n#> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1\nlibrary(terra)      # classes and functions for raster data\nlibrary(spData)        # load geographic data\n#> Warning: multiple methods tables found for 'approxNA'\nlibrary(spDataLarge)   # load larger geographic data"},{"path":"spatial-class.html","id":"intro-spatial-class","chapter":"2 Geographic data in R","heading":"2.1 Introduction","text":"chapter provide brief explanations fundamental geographic data models: vector raster.\nintroduce theory behind data model disciplines predominate, demonstrating implementation R.vector data model represents world using points, lines polygons.\ndiscrete, well-defined borders, meaning vector datasets usually high level precision (necessarily accuracy see Section 2.5).\nraster data model divides surface cells constant size.\nRaster datasets basis background images used web-mapping vital source geographic data since origins aerial photography satellite-based remote sensing devices.\nRasters aggregate spatially specific features given resolution, meaning consistent space scalable (many worldwide raster datasets available).use?\nanswer likely depends domain application:Vector data tends dominate social sciences human settlements tend discrete bordersRaster dominates many environmental sciences reliance remote sensing dataThere much overlap fields raster vector datasets can used together:\necologists demographers, example, commonly use vector raster data.\nFurthermore, possible convert two forms (see Section 5.4).\nWhether work involves use vector raster datasets, worth understanding underlying data model using , discussed subsequent chapters.\nbook uses sf terra packages work vector data raster datasets, respectively.","code":""},{"path":"spatial-class.html","id":"vector-data","chapter":"2 Geographic data in R","heading":"2.2 Vector data","text":"geographic vector data model based points located within coordinate reference system (CRS).\nPoints can represent self-standing features (e.g., location bus stop) can linked together form complex geometries lines polygons.\npoint geometries contain two dimensions (3-dimensional CRSs contain additional \\(z\\) value, typically representing height sea level).system London, example, can represented coordinates c(-0.1, 51.5).\nmeans location -0.1 degrees east 51.5 degrees north origin.\norigin case 0 degrees longitude (Prime Meridian) 0 degree latitude (Equator) geographic (‘lon/lat’) CRS (Figure 2.1, left panel).\npoint also approximated projected CRS ‘Easting/Northing’ values c(530000, 180000) British National Grid, meaning London located 530 km East 180 km North \\(origin\\) CRS.\ncan verified visually: slightly 5 ‘boxes’ — square areas bounded gray grid lines 100 km width — separate point representing London origin (Figure 2.1, right panel).location National Grid’s origin, sea beyond South West Peninsular, ensures locations UK positive Easting Northing values.7\nCRSs, described Sections 2.4 6 , purposes section, sufficient know coordinates consist two numbers representing distance origin, usually \\(x\\) \\(y\\) dimensions.\nFIGURE 2.1: Illustration vector (point) data location London (red X) represented reference origin (blue circle). left plot represents geographic CRS origin 0° longitude latitude. right plot represents projected CRS origin located sea west South West Peninsula.\nsf package providing class system geographic vector data.\nsf supersede sp, also provides consistent command-line interface GEOS GDAL, superseding rgeos rgdal (described Section 1.5).\nsection introduces sf classes preparation subsequent chapters (Chapters 5 7 cover GEOS GDAL interface, respectively).","code":""},{"path":"spatial-class.html","id":"intro-sf","chapter":"2 Geographic data in R","heading":"2.2.1 An introduction to simple features","text":"Simple features open standard developed endorsed Open Geospatial Consortium (OGC), --profit organization whose activities revisit later chapter (Section 7.5).\n\nSimple Features hierarchical data model represents wide range geometry types.\n17 geometry types supported specification, 7 used vast majority geographic research (see Figure 2.2);\ncore geometry types fully supported R package sf (E. Pebesma 2018).8\nFIGURE 2.2: Simple feature types fully supported sf.\nsf can represent common vector geometry types (raster data classes supported sf): points, lines, polygons respective ‘multi’ versions (group together features type single feature).\n\n\nsf also supports geometry collections, can contain multiple geometry types single object.\nsf provides functionality () previously provided three packages — sp data classes (E. Pebesma Bivand 2018), rgdal data read/write via interface GDAL PROJ (R. Bivand, Keitt, Rowlingson 2018) rgeos spatial operations via interface GEOS (R. Bivand Rundel 2018).re-iterate message Chapter 1, geographic R packages long history interfacing lower level libraries, sf continues tradition unified interface recent versions GEOS geometry operations, GDAL library reading writing geographic data files, PROJ library representing transforming projected coordinate reference systems.\ns2,\n\n”\n\nR interface Google’s spherical geometry library s2, sf also access fast accurate “measurements operations non-planar geometries” (bivand_progress_2021?).\nSince sf version 1.0.0, launched June 2021, s2 functionality now used default geometries geographic (longitude/latitude) coordinate systems, unique feature sf differs spatial libraries support GEOS geometry operations Python package GeoPandas.\ndiscuss s2 subsequent chapters.\n\nsf’s ability integrate multiple powerful libraries geocomputation single framework notable achievement reduces ‘barriers entry’ world reproducible geographic data analysis high-performance libraries.\nsf’s functionality well documented website r-spatial.github.io/sf/ contains 7 vignettes.\ncan viewed offline follows:first vignette explains, simple feature objects R stored data frame, geographic data occupying special column, usually named ‘geom’ ‘geometry.’\nuse world dataset provided spData, loaded beginning chapter, show sf objects work.\nworld ‘sf data frame’ containing spatial attribute columns, names returned function names() (last column example contains geographic information):contents geom column give sf objects spatial powers: world$geom ‘list column’ contains coordinates country polygons.\n\nsf objects can plotted quickly base R function plot();\nfollowing command creates Figure 2.3.\nFIGURE 2.3: spatial plot world using sf package, facet attribute.\nNote instead creating single map default geographic objects, GIS programs , plot()ing sf objects results map variable datasets.\nbehavior can useful exploring spatial distribution different variables discussed Section 2.2.3.broadly, treating geographic objects regular data frames spatial powers many advantages, especially already used working data frames.\ncommonly used summary() function, example, provides useful overview variables within world object.Although selected one variable summary() command, also outputs report geometry.\ndemonstrates ‘sticky’ behavior geometry columns sf objects, meaning geometry kept unless user deliberately removes , ’ll see Section 3.2.\nresult provides quick summary non-spatial spatial data contained world: mean average life expectancy 71 years (ranging less 51 83 years median 73 years) across countries.worth taking deeper look basic behavior contents simple feature object, can usefully thought ‘spatial data frame.’sf objects easy subset.\ncode shows first two rows three columns.\noutput shows two major differences compared regular data.frame: inclusion additional geographic data (geometry type, dimension, bbox CRS information - epsg (SRID), proj4string), presence geometry column, named geom:may seem rather complex, especially class system supposed simple.\nHowever, good reasons organizing things way using sf.describing geometry type sf package supports, worth taking step back understand building blocks sf objects.\nSection 2.2.8 shows simple features objects data frames, special geometry columns.\nspatial columns often called geom geometry: world$geom refers spatial element world object described .\ngeometry columns ‘list columns’ class sfc (see Section 2.2.7).\nturn, sfc objects composed one objects class sfg: simple feature geometries describe Section 2.2.6.\n\nunderstand spatial components simple features work, vital understand simple feature geometries.\nreason cover currently supported simple features geometry type Section 2.2.5 moving describe can represented R using sfg objects, form basis sfc eventually full sf objects.","code":"\nvignette(package = \"sf\") # see which vignettes are available\nvignette(\"sf1\")          # an introduction to the package\nclass(world)\n#> [1] \"sf\"         \"tbl_df\"     \"tbl\"        \"data.frame\"\nnames(world)\n#>  [1] \"iso_a2\"    \"name_long\" \"continent\" \"region_un\" \"subregion\" \"type\"     \n#>  [7] \"area_km2\"  \"pop\"       \"lifeExp\"   \"gdpPercap\" \"geom\"\nplot(world)\nsummary(world[\"lifeExp\"])\n#>     lifeExp                geom    \n#>  Min.   :50.6   MULTIPOLYGON :177  \n#>  1st Qu.:65.0   epsg:4326    :  0  \n#>  Median :72.9   +proj=long...:  0  \n#>  Mean   :70.9                      \n#>  3rd Qu.:76.8                      \n#>  Max.   :83.6                      \n#>  NA's   :10\nworld_mini = world[1:2, 1:3]\nworld_mini\n#> Simple feature collection with 2 features and 3 fields\n#> Geometry type: MULTIPOLYGON\n#> Dimension:     XY\n#> Bounding box:  xmin: -180 ymin: -18.3 xmax: 180 ymax: -0.95\n#> Geodetic CRS:  WGS 84\n#> # A tibble: 2 × 4\n#>   iso_a2 name_long continent                                                geom\n#>   <chr>  <chr>     <chr>                                      <MULTIPOLYGON [°]>\n#> 1 FJ     Fiji      Oceania   (((-180 -16.6, -180 -16.5, -180 -16, -180 -16.1, -…\n#> 2 TZ     Tanzania  Africa    (((33.9 -0.95, 31.9 -1.03, 30.8 -1.01, 30.4 -1.13,…"},{"path":"spatial-class.html","id":"why-simple-features","chapter":"2 Geographic data in R","heading":"2.2.2 Why simple features?","text":"Simple features widely supported data model underlies data structures many GIS applications including QGIS PostGIS.\nmajor advantage using data model ensures work cross-transferable set-ups, example importing exporting spatial databases.\nspecific question R perspective “use sf package sp already tried tested?”\nmany reasons (linked advantages simple features model):Fast reading writing dataEnhanced plotting performancesf objects can treated data frames operationssf function names relatively consistent intuitive (begin st_)sf functions can combined using %>% operator works well tidyverse collection R packages.sf’s support tidyverse packages exemplified provision two functions reading data, st_read() read_sf() store attributes base R data.frame tidyverse tibble classes respectively, demonstrated (see Chapter 3 manipulating geographic data tidyverse functions Section 7.6.1 details reading writing geographic vector data R):sf now go-package analysis spatial vector data R (withstanding spatstat package ecosystem provides numerous functions spatial statistics).\nMany popular packages build sf, shown rise popularity terms number downloads per day, shown Section 1.4 previous chapter.\nHowever, take many years packages fully transition away older packages sp, many packages depend sf sp never switch (bivand_progress_2021?).context important note people still using sp (related rgeos rgdal) packages advised switch sf.\ndescription rgeos CRAN, example, states package “retired end 2023” advises people plan transition sf.\nwords, sf future proof sp .\nworkflows depend legacy class system, sf objects can converted Spatial class sp package follows:","code":"\nnc_dfr = st_read(system.file(\"shape/nc.shp\", package=\"sf\"))\n#> Reading layer `nc' from data source \n#>   `/usr/local/lib/R/site-library/sf/shape/nc.shp' using driver `ESRI Shapefile'\n#> Simple feature collection with 100 features and 14 fields\n#> Geometry type: MULTIPOLYGON\n#> Dimension:     XY\n#> Bounding box:  xmin: -84.3 ymin: 33.9 xmax: -75.5 ymax: 36.6\n#> Geodetic CRS:  NAD27\nnc_tbl = read_sf(system.file(\"shape/nc.shp\", package=\"sf\"))\nclass(nc_dfr)\n#> [1] \"sf\"         \"data.frame\"\nclass(nc_tbl)\n#> [1] \"sf\"         \"tbl_df\"     \"tbl\"        \"data.frame\"\nlibrary(sp)\nworld_sp = as(world, Class = \"Spatial\") # from an sf object to sp\n# sp functions ...\nworld_sf = st_as_sf(world_sp)           # from sp to sf"},{"path":"spatial-class.html","id":"basic-map","chapter":"2 Geographic data in R","heading":"2.2.3 Basic map making","text":"Basic maps created sf plot().\ndefault creates multi-panel plot (like sp’s spplot()), one sub-plot variable object, illustrated left-hand panel Figure 2.4.\nlegend ‘key’ continuous color produced object plotted single variable (see right-hand panel).\nColors can also set col =, although create continuous palette legend.\n\nFIGURE 2.4: Plotting sf, multiple variables (left) single variable (right).\nPlots added layers existing images setting add = TRUE.9\ndemonstrate , provide taster content covered Chapters 3 4 attribute spatial data operations, subsequent code chunk combines countries Asia:can now plot Asian continent map world.\nNote first plot must one facet add = TRUE work.\nfirst plot key, reset = FALSE must used (result shown):Adding layers way can used verify geographic correspondence layers:\nplot() function fast execute requires lines code, create interactive maps wide range options.\nadvanced map making recommend using dedicated visualization packages tmap (see Chapter 8).","code":"\nplot(world[3:6])\nplot(world[\"pop\"])\nworld_asia = world[world$continent == \"Asia\", ]\nasia = st_union(world_asia)\nplot(world[\"pop\"], reset = FALSE)\nplot(asia, add = TRUE, col = \"red\")"},{"path":"spatial-class.html","id":"base-args","chapter":"2 Geographic data in R","heading":"2.2.4 Base plot arguments","text":"various ways modify maps sf’s plot() method.\nsf extends base R plotting methods plot()’s arguments main = (specifies title map) work sf objects (see ?graphics::plot ?par).10\n\nFigure 2.5 illustrates flexibility overlaying circles, whose diameters (set cex =) represent country populations, map world.\nunprojected version figure can created following commands (see exercises end chapter script 02-contplot.R reproduce Figure 2.5):\nFIGURE 2.5: Country continents (represented fill color) 2015 populations (represented circles, area proportional population).\ncode uses function st_centroid() convert one geometry type (polygons) another (points) (see Chapter 5), aesthetics varied cex argument.\nsf’s plot method also arguments specific geographic data. expandBB, example, can used plot sf object context:\ntakes numeric vector length four expands bounding box plot relative zero following order: bottom, left, top, right.\nused plot India context giant Asian neighbors, emphasis China east, following code chunk, generates Figure 2.6 (see exercises adding text plots):\nFIGURE 2.6: India context, demonstrating expandBB argument.\nNote use [0] keep geometry column lwd emphasize India.\nSee Section 8.6 visualization techniques representing range geometry types, subject next section.","code":"\nplot(world[\"continent\"], reset = FALSE)\ncex = sqrt(world$pop) / 10000\nworld_cents = st_centroid(world, of_largest = TRUE)\nplot(st_geometry(world_cents), add = TRUE, cex = cex)\nindia = world[world$name_long == \"India\", ]\nplot(st_geometry(india), expandBB = c(0, 0.2, 0.1, 1), col = \"gray\", lwd = 3)\nplot(world_asia[0], add = TRUE)"},{"path":"spatial-class.html","id":"geometry","chapter":"2 Geographic data in R","heading":"2.2.5 Geometry types","text":"Geometries basic building blocks simple features.\nSimple features R can take one 17 geometry types supported sf package.\n\n\nchapter focus seven commonly used types: POINT, LINESTRING, POLYGON, MULTIPOINT, MULTILINESTRING, MULTIPOLYGON GEOMETRYCOLLECTION.\nFind whole list possible feature types PostGIS manual.Generally, well-known binary (WKB) well-known text (WKT) standard encoding simple feature geometries.\n\n\n\nWKB representations usually hexadecimal strings easily readable computers.\nGIS spatial databases use WKB transfer store geometry objects.\nWKT, hand, human-readable text markup description simple features.\nformats exchangeable, present one, naturally choose WKT representation.basis geometry type point.\npoint simply coordinate 2D, 3D 4D space (see vignette(\"sf1\") information) (see left panel Figure 2.7):\nPOINT (5 2)\nlinestring sequence points straight line connecting points, example (see middle panel Figure 2.7):LINESTRING (1 5, 4 4, 4 1, 2 2, 3 2)polygon sequence points form closed, non-intersecting ring.\nClosed means first last point polygon coordinates (see right panel Figure 2.7).11\nPolygon without hole: POLYGON ((1 5, 2 2, 4 1, 4 4, 1 5))\nFIGURE 2.7: Illustration point, linestring polygon geometries.\nfar created geometries one geometric entity per feature.\nHowever, sf also allows multiple geometries exist within single feature (hence term ‘geometry collection’) using “multi” version geometry type:\nMultipoint: MULTIPOINT (5 2, 1 3, 3 4, 3 2)Multilinestring: MULTILINESTRING ((1 5, 4 4, 4 1, 2 2, 3 2), (1 2, 2 4))Multipolygon: MULTIPOLYGON (((1 5, 2 2, 4 1, 4 4, 1 5), (0 2, 1 2, 1 3, 0 3, 0 2)))\nFIGURE 2.8: Illustration multi* geometries.\nFinally, geometry collection can contain combination geometries including (multi)points linestrings (see Figure 2.9):\nGeometry collection: GEOMETRYCOLLECTION (MULTIPOINT (5 2, 1 3, 3 4, 3 2), LINESTRING (1 5, 4 4, 4 1, 2 2, 3 2))\nFIGURE 2.9: Illustration geometry collection.\n","code":""},{"path":"spatial-class.html","id":"sfg","chapter":"2 Geographic data in R","heading":"2.2.6 Simple feature geometries (sfg)","text":"sfg class represents different simple feature geometry types R: point, linestring, polygon (‘multi’ equivalents, multipoints) geometry collection.\nUsually spared tedious task creating geometries since can simply import already existing spatial file.\nHowever, set functions create simple feature geometry objects (sfg) scratch needed.\nnames functions simple consistent, start st_ prefix end name geometry type lowercase letters:point: st_point()linestring: st_linestring()polygon: st_polygon()multipoint: st_multipoint()multilinestring: st_multilinestring()multipolygon: st_multipolygon()geometry collection: st_geometrycollection()sfg objects can created three base R data types:numeric vector: single pointA matrix: set points, row represents point, multipoint linestringA list: collection objects matrices, multilinestrings geometry collectionsThe function st_point() creates single points numeric vectors:results show XY (2D coordinates), XYZ (3D coordinates) XYZM (3D additional variable, typically measurement accuracy) point types created vectors length 2, 3, 4, respectively.\nXYM type must specified using dim argument (short dimension).contrast, use matrices case multipoint (st_multipoint()) linestring (st_linestring()) objects:Finally, use lists creation multilinestrings, (multi-)polygons geometry collections:","code":"\nst_point(c(5, 2))                 # XY point\n#> POINT (5 2)\nst_point(c(5, 2, 3))              # XYZ point\n#> POINT Z (5 2 3)\nst_point(c(5, 2, 1), dim = \"XYM\") # XYM point\n#> POINT M (5 2 1)\nst_point(c(5, 2, 3, 1))           # XYZM point\n#> POINT ZM (5 2 3 1)\n# the rbind function simplifies the creation of matrices\n## MULTIPOINT\nmultipoint_matrix = rbind(c(5, 2), c(1, 3), c(3, 4), c(3, 2))\nst_multipoint(multipoint_matrix)\n#> MULTIPOINT ((5 2), (1 3), (3 4), (3 2))\n## LINESTRING\nlinestring_matrix = rbind(c(1, 5), c(4, 4), c(4, 1), c(2, 2), c(3, 2))\nst_linestring(linestring_matrix)\n#> LINESTRING (1 5, 4 4, 4 1, 2 2, 3 2)\n## POLYGON\npolygon_list = list(rbind(c(1, 5), c(2, 2), c(4, 1), c(4, 4), c(1, 5)))\nst_polygon(polygon_list)\n#> POLYGON ((1 5, 2 2, 4 1, 4 4, 1 5))\n## POLYGON with a hole\npolygon_border = rbind(c(1, 5), c(2, 2), c(4, 1), c(4, 4), c(1, 5))\npolygon_hole = rbind(c(2, 4), c(3, 4), c(3, 3), c(2, 3), c(2, 4))\npolygon_with_hole_list = list(polygon_border, polygon_hole)\nst_polygon(polygon_with_hole_list)\n#> POLYGON ((1 5, 2 2, 4 1, 4 4, 1 5), (2 4, 3 4, 3 3, 2 3, 2 4))\n## MULTILINESTRING\nmultilinestring_list = list(rbind(c(1, 5), c(4, 4), c(4, 1), c(2, 2), c(3, 2)), \n                            rbind(c(1, 2), c(2, 4)))\nst_multilinestring((multilinestring_list))\n#> MULTILINESTRING ((1 5, 4 4, 4 1, 2 2, 3 2), (1 2, 2 4))\n## MULTIPOLYGON\nmultipolygon_list = list(list(rbind(c(1, 5), c(2, 2), c(4, 1), c(4, 4), c(1, 5))),\n                         list(rbind(c(0, 2), c(1, 2), c(1, 3), c(0, 3), c(0, 2))))\nst_multipolygon(multipolygon_list)\n#> MULTIPOLYGON (((1 5, 2 2, 4 1, 4 4, 1 5)), ((0 2, 1 2, 1 3, 0 3, 0 2)))\n## GEOMETRYCOLLECTION\ngemetrycollection_list = list(st_multipoint(multipoint_matrix),\n                              st_linestring(linestring_matrix))\nst_geometrycollection(gemetrycollection_list)\n#> GEOMETRYCOLLECTION (MULTIPOINT (5 2, 1 3, 3 4, 3 2),\n#>   LINESTRING (1 5, 4 4, 4 1, 2 2, 3 2))"},{"path":"spatial-class.html","id":"sfc","chapter":"2 Geographic data in R","heading":"2.2.7 Simple feature columns (sfc)","text":"One sfg object contains single simple feature geometry.\nsimple feature geometry column (sfc) list sfg objects, additionally able contain information coordinate reference system use.\ninstance, combine two simple features one object two features, can use st_sfc() function.\n\nimportant since sfc represents geometry column sf data frames:cases, sfc object contains objects geometry type.\nTherefore, convert sfg objects type polygon simple feature geometry column, also end sfc object type polygon, can verified st_geometry_type().\nEqually, geometry column multilinestrings result sfc object type multilinestring:also possible create sfc object sfg objects different geometry types:mentioned , sfc objects can additionally store information coordinate reference systems (CRS).\ndefault value NA (Available), can verified st_crs():geometries sfc object must CRS.\ncan add coordinate reference system crs argument st_sfc().\nspecify certain CRS, can provide Spatial Reference System Identifier (SRID, e.g., \"EPSG:4326\"), well-known text (WKT2), proj4string representation (see Section 2.4).\nprovide SRID proj4string, well-known text (WKT2) added automatically.","code":"\n# sfc POINT\npoint1 = st_point(c(5, 2))\npoint2 = st_point(c(1, 3))\npoints_sfc = st_sfc(point1, point2)\npoints_sfc\n#> Geometry set for 2 features \n#> Geometry type: POINT\n#> Dimension:     XY\n#> Bounding box:  xmin: 1 ymin: 2 xmax: 5 ymax: 3\n#> CRS:           NA\n#> POINT (5 2)\n#> POINT (1 3)\n# sfc POLYGON\npolygon_list1 = list(rbind(c(1, 5), c(2, 2), c(4, 1), c(4, 4), c(1, 5)))\npolygon1 = st_polygon(polygon_list1)\npolygon_list2 = list(rbind(c(0, 2), c(1, 2), c(1, 3), c(0, 3), c(0, 2)))\npolygon2 = st_polygon(polygon_list2)\npolygon_sfc = st_sfc(polygon1, polygon2)\nst_geometry_type(polygon_sfc)\n#> [1] POLYGON POLYGON\n#> 18 Levels: GEOMETRY POINT LINESTRING POLYGON MULTIPOINT ... TRIANGLE\n# sfc MULTILINESTRING\nmultilinestring_list1 = list(rbind(c(1, 5), c(4, 4), c(4, 1), c(2, 2), c(3, 2)), \n                            rbind(c(1, 2), c(2, 4)))\nmultilinestring1 = st_multilinestring((multilinestring_list1))\nmultilinestring_list2 = list(rbind(c(2, 9), c(7, 9), c(5, 6), c(4, 7), c(2, 7)), \n                            rbind(c(1, 7), c(3, 8)))\nmultilinestring2 = st_multilinestring((multilinestring_list2))\nmultilinestring_sfc = st_sfc(multilinestring1, multilinestring2)\nst_geometry_type(multilinestring_sfc)\n#> [1] MULTILINESTRING MULTILINESTRING\n#> 18 Levels: GEOMETRY POINT LINESTRING POLYGON MULTIPOINT ... TRIANGLE\n# sfc GEOMETRY\npoint_multilinestring_sfc = st_sfc(point1, multilinestring1)\nst_geometry_type(point_multilinestring_sfc)\n#> [1] POINT           MULTILINESTRING\n#> 18 Levels: GEOMETRY POINT LINESTRING POLYGON MULTIPOINT ... TRIANGLE\nst_crs(points_sfc)\n#> Coordinate Reference System: NA\n# EPSG definition\npoints_sfc_wgs = st_sfc(point1, point2, crs = \"EPSG:4326\")\nst_crs(points_sfc_wgs)\n#> Coordinate Reference System:\n#>   User input: EPSG:4326 \n#>   wkt:\n#> GEOGCRS[\"WGS 84\",\n#>     DATUM[\"World Geodetic System 1984\",\n#>         ELLIPSOID[\"WGS 84\",6378137,298.257223563,\n#>             LENGTHUNIT[\"metre\",1]]],\n#>     PRIMEM[\"Greenwich\",0,\n#>         ANGLEUNIT[\"degree\",0.0174532925199433]],\n#>     CS[ellipsoidal,2],\n#>         AXIS[\"geodetic latitude (Lat)\",north,\n#>             ORDER[1],\n#>             ANGLEUNIT[\"degree\",0.0174532925199433]],\n#>         AXIS[\"geodetic longitude (Lon)\",east,\n#>             ORDER[2],\n#>             ANGLEUNIT[\"degree\",0.0174532925199433]],\n#>     USAGE[\n#>         SCOPE[\"unknown\"],\n#>         AREA[\"World\"],\n#>         BBOX[-90,-180,90,180]],\n#>     ID[\"EPSG\",4326]]"},{"path":"spatial-class.html","id":"sf","chapter":"2 Geographic data in R","heading":"2.2.8 The sf class","text":"Sections 2.2.5 2.2.7 deal purely geometric objects, ‘sf geometry’ ‘sf column’ objects, respectively.\ngeographic building blocks geographic vector data represented simple features.\nfinal building block non-geographic attributes, representing name feature attributes measured values, groups, things.\nillustrate attributes, represent temperature 25°C London June 21st, 2017.\nexample contains geometry (coordinates), three attributes three different classes (place name, temperature date).12\nObjects class sf represent data combining attributes (data.frame) simple feature geometry column (sfc).\ncreated st_sf() illustrated , creates London example described :just happened? First, coordinates used create simple feature geometry (sfg).\nSecond, geometry converted simple feature geometry column (sfc), CRS.\nThird, attributes stored data.frame, combined sfc object st_sf().\nresults sf object, demonstrated (output omitted):result shows sf objects actually two classes, sf data.frame.\nSimple features simply data frames (square tables), spatial attributes stored list column, usually called geometry, described Section 2.2.1.\nduality central concept simple features:\ntime sf can treated behaves like data.frame.\nSimple features , essence, data frames spatial extension.","code":"\nlnd_point = st_point(c(0.1, 51.5))                 # sfg object\nlnd_geom = st_sfc(lnd_point, crs = 4326)           # sfc object\nlnd_attrib = data.frame(                           # data.frame object\n  name = \"London\",\n  temperature = 25,\n  date = as.Date(\"2017-06-21\")\n  )\nlnd_sf = st_sf(lnd_attrib, geometry = lnd_geom)    # sf object\nlnd_sf\n#> Simple feature collection with 1 features and 3 fields\n#> ...\n#>     name temperature       date         geometry\n#> 1 London          25 2017-06-21 POINT (0.1 51.5)\nclass(lnd_sf)\n#> [1] \"sf\"         \"data.frame\""},{"path":"spatial-class.html","id":"raster-data","chapter":"2 Geographic data in R","heading":"2.3 Raster data","text":"spatial raster data model represents world continuous grid cells (often also called pixels; Figure 2.10:).\ndata model often refers -called regular grids, cell , constant size – focus regular grids book .\nHowever, several types grids exist, including rotated, sheared, rectilinear, curvilinear grids (see Chapter 1 E. Pebesma Bivand (2022) Chapter 2 Tennekes Nowosad (2022)).raster data model usually consists raster header\nmatrix (rows columns) representing equally spaced cells (often also called pixels; Figure 2.10:).13\nraster header defines coordinate reference system, extent origin.\n\n\norigin (starting point) frequently coordinate lower-left corner matrix (terra package, however, uses upper left corner, default (Figure 2.10:B)).\nheader defines extent via number columns, number rows cell size resolution.\nHence, starting origin, can easily access modify single cell either using ID cell (Figure 2.10:B) explicitly specifying rows columns.\nmatrix representation avoids storing explicitly coordinates four corner points (fact stores one coordinate, namely origin) cell corner case rectangular vector polygons.\nmap algebra (Section 4.3.2) makes raster processing much efficient faster vector data processing.\nHowever, contrast vector data, cell one raster layer can hold single value.\nvalue might numeric categorical (Figure 2.10:C).\nFIGURE 2.10: Raster data types: () cell IDs, (B) cell values, (C) colored raster map.\nRaster maps usually represent continuous phenomena elevation, temperature, population density spectral data (Figure 2.11).\ncourse, can represent discrete features soil land-cover classes also help raster data model (Figure 2.11).\nConsequently, discrete borders features become blurred, depending spatial task vector representation might suitable.\nFIGURE 2.11: Examples continuous categorical rasters.\n","code":""},{"path":"spatial-class.html","id":"r-packages-for-raster-data-handling","chapter":"2 Geographic data in R","heading":"2.3.1 R packages for raster data handling","text":"R several packages able read process spatial raster data; see (-history--r-spatial) context.\nHowever, currently, two main packages purpose exist – terra stars.14\nfocusing terra package book; however, may worth knowing basic similarities differences packages deciding one use.First, terra focuses common raster data model (regular grids), stars also allows storing less popular models (including regular, rotated, sheared, rectilinear, curvilinear grids).\nterra usually handle one multi-layered rasters15, stars package provides ways store raster data cubes – raster object many layers (e.g., bands), many moments time (e.g., months), many attributes (e.g., sensor type sensor type B).\nImportantly, packages, layers elements data cube must spatial dimensions extent.\nSecond, packages allow either read raster data memory just read metadata – usually done automatically based input file size.\nHowever, store raster values differently.\nterra based C++ code mostly uses C++ pointers.\nstars stores values lists arrays smaller rasters just file path larger ones.\nThird, stars functions closely related vector objects functions sf, terra uses class objects vector data, namely SpatVector.\nFourth, packages different approach various functions work objects.\nterra package mostly relies large number built-functions, function specific purpose (e.g., resampling cropping).\nhand, stars uses build-functions (usually names starting st_), methods existing R functions (e.g., split() aggregate()), also existing dplyr functions (e.g., filter() slice()).Importantly, straightforward convert objects terra stars (using st_as_stars()) way round (using rast()).\nalso encourage read E. Pebesma Bivand (2022) comprehensive introduction stars package.","code":""},{"path":"spatial-class.html","id":"an-introduction-to-terra","chapter":"2 Geographic data in R","heading":"2.3.2 An introduction to terra","text":"terra package supports raster objects R.\nLike predecessor raster (created developer, Robert Hijmans), provides extensive set functions create, read, export, manipulate process raster datasets.\nterra’s functionality largely mature raster package, differences: terra functions usually computationally efficient raster equivalents.\n\nhand, raster class system popular used many packages; sf sp, good news can seamlessly translate two types object ensure backwards compatibility older scripts packages, example, functions raster(), stack(), brick() (see previous chapter evolution R packages working geographic data).addition functions raster data manipulation, terra provides many low-level functions can form foundation developing new tools working raster datasets.\n\nterra also lets work large raster datasets large fit main memory.\ncase, terra provides possibility divide raster smaller chunks, processes iteratively instead loading whole raster file RAM.illustration terra concepts, use datasets spDataLarge.\nconsists raster objects one vector object covering area Zion National Park (Utah, USA).\nexample, srtm.tif digital elevation model area (details, see documentation ?srtm).\nFirst, let’s create SpatRaster object named my_rast:Typing name raster console, print raster header (dimensions, resolution, extent, CRS) additional information (class, data source, summary raster values):Dedicated functions report component: dim(my_rast) returns number rows, columns layers; ncell() number cells (pixels); res() spatial resolution; ext() spatial extent; crs() coordinate reference system (raster reprojection covered Section 6.6).\ninMemory() reports whether raster data stored memory disk.help(\"terra-package\") returns full list available terra functions.","code":"\nraster_filepath = system.file(\"raster/srtm.tif\", package = \"spDataLarge\")\nmy_rast = rast(raster_filepath)\nclass(my_rast)\n#> [1] \"SpatRaster\"\n#> attr(,\"package\")\n#> [1] \"terra\"\nmy_rast\n#> class       : SpatRaster \n#> dimensions  : 457, 465, 1  (nrow, ncol, nlyr)\n#> resolution  : 0.000833, 0.000833  (x, y)\n#> extent      : -113, -113, 37.1, 37.5  (xmin, xmax, ymin, ymax)\n#> coord. ref. : lon/lat WGS 84 (EPSG:4326) \n#> source      : srtm.tif \n#> name        : srtm \n#> min value   : 1024 \n#> max value   : 2892"},{"path":"spatial-class.html","id":"basic-map-raster","chapter":"2 Geographic data in R","heading":"2.3.3 Basic map making","text":"Similar sf package, terra also provides plot() methods classes.\n\nFIGURE 2.12: Basic raster plot.\nseveral approaches plotting raster data R outside scope section, including:plotRGB() function terra package create Red-Green-Blue plot based three layers SpatRaster objectpackages tmap create static interactive maps raster vector objects (see Chapter 8)functions, example levelplot() rasterVis package, create facets, common technique visualizing change time","code":"\nplot(my_rast)"},{"path":"spatial-class.html","id":"raster-classes","chapter":"2 Geographic data in R","heading":"2.3.4 Raster classes","text":"SpatRaster class represents rasters object terra.\neasiest way create raster object R read-raster file disk server (Section 7.6.2.\nterra package supports numerous drivers help GDAL library.\nRasters files usually read entirely RAM, exception header pointer file .Rasters can also created scratch using rast() function.\nillustrated subsequent code chunk, results new SpatRaster object.\nresulting raster consists 36 cells (6 columns 6 rows specified nrows ncols) centered around Prime Meridian Equator (see xmin, xmax, ymin ymax parameters).\ndefault CRS raster objects WGS84, can changed crs argument.\nmeans unit resolution degrees set 0.5 (resolution).\nValues (vals) assigned cell: 1 cell 1, 2 cell 2, .\nRemember: rast() fills cells row-wise (unlike matrix()) starting upper left corner, meaning top row contains values 1 6, second 7 12, etc.ways creating raster objects, see ?rast.SpatRaster class also handles multiple layers, typically correspond single multispectral satellite file time-series rasters.nlyr() retrieves number layers stored SpatRaster object:multi-layer raster objects, layers can selected terra::subset().16\naccepts layer number name second argument:opposite operation, combining several SpatRaster objects one, can done using c function:","code":"\nsingle_raster_file = system.file(\"raster/srtm.tif\", package = \"spDataLarge\")\nsingle_rast = rast(raster_filepath)\nnew_raster = rast(nrows = 6, ncols = 6, resolution = 0.5, \n                  xmin = -1.5, xmax = 1.5, ymin = -1.5, ymax = 1.5,\n                  vals = 1:36)\nmulti_raster_file = system.file(\"raster/landsat.tif\", package = \"spDataLarge\")\nmulti_rast = rast(multi_raster_file)\nmulti_rast\n#> class       : SpatRaster \n#> dimensions  : 1428, 1128, 4  (nrow, ncol, nlyr)\n#> resolution  : 30, 30  (x, y)\n#> extent      : 301905, 335745, 4111245, 4154085  (xmin, xmax, ymin, ymax)\n#> coord. ref. : WGS 84 / UTM zone 12N (EPSG:32612) \n#> source      : landsat.tif \n#> names       : lan_1, lan_2, lan_3, lan_4 \n#> min values  :  7550,  6404,  5678,  5252 \n#> max values  : 19071, 22051, 25780, 31961\nnlyr(multi_rast)\n#> [1] 4\nmulti_rast3 = subset(multi_rast, 3)\nmulti_rast4 = subset(multi_rast, 4)\nmulti_rast34 = c(multi_rast3, multi_rast4)"},{"path":"spatial-class.html","id":"crs-intro","chapter":"2 Geographic data in R","heading":"2.4 Coordinate Reference Systems","text":"\nVector raster spatial data types share concepts intrinsic spatial data.\nPerhaps fundamental Coordinate Reference System (CRS), defines spatial elements data relate surface Earth (bodies).\nCRSs either geographic projected, introduced beginning chapter (see Figure 2.1).\nsection explain type, laying foundations Section 6 CRS transformations.","code":""},{"path":"spatial-class.html","id":"geographic-coordinate-systems","chapter":"2 Geographic data in R","heading":"2.4.1 Geographic coordinate systems","text":"\nGeographic coordinate systems identify location Earth’s surface using two values — longitude latitude (see left panel Figure 2.14).\nLongitude location East-West direction angular distance Prime Meridian plane.\nLatitude angular distance North South equatorial plane.\nDistances geographic CRSs therefore measured meters.\nimportant consequences, demonstrated Section 6.surface Earth geographic coordinate systems represented spherical ellipsoidal surface.\nSpherical models assume Earth perfect sphere given radius – advantage simplicity , time, inaccurate: Earth sphere!\nEllipsoidal models defined two parameters: equatorial radius polar radius.\nsuitable Earth compressed: equatorial radius around 11.5 km longer polar radius (Maling 1992).17Ellipsoids part wider component CRSs: datum.\ncontains information ellipsoid use precise relationship Cartesian coordinates location Earth’s surface.\ntwo types datum — geocentric (WGS84) local (NAD83).\ncan see examples two types datums Figure 2.13.\nBlack lines represent geocentric datum, center located Earth’s center gravity optimized specific location.\nlocal datum, shown purple dashed line, ellipsoidal surface shifted align surface particular location.\nallow local variations Earth’s surface, example due large mountain ranges, accounted local CRS.\ncan seen Figure 2.13, local datum fitted area Philippines, misaligned rest planet’s surface.\ndatums Figure 2.13 put top geoid - model global mean sea level.18\nFIGURE 2.13: Geocentric local geodetic datums shown top geoid (false color vertical exaggeration 10,000 scale factor). Image geoid adapted work Ince et al. (2019).\n","code":""},{"path":"spatial-class.html","id":"projected-coordinate-reference-systems","chapter":"2 Geographic data in R","heading":"2.4.2 Projected coordinate reference systems","text":"\nprojected CRSs based geographic CRS, described previous section, rely map projections convert three-dimensional surface Earth Easting Northing (x y) values projected CRS.\nProjected CRSs based Cartesian coordinates implicitly flat surface (right panel Figure 2.14).\norigin, x y axes, linear unit measurement meters.transition done without adding deformations.\nTherefore, properties Earth’s surface distorted process, area, direction, distance, shape.\nprojected coordinate system can preserve one two properties.\nProjections often named based property preserve: equal-area preserves area, azimuthal preserve direction, equidistant preserve distance, conformal preserve local shape.three main groups projection types - conic, cylindrical, planar (azimuthal).\nconic projection, Earth’s surface projected onto cone along single line tangency two lines tangency.\nDistortions minimized along tangency lines rise distance lines projection.\nTherefore, best suited maps mid-latitude areas.\ncylindrical projection maps surface onto cylinder.\nprojection also created touching Earth’s surface along single line tangency two lines tangency.\nCylindrical projections used often mapping entire world.\nplanar projection projects data onto flat surface touching globe point along line tangency.\ntypically used mapping polar regions.\nsf_proj_info(type = \"proj\") gives list available projections supported PROJ library.quick summary different projections, types, properties, suitability can found “Map Projections” (1993).\nFIGURE 2.14: Examples geographic (WGS 84; left) projected (NAD83 / UTM zone 12N; right) coordinate systems vector data type.\n","code":""},{"path":"spatial-class.html","id":"crs-in-r","chapter":"2 Geographic data in R","heading":"2.4.3 CRSs in R","text":"\n\n\nSpatial R packages support wide range CRSs use long-established PROJ library.\nTwo recommend ways describe CRSs R () Spatial Reference System Identifier (SRID) (b) well-known text (known WKT219) definitions.\napproaches advantages disadvantages.SRID unique value used identify coordinate reference system definitions form AUTHORITY:CODE.\npopular registry SRIDs EPSG, however, registries, ESRI OGR, exist.\nexample, EPSG:4326 represents latitude/longitude WGS84 CRS, ESRI:54030 - Robinson projection.\nSRIDs usually short therefore easier remember.\nSRID associated well-known text (WKT2) definition coordinate reference system.WKT2 describes coordinate reference systems (CRSs) coordinates operations form well-known text strings.\nexhaustive, detailed, precise (can see later section), allowing unambiguous CRSs storage transformations.\nconsists information given CRS, including datum ellipsoid, prime meridian, projection, units, etc.\nfeature also makes WKT2 approach complicated\nusually complex manually defined.past, proj4string definitions, standard way specify coordinate operations store CRSs.\nstring representations, built key=value form (e.g, +proj=longlat +datum=WGS84 +no_defs), , however, currently discouraged cases.\nPROJ version 6 still allows use proj4strings define coordinate operations, proj4string keys longer supported advisable use (e.g., +nadgrids, +towgs84, +k, +init=epsg:) three datums (.e., WGS84, NAD83, NAD27) can directly set proj4string.\nImportantly, proj4strings used store CRSs anymore.\nLonger explanations recent changes PROJ library proj4string replaced WKT2 can found R. S. Bivand (2020), Chapter 2 E. Pebesma Bivand (2022), blog post Floris Vanderhaeghe.Let’s look CRSs stored R spatial objects can set.\n, need read-vector dataset:new object, new_vector, polygon representing world map data (?spData::world).\nsf CRS object can retrieved using st_crs().CRS sf objects list two elements - input wkt.\ninput element quite flexible, depending input file user input, can contain SRID representation (e.g., \"EPSG:4326\"), CRS’s name (e.g., \"WGS84\"), even proj4string definition.\nwkt element stores WKT2 representation, used saving object file coordinate operations.\n, can see new_vector object WGS84 ellipsoid, uses Greenwich prime meridian, latitude longitude axis order.\ncase, also additional elements, USAGE explaining area suitable use CRS, ID pointing CRS’s SRID - \"EPSG:4326\".st_crs function also one helpful feature – can retrieve additional information used CRS.\nexample, try run:st_crs(new_vector)$IsGeographic check CRS geographic notst_crs(new_vector)$units_gdal find CRS unitsst_crs(new_vector)$srid extracts SRID (available)st_crs(new_vector)$proj4string extracts proj4string representationIn cases coordinate reference system (CRS) missing wrong CRS set, st_set_crs() function can used:second argument function either SRID (\"EPSG:4326\" example), complete WKT2 representation, proj4string, CRS extracted existing object st_crs().crs() function can used access CRS information SpatRaster object20:output WKT2 representation CRS.function, crs(), can also used set CRS raster objects., can use either SRID, complete WKT2 representation, proj4string, CRS extracted existing object crs().Importantly, st_crs() crs() functions alter coordinates’ values geometries.\nrole set metadata information object CRS.\nexpand CRSs explain project one CRS another Chapter 6.","code":"\nvector_filepath = system.file(\"shapes/world.gpkg\", package = \"spData\")\nnew_vector = read_sf(vector_filepath)\nst_crs(new_vector) # get CRS\n#> Coordinate Reference System:\n#>   User input: WGS 84 \n#>   wkt:\n#> GEOGCRS[\"WGS 84\",\n#>     DATUM[\"World Geodetic System 1984\",\n#>         ELLIPSOID[\"WGS 84\",6378137,298.257223563,\n#>             LENGTHUNIT[\"metre\",1]]],\n#>     PRIMEM[\"Greenwich\",0,\n#>         ANGLEUNIT[\"degree\",0.0174532925199433]],\n#>     CS[ellipsoidal,2],\n#>         AXIS[\"geodetic latitude (Lat)\",north,\n#>             ORDER[1],\n#>             ANGLEUNIT[\"degree\",0.0174532925199433]],\n#>         AXIS[\"geodetic longitude (Lon)\",east,\n#>             ORDER[2],\n#>             ANGLEUNIT[\"degree\",0.0174532925199433]],\n#>     USAGE[\n#>         SCOPE[\"unknown\"],\n#>         AREA[\"World\"],\n#>         BBOX[-90,-180,90,180]],\n#>     ID[\"EPSG\",4326]]\nnew_vector = st_set_crs(new_vector, \"EPSG:4326\") # set CRS\ncrs(my_rast) # get CRS\n#> [1] \"GEOGCRS[\\\"WGS 84\\\",\\n    DATUM[\\\"World Geodetic System 1984\\\",\\n        ELLIPSOID[\\\"WGS 84\\\",6378137,298.257223563,\\n            LENGTHUNIT[\\\"metre\\\",1]]],\\n    PRIMEM[\\\"Greenwich\\\",0,\\n        ANGLEUNIT[\\\"degree\\\",0.0174532925199433]],\\n    CS[ellipsoidal,2],\\n        AXIS[\\\"geodetic latitude (Lat)\\\",north,\\n            ORDER[1],\\n            ANGLEUNIT[\\\"degree\\\",0.0174532925199433]],\\n        AXIS[\\\"geodetic longitude (Lon)\\\",east,\\n            ORDER[2],\\n            ANGLEUNIT[\\\"degree\\\",0.0174532925199433]],\\n    ID[\\\"EPSG\\\",4326]]\"\ncrs(my_rast) = \"EPSG:26912\" # set CRS"},{"path":"spatial-class.html","id":"units","chapter":"2 Geographic data in R","heading":"2.5 Units","text":"important feature CRSs contain information spatial units.\nClearly, vital know whether house’s measurements feet meters, applies maps.\ngood cartographic practice add scale bar distance indicator onto maps demonstrate relationship distances page screen distances ground.\nLikewise, important formally specify units geometry data cells measured provide context, ensure subsequent calculations done context.novel feature geometry data sf objects native support units.\nmeans distance, area geometric calculations sf return values come units attribute, defined units package (E. Pebesma, Mailund, Hiebert 2016).\nadvantageous, preventing confusion caused different units (CRSs use meters, use feet) providing information dimensionality.\ndemonstrated code chunk , calculates area Luxembourg:\n\noutput units square meters (m2), showing result represents two-dimensional space.\ninformation, stored attribute (interested readers can discover attributes(st_area(luxembourg))), can feed subsequent calculations use units, population density (measured people per unit area, typically per km2).\nReporting units prevents confusion.\ntake Luxembourg example, units remained unspecified, one incorrectly assume units hectares.\ntranslate huge number digestible size, tempting divide results million (number square meters square kilometer):However, result incorrectly given square meters.\nsolution set correct units units package:Units equal importance case raster data.\nHowever, far sf spatial package supports units, meaning people working raster data approach changes units analysis (example, converting pixel widths imperial decimal units) care.\nmy_rast object (see ) uses WGS84 projection decimal degrees units.\nConsequently, resolution also given decimal degrees know , since res() function simply returns numeric vector.used UTM projection, units change., res() command gives back numeric vector without unit, forcing us know unit UTM projection meters.","code":"\nluxembourg = world[world$name_long == \"Luxembourg\", ]\nst_area(luxembourg) # requires the s2 package in recent versions of sf\n#> 2.41e+09 [m^2]\nst_area(luxembourg) / 1000000\n#> 2409 [m^2]\nunits::set_units(st_area(luxembourg), km^2)\n#> 2409 [km^2]\nres(my_rast)\n#> [1] 0.000833 0.000833\nrepr = project(my_rast, \"EPSG:26912\")\nres(repr)\n#> [1] 0.000833 0.000833"},{"path":"spatial-class.html","id":"ex2","chapter":"2 Geographic data in R","heading":"2.6 Exercises","text":"E1. Use summary() geometry column world data object. output tell us :geometry type?number countries?coordinate reference system (CRS)?E2. Run code ‘generated’ map world Section 2.2.4 Base plot arguments.\nFind two similarities two differences image computer book.cex argument (see ?plot)?cex set sqrt(world$pop) / 10000?Bonus: experiment different ways visualize global population.E3. Use plot() create maps Nigeria context (see Section 2.2.4 Base plot arguments).Adjust lwd, col expandBB arguments plot().Challenge: read documentation text() annotate map.E4. Create empty SpatRaster object called my_raster 10 columns 10 rows.\nAssign random values 0 10 new raster plot .E5. Read-raster/nlcd.tif file spDataLarge package.\nkind information can get properties file?E6. Check CRS raster/nlcd.tif file spDataLarge package.\nkind information can learn ?","code":""},{"path":"attr.html","id":"attr","chapter":"3 Attribute data operations","heading":"3 Attribute data operations","text":"","code":""},{"path":"attr.html","id":"prerequisites-1","chapter":"3 Attribute data operations","heading":"Prerequisites","text":"chapter requires following packages installed attached:also relies spData, loads datasets used code examples chapter:","code":"\nlibrary(sf)      # vector data package introduced in Chapter 2\nlibrary(terra)   # raster data package introduced in Chapter 2\nlibrary(dplyr)   # tidyverse package for data frame manipulation\nlibrary(spData)  # spatial data package introduced in Chapter 2\n#> Warning: multiple methods tables found for 'approxNA'"},{"path":"attr.html","id":"introduction","chapter":"3 Attribute data operations","heading":"3.1 Introduction","text":"Attribute data non-spatial information associated geographic (geometry) data.\nbus stop provides simple example: position typically represented latitude longitude coordinates (geometry data), addition name.\nElephant & Castle / New Kent Road stop London, example coordinates -0.098 degrees longitude 51.495 degrees latitude can represented POINT (-0.098 51.495) sfc representation described Chapter 2.\nAttributes name attribute POINT feature (use Simple Features terminology) topic chapter.Another example elevation value (attribute) specific grid cell raster data.\nUnlike vector data model, raster data model stores coordinate grid cell indirectly, meaning distinction attribute spatial information less clear.\nillustrate point, think pixel 3rd row 4th column raster matrix.\nspatial location defined index matrix: move origin four cells x direction (typically east right maps) three cells y direction (typically south ).\nraster’s resolution defines distance x- y-step specified header.\nheader vital component raster datasets specifies pixels relate geographic coordinates (see also Chapter 4).teaches manipulate geographic objects based attributes names bus stops vector dataset elevations pixels raster dataset.\nvector data, means techniques subsetting aggregation (see Sections 3.2.1 3.2.3).\nSections 3.2.4 3.2.5 demonstrate join data onto simple feature objects using shared ID create new variables, respectively.\noperations spatial equivalent:\n[ operator base R, example, works equally subsetting objects based attribute spatial objects; can also join attributes two geographic datasets using spatial joins.\ngood news: skills developed chapter cross-transferable.\nChapter 4 extends methods presented spatial world.deep dive various types vector attribute operations next section, raster attribute data operations covered Section 3.3, demonstrates create raster layers containing continuous categorical attributes extracting cell values one layer (raster subsetting).\nSection 3.3.2 provides overview ‘global’ raster operations can used summarize entire raster datasets.","code":""},{"path":"attr.html","id":"vector-attribute-manipulation","chapter":"3 Attribute data operations","heading":"3.2 Vector attribute manipulation","text":"Geographic vector datasets well supported R thanks sf class, extends base R’s data.frame.\nLike data frames, sf objects one column per attribute variable (‘name’) one row per observation feature (e.g., per bus station).\nsf objects differ basic data frames geometry column class sfc can contain range geographic entities (single ‘multi’ point, line, polygon features) per row.\ndescribed Chapter 2, demonstrated generic methods plot() summary() work sf objects.\nsf also provides generics allow sf objects behave like regular data frames, shown printing class’s methods:Many (aggregate(), cbind(), merge(), rbind() [) manipulating data frames.\nrbind(), example, binds rows two data frames together, one ‘top’ .\n$<- creates new columns.\nkey feature sf objects store spatial non-spatial data way, columns data.frame.geometry column sf objects typically called geometry geom name can used.\nfollowing command, example, creates geometry column named g:st_sf(data.frame(n = world$name_long), g = world$geom)sf objects can also extend tidyverse classes data frames, tibble tbl.\n.\nThus sf enables full power R’s data analysis capabilities unleashed geographic data, whether use base R tidyverse functions data analysis.\n\n(See Rdatatable/data.table#2273 discussion compatibility sf objects fast data.table package.)\nusing capabilities worth re-capping discover basic properties vector data objects.\nLet’s start using base R functions learn world dataset spData package:world contains ten non-geographic columns (one geometry list column) almost 200 rows representing world’s countries.\nfunction st_drop_geometry() keeps attributes data sf object, words removing geometry:Dropping geometry column working attribute data can useful; data manipulation processes can run faster work attribute data geometry columns always needed.\ncases, however, makes sense keep geometry column, explaining column ‘sticky’ (remains attribute operations unless specifically dropped).\nNon-spatial data operations sf objects change object’s geometry appropriate (e.g., dissolving borders adjacent polygons following aggregation).\nBecoming skilled geographic attribute data manipulation means becoming skilled manipulating data frames.many applications, tidyverse package dplyr offers effective approach working data frames.\nTidyverse compatibility advantage sf predecessor sp, pitfalls avoid (see supplementary tidyverse-pitfalls vignette geocompr.github.io details).","code":"\nmethods(class = \"sf\") # methods for sf objects, first 12 shown\n#>  [1] aggregate             cbind                 coerce               \n#>  [4] initialize            merge                 plot                 \n#>  [7] print                 rbind                 [                    \n#> [10] [[<-                  $<-                   show                 \nclass(world) # it's an sf object and a (tidy) data frame\n#> [1] \"sf\"         \"tbl_df\"     \"tbl\"        \"data.frame\"\ndim(world)   # it is a 2 dimensional object, with 177 rows and 11 columns\n#> [1] 177  11\nworld_df = st_drop_geometry(world)\nclass(world_df)\n#> [1] \"tbl_df\"     \"tbl\"        \"data.frame\"\nncol(world_df)\n#> [1] 10"},{"path":"attr.html","id":"vector-attribute-subsetting","chapter":"3 Attribute data operations","heading":"3.2.1 Vector attribute subsetting","text":"Base R subsetting methods include operator [ function subset().\nkey dplyr subsetting functions filter() slice() subsetting rows, select() subsetting columns.\napproaches preserve spatial components attribute data sf objects, using operator $ dplyr function pull() return single attribute column vector lose attribute data, see.\n\nsection focuses subsetting sf data frames; details subsetting vectors non-geographic data frames recommend reading section section 2.7 Introduction R (R Core Team, Smith, Team 2021) Chapter 4 Advanced R Programming (Wickham 2019), respectively.[ operator can subset rows columns.\nIndices placed inside square brackets placed directly data frame object name specify elements keep.\ncommand object[, j] means ’return rows represented columns represented j, j typically contain integers TRUEs FALSEs (indices can also character strings, indicating row column names).\nobject[5, 1:3], example, means ’return data containing 5th row columns 1 3: result data frame 1 row 3 columns, fourth geometry column ’s sf object.\nLeaving j empty returns rows columns, world[1:5, ] returns first five rows 11 columns.\nexamples demonstrate subsetting base R.\nGuess number rows columns sf data frames returned command check results computer (see end chapter exercises):demonstration utility using logical vectors subsetting shown code chunk .\ncreates new object, small_countries, containing nations whose surface area smaller 10,000 km2:intermediary i_small (short index representing small countries) logical vector can used subset seven smallest countries world surface area.\nconcise command, omits intermediary object, generates result:base R function subset() provides another way achieve result:Base R functions mature, stable widely used, making rock solid choice, especially contexts reproducibility reliability key.\ndplyr functions enable ‘tidy’ workflows people (authors book included) find intuitive productive interactive data analysis, especially combined code editors RStudio enable auto-completion column names.\nKey functions subsetting data frames (including sf data frames) dplyr functions demonstrated .\n\n\n\nselect() selects columns name position.\nexample, select two columns, name_long pop, following command:Note: equivalent command base R (world[, c(\"name_long\", \"pop\")]), sticky geom column remains.\nselect() also allows selecting range columns help : operator:can remove specific columns - operator:Subset rename columns time new_name = old_name syntax:worth noting command concise base R equivalent, requires two lines code:select() also works ‘helper functions’ advanced subsetting operations, including contains(), starts_with() num_range() (see help page ?select details).dplyr verbs return data frame, can extract single column vector pull().\n\n\n\ncan get result base R list subsetting operators $ [[, three following commands return numeric vector:slice() row-equivalent select().\nfollowing code chunk, example, selects rows 1 6:filter() dplyr’s equivalent base R’s subset() function.\nkeeps rows matching given criteria, e.g., countries area certain threshold, high average life expectancy, shown following examples:standard set comparison operators can used filter() function, illustrated Table 3.1:TABLE 3.1: Comparison operators return Booleans (TRUE/FALSE).","code":"\nworld[1:6, ]    # subset rows by position\nworld[, 1:3]    # subset columns by position\nworld[1:6, 1:3] # subset rows and columns by position\nworld[, c(\"name_long\", \"pop\")] # columns by name\nworld[, c(T, T, F, F, F, F, F, T, T, F, F)] # by logical indices\nworld[, 888] # an index representing a non-existent column\ni_small = world$area_km2 < 10000\nsummary(i_small) # a logical vector\n#>    Mode   FALSE    TRUE \n#> logical     170       7\nsmall_countries = world[i_small, ]\nsmall_countries = world[world$area_km2 < 10000, ]\nsmall_countries = subset(world, area_km2 < 10000)\nworld1 = dplyr::select(world, name_long, pop)\nnames(world1)\n#> [1] \"name_long\" \"pop\"       \"geom\"\n# all columns between name_long and pop (inclusive)\nworld2 = dplyr::select(world, name_long:pop)\n# all columns except subregion and area_km2 (inclusive)\nworld3 = dplyr::select(world, -subregion, -area_km2)\nworld4 = dplyr::select(world, name_long, population = pop)\nworld5 = world[, c(\"name_long\", \"pop\")] # subset columns by name\nnames(world5)[names(world5) == \"pop\"] = \"population\" # rename column manually\npull(world, pop)\nworld$pop\nworld[[\"pop\"]]\nslice(world, 1:6)\nworld7 = filter(world ,area_km2 < 10000) # countries with a small area\nworld7 = filter(world, lifeExp > 82)      # with high life expectancy"},{"path":"attr.html","id":"chaining-commands-with-pipes","chapter":"3 Attribute data operations","heading":"3.2.2 Chaining commands with pipes","text":"Key workflows using dplyr functions ‘pipe’ operator %>% (since R 4.1.0 native pipe |>), takes name Unix pipe | (Grolemund Wickham 2016).\nPipes enable expressive code: output previous function becomes first argument next function, enabling chaining.\nillustrated , countries Asia filtered world dataset, next object subset columns (name_long continent) first five rows (result shown).chunk shows pipe operator allows commands written clear order:\nrun top bottom (line--line) left right.\nalternative %>% nested function calls, harder read:","code":"\nworld7 = world %>%\n  filter(continent == \"Asia\") %>%\n  dplyr::select(name_long, continent) %>%\n  slice(1:5)\nworld8 = slice(\n  dplyr::select(\n    filter(world, continent == \"Asia\"),\n    name_long, continent),\n  1:5)"},{"path":"attr.html","id":"vector-attribute-aggregation","chapter":"3 Attribute data operations","heading":"3.2.3 Vector attribute aggregation","text":"\n\nAggregation involves summarizing data one ‘grouping variables,’ typically columns data frame aggregated (geographic aggregation covered next chapter).\nexample attribute aggregation calculating number people per continent based country-level data (one row per country).\nworld dataset contains necessary ingredients: columns pop continent, population grouping variable, respectively.\naim find sum() country populations continent, resulting smaller data frame (aggregation form data reduction can useful early step working large datasets).\ncan done base R function aggregate() follows:result non-spatial data frame six rows, one per continent, two columns reporting name population continent (see Table 3.2 results top 3 populous continents).aggregate() generic function means behaves differently depending inputs.\nsf provides method aggregate.sf() activated automatically x sf object argument provided:resulting world_agg2 object spatial object containing 8 features representing continents world (open ocean).\ngroup_by() %>% summarize() dplyr equivalent aggregate(), variable name provided group_by() function specifying grouping variable information summarized passed summarize() function, shown :approach may seem complex benefits: flexibility, readability, control new column names.\nflexibility illustrated command , calculates population also area number countries continent:previous code chunk pop, area (sqkm) n column names result, sum() n() aggregating functions.\naggregating functions return sf objects rows representing continents geometries containing multiple polygons representing land mass associated islands (works thanks geometric operation ‘union,’ explained Section 5.2.6).Let’s combine learned far dplyr functions, chaining multiple commands summarize attribute data countries worldwide continent.\nfollowing command calculates population density (mutate()), arranges continents number countries contain (dplyr::arrange()), keeps 3 populous continents (top_n()), result presented Table 3.2):TABLE 3.2: top 3 populous continents ordered population density (people per square km).","code":"\nworld_agg1 = aggregate(pop ~ continent, FUN = sum, data = world, na.rm = TRUE)\nclass(world_agg1)\n#> [1] \"data.frame\"\nworld_agg2 = aggregate(world[\"pop\"], list(world$continent), FUN = sum, na.rm = TRUE)\nclass(world_agg2)\n#> [1] \"sf\"         \"data.frame\"\nnrow(world_agg2)\n#> [1] 8\nworld_agg3 = world %>%\n  group_by(continent) %>%\n  summarize(pop = sum(pop, na.rm = TRUE))\nworld_agg4  = world %>% \n  group_by(continent) %>%\n  summarize(pop = sum(pop, na.rm = TRUE), `area (sqkm)` = sum(area_km2), n = n())\nworld_agg5 = world %>% \n  st_drop_geometry() %>%                      # drop the geometry for speed\n  dplyr::select(pop, continent, area_km2) %>% # subset the columns of interest  \n  group_by(continent) %>%                     # group by continent and summarize:\n  summarize(Pop = sum(pop, na.rm = TRUE), Area = sum(area_km2), N = n()) %>%\n  mutate(Density = round(Pop / Area)) %>%     # calculate population density\n  top_n(n = 3, wt = Pop) %>%                  # keep only the top 3\n  arrange(desc(N))                            # arrange in order of n. countries"},{"path":"attr.html","id":"vector-attribute-joining","chapter":"3 Attribute data operations","heading":"3.2.4 Vector attribute joining","text":"Combining data different sources common task data preparation.\nJoins combining tables based shared ‘key’ variable.\ndplyr multiple join functions including left_join() inner_join() — see vignette(\"two-table\") full list.\nfunction names follow conventions used database language SQL (Grolemund Wickham 2016, chap. 13); using join non-spatial datasets sf objects focus section.\ndplyr join functions work data frames sf objects, important difference geometry list column.\nresult data joins can either sf data.frame object.\ncommon type attribute join spatial data takes sf object first argument adds columns data.frame specified second argument.\n\ndemonstrate joins, combine data coffee production world dataset.\ncoffee data data frame called coffee_data spData package (see ?coffee_data details).\n3 columns:\nname_long names major coffee-producing nations coffee_production_2016 coffee_production_2017 contain estimated values coffee production units 60-kg bags year.\n‘left join,’ preserves first dataset, merges world coffee_data:input datasets share ‘key variable’ (name_long) join worked without using argument (see ?left_join details).\nresult sf object identical original world object two new variables (column indices 11 12) coffee production.\ncan plotted map, illustrated Figure 3.1, generated plot() function :\nFIGURE 3.1: World coffee production (thousand 60-kg bags) country, 2017. Source: International Coffee Organization.\njoining work, ‘key variable’ must supplied datasets.\ndefault dplyr uses variables matching names.\ncase, world_coffee world objects contained variable called name_long, explaining message Joining, = \"name_long\".\nmajority cases variable names , two options:Rename key variable one objects match.Use argument specify joining variables.latter approach demonstrated renamed version coffee_data:Note name original object kept, meaning world_coffee new object world_coffee2 identical.\nAnother feature result number rows original dataset.\nAlthough 47 rows data coffee_data, 177 country records kept intact world_coffee world_coffee2:\nrows original dataset match assigned NA values new coffee production variables.\nwant keep countries match key variable?\ncase inner join can used:Note result inner_join() 45 rows compared 47 coffee_data.\nhappened remaining rows?\ncan identify rows match using setdiff() function follows:result shows Others accounts one row present world dataset name Democratic Republic Congo accounts :\nabbreviated, causing join miss .\nfollowing command uses string matching (regex) function stringr package confirm Congo, Dem. Rep. :fix issue, create new version coffee_data update name.\ninner_join()ing updated data frame returns result 46 coffee-producing nations:also possible join direction: starting non-spatial dataset adding variables simple features object.\ndemonstrated , starts coffee_data object adds variables original world dataset.\ncontrast previous joins, result another simple feature object, data frame form tidyverse tibble:\noutput join tends match first argument:section covers majority joining use cases.\ninformation, recommend Grolemund Wickham (2016), join vignette geocompkg package accompanies book, documentation data.table package.21\nAnother type join spatial join, covered next chapter (Section 4.2.3).","code":"\nworld_coffee = left_join(world, coffee_data)\n#> Joining, by = \"name_long\"\nclass(world_coffee)\n#> [1] \"sf\"         \"tbl_df\"     \"tbl\"        \"data.frame\"\nnames(world_coffee)\n#>  [1] \"iso_a2\"                 \"name_long\"              \"continent\"             \n#>  [4] \"region_un\"              \"subregion\"              \"type\"                  \n#>  [7] \"area_km2\"               \"pop\"                    \"lifeExp\"               \n#> [10] \"gdpPercap\"              \"geom\"                   \"coffee_production_2016\"\n#> [13] \"coffee_production_2017\"\nplot(world_coffee[\"coffee_production_2017\"])\ncoffee_renamed = rename(coffee_data, nm = name_long)\nworld_coffee2 = left_join(world, coffee_renamed, by = c(name_long = \"nm\"))\nworld_coffee_inner = inner_join(world, coffee_data)\n#> Joining, by = \"name_long\"\nnrow(world_coffee_inner)\n#> [1] 45\nsetdiff(coffee_data$name_long, world$name_long)\n#> [1] \"Congo, Dem. Rep. of\" \"Others\"\n(drc = stringr::str_subset(world$name_long, \"Dem*.+Congo\"))\n#> [1] \"Democratic Republic of the Congo\"\ncoffee_data$name_long[grepl(\"Congo,\", coffee_data$name_long)] = drc\nworld_coffee_match = inner_join(world, coffee_data)\n#> Joining, by = \"name_long\"\nnrow(world_coffee_match)\n#> [1] 46\ncoffee_world = left_join(coffee_data, world)\n#> Joining, by = \"name_long\"\nclass(coffee_world)\n#> [1] \"tbl_df\"     \"tbl\"        \"data.frame\""},{"path":"attr.html","id":"vec-attr-creation","chapter":"3 Attribute data operations","heading":"3.2.5 Creating attributes and removing spatial information","text":"Often, like create new column based already existing columns.\nexample, want calculate population density country.\nneed divide population column, pop, area column, area_km2 unit area square kilometers.\nUsing base R, can type:Alternatively, can use one dplyr functions - mutate() transmute().\nmutate() adds new columns penultimate position sf object (last one reserved geometry):difference mutate() transmute() latter drops existing columns (except sticky geometry column):unite() tidyr package (provides many useful functions reshaping datasets, including pivot_longer()) pastes together existing columns.\nexample, want combine continent region_un columns new column named con_reg.\nAdditionally, can define separator (: colon :) defines values input columns joined, original columns removed (: TRUE):separate() function opposite unite(): splits one column multiple columns using either regular expression character positions.\nfunction also comes tidyr package.dplyr function rename() base R function setNames() useful renaming columns.\nfirst replaces old name new one.\nfollowing command, example, renames lengthy name_long column simply name:setNames() changes column names , requires character vector name matching column.\nillustrated , outputs world object, short names:important note attribute data operations preserve geometry simple features.\nmentioned outset chapter, can useful remove geometry.\n, explicitly remove .\nHence, approach select(world, -geom) unsuccessful instead use st_drop_geometry().22","code":"\nworld_new = world # do not overwrite our original data\nworld_new$pop_dens = world_new$pop / world_new$area_km2\nworld %>% \n  mutate(pop_dens = pop / area_km2)\nworld %>% \n  transmute(pop_dens = pop / area_km2)\nworld_unite = world %>%\n  unite(\"con_reg\", continent:region_un, sep = \":\", remove = TRUE)\nworld_separate = world_unite %>% \n  separate(con_reg, c(\"continent\", \"region_un\"), sep = \":\")\nworld %>% \n  rename(name = name_long)\nnew_names = c(\"i\", \"n\", \"c\", \"r\", \"s\", \"t\", \"a\", \"p\", \"l\", \"gP\", \"geom\")\nworld %>% \n  setNames(new_names)\nworld_data = world %>% st_drop_geometry()\nclass(world_data)\n#> [1] \"tbl_df\"     \"tbl\"        \"data.frame\""},{"path":"attr.html","id":"manipulating-raster-objects","chapter":"3 Attribute data operations","heading":"3.3 Manipulating raster objects","text":"contrast vector data model underlying simple features (represents points, lines polygons discrete entities space), raster data represent continuous surfaces.\nsection shows raster objects work creating scratch, building Section 2.3.2.\nunique structure, subsetting operations raster datasets work different way, demonstrated Section 3.3.1.\nfollowing code recreates raster dataset used Section 2.3.4, result illustrated Figure 3.2.\ndemonstrates rast() function works create example raster named elev (representing elevations).result raster object 6 rows 6 columns (specified nrow ncol arguments), minimum maximum spatial extent x y direction (xmin, xmax, ymin, ymax).\nvals argument sets values cell contains: numeric data ranging 1 36 case.\nRaster objects can also contain categorical values class logical factor variables R.\nfollowing code creates raster representing grain sizes (Figure 3.2):raster object stores corresponding look-table “Raster Attribute Table” (RAT) list data frames, can viewed cats(grain) (see ?cats() information).\nelement list layer raster.\nalso possible use function levels() retrieving adding new replacing existing factor levels:\nFIGURE 3.2: Raster datasets numeric (left) categorical values (right).\n","code":"\nelev = rast(nrows = 6, ncols = 6, resolution = 0.5, \n            xmin = -1.5, xmax = 1.5, ymin = -1.5, ymax = 1.5,\n            vals = 1:36)\ngrain_order = c(\"clay\", \"silt\", \"sand\")\ngrain_char = sample(grain_order, 36, replace = TRUE)\ngrain_fact = factor(grain_char, levels = grain_order)\ngrain = rast(nrows = 6, ncols = 6, resolution = 0.5, \n             xmin = -1.5, xmax = 1.5, ymin = -1.5, ymax = 1.5,\n             vals = grain_fact)\nlevels(grain)[[1]] = c(levels(grain)[[1]], wetness = c(\"wet\", \"moist\", \"dry\"))\nlevels(grain)\n#> [[1]]\n#> [1] \"clay\"  \"silt\"  \"sand\"  \"wet\"   \"moist\" \"dry\""},{"path":"attr.html","id":"raster-subsetting","chapter":"3 Attribute data operations","heading":"3.3.1 Raster subsetting","text":"Raster subsetting done base R operator [, accepts variety inputs:\nRow-column indexingCell IDsCoordinates (see Section 4.3.1)Another spatial object (see Section 4.3.1), show first two options since can considered non-spatial operations.\nneed spatial object subset another output spatial object, refer spatial subsetting.\nTherefore, latter two options shown next chapter (see Section 4.3.1).first two subsetting options demonstrated commands —\nreturn value top left pixel raster object elev (results shown):Subsetting multi-layered raster objects return cell value(s) layer.\nexample, c(elev, grain)[1] returns data frame one row two columns — one layer.\nextract values complete rows, can also use values().Cell values can modified overwriting existing values conjunction subsetting operation.\nfollowing code chunk, example, sets upper left cell elev 0 (results shown):Leaving square brackets empty shortcut version values() retrieving values raster.\nMultiple cells can also modified way:Replacing values multilayered rasters can done matrix many columns layers rows replaceable cells (results shown):","code":"\n# row 1, column 1\nelev[1, 1]\n# cell ID 1\nelev[1]\nelev[1, 1] = 0\nelev[]\nelev[1, c(1, 2)] = 0\ntwo_layers = c(grain, elev) \ntwo_layers[1] = cbind(c(0), c(4))\ntwo_layers[]"},{"path":"attr.html","id":"summarizing-raster-objects","chapter":"3 Attribute data operations","heading":"3.3.2 Summarizing raster objects","text":"terra contains functions extracting descriptive statistics entire rasters.\nPrinting raster object console typing name returns minimum maximum values raster.\nsummary() provides common descriptive statistics – minimum, maximum, quartiles number NAs continuous rasters number cells class categorical rasters.\nsummary operations standard deviation (see ) custom summary statistics can calculated global().\nAdditionally, freq() function allows get frequency table categorical values.Raster value statistics can visualized variety ways.\nSpecific functions boxplot(), density(), hist() pairs() work also raster objects, demonstrated histogram created command (shown):case desired visualization function work raster objects, one can extract raster data plotted help values() (Section 3.3.1).\nDescriptive raster statistics belong -called global raster operations.\ntypical raster processing operations part map algebra scheme, covered next chapter (Section 4.3.2).\nfunction names clash packages (e.g., function name extract() exist terra tidyr packages). addition loading packages referring functions verbosely (e.g., tidyr::extract()), another way prevent function names clashes unloading offending package detach(). following command, example, unloads terra package (can also done package tab resides default right-bottom pane RStudio): detach(“package:terra,” unload = TRUE, force = TRUE). force argument makes sure package detached even packages depend . , however, may lead restricted usability packages depending detached package, therefore recommended.\n","code":"\nglobal(elev, sd)\nhist(elev)"},{"path":"attr.html","id":"exercises-1","chapter":"3 Attribute data operations","heading":"3.4 Exercises","text":"exercises use us_states us_states_df datasets spData package.\nmust attached package, packages used attribute operations chapter (sf, dplyr, terra) commands library(spData) attempting exercises:us_states spatial object (class sf), containing geometry attributes (including name, region, area, population) states within contiguous United States.\nus_states_df data frame (class data.frame) containing name additional variables (including median income poverty level, years 2010 2015) US states, including Alaska, Hawaii Puerto Rico.\ndata comes United States Census Bureau, documented ?us_states ?us_states_df.E1. Create new object called us_states_name contains NAME column us_states object using either base R ([) tidyverse (select()) syntax.\nclass new object makes geographic?E2. Select columns us_states object contain population data.\nObtain result using different command (bonus: try find three ways obtaining result).\nHint: try use helper functions, contains starts_with dplyr (see ?contains).E3. Find states following characteristics (bonus find plot ):Belong Midwest region.Belong West region, area 250,000 km2and 2015 population greater 5,000,000 residents (hint: may need use function units::set_units() .numeric()).Belong South region, area larger 150,000 km2 total population 2015 larger 7,000,000 residents.E4. total population 2015 us_states dataset?\nminimum maximum total population 2015?E5. many states region?E6. minimum maximum total population 2015 region?\ntotal population 2015 region?E7. Add variables us_states_df us_states, create new object called us_states_stats.\nfunction use ?\nvariable key datasets?\nclass new object?E8. us_states_df two rows us_states.\ncan find ? (hint: try use dplyr::anti_join() function)E9. population density 2015 state?\npopulation density 2010 state?E10. much population density changed 2010 2015 state?\nCalculate change percentages map .E11. Change columns’ names us_states lowercase. (Hint: helper functions - tolower() colnames() may help.)E12. Using us_states us_states_df create new object called us_states_sel.\nnew object two variables - median_income_15 geometry.\nChange name median_income_15 column Income.E13. Calculate change number residents living poverty level 2010 2015 state. (Hint: See ?us_states_df documentation poverty level columns.)\nBonus: Calculate change percentage residents living poverty level state.E14. minimum, average maximum state’s number people living poverty line 2015 region?\nBonus: region largest increase people living poverty line?E15. Create raster scratch nine rows columns resolution 0.5 decimal degrees (WGS84).\nFill random numbers.\nExtract values four corner cells.E16. common class example raster grain (hint: modal)?E17. Plot histogram boxplot dem.tif file spDataLarge package (system.file(\"raster/dem.tif\", package = \"spDataLarge\")).","code":"\nlibrary(sf)\nlibrary(dplyr)\nlibrary(terra)\nlibrary(spData)\ndata(us_states)\ndata(us_states_df)"},{"path":"spatial-operations.html","id":"spatial-operations","chapter":"4 Spatial data operations","heading":"4 Spatial data operations","text":"","code":""},{"path":"spatial-operations.html","id":"prerequisites-2","chapter":"4 Spatial data operations","heading":"Prerequisites","text":"chapter requires packages used Chapter 3:also need read couple datasets follows Section 4.3","code":"\nlibrary(sf)\nlibrary(terra)\nlibrary(dplyr)\nlibrary(spData)\n#> Warning: multiple methods tables found for 'approxNA'\nelev = rast(system.file(\"raster/elev.tif\", package = \"spData\"))\ngrain = rast(system.file(\"raster/grain.tif\", package = \"spData\"))"},{"path":"spatial-operations.html","id":"introduction-1","chapter":"4 Spatial data operations","heading":"4.1 Introduction","text":"Spatial operations, including spatial joins vector datasets local focal operations raster datasets, vital part geocomputation.\nchapter shows spatial objects can modified multitude ways based location shape.\nMany spatial operations non-spatial (attribute) equivalent, concepts subsetting joining datasets demonstrated previous chapter applicable .\nespecially true vector operations: Section 3.2 vector attribute manipulation provides basis understanding spatial counterpart, namely spatial subsetting (covered Section 4.2.1).\nSpatial joining (Section 4.2.3) aggregation (Section 4.2.5) also non-spatial counterparts, covered previous chapter.Spatial operations differ non-spatial operations number ways, however:\nSpatial joins, example, can done number ways — including matching entities intersect within certain distance target dataset — attribution joins discussed Section 3.2.4 previous chapter can done one way (except using fuzzy joins, described documentation fuzzyjoin package).\ntype spatial relationship objects must considered undertaking spatial operations, described Section 4.2.2, topological relations vector features.\n.\nAnother unique aspect spatial objects distance: spatial objects related space, distance calculations can used explore strength relationship, described context vector data Section 4.2.7.Spatial operations raster objects include subsetting — covered Section 4.3.1 — merging several raster ‘tiles’ single object, demonstrated Section 4.3.8.\nMap algebra covers range operations modify raster cell values, without reference surrounding cell values, vital many applications.\nconcept map algebra introduced Section 4.3.2; local, focal zonal map algebra operations covered sections 4.3.3, 4.3.4, 4.3.5, respectively. Global map algebra operations, generate summary statistics representing entire raster dataset, distance calculations rasters, discussed Section 4.3.6.\nfinal section exercises (??) process merging two raster datasets discussed demonstrated reference reproducible example.","code":""},{"path":"spatial-operations.html","id":"spatial-vec","chapter":"4 Spatial data operations","heading":"4.2 Spatial operations on vector data","text":"section provides overview spatial operations vector geographic data represented simple features sf package.\nSection 4.3 presents spatial operations raster datasets using classes functions terra package.","code":""},{"path":"spatial-operations.html","id":"spatial-subsetting","chapter":"4 Spatial data operations","heading":"4.2.1 Spatial subsetting","text":"Spatial subsetting process taking spatial object returning new object containing features relate space another object.\nAnalogous attribute subsetting (covered Section 3.2.1), subsets sf data frames can created square bracket ([) operator using syntax x[y, , op = st_intersects], x sf object subset rows returned, y ‘subsetting object’ , op = st_intersects optional argument specifies topological relation (also known binary predicate) used subsetting.\ndefault topological relation used op argument provided st_intersects(): command x[y, ] identical x[y, , op = st_intersects] shown x[y, , op = st_disjoint] (meaning topological relations described next section).\nfilter() function tidyverse can also used approach verbose, see examples .\n\ndemonstrate spatial subsetting, use nz nz_height datasets spData package, contain geographic data 16 main regions 101 highest points New Zealand, respectively (Figure 4.1), projected coordinate system.\nfollowing code chunk creates object representing Canterbury, uses spatial subsetting return high points region:\nFIGURE 4.1: Illustration spatial subsetting red triangles representing 101 high points New Zealand, clustered near central Canterbuy region (left). points Canterbury created [ subsetting operator (highlighted gray, right).\nLike attribute subsetting, command x[y, ] (equivalent nz_height[canterbury, ]) subsets features target x using contents source object y.\nInstead y vector class logical integer, however, spatial subsetting x y must geographic objects.\nSpecifically, objects used spatial subsetting way must class sf sfc: nz nz_height geographic vector data frames class sf, result operation returns another sf object representing features target nz_height object intersect (case high points located within) canterbury region.Various topological relations can used spatial subsetting determine type spatial relationship features target object must subsetting object selected.\ninclude touches, crosses within, see shortly Section 4.2.2.\ndefault setting st_intersects ‘catch ’ topological relation return features target touch, cross within source ‘subsetting’ object.\nindicated , alternative spatial operators can specified op = argument, demonstrated following command returns opposite st_intersects(), points intersect Canterbury (see Section 4.2.2):many applications, ’ll need know spatial subsetting vector data: just works.\nimpatient learn topological relations, beyond st_intersects() st_disjoint(), skip next section (4.2.2).\n’re interested details, including ways subsetting, read .Another way spatial subsetting uses objects returned topological operators.\nobjects can useful right, example exploring graph network relationships contiguous regions, can also used subsetting, demonstrated code chunk :code chunk creates object class sgbp (sparse geometry binary predicate, list length x spatial operation) converts logical vector sel_logical (containing TRUE FALSE values, something can also used dplyr’s filter function).\n\nfunction lengths() identifies features nz_height intersect objects y.\ncase 1 greatest possible value complex operations one use method subset features intersect , example, 2 features source object.result can achieved sf function st_filter() created increase compatibility sf objects dplyr data manipulation code:point, three identical (row names) versions canterbury_height, one created using [ operator, one created via intermediary selection object, another using sf’s convenience function st_filter().\n\n\nnext section explores different types spatial relation, also known binary predicates, can used identify whether two features spatially related .","code":"\ncanterbury = nz %>% filter(Name == \"Canterbury\")\ncanterbury_height = nz_height[canterbury, ]\nnz_height[canterbury, , op = st_disjoint]\nsel_sgbp = st_intersects(x = nz_height, y = canterbury)\nclass(sel_sgbp)\n#> [1] \"sgbp\" \"list\"\nsel_sgbp\n#> Sparse geometry binary predicate list of length 101, where the\n#> predicate was `intersects'\n#> first 10 elements:\n#>  1: (empty)\n#>  2: (empty)\n#>  3: (empty)\n#>  4: (empty)\n#>  5: 1\n#>  6: 1\n#>  7: 1\n#>  8: 1\n#>  9: 1\n#>  10: 1\nsel_logical = lengths(sel_sgbp) > 0\ncanterbury_height2 = nz_height[sel_logical, ]\ncanterbury_height3 = nz_height %>%\n  st_filter(y = canterbury, .predicate = st_intersects)"},{"path":"spatial-operations.html","id":"topological-relations","chapter":"4 Spatial data operations","heading":"4.2.2 Topological relations","text":"Topological relations describe spatial relationships objects.\n“Binary topological relationships,” give full name, logical statements (answer can TRUE FALSE) spatial relationships two objects defined ordered sets points (typically forming points, lines polygons) two dimensions (Egenhofer Herring 1990).\nmay sound rather abstract , indeed, definition classification topological relations based earlier mathematical foundations first published book 1966 (Spanier 1995).23Despite mathematical origins, topological relations can understood intuitively reference visualizations commonly used functions test common types spatial relationships.\nFigure 4.2 shows variety geometry pairs associated relations.\nthird fourth pairs Figure 4.2 (left right ) demonstrate , relations, order important: relations equals, intersects, crosses, touches overlaps symmetrical, meaning function(x, y) true, function(y, x) also true, relations order geomtries important contains within .\n\nFIGURE 4.2: Topological relations vector geometries, inspired Figures 1 2 Egenhofer Herring (1990). relations function(x, y) true printed geometry pair, x represented pink y represented blue.\nsf, functions testing different types topological relations called ‘binary predicates,’ described vignette Manipulating Simple Feature Geometries, can viewed command vignette(\"sf3\"), help page ?geos_binary_pred (E. Pebesma 2018).\nsee topological relations work practice, let’s create simple reproducible example, building relations illustrated Figure 4.2 consolidating knowledge vector geometries represented previous chapter (Section 2.2.5).\nNote create tabular data representing coordinates (x y) polygon vertices, use base R function read.csv() convert result matrix, POLYGON finally sfc object:create additional geometries demonstrate spatial relations following commands , plotted top polygon created , relate space one another, shown Figure 4.3.\nNote use function st_as_sf() argument coords efficiently convert data frame containing columns representing coordinates sf object containing points:\nFIGURE 4.3: Points (point_df 1 3), line polygon objects arranged illustrate topological relations.\nsimple query : points point_sf intersect way polygon polygon_sfc?\nquestion can answered inspection (points 1 2 touch triangle).\ncan also answered using spatial predicate objects intersect?\nimplemented sf follows:contents result expected:\nfunction returns positive (1) third point, negative result (represented empty vector) first two outside polygon’s border.\nmay unexpected result comes form list vectors.\nsparse matrix output registers relation one exists, reducing memory requirements topological operations multi-feature objects.\nsaw previous section, dense matrix consisting TRUE FALSE values combination features can also returned sparse = FALSE:output matrix row represents feature target object column represents feature selecting object.\ncase, third feature point_sf intersects polygon_sfc.\none feature polygon_sfc result one column.\nresult can used subsetting saw Section 4.2.1.Note: st_intersects() returns TRUE even cases features just touch: intersects ‘catch-’ topological operation identifies many types spatial relation, illustrated Figure ??.\nopposite st_intersects() st_disjoint(), returns objects spatially relate way selecting object (note [, 1] converts result vector):st_within() returns TRUE objects completely within selecting object.\nalso applies third point, inside polygon, illustrated :Note although first point within triangle, touch part border.\nreason st_touches() returns FALSE points:features touch, almost touch selection object?\ncan selected using st_is_within_distance(), additional dist argument.\ncan used set close target objects need selected.\nNote although point 4 one unit distance nearest node polygon_sfc (point 2 Figure 4.3), still selected distance set 0.9.\nillustrated code chunk , second line uses function lengths() convert lengthy list output logical object:","code":"#> Warning: Currect projection of shape xy unknown. Long-lat (WGS84) is assumed.\n\n#> Warning: Currect projection of shape xy unknown. Long-lat (WGS84) is assumed.\n\n#> Warning: Currect projection of shape xy unknown. Long-lat (WGS84) is assumed.\n\n#> Warning: Currect projection of shape xy unknown. Long-lat (WGS84) is assumed.\n\n#> Warning: Currect projection of shape xy unknown. Long-lat (WGS84) is assumed.\n\n#> Warning: Currect projection of shape xy unknown. Long-lat (WGS84) is assumed.\n\n#> Warning: Currect projection of shape xy unknown. Long-lat (WGS84) is assumed.\n\n#> Warning: Currect projection of shape xy unknown. Long-lat (WGS84) is assumed.\n\n#> Warning: Currect projection of shape xy unknown. Long-lat (WGS84) is assumed.\n\n#> Warning: Currect projection of shape xy unknown. Long-lat (WGS84) is assumed.\n\n#> Warning: Currect projection of shape xy unknown. Long-lat (WGS84) is assumed.\n\n#> Warning: Currect projection of shape xy unknown. Long-lat (WGS84) is assumed.\npolygon_df = read.csv(text = \"x, y\n0, 0.0\n0, 1.0\n1, 1.0\n1, 0.5\n0, 0.0\")\npolygon_matrix = as.matrix(polygon_df)\npolygon = st_polygon(list(polygon_matrix))\npolygon_sfc = st_sfc(polygon)\nline_df = read.csv(text = \"x,y\n0.1, 0\n1, 0.1\")\nline = st_linestring(x = as.matrix(line_df))\nline_sfc = st_sfc(line)\n# create points\npoint_df = read.csv(text = \"x,y\n0.1,0\n0.7,0.2\n0.4,0.8\")\npoint_sf = st_as_sf(point_df, coords = c(\"x\", \"y\"))\nst_intersects(point_sf, polygon_sfc)\n#> Sparse geometry binary ..., where the predicate was `intersects'\n#> 1: (empty)\n#> 2: (empty)\n#> 3: 1\nst_intersects(point_sf, polygon_sfc, sparse = FALSE)\n#>       [,1]\n#> [1,] FALSE\n#> [2,] FALSE\n#> [3,]  TRUE\nst_disjoint(point_sf, polygon_sfc, sparse = FALSE)[, 1]\n#> [1]  TRUE  TRUE FALSE\nst_within(point_sf, polygon_sfc, sparse = FALSE)[, 1]\n#> [1] FALSE FALSE  TRUE\nst_touches(point_sf, polygon_sfc, sparse = FALSE)[, 1]\n#> [1] FALSE FALSE FALSE\n# can only return a sparse matrix\nsel = st_is_within_distance(point_sf, polygon_sfc, dist = 0.1) \nlengths(sel) > 0\n#> [1]  TRUE FALSE  TRUE"},{"path":"spatial-operations.html","id":"spatial-joining","chapter":"4 Spatial data operations","heading":"4.2.3 Spatial joining","text":"Joining two non-spatial datasets relies shared ‘key’ variable, described Section 3.2.4.\nSpatial data joining applies concept, instead relies spatial relations, described previous section.\nattribute data, joining adds new columns target object (argument x joining functions), source object (y).\n\nprocess illustrated following example: imagine ten points randomly distributed across Earth’s surface ask, points land, countries ?\nImplementing idea reproducible example build geographic data handling skills show spatial joins work.\nstarting point create points randomly scattered Earth’s surface:scenario illustrated Figure 4.4 shows random_points object (top left) lacks attribute data, world (top right) attributes, including country names shown sample countries legend.\nSpatial joins implemented st_join(), illustrated code chunk .\noutput random_joined object illustrated Figure 4.4 (bottom left).\ncreating joined dataset, use spatial subsetting create world_random, contains countries contain random points, verify number country names returned joined dataset four (see top right panel Figure 4.4).\nFIGURE 4.4: Illustration spatial join. new attribute variable added random points (top left) source world object (top right) resulting data represented final panel.\ndefault, st_join() performs left join, meaning result object containing rows x including rows match y (see Section 3.2.4), can also inner joins setting argument left = FALSE.\nLike spatial subsetting, default topological operator used st_join() st_intersects(), can changed setting join argument (see ?st_join details).\nexample demonstrates addition column polygon layer point layer, approach works regardless geometry types.\ncases, example x contains polygons, match multiple objects y, spatial joins result duplicate features, creates new row match y.","code":"\nset.seed(2018) # set seed for reproducibility\n(bb = st_bbox(world)) # the world's bounds\n#>   xmin   ymin   xmax   ymax \n#> -180.0  -89.9  180.0   83.6\nrandom_df = tibble(\n  x = runif(n = 10, min = bb[1], max = bb[3]),\n  y = runif(n = 10, min = bb[2], max = bb[4])\n)\nrandom_points = random_df %>% \n  st_as_sf(coords = c(\"x\", \"y\")) %>% # set coordinates\n  st_set_crs(\"EPSG:4326\") # set geographic CRS\nworld_random = world[random_points, ]\nnrow(world_random)\n#> [1] 4\nrandom_joined = st_join(random_points, world[\"name_long\"])"},{"path":"spatial-operations.html","id":"non-overlapping-joins","chapter":"4 Spatial data operations","heading":"4.2.4 Non-overlapping joins","text":"Sometimes two geographic datasets touch still strong geographic relationship.\ndatasets cycle_hire cycle_hire_osm, already attached spData package, provide good example.\nPlotting shows often closely related touch, shown Figure 4.5, base version created following code :\ncan check points st_intersects() shown :\nFIGURE 4.5: spatial distribution cycle hire points London based official data (blue) OpenStreetMap data (red).\nImagine need join capacity variable cycle_hire_osm onto official ‘target’ data contained cycle_hire.\nnon-overlapping join needed.\nsimplest method use topological operator st_is_within_distance(), demonstrated using threshold distance 20 m (note works projected unprojected data).shows 438 points target object cycle_hire within threshold distance cycle_hire_osm.\nretrieve values associated respective cycle_hire_osm points?\nsolution st_join(), addition dist argument (set 20 m ):Note number rows joined result greater target.\ncycle hire stations cycle_hire multiple matches cycle_hire_osm.\naggregate values overlapping points return mean, can use aggregation methods learned Chapter 3, resulting object number rows target:capacity nearby stations can verified comparing plot capacity source cycle_hire_osm data results new object (plots shown):result join used spatial operation change attribute data associated simple features; geometry associated feature remained unchanged.","code":"\nplot(st_geometry(cycle_hire), col = \"blue\")\nplot(st_geometry(cycle_hire_osm), add = TRUE, pch = 3, col = \"red\")\nany(st_touches(cycle_hire, cycle_hire_osm, sparse = FALSE))\n#> [1] FALSE\nsel = st_is_within_distance(cycle_hire, cycle_hire_osm, dist = 20)\nsummary(lengths(sel) > 0)\n#>    Mode   FALSE    TRUE \n#> logical     304     438\nz = st_join(cycle_hire, cycle_hire_osm, st_is_within_distance, dist = 20)\nnrow(cycle_hire)\n#> [1] 742\nnrow(z)\n#> [1] 762\nz = z %>% \n  group_by(id) %>% \n  summarize(capacity = mean(capacity))\nnrow(z) == nrow(cycle_hire)\n#> [1] TRUE\nplot(cycle_hire_osm[\"capacity\"])\nplot(z[\"capacity\"])"},{"path":"spatial-operations.html","id":"spatial-aggr","chapter":"4 Spatial data operations","heading":"4.2.5 Spatial aggregation","text":"attribute data aggregation, spatial data aggregation condenses data: aggregated outputs fewer rows non-aggregated inputs.\nStatistical aggregating functions, mean average sum, summarise multiple values  variable, return single value per grouping variable.\nSection 3.2.3 demonstrated aggregate() group_by() %>% summarize() condense data based attribute variables, section shows functions work spatial objects.\nReturning example New Zealand, imagine want find average height high points region: geometry source (y nz case) defines values target object (x nz_height) grouped.\ncan done single line code base R’s aggregate() method:result previous command sf object geometry (spatial) aggregating object (nz), can verify command identical(st_geometry(nz), st_geometry(nz_agg)).\nresult previous operation illustrated Figure 4.6, shows average value features nz_height within New Zealand’s 16 regions.\nresult can also generated piping output st_join() ‘tidy’ functions group_by() summarize() follows:\nFIGURE 4.6: Average height top 101 high points across regions New Zealand.\nresulting nz_agg objects geometry aggregating object nz new column summarising values x region using function mean() (cold, course, replaced median(), sd() functions return single value).\nNote: one difference aggregate() group_by() %>% summarize() approaches former results NA values unmatching region names, latter preserves region names flexible terms aggregating functions column names results.\naggregating operations also create new geometries, see Section 5.2.6.","code":"\nnz_agg = aggregate(x = nz_height, by = nz, FUN = mean)\nnz_agg2 = st_join(x = nz, y = nz_height) %>%\n  group_by(Name) %>%\n  summarize(elevation = mean(elevation, na.rm = TRUE))"},{"path":"spatial-operations.html","id":"incongruent","chapter":"4 Spatial data operations","heading":"4.2.6 Joining incongruent layers","text":"Spatial congruence important concept related spatial aggregation.\naggregating object (refer y) congruent target object (x) two objects shared borders.\nOften case administrative boundary data, whereby larger units — Middle Layer Super Output Areas (MSOAs) UK districts many European countries — composed many smaller units.Incongruent aggregating objects, contrast, share common borders target (Qiu, Zhang, Zhou 2012).\nproblematic spatial aggregation (spatial operations) illustrated Figure 4.7: aggregating centroid sub-zone return accurate results.\nAreal interpolation overcomes issue transferring values one set areal units another, using range algorithms including simple area weighted approaches sophisticated approaches ‘pycnophylactic’ methods (Tobler 1979).\nFIGURE 4.7: Illustration congruent (left) incongruent (right) areal units respect larger aggregating zones (translucent blue borders).\nspData package contains dataset named incongruent (colored polygons black borders right panel Figure 4.7) dataset named aggregating_zones (two polygons translucent blue border right panel Figure 4.7).\nLet us assume value column incongruent refers total regional income million Euros.\ncan transfer values underlying nine spatial polygons two polygons aggregating_zones?simplest useful method area weighted spatial interpolation, transfers values incongruent object new column aggregating_zones proportion area overlap: larger spatial intersection input output features, larger corresponding value.\nimplemented st_interpolate_aw(), demonstrated code chunk .case meaningful sum values intersections falling aggregating zones since total income -called spatially extensive variable (increases area), assuming income evenly distributed across smaller zones (hence warning message ).\ndifferent spatially intensive variables average income percentages, increase area increases.\nst_interpolate_aw() works equally spatially intensive variables: set extensive parameter FALSE use average rather sum function aggregation.","code":"\niv = incongruent[\"value\"] # keep only the values to be transferred\nagg_aw = st_interpolate_aw(iv, aggregating_zones, ext = TRUE)\n#> Warning in st_interpolate_aw.sf(iv, aggregating_zones, ext = TRUE):\n#> st_interpolate_aw assumes attributes are constant or uniform over areas of x\nagg_aw$value\n#> [1] 19.6 25.7"},{"path":"spatial-operations.html","id":"distance-relations","chapter":"4 Spatial data operations","heading":"4.2.7 Distance relations","text":"topological relations binary — feature either intersects another — distance relations continuous.\ndistance two objects calculated st_distance() function.\nillustrated code chunk , finds distance highest point New Zealand geographic centroid Canterbury region, created Section 4.2.1:\ntwo potentially surprising things result:units, telling us distance 100,000 meters, 100,000 inches, measure distanceIt returned matrix, even though result contains single valueThis second feature hints another useful feature st_distance(), ability return distance matrices combinations features objects x y.\nillustrated command , finds distances first three features nz_height Otago Canterbury regions New Zealand represented object co.Note distance second third features nz_height second feature co zero.\ndemonstrates fact distances points polygons refer distance part polygon:\nsecond third points nz_height Otago, can verified plotting (result shown):","code":"\nnz_heighest = nz_height %>% top_n(n = 1, wt = elevation)\ncanterbury_centroid = st_centroid(canterbury)\nst_distance(nz_heighest, canterbury_centroid)\n#> Units: [m]\n#>        [,1]\n#> [1,] 115540\nco = filter(nz, grepl(\"Canter|Otag\", Name))\nst_distance(nz_height[1:3, ], co)\n#> Units: [m]\n#>        [,1]  [,2]\n#> [1,] 123537 15498\n#> [2,]  94283     0\n#> [3,]  93019     0\nplot(st_geometry(co)[2])\nplot(st_geometry(nz_height)[2:3], add = TRUE)"},{"path":"spatial-operations.html","id":"spatial-ras","chapter":"4 Spatial data operations","heading":"4.3 Spatial operations on raster data","text":"section builds Section 3.3, highlights various basic methods manipulating raster datasets, demonstrate advanced explicitly spatial raster operations, uses objects elev grain manually created Section 3.3.\nreader’s convenience, datasets can also found spData package.","code":""},{"path":"spatial-operations.html","id":"spatial-raster-subsetting","chapter":"4 Spatial data operations","heading":"4.3.1 Spatial subsetting","text":"previous chapter (Section 3.3) demonstrated retrieve values associated specific cell IDs row column combinations.\nRaster objects can also extracted location (coordinates) spatial objects.\nuse coordinates subsetting, one can ‘translate’ coordinates cell ID terra function cellFromXY().\nalternative use terra::extract() (careful, also function called extract() tidyverse) extract values.\nmethods demonstrated find value cell covers point located coordinates 0.1, 0.1.\n\nRaster objects can also subset another raster object, demonstrated code chunk :amounts retrieving values first raster object (case elev) fall within extent second raster (: clip), illustrated Figure 4.8.\nFIGURE 4.8: Original raster (left). Raster mask (middle). Output masking raster (right).\nexample returned values specific cells, many cases spatial outputs subsetting operations raster datasets needed.\ncan done using [ operator, drop = FALSE, outlined Section 3.3, also shows raster objects can subsetted various objects.\ndemonstrated code , returns first two cells elev raster object first two cells top row (first 2 lines output shown):Another common use case spatial subsetting raster logical (NA) values used mask another raster extent resolution, illustrated Figure 4.8.\ncase, [ mask() functions can used (results shown):code chunk , created mask object called rmask values randomly assigned NA TRUE.\nNext, want keep values elev TRUE rmask.\nwords, want mask elev rmask.approach can also used replace values (e.g., expected wrong) NA.operations fact Boolean local operations since compare cell-wise two rasters.\nnext subsection explores related operations detail.","code":"\nid = cellFromXY(elev, xy = matrix(c(0.1, 0.1), ncol = 2))\nelev[id]\n# the same as\nterra::extract(elev, matrix(c(0.1, 0.1), ncol = 2))\nclip = rast(xmin = 0.9, xmax = 1.8, ymin = -0.45, ymax = 0.45,\n            resolution = 0.3, vals = rep(1, 9))\nelev[clip]\n# we can also use extract\n# terra::extract(elev, ext(clip))\nelev[1:2, drop = FALSE]    # spatial subsetting with cell IDs\n#> class       : SpatRaster \n#> dimensions  : 1, 2, 1  (nrow, ncol, nlyr)\n#> ...\n# create raster mask\nrmask = elev\nvalues(rmask) = sample(c(NA, TRUE), 36, replace = TRUE)\n# spatial subsetting\nelev[rmask, drop = FALSE]           # with [ operator\nmask(elev, rmask)                   # with mask()\nelev[elev < 20] = NA"},{"path":"spatial-operations.html","id":"map-algebra","chapter":"4 Spatial data operations","heading":"4.3.2 Map algebra","text":"\nterm ‘map algebra’ coined late 1970s describe “set conventions, capabilities, techniques” analysis geographic raster (although less prominently) vector data (Tomlin 1994).\n\ncontext, define map algebra narrowly, operations modify summarise raster cell values, reference surrounding cells, zones, statistical functions apply every cell.Map algebra operations tend fast, raster datasets implicitly store coordinates (hence oversimplifying phrase “raster faster vector corrector”).\nlocation cells raster datasets can calculated using matrix position resolution origin dataset (stored header).\nprocessing, however, geographic position cell barely relevant long make sure cell position still processing.\nAdditionally, two raster datasets share extent, projection resolution, one treat matrices processing.way map algebra works terra package.\nFirst, headers raster datasets queried (cases map algebra operations work one dataset) checked ensure datasets compatible.\nSecond, map algebra retains -called one--one locational correspondence, meaning cells move.\ndiffers matrix algebra, values change position, example multiplying dividing matrices.Map algebra (cartographic modeling raster data) divides raster operations four subclasses (Tomlin 1990), working one several grids simultaneously:Local per-cell operationsFocal neighborhood operations.\noften output cell value result 3 x 3 input cell blockZonal operations similar focal operations, surrounding pixel grid new values computed can irregular sizes shapesGlobal per-raster operations; means output cell derives value potentially one several entire rastersThis typology classifies map algebra operations number cells used pixel processing step type output.\nsake completeness, mention raster operations can also classified discipline terrain, hydrological analysis, image classification.\nfollowing sections explain type map algebra operations can used, reference worked examples.","code":""},{"path":"spatial-operations.html","id":"local-operations","chapter":"4 Spatial data operations","heading":"4.3.3 Local operations","text":"\nLocal operations comprise cell--cell operations one several layers.\nRaster algebra classical use case local operations – includes adding subtracting values raster, squaring multipling rasters.\nRaster algebra also allows logical operations finding raster cells greater specific value (5 example ).\nterra package supports operations , demonstrated (Figure 4.9):\nFIGURE 4.9: Examples different local operations elev raster object: adding two rasters, squaring, applying logarithmic transformation, performing logical operation.\nAnother good example local operations classification intervals numeric values groups grouping digital elevation model low (class 1), middle (class 2) high elevations (class 3).\nUsing classify() command, need first construct reclassification matrix, first column corresponds lower second column upper end class.\nthird column represents new value specified ranges column one two., assign raster values ranges 0–12, 12–24 24–36 reclassified take values 1, 2 3, respectively.classify() function can also used want reduce number classes categorical rasters.\nperform several additional reclassifications Chapter 13.Apart arithmetic operators, one can also use app(), tapp() lapp() functions.\nefficient, hence, preferable presence large raster datasets.\nAdditionally, allow save output file directly.\napp() function applies function cell raster used summarize (e.g., calculating sum) values multiple layers one layer.\ntapp() extension app(), allowing us select subset layers (see index argument) want perform certain operation.\nFinally, lapp() function allows apply function cell using layers arguments – application lapp() presented .calculation normalized difference vegetation index (NDVI) well-known local (pixel--pixel) raster operation.\nreturns raster values -1 1; positive values indicate presence living plants (mostly > 0.2).\nNDVI calculated red near-infrared (NIR) bands remotely sensed imagery, typically satellite systems Landsat Sentinel.\nVegetation absorbs light heavily visible light spectrum, especially red channel, reflecting NIR light, explaining NVDI formula:\\[\n\\begin{split}\nNDVI&= \\frac{\\text{NIR} - \\text{Red}}{\\text{NIR} + \\text{Red}}\\\\\n\\end{split}\n\\]Let’s calculate NDVI multispectral satellite file Zion National Park.raster object four satellite bands - blue, green, red, near-infrared (NIR).\nnext step implement NDVI formula R function:function accepts two numerical arguments, nir red, returns numerical vector NDVI values.\ncan used fun argument lapp().\njust need remember function just needs two bands (four original raster), need NIR, red order.\nsubset input raster multi_rast[[c(4, 3)]] calculations.result, shown right panel Figure 4.10, can compared RGB image area (left panel Figure).\nallows us see largest NDVI values connected areas dense forest northern parts area, lowest values related lake north snowy mountain ridges.\nFIGURE 4.10: RGB image (left) NDVI values (right) calculated example satellite file Zion National Park\nPredictive mapping another interesting application local raster operations.\nresponse variable corresponds measured observed points space, example, species richness, presence landslides, tree disease crop yield.\nConsequently, can easily retrieve space- airborne predictor variables various rasters (elevation, pH, precipitation, temperature, landcover, soil class, etc.).\nSubsequently, model response function predictors using lm(), glm(), gam() machine-learning technique.\nSpatial predictions raster objects can therefore made applying estimated coefficients predictor raster values, summing output raster values (see Chapter 14).","code":"\nelev + elev\nelev^2\nlog(elev)\nelev > 5\nrcl = matrix(c(0, 12, 1, 12, 24, 2, 24, 36, 3), ncol = 3, byrow = TRUE)\nrcl\n#>      [,1] [,2] [,3]\n#> [1,]    0   12    1\n#> [2,]   12   24    2\n#> [3,]   24   36    3\nrecl = classify(elev, rcl = rcl)\nmulti_raster_file = system.file(\"raster/landsat.tif\", package = \"spDataLarge\")\nmulti_rast = rast(multi_raster_file)\nndvi_fun = function(nir, red){\n  (nir - red) / (nir + red)\n}\nndvi_rast = lapp(multi_rast[[c(4, 3)]], fun = ndvi_fun)"},{"path":"spatial-operations.html","id":"focal-operations","chapter":"4 Spatial data operations","heading":"4.3.4 Focal operations","text":"\nlocal functions operate one cell, though possibly multiple layers, focal operations take account central (focal) cell neighbors.\nneighborhood (also named kernel, filter moving window) consideration typically size 3--3 cells (central cell eight surrounding neighbors), can take (necessarily rectangular) shape defined user.\nfocal operation applies aggregation function cells within specified neighborhood, uses corresponding output new value central cell, moves next central cell (Figure 4.11).\nnames operation spatial filtering convolution (Burrough, McDonnell, Lloyd 2015).R, can use focal() function perform spatial filtering.\ndefine shape moving window matrix whose values correspond weights (see w parameter code chunk ).\nSecondly, fun parameter lets us specify function wish apply neighborhood.\n, choose minimum, summary function, including sum(), mean(), var() can used.function also accepts additional arguments, example, remove NAs process (na.rm = TRUE) (na.rm = FALSE).\nFIGURE 4.11: Input raster (left) resulting output raster (right) due focal operation - finding minimum value 3--3 moving windows.\ncan quickly check output meets expectations.\nexample, minimum value always upper left corner moving window (remember created input raster row-wise incrementing cell values one starting upper left corner).\nexample, weighting matrix consists 1s, meaning cell weight output, can changed.Focal functions filters play dominant role image processing.\nLow-pass smoothing filters use mean function remove extremes.\ncase categorical data, can replace mean mode, common value.\ncontrast, high-pass filters accentuate features.\nline detection Laplace Sobel filters might serve example .\nCheck focal() help page use R (also used exercises end chapter).Terrain processing, calculation topographic characteristics slope, aspect flow directions, relies focal functions.\nterrain() can used calculate metrics, although terrain algorithms, including Zevenbergen Thorne method compute slope, implemented terra function.\nMany algorithms — including curvatures, contributing areas wetness indices — implemented open source desktop geographic information system (GIS) software.\nChapter 9 shows access GIS functionality within R.","code":"\nr_focal = focal(elev, w = matrix(1, nrow = 3, ncol = 3), fun = min)"},{"path":"spatial-operations.html","id":"zonal-operations","chapter":"4 Spatial data operations","heading":"4.3.5 Zonal operations","text":"\nJust like focal operations, zonal operations apply aggregation function multiple raster cells.\nHowever, second raster, usually categorical values, defines zonal filters (‘zones’) case zonal operations, opposed predefined neighborhood window case focal operation presented previous section.\nConsequently, raster cells defining zonal filter necessarily neighbors.\ngrain size raster good example, illustrated right panel Figure 3.2): different grain sizes spread irregularly throughout raster.\nFinally, result zonal operation summary table grouped zone operation also known zonal statistics GIS world.\ncontrast focal operations return raster object.following code chunk uses zonal() function calculate mean elevation associated grain size class, example.\noutput shown Figure 3.2).returns statistics category, mean altitude grain size class.\nNote: also possible get raster calculated statistics zone setting .raster argument TRUE.","code":"\nz = zonal(elev, grain, fun = \"mean\")\nz\n#>   grain elev\n#> 1  clay 14.8\n#> 2  silt 21.2\n#> 3  sand 18.7"},{"path":"spatial-operations.html","id":"global-operations-and-distances","chapter":"4 Spatial data operations","heading":"4.3.6 Global operations and distances","text":"Global operations special case zonal operations entire raster dataset representing single zone.\ncommon global operations descriptive statistics entire raster dataset minimum maximum – already discussed Section 3.3.2.Aside , global operations also useful computation distance weight rasters.\nfirst case, one can calculate distance cell specific target cell.\nexample, one might want compute distance nearest coast (see also terra::distance()).\nmight also want consider topography, means, interested pure distance like also avoid crossing mountain ranges going coast.\n, can weight distance elevation additional altitudinal meter ‘prolongs’ Euclidean distance.\nVisibility viewshed computations also belong family global operations (exercises Chapter 9, compute viewshed raster).","code":""},{"path":"spatial-operations.html","id":"map-algebra-counterparts-in-vector-processing","chapter":"4 Spatial data operations","heading":"4.3.7 Map algebra counterparts in vector processing","text":"Many map algebra operations counterpart vector processing (Liu Mason 2009).\nComputing distance raster (global operation) considering maximum distance (logical focal operation) equivalent vector buffer operation (Section 5.2.5).\nReclassifying raster data (either local zonal function depending input) equivalent dissolving vector data (Section 4.2.3).\nOverlaying two rasters (local operation), one contains NULL NA values representing mask, similar vector clipping (Section 5.2.5).\nQuite similar spatial clipping intersecting two layers (Section 4.2.1).\ndifference two layers (vector raster) simply share overlapping area (see Figure 5.8 example).\nHowever, careful wording.\nSometimes words slightly different meanings raster vector data models.\nAggregating case vector data refers dissolving polygons, means increasing resolution case raster data.\nfact, one see dissolving aggregating polygons decreasing resolution.\nHowever, zonal operations might better raster equivalent compared changing cell resolution.\nZonal operations can dissolve cells one raster accordance zones (categories) another raster using aggregation function (see ).","code":""},{"path":"spatial-operations.html","id":"merging-rasters","chapter":"4 Spatial data operations","heading":"4.3.8 Merging rasters","text":"\nSuppose like compute NDVI (see Section 4.3.3), additionally want compute terrain attributes elevation data observations within study area.\ncomputations rely remotely sensed information.\ncorresponding imagery often divided scenes covering specific spatial extent, frequently, study area covers one scene.\n, need merge scenes covered study area.\neasiest case, can just merge scenes, put side side.\npossible, example, digital elevation data (SRTM, ASTER).\nfollowing code chunk first download SRTM elevation data Austria Switzerland (country codes, see geodata function country_codes()).\nsecond step, merge two rasters one.terra’s merge() command combines two images, case overlap, uses value first raster.\n\n\n\n\n\nmerging approach little use overlapping values correspond .\nfrequently case want combine spectral imagery scenes taken different dates.\nmerge() command still work see clear border resulting image.\nhand, mosaic() command lets define function overlapping area.\ninstance, compute mean value – might smooth clear border merged result likely make disappear.\n, need advanced approach.\nRemote sensing scientists frequently apply histogram matching use regression techniques align values first image second image.\npackages landsat (histmatch(), relnorm(), PIF()), satellite (calcHistMatch()) RStoolbox (histMatch(), pifMatch()) provide corresponding functions raster’s package objects.\ndetailed introduction use R remote sensing, refer reader Wegmann, Leutner, Dech (2016).\n\n","code":"\naut = geodata::elevation_30s(country = \"AUT\", path = tempdir())\nch = geodata::elevation_30s(country = \"CHE\", path = tempdir())\naut_ch = merge(aut, ch)"},{"path":"spatial-operations.html","id":"exercises-2","chapter":"4 Spatial data operations","heading":"4.4 Exercises","text":"E1. established Section 4.2 Canterbury region New Zealand containing 100 highest points country.\nmany high points Canterbury region contain?E2. region second highest number nz_height points , many ?E3. Generalizing question regions: many New Zealand’s 16 regions contain points belong top 100 highest points country? regions?Bonus: create table listing regions order number points name.E4. Use dem = rast(system.file(\"raster/dem.tif\", package = \"spDataLarge\")), reclassify elevation three classes: low (<300), medium high (>500).\nSecondly, read NDVI raster (ndvi = rast(system.file(\"raster/ndvi.tif\", package = \"spDataLarge\"))) compute mean NDVI mean elevation altitudinal class.E5. Apply line detection filter rast(system.file(\"ex/logo.tif\", package = \"terra\")).\nPlot result.\nHint: Read ?terra::focal().E6. Calculate Normalized Difference Water Index (NDWI; (green - nir)/(green + nir)) Landsat image.\nUse Landsat image provided spDataLarge package (system.file(\"raster/landsat.tif\", package = \"spDataLarge\")).\nAlso, calculate correlation NDVI NDWI area.E7. StackOverflow post shows compute distances nearest coastline using raster::distance().\nTry something similar terra::distance(): retrieve digital elevation model Spain, compute raster represents distances coast across country (hint: use geodata::elevation_30s()).\nConvert resulting distances meters kilometers.\nNote: may wise increase cell size input raster reduce compute time operation.E8. Try modify approach used exercise weighting distance raster elevation raster; every 100 altitudinal meters increase distance coast 10 km.\nNext, compute visualize difference raster created using Euclidean distance (E7) raster weighted elevation.","code":""},{"path":"geometric-operations.html","id":"geometric-operations","chapter":"5 Geometry operations","heading":"5 Geometry operations","text":"","code":""},{"path":"geometric-operations.html","id":"prerequisites-3","chapter":"5 Geometry operations","heading":"Prerequisites","text":"chapter uses packages Chapter 4 addition spDataLarge, installed Chapter 2:","code":"\nlibrary(sf)\nlibrary(terra)\nlibrary(dplyr)\nlibrary(spData)\n#> Warning: multiple methods tables found for 'approxNA'\nlibrary(spDataLarge)"},{"path":"geometric-operations.html","id":"introduction-2","chapter":"5 Geometry operations","heading":"5.1 Introduction","text":"previous three chapters demonstrated geographic datasets structured R (Chapter 2) manipulate based non-geographic attributes (Chapter 3) spatial properties (Chapter 4).\nchapter extends skills.\nreading — attempting exercises end — understand control geometry column sf objects geographic location pixels represented rasters.Section 5.2 covers transforming vector geometries ‘unary’ ‘binary’ operations.\nUnary operations work single geometry isolation.\nincludes simplification (lines polygons), creation buffers centroids, shifting/scaling/rotating single geometries using ‘affine transformations’ (Sections 5.2.1 5.2.4).\nBinary transformations modify one geometry based shape another.\nincludes clipping geometry unions, covered Sections 5.2.5 5.2.6, respectively.\nType transformations (polygon line, example) demonstrated Section 5.2.7.Section 5.3 covers geometric transformations raster objects.\ninvolves changing size number underlying pixels, assigning new values.\nteaches change resolution (also called raster aggregation disaggregation), extent origin raster.\noperations especially useful one like align raster datasets diverse sources.\nAligned raster objects share one--one correspondence pixels, allowing processed using map algebra operations, described Section 4.3.2. final Section 5.4 connects vector raster objects.\nshows raster values can ‘masked’ ‘extracted’ vector geometries.\nImportantly shows ‘polygonize’ rasters ‘rasterize’ vector datasets, making two data models interchangeable.","code":""},{"path":"geometric-operations.html","id":"geo-vec","chapter":"5 Geometry operations","heading":"5.2 Geometric operations on vector data","text":"section operations way change geometry vector (sf) objects.\nadvanced spatial data operations presented previous chapter (Section 4.2), drill geometry:\nfunctions discussed section work objects class sfc addition objects class sf.","code":""},{"path":"geometric-operations.html","id":"simplification","chapter":"5 Geometry operations","heading":"5.2.1 Simplification","text":"\nSimplification process generalization vector objects (lines polygons) usually use smaller scale maps.\nAnother reason simplifying objects reduce amount memory, disk space network bandwidth consume:\nmay wise simplify complex geometries publishing interactive maps.\nsf package provides st_simplify(), uses GEOS implementation Douglas-Peucker algorithm reduce vertex count.\nst_simplify() uses dTolerance control level generalization map units (see Douglas Peucker 1973 details).\nFigure 5.1 illustrates simplification LINESTRING geometry representing river Seine tributaries.\nsimplified geometry created following command:\nFIGURE 5.1: Comparison original simplified geometry seine object.\nresulting seine_simp object copy original seine fewer vertices.\napparent, result visually simpler (Figure 5.1, right) consuming less memory original object, verified :Simplification also applicable polygons.\nillustrated using us_states, representing contiguous United States.\nshow Chapter 6, GEOS assumes data projected CRS lead unexpected results using geographic CRS.\nTherefore, first step project data adequate projected CRS, US National Atlas Equal Area (epsg = 2163) (left Figure 5.2):st_simplify() works equally well projected polygons:limitation st_simplify() simplifies objects per-geometry basis.\nmeans ‘topology’ lost, resulting overlapping ‘holy’ areal units illustrated Figure 5.2 (middle panel).\nms_simplify() rmapshaper provides alternative overcomes issue.\ndefault uses Visvalingam algorithm, overcomes limitations Douglas-Peucker algorithm (Visvalingam Whyatt 1993).\n\nfollowing code chunk uses function simplify us_states2163.\nresult 1% vertices input (set using argument keep) number objects remains intact set keep_shapes = TRUE:24Finally, visual comparison original dataset two simplified versions shows differences Douglas-Peucker (st_simplify) Visvalingam (ms_simplify) algorithm outputs (Figure 5.2):\nFIGURE 5.2: Polygon simplification action, comparing original geometry contiguous United States simplified versions, generated functions sf (center) rmapshaper (right) packages.\n","code":"\nseine_simp = st_simplify(seine, dTolerance = 2000)  # 2000 m\nobject.size(seine)\n#> 18096 bytes\nobject.size(seine_simp)\n#> 9112 bytes\nus_states2163 = st_transform(us_states, 2163)\nus_states_simp1 = st_simplify(us_states2163, dTolerance = 100000)  # 100 km\n# proportion of points to retain (0-1; default 0.05)\nus_states2163$AREA = as.numeric(us_states2163$AREA)\nus_states_simp2 = rmapshaper::ms_simplify(us_states2163, keep = 0.01,\n                                          keep_shapes = TRUE)"},{"path":"geometric-operations.html","id":"centroids","chapter":"5 Geometry operations","heading":"5.2.2 Centroids","text":"\nCentroid operations identify center geographic objects.\nLike statistical measures central tendency (including mean median definitions ‘average’), many ways define geographic center object.\ncreate single point representations complex vector objects.commonly used centroid operation geographic centroid.\ntype centroid operation (often referred ‘centroid’) represents center mass spatial object (think balancing plate finger).\nGeographic centroids many uses, example create simple point representation complex geometries, estimate distances polygons.\ncan calculated sf function st_centroid() demonstrated code , generates geographic centroids regions New Zealand tributaries River Seine, illustrated black points Figure 5.3.Sometimes geographic centroid falls outside boundaries parent objects (think doughnut).\ncases point surface operations can used guarantee point parent object (e.g., labeling irregular multipolygon objects island states), illustrated red points Figure 5.3.\nNotice red points always lie parent objects.\ncreated st_point_on_surface() follows:25\nFIGURE 5.3: Centroids (black points) ‘points surface’ (red points) New Zealand’s regions (left) Seine (right) datasets.\ntypes centroids exist, including Chebyshev center visual center.\nexplore possible calculate using R, ’ll see Chapter 10.","code":"\nnz_centroid = st_centroid(nz)\nseine_centroid = st_centroid(seine)\nnz_pos = st_point_on_surface(nz)\nseine_pos = st_point_on_surface(seine)"},{"path":"geometric-operations.html","id":"buffers","chapter":"5 Geometry operations","heading":"5.2.3 Buffers","text":"\nBuffers polygons representing area within given distance geometric feature:\nregardless whether input point, line polygon, output polygon.\nUnlike simplification (often used visualization reducing file size) buffering tends used geographic data analysis.\nmany points within given distance line?\ndemographic groups within travel distance new shop?\nkinds questions can answered visualized creating buffers around geographic entities interest.Figure 5.4 illustrates buffers different sizes (5 50 km) surrounding river Seine tributaries.\nbuffers created commands , show command st_buffer() requires least two arguments: input geometry distance, provided units CRS (case meters):\nFIGURE 5.4: Buffers around Seine dataset 5 km (left) 50 km (right). Note colors, reflect fact one buffer created per geometry feature.\n","code":"\nseine_buff_5km = st_buffer(seine, dist = 5000)\nseine_buff_50km = st_buffer(seine, dist = 50000)"},{"path":"geometric-operations.html","id":"affine-transformations","chapter":"5 Geometry operations","heading":"5.2.4 Affine transformations","text":"\nAffine transformation transformation preserves lines parallelism.\nHowever, angles length necessarily preserved.\nAffine transformations include, among others, shifting (translation), scaling rotation.\nAdditionally, possible use combination .\nAffine transformations essential part geocomputation.\nexample, shifting needed labels placement, scaling used non-contiguous area cartograms (see Section 8.6), many affine transformations applied reprojecting improving geometry created based distorted wrongly projected map.\nsf package implements affine transformation objects classes sfg sfc.Shifting moves every point distance map units.\ndone adding numerical vector vector object.\nexample, code shifts y-coordinates 100,000 meters north, leaves x-coordinates untouched (left panel Figure 5.5).Scaling enlarges shrinks objects factor.\ncan applied either globally locally.\nGlobal scaling increases decreases coordinates values relation origin coordinates, keeping geometries topological relations intact.\ncan done subtraction multiplication asfg sfc object.Local scaling treats geometries independently requires points around geometries going scaled, e.g., centroids.\nexample , geometry shrunk factor two around centroids (middle panel Figure 5.5).\nachieve , object firstly shifted way center coordinates 0, 0 ((nz_sfc - nz_centroid_sfc)).\nNext, sizes geometries reduced half (* 0.5).\nFinally, object’s centroid moved back input data coordinates (+ nz_centroid_sfc).Rotation two-dimensional coordinates requires rotation matrix:\\[\nR =\n\\begin{bmatrix}\n\\cos \\theta & -\\sin \\theta \\\\  \n\\sin \\theta & \\cos \\theta \\\\\n\\end{bmatrix}\n\\]rotates points clockwise direction.\nrotation matrix can implemented R :rotation function accepts one argument - rotation angle degrees.\nRotation done around selected points, centroids (right panel Figure 5.5).\nSee vignette(\"sf3\") examples.\nFIGURE 5.5: Illustrations affine transformations: shift, scale rotate.\nFinally, newly created geometries can replace old ones st_set_geometry() function:","code":"\nnz_sfc = st_geometry(nz)\nnz_shift = nz_sfc + c(0, 100000)\nnz_centroid_sfc = st_centroid(nz_sfc)\nnz_scale = (nz_sfc - nz_centroid_sfc) * 0.5 + nz_centroid_sfc\nrotation = function(a){\n  r = a * pi / 180 #degrees to radians\n  matrix(c(cos(r), sin(r), -sin(r), cos(r)), nrow = 2, ncol = 2)\n} \nnz_rotate = (nz_sfc - nz_centroid_sfc) * rotation(30) + nz_centroid_sfc\nnz_scale_sf = st_set_geometry(nz, nz_scale)"},{"path":"geometric-operations.html","id":"clipping","chapter":"5 Geometry operations","heading":"5.2.5 Clipping","text":"\n\nSpatial clipping form spatial subsetting involves changes geometry columns least affected features.Clipping can apply features complex points:\nlines, polygons ‘multi’ equivalents.\nillustrate concept start simple example:\ntwo overlapping circles center point one unit away radius one (Figure 5.6).\nFIGURE 5.6: Overlapping circles.\nImagine want select one circle , space covered x y.\ncan done using function st_intersection(), illustrated using objects named x y represent left- right-hand circles (Figure 5.7).\nFIGURE 5.7: Overlapping circles gray color indicating intersection .\nsubsequent code chunk demonstrates works combinations ‘Venn’ diagram representing x y, inspired Figure 5.1 book R Data Science (Grolemund Wickham 2016).\nFIGURE 5.8: Spatial equivalents logical operators.\nillustrate relationship subsetting clipping spatial data, subset points cover bounding box circles x y Figure 5.8.\npoints inside just one circle, inside inside neither.\nst_sample() used generate simple random distribution points within extent circles x y, resulting output illustrated Figure 5.9.\nFIGURE 5.9: Randomly distributed points within bounding box enclosing circles x y.\nlogical operator way find points inside x y using spatial predicate st_intersects(), whereas intersection method simply finds points inside intersecting region created x_and_y.\ndemonstrated results identical, method uses clipped polygon concise:","code":"\nb = st_sfc(st_point(c(0, 1)), st_point(c(1, 1))) # create 2 points\nb = st_buffer(b, dist = 1) # convert points to circles\nplot(b)\ntext(x = c(-0.5, 1.5), y = 1, labels = c(\"x\", \"y\")) # add text\nx = b[1]\ny = b[2]\nx_and_y = st_intersection(x, y)\nplot(b)\nplot(x_and_y, col = \"lightgrey\", add = TRUE) # color intersecting area\nbb = st_bbox(st_union(x, y))\nbox = st_as_sfc(bb)\nset.seed(2017)\np = st_sample(x = box, size = 10)\nplot(box)\nplot(x, add = TRUE)\nplot(y, add = TRUE)\nplot(p, add = TRUE)\ntext(x = c(-0.5, 1.5), y = 1, labels = c(\"x\", \"y\"))\nsel_p_xy = st_intersects(p, x, sparse = FALSE)[, 1] &\n  st_intersects(p, y, sparse = FALSE)[, 1]\np_xy1 = p[sel_p_xy]\np_xy2 = p[x_and_y]\nidentical(p_xy1, p_xy2)\n#> [1] TRUE"},{"path":"geometric-operations.html","id":"geometry-unions","chapter":"5 Geometry operations","heading":"5.2.6 Geometry unions","text":"\n\nsaw Section 3.2.3, spatial aggregation can silently dissolve geometries touching polygons group.\ndemonstrated code chunk 49 us_states aggregated 4 regions using base tidyverse functions (see results Figure 5.10):\nFIGURE 5.10: Spatial aggregation contiguous polygons, illustrated aggregating population US states regions, population represented color. Note operation automatically dissolves boundaries states.\ngoing terms geometries?\nBehind scenes, aggregate() summarize() combine geometries dissolve boundaries using st_union().\ndemonstrated code chunk creates united western US:function can take two geometries unite , demonstrated code chunk creates united western block incorporating Texas (challenge: reproduce plot result):","code":"\nregions = aggregate(x = us_states[, \"total_pop_15\"], by = list(us_states$REGION),\n                    FUN = sum, na.rm = TRUE)\nregions2 = us_states %>% group_by(REGION) %>%\n  summarize(pop = sum(total_pop_15, na.rm = TRUE))\nus_west = us_states[us_states$REGION == \"West\", ]\nus_west_union = st_union(us_west)\ntexas = us_states[us_states$NAME == \"Texas\", ]\ntexas_union = st_union(us_west_union, texas)"},{"path":"geometric-operations.html","id":"type-trans","chapter":"5 Geometry operations","heading":"5.2.7 Type transformations","text":"\nGeometry casting powerful operation enables transformation geometry type.\nimplemented st_cast function sf package.\nImportantly, st_cast behaves differently single simple feature geometry (sfg) objects, simple feature geometry column (sfc) simple features objects.Let’s create multipoint illustrate geometry casting works simple feature geometry (sfg) objects:case, st_cast can useful transform new object linestring polygon (Figure 5.11):\nFIGURE 5.11: Examples linestring polygon casted multipoint geometry.\nConversion multipoint linestring common operation creates line object ordered point observations, GPS measurements geotagged media.\nallows spatial operations length path traveled.\nConversion multipoint linestring polygon often used calculate area, example set GPS measurements taken around lake corners building lot.transformation process can also reversed using st_cast:Geometry casting simple features geometry column (sfc) simple features objects works single geometries cases.\nOne important difference conversion multi-types non-multi-types.\nresult process, multi-objects split many non-multi-objects.Table 5.1 shows possible geometry type transformations simple feature objects.\ninput simple feature object one element (first column) transformed directly another geometry type.\nSeveral transformations possible, example, convert single point multilinestring polygon (cells [1, 4:5] table NA).\nhand, transformations splitting single element input object multi-element object.\ncan see , example, cast multipoint consisting five pairs coordinates point.\nTABLE 5.1: Geometry casting simple feature geometries (see Section 2.1) input type row output type column\nLet’s try apply geometry type transformations new object, multilinestring_sf, example (left Figure 5.12):can imagine road river network.\nnew object one row defines lines.\nrestricts number operations can done, example prevents adding names line segment calculating lengths single lines.\nst_cast function can used situation, separates one mutlilinestring three linestrings:\nFIGURE 5.12: Examples type casting MULTILINESTRING (left) LINESTRING (right).\nnewly created object allows attributes creation (see Section 3.2.5) length measurements:","code":"\nmultipoint = st_multipoint(matrix(c(1, 3, 5, 1, 3, 1), ncol = 2))\nlinestring = st_cast(multipoint, \"LINESTRING\")\npolyg = st_cast(multipoint, \"POLYGON\")\nmultipoint_2 = st_cast(linestring, \"MULTIPOINT\")\nmultipoint_3 = st_cast(polyg, \"MULTIPOINT\")\nall.equal(multipoint, multipoint_2, multipoint_3)\n#> [1] TRUE\nmultilinestring_list = list(matrix(c(1, 4, 5, 3), ncol = 2), \n                            matrix(c(4, 4, 4, 1), ncol = 2),\n                            matrix(c(2, 4, 2, 2), ncol = 2))\nmultilinestring = st_multilinestring((multilinestring_list))\nmultilinestring_sf = st_sf(geom = st_sfc(multilinestring))\nmultilinestring_sf\n#> Simple feature collection with 1 feature and 0 fields\n#> Geometry type: MULTILINESTRING\n#> Dimension:     XY\n#> Bounding box:  xmin: 1 ymin: 1 xmax: 4 ymax: 5\n#> CRS:           NA\n#>                             geom\n#> 1 MULTILINESTRING ((1 5, 4 3)...\nlinestring_sf2 = st_cast(multilinestring_sf, \"LINESTRING\")\nlinestring_sf2\n#> Simple feature collection with 3 features and 0 fields\n#> Geometry type: LINESTRING\n#> Dimension:     XY\n#> Bounding box:  xmin: 1 ymin: 1 xmax: 4 ymax: 5\n#> CRS:           NA\n#>                    geom\n#> 1 LINESTRING (1 5, 4 3)\n#> 2 LINESTRING (4 4, 4 1)\n#> 3 LINESTRING (2 2, 4 2)\nlinestring_sf2$name = c(\"Riddle Rd\", \"Marshall Ave\", \"Foulke St\")\nlinestring_sf2$length = st_length(linestring_sf2)\nlinestring_sf2\n#> Simple feature collection with 3 features and 2 fields\n#> Geometry type: LINESTRING\n#> Dimension:     XY\n#> Bounding box:  xmin: 1 ymin: 1 xmax: 4 ymax: 5\n#> CRS:           NA\n#>                    geom         name length\n#> 1 LINESTRING (1 5, 4 3)    Riddle Rd   3.61\n#> 2 LINESTRING (4 4, 4 1) Marshall Ave   3.00\n#> 3 LINESTRING (2 2, 4 2)    Foulke St   2.00"},{"path":"geometric-operations.html","id":"geo-ras","chapter":"5 Geometry operations","heading":"5.3 Geometric operations on raster data","text":"\nGeometric raster operations include shift, flipping, mirroring, scaling, rotation warping images.\noperations necessary variety applications including georeferencing, used allow images overlaid accurate map known CRS (Liu Mason 2009).\nvariety georeferencing techniques exist, including:Georectification based known ground control pointsOrthorectification, also accounts local topographyImage registration used combine images thing shot different sensors aligning one image another (terms coordinate system resolution)R rather unsuitable first two points since often require manual intervention usually done help dedicated GIS software (see also Chapter 9).\nhand, aligning several images possible R section shows among others .\noften includes changing extent, resolution origin image.\nmatching projection course also required already covered Section 6.6.case, reasons perform geometric operation single raster image.\ninstance, Chapter 13 define metropolitan areas Germany 20 km2 pixels 500,000 inhabitants.\noriginal inhabitant raster, however, resolution 1 km2 decrease (aggregate) resolution factor 20 (see Section 13.5).\nAnother reason aggregating raster simply decrease run-time save disk space.\ncourse, possible task hand allows coarser resolution.\nSometimes coarser resolution sufficient task hand.","code":""},{"path":"geometric-operations.html","id":"geometric-intersections","chapter":"5 Geometry operations","heading":"5.3.1 Geometric intersections","text":"\nSection 4.3.1 shown extract values raster overlaid spatial objects.\nretrieve spatial output, can use almost subsetting syntax.\ndifference make clear like keep matrix structure setting drop argument FALSE.\nreturn raster object containing cells whose midpoints overlap clip.operation can also use intersect() crop() command.","code":"\nelev = rast(system.file(\"raster/elev.tif\", package = \"spData\"))\nclip = rast(xmin = 0.9, xmax = 1.8, ymin = -0.45, ymax = 0.45,\n            resolution = 0.3, vals = rep(1, 9))\nelev[clip, drop = FALSE]\n#> class       : SpatRaster \n#> dimensions  : 2, 1, 1  (nrow, ncol, nlyr)\n#> resolution  : 0.5, 0.5  (x, y)\n#> extent      : 1, 1.5, -0.5, 0.5  (xmin, xmax, ymin, ymax)\n#> coord. ref. : lon/lat WGS 84 (EPSG:4326) \n#> source      : memory \n#> name        : elev \n#> min value   :   18 \n#> max value   :   24"},{"path":"geometric-operations.html","id":"extent-and-origin","chapter":"5 Geometry operations","heading":"5.3.2 Extent and origin","text":"\nmerging performing map algebra rasters, resolution, projection, origin /extent match. Otherwise, add values one raster resolution 0.2 decimal degrees second raster resolution 1 decimal degree?\nproblem arises like merge satellite imagery different sensors different projections resolutions.\ncan deal mismatches aligning rasters.simplest case, two images differ regard extent.\nFollowing code adds one row two columns side raster setting new values elevation 1000 meters (Figure 5.13).\nFIGURE 5.13: Original raster (left) raster (right) extended one row top bottom two columns left right.\nPerforming algebraic operation two objects differing extents R, terra package returns error.However, can align extent two rasters extend().\nInstead telling function many rows columns added (done ), allow figure using another raster object.\n, extend elev object extent elev_2.\nnewly added rows column receive default value value parameter, .e., NA.origin raster cell corner closest coordinates (0, 0).\norigin() function returns coordinates origin.\nexample cell corner exists coordinates (0, 0), necessarily case.two rasters different origins, cells overlap completely make map algebra impossible.\nchange origin – use origin().26\nLooking Figure 5.14 reveals effect changing origin.\nFIGURE 5.14: Rasters identical values different origins.\nNote changing resolution (next section) frequently also changes origin.","code":"\nelev = rast(system.file(\"raster/elev.tif\", package = \"spData\"))\nelev_2 = extend(elev, c(1, 2))\nelev_3 = elev + elev_2\n#> Error: [+] extents do not match\nelev_4 = extend(elev, elev_2)\norigin(elev_4)\n#> [1] 0 0\n# change the origin\norigin(elev_4) = c(0.25, 0.25)"},{"path":"geometric-operations.html","id":"aggregation-and-disaggregation","chapter":"5 Geometry operations","heading":"5.3.3 Aggregation and disaggregation","text":"\n\nRaster datasets can also differ regard resolution.\nmatch resolutions, one can either decrease (aggregate()) increase (disagg()) resolution one raster.27\nexample, change spatial resolution dem (found spDataLarge package) factor 5 (Figure 5.15).\nAdditionally, output cell value correspond mean input cells (note one use functions well, median(), sum(), etc.):\nFIGURE 5.15: Original raster (left). Aggregated raster (right).\ncontrast, disagg() function increases resolution.\nHowever, specify method fill new cells.\ndisagg() function provides two methods.\ndefault one (method = \"near\") simply gives output cells value input cell, hence duplicates values leads blocky output image.bilinear method, turn, interpolation technique uses four nearest pixel centers input image (salmon colored points Figure 5.16) compute average weighted distance (arrows Figure 5.16 value output cell - square upper left corner Figure 5.16).\nFIGURE 5.16: distance-weighted average four closest input cells determine output using bilinear method disaggregation.\nComparing values dem dem_disagg tells us identical (can also use compareGeom() .equal()).\nHowever, hardly expected, since disaggregating simple interpolation technique.\nimportant keep mind disaggregating results finer resolution; corresponding values, however, accurate lower resolution source.","code":"\ndem = rast(system.file(\"raster/dem.tif\", package = \"spDataLarge\"))\ndem_agg = aggregate(dem, fact = 5, fun = mean)\ndem_disagg = disagg(dem_agg, fact = 5, method = \"bilinear\")\nidentical(dem, dem_disagg)\n#> [1] FALSE"},{"path":"geometric-operations.html","id":"resampling","chapter":"5 Geometry operations","heading":"5.3.4 Resampling","text":"\nmethods aggregation disaggregation suitable want change resolution raster aggregation/disaggregation factor.\nHowever, two rasters different resolutions origins?\nrole resampling – process computing values new pixel locations.\nshort, process takes values original raster recalculates new values target raster custom resolution origin.Several methods recalculating (estimating) values raster different resolutions/origins exist (Figure 5.17).\nincludes:Nearest neighbor - assigns value nearest cell original raster cell target one.\nfast usually suitable categorical rasters.Bilinear interpolation - assigns weighted average four nearest cells original raster cell target one (Figure 5.16). fastest method continuous rasters.Cubic interpolation - uses values 16 nearest cells original raster determine output cell value, applying third-order polynomial functions. Used continuous rasters. results smoothed surface bilinear interpolation, also computationally demanding.Cubic spline interpolation - also uses values 16 nearest cells original raster determine output cell value, applies cubic splines (piecewise third-order polynomial functions) derive results. Used continuous rasters.Lanczos windowed sinc resampling - uses values 36 nearest cells original raster determine output cell value. Used continuous rasters.28As can find explanation, nearest neighbor suitable categorical rasters, methods can used (different outcomes) continuous rasters.\nAdditionally, successive method requires processing time.apply resampling, terra package provides resample() function.\naccepts input raster (x), raster target spatial properties (y), resampling method (method).need raster target spatial properties see resample() function works.\nexample, create target_rast, often use already existing raster object.Next, need provide two raster objects first two arguments one resampling methods described .Figure 5.17 shows comparison different resampling methods dem object.\nFIGURE 5.17: Visual comparison original raster five different resampling methods.\nsee section 6.6, raster reprojection special case resampling target raster different CRS original raster.geometry operations terra user-friendly, rather fast, work large raster objects.\nHowever, cases, terra performant either extensive rasters many raster files, alternatives considered.established alternatives come GDAL library.\ncontains several utility functions, including:gdalinfo - lists various information raster file, including resolution, CRS, bounding box, moregdal_translate - converts raster data different file formatsgdal_rasterize - converts vector data raster filesgdalwarp - allows raster mosaicing, resampling, cropping, reprojecting","code":"\ntarget_rast = rast(xmin = 794600, xmax = 798200, \n                   ymin = 8931800, ymax = 8935400,\n                   resolution = 150, crs = \"EPSG:32717\")\ndem_resampl = resample(dem, y = target_rast, method = \"bilinear\")"},{"path":"geometric-operations.html","id":"raster-vector","chapter":"5 Geometry operations","heading":"5.4 Raster-vector interactions","text":"\nsection focuses interactions raster vector geographic data models, introduced Chapter 2.\nincludes four main techniques:\nraster cropping masking using vector objects (Section 5.4.1);\nextracting raster values using different types vector data (Section 5.4.2);\nraster-vector conversion (Sections 5.4.3 5.4.4).\nconcepts demonstrated using data used previous chapters understand potential real-world applications.","code":""},{"path":"geometric-operations.html","id":"raster-cropping","chapter":"5 Geometry operations","heading":"5.4.1 Raster cropping","text":"\nMany geographic data projects involve integrating data many different sources, remote sensing images (rasters) administrative boundaries (vectors).\nOften extent input raster datasets larger area interest.\ncase raster cropping masking useful unifying spatial extent input data.\noperations reduce object memory use associated computational resources subsequent analysis steps, may necessary preprocessing step creating attractive maps involving raster data.use two objects illustrate raster cropping:SpatRaster object srtm representing elevation (meters sea level) south-western Utah.vector (sf) object zion representing Zion National Park.target cropping objects must projection.\nfollowing code chunk therefore reads datasets spDataLarge package (installed Chapter 2), also reprojects zion (see Section 6 reprojection):use crop() terra package crop srtm raster.\nreduces rectangular extent object passed first argument based extent object passed second argument, demonstrated command (generates Figure 5.18(B) — note smaller extent raster background):\nRelated crop() terra function mask(), sets values outside bounds object passed second argument NA.\nfollowing command therefore masks every cell outside Zion National Park boundaries (Figure 5.18(C)):Importantly, want use crop() mask() together cases.\ncombination functions () limit raster’s extent area interest (b) replace values outside area NA.Changing settings mask() yields different results.\nSetting updatevalue = 0, example, set pixels outside national park 0.\nSetting inverse = TRUE mask everything inside bounds park (see ?mask details) (Figure 5.18(D)).\nFIGURE 5.18: Illustration raster cropping raster masking.\n","code":"\nsrtm = rast(system.file(\"raster/srtm.tif\", package = \"spDataLarge\"))\nzion = st_read(system.file(\"vector/zion.gpkg\", package = \"spDataLarge\"))\nzion = st_transform(zion, crs(srtm))\nsrtm_cropped = crop(srtm, vect(zion))\nsrtm_masked = mask(srtm, vect(zion))\nsrtm_cropped = crop(srtm, vect(zion))\nsrtm_final = mask(srtm_cropped, vect(zion))\nsrtm_inv_masked = mask(srtm, vect(zion), inverse = TRUE)"},{"path":"geometric-operations.html","id":"raster-extraction","chapter":"5 Geometry operations","heading":"5.4.2 Raster extraction","text":"\nRaster extraction process identifying returning values associated ‘target’ raster specific locations, based (typically vector) geographic ‘selector’ object.\nresults depend type selector used (points, lines polygons) arguments passed terra::extract() function, use demonstrate raster extraction.\nreverse raster extraction — assigning raster cell values based vector objects — rasterization, described Section 5.4.3.basic example extracting value raster cell specific points.\npurpose, use zion_points, contain sample 30 locations within Zion National Park (Figure 5.19).\nfollowing command extracts elevation values srtm creates data frame points’ IDs (one value per vector’s row) related srtm values point.\nNow, can add resulting object zion_points dataset cbind() function:\nFIGURE 5.19: Locations points used raster extraction.\nRaster extraction also works line selectors.\n, extracts one value raster cell touched line.\nHowever, line extraction approach recommended obtain values along transects hard get correct distance pair extracted raster values.case, better approach split line many points extract values points.\ndemonstrate , code creates zion_transect, straight line going northwest southeast Zion National Park, illustrated Figure 5.20() (see Section 2.2 recap vector data model):utility extracting heights linear selector illustrated imagining planning hike.\nmethod demonstrated provides ‘elevation profile’ route (line need straight), useful estimating long take due long climbs.first step add unique id transect.\nNext, st_segmentize() function can add points along line(s) provided density (dfMaxLength) convert points st_cast().Now, large set points, want derive distance first point transects subsequent points.\ncase, one transect, code, principle, work number transects:Finally, can extract elevation values point transects combine information main object.resulting zion_transect can used create elevation profiles, illustrated Figure 5.20(B).\nFIGURE 5.20: Location line used raster extraction (left) elevation along line (right).\nfinal type geographic vector object raster extraction polygons.\nLike lines, polygons tend return many raster values per polygon.\ndemonstrated command , results data frame column names ID (row number polygon) srtm (associated elevation values):results can used generate summary statistics raster values per polygon, example characterize single region compare many regions.\ngeneration summary statistics demonstrated code , creates object zion_srtm_df containing summary statistics elevation values Zion National Park (see Figure 5.21()):preceding code chunk used tidyverse provide summary statistics cell values per polygon ID, described Chapter 3.\nresults provide useful summaries, example maximum height park around 2,661 meters see level (summary statistics, standard deviation, can also calculated way).\none polygon example data frame single row returned; however, method works multiple selector polygons used.similar approach works counting occurrences categorical raster values within polygons.\nillustrated land cover dataset (nlcd) spDataLarge package Figure 5.21(B), demonstrated code :\nFIGURE 5.21: Area used continuous (left) categorical (right) raster extraction.\n","code":"\ndata(\"zion_points\", package = \"spDataLarge\")\nelevation = terra::extract(srtm, vect(zion_points))\nzion_points = cbind(zion_points, elevation)\nzion_transect = cbind(c(-113.2, -112.9), c(37.45, 37.2)) %>%\n  st_linestring() %>% \n  st_sfc(crs = crs(srtm)) %>% \n  st_sf()\nzion_transect$id = 1:nrow(zion_transect)\nzion_transect = st_segmentize(zion_transect, dfMaxLength = 250)\nzion_transect = st_cast(zion_transect, \"POINT\")\nzion_transect = zion_transect %>% \n  group_by(id) %>% \n  mutate(dist = st_distance(geometry)[, 1]) \nzion_elev = terra::extract(srtm, vect(zion_transect))\nzion_transect = cbind(zion_transect, zion_elev)\nzion_srtm_values = terra::extract(x = srtm, y = vect(zion))\ngroup_by(zion_srtm_values, ID) %>% \n  summarize(across(srtm, list(min = min, mean = mean, max = max)))\n#> # A tibble: 1 × 4\n#>      ID srtm_min srtm_mean srtm_max\n#>   <dbl>    <dbl>     <dbl>    <dbl>\n#> 1     1     1122     1818.     2661\nnlcd = rast(system.file(\"raster/nlcd.tif\", package = \"spDataLarge\"))\nzion2 = st_transform(zion, st_crs(nlcd))\nzion_nlcd = terra::extract(nlcd, vect(zion2))\nzion_nlcd %>% \n  group_by(ID, levels) %>%\n  count()\n#> # A tibble: 7 × 3\n#> # Groups:   ID, levels [7]\n#>      ID levels      n\n#>   <dbl>  <dbl>  <int>\n#> 1     1      2   4205\n#> 2     1      3  98285\n#> 3     1      4 298299\n#> 4     1      5 203701\n#> # … with 3 more rows"},{"path":"geometric-operations.html","id":"rasterization","chapter":"5 Geometry operations","heading":"5.4.3 Rasterization","text":"\nRasterization conversion vector objects representation raster objects.\nUsually, output raster used quantitative analysis (e.g., analysis terrain) modeling.\nsaw Chapter 2 raster data model characteristics make conducive certain methods.\nFurthermore, process rasterization can help simplify datasets resulting values spatial resolution: rasterization can seen special type geographic data aggregation.terra package contains function rasterize() work.\nfirst two arguments , x, vector object rasterized , y, ‘template raster’ object defining extent, resolution CRS output.\ngeographic resolution input raster major impact results: low (cell size large), result may miss full geographic variability vector data; high, computational times may excessive.\nsimple rules follow deciding appropriate geographic resolution, heavily dependent intended use results.\nOften target resolution imposed user, example output rasterization needs aligned existing raster.demonstrate rasterization action, use template raster extent CRS input vector data cycle_hire_osm_projected (dataset cycle hire points London illustrated Figure 5.22()) spatial resolution 1000 meters:Rasterization flexible operation: results depend nature template raster, also type input vector (e.g., points, polygons) variety arguments taken rasterize() function.illustrate flexibility try three different approaches rasterization.\nFirst, create raster representing presence absence cycle hire points (known presence/absence rasters).\ncase rasterize() requires one argument addition x y (aforementioned vector raster objects): value transferred non-empty cells specified field (results illustrated Figure 5.22(B)).fun argument specifies summary statistics used convert multiple observations close proximity associate cells raster object.\ndefault fun = \"last\" used options fun = \"length\" can used, case count number cycle hire points grid cell (results operation illustrated Figure 5.22(C)).new output, ch_raster2, shows number cycle hire points grid cell.\ncycle hire locations different numbers bicycles described capacity variable, raising question, ’s capacity grid cell?\ncalculate must sum field (\"capacity\"), resulting output illustrated Figure 5.22(D), calculated following command (summary functions mean used):\nFIGURE 5.22: Examples point rasterization.\nAnother dataset based California’s polygons borders (created ) illustrates rasterization lines.\ncasting polygon objects multilinestring, template raster created resolution 0.5 degree:considering line polygon rasterization, one useful additional argument touches.\ndefault FALSE, changed TRUE – cells touched line polygon border get value.\nLine rasterization touches = TRUE demonstrated code (Figure 5.23()).Compare polygon rasterization, touches = FALSE default, selects cells whose centroids inside selector polygon, illustrated Figure 5.23(B).\nFIGURE 5.23: Examples line polygon rasterizations.\n","code":"\ncycle_hire_osm_projected = st_transform(cycle_hire_osm, \"EPSG:27700\")\nraster_template = rast(ext(cycle_hire_osm_projected), resolution = 1000,\n                       crs = st_crs(cycle_hire_osm_projected)$wkt)\nch_raster1 = rasterize(vect(cycle_hire_osm_projected), raster_template,\n                       field = 1)\nch_raster2 = rasterize(vect(cycle_hire_osm_projected), raster_template, \n                       fun = \"length\")\nch_raster3 = rasterize(vect(cycle_hire_osm_projected), raster_template, \n                       field = \"capacity\", fun = sum)\ncalifornia = dplyr::filter(us_states, NAME == \"California\")\ncalifornia_borders = st_cast(california, \"MULTILINESTRING\")\nraster_template2 = rast(ext(california), resolution = 0.5,\n                        crs = st_crs(california)$wkt)\ncalifornia_raster1 = rasterize(vect(california_borders), raster_template2,\n                               touches = TRUE)\ncalifornia_raster2 = rasterize(vect(california), raster_template2) "},{"path":"geometric-operations.html","id":"spatial-vectorization","chapter":"5 Geometry operations","heading":"5.4.4 Spatial vectorization","text":"\nSpatial vectorization counterpart rasterization (Section 5.4.3), opposite direction.\ninvolves converting spatially continuous raster data spatially discrete vector data points, lines polygons.simplest form vectorization convert centroids raster cells points.\n.points() exactly non-NA raster grid cells (Figure 5.24).\nNote, also used st_as_sf() convert resulting object sf class.\nFIGURE 5.24: Raster point representation elev object.\nAnother common type spatial vectorization creation contour lines representing lines continuous height temperatures (isotherms) example.\nuse real-world digital elevation model (DEM) artificial raster elev produces parallel lines (task reader: verify explain happens).\nContour lines can created terra function .contour(), wrapper around filled.contour(), demonstrated (shown):Contours can also added existing plots functions contour(), rasterVis::contourplot() tmap::tm_iso().\nillustrated Figure 5.25, isolines can labelled.\nFIGURE 5.25: DEM hillshade southern flank Mt. Mongón overlaid contour lines.\nfinal type vectorization involves conversion rasters polygons.\ncan done terra::.polygons(), converts raster cell polygon consisting five coordinates, stored memory (explaining rasters often fast compared vectors!).illustrated converting grain object polygons subsequently dissolving borders polygons attribute values (also see dissolve argument .polygons()).\nFIGURE 5.26: Illustration vectorization raster (left) polygon (center) polygon aggregation (right).\n","code":"\nelev_point = as.points(elev) %>% \n  st_as_sf()\ndem = rast(system.file(\"raster/dem.tif\", package = \"spDataLarge\"))\ncl = as.contour(dem)\nplot(dem, axes = FALSE)\nplot(cl, add = TRUE)\n# create hillshade\nhs = shade(slope = terrain(dem, \"slope\", unit = \"radians\"),\n           aspect = terrain(dem, \"aspect\", unit = \"radians\"))\nplot(hs, col = gray(0:100 / 100), legend = FALSE)\n# overlay with DEM\nplot(dem, col = terrain.colors(25), alpha = 0.5, legend = FALSE, add = TRUE)\n# add contour lines\ncontour(dem, col = \"white\", add = TRUE)\ngrain = rast(system.file(\"raster/grain.tif\", package = \"spData\"))\ngrain_poly = as.polygons(grain) %>% \n  st_as_sf()"},{"path":"geometric-operations.html","id":"exercises-3","chapter":"5 Geometry operations","heading":"5.5 Exercises","text":"exercises use vector (zion_points) raster dataset (srtm) spDataLarge package.\nalso use polygonal ‘convex hull’ derived vector dataset (ch) represent area interest:E1. Generate plot simplified versions nz dataset.\nExperiment different values keep (ranging 0.5 0.00005) ms_simplify() dTolerance (100 100,000) st_simplify().value form result start break method, making New Zealand unrecognizable?Advanced: different geometry type results st_simplify() compared geometry type ms_simplify()? problems create can resolved?E2. first exercise Chapter Spatial data operations established Canterbury region 70 101 highest points New Zealand.\nUsing st_buffer(), many points nz_height within 100 km Canterbury?E3. Find geographic centroid New Zealand.\nfar geographic centroid Canterbury?E4. world maps north-orientation.\nworld map south-orientation created reflection (one affine transformations mentioned chapter) world object’s geometry.\nWrite code .\nHint: need use two-element vector transformation.\nBonus: create upside-map country.E5. Subset point p contained within x y.Using base subsetting operators.Using intermediary object created st_intersection().E6. Calculate length boundary lines US states meters.\nstate longest border shortest?\nHint: st_length function computes length LINESTRING MULTILINESTRING geometry.E7. Crop srtm raster using (1) zion_points dataset (2) ch dataset.\ndifferences output maps?\nNext, mask srtm using two datasets.\nCan see difference now?\ncan explain ?E8. Firstly, extract values srtm points represented zion_points.\nNext, extract average values srtm using 90 buffer around point zion_points compare two sets values.\nextracting values buffers suitable points alone?E9. Subset points higher 3100 meters New Zealand (nz_height object) create template raster resolution 3 km extent new point dataset.\nUsing two new objects:Count numbers highest points grid cell.Find maximum elevation grid cell.E10. Aggregate raster counting high points New Zealand (created previous exercise), reduce geographic resolution half (cells 6 6 km) plot result.Resample lower resolution raster back original resolution 3 km. results changed?Name two advantages disadvantages reducing raster resolution.E11. Polygonize grain dataset filter squares representing clay.Name two advantages disadvantages vector data raster data.useful convert rasters vectors work?","code":"\nlibrary(sf)\nlibrary(terra)\nlibrary(spData)\nzion_points = read_sf(system.file(\"vector/zion_points.gpkg\", package = \"spDataLarge\"))\nsrtm = rast(system.file(\"raster/srtm.tif\", package = \"spDataLarge\"))\nch = st_combine(zion_points) %>%\n  st_convex_hull() %>% \n  st_as_sf()"},{"path":"reproj-geo-data.html","id":"reproj-geo-data","chapter":"6 Reprojecting geographic data","heading":"6 Reprojecting geographic data","text":"","code":""},{"path":"reproj-geo-data.html","id":"prerequisites-4","chapter":"6 Reprojecting geographic data","heading":"Prerequisites","text":"chapter requires following packages (lwgeom also used, need attached):","code":"\nlibrary(sf)\nlibrary(terra)\nlibrary(dplyr)\nlibrary(spData)\n#> Warning: multiple methods tables found for 'approxNA'\nlibrary(spDataLarge)"},{"path":"reproj-geo-data.html","id":"introduction-3","chapter":"6 Reprojecting geographic data","heading":"6.1 Introduction","text":"Section 2.4 introduced coordinate reference systems (CRSs) demonstrated importance.\nchapter goes .\nhighlights issues can arise using inappropriate CRSs transform data one CRS another.\n\nillustrated Figure 2.1, two types CRSs: geographic (‘lon/lat,’ units degrees longitude latitude) projected (typically units meters datum).\nconsequences.\n\n\n\ncheck data geographic CRS, can use sf::st_is_longlat() vector data terra::.lonlat() raster data.\ncases CRS unknown, shown using example London introduced Section 2.2:shows unless CRS manually specified loaded source CRS metadata, CRS NA.\nCRS can added sf objects st_set_crs() follows:29Datasets without specified CRS can cause problems.subsequent sections go depth, exploring CRS use details reprojecting vector raster objects.","code":"\nlondon = data.frame(lon = -0.1, lat = 51.5) %>% \n  st_as_sf(coords = c(\"lon\", \"lat\"))\nst_is_longlat(london)\n#> [1] NA\nlondon_geo = st_set_crs(london, \"EPSG:4326\")\nst_is_longlat(london_geo)\n#> [1] TRUE\nlondon_proj = data.frame(x = 530000, y = 180000) %>% \n  st_as_sf(coords = 1:2, crs = \"EPSG:27700\")"},{"path":"reproj-geo-data.html","id":"when-to-reproject","chapter":"6 Reprojecting geographic data","heading":"6.2 When to reproject?","text":"\nprevious section showed set CRS manually, st_set_crs(london, \"EPSG:4326\").\nreal world applications, however, CRSs usually set automatically data read-.\nmain task involving CRSs often transform objects, one CRS another.\ndata transformed?\nCRS?\nclear-cut answers questions CRS selection always involves trade-offs (Maling 1992).\nHowever, general principles provided section can help decide.First ’s worth considering transform.\n\n\ncases transformation projected CRS essential, using geometric functions st_buffer(), Figure ?? showed.\nConversely, publishing data online leaflet package may require geographic CRS.\nAnother case two objects different CRSs must compared combined, shown try find distance two objects different CRSs:make london london_proj objects geographically comparable one must transformed CRS .\nCRS use?\nanswer usually ‘projected CRS,’ case British National Grid (EPSG:27700):Now transformed version london created, using sf function st_transform(), distance two representations London can found.\nmay come surprise london london2 just 2 km apart!30","code":"\nst_distance(london_geo, london_proj)\n# > Error: st_crs(x) == st_crs(y) is not TRUE\nlondon2 = st_transform(london_geo, \"EPSG:27700\")\nst_distance(london2, london_proj)\n#> Units: [m]\n#>      [,1]\n#> [1,] 2018"},{"path":"reproj-geo-data.html","id":"which-crs-to-use","chapter":"6 Reprojecting geographic data","heading":"6.3 Which CRS to use?","text":"\n\nquestion CRS tricky, rarely ‘right’ answer:\n“exist -purpose projections, involve distortion far center specified frame” (R. Bivand, Pebesma, Gómez-Rubio 2013).geographic CRSs, answer often WGS84, web mapping, also GPS datasets thousands raster vector datasets provided CRS default.\nWGS84 common CRS world, worth knowing EPSG code: 4326.\n‘magic number’ can used convert objects unusual projected CRSs something widely understood.projected CRS required?\ncases, something free decide:\n“often choice projection made public mapping agency” (R. Bivand, Pebesma, Gómez-Rubio 2013).\nmeans working local data sources, likely preferable work CRS data provided, ensure compatibility, even official CRS accurate.\nexample London easy answer () British National Grid (associated EPSG code 27700) well known (b) original dataset (london) already CRS.cases appropriate CRS immediately clear, choice CRS depend properties important preserve subsequent maps analysis.\nCRSs either equal-area, equidistant, conformal (shapes remaining unchanged), combination compromises (section 2.4.2).\nCustom CRSs local parameters can created region interest multiple CRSs can used projects single CRS suits tasks.\n‘Geodesic calculations’ can provide fall-back CRSs appropriate (see proj.org/geodesic.html).\nRegardless projected CRS used, results may accurate geometries covering hundreds kilometers.deciding custom CRS, recommend following:31\n\n\n\nLambert azimuthal equal-area (LAEA) projection custom local projection (set lon_0 lat_0 center study area), equal-area projection locations distorts shapes beyond thousands kilometersAzimuthal equidistant (AEQD) projections specifically accurate straight-line distance point center point local projectionLambert conformal conic (LCC) projections regions covering thousands kilometers, cone set keep distance area properties reasonable secant linesStereographic (STERE) projections polar regions, taking care rely area distance calculations thousands kilometers centerOne possible approach automatically select projected CRS specific local dataset create azimuthal equidistant (AEQD) projection center-point study area.\ninvolves creating custom CRS (EPSG code) units meters based center point dataset.\napproach used caution: datasets compatible custom CRS created results may accurate used extensive datasets covering hundreds kilometers.commonly used default Universal Transverse Mercator (UTM), set CRSs divides Earth 60 longitudinal wedges 20 latitudinal segments.\ntransverse Mercator projection used UTM CRSs conformal distorts areas distances increasing severity distance center UTM zone.\nDocumentation GIS software Manifold therefore suggests restricting longitudinal extent projects using UTM zones 6 degrees central meridian (source: manifold.net).Almost every place Earth UTM code, “60H” refers northern New Zealand R invented.\nUTM EPSG codes run sequentially 32601 32660 northern hemisphere locations 32701 32760 southern hemisphere locations.show system works, let’s create function, lonlat2UTM() calculate EPSG code associated point planet follows:following command uses function identify UTM zone associated EPSG code Auckland London:Maps UTM zones provided dmap.co.uk confirm London UTM zone 30U.principles outlined section apply equally vector raster datasets.\nfeatures CRS transformation however unique geographic data model.\ncover particularities vector data transformation Section 6.4 raster transformation Section 6.6.","code":"\nlonlat2UTM = function(lonlat) {\n  utm = (floor((lonlat[1] + 180) / 6) %% 60) + 1\n  if(lonlat[2] > 0) {\n    utm + 32600\n  } else{\n    utm + 32700\n  }\n}\nepsg_utm_auk = lonlat2UTM(c(174.7, -36.9))\nepsg_utm_lnd = lonlat2UTM(st_coordinates(london))\nst_crs(epsg_utm_auk)$proj4string\n#> [1] \"+proj=utm +zone=60 +south +datum=WGS84 +units=m +no_defs\"\nst_crs(epsg_utm_lnd)$proj4string\n#> [1] \"+proj=utm +zone=30 +datum=WGS84 +units=m +no_defs\""},{"path":"reproj-geo-data.html","id":"reproj-vec-geom","chapter":"6 Reprojecting geographic data","heading":"6.4 Reprojecting vector geometries","text":"\n\nChapter 2 demonstrated vector geometries made-points, points form basis complex objects lines polygons.\nReprojecting vectors thus consists transforming coordinates points.\nillustrated cycle_hire_osm, sf object spData represents cycle hire locations across London.\nprevious section showed CRS vector data can queried st_crs().\n\n\n\n","code":""},{"path":"reproj-geo-data.html","id":"modifying-map-projections","chapter":"6 Reprojecting geographic data","heading":"6.5 Modifying map projections","text":"information CRS modifications can found Using PROJ documentation.","code":""},{"path":"reproj-geo-data.html","id":"reprojecting-raster-geometries","chapter":"6 Reprojecting geographic data","heading":"6.6 Reprojecting raster geometries","text":"\n\n\n\nprojection concepts described previous section apply equally rasters.\nHowever, important differences reprojection vectors rasters:\ntransforming vector object involves changing coordinates every vertex apply raster data.\nRasters composed rectangular cells size (expressed map units, degrees meters), usually impracticable transform coordinates pixels separately.\nRaster reprojection involves creating new raster object, often different number columns rows original.\nattributes must subsequently re-estimated, allowing new pixels ‘filled’ appropriate values.\nwords, raster reprojection can thought two separate spatial operations: vector reprojection raster extent another CRS (Section 6.4), computation new pixel values resampling (Section 5.3.4).\nThus cases raster vector data used, better avoid reprojecting rasters reproject vectors instead.raster reprojection process done project() terra package.\nLike st_transform() function demonstrated previous section, project() takes geographic object (raster dataset case) CRS representation second argument.\nside note – second argument can also existing raster object different CRS.Let’s take look two examples raster transformation: using categorical continuous data.\nLand cover data usually represented categorical maps.\nnlcd.tif file provides information small area Utah, USA obtained National Land Cover Database 2011 NAD83 / UTM zone 12N CRS.region, 8 land cover classes distinguished (full list NLCD2011 land cover classes can found mrlc.gov):reprojecting categorical rasters, estimated values must original.\ndone using nearest neighbor method (near), sets new cell value value nearest cell (center) input raster.\nexample reprojecting cat_raster WGS84, geographic CRS well suited web mapping.\nfirst step obtain PROJ definition CRS, can done, example using http://spatialreference.org webpage.\nfinal step reproject raster project() function , case categorical data, uses nearest neighbor method (near):Many properties new object differ previous one, including number columns rows (therefore number cells), resolution (transformed meters degrees), extent, illustrated Table 6.1 (note number categories increases 8 9 addition NA values, new category created — land cover classes preserved).TABLE 6.1: Key attributes original (‘cat_raster’) projected (‘cat_raster_wgs84’) categorical raster datasets.Reprojecting numeric rasters (numeric case integer values) follows almost identical procedure.\ndemonstrated srtm.tif spDataLarge Shuttle Radar Topography Mission (SRTM), represents height meters sea level (elevation) WGS84 CRS:reproject dataset projected CRS, nearest neighbor method appropriate categorical data.\nInstead, use bilinear method computes output cell value based four nearest cells original raster.32\nvalues projected dataset distance-weighted average values four cells:\ncloser input cell center output cell, greater weight.\nfollowing commands create text string representing WGS 84 / UTM zone 12N, reproject raster CRS, using bilinear method:Raster reprojection numeric variables also leads small changes values spatial properties, number cells, resolution, extent.\nchanges demonstrated Table 6.233:TABLE 6.2: Key attributes original (‘con_raster’) projected (‘con_raster_ea’) continuous raster datasets.learn CRSs.\nexcellent resource area, also implemented R, website R Spatial.\nChapter 6 free online book recommended reading — see: rspatial.org/terra/spatial/6-crs.html","code":"\ncat_raster = rast(system.file(\"raster/nlcd.tif\", package = \"spDataLarge\"))\ncrs(cat_raster)\n#> [1] \"PROJCRS[\\\"NAD83 / UTM zone 12N\\\",\\n    BASEGEOGCRS[\\\"NAD83\\\",\\n        DATUM[\\\"North American Datum 1983\\\",\\n            ELLIPSOID[\\\"GRS 1980\\\",6378137,298.257222101,\\n                LENGTHUNIT[\\\"metre\\\",1]]],\\n        PRIMEM[\\\"Greenwich\\\",0,\\n            ANGLEUNIT[\\\"degree\\\",0.0174532925199433]],\\n        ID[\\\"EPSG\\\",4269]],\\n    CONVERSION[\\\"UTM zone 12N\\\",\\n        METHOD[\\\"Transverse Mercator\\\",\\n            ID[\\\"EPSG\\\",9807]],\\n        PARAMETER[\\\"Latitude of natural origin\\\",0,\\n            ANGLEUNIT[\\\"degree\\\",0.0174532925199433],\\n            ID[\\\"EPSG\\\",8801]],\\n        PARAMETER[\\\"Longitude of natural origin\\\",-111,\\n            ANGLEUNIT[\\\"degree\\\",0.0174532925199433],\\n            ID[\\\"EPSG\\\",8802]],\\n        PARAMETER[\\\"Scale factor at natural origin\\\",0.9996,\\n            SCALEUNIT[\\\"unity\\\",1],\\n            ID[\\\"EPSG\\\",8805]],\\n        PARAMETER[\\\"False easting\\\",500000,\\n            LENGTHUNIT[\\\"metre\\\",1],\\n            ID[\\\"EPSG\\\",8806]],\\n        PARAMETER[\\\"False northing\\\",0,\\n            LENGTHUNIT[\\\"metre\\\",1],\\n            ID[\\\"EPSG\\\",8807]]],\\n    CS[Cartesian,2],\\n        AXIS[\\\"(E)\\\",east,\\n            ORDER[1],\\n            LENGTHUNIT[\\\"metre\\\",1]],\\n        AXIS[\\\"(N)\\\",north,\\n            ORDER[2],\\n            LENGTHUNIT[\\\"metre\\\",1]],\\n    USAGE[\\n        SCOPE[\\\"unknown\\\"],\\n        AREA[\\\"North America - 114°W to 108°W and NAD83 by country\\\"],\\n        BBOX[31.33,-114,84,-108]],\\n    ID[\\\"EPSG\\\",26912]]\"\nunique(cat_raster)\n#>   levels\n#> 1      1\n#> 2      2\n#> 3      3\n#> 4      4\n#> 5      5\n#> 6      6\n#> 7      7\n#> 8      8\ncat_raster_wgs84 = project(cat_raster, \"EPSG:4326\", method = \"near\")\ncon_raster = rast(system.file(\"raster/srtm.tif\", package = \"spDataLarge\"))\ncrs(con_raster)\n#> [1] \"GEOGCRS[\\\"WGS 84\\\",\\n    DATUM[\\\"World Geodetic System 1984\\\",\\n        ELLIPSOID[\\\"WGS 84\\\",6378137,298.257223563,\\n            LENGTHUNIT[\\\"metre\\\",1]]],\\n    PRIMEM[\\\"Greenwich\\\",0,\\n        ANGLEUNIT[\\\"degree\\\",0.0174532925199433]],\\n    CS[ellipsoidal,2],\\n        AXIS[\\\"geodetic latitude (Lat)\\\",north,\\n            ORDER[1],\\n            ANGLEUNIT[\\\"degree\\\",0.0174532925199433]],\\n        AXIS[\\\"geodetic longitude (Lon)\\\",east,\\n            ORDER[2],\\n            ANGLEUNIT[\\\"degree\\\",0.0174532925199433]],\\n    ID[\\\"EPSG\\\",4326]]\"\ncon_raster_ea = project(con_raster, \"EPSG:32612\", method = \"bilinear\")\ncrs(con_raster_ea)\n#> [1] \"PROJCRS[\\\"WGS 84 / UTM zone 12N\\\",\\n    BASEGEOGCRS[\\\"WGS 84\\\",\\n        DATUM[\\\"World Geodetic System 1984\\\",\\n            ELLIPSOID[\\\"WGS 84\\\",6378137,298.257223563,\\n                LENGTHUNIT[\\\"metre\\\",1]]],\\n        PRIMEM[\\\"Greenwich\\\",0,\\n            ANGLEUNIT[\\\"degree\\\",0.0174532925199433]],\\n        ID[\\\"EPSG\\\",4326]],\\n    CONVERSION[\\\"UTM zone 12N\\\",\\n        METHOD[\\\"Transverse Mercator\\\",\\n            ID[\\\"EPSG\\\",9807]],\\n        PARAMETER[\\\"Latitude of natural origin\\\",0,\\n            ANGLEUNIT[\\\"degree\\\",0.0174532925199433],\\n            ID[\\\"EPSG\\\",8801]],\\n        PARAMETER[\\\"Longitude of natural origin\\\",-111,\\n            ANGLEUNIT[\\\"degree\\\",0.0174532925199433],\\n            ID[\\\"EPSG\\\",8802]],\\n        PARAMETER[\\\"Scale factor at natural origin\\\",0.9996,\\n            SCALEUNIT[\\\"unity\\\",1],\\n            ID[\\\"EPSG\\\",8805]],\\n        PARAMETER[\\\"False easting\\\",500000,\\n            LENGTHUNIT[\\\"metre\\\",1],\\n            ID[\\\"EPSG\\\",8806]],\\n        PARAMETER[\\\"False northing\\\",0,\\n            LENGTHUNIT[\\\"metre\\\",1],\\n            ID[\\\"EPSG\\\",8807]]],\\n    CS[Cartesian,2],\\n        AXIS[\\\"(E)\\\",east,\\n            ORDER[1],\\n            LENGTHUNIT[\\\"metre\\\",1]],\\n        AXIS[\\\"(N)\\\",north,\\n            ORDER[2],\\n            LENGTHUNIT[\\\"metre\\\",1]],\\n    USAGE[\\n        SCOPE[\\\"unknown\\\"],\\n        AREA[\\\"World - N hemisphere - 114°W to 108°W - by country\\\"],\\n        BBOX[0,-114,84,-108]],\\n    ID[\\\"EPSG\\\",32612]]\""},{"path":"reproj-geo-data.html","id":"exercises-4","chapter":"6 Reprojecting geographic data","heading":"6.7 Exercises","text":"E1. Create new object called nz_wgs transforming nz object WGS84 CRS.Create object class crs use query CRSs.reference bounding box object, units CRS use?Remove CRS nz_wgs plot result: wrong map New Zealand ?E2. Transform world dataset transverse Mercator projection (\"+proj=tmerc\") plot result.\nchanged ?\nTry transform back WGS 84 plot new object.\nnew object differ original one?E3. Transform continuous raster (con_raster) NAD83 / UTM zone 12N using nearest neighbor interpolation method.\nchanged?\ninfluence results?E4. Transform categorical raster (cat_raster) WGS 84 using bilinear interpolation method.\nchanged?\ninfluence results?","code":""},{"path":"read-write.html","id":"read-write","chapter":"7 Geographic data I/O","heading":"7 Geographic data I/O","text":"","code":""},{"path":"read-write.html","id":"prerequisites-5","chapter":"7 Geographic data I/O","heading":"Prerequisites","text":"chapter requires following packages:","code":"\nlibrary(sf)\nlibrary(terra)\nlibrary(dplyr)\nlibrary(spData)\n#> Warning: multiple methods tables found for 'approxNA'"},{"path":"read-write.html","id":"introduction-4","chapter":"7 Geographic data I/O","heading":"7.1 Introduction","text":"chapter reading writing geographic data.\nGeographic data import essential geocomputation: real-world applications impossible without data.\nothers benefit results work, data output also vital.\nTaken together, refer processes /O, short input/output.Geographic data /O almost always part wider process.\ndepends knowing datasets available, can found retrieve .\ntopics covered Section 7.2, describes various geoportals, collectively contain many terabytes data, use .\nease data access, number packages downloading geographic data developed.\ndescribed Section 7.3.many geographic file formats, pros cons.\ndescribed Section 7.5.\nprocess actually reading writing file formats efficiently covered Sections 7.6 7.7, respectively.\nfinal Section 7.8 demonstrates methods saving visual outputs (maps), preparation Chapter 8 visualization.","code":""},{"path":"read-write.html","id":"retrieving-data","chapter":"7 Geographic data I/O","heading":"7.2 Retrieving open data","text":"\nvast ever-increasing amount geographic data available internet, much free access use (appropriate credit given providers).\nways now much data, sense often multiple places access dataset.\ndatasets poor quality.\ncontext, vital know look, first section covers important sources.\nVarious ‘geoportals’ (web services providing geospatial datasets Data.gov) good place start, providing wide range data often specific locations (illustrated updated Wikipedia page topic).\nglobal geoportals overcome issue.\nGEOSS portal Copernicus Open Access Hub, example, contain many raster datasets global coverage.\nwealth vector datasets can accessed SEDAC portal run National Aeronautics Space Administration (NASA) European Union’s INSPIRE geoportal, global regional coverage.geoportals provide graphical interface allowing datasets queried based characteristics spatial temporal extent, United States Geological Services’ EarthExplorer prime example.\nExploring datasets interactively browser effective way understanding available layers.\nDownloading data best done code, however, reproducibility efficiency perspectives.\nDownloads can initiated command line using variety techniques, primarily via URLs APIs (see Sentinel API example).\nFiles hosted static URLs can downloaded download.file(), illustrated code chunk accesses US National Parks data catalog.data.gov/dataset/national-parks:","code":"\ndownload.file(url = \"https://irma.nps.gov/DataStore/DownloadFile/666527\",\n              destfile = \"nps_boundary.zip\")\nunzip(zipfile = \"nps_boundary.zip\")\nusa_parks = read_sf(dsn = \"nps_boundary.shp\")"},{"path":"read-write.html","id":"geographic-data-packages","chapter":"7 Geographic data I/O","heading":"7.3 Geographic data packages","text":"\nMany R packages developed accessing geographic data, presented Table 7.1.\nprovide interfaces one spatial libraries geoportals aim make data access even quicker command line.\nTABLE 7.1: Selected R packages geographic data retrieval.\nemphasized Table 7.1 represents small number available geographic data packages.\n\n\nnotable packages include GSODR, provides Global Summary Daily Weather Data R (see package’s README overview weather data sources);\ntidycensus tigris, provide socio-demographic vector data USA; hddtools, provides access range hydrological datasets.data package syntax accessing data.\ndiversity demonstrated subsequent code chunks, show get data using three packages Table 7.1.\nCountry borders often useful can accessed ne_countries() function rnaturalearth package follows:default rnaturalearth returns objects class Spatial*.\nresult can converted sf objects st_as_sf() follows:second example downloads series rasters containing global monthly precipitation sums spatial resolution ten minutes (~18.5 km equator) using geodata package.\nresult multilayer object class SpatRaster.third example uses osmdata package (Padgham et al. 2018) find parks OpenStreetMap (OSM) database.\nillustrated code-chunk , queries begin function opq() (short OpenStreetMap query), first argument bounding box, text string representing bounding box (city Leeds case).\nresult passed function selecting OSM elements ’re interested (parks case), represented key-value pairs.\nNext, passed function osmdata_sf() work downloading data converting list sf objects (see vignette('osmdata') details):limitation osmdata package rate limited, meaning download large OSM datasets (e.g. OSM data large city).\novercome limitation, osmextract package developed, can used download import binary .pbf files containing compressed versions OSM database pre-defined regions.\nOpenStreetMap vast global database crowd-sourced data, growing daily, wider ecosystem tools enabling easy access data, Overpass turbo web service rapid development testing OSM queries osm2pgsql importing data PostGIS database.\nAlthough quality datasets derived OSM varies, data source wider OSM ecosystems many advantages: provide datasets available globally, free charge, constantly improving thanks army volunteers.\nUsing OSM encourages ‘citizen science’ contributions back digital commons (can start editing data representing part world know well www.openstreetmap.org).\nexamples OSM data action provided Chapters 9, 12 13.Sometimes, packages come built-datasets.\ncan accessed four ways: attaching package (package uses ‘lazy loading’ spData ), data(dataset, package = mypackage), referring dataset mypackage::dataset, system.file(filepath, package = mypackage) access raw data files.\nfollowing code chunk illustrates latter two options using world dataset (already loaded attaching parent package library(spData)):34The last example, system.file(\"shapes/world.gpkg\", package = \"spData\"), returns path world.gpkg file, stored inside \"shapes/\" folder spData package.","code":"\nlibrary(rnaturalearth)\nusa = ne_countries(country = \"United States of America\") # United States borders\nclass(usa)\n#> [1] \"SpatialPolygonsDataFrame\"\n#> attr(,\"package\")\n#> [1] \"sp\"\n# alternative way of accessing the data, with geodata\n# geodata::gadm(\"USA\", level = 0, path = tempdir())\nusa_sf = st_as_sf(usa)\nlibrary(geodata)\nworldclim_prec = worldclim_global(\"prec\", res = 10, path = tempdir())\nclass(worldclim_prec)\n#> [1] \"SpatRaster\"\n#> attr(,\"package\")\n#> [1] \"terra\"\nlibrary(osmdata)\nparks = opq(bbox = \"leeds uk\") %>% \n  add_osm_feature(key = \"leisure\", value = \"park\") %>% \n  osmdata_sf()\nworld2 = spData::world\nworld3 = read_sf(system.file(\"shapes/world.gpkg\", package = \"spData\"))"},{"path":"read-write.html","id":"geographic-web-services","chapter":"7 Geographic data I/O","heading":"7.4 Geographic web services","text":"\neffort standardize web APIs accessing spatial data, Open Geospatial Consortium (OGC) created number specifications web services (collectively known OWS, short OGC Web Services).\nspecifications include Web Feature Service (WFS), Web Map Service (WMS), Web Map Tile Service (WMTS), Web Coverage Service (WCS) even Web Processing Service (WPS).\nMap servers PostGIS adopted protocols, leading standardization queries.\nLike web APIs, OWS APIs use ‘base URL,’ ‘endpoint’ ‘URL query arguments’ following ? request data (see best-practices-api-packages vignette httr package).many requests can made OWS service.\nOne fundamental getCapabilities, demonstrated httr .\ncode chunk demonstrates API queries can constructed dispatched, case discover capabilities service run Food Agriculture Organization United Nations (FAO):code chunk demonstrates API requests can constructed programmatically GET() function, takes base URL list query parameters can easily extended.\nresult request saved res, object class response defined httr package, list containing information request, including URL.\ncan seen executing browseURL(res$url), results can also read directly browser.\nOne way extracting contents request follows:Data can downloaded WFS services GetFeature request specific typeName (illustrated code chunk ).Available names differ depending accessed web feature service.\nOne can extract programmatically using web technologies (Nolan Lang 2014) scrolling manually contents GetCapabilities output browser.Note use write_disk() ensure results written disk rather loaded memory, allowing imported sf.\nexample shows gain low-level access web services using httr, can useful understanding web services work.\nmany everyday tasks, however, higher-level interface may appropriate, number R packages, tutorials, developed precisely purpose.Packages ows4R, rwfs sos4R developed working OWS services general, WFS sensor observation service (SOS) respectively.\nOctober 2018, ows4R CRAN.\npackage’s basic functionality demonstrated , commands get FAO_AREAS previous code chunk:35There much learn web services much potential development R-OWS interfaces, active area development.\ninformation topic, recommend examples European Centre Medium-Range Weather Forecasts (ECMWF) services github.com/OpenDataHack reading-OCG Web Services opengeospatial.org.","code":"\nbase_url = \"http://www.fao.org\"\nendpoint = \"/figis/geoserver/wfs\"\nq = list(request = \"GetCapabilities\")\nres = httr::GET(url = httr::modify_url(base_url, path = endpoint), query = q)\nres$url\n#> [1] \"https://www.fao.org/figis/geoserver/wfs?request=GetCapabilities\"\ntxt = httr::content(res, \"text\")\nxml = xml2::read_xml(txt)\nxml\n#> {xml_document} ...\n#> [1] <ows:ServiceIdentification>\\n  <ows:Title>GeoServer WFS...\n#> [2] <ows:ServiceProvider>\\n  <ows:ProviderName>UN-FAO Fishe...\n#> ...\nqf = list(request = \"GetFeature\", typeName = \"area:FAO_AREAS\")\nfile = tempfile(fileext = \".gml\")\nhttr::GET(url = base_url, path = endpoint, query = qf, httr::write_disk(file))\nfao_areas = read_sf(file)\nlibrary(ows4R)\nwfs = WFSClient$new(\"http://www.fao.org/figis/geoserver/wfs\",\n                      serviceVersion = \"1.0.0\", logger = \"INFO\")\nfao_areas = wfs$getFeatures(\"area:FAO_AREAS\")"},{"path":"read-write.html","id":"file-formats","chapter":"7 Geographic data I/O","heading":"7.5 File formats","text":"\nGeographic datasets usually stored files spatial databases.\nFile formats can either store vector raster data, spatial databases PostGIS can store (see also Section 9.6.2).\nToday variety file formats may seem bewildering much consolidation standardization since beginnings GIS software 1960s first widely distributed program (SYMAP) spatial analysis created Harvard University (Coppock Rhind 1991).\nGDAL (pronounced “goo-dal,” double “o” making reference object-orientation), Geospatial Data Abstraction Library, resolved many issues associated incompatibility geographic file formats since release 2000.\nGDAL provides unified high-performance interface reading writing many raster vector data formats.36\nMany open proprietary GIS programs, including GRASS, ArcGIS QGIS, use GDAL behind GUIs legwork ingesting spitting geographic data appropriate formats.GDAL provides access 200 vector raster data formats.\nTable 7.2 presents basic information selected often used spatial file formats.\nTABLE 7.2: Selected spatial file formats.\n\nimportant development ensuring standardization open-sourcing file formats founding Open Geospatial Consortium (OGC) 1994.\nBeyond defining simple features data model (see Section 2.2.1), OGC also coordinates development open standards, example used file formats KML GeoPackage.\nOpen file formats kind endorsed OGC several advantages proprietary formats: standards published, ensure transparency open possibility users develop adjust file formats specific needs.ESRI Shapefile popular vector data exchange format; however, open format (though specification open).\ndeveloped early 1990s number limitations.\nFirst , multi-file format, consists least three files.\nsupports 255 columns, column names restricted ten characters file size limit 2 GB.\nFurthermore, ESRI Shapefile support possible geometry types, example, unable distinguish polygon multipolygon.37\nDespite limitations, viable alternative missing long time.\nmeantime, GeoPackage emerged, seems suitable replacement candidate ESRI Shapefile.\nGeopackage format exchanging geospatial information OGC standard.\nGeoPackage standard describes rules store geospatial information tiny SQLite container.\nHence, GeoPackage lightweight spatial database container, allows storage vector raster data also non-spatial data extensions.\nAside GeoPackage, geospatial data exchange formats worth checking (Table 7.2).\nGeoTIFF format seems prominent raster data format.\nallows spatial information, CRS, embedded within TIFF file.\nSimilar ESRI Shapefile, format firstly developed 1990s, open format.\nAdditionally, GeoTIFF still expanded improved.\nOne significant recent addition GeoTIFF format variant called COG (Cloud Optimized GeoTIFF).\nRaster objects saved COGs can hosted HTTP servers, people can read parts file without downloading whole file.","code":""},{"path":"read-write.html","id":"data-input","chapter":"7 Geographic data I/O","heading":"7.6 Data input (I)","text":"Executing commands sf::read_sf() (main function use loading vector data) terra::rast() (main function used loading raster data) silently sets chain events reads data files.\nMoreover, many R packages containing wide range geographic data providing simple access different data sources.\nload data R , precisely, assign objects workspace, stored RAM accessible .GlobalEnv R session.","code":""},{"path":"read-write.html","id":"iovec","chapter":"7 Geographic data I/O","heading":"7.6.1 Vector data","text":"\nSpatial vector data comes wide variety file formats.\npopular representations .geojson .gpkg files can imported directly R sf function read_sf() (equivalent st_read()), uses GDAL’s vector drivers behind scenes.\nst_drivers() returns data frame containing name long_name first two columns, features driver available GDAL (therefore sf), including ability write data store raster data subsequent columns, illustrated key file formats Table 7.3.\nfollowing commands show first three drivers reported computer’s GDAL installation (results can vary depending GDAL version installed) summary features.\nNote majority drivers can write data (51 87) 16 formats can efficiently represent raster data addition vector data (see ?st_drivers() details):\nTABLE 7.3: Popular drivers/formats reading/writing vector data.\nfirst argument read_sf() dsn, text string object containing single text string.\ncontent text string vary different drivers.\ncases, ESRI Shapefile (.shp) GeoPackage format (.gpkg), dsn file name.\nread_sf() guesses driver based file extension, illustrated .gpkg file :drivers, dsn provided folder name, access credentials database, GeoJSON string representation (see examples read_sf() help page details).vector driver formats can store multiple data layers.\ndefault, read_sf() automatically reads first layer file specified dsn; however, using layer argument can specify layer.read_sf() function also allows reading just parts file RAM two possible mechanisms.\nfirst one related query argument, allows specifying part data read OGR SQL query text.\nexample extracts data Tanzania (Figure ??:).\ndone specifying want get columns (SELECT *) \"world\" layer name_long equals \"Tanzania\":second mechanism uses wkt_filter argument.\nargument expects well-known text representing study area want extract data.\nLet’s try using small example – want read polygons file intersect buffer 50,000 meters Tanzania’s borders.\n, need prepare “filter” () creating buffer (Section 5.2.3), (b) converting sf buffer object sfc geometry object st_geometry(), (c) translating geometries well-known text representation st_as_text():Now, can apply “filter” using wkt_filter argument.result, shown Figure ??:B, contains Tanzania every country within 50 km buffer.\nFIGURE 7.1: Reading subset vector data using query () wkt filter (B).\nNaturally, options specific certain drivers.38\nexample, think coordinates stored spreadsheet format (.csv).\nread files spatial objects, naturally specify names columns (X Y example ) representing coordinates.\ncan help options parameter.\nfind possible options, please refer ‘Open Options’ section corresponding GDAL driver description.\ncomma-separated value (csv) format, visit http://www.gdal.org/drv_csv.html.Instead columns describing xy-coordinates, single column can also contain geometry information.\nWell-known text (WKT), well-known binary (WKB), GeoJSON formats examples .\ninstance, world_wkt.csv file column named WKT representing polygons world’s countries.\nuse options parameter indicate .\n\n\nfinal example, show read_sf() also reads KML files.\nKML file stores geographic information XML format - data format creation web pages transfer data application-independent way (Nolan Lang 2014).\n, access KML file web.\nfile contains one layer.\nst_layers() lists available layers.\nchoose first layer Placemarks say help layer parameter read_sf().examples presented section far used sf package geographic data import.\nfast flexible may worth looking packages specific file formats.\nexample geojsonsf package.\nbenchmark suggests around 10 times faster sf package reading .geojson.","code":"\nsf_drivers = st_drivers()\nhead(sf_drivers, n = 3)\nsummary(sf_drivers[-c(1:2)])\nvector_filepath = system.file(\"shapes/world.gpkg\", package = \"spData\")\nworld = read_sf(vector_filepath, quiet = TRUE)\ntanzania = read_sf(vector_filepath,\n                   query = 'SELECT * FROM \"world\" WHERE name_long = \"Tanzania\"')\ntanzania_buf = st_buffer(tanzania, 50000)\ntanzania_buf_geom = st_geometry(tanzania_buf)\ntanzania_buf_wkt = st_as_text(tanzania_buf_geom)\ntanzania_neigh = read_sf(vector_filepath,\n                         wkt_filter = tanzania_buf_wkt)\ncycle_hire_txt = system.file(\"misc/cycle_hire_xy.csv\", package = \"spData\")\ncycle_hire_xy = read_sf(cycle_hire_txt, options = c(\"X_POSSIBLE_NAMES=X\",\n                                                    \"Y_POSSIBLE_NAMES=Y\"))\nworld_txt = system.file(\"misc/world_wkt.csv\", package = \"spData\")\nworld_wkt = read_sf(world_txt, options = \"GEOM_POSSIBLE_NAMES=WKT\")\n# the same as\nworld_wkt2 = st_read(world_txt, options = \"GEOM_POSSIBLE_NAMES=WKT\", \n                    quiet = TRUE, stringsAsFactors = FALSE, as_tibble = TRUE)\nu = \"https://developers.google.com/kml/documentation/KML_Samples.kml\"\ndownload.file(u, \"KML_Samples.kml\")\nst_layers(\"KML_Samples.kml\")\n#> Driver: LIBKML \n#> Available layers:\n#>               layer_name geometry_type features fields\n#> 1             Placemarks                      3     11\n#> 2      Styles and Markup                      1     11\n#> 3       Highlighted Icon                      1     11\n#> 4        Ground Overlays                      1     11\n#> 5        Screen Overlays                      0     11\n#> 6                  Paths                      6     11\n#> 7               Polygons                      0     11\n#> 8          Google Campus                      4     11\n#> 9       Extruded Polygon                      1     11\n#> 10 Absolute and Relative                      4     11\nkml = read_sf(\"KML_Samples.kml\", layer = \"Placemarks\")"},{"path":"read-write.html","id":"raster-data-1","chapter":"7 Geographic data I/O","heading":"7.6.2 Raster data","text":"\nSimilar vector data, raster data comes many file formats supporting multilayer files.\nterra’s rast() command reads single layer file just one layer provided.also works case want read multilayer file.","code":"\nraster_filepath = system.file(\"raster/srtm.tif\", package = \"spDataLarge\")\nsingle_layer = rast(raster_filepath)\nmultilayer_filepath = system.file(\"raster/landsat.tif\", package = \"spDataLarge\")\nmultilayer_rast = rast(multilayer_filepath)"},{"path":"read-write.html","id":"data-output","chapter":"7 Geographic data I/O","heading":"7.7 Data output (O)","text":"Writing geographic data allows convert one format another save newly created objects.\nDepending data type (vector raster), object class (e.g., sf SpatRaster), type amount stored information (e.g., object size, range values), important know store spatial files efficient way.\nnext two sections demonstrate .","code":""},{"path":"read-write.html","id":"vector-data-1","chapter":"7 Geographic data I/O","heading":"7.7.1 Vector data","text":"counterpart read_sf() write_sf().\nallows write sf objects wide range geographic vector file formats, including common .geojson, .shp .gpkg.\nBased file name, write_sf() decides automatically driver use.\nspeed writing process depends also driver.Note: try write data source , function overwrite file:Instead overwriting file, add new layer file append = TRUE, supported several spatial formats, including GeoPackage.Alternatively, can use st_write() since equivalent write_sf().\nHowever, different defaults – overwrite files (returns error try ) shows short summary written file format object.layer_options argument also used many different purposes.\nOne write spatial data text file.\ncan done specifying GEOMETRY inside layer_options.\neither AS_XY simple point datasets (creates two new columns coordinates) AS_WKT complex spatial data (one new column created contains well-known text representation spatial objects).","code":"\nwrite_sf(obj = world, dsn = \"world.gpkg\")\nwrite_sf(obj = world, dsn = \"world.gpkg\")\nwrite_sf(obj = world, dsn = \"world_many_layers.gpkg\", append = TRUE)\nst_write(obj = world, dsn = \"world2.gpkg\")\n#> Writing layer `world2' to data source `world2.gpkg' using driver `GPKG'\n#> Writing 177 features with 10 fields and geometry type Multi Polygon.\nwrite_sf(cycle_hire_xy, \"cycle_hire_xy.csv\", layer_options = \"GEOMETRY=AS_XY\")\nwrite_sf(world_wkt, \"world_wkt.csv\", layer_options = \"GEOMETRY=AS_WKT\")"},{"path":"read-write.html","id":"raster-data-2","chapter":"7 Geographic data I/O","heading":"7.7.2 Raster data","text":"\nwriteRaster() function saves SpatRaster objects files disk.\nfunction expects input regarding output data type file format, also accepts GDAL options specific selected file format (see ?writeRaster details).\nterra package offers nine data types saving raster: LOG1S, INT1S, INT1U, INT2S, INT2U, INT4S, INT4U, FLT4S, FLT8S.39\ndata type determines bit representation raster object written disk (Table 7.4).\ndata type use depends range values raster object.\nvalues data type can represent, larger file get disk.\nUnsigned integers (INT1U, INT2U, INT4U) suitable categorical data, float numbers (FLT4S FLT8S) usually represent continuous data.\nwriteRaster() uses FLT4S default.\nworks cases, size output file unnecessarily large save binary categorical data.\nTherefore, recommend use data type needs least storage space, still able represent values (check range values summary() function).\nTABLE 7.4: Data types supported terra package.\ndefault, output file format derived filename.\nNaming file *.tif create GeoTIFF file, demonstrated :raster file formats additional options, can set providing GDAL parameters options argument writeRaster().\nGeoTIFF files written terra, default, LZW compression gdal = c(\"COMPRESS=LZW\").\nchange disable compression, need modify argument.\nAdditionally, can save raster object COG (Cloud Optimized GeoTIFF, Section 7.5) \"=COG\" option.","code":"\nwriteRaster(single_layer, filename = \"my_raster.tif\", datatype = \"INT2U\")\nwriteRaster(x = single_layer,\n            filename = \"my_raster.tif\",\n            datatype = \"INT2U\",\n            gdal = c(\"COMPRESS=NONE\", \"of=COG\"),\n            overwrite = TRUE)"},{"path":"read-write.html","id":"visual-outputs","chapter":"7 Geographic data I/O","heading":"7.8 Visual outputs","text":"\nR supports many different static interactive graphics formats.\ngeneral method save static plot open graphic device, create plot, close , example:available graphic devices include pdf(), bmp(), jpeg(), tiff().\ncan specify several properties output plot, including width, height resolution.Additionally, several graphic packages provide functions save graphical output.\nexample, tmap package tmap_save() function.\ncan save tmap object different graphic formats HTML file specifying object name file path new file.hand, can save interactive maps created mapview package HTML file image using mapshot() function:","code":"\npng(filename = \"lifeExp.png\", width = 500, height = 350)\nplot(world[\"lifeExp\"])\ndev.off()\nlibrary(tmap)\ntmap_obj = tm_shape(world) + tm_polygons(col = \"lifeExp\")\ntmap_save(tmap_obj, filename = \"lifeExp_tmap.png\")\nlibrary(mapview)\nmapview_obj = mapview(world, zcol = \"lifeExp\", legend = TRUE)\nmapshot(mapview_obj, file = \"my_interactive_map.html\")"},{"path":"read-write.html","id":"exercises-5","chapter":"7 Geographic data I/O","heading":"7.9 Exercises","text":"E1. List describe three types vector, raster, geodatabase formats.E2. Name least two differences read_sf() well-known function st_read().E3. Read cycle_hire_xy.csv file spData package spatial object (Hint: located misc folder).\ngeometry type loaded object?E4. Download borders Germany using rnaturalearth, create new object called germany_borders.\nWrite new object file GeoPackage format.E5. Download global monthly minimum temperature spatial resolution five minutes using geodata package.\nExtract June values, save file named tmin_june.tif file (hint: use terra::subset()).E6. Create static map Germany’s borders, save PNG file.E7. Create interactive map using data cycle_hire_xy.csv file.\nExport map file called cycle_hire.html.","code":""},{"path":"adv-map.html","id":"adv-map","chapter":"8 Making maps with R","heading":"8 Making maps with R","text":"","code":""},{"path":"adv-map.html","id":"prerequisites-6","chapter":"8 Making maps with R","heading":"Prerequisites","text":"chapter requires following packages already using:addition, uses following visualization packages (also install shiny want develop interactive mapping applications):","code":"\nlibrary(sf)\nlibrary(raster)\n#> Warning: multiple methods tables found for 'approxNA'\nlibrary(dplyr)\nlibrary(spData)\nlibrary(spDataLarge)\nlibrary(tmap)    # for static and interactive maps\nlibrary(leaflet) # for interactive maps\nlibrary(ggplot2) # tidyverse data visualization package"},{"path":"adv-map.html","id":"introduction-5","chapter":"8 Making maps with R","heading":"8.1 Introduction","text":"satisfying important aspect geographic research communicating results.\nMap making — art cartography — ancient skill involves communication, intuition, element creativity.\nStatic mapping R straightforward plot() function, saw Section 2.2.3.\npossible create advanced maps using base R methods (Murrell 2016).\nfocus chapter, however, cartography dedicated map-making packages.\nlearning new skill, makes sense gain depth--knowledge one area branching .\nMap making exception, hence chapter’s coverage one package (tmap) depth rather many superficially.addition fun creative, cartography also important practical applications.\ncarefully crafted map can best way communicating results work, poorly designed maps can leave bad impression.\nCommon design issues include poor placement, size readability text careless selection colors, outlined style guide Journal Maps.\nFurthermore, poor map making can hinder communication results (Brewer 2015):Amateur-looking maps can undermine audience’s ability understand important information weaken presentation professional data investigation.Maps used several thousand years wide variety purposes.\nHistoric examples include maps buildings land ownership Old Babylonian dynasty 3000 years ago Ptolemy’s world map masterpiece Geography nearly 2000 years ago (Talbert 2014).Map making historically activity undertaken , behalf , elite.\nchanged emergence open source mapping software R package tmap ‘print composer’ QGIS enable anyone make high-quality maps, enabling ‘citizen science.’\nMaps also often best way present findings geocomputational research way accessible.\nMap making therefore critical part geocomputation emphasis describing, also changing world.chapter shows make wide range maps.\nnext section covers range static maps, including aesthetic considerations, facets inset maps.\nSections 8.3 8.5 cover animated interactive maps (including web maps mapping applications).\nFinally, Section 8.6 covers range alternative map-making packages including ggplot2 cartogram.","code":""},{"path":"adv-map.html","id":"static-maps","chapter":"8 Making maps with R","heading":"8.2 Static maps","text":"\nStatic maps common type visual output geocomputation.\nStandard formats include .png .pdf raster vector outputs respectively.\nInitially, static maps type maps R produce.\nThings advanced release sp (see E. J. Pebesma Bivand 2005) many techniques map making developed since .\nHowever, despite innovation interactive mapping, static plotting still emphasis geographic data visualisation R decade later (Cheshire Lovelace 2015).generic plot() function often fastest way create static maps vector raster spatial objects (see sections 2.2.3 2.3.3).\nSometimes, simplicity speed priorities, especially development phase project, plot() excels.\nbase R approach also extensible, plot() offering dozens arguments.\nAnother approach grid package allows low level control static maps, illustrated Chapter 14 Murrell (2016).\nsection focuses tmap emphasizes important aesthetic layout options.\ntmap powerful flexible map-making package sensible defaults.\nconcise syntax allows creation attractive maps minimal code familiar ggplot2 users.\nalso unique capability generate static interactive maps using code via tmap_mode().\nFinally, accepts wider range spatial classes (including raster objects) alternatives ggplot2 (see vignettes tmap-getstarted tmap-changes-v2, well Tennekes (2018), documentation).","code":""},{"path":"adv-map.html","id":"tmap-basics","chapter":"8 Making maps with R","heading":"8.2.1 tmap basics","text":"\nLike ggplot2, tmap based idea ‘grammar graphics’ (Wilkinson Wills 2005).\ninvolves separation input data aesthetics (data visualised): input dataset can ‘mapped’ range different ways including location map (defined data’s geometry), color, visual variables.\nbasic building block tm_shape() (defines input data, raster vector objects), followed one layer elements tm_fill() tm_dots().\nlayering demonstrated chunk , generates maps presented Figure 8.1:\nFIGURE 8.1: New Zealand’s shape plotted fill (left), border (middle) fill border (right) layers added using tmap functions.\nobject passed tm_shape() case nz, sf object representing regions New Zealand (see Section 2.2.1 sf objects).\nLayers added represent nz visually, tm_fill() tm_borders() creating shaded areas (left panel) border outlines (middle panel) Figure 8.1, respectively.intuitive approach map making:\ncommon task adding new layers undertaken addition operator +, followed tm_*().\nasterisk (*) refers wide range layer types self-explanatory names including fill, borders (demonstrated ), bubbles, text raster (see help(\"tmap-element\") full list).\nlayering illustrated right panel Figure 8.1, result adding border top fill layer.","code":"\n# Add fill layer to nz shape\ntm_shape(nz) +\n  tm_fill() \n# Add border layer to nz shape\ntm_shape(nz) +\n  tm_borders() \n# Add fill and border layers to nz shape\ntm_shape(nz) +\n  tm_fill() +\n  tm_borders() "},{"path":"adv-map.html","id":"map-obj","chapter":"8 Making maps with R","heading":"8.2.2 Map objects","text":"useful feature tmap ability store objects representing maps.\ncode chunk demonstrates saving last plot Figure 8.1 object class tmap (note use tm_polygons() condenses tm_fill()  + tm_borders() single function):map_nz can plotted later, example adding additional layers (shown ) simply running map_nz console, equivalent print(map_nz).New shapes can added + tm_shape(new_obj).\ncase new_obj represents new spatial object plotted top preceding layers.\nnew shape added way, subsequent aesthetic functions refer , another new shape added.\nsyntax allows creation maps multiple shapes layers, illustrated next code chunk uses function tm_raster() plot raster layer (alpha set make layer semi-transparent):Building previously created map_nz object, preceding code creates new map object map_nz1 contains another shape (nz_elev) representing average elevation across New Zealand (see Figure 8.2, left).\nshapes layers can added, illustrated code chunk creates nz_water, representing New Zealand’s territorial waters, adds resulting lines existing map object.limit number layers shapes can added tmap objects.\nshape can even used multiple times.\nfinal map illustrated Figure 8.2 created adding layer representing high points (stored object nz_height) onto previously created map_nz2 object tm_dots() (see ?tm_dots ?tm_bubbles details tmap’s point plotting functions).\nresulting map, four layers, illustrated right-hand panel Figure 8.2:useful little known feature tmap multiple map objects can arranged single ‘metaplot’ tmap_arrange().\ndemonstrated code chunk plots map_nz1 map_nz3, resulting Figure 8.2.\nFIGURE 8.2: Maps additional layers added final map Figure 8.1.\nelements can also added + operator.\nAesthetic settings, however, controlled arguments layer functions.","code":"\nmap_nz = tm_shape(nz) + tm_polygons()\nclass(map_nz)\n#> [1] \"tmap\"\nmap_nz1 = map_nz +\n  tm_shape(nz_elev) + tm_raster(alpha = 0.7)\nnz_water = st_union(nz) %>% st_buffer(22200) %>% \n  st_cast(to = \"LINESTRING\")\nmap_nz2 = map_nz1 +\n  tm_shape(nz_water) + tm_lines()\nmap_nz3 = map_nz2 +\n  tm_shape(nz_height) + tm_dots()\ntmap_arrange(map_nz1, map_nz2, map_nz3)"},{"path":"adv-map.html","id":"aesthetics","chapter":"8 Making maps with R","heading":"8.2.3 Aesthetics","text":"\nplots previous section demonstrate tmap’s default aesthetic settings.\nGray shades used tm_fill() tm_bubbles() layers continuous black line used represent lines created tm_lines().\ncourse, default values aesthetics can overridden.\npurpose section show .two main types map aesthetics: change data constant.\nUnlike ggplot2, uses helper function aes() represent variable aesthetics, tmap accepts aesthetic arguments directly.\nmap variable aesthetic, pass column name corresponding argument, set fixed aesthetic, pass desired value instead.40\ncommonly used aesthetics fill border layers include color, transparency, line width line type, set col, alpha, lwd, lty arguments, respectively.\nimpact setting fixed values illustrated Figure 8.3.\nFIGURE 8.3: impact changing commonly used fill border aesthetics fixed values.\nLike base R plots, arguments defining aesthetics can also receive values vary.\nUnlike base R code (generates left panel Figure 8.4), tmap aesthetic arguments accept numeric vector:Instead col (aesthetics can vary lwd line layers size point layers) requires character string naming attribute associated geometry plotted.\nThus, one achieve desired result follows (plotted right-hand panel Figure 8.4):\nFIGURE 8.4: Comparison base (left) tmap (right) handling numeric color field.\nimportant argument functions defining aesthetic layers tm_fill() title, sets title associated legend.\nfollowing code chunk demonstrates functionality providing attractive name variable name Land_area (note use expression() create superscript text):","code":"\nma1 = tm_shape(nz) + tm_fill(col = \"red\")\nma2 = tm_shape(nz) + tm_fill(col = \"red\", alpha = 0.3)\nma3 = tm_shape(nz) + tm_borders(col = \"blue\")\nma4 = tm_shape(nz) + tm_borders(lwd = 3)\nma5 = tm_shape(nz) + tm_borders(lty = 2)\nma6 = tm_shape(nz) + tm_fill(col = \"red\", alpha = 0.3) +\n  tm_borders(col = \"blue\", lwd = 3, lty = 2)\ntmap_arrange(ma1, ma2, ma3, ma4, ma5, ma6)\nplot(st_geometry(nz), col = nz$Land_area)  # works\ntm_shape(nz) + tm_fill(col = nz$Land_area) # fails\n#> Error: Fill argument neither colors nor valid variable name(s)\ntm_shape(nz) + tm_fill(col = \"Land_area\")\nlegend_title = expression(\"Area (km\"^2*\")\")\nmap_nza = tm_shape(nz) +\n  tm_fill(col = \"Land_area\", title = legend_title) + tm_borders()"},{"path":"adv-map.html","id":"color-settings","chapter":"8 Making maps with R","heading":"8.2.4 Color settings","text":"\nColor settings important part map design.\ncan major impact spatial variability portrayed illustrated Figure 8.5.\nshows four ways coloring regions New Zealand depending median income, left right (demonstrated code chunk ):default setting uses ‘pretty’ breaks, described next paragraphbreaks allows manually set breaksn sets number bins numeric variables categorizedpalette defines color scheme, example BuGn\nFIGURE 8.5: Illustration settings affect color settings. results show (left right): default settings, manual breaks, n breaks, impact changing palette.\nAnother way change color settings altering color break (bin) settings.\naddition manually setting breaks tmap allows users specify algorithms automatically create breaks style argument.\n\nsix useful break styles:style = \"pretty\", default setting, rounds breaks whole numbers possible spaces evenly;style = \"equal\" divides input values bins equal range appropriate variables uniform distribution (recommended variables skewed distribution resulting map may end-little color diversity);style = \"quantile\" ensures number observations fall category (potential downside bin ranges can vary widely);style = \"jenks\" identifies groups similar values data maximizes differences categories;style = \"cont\" (\"order\") present large number colors continuous color fields particularly suited continuous rasters (\"order\" can help visualize skewed distributions);style = \"cat\" designed represent categorical values assures category receives unique color.\nFIGURE 8.6: Illustration different binning methods set using style argument tmap.\nPalettes define color ranges associated bins determined breaks, n, style arguments described .\ndefault color palette specified tm_layout() (see Section 8.2.5 learn ); however, quickly changed using palette argument.\nexpects vector colors new color palette name, can selected interactively tmaptools::palette_explorer().\ncan add - prefix reverse palette order.three main groups color palettes: categorical, sequential diverging (Figure 8.7), serves different purpose.\nCategorical palettes consist easily distinguishable colors appropriate categorical data without particular order state names land cover classes.\nColors intuitive: rivers blue, example, pastures green.\nAvoid many categories: maps large legends many colors can uninterpretable.41The second group sequential palettes.\nfollow gradient, example light dark colors (light colors tend represent lower values), appropriate continuous (numeric) variables.\nSequential palettes can single (Blues go light dark blue, example) multi-color/hue (YlOrBr gradient light yellow brown via orange, example), demonstrated code chunk — output shown, run code see results!last group, diverging palettes, typically range three distinct colors (purple-white-green Figure 8.7) usually created joining two single-color sequential palettes darker colors end.\nmain purpose visualize difference important reference point, e.g., certain temperature, median household income mean probability drought event.\nreference point’s value can adjusted tmap using midpoint argument.\nFIGURE 8.7: Examples categorical, sequential diverging palettes.\ntwo important principles consideration working colors: perceptibility accessibility.\nFirstly, colors maps match perception.\nmeans certain colors viewed experience also cultural lenses.\nexample, green colors usually represent vegetation lowlands blue connected water cool.\nColor palettes also easy understand effectively convey information.\nclear values lower higher, colors change gradually.\nproperty preserved rainbow color palette; therefore, suggest avoiding geographic data visualization (Borland Taylor II 2007).\nInstead, viridis color palettes, also available tmap, can used.\nSecondly, changes colors accessible largest number people.\nTherefore, important use colorblind friendly palettes often possible.42","code":"\ntm_shape(nz) + tm_polygons(col = \"Median_income\")\nbreaks = c(0, 3, 4, 5) * 10000\ntm_shape(nz) + tm_polygons(col = \"Median_income\", breaks = breaks)\ntm_shape(nz) + tm_polygons(col = \"Median_income\", n = 10)\ntm_shape(nz) + tm_polygons(col = \"Median_income\", palette = \"BuGn\")\ntm_shape(nz) + tm_polygons(\"Population\", palette = \"Blues\")\ntm_shape(nz) + tm_polygons(\"Population\", palette = \"YlOrBr\")"},{"path":"adv-map.html","id":"layouts","chapter":"8 Making maps with R","heading":"8.2.5 Layouts","text":"\nmap layout refers combination map elements cohesive map.\nMap elements include among others objects mapped, title, scale bar, margins aspect ratios, color settings covered previous section relate palette break-points used affect map looks.\nmay result subtle changes can equally large impact impression left maps.Additional elements north arrows scale bars functions: tm_compass() tm_scale_bar() (Figure 8.8).\nFIGURE 8.8: Map additional elements - north arrow scale bar.\ntmap also allows wide variety layout settings changed, , produced using following code (see args(tm_layout) ?tm_layout full list), illustrated Figure 8.9:\nFIGURE 8.9: Layout options specified (left right) title, scale, bg.color frame arguments.\narguments tm_layout() provide control many aspects map relation canvas placed.\nuseful layout settings (illustrated Figure 8.10):Frame width (frame.lwd) option allow double lines (frame.double.line)Margin settings including outer.margin inner.marginFont settings controlled fontface fontfamilyLegend settings including binary options legend.show (whether show legend) legend.(omit map) legend.outside (legend go outside map?), well multiple choice settings legend.positionDefault colors aesthetic layers (aes.color), map attributes frame (attr.color)Color settings controlling sepia.intensity (yellowy map looks) saturation (color-grayscale)\nFIGURE 8.10: Illustration selected layout options.\nimpact changing color settings listed illustrated Figure 8.11 (see ?tm_layout full list).\nFIGURE 8.11: Illustration selected color-related layout options.\n\nBeyond low-level control layouts colors, tmap also offers high-level styles, using tm_style() function (representing second meaning ‘style’ package).\nstyles tm_style(\"cobalt\") result stylized maps, others tm_style(\"gray\") make subtle changes, illustrated Figure 8.12, created using code (see 08-tmstyles.R):\nFIGURE 8.12: Selected tmap styles.\n","code":"\nmap_nz + \n  tm_compass(type = \"8star\", position = c(\"left\", \"top\")) +\n  tm_scale_bar(breaks = c(0, 100, 200), text.size = 1)\nmap_nz + tm_layout(title = \"New Zealand\")\nmap_nz + tm_layout(scale = 5)\nmap_nz + tm_layout(bg.color = \"lightblue\")\nmap_nz + tm_layout(frame = FALSE)\nmap_nza + tm_style(\"bw\")\nmap_nza + tm_style(\"classic\")\nmap_nza + tm_style(\"cobalt\")\nmap_nza + tm_style(\"col_blind\")"},{"path":"adv-map.html","id":"faceted-maps","chapter":"8 Making maps with R","heading":"8.2.6 Faceted maps","text":"\n\nFaceted maps, also referred ‘small multiples,’ composed many maps arranged side--side, sometimes stacked vertically (Meulemans et al. 2017).\nFacets enable visualization spatial relationships change respect another variable, time.\nchanging populations settlements, example, can represented faceted map panel representing population particular moment time.\ntime dimension represented via another aesthetic color.\nHowever, risks cluttering map involve multiple overlapping points (cities tend move time!).Typically individual facets faceted map contain geometry data repeated multiple times, column attribute data (default plotting method sf objects, see Chapter 2).\nHowever, facets can also represent shifting geometries evolution point pattern time.\nuse case faceted plot illustrated Figure 8.13.\nFIGURE 8.13: Faceted map showing top 30 largest urban agglomerations 1970 2030 based population projections United Nations.\npreceding code chunk demonstrates key features faceted maps created tmap:Shapes facet variable repeated (countries world case)argument varies depending variable (year case).nrow/ncol setting specifying number rows columns facets arranged intoThe free.coords parameter specifying map bounding boxIn addition utility showing changing spatial relationships, faceted maps also useful foundation animated maps (see Section 8.3).","code":"\nurb_1970_2030 = urban_agglomerations %>% \n  filter(year %in% c(1970, 1990, 2010, 2030))\n\ntm_shape(world) +\n  tm_polygons() +\n  tm_shape(urb_1970_2030) +\n  tm_symbols(col = \"black\", border.col = \"white\", size = \"population_millions\") +\n  tm_facets(by = \"year\", nrow = 2, free.coords = FALSE)"},{"path":"adv-map.html","id":"inset-maps","chapter":"8 Making maps with R","heading":"8.2.7 Inset maps","text":"\n\ninset map smaller map rendered within next main map.\nserve many different purposes, including providing context (Figure 8.14) bringing non-contiguous regions closer ease comparison (Figure 8.15).\nalso used focus smaller area detail cover area map, representing different topic.example , create map central part New Zealand’s Southern Alps.\ninset map show main map relation whole New Zealand.\nfirst step define area interest, can done creating new spatial object, nz_region.second step, create base map showing New Zealand’s Southern Alps area.\nplace important message stated.third step consists inset map creation.\ngives context helps locate area interest.\nImportantly, map needs clearly indicate location main map, example stating borders.Finally, combine two maps using function viewport() grid package, first arguments specify center location (x y) size (width height) inset map.\nFIGURE 8.14: Inset map providing context - location central part Southern Alps New Zealand.\nInset map can saved file either using graphic device (see Section 7.8) tmap_save() function arguments - insets_tm insets_vp.Inset maps also used create one map non-contiguous areas.\nProbably, often used example map United States, consists contiguous United States, Hawaii Alaska.\nimportant find best projection individual inset types cases (see Chapter 6 learn ).\ncan use US National Atlas Equal Area map contiguous United States putting EPSG code projection argument tm_shape().rest objects, hawaii alaska, already proper projections; therefore, just need create two separate maps:final map created combining arranging three maps:\nFIGURE 8.15: Map United States.\ncode presented compact can used basis inset maps results, Figure 8.15, provide poor representation locations Hawaii Alaska.\n-depth approach, see us-map vignette geocompkg.","code":"\nnz_region = st_bbox(c(xmin = 1340000, xmax = 1450000,\n                      ymin = 5130000, ymax = 5210000),\n                    crs = st_crs(nz_height)) %>% \n  st_as_sfc()\nnz_height_map = tm_shape(nz_elev, bbox = nz_region) +\n  tm_raster(style = \"cont\", palette = \"YlGn\", legend.show = TRUE) +\n  tm_shape(nz_height) + tm_symbols(shape = 2, col = \"red\", size = 1) +\n  tm_scale_bar(position = c(\"left\", \"bottom\"))\nnz_map = tm_shape(nz) + tm_polygons() +\n  tm_shape(nz_height) + tm_symbols(shape = 2, col = \"red\", size = 0.1) + \n  tm_shape(nz_region) + tm_borders(lwd = 3) \nlibrary(grid)\nnz_height_map\nprint(nz_map, vp = viewport(0.8, 0.27, width = 0.5, height = 0.5))\nus_states_map = tm_shape(us_states, projection = 2163) + tm_polygons() + \n  tm_layout(frame = FALSE)\nhawaii_map = tm_shape(hawaii) + tm_polygons() + \n  tm_layout(title = \"Hawaii\", frame = FALSE, bg.color = NA, \n            title.position = c(\"LEFT\", \"BOTTOM\"))\nalaska_map = tm_shape(alaska) + tm_polygons() + \n  tm_layout(title = \"Alaska\", frame = FALSE, bg.color = NA)\nus_states_map\nprint(hawaii_map, vp = grid::viewport(0.35, 0.1, width = 0.2, height = 0.1))\nprint(alaska_map, vp = grid::viewport(0.15, 0.15, width = 0.3, height = 0.3))"},{"path":"adv-map.html","id":"animated-maps","chapter":"8 Making maps with R","heading":"8.3 Animated maps","text":"\n\nFaceted maps, described Section 8.2.6, can show spatial distributions variables change (e.g., time), approach disadvantages.\nFacets become tiny many .\nFurthermore, fact facet physically separated screen page means subtle differences facets can hard detect.Animated maps solve issues.\nAlthough depend digital publication, becoming less issue content moves online.\nAnimated maps can still enhance paper reports: can always link readers web-page containing animated (interactive) version printed map help make come alive.\nseveral ways generate animations R, including animation packages gganimate, builds ggplot2 (see Section 8.6).\nsection focusses creating animated maps tmap syntax familiar previous sections flexibility approach.Figure 8.16 simple example animated map.\nUnlike faceted plot, squeeze multiple maps single screen allows reader see spatial distribution world’s populous agglomerations evolve time (see book’s website animated version).\nFIGURE 8.16: Animated map showing top 30 largest urban agglomerations 1950 2030 based population projects United Nations. Animated version available online : geocompr.robinlovelace.net.\nanimated map illustrated Figure 8.16 can created using tmap techniques generate faceted maps, demonstrated Section 8.2.6.\ntwo differences, however, related arguments tm_facets():along = \"year\" used instead = \"year\".free.coords = FALSE, maintains map extent map iteration.additional arguments demonstrated subsequent code chunk:resulting urb_anim represents set separate maps year.\nfinal stage combine save result .gif file tmap_animation().\nfollowing command creates animation illustrated Figure 8.16, elements missing, add exercises:Another illustration power animated maps provided Figure 8.17.\nshows development states United States, first formed east incrementally west finally interior.\nCode reproduce map can found script 08-usboundaries.R.\nFIGURE 8.17: Animated map showing population growth, state formation boundary changes United States, 1790-2010. Animated version available online geocompr.robinlovelace.net.\n","code":"\nurb_anim = tm_shape(world) + tm_polygons() + \n  tm_shape(urban_agglomerations) + tm_dots(size = \"population_millions\") +\n  tm_facets(along = \"year\", free.coords = FALSE)\ntmap_animation(urb_anim, filename = \"urb_anim.gif\", delay = 25)"},{"path":"adv-map.html","id":"interactive-maps","chapter":"8 Making maps with R","heading":"8.4 Interactive maps","text":"\n\nstatic animated maps can enliven geographic datasets, interactive maps can take new level.\nInteractivity can take many forms, common useful ability pan around zoom part geographic dataset overlaid ‘web map’ show context.\nLess advanced interactivity levels include popups appear click different features, kind interactive label.\nadvanced levels interactivity include ability tilt rotate maps, demonstrated mapdeck example , provision “dynamically linked” sub-plots automatically update user pans zooms (Pezanowski et al. 2018).important type interactivity, however, display geographic data interactive ‘slippy’ web maps.\nrelease leaflet package 2015 revolutionized interactive web map creation within R number packages built foundations adding new features (e.g., leaflet.extras) making creation web maps simple creating static maps (e.g., mapview tmap).\nsection illustrates approach opposite order.\nexplore make slippy maps tmap (syntax already learned), mapview finally leaflet (provides low-level control interactive maps).unique feature tmap mentioned Section 8.2 ability create static interactive maps using code.\nMaps can viewed interactively point switching view mode, using command tmap_mode(\"view\").\ndemonstrated code , creates interactive map New Zealand based tmap object map_nz, created Section 8.2.2, illustrated Figure 8.18:\nFIGURE 8.18: Interactive map New Zealand created tmap view mode. Interactive version available online : geocompr.robinlovelace.net.\nNow interactive mode ‘turned ,’ maps produced tmap launch (another way create interactive maps tmap_leaflet function).\nNotable features interactive mode include ability specify basemap tm_basemap() (tmap_options()) demonstrated (result shown):impressive little-known feature tmap’s view mode also works faceted plots.\nargument sync tm_facets() can used case produce multiple maps synchronized zoom pan settings, illustrated Figure 8.19, produced following code:\nFIGURE 8.19: Faceted interactive maps global coffee production 2016 2017 sync, demonstrating tmap’s view mode action.\nSwitch tmap back plotting mode function:proficient tmap, quickest way create interactive maps may mapview.\nfollowing ‘one liner’ reliable way interactively explore wide range geographic data formats:\nFIGURE 8.20: Illustration mapview action.\nmapview concise syntax yet powerful. default, provides standard GIS functionality mouse position information, attribute queries (via pop-ups), scale bar, zoom--layer buttons.\noffers advanced controls including ability ‘burst’ datasets multiple layers addition multiple layers + followed name geographic object.\nAdditionally, provides automatic coloring attributes (via argument zcol).\nessence, can considered data-driven leaflet API (see information leaflet).\nGiven mapview always expects spatial object (sf, Spatial*, Raster*) first argument, works well end piped expressions.\nConsider following example sf used intersect lines polygons visualized mapview (Figure 8.21).\nFIGURE 8.21: Using mapview end sf-based pipe expression.\nOne important thing keep mind mapview layers added via + operator (similar ggplot2 tmap). frequent gotcha piped workflows main binding operator %>%.\ninformation mapview, see package’s website : r-spatial.github.io/mapview/.ways create interactive maps R.\ngoogleway package, example, provides interactive mapping interface flexible extensible\n(see googleway-vignette details).\nAnother approach author mapdeck, provides access Uber’s Deck.gl framework.\nuse WebGL enables interactively visualize large datasets (millions points).\npackage uses Mapbox access tokens, must register using package.unique feature mapdeck provision interactive ‘2.5d’ perspectives, illustrated Figure 8.22.\nmeans can can pan, zoom rotate around maps, view data ‘extruded’ map.\nFigure 8.22, generated following code chunk, visualizes road traffic crashes UK, bar height respresenting casualties per area.\nFIGURE 8.22: Map generated mapdeck, representing road traffic casualties across UK. Height 1 km cells represents number crashes.\nbrowser can zoom drag, addition rotating tilting map pressing Cmd/Ctrl.\nMultiple layers can added %>% operator, demonstrated mapdeck vignette.Mapdeck also supports sf objects, can seen replacing add_grid() function call preceding code chunk add_polygon(data = lnd, layer_id = \"polygon_layer\"), add polygons representing London interactive tilted map.Last least leaflet mature widely used interactive mapping package R.\nleaflet provides relatively low-level interface Leaflet JavaScript library many arguments can understood reading documentation original JavaScript library (see leafletjs.com).Leaflet maps created leaflet(), result leaflet map object can piped leaflet functions.\nallows multiple map layers control settings added interactively, demonstrated code generates Figure 8.23 (see rstudio.github.io/leaflet/ details).\nFIGURE 8.23: leaflet package action, showing cycle hire points London. See interactive version online.\n","code":"\ntmap_mode(\"view\")\nmap_nz\nmap_nz + tm_basemap(server = \"OpenTopoMap\")\nworld_coffee = left_join(world, coffee_data, by = \"name_long\")\nfacets = c(\"coffee_production_2016\", \"coffee_production_2017\")\ntm_shape(world_coffee) + tm_polygons(facets) + \n  tm_facets(nrow = 1, sync = TRUE)\ntmap_mode(\"plot\")\n#> tmap mode set to plotting\nmapview::mapview(nz)\ntrails %>%\n  st_transform(st_crs(franconia)) %>%\n  st_intersection(franconia[franconia$district == \"Oberfranken\", ]) %>%\n  st_collection_extract(\"LINE\") %>%\n  mapview(color = \"red\", lwd = 3, layer.name = \"trails\") +\n  mapview(franconia, zcol = \"district\", burst = TRUE) +\n  breweries\nlibrary(mapdeck)\nset_token(Sys.getenv(\"MAPBOX\"))\ncrash_data = read.csv(\"https://git.io/geocompr-mapdeck\")\ncrash_data = na.omit(crash_data)\nms = mapdeck_style(\"dark\")\nmapdeck(style = ms, pitch = 45, location = c(0, 52), zoom = 4) %>%\nadd_grid(data = crash_data, lat = \"lat\", lon = \"lng\", cell_size = 1000,\n         elevation_scale = 50, layer_id = \"grid_layer\",\n         colour_range = viridisLite::plasma(6))\npal = colorNumeric(\"RdYlBu\", domain = cycle_hire$nbikes)\nleaflet(data = cycle_hire) %>% \n  addProviderTiles(providers$CartoDB.Positron) %>%\n  addCircles(col = ~pal(nbikes), opacity = 0.9) %>% \n  addPolygons(data = lnd, fill = FALSE) %>% \n  addLegend(pal = pal, values = ~nbikes) %>% \n  setView(lng = -0.1, 51.5, zoom = 12) %>% \n  addMiniMap()"},{"path":"adv-map.html","id":"mapping-applications","chapter":"8 Making maps with R","heading":"8.5 Mapping applications","text":"\ninteractive web maps demonstrated Section 8.4 can go far.\nCareful selection layers display, base-maps pop-ups can used communicate main results many projects involving geocomputation.\nweb mapping approach interactivity limitations:Although map interactive terms panning, zooming clicking, code static, meaning user interface fixedAll map content generally static web map, meaning web maps scale handle large datasets easilyAdditional layers interactivity, graphs showing relationships variables ‘dashboards’ difficult create using web-mapping approachOvercoming limitations involves going beyond static web mapping towards geospatial frameworks map servers.\nProducts field include GeoDjango (extends Django web framework written Python), MapGuide (framework developing web applications, largely written C++) GeoServer (mature powerful map server written Java).\n(particularly GeoServer) scalable, enabling maps served thousands people daily — assuming sufficient public interest maps!\nbad news server-side solutions require much skilled developer time set-maintain, often involving teams people roles dedicated geospatial database administrator (DBA).good news web mapping applications can now rapidly created using shiny, package converting R code interactive web applications.\nthanks support interactive maps via functions renderLeaflet(), documented Shiny integration section RStudio’s leaflet website.\nsection gives context, teaches basics shiny web mapping perspective culminates full-screen mapping application less 100 lines code.way shiny works well documented shiny.rstudio.com.\ntwo key elements shiny app reflect duality common web application development: ‘front end’ (bit user sees) ‘back end’ code.\nshiny apps, elements typically created objects named ui server within R script named app.R, lives ‘app folder.’\nallows web mapping applications represented single file, coffeeApp/app.R file book’s GitHub repo.considering large apps, worth seeing minimal example, named ‘lifeApp,’ action.43\ncode defines launches — command shinyApp() — lifeApp, provides interactive slider allowing users make countries appear progressively lower levels life expectancy (see Figure 8.24):\nFIGURE 8.24: Screenshot showing minimal example web mapping application created shiny.\nuser interface (ui) lifeApp created fluidPage().\ncontains input output ‘widgets’ — case, sliderInput() (many *Input() functions available) leafletOutput().\narranged row-wise default, explaining slider interface placed directly map Figure 8.24 (see ?column adding content column-wise).server side (server) function input output arguments.\noutput list objects containing elements generated render*() function — renderLeaflet() example generates output$map.\nInput elements input$life referred server must relate elements exist ui — defined inputId = \"life\" code .\nfunction shinyApp() combines ui server elements serves results interactively via new R process.\nmove slider map shown Figure 8.24, actually causing R code re-run, although hidden view user interface.Building basic example knowing find help (see ?shiny), best way forward now may stop reading start programming!\nrecommended next step open previously mentioned CycleHireApp/app.R script IDE choice, modify re-run repeatedly.\nexample contains components web mapping application implemented shiny ‘shine’ light behave.CycleHireApp/app.R script contains shiny functions go beyond demonstrated simple ‘lifeApp’ example.\ninclude reactive() observe() (creating outputs respond user interface — see ?reactive) leafletProxy() (modifying leaflet object already created).\nelements critical creation web mapping applications implemented shiny.\nrange ‘events’ can programmed including advanced functionality drawing new layers subsetting data, described shiny section RStudio’s leaflet website.Experimenting apps CycleHireApp build knowledge web mapping applications R, also practical skills.\nChanging contents setView(), example, change starting bounding box user sees app initiated.\nexperimentation done random, reference relevant documentation, starting ?shiny, motivated desire solve problems posed exercises.shiny used way can make prototyping mapping applications faster accessible ever (deploying shiny apps separate topic beyond scope chapter).\nEven applications eventually deployed using different technologies, shiny undoubtedly allows web mapping applications developed relatively lines code (76 case CycleHireApp).\nstop shiny apps getting rather large.\nPropensity Cycle Tool (PCT) hosted pct.bike, example, national mapping tool funded UK’s Department Transport.\nPCT used dozens people day multiple interactive elements based 1000 lines code (Lovelace et al. 2017).apps undoubtedly take time effort develop, shiny provides framework reproducible prototyping aid development process.\nOne potential problem ease developing prototypes shiny temptation start programming early, purpose mapping application envisioned detail.\nreason, despite advocating shiny, recommend starting longer established technology pen paper first stage interactive mapping projects.\nway prototype web applications limited technical considerations, motivations imagination.\nFIGURE 8.25: Hire cycle App, simple web mapping application finding closest cycle hiring station based location requirement cycles. Interactive version available online geocompr.robinlovelace.net.\n","code":"\nlibrary(shiny)    # for shiny apps\nlibrary(leaflet)  # renderLeaflet function\nlibrary(spData)   # loads the world dataset \nui = fluidPage(\n  sliderInput(inputId = \"life\", \"Life expectancy\", 49, 84, value = 80),\n      leafletOutput(outputId = \"map\")\n  )\nserver = function(input, output) {\n  output$map = renderLeaflet({\n    leaflet() %>% \n      # addProviderTiles(\"OpenStreetMap.BlackAndWhite\") %>%\n      addPolygons(data = world[world$lifeExp < input$life, ])})\n}\nshinyApp(ui, server)"},{"path":"adv-map.html","id":"other-mapping-packages","chapter":"8 Making maps with R","heading":"8.6 Other mapping packages","text":"tmap provides powerful interface creating wide range static maps (Section 8.2) also supports interactive maps (Section 8.4).\nmany options creating maps R.\naim section provide taster pointers additional resources: map making surprisingly active area R package development, learn can covered .mature option use plot() methods provided core spatial packages sf raster, covered Sections 2.2.3 2.3.3, respectively.\nmentioned sections plot methods raster vector objects can combined results draw onto plot area (elements keys sf plots multi-band rasters interfere ).\nbehavior illustrated subsequent code chunk generates Figure 8.26.\nplot() many options can explored following links ?plot help page sf vignette sf5.\nFIGURE 8.26: Map New Zealand created plot(). legend right refers elevation (1000 m sea level).\nSince version 2.3.0, tidyverse plotting package ggplot2 supported sf objects geom_sf().\nsyntax similar used tmap:\ninitial ggplot() call followed one layers, added + geom_*(), * represents layer type geom_sf() (sf objects) geom_points() (points).ggplot2 plots graticules default.\ndefault settings graticules can overridden using scale_x_continuous(), scale_y_continuous() coord_sf(datum = NA).\nnotable features include use unquoted variable names encapsulated aes() indicate aesthetics vary switching data sources using data argument, demonstrated code chunk creates Figure 8.27:\nFIGURE 8.27: Map New Zealand created ggplot2.\nadvantage ggplot2 strong user-community many add-packages.\nGood additional resources can found open source ggplot2 book (Wickham 2016) descriptions multitude ‘ggpackages’ ggrepel tidygraph.Another benefit maps based ggplot2 can easily given level interactivity printed using function ggplotly() plotly package.\nTry plotly::ggplotly(g1), example, compare result plotly mapping functions described : blog.cpsievert..time, ggplot2 drawbacks.\ngeom_sf() function always able create desired legend use spatial data.\nRaster objects also natively supported ggplot2 need converted data frame plotting.covered mapping sf, raster ggplot2 packages first packages highly flexible, allowing creation wide range static maps.\ncover mapping packages plotting specific type map (next paragraph), worth considering alternatives packages already covered general-purpose mapping (Table 8.1).\nTABLE 8.1: Selected general-purpose mapping packages.\nTable 8.1 shows range mapping packages available, many others listed table.\nnote cartography, generates range unusual maps including choropleth, ‘proportional symbol’ ‘flow’ maps, documented vignette cartography.Several packages focus specific map types, illustrated Table 8.2.\npackages create cartograms distort geographical space, create line maps, transform polygons regular hexagonal grids, visualize complex data grids representing geographic topologies.\nTABLE 8.2: Selected specific-purpose mapping packages, associated metrics.\naforementioned packages, however, different approaches data preparation map creation.\nnext paragraph, focus solely cartogram package.\nTherefore, suggest read linemap, geogrid geofacet documentations learn .cartogram map geometry proportionately distorted represent mapping variable.\nCreation type map possible R cartogram, allows creating continuous non-contiguous area cartograms.\nmapping package per se, allows construction distorted spatial objects plotted using generic mapping package.cartogram_cont() function creates continuous area cartograms.\naccepts sf object name variable (column) inputs.\nAdditionally, possible modify intermax argument - maximum number iterations cartogram transformation.\nexample, represent median income New Zeleand’s regions continuous cartogram (right-hand panel Figure 8.28) follows:\nFIGURE 8.28: Comparison standard map (left) continuous area cartogram (right).\ncartogram also offers creation non-contiguous area cartograms using cartogram_ncont() Dorling cartograms using cartogram_dorling().\nNon-contiguous area cartograms created scaling region based provided weighting variable.\nDorling cartograms consist circles area proportional weighting variable.\ncode chunk demonstrates creation non-contiguous area Dorling cartograms US states’ population (Figure 8.29):\nFIGURE 8.29: Comparison non-continuous area cartogram (left) Dorling cartogram (right).\nNew mapping packages emerging time.\n2018 alone, number mapping packages released CRAN, including mapdeck, mapsapi, rayshader.\nterms interactive mapping, leaflet.extras contains many functions extending functionality leaflet (see end point-pattern vignette geocompkg website examples heatmaps created leaflet.extras).","code":"\ng = st_graticule(nz, lon = c(170, 175), lat = c(-45, -40, -35))\nplot(nz_water, graticule = g, axes = TRUE, col = \"blue\")\nraster::plot(nz_elev / 1000, add = TRUE)\nplot(st_geometry(nz), add = TRUE)\nlibrary(ggplot2)\ng1 = ggplot() + geom_sf(data = nz, aes(fill = Median_income)) +\n  geom_sf(data = nz_height) +\n  scale_x_continuous(breaks = c(170, 175))\ng1\nlibrary(cartogram)\nnz_carto = cartogram_cont(nz, \"Median_income\", itermax = 5)\ntm_shape(nz_carto) + tm_polygons(\"Median_income\")\nus_states2163 = st_transform(us_states, 2163)\nus_states2163_ncont = cartogram_ncont(us_states2163, \"total_pop_15\")\nus_states2163_dorling = cartogram_dorling(us_states2163, \"total_pop_15\")"},{"path":"adv-map.html","id":"exercises-6","chapter":"8 Making maps with R","heading":"8.7 Exercises","text":"exercises rely new object, africa.\nCreate using world worldbank_df datasets spData package follows (see Chapter 3):also use zion nlcd datasets spDataLarge:Create map showing geographic distribution Human Development Index (HDI) across Africa base graphics (hint: use plot()) tmap packages (hint: use tm_shape(africa) + ...).\nName two advantages based experience.\nName three mapping packages advantage .\nBonus: create three maps Africa using three packages.\nName two advantages based experience.Name three mapping packages advantage .Bonus: create three maps Africa using three packages.Extend tmap created previous exercise legend three bins: “High” (HDI 0.7), “Medium” (HDI 0.55 0.7) “Low” (HDI 0.55).\nBonus: improve map aesthetics, example changing legend title, class labels color palette.\nBonus: improve map aesthetics, example changing legend title, class labels color palette.Represent africa’s subregions map.\nChange default color palette legend title.\nNext, combine map map created previous exercise single plot.Create land cover map Zion National Park.\nChange default colors match perception land cover categories\nAdd scale bar north arrow change position improve map’s aesthetic appeal\nBonus: Add inset map Zion National Park’s location context Utah state. (Hint: object representing Utah can subset us_states dataset.)\nChange default colors match perception land cover categoriesAdd scale bar north arrow change position improve map’s aesthetic appealBonus: Add inset map Zion National Park’s location context Utah state. (Hint: object representing Utah can subset us_states dataset.)Create facet maps countries Eastern Africa:\none facet showing HDI representing population growth (hint: using variables HDI pop_growth, respectively)\n‘small multiple’ per country\none facet showing HDI representing population growth (hint: using variables HDI pop_growth, respectively)‘small multiple’ per countryBuilding previous facet map examples, create animated maps East Africa:\nShowing first spatial distribution HDI scores population growth\nShowing country order\nShowing first spatial distribution HDI scores population growthShowing country orderCreate interactive map Africa:\ntmap\nmapview\nleaflet\nBonus: approach, add legend (automatically provided) scale bar\ntmapWith mapviewWith leafletBonus: approach, add legend (automatically provided) scale barSketch paper ideas web mapping app used make transport land-use policies evidence based:\ncity live, couple users per day\ncountry live, dozens users per day\nWorldwide hundreds users per day large data serving requirements\ncity live, couple users per dayIn country live, dozens users per dayWorldwide hundreds users per day large data serving requirementsUpdate code coffeeApp/app.R instead centering Brazil user can select country focus :\nUsing textInput()\nUsing selectInput()\nUsing textInput()Using selectInput()Reproduce Figure 8.1 1st 6th panel Figure 8.6 closely possible using ggplot2 package.Join us_states us_states_df together calculate poverty rate state using new dataset.\nNext, construct continuous area cartogram based total population.\nFinally, create compare two maps poverty rate: (1) standard choropleth map (2) map using created cartogram boundaries.\ninformation provided first second map?\ndiffer ?Visualize population growth Africa.\nNext, compare maps hexagonal regular grid created using geogrid package.","code":"\nafrica = world %>% \n  filter(continent == \"Africa\", !is.na(iso_a2)) %>% \n  left_join(worldbank_df, by = \"iso_a2\") %>% \n  dplyr::select(name, subregion, gdpPercap, HDI, pop_growth) %>% \n  st_transform(\"+proj=aea +lat_1=20 +lat_2=-23 +lat_0=0 +lon_0=25\")\nzion = st_read((system.file(\"vector/zion.gpkg\", package = \"spDataLarge\")))\ndata(nlcd, package = \"spDataLarge\")"},{"path":"gis.html","id":"gis","chapter":"9 Bridges to GIS software","heading":"9 Bridges to GIS software","text":"","code":""},{"path":"gis.html","id":"prerequisites-7","chapter":"9 Bridges to GIS software","heading":"Prerequisites","text":"chapter requires QGIS, SAGA GRASS installed following packages attached:44","code":"\nlibrary(sf)\nlibrary(raster)\n#> Warning: multiple methods tables found for 'approxNA'\n#library(RQGIS)\nlibrary(RSAGA)\nlibrary(rgrass7)"},{"path":"gis.html","id":"introduction-6","chapter":"9 Bridges to GIS software","heading":"9.1 Introduction","text":"defining feature R way interact :\ntype commands hit Enter (Ctrl+Enter writing code source editor RStudio) execute interactively.\nway interacting computer called command-line interface (CLI) (see definition note ).\nCLIs unique R.45\ndedicated GIS packages, contrast, emphasis tends graphical user interface (GUI).\ncan interact GRASS, QGIS, SAGA gvSIG system terminals embedded CLIs Python Console QGIS, ‘pointing clicking’ norm.\nmeans many GIS users miss advantages command-line according Gary Sherman, creator QGIS (Sherman 2008):advent ‘modern’ GIS software, people want point \nclick way life. ’s good, tremendous amount\nflexibility power waiting command line. Many times\ncan something command line fraction time \ncan GUI.‘CLI vs GUI’ debate can adversial ; options can used interchangeably, depending task hand user’s skillset.46\nadvantages good CLI provided R (enhanced IDEs RStudio) numerous.\ngood CLI:Facilitates automation repetitive tasks;Enables transparency reproducibility, backbone good scientific practice data science;Encourages software development providing tools modify existing functions implement new ones;Helps develop future-proof programming skills high demand many disciplines industries; andIs user-friendly fast, allowing efficient workflow.hand, GUI-based GIS systems (particularly QGIS) also advantageous.\ngood GIS GUI:‘shallow’ learning curve meaning geographic data can explored visualized without hours learning new language;Provides excellent support ‘digitizing’ (creating new vector datasets), including trace, snap topological tools;47Enables georeferencing (matching raster images existing maps) ground control points orthorectification;Supports stereoscopic mapping (e.g., LiDAR structure motion); andProvides access spatial database management systems object-oriented relational data models, topology fast (spatial) querying.Another advantage dedicated GISs provide access hundreds ‘geoalgorithms’ (computational recipes solve geographic problems — see Chapter 10).\nMany unavailable R command line, except via ‘GIS bridges,’ topic (motivation ) chapter.48R originated interface language.\npredecessor S provided access statistical algorithms languages (particularly FORTRAN), intuitive read-evaluate-print loop (REPL) (Chambers 2016).\nR continues tradition interfaces numerous languages, notably C++, described Chapter 1.\nR designed GIS.\nHowever, ability interface dedicated GISs gives astonishing geospatial capabilities.\nR well known statistical programming language, many people unaware ability replicate GIS workflows, additional benefits (relatively) consistent CLI.\nFurthermore, R outperforms GISs areas geocomputation, including interactive/animated map making (see Chapter 8) spatial statistical modeling (see Chapter 11).\nchapter focuses ‘bridges’ three mature open source GIS products (see Table 9.1): QGIS (via package RQGIS; Section 9.2), SAGA (via RSAGA; Section 9.3) GRASS (via rgrass7; Section 9.4).\nThough covered , worth aware interface ArcGIS, proprietary popular GIS software, via RPyGeo.49\ncomplement R-GIS bridges, chapter ends brief introduction interfaces spatial libraries (Section 9.6.1) spatial databases (Section 9.6.2).TABLE 9.1: Comparison three open-source GIS. Hybrid refers support vector raster operations.","code":""},{"path":"gis.html","id":"rqgis","chapter":"9 Bridges to GIS software","heading":"9.2 (R)QGIS","text":"QGIS one popular open-source GIS [Table 9.1; Graser Olaya (2015)].\nmain advantage lies fact provides unified interface several open-source GIS.\nmeans access GDAL, GRASS SAGA QGIS (Graser Olaya 2015).\nrun geoalgorithms (frequently 1000 depending set-) outside QGIS GUI, QGIS provides Python API.\nRQGIS establishes tunnel Python API reticulate package.\nBasically, functions set_env() open_app() .\nNote optional run set_env() open_app() since functions depending output run automatically needed.\nrunning RQGIS, make sure installed QGIS (third-party) dependencies SAGA GRASS.\ninstall RQGIS number dependencies required, described install_guide vignette, covers installation Windows, Linux Mac.\ntime writing (autumn 2018) RQGIS supports Long Term Release (2.18), support QGIS 3 pipeline (see RQGIS3).Leaving path-argument set_env() unspecified search computer QGIS installation.\nHence, faster specify explicitly path QGIS installation.\nSubsequently, open_app() sets paths necessary run QGIS within R, finally creates -called QGIS custom application (see http://docs.qgis.org/testing/en/docs/pyqgis_developer_cookbook/intro.html#using-pyqgis--custom-applications).now ready QGIS geoprocessing within R!\nexample shows unite polygons, process unfortunately produces -called slivers , tiny polygons resulting overlaps inputs frequently occur real-world data.\nsee remove .union, use incongruent polygons already encountered Section 4.2.5.\npolygon datasets available spData package, like use geographic CRS (see also Chapter 6).find algorithm work, find_algorithms() searches QGIS geoalgorithms using regular expressions.\nAssuming short description function contains word “union”, can run:Short descriptions geoalgorithm can also provided, setting name_only = FALSE.\none clue name geoalgorithm might , one can leave search_term-argument empty return list available QGIS geoalgorithms.\ncan also find algorithms QGIS online documentation.next step find qgis:union can used.\nopen_help() opens online help geoalgorithm question.\nget_usage() returns function parameters default values.Finally, can let QGIS work.\nNote workhorse function run_qgis() accepts R named arguments, .e., can specify parameter names returned get_usage() run_qgis() regular R function.\nNote also run_qgis() accepts spatial objects residing R’s global environment input (: aggzone_wgs incongr_wgs).\ncourse, also specify paths spatial vector files stored disk.\nSetting load_output TRUE automatically loads QGIS output sf-object R.Note QGIS union operation merges two input layers one layer using intersection symmetrical difference two input layers (way also default union operation GRASS SAGA).\nst_union(incongr_wgs, aggzone_wgs) (see Exercises)!\nQGIS output contains empty geometries multipart polygons.\nEmpty geometries might lead problems subsequent geoprocessing tasks deleted.\nst_dimension() returns NA geometry empty, can therefore used filter.Next convert multipart polygons single-part polygons (also known explode geometries casting).\nnecessary deletion sliver polygons later .One way identify slivers find polygons comparatively small areas, , e.g., 25000 m2 (see blue colored polygons left panel Figure 9.1).next step find function makes slivers disappear.\nAssuming function short description contains word “sliver,” can run:returns one geoalgorithm whose parameters can accessed help get_usage() .Conveniently, user need specify single parameter.\ncase parameter left unspecified, run_qgis() automatically use corresponding default value argument available.\nfind default values, run get_args_man().remove slivers, specify polygons area less equal 25,000 m2 joined neighboring polygon largest area (see right panel Figure 9.1).\nFIGURE 9.1: Sliver polygons colored blue (left panel). Cleaned polygons (right panel).\ncode chunk note thatleaving output parameter(s) unspecified saves resulting QGIS output temporary folder created QGIS;\nrun_qgis() prints paths console successfully running QGIS engine; andif output consists multiple files set load_output TRUE, run_qgis() return list element corresponding one output file.learn RQGIS, see Muenchow, Schratz, Brenning (2017).","code":"\n#library(RQGIS)\nset_env(dev = FALSE)\n#> $`root`\n#> [1] \"C:/OSGeo4W64\"\n#> ...\nopen_app()\ndata(\"incongruent\", \"aggregating_zones\", package = \"spData\")\nincongr_wgs = st_transform(incongruent, 4326)\naggzone_wgs = st_transform(aggregating_zones, 4326)\nfind_algorithms(\"union\", name_only = TRUE)\n#> [1] \"qgis:union\"        \"saga:fuzzyunionor\" \"saga:union\"\nalg = \"qgis:union\"\nopen_help(alg)\nget_usage(alg)\n#>ALGORITHM: Union\n#>  INPUT <ParameterVector>\n#>  INPUT2 <ParameterVector>\n#>  OUTPUT <OutputVector>\nunion = run_qgis(alg, INPUT = incongr_wgs, INPUT2 = aggzone_wgs, \n                 OUTPUT = file.path(tempdir(), \"union.shp\"),\n                 load_output = TRUE)\n#> $`OUTPUT`\n#> [1] \"C:/Users/geocompr/AppData/Local/Temp/RtmpcJlnUx/union.shp\"\n# remove empty geometries\nunion = union[!is.na(st_dimension(union)), ]\n# multipart polygons to single polygons\nsingle = st_cast(union, \"POLYGON\")\n# find polygons which are smaller than 25000 m^2\nx = 25000\nunits(x) = \"m^2\"\nsingle$area = st_area(single)\nsub = dplyr::filter(single, area < x)\nfind_algorithms(\"sliver\", name_only = TRUE)\n#> [1] \"qgis:eliminatesliverpolygons\"\nalg = \"qgis:eliminatesliverpolygons\"\nget_usage(alg)\n#>ALGORITHM: Eliminate sliver polygons\n#>  INPUT <ParameterVector>\n#>  KEEPSELECTION <ParameterBoolean>\n#>  ATTRIBUTE <parameters from INPUT>\n#>  COMPARISON <ParameterSelection>\n#>  COMPARISONVALUE <ParameterString>\n#>  MODE <ParameterSelection>\n#>  OUTPUT <OutputVector>\n#>  ...\nclean = run_qgis(\"qgis:eliminatesliverpolygons\",\n                 INPUT = single,\n                 ATTRIBUTE = \"area\",\n                 COMPARISON = \"<=\",\n                 COMPARISONVALUE = 25000,\n                 OUTPUT = file.path(tempdir(), \"clean.shp\"),\n                 load_output = TRUE)\n#> $`OUTPUT`\n#> [1] \"C:/Users/geocompr/AppData/Local/Temp/RtmpcJlnUx/clean.shp\""},{"path":"gis.html","id":"rsaga","chapter":"9 Bridges to GIS software","heading":"9.3 (R)SAGA","text":"System Automated Geoscientific Analyses (SAGA; Table 9.1) provides possibility execute SAGA modules via command line interface (saga_cmd.exe Windows just saga_cmd Linux) (see SAGA wiki modules).\naddition, Python interface (SAGA Python API).\nRSAGA uses former run SAGA within R.Though SAGA hybrid GIS, main focus raster processing, particularly digital elevation models (soil properties, terrain attributes, climate parameters).\nHence, SAGA especially good fast processing large (high-resolution) raster datasets (Conrad et al. 2015).\nTherefore, introduce RSAGA raster use case Muenchow, Brenning, Richter (2012).\nSpecifically, like compute SAGA wetness index digital elevation model.\nFirst , need make sure RSAGA find SAGA computer called.\n, RSAGA functions using SAGA background make use rsaga.env().\nUsually, rsaga.env() detect SAGA automatically searching several likely directories (see help information).However, possible ‘hidden’ SAGA location rsaga.env() search automatically.\nlinkSAGA searches computer valid SAGA installation.\nfinds one, adds newest version PATH environment variable thereby making sure rsaga.env() runs successfully.\nnecessary run next code chunk rsaga.env() unsuccessful (see previous code chunk).Secondly, need write digital elevation model SAGA-format.\nNote calling data(landslides) attaches two objects global environment - dem, digital elevation model form list, landslides, data.frame containing observations representing presence absence landslide:organization SAGA modular.\nLibraries contain -called modules, .e., geoalgorithms.\nfind libraries available, run (output shown):choose library ta_hydrology (ta abbreviation terrain analysis).\nSubsequently, can access available modules specific library (: ta_hydrology) follows:rsaga.get.usage() prints function parameters specific geoalgorithm, e.g., SAGA Wetness Index, console.Finally, can run SAGA within R using RSAGA’s geoprocessing workhorse function rsaga.geoprocessor().\nfunction expects parameter-argument list specified necessary parameters.facilitate access SAGA interface, RSAGA frequently provides user-friendly wrapper-functions meaningful default values (see RSAGA documentation examples, e.g., ?rsaga.wetness.index).\nfunction call calculating ‘SAGA Wetness Index’ becomes:course, like inspect result visually (Figure 9.2).\nload plot SAGA output file, use raster package.\nFIGURE 9.2: SAGA wetness index Mount Mongón, Peru.\ncan find extended version example vignette(\"RSAGA-landslides\") includes use statistical geocomputing derive terrain attributes predictors non-linear Generalized Additive Model (GAM) predict spatially landslide susceptibility (Muenchow, Brenning, Richter 2012).\nterm statistical geocomputation emphasizes strength combining R’s data science power geoprocessing power GIS heart building bridge R GIS.","code":"\nlibrary(RSAGA)\nrsaga.env()\n#> Search for SAGA command line program and modules... \n#> Done\n#> $workspace\n#> [1] \".\"\n#> ...\nlibrary(link2GI)\nsaga = linkSAGA()\nrsaga.env()\ndata(landslides)\nwrite.sgrd(data = dem, file = file.path(tempdir(), \"dem\"), header = dem$header)\nrsaga.get.libraries()\nrsaga.get.modules(libs = \"ta_hydrology\")\nrsaga.get.usage(lib = \"ta_hydrology\", module = \"SAGA Wetness Index\")\nparams = list(DEM = file.path(tempdir(), \"dem.sgrd\"),\n              TWI = file.path(tempdir(), \"twi.sdat\"))\nrsaga.geoprocessor(lib = \"ta_hydrology\", module = \"SAGA Wetness Index\", \n                   param = params)\nrsaga.wetness.index(in.dem = file.path(tempdir(), \"dem\"), \n                    out.wetness.index = file.path(tempdir(), \"twi\"))\nlibrary(raster)\ntwi = raster::raster(file.path(tempdir(), \"twi.sdat\"))\n# shown is a version using tmap\nplot(twi, col = RColorBrewer::brewer.pal(n = 9, name = \"Blues\"))"},{"path":"gis.html","id":"rgrass","chapter":"9 Bridges to GIS software","heading":"9.4 GRASS through rgrass7","text":"U.S. Army - Construction Engineering Research Laboratory (USA-CERL) created core Geographical Resources Analysis Support System (GRASS) [Table 9.1; Neteler Mitasova (2008)] 1982 1995.\nAcademia continued work since 1997.\nSimilar SAGA, GRASS focused raster processing beginning later, since GRASS 6.0, adding advanced vector functionality (R. Bivand, Pebesma, Gómez-Rubio 2013).introduce rgrass7 one interesting problems GIScience - traveling salesman problem.\nSuppose traveling salesman like visit 24 customers.\nAdditionally, like start finish journey home makes total 25 locations covering shortest distance possible.\nsingle best solution problem; however, find even modern computers (mostly) impossible (P. Longley 2015).\ncase, number possible solutions correspond (25 - 1)! / 2, .e., factorial 24 divided 2 (since differentiate forward backward direction).\nEven one iteration can done nanosecond, still corresponds 9837145 years.\nLuckily, clever, almost optimal solutions run tiny fraction inconceivable amount time.\nGRASS GIS provides one solutions (details, see v.net.salesman).\nuse case, like find shortest path first 25 bicycle stations (instead customers) London’s streets (simply assume first bike station corresponds home traveling salesman).Aside cycle hire points data, need OpenStreetMap data London.\ndownload help osmdata package (see also Section 7.2).\nconstrain download street network (OSM language called “highway”) bounding box cycle hire data, attach corresponding data sf-object.\nosmdata_sf() returns list several spatial objects (points, lines, polygons, etc.).\n, keep line objects.\nOpenStreetMap objects come lot columns, streets features almost 500.\nfact, interested geometry column.\nNevertheless, keeping one attribute column; otherwise, run trouble trying provide writeVECT() geometry object (see ?writeVECT details).\nRemember geometry column sticky, hence, even though just selecting one attribute, geometry column also returned (see Section 2.2.1).convenience reader, one can attach london_streets global environment using data(\"london_streets\", package = \"spDataLarge\").Now data, can go initiate GRASS session, .e., create GRASS spatial database.\nGRASS geodatabase  system based SQLite.\nConsequently, different users can easily work project, possibly different read/write permissions.\nHowever, one set spatial database (also within R), users used GIS GUI popping one click might find process bit intimidating beginning.\nFirst , GRASS database requires directory, contains location (see GRASS GIS Database help pages grass.osgeo.org information).\nlocation turn simply contains geodata one project.\nWithin one location, several mapsets can exist typically refer different users.\nPERMANENT mandatory mapset created automatically.\nstores projection, spatial extent default resolution raster data.\norder share geographic data users project, database owner can add spatial data PERMANENT mapset.\nPlease refer Neteler Mitasova (2008) GRASS GIS quick start information GRASS spatial database system.set location mapset want use GRASS within R.\nFirst , need find GRASS 7 installed computer.link data.frame contains rows GRASS 7 installations computer.\n, use GRASS 7 installation.\ninstalled GRASS 7 computer, recommend now.\nAssuming found working installation computer, use corresponding path initGRASS.\nAdditionally, specify store spatial database (gisDbase), name location london, use PERMANENT mapset.Subsequently, define projection, extent resolution.familiar set GRASS environment, becomes tedious .\nLuckily, linkGRASS7() link2GI packages lets one line code.\nthing need provide spatial object determines projection extent spatial database..\nFirst, linkGRASS7() finds GRASS installations computer.\nSince set ver_select TRUE, can interactively choose one found GRASS-installations.\njust one installation, linkGRASS7() automatically chooses one.\nSecond, linkGRASS7() establishes connection GRASS 7.can use GRASS geoalgorithms, need add data GRASS’s spatial database.\nLuckily, convenience function writeVECT() us.\n(Use writeRAST() case raster data.)\ncase add street cycle hire point data using first attribute column, name also london_streets points.use sf-objects rgrass7, run use_sf() first (note: code assumes running rgrass7 0.2.1 ).perform network analysis, need topological clean street network.\nGRASS’s v.clean takes care removal duplicates, small angles dangles, among others.\n, break lines intersection ensure subsequent routing algorithm can actually turn right left intersection, save output GRASS object named streets_clean.\nlikely cycling station points lie exactly street segment.\nHowever, find shortest route , need connect nearest streets segment.\nv.net’s connect-operator exactly .\nsave output streets_points_con.resulting clean dataset serves input v.net.salesman-algorithm, finally finds shortest route cycle hire stations.\ncenter_cats requires numeric range input.\nrange represents points shortest route calculated.\nSince like calculate route cycle stations, set 1-25.\naccess GRASS help page traveling salesman algorithm, run execGRASS(\"g.manual\", entry = \"v.net.salesman\").visualize result, import output layer R, convert sf-object keeping geometry, visualize help mapview package (Figure 9.3 Section 8.4).\nFIGURE 9.3: Shortest route (blue line) 24 cycle hire stations (blue dots) OSM street network London.\nimportant considerations note process:used GRASS’s spatial database (based SQLite) allows faster processing.\nmeans exported geographic data beginning.\ncreated new objects imported final result back R.\nfind datasets currently available, run execGRASS(\"g.list\", type = \"vector,raster\", flags = \"p\").also accessed already existing GRASS spatial database within R.\nPrior importing data R, might want perform (spatial) subsetting.\nUse v.select v.extract vector data.\ndb.select lets select subsets attribute table vector layer without returning corresponding geometry.can also start R within running GRASS session (information please refer R. Bivand, Pebesma, Gómez-Rubio 2013 wiki).Refer excellent GRASS online help execGRASS(\"g.manual\", flags = \"\") information available GRASS geoalgorithm.like use GRASS 6 within R, use R package spgrass6.","code":"\ndata(\"cycle_hire\", package = \"spData\")\npoints = cycle_hire[1:25, ]\nlibrary(osmdata)\nb_box = st_bbox(points)\nlondon_streets = opq(b_box) %>%\n  add_osm_feature(key = \"highway\") %>%\n  osmdata_sf() %>%\n  `[[`(\"osm_lines\")\nlondon_streets = dplyr::select(london_streets, osm_id)\nlibrary(link2GI)\nlink = findGRASS() \nlibrary(rgrass7)\n# find a GRASS 7 installation, and use the first one\nind = grep(\"7\", link$version)[1]\n# next line of code only necessary if we want to use GRASS as installed by \n# OSGeo4W. Among others, this adds some paths to PATH, which are also needed\n# for running GRASS.\nlink2GI::paramGRASSw(link[ind, ])\ngrass_path = \n  ifelse(test = !is.null(link$installation_type) && \n           link$installation_type[ind] == \"osgeo4W\",\n         yes = file.path(link$instDir[ind], \"apps/grass\", link$version[ind]),\n         no = link$instDir)\ninitGRASS(gisBase = grass_path,\n          # home parameter necessary under UNIX-based systems\n          home = tempdir(),\n          gisDbase = tempdir(), location = \"london\", \n          mapset = \"PERMANENT\", override = TRUE)\nexecGRASS(\"g.proj\", flags = c(\"c\", \"quiet\"), \n          proj4 = st_crs(london_streets)$proj4string)\nb_box = st_bbox(london_streets) \nexecGRASS(\"g.region\", flags = c(\"quiet\"), \n          n = as.character(b_box[\"ymax\"]), s = as.character(b_box[\"ymin\"]), \n          e = as.character(b_box[\"xmax\"]), w = as.character(b_box[\"xmin\"]), \n          res = \"1\")\nlink2GI::linkGRASS7(london_streets, ver_select = TRUE)\nuse_sf()\nwriteVECT(SDF = london_streets, vname = \"london_streets\")\nwriteVECT(SDF = points[, 1], vname = \"points\")\n# clean street network\nexecGRASS(cmd = \"v.clean\", input = \"london_streets\", output = \"streets_clean\",\n          tool = \"break\", flags = \"overwrite\")\n# connect points with street network\nexecGRASS(cmd = \"v.net\", input = \"streets_clean\", output = \"streets_points_con\", \n          points = \"points\", operation = \"connect\", threshold = 0.001,\n          flags = c(\"overwrite\", \"c\"))\nexecGRASS(cmd = \"v.net.salesman\", input = \"streets_points_con\",\n          output = \"shortest_route\", center_cats = paste0(\"1-\", nrow(points)),\n          flags = c(\"overwrite\"))\nroute = readVECT(\"shortest_route\") %>%\n  st_as_sf() %>%\n  st_geometry()\nmapview::mapview(route, map.types = \"OpenStreetMap.BlackAndWhite\", lwd = 7) +\n  points"},{"path":"gis.html","id":"when-to-use-what","chapter":"9 Bridges to GIS software","heading":"9.5 When to use what?","text":"recommend single R-GIS interface hard since usage depends personal preferences, tasks hand familiarity different GIS software packages turn probably depends field study.\nmentioned previously, SAGA especially good fast processing large (high-resolution) raster datasets, frequently used hydrologists, climatologists soil scientists (Conrad et al. 2015).\nGRASS GIS, hand, GIS presented supporting topologically based spatial database especially useful network analyses also simulation studies (see ).\nQGIS much user-friendly compared GRASS- SAGA-GIS, especially first-time GIS users, probably popular open-source GIS.\nTherefore, RQGIS appropriate choice use cases.\nmain advantages area unified access several GIS, therefore provision >1000 geoalgorithms (Table 9.1) including duplicated functionality, e.g., can perform overlay-operations using QGIS-, SAGA- GRASS-geoalgorithms;automatic data format conversions (SAGA uses .sdat grid files GRASS uses database format QGIS handle corresponding conversions);automatic passing geographic R objects QGIS geoalgorithms back R; andconvenience functions support access online help, named arguments automatic default value retrieval (rgrass7 inspired latter two features).means, use cases certainly use one R-GIS bridges.\nThough QGIS GIS providing unified interface several GIS software packages, provides access subset corresponding third-party geoalgorithms (information please refer Muenchow, Schratz, Brenning (2017)).\nTherefore, use complete set SAGA GRASS functions, stick RSAGA rgrass7.\n, take advantage RSAGA’s numerous user-friendly functions.\nNote also, RSAGA offers native R functions geocomputation multi.local.function(), pick..points() many .\nRSAGA supports much SAGA versions (R)QGIS.\nFinally, need topological correct data /spatial database management functionality multi-user access, recommend usage GRASS.\naddition, like run simulations help geodatabase (Krug, Roura-Pascual, Richardson 2010), use rgrass7 directly since RQGIS always starts new GRASS session call.Please note number GIS software packages scripting interface dedicated R package accesses : gvSig, OpenJump, Orfeo Toolbox TauDEM.","code":""},{"path":"gis.html","id":"other-bridges","chapter":"9 Bridges to GIS software","heading":"9.6 Other bridges","text":"focus chapter R interfaces Desktop GIS software.\nemphasize bridges dedicated GIS software well-known common ‘way ’ understanding geographic data.\nalso provide access many geoalgorithms.‘bridges’ include interfaces spatial libraries (Section 9.6.1 shows access GDAL CLI R), spatial databases (see Section 9.6.2) web mapping services (see Chapter 8).\nsection provides snippet possible.\nThanks R’s flexibility, ability call programs system integration languages (notably via Rcpp reticulate), many bridges possible.\naim comprehensive, demonstrate ways accessing ‘flexibility power’ quote Sherman (2008) beginning chapter.","code":""},{"path":"gis.html","id":"gdal","chapter":"9 Bridges to GIS software","heading":"9.6.1 Bridges to GDAL","text":"discussed Chapter 7, GDAL low-level library supports many geographic data formats.\nGDAL effective GIS programs use GDAL background importing exporting geographic data, rather re-inventing wheel using bespoke read-write code.\nGDAL offers data /O.\ngeoprocessing tools vector raster data, functionality create tiles serving raster data online, rapid rasterization vector data, can accessed via system R command line.code chunk demonstrates functionality:\nlinkGDAL() searches computer working GDAL installation adds location executable files PATH variable, allowing GDAL called.\nexample ogrinfo provides metadata vector dataset:example — returns result rgdal::ogrInfo() — may simple, shows use GDAL via system command-line, independently packages.\n‘link’ GDAL provided link2gi used foundation advanced GDAL work R system CLI.50\nTauDEM (http://hydrology.usu.edu/taudem/taudem5/index.html) Orfeo Toolbox (https://www.orfeo-toolbox.org/) spatial data processing libraries/programs offering command line interface.\ntime writing, appears developer version R/TauDEM interface R-Forge (https://r-forge.r-project.org/R/?group_id=956).\ncase, example shows access libraries system command line via R.\nturn starting point creating proper interface libraries form new R packages.diving project create new bridge, however, important aware power existing R packages system() calls may platform-independent (may fail computers).\nFurthermore, sf brings power provided GDAL, GEOS PROJ R via R/C++ interface provided Rcpp, avoids system() calls.","code":"\nlink2GI::linkGDAL()\ncmd = paste(\"ogrinfo -ro -so -al\", system.file(\"shape/nc.shp\", package = \"sf\"))\nsystem(cmd)\n#> INFO: Open of `C:/Users/geocompr/Documents/R/win-library/3.5/sf/shape/nc.shp'\n#>     using driver `ESRI Shapefile' successful.\n#> \n#> Layer name: nc\n#> Metadata:\n#>  DBF_DATE_LAST_UPDATE=2016-10-26\n#> Geometry: Polygon\n#> Feature Count: 100\n#> Extent: (-84.323853, 33.881992) - (-75.456978, 36.589649)\n#> Layer SRS WKT:\n#> ..."},{"path":"gis.html","id":"postgis","chapter":"9 Bridges to GIS software","heading":"9.6.2 Bridges to spatial databases","text":"\nSpatial database management systems (spatial DBMS) store spatial non-spatial data structured way.\ncan organize large collections data related tables (entities) via unique identifiers (primary foreign keys) implicitly via space (think instance spatial join).\nuseful geographic datasets tend become big messy quite quickly.\nDatabases enable storing querying large datasets efficiently based spatial non-spatial fields, provide multi-user access topology support.important open source spatial database PostGIS (Obe Hsu 2015).51\nR bridges spatial DBMSs PostGIS important, allowing access huge data stores without loading several gigabytes geographic data RAM, likely crashing R session.\nremainder section shows PostGIS can called R, based “Hello real word” PostGIS Action, Second Edition (Obe Hsu 2015).52The subsequent code requires working internet connection, since accessing PostgreSQL/PostGIS database living QGIS Cloud (https://qgiscloud.com/).53Often first question , ‘tables can found database?’\ncan asked follows (answer 5 tables):interested restaurants highways tables.\nformer represents locations fast-food restaurants US, latter principal US highways.\nfind attributes available table, can run:first query select US Route 1 Maryland (MD).\nNote st_read() allows us read geographic data database provided open connection database query.\nAdditionally, st_read() needs know column represents geometry (: wkb_geometry).results sf-object named us_route type sfc_MULTILINESTRING.\nnext step add 20-mile buffer (corresponds 1609 meters times 20) around selected highway (Figure 9.4).Note spatial query using functions (ST_Union(), ST_Buffer()) already familiar since find also sf-package, though written lowercase characters (st_union(), st_buffer()).\nfact, function names sf package largely follow PostGIS naming conventions.54\nlast query find Hardee restaurants (HDE) within buffer zone (Figure 9.4).Please refer Obe Hsu (2015) detailed explanation spatial SQL query.\nFinally, good practice close database connection follows:55\nFIGURE 9.4: Visualization output previous PostGIS commands showing highway (black line), buffer (light yellow) three restaurants (light blue points) within buffer.\nUnlike PostGIS, sf supports spatial vector data.\nquery manipulate raster data stored PostGIS database, use rpostgis package (Bucklin Basille 2018) /use command-line tools rastertopgsql comes part PostGIS installation.subsection brief introduction PostgreSQL/PostGIS.\nNevertheless, like encourage practice storing geographic non-geographic data spatial DBMS attaching subsets R’s global environment needed (geo-)statistical analysis.\nPlease refer Obe Hsu (2015) detailed description SQL queries presented comprehensive introduction PostgreSQL/PostGIS general.\nPostgreSQL/PostGIS formidable choice open-source spatial database.\ntrue lightweight SQLite/SpatiaLite database engine GRASS uses SQLite background (see Section 9.4).final note, data getting big PostgreSQL/PostGIS require massive spatial data management query performance, next logical step use large-scale geographic querying distributed computing systems, example, provided GeoMesa (http://www.geomesa.org/) GeoSpark [http://geospark.datasyslab.org/; Huang et al. (2017)].","code":"\nlibrary(RPostgreSQL)\nconn = dbConnect(drv = PostgreSQL(), dbname = \"rtafdf_zljbqm\",\n                 host = \"db.qgiscloud.com\",\n                 port = \"5432\", user = \"rtafdf_zljbqm\", \n                 password = \"d3290ead\")\ndbListTables(conn)\n#> [1] \"spatial_ref_sys\" \"topology\"        \"layer\"           \"restaurants\"    \n#> [5] \"highways\" \ndbListFields(conn, \"highways\")\n#> [1] \"qc_id\"        \"wkb_geometry\" \"gid\"          \"feature\"     \n#> [5] \"name\"         \"state\"   \nquery = paste(\n  \"SELECT *\",\n  \"FROM highways\",\n  \"WHERE name = 'US Route 1' AND state = 'MD';\")\nus_route = st_read(conn, query = query, geom = \"wkb_geometry\")\nquery = paste(\n  \"SELECT ST_Union(ST_Buffer(wkb_geometry, 1609 * 20))::geometry\",\n  \"FROM highways\",\n  \"WHERE name = 'US Route 1' AND state = 'MD';\")\nbuf = st_read(conn, query = query)\nquery = paste(\n  \"SELECT r.wkb_geometry\",\n  \"FROM restaurants r\",\n  \"WHERE EXISTS (\",\n  \"SELECT gid\",\n  \"FROM highways\",\n  \"WHERE\",\n  \"ST_DWithin(r.wkb_geometry, wkb_geometry, 1609 * 20) AND\",\n  \"name = 'US Route 1' AND\",\n  \"state = 'MD' AND\",\n  \"r.franchise = 'HDE');\"\n)\nhardees = st_read(conn, query = query)\nRPostgreSQL::postgresqlCloseConnection(conn)#> old-style crs object detected; please recreate object with a recent sf::st_crs()\n#> old-style crs object detected; please recreate object with a recent sf::st_crs()"},{"path":"gis.html","id":"exercises-7","chapter":"9 Bridges to GIS software","heading":"9.7 Exercises","text":"Create two overlapping polygons (poly_1 poly_2) help sf-package (see Chapter 2).Create two overlapping polygons (poly_1 poly_2) help sf-package (see Chapter 2).Union poly_1 poly_2 using st_union() qgis:union.\ndifference two union operations?\ncan use sf package obtain result QGIS?Union poly_1 poly_2 using st_union() qgis:union.\ndifference two union operations?\ncan use sf package obtain result QGIS?Calculate intersection poly_1 poly_2 using:\nRQGIS, RSAGA rgrass7\nsf\nCalculate intersection poly_1 poly_2 using:RQGIS, RSAGA rgrass7sfAttach data(dem, package = \"spDataLarge\") data(random_points, package = \"spDataLarge\").\nSelect randomly point random_points find dem pixels can seen point (hint: viewshed).\nVisualize result.\nexample, plot hillshade, top digital elevation model, viewshed output point.\nAdditionally, give mapview try.Attach data(dem, package = \"spDataLarge\") data(random_points, package = \"spDataLarge\").\nSelect randomly point random_points find dem pixels can seen point (hint: viewshed).\nVisualize result.\nexample, plot hillshade, top digital elevation model, viewshed output point.\nAdditionally, give mapview try.Compute catchment area catchment slope data(\"dem\", package = \"spDataLarge\") using RSAGA (see Section 9.3).Compute catchment area catchment slope data(\"dem\", package = \"spDataLarge\") using RSAGA (see Section 9.3).Use gdalinfo via system call raster file stored disk choice (see Section 9.6.1).Use gdalinfo via system call raster file stored disk choice (see Section 9.6.1).Query Californian highways PostgreSQL/PostGIS database living QGIS Cloud introduced chapter (see Section 9.6.2).Query Californian highways PostgreSQL/PostGIS database living QGIS Cloud introduced chapter (see Section 9.6.2).","code":""},{"path":"algorithms.html","id":"algorithms","chapter":"10 Scripts, algorithms and functions","heading":"10 Scripts, algorithms and functions","text":"","code":""},{"path":"algorithms.html","id":"prerequisites-8","chapter":"10 Scripts, algorithms and functions","heading":"Prerequisites","text":"chapter primarily uses base R; sf package used check result algorithm develop.\nassumes understanding geographic classes introduced Chapter 2 can used represent wide range input file formats (see Chapter 7).","code":""},{"path":"algorithms.html","id":"intro-algorithms","chapter":"10 Scripts, algorithms and functions","heading":"10.1 Introduction","text":"Chapter 1 established geocomputation using existing tools, developing new ones, “form shareable R scripts functions.”\nchapter teaches building blocks reproducible code.\nalso introduces low-level geometric algorithms, type used Chapter 9.\nReading help understand algorithms work write code can used many times, many people, multiple datasets.\nchapter , , make skilled programmer.\nProgramming hard requires plenty practice (Abelson, Sussman, Sussman 1996):appreciate programming intellectual activity right must turn computer programming; must read write computer programs — many .strong reasons moving direction, however.56\nadvantages reproducibility go beyond allowing others replicate work:\nreproducible code often better every way code written run , including terms computational efficiency, scalability ease adapting maintaining .Scripts basis reproducible R code, topic covered Section 10.2.\nAlgorithms recipes modifying inputs using series steps, resulting output, described Section 10.3.\nease sharing reproducibility, algorithms can placed functions.\ntopic Section 10.4.\nexample finding centroid polygon used tie concepts together.\nChapter 5 already introduced centroid function st_centroid(), example highlights seemingly simple operations result comparatively complex code, affirming following observation (Wise 2001):One intriguing things spatial data problems things appear trivially easy human can surprisingly difficult computer.example also reflects secondary aim chapter , following Xiao (2016), “duplicate available , show things work.”","code":""},{"path":"algorithms.html","id":"scripts","chapter":"10 Scripts, algorithms and functions","heading":"10.2 Scripts","text":"functions distributed packages building blocks R code, scripts glue holds together, logical order, create reproducible workflows.\nprogramming novices scripts may sound intimidating simply plain text files, typically saved extension representing language contain.\nR scripts generally saved .R extension named reflect .\nexample 10-hello.R, script file stored code folder book’s repository, contains following two lines code:lines code may particularly exciting demonstrate point: scripts need complicated.\nSaved scripts can called executed entirety source(), demonstrated shows comment ignored instruction executed:strict rules can go script files nothing prevent saving broken, non-reproducible code.57\n, however, conventions worth following:Write script order: just like script film, scripts clear order ‘setup,’ ‘data processing’ ‘save results’ (roughly equivalent ‘beginning,’ ‘middle’ ‘end’ film).Write script order: just like script film, scripts clear order ‘setup,’ ‘data processing’ ‘save results’ (roughly equivalent ‘beginning,’ ‘middle’ ‘end’ film).Add comments script people (future self) can understand . minimum, comment state purpose script (see Figure 10.1) (long scripts) divide sections. can done RStudio, example, shortcut Ctrl+Shift+R, creates ‘foldable’ code section headings.Add comments script people (future self) can understand . minimum, comment state purpose script (see Figure 10.1) (long scripts) divide sections. can done RStudio, example, shortcut Ctrl+Shift+R, creates ‘foldable’ code section headings., scripts reproducible: self-contained scripts work computer useful scripts run computer, good day. involves attaching required packages beginning, reading-data persistent sources (reliable website) ensuring previous steps taken.58Above , scripts reproducible: self-contained scripts work computer useful scripts run computer, good day. involves attaching required packages beginning, reading-data persistent sources (reliable website) ensuring previous steps taken.58It hard enforce reproducibility R scripts, tools can help.\ndefault, RStudio  ‘code-checks’ R scripts underlines faulty code red wavy line, illustrated :\nFIGURE 10.1: Code checking RStudio. example, script 10-centroid-alg.R, highlights unclosed curly bracket line 19.\ncontents section apply type R script.\nparticular consideration scripts geocomputation tend external dependencies, QGIS dependency run code Chapter 9, require input data specific format.\ndependencies mentioned comments script elsewhere project part, illustrated script 10-centroid-alg.R.\nwork undertaken script demonstrated reproducible example , works pre-requisite object named poly_mat, square sides 9 units length (meaning become apparent next section):59","code":"\n# Aim: provide a minimal R script\nprint(\"Hello geocompr\")\nsource(\"code/10-hello.R\")\n#> [1] \"Hello geocompr\"\npoly_mat = cbind(\n  x = c(0, 0, 9, 9, 0),\n  y = c(0, 9, 9, 0, 0)\n)\nsource(\"https://git.io/10-centroid-alg.R\") # short url#> [1] \"The area is: 81\"\n#> [1] \"The coordinates of the centroid are: 4.5, 4.5\""},{"path":"algorithms.html","id":"geometric-algorithms","chapter":"10 Scripts, algorithms and functions","heading":"10.3 Geometric algorithms","text":"Algorithms can understood computing equivalent cooking recipe.\ncomplete set instructions , undertaken input (ingredients), result useful (tasty) outputs.\ndiving concrete case study, brief history show algorithms relate scripts (covered Section 10.2) functions (can used generalize algorithms, ’ll see Section 10.4).word “algorithm” originated 9th century Baghdad publication Hisab al-jabr w’al-muqabala, early math textbook.\nbook translated Latin became popular author’s last name, al-Khwārizmī, “immortalized scientific term: Al-Khwarizmi\nbecame Alchoarismi, Algorismi , eventually, algorithm” (Bellos 2011).\ncomputing age, algorithm refers series steps solves problem, resulting pre-defined output.\nInputs must formally defined suitable data structure (Wise 2001).\nAlgorithms often start flow charts pseudocode showing aim process implemented code.\nease usability, common algorithms often packaged inside functions, may hide steps taken (unless look function’s source code, see Section 10.4).Geoalgorithms, encountered Chapter 9, algorithms take geographic data , generally, return geographic results (alternative terms thing include GIS algorithms geometric algorithms).\nmay sound simple deep subject entire academic field, Computational Geometry, dedicated study (Berg et al. 2008) numerous books subject.\nO’Rourke (1998), example, introduces subject range progressively harder geometric algorithms using reproducible freely available C code.example geometric algorithm one finds centroid polygon.\nmany approaches centroid calculation, work specific types spatial data.\npurposes section, choose approach easy visualize: breaking polygon many triangles finding centroid , approach discussed Kaiser Morin (1993) alongside centroid algorithms (mentioned briefly O’Rourke 1998).\nhelps break approach discrete tasks writing code (subsequently referred step 1 step 4, also presented schematic diagram pseudocode):Divide polygon contiguous triangles.Find centroid triangle.Find area triangle.Find area-weighted mean triangle centroids.steps may sound straightforward, converting words working code requires work plenty trial--error, even inputs constrained:\nalgorithm work convex polygons, contain internal angles greater 180°, star shapes allowed (packages decido sfdct can triangulate non-convex polygons using external libraries, shown algorithm vignette geocompr.github.io).simplest data structure polygon matrix x y coordinates row represents vertex tracing polygon’s border order first last rows identical (Wise 2001).\ncase, ’ll create polygon five vertices base R, building example GIS Algorithms (Xiao 2016 see github.com/gisalgs Python code), illustrated Figure 10.2:Now example dataset, ready undertake step 1 outlined .\ncode shows can done creating single triangle (T1), demonstrates method; also demonstrates step 2 calculating centroid based formula \\(1/3(+ b + c)\\) \\(\\) \\(c\\) coordinates representing triangle’s vertices:\nFIGURE 10.2: Illustration polygon centroid calculation problem.\nStep 3 find area triangle, weighted mean accounting disproportionate impact large triangles accounted .\nformula calculate area triangle follows (Kaiser Morin 1993):\\[\n\\frac{Ax ( B y − C y ) + B x ( C y − y ) + C x ( y − B y )}\n{ 2 }\n\\]\\(\\) \\(C\\) triangle’s three points \\(x\\) \\(y\\) refer x y dimensions.\ntranslation formula R code works data matrix representation triangle T1 follows (function abs() ensures positive result):code chunk outputs correct result.60\nproblem code clunky must re-typed want run another triangle matrix.\nmake code generalizable, see can converted function Section 10.4.Step 4 requires steps 2 3 undertaken just one triangle (demonstrated ) triangles.\nrequires iteration create triangles representing polygon, illustrated Figure 10.3.\nlapply() vapply() used iterate triangle provide concise solution base R:61\nFIGURE 10.3: Illustration iterative centroid algorithm triangles. X represents area-weighted centroid iterations 2 3.\nnow position complete step 4 calculate total area sum() centroid coordinates polygon weighted.mean(C[, 1], ) weighted.mean(C[, 2], ) (exercise alert readers: verify commands work).\ndemonstrate link algorithms scripts, contents section condensed 10-centroid-alg.R.\nsaw end Section 10.2 script can calculate centroid square.\ngreat thing scripting algorithm works new poly_mat object (see exercises verify results reference st_centroid()):example shows low-level geographic operations can developed first principles base R.\nalso shows tried--tested solution already exists, may worth re-inventing wheel:\naimed find centroid polygon, quicker represent poly_mat sf object use pre-existing sf::st_centroid() function instead.\nHowever, great benefit writing algorithms 1st principles understand every step process, something guaranteed using peoples’ code.\nconsideration performance: R slow compared low-level languages C++ number crunching (see Section 1.3) optimization difficult.\naim develop new methods, computational efficiency prioritized.\ncaptured saying “premature optimization root evil (least ) programming” (Knuth 1974).Algorithm development hard.\napparent amount work gone developing centroid algorithm base R just one, rather inefficient, approach problem limited real-world applications (convex polygons uncommon practice).\nexperience lead appreciation low-level geographic libraries GEOS (underlies sf::st_centroid()) CGAL (Computational Geometry Algorithms Library) run fast work wide range input geometry types.\ngreat advantage open source nature libraries source code readily available study, comprehension (skills confidence) modification.62","code":"\n# generate a simple matrix representation of a polygon:\nx_coords = c(10, 0, 0, 12, 20, 10)\ny_coords = c(0, 0, 10, 20, 15, 0)\npoly_mat = cbind(x_coords, y_coords)\n# create a point representing the origin:\nOrigin = poly_mat[1, ]\n# create 'triangle matrix':\nT1 = rbind(Origin, poly_mat[2:3, ], Origin) \n# find centroid (drop = FALSE preserves classes, resulting in a matrix):\nC1 = (T1[1, , drop = FALSE] + T1[2, , drop = FALSE] + T1[3, , drop = FALSE]) / 3\n# calculate the area of the triangle represented by matrix T1:\nabs(T1[1, 1] * (T1[2, 2] - T1[3, 2]) +\n  T1[2, 1] * (T1[3, 2] - T1[1, 2]) +\n  T1[3, 1] * (T1[1, 2] - T1[2, 2]) ) / 2\n#> [1] 50\ni = 2:(nrow(poly_mat) - 2)\nT_all = lapply(i, function(x) {\n  rbind(Origin, poly_mat[x:(x + 1), ], Origin)\n})\n\nC_list = lapply(T_all,  function(x) (x[1, ] + x[2, ] + x[3, ]) / 3)\nC = do.call(rbind, C_list)\n\nA = vapply(T_all, function(x) {\n  abs(x[1, 1] * (x[2, 2] - x[3, 2]) +\n        x[2, 1] * (x[3, 2] - x[1, 2]) +\n        x[3, 1] * (x[1, 2] - x[2, 2]) ) / 2\n  }, FUN.VALUE = double(1))\nsource(\"code/10-centroid-alg.R\")\n#> [1] \"The area is: 245\"\n#> [1] \"The coordinates of the centroid are: 8.83, 9.22\""},{"path":"algorithms.html","id":"functions","chapter":"10 Scripts, algorithms and functions","heading":"10.4 Functions","text":"Like algorithms, functions take input return output.\nFunctions, however, refer implementation particular programming language, rather ‘recipe’ .\nR, functions objects right, can created joined together modular fashion.\ncan, example, create function undertakes step 2 centroid generation algorithm follows:example demonstrates two key components functions:\n1) function body, code inside curly brackets define function inputs; 2) formals, list arguments function works — x case (third key component, environment, beyond scope section).\ndefault, functions return last object calculated (coordinates centroid case t_centroid()).63The function now works inputs pass , illustrated command calculates area 1st triangle example polygon previous section (see Figure 10.3):can also create function calculate triangle’s area, name t_area():Note function’s creation, triangle’s area can calculated single line code, avoiding duplication verbose code:\nfunctions mechanism generalizing code.\nnewly created function t_area() takes object x, assumed dimensions ‘triangle matrix’ data structure ’ve using, returns area, illustrated T1 follows:can test generalizability function using find area new triangle matrix, height 1 base 3:useful feature functions modular.\nProvided know output , one function can used building block another.\nThus, functions t_centroid() t_area() can used sub-components larger function work script 10-centroid-alg.R: calculate area convex polygon.\ncode chunk creates function poly_centroid() mimic behavior sf::st_centroid() convex polygons:64Functions poly_centroid() can extended provide different types output.\nreturn result object class sfg, example, ‘wrapper’ function can used modify output poly_centroid() returning result:can verify output output sf::st_centroid() follows:","code":"\nt_centroid = function(x) {\n  (x[1, ] + x[2, ] + x[3, ]) / 3\n}\nt_centroid(T1)\n#> x_coords y_coords \n#>     3.33     3.33\nt_area = function(x) {\n  abs(\n    x[1, 1] * (x[2, 2] - x[3, 2]) +\n    x[2, 1] * (x[3, 2] - x[1, 2]) +\n    x[3, 1] * (x[1, 2] - x[2, 2])\n  ) / 2\n}\nt_area(T1)\n#> [1] 50\nt_new = cbind(x = c(0, 3, 3, 0),\n              y = c(0, 0, 1, 0))\nt_area(t_new)\n#>   x \n#> 1.5\npoly_centroid = function(poly_mat) {\n  Origin = poly_mat[1, ] # create a point representing the origin\n  i = 2:(nrow(poly_mat) - 2)\n  T_all = lapply(i, function(x) {\n    rbind(Origin, poly_mat[x:(x + 1), ], Origin)\n  })\n  C_list = lapply(T_all, t_centroid)\n  C = do.call(rbind, C_list)\n  A = vapply(T_all, t_area, FUN.VALUE = double(1))\n  c(weighted.mean(C[, 1], A), weighted.mean(C[, 2], A))\n}\npoly_centroid(poly_mat)\n#> [1] 8.83 9.22\npoly_centroid_sfg = function(x) {\n  centroid_coords = poly_centroid(x)\n  sf::st_point(centroid_coords)\n}\npoly_sfc = sf::st_polygon(list(poly_mat))\nidentical(poly_centroid_sfg(poly_mat), sf::st_centroid(poly_sfc))\n#> [1] TRUE"},{"path":"algorithms.html","id":"programming","chapter":"10 Scripts, algorithms and functions","heading":"10.5 Programming","text":"chapter moved quickly, scripts functions via tricky topic algorithms.\ndiscussed abstract, also created working examples solve specific problem:script 10-centroid-alg.R introduced demonstrated ‘polygon matrix’individual steps allowed script work described algorithm, computational recipeTo generalize algorithm converted modular functions eventually combined create function poly_centroid() previous sectionTaken , steps straightforward.\nskill programming combining scripts, algorithms functions way produces performant, robust user-friendly tools people can use.\nnew programming, expect people reading book , able follow reproduce results preceding sections seen major achievement.\nProgramming takes many hours dedicated study practice become proficient.challenge facing developers aiming implement new algorithms efficient way put perspective considering created toy function.\ncurrent state, poly_centroid() fails (non-convex) polygons!\nquestion arising : one generalize function?\nTwo options (1) find ways triangulate non-convex polygons (topic covered online algorithm article supports chapter) (2) explore centroid algorithms rely triangular meshes.wider question : worth programming solution high performance algorithms already implemented packaged functions st_centroid()?\nreductionist answer specific case ‘.’\nwider context, considering benefits learning program, answer ‘depends.’\nprogramming, ’s easy waste hours trying implement method, find someone already done hard work.\ninstead seeing chapter first stepping stone towards geometric algorithm programming wizardry, may productive use lesson try program generalized solution, use existing higher-level solutions.\nsurely occasions writing new functions best way forward, also times using functions already exist best way forward.guarantee , read chapter, able rapidly create new functions work.\nconfident contents help decide appropriate time try (existing functions solve problem, programming task within capabilities benefits solution likely outweigh time costs developing ).\nFirst steps towards programming can slow (exercises rushed) long-term rewards can large.","code":""},{"path":"algorithms.html","id":"ex-algorithms","chapter":"10 Scripts, algorithms and functions","heading":"10.6 Exercises","text":"Read script 10-centroid-alg.R code folder book’s GitHub repo.\nbest practices covered Section 10.2 follow?\nCreate version script computer IDE RStudio (preferably typing-script line--line, coding style comments, rather copy-pasting — help learn type scripts). Using example square polygon (e.g., created poly_mat = cbind(x = c(0, 0, 9, 9, 0), y = c(0, 9, 9, 0, 0))) execute script line--line.\nchanges made script make reproducible?\n\n\n\n\n\n\ndocumentation improved?\n\nbest practices covered Section 10.2 follow?Create version script computer IDE RStudio (preferably typing-script line--line, coding style comments, rather copy-pasting — help learn type scripts). Using example square polygon (e.g., created poly_mat = cbind(x = c(0, 0, 9, 9, 0), y = c(0, 9, 9, 0, 0))) execute script line--line.changes made script make reproducible?\n\n\n\n\n\ndocumentation improved?\nSection 10.3 calculated area geographic centroid polygon represented poly_mat 245 8.8, 9.2, respectively.\nReproduce results computer reference script 10-centroid-alg.R, implementation algorithm (bonus: type commands - try avoid copy-pasting).\n\nresults correct? Verify converting poly_mat sfc object (named poly_sfc) st_polygon() (hint: function takes objects class list()) using st_area() st_centroid().\n\nReproduce results computer reference script 10-centroid-alg.R, implementation algorithm (bonus: type commands - try avoid copy-pasting).\nresults correct? Verify converting poly_mat sfc object (named poly_sfc) st_polygon() (hint: function takes objects class list()) using st_area() st_centroid().\nstated algorithm created works convex hulls. Define convex hulls (see Chapter 5) test algorithm polygon convex hull.\nBonus 1: Think method works convex hulls note changes need made algorithm make work types polygon.\n\nBonus 2: Building contents 10-centroid-alg.R, write algorithm using base R functions can find total length linestrings represented matrix form.\n\nBonus 1: Think method works convex hulls note changes need made algorithm make work types polygon.\nBonus 2: Building contents 10-centroid-alg.R, write algorithm using base R functions can find total length linestrings represented matrix form.\nSection 10.4 created different versions poly_centroid() function generated outputs class sfg (poly_centroid_sfg()) type-stable matrix outputs (poly_centroid_type_stable()). extend function creating version (e.g., called poly_centroid_sf()) type stable (accepts inputs class sf) returns sf objects (hint: may need convert object x matrix command sf::st_coordinates(x)).\nVerify works running poly_centroid_sf(sf::st_sf(sf::st_sfc(poly_sfc)))\nerror message get try run poly_centroid_sf(poly_mat)?\nVerify works running poly_centroid_sf(sf::st_sf(sf::st_sfc(poly_sfc)))error message get try run poly_centroid_sf(poly_mat)?","code":""},{"path":"spatial-cv.html","id":"spatial-cv","chapter":"11 Statistical learning","heading":"11 Statistical learning","text":"","code":""},{"path":"spatial-cv.html","id":"prerequisites-9","chapter":"11 Statistical learning","heading":"Prerequisites","text":"chapter assumes proficiency geographic data analysis, example gained studying contents working-exercises Chapters 2 6.\nfamiliarity generalized linear models (GLM) machine learning highly recommended (example . Zuur et al. 2009; James et al. 2013).chapter uses following packages:65Required data attached due course.","code":"\nlibrary(sf)\nlibrary(raster)\n#> Warning: multiple methods tables found for 'approxNA'\nlibrary(dplyr)\nlibrary(mlr)\nlibrary(parallelMap)"},{"path":"spatial-cv.html","id":"intro-cv1","chapter":"11 Statistical learning","heading":"11.1 Introduction","text":"Statistical learning concerned use statistical computational models identifying patterns data predicting patterns.\nDue origins, statistical learning one R’s great strengths (see Section 1.3).66\nStatistical learning combines methods statistics machine learning methods can categorized supervised unsupervised techniques.\nincreasingly used disciplines ranging physics, biology ecology geography economics (James et al. 2013).chapter focuses supervised techniques training dataset, opposed unsupervised techniques clustering.\nResponse variables can binary (landslide occurrence), categorical (land use), integer (species richness count) numeric (soil acidity measured pH).\nSupervised techniques model relationship responses — known sample observations — one predictors.primary aim much machine learning research make good predictions, opposed statistical/Bayesian inference, good helping understand underlying mechanisms uncertainties data (see Krainski et al. 2018).\nMachine learning thrives age ‘big data’ methods make assumptions input variables can handle huge datasets.\nMachine learning conducive tasks prediction future customer behavior, recommendation services (music, movies, buy next), face recognition, autonomous driving, text classification predictive maintenance (infrastructure, industry).chapter based case study: (spatial) prediction landslides.\napplication links applied nature geocomputation, defined Chapter 1, illustrates machine learning borrows field statistics sole aim prediction.\nTherefore, chapter first introduces modeling cross-validation concepts help Generalized Linear Model  (. Zuur et al. 2009).\nBuilding , chapter implements typical machine learning algorithm, namely Support Vector Machine (SVM).\nmodels’ predictive performance assessed using spatial cross-validation (CV), accounts fact geographic data special.CV determines model’s ability generalize new data, splitting dataset (repeatedly) training test sets.\nuses training data fit model, checks performance predicting test data.\nCV helps detect overfitting since models predict training data closely (noise) tend perform poorly test data.Randomly splitting spatial data can lead training points neighbors space test points.\nDue spatial autocorrelation, test training datasets independent scenario, consequence CV fails detect possible overfitting.\nSpatial CV alleviates problem central theme chapter.","code":""},{"path":"spatial-cv.html","id":"case-landslide","chapter":"11 Statistical learning","heading":"11.2 Case study: Landslide susceptibility","text":"case study based dataset landslide locations Southern Ecuador, illustrated Figure 11.1 described detail Muenchow, Brenning, Richter (2012).\nsubset dataset used paper provided RSAGA package, can loaded follows:load three objects: data.frame named landslides, list named dem, sf object named study_area.\nlandslides contains factor column lslpts TRUE corresponds observed landslide ‘initiation point,’ coordinates stored columns x y.67There 175 landslide points 1360 non-landslide, shown summary(landslides).\n1360 non-landslide points sampled randomly study area, restriction must fall outside small buffer around landslide polygons.make number landslide non-landslide points balanced, let us sample 175 1360 non-landslide points.68dem digital elevation model consisting two elements:\ndem$header, list represents raster ‘header’ (see Section 2.3), dem$data, matrix altitude pixel.\ndem can converted raster object :\nFIGURE 11.1: Landslide initiation points (red) points unaffected landsliding (blue) Southern Ecuador.\nmodel landslide susceptibility, need predictors.\nTerrain attributes frequently associated landsliding (Muenchow, Brenning, Richter 2012), can computed digital elevation model (dem) using R-GIS bridges (see Chapter 9).\nleave exercise reader compute following terrain attribute rasters extract corresponding values landslide/non-landslide data frame (see exercises; also provide resulting data frame via spDataLarge package, see ):slope: slope angle (°).cplan: plan curvature (rad m−1) expressing convergence divergence slope thus water flow.cprof: profile curvature (rad m-1) measure flow acceleration, also known downslope change slope angle.elev: elevation (m .s.l.) representation different altitudinal zones vegetation precipitation study area.log10_carea: decadic logarithm catchment area (log10 m2) representing amount water flowing towards location.Data containing landslide points, corresponding terrain attributes, provided spDataLarge package, along terrain attribute raster stack values extracted.\nHence, computed predictors , attach corresponding data running code remaining chapter:first three rows lsl, rounded two significant digits, can found Table 11.1.\nTABLE 11.1: Structure lsl dataset.\n","code":"\ndata(\"landslides\", package = \"RSAGA\")\n# select non-landslide points\nnon_pts = filter(landslides, lslpts == FALSE)\n# select landslide points\nlsl_pts = filter(landslides, lslpts == TRUE)\n# randomly select 175 non-landslide points\nset.seed(11042018)\nnon_pts_sub = sample_n(non_pts, size = nrow(lsl_pts))\n# create smaller landslide dataset (lsl)\nlsl = bind_rows(non_pts_sub, lsl_pts)\ndem = raster(\n  dem$data, \n  crs = dem$header$proj4string,\n  xmn = dem$header$xllcorner, \n  xmx = dem$header$xllcorner + dem$header$ncols * dem$header$cellsize,\n  ymn = dem$header$yllcorner,\n  ymx = dem$header$yllcorner + dem$header$nrows * dem$header$cellsize\n  )\n# attach landslide points with terrain attributes\ndata(\"lsl\", package = \"spDataLarge\")\n# attach terrain attribute raster stack\ndata(\"ta\", package = \"spDataLarge\")"},{"path":"spatial-cv.html","id":"conventional-model","chapter":"11 Statistical learning","heading":"11.3 Conventional modeling approach in R","text":"introducing mlr package, umbrella-package providing unified interface dozens learning algorithms (Section 11.5), worth taking look conventional modeling interface R.\nintroduction supervised statistical learning provides basis spatial CV, contributes better grasp mlr approach presented subsequently.Supervised learning involves predicting response variable function predictors (Section 11.4).\nR, modeling functions usually specified using formulas (see ?formula detailed Formulas R Tutorial details R formulas).\nfollowing command specifies runs generalized linear model:worth understanding three input arguments:formula, specifies landslide occurrence (lslpts) function predictorsA family, specifies type model, case binomial response binary (see ?family)data frame contains response predictorsThe results model can printed follows (summary(fit) provides detailed account results):model object fit, class glm, contains coefficients defining fitted relationship response predictors.\ncan also used prediction.\ndone generic predict() method, case calls function predict.glm().\nSetting type response returns predicted probabilities (landslide occurrence) observation lsl, illustrated (see ?predict.glm):Spatial predictions can made applying coefficients predictor rasters.\ncan done manually raster::predict().\naddition model object (fit), function also expects raster stack predictors named model’s input data frame (Figure 11.2).\nFIGURE 11.2: Spatial prediction landslide susceptibility using GLM.\n, making predictions neglect spatial autocorrelation since assume average predictive accuracy remains without spatial autocorrelation structures.\nHowever, possible include spatial autocorrelation structures models (. Zuur et al. 2009; Blangiardo Cameletti 2015; . F. Zuur et al. 2017) well predictions (kriging approaches, see, e.g., Goovaerts 1997; Hengl 2007; R. Bivand, Pebesma, Gómez-Rubio 2013).\n, however, beyond scope book.\nSpatial prediction maps one important outcome model.\nEven important good underlying model making since prediction map useless model’s predictive performance bad.\npopular measure assess predictive performance binomial model Area Receiver Operator Characteristic Curve (AUROC).\nvalue 0.5 1.0, 0.5 indicating model better random 1.0 indicating perfect prediction two classes.\nThus, higher AUROC, better model’s predictive power.\nfollowing code chunk computes AUROC value model roc(), takes response predicted values inputs.\nauc() returns area curve.AUROC value \n\n0.83\nrepresents good fit.\nHowever, overoptimistic estimation since computed complete dataset.\nderive biased-reduced assessment, use cross-validation case spatial data make use spatial CV.","code":"\nfit = glm(lslpts ~ slope + cplan + cprof + elev + log10_carea,\n          family = binomial(),\n          data = lsl)\nclass(fit)\n#> [1] \"glm\" \"lm\"\nfit\n#> \n#> Call:  glm(formula = lslpts ~ slope + cplan + cprof + elev + log10_carea, \n#>     family = binomial(), data = lsl)\n#> \n#> Coefficients:\n#> (Intercept)        slope        cplan        cprof         elev  log10_carea  \n#>    1.97e+00     9.30e-02    -2.57e+01    -1.43e+01     2.41e-05    -2.12e+00  \n#> \n#> Degrees of Freedom: 349 Total (i.e. Null);  344 Residual\n#> Null Deviance:       485 \n#> Residual Deviance: 361   AIC: 373\npred_glm = predict(object = fit, type = \"response\")\nhead(pred_glm)\n#>      1      2      3      4      5      6 \n#> 0.3327 0.4755 0.0995 0.1480 0.3486 0.6766\n# making the prediction\npred = raster::predict(ta, model = fit, type = \"response\")\npROC::auc(pROC::roc(lsl$lslpts, fitted(fit)))\n#> Area under the curve: 0.826"},{"path":"spatial-cv.html","id":"intro-cv","chapter":"11 Statistical learning","heading":"11.4 Introduction to (spatial) cross-validation","text":"Cross-validation belongs family resampling methods (James et al. 2013).\nbasic idea split (repeatedly) dataset training test sets whereby training data used fit model applied test set.\nComparing predicted values known response values test set (using performance measure AUROC binomial case) gives bias-reduced assessment model’s capability generalize learned relationship independent data.\nexample, 100-repeated 5-fold cross-validation means randomly split data five partitions (folds) fold used test set (see upper row Figure 11.3).\nguarantees observation used one test sets, requires fitting five models.\nSubsequently, procedure repeated 100 times.\ncourse, data splitting differ repetition.\n\nOverall, sums 500 models, whereas mean performance measure (AUROC) models model’s overall predictive power.However, geographic data special.\nsee Chapter 12, ‘first law’ geography states points close , generally, similar points away (Miller 2004).\nmeans points statistically independent training test points conventional CV often close (see first row Figure 11.3).\n‘Training’ observations near ‘test’ observations can provide kind ‘sneak preview’:\ninformation unavailable training dataset.\n\nalleviate problem ‘spatial partitioning’ used split observations spatially disjointed subsets (using observations’ coordinates k-means clustering; Brenning (2012b); second row Figure 11.3).\npartitioning strategy difference spatial conventional CV.\nresult, spatial CV leads bias-reduced assessment model’s predictive performance, hence helps avoid overfitting.\n\nFIGURE 11.3: Spatial visualization selected test training observations cross-validation one repetition. Random (upper row) spatial partitioning (lower row).\n","code":""},{"path":"spatial-cv.html","id":"spatial-cv-with-mlr","chapter":"11 Statistical learning","heading":"11.5 Spatial CV with mlr","text":"\ndozens packages statistical learning, described example CRAN machine learning task view.\nGetting acquainted packages, including undertake cross-validation hyperparameter tuning, can time-consuming process.\nComparing model results different packages can even laborious.\nmlr package developed address issues.\nacts ‘meta-package,’ providing unified interface popular supervised unsupervised statistical learning techniques including classification, regression, survival analysis clustering (Bischl et al. 2016).\n\n\n\nstandardized mlr interface based eight ‘building blocks.’\nillustrated Figure 11.4, clear order.\nFIGURE 11.4: Basic building blocks mlr package. Source: http://bit.ly/2tcb2b7. (Permission reuse figure kindly granted.)\nmlr modeling process consists three main stages.\nFirst, task specifies data (including response predictor variables) model type (regression classification).\nSecond, learner defines specific learning algorithm applied created task.\nThird, resampling approach assesses predictive performance model, .e., ability generalize new data (see also Section 11.4).","code":""},{"path":"spatial-cv.html","id":"glm","chapter":"11 Statistical learning","heading":"11.5.1 Generalized linear model","text":"implement GLM mlr, must create task containing landslide data.\nSince response binary (two-category variable), create classification task makeClassifTask() (regression tasks, use makeRegrTask(), see ?makeRegrTask task types).\nfirst essential argument make*() functions data.\ntarget argument expects name response variable positive determines two factor levels response variable indicate landslide initiation point (case TRUE).\nvariables lsl dataset serve predictors except coordinates (see result getTaskFormula(task) model formula).\nspatial CV, coordinates parameter used (see Section 11.4 Figure 11.3) expects coordinates xy data frame.makeLearner() determines statistical learning method use.\nclassification learners start classif. regression learners regr. (see ?makeLearners details).\nlistLearners() helps find available learners package mlr imports (Table 11.2).\nspecific task, can run:\nTABLE 11.2: Sample available learners binomial tasks mlr package.\nyields learners able model two-class problems (landslide yes ).\nopt binomial classification method used Section 11.3 implemented classif.binomial mlr.\nAdditionally, must specify link-function, logit case, also default binomial() function.\npredict.type determines type prediction prob resulting predicted probability landslide occurrence 0 1 (corresponds type = response predict.glm).find package specified learner taken access corresponding help pages, can run:set-steps modeling mlr may seem tedious.\nremember, single interface provides access 150+ learners shown listLearners(); far tedious learn interface learner!\nadvantages simple parallelization resampling techniques ability tune machine learning hyperparameters (see Section 11.5.2).\nimportantly, (spatial) resampling mlr straightforward, requiring two steps: specifying resampling method running .\nuse 100-repeated 5-fold spatial CV: five partitions chosen based provided coordinates task partitioning repeated 100 times:69To execute spatial resampling, run resample() using specified learner, task, resampling strategy course performance measure, AUROC.\ntakes time (around 10 seconds modern laptop) computes AUROC 500 models.\nSetting seed ensures reproducibility obtained result ensure spatial partitioning re-running code.output preceding code chunk bias-reduced assessment model’s predictive performance, illustrated following code chunk (required input data saved file spatialcv.Rdata book’s GitHub repo):put results perspective, let us compare AUROC values 100-repeated 5-fold non-spatial cross-validation (Figure 11.5; code non-spatial cross-validation shown explored exercise section).\nexpected, spatially cross-validated result yields lower AUROC values average conventional cross-validation approach, underlining -optimistic predictive performance due spatial autocorrelation latter.\nFIGURE 11.5: Boxplot showing difference AUROC values spatial conventional 100-repeated 5-fold cross-validation.\n","code":"\nlibrary(mlr)\n# coordinates needed for the spatial partitioning\ncoords = lsl[, c(\"x\", \"y\")]\n# select response and predictors to use in the modeling\ndata = dplyr::select(lsl, -x, -y)\n# create task\ntask = makeClassifTask(data = data, target = \"lslpts\",\n                       positive = \"TRUE\", coordinates = coords)\nlistLearners(task, warn.missing.packages = FALSE) %>%\n  dplyr::select(class, name, short.name, package) %>%\n  head()\nlrn = makeLearner(cl = \"classif.binomial\",\n                  link = \"logit\",\n                  predict.type = \"prob\",\n                  fix.factors.prediction = TRUE)\ngetLearnerPackages(lrn)\nhelpLearner(lrn)\nperf_level = makeResampleDesc(method = \"SpRepCV\", folds = 5, reps = 100)\nset.seed(012348)\nsp_cv = mlr::resample(learner = lrn, task = task,\n                      resampling = perf_level, \n                      measures = mlr::auc)\n# summary statistics of the 500 models\nsummary(sp_cv$measures.test$auc)\n#>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. \n#>   0.686   0.757   0.789   0.780   0.795   0.861\n# mean AUROC of the 500 models\nmean(sp_cv$measures.test$auc)\n#> [1] 0.78"},{"path":"spatial-cv.html","id":"svm","chapter":"11 Statistical learning","heading":"11.5.2 Spatial tuning of machine-learning hyperparameters","text":"Section 11.4 introduced machine learning part statistical learning.\nrecap, adhere following definition machine learning Jason Brownlee:Machine learning, specifically field predictive modeling, primarily concerned minimizing error model making accurate predictions possible, expense explainability.\napplied machine learning borrow, reuse steal algorithms many different fields, including statistics use towards ends.Section 11.5.1 GLM used predict landslide susceptibility.\nsection introduces support vector machines (SVM) purpose.\nRandom forest models might popular SVMs; however, positive effect tuning hyperparameters model performance much pronounced case SVMs (Probst, Wright, Boulesteix 2018).\nSince (spatial) hyperparameter tuning major aim section, use SVM.\nwishing apply random forest model, recommend read chapter, proceed Chapter 14 apply currently covered concepts techniques make spatial predictions based random forest model.SVMs search best possible ‘hyperplanes’ separate classes (classification case) estimate ‘kernels’ specific hyperparameters allow non-linear boundaries classes (James et al. 2013).\nHyperparameters confused coefficients parametric models, sometimes also referred parameters.70\nCoefficients can estimated data, hyperparameters set learning begins.\nOptimal hyperparameters usually determined within defined range help cross-validation methods.\ncalled hyperparameter tuning.SVM implementations provided kernlab allow hyperparameters tuned automatically, usually based random sampling (see upper row Figure 11.3).\nworks non-spatial data less use spatial data ‘spatial tuning’ undertaken.defining spatial tuning, set mlr building blocks, introduced Section 11.5.1, SVM.\nclassification task remains , hence can simply reuse task object created Section 11.5.1.\nLearners implementing SVM can found using listLearners() follows:options illustrated , use ksvm() kernlab package (Karatzoglou et al. 2004).\nallow non-linear relationships, use popular radial basis function (Gaussian) kernel also default ksvm().next stage specify resampling strategy.\nuse 100-repeated 5-fold spatial CV.Note exact code used GLM Section 11.5.1; simply repeated reminder.far, process identical described Section 11.5.1.\nnext step new, however: tune hyperparameters.\nUsing data performance assessment tuning potentially lead overoptimistic results (Cawley Talbot 2010).\ncan avoided using nested spatial CV.\nFIGURE 11.6: Schematic hyperparameter tuning performance estimation levels CV. (Figure taken Schratz et al. (2018). Permission reuse kindly granted.)\nmeans split fold five spatially disjoint subfolds used determine optimal hyperparameters (tune_level object code chunk ; see Figure 11.6 visual representation).\nfind optimal hyperparameter combination, fit 50 models (ctrl object code chunk ) subfolds randomly selected values hyperparameters C Sigma.\nrandom selection values C Sigma additionally restricted predefined tuning space (ps object).\nrange tuning space chosen values recommended literature (Schratz et al. 2018).next stage modify learner lrn_ksvm accordance characteristics defining hyperparameter tuning makeTuneWrapper().mlr now set-fit 250 models determine optimal hyperparameters one fold.\nRepeating fold, end 1250 (250 * 5) models repetition.\nRepeated 100 times means fitting total 125,000 models identify optimal hyperparameters (Figure 11.3).\nused performance estimation, requires fitting another 500 models (5 folds * 100 repetitions; see Figure 11.3).\nmake performance estimation processing chain even clearer, let us write commands given computer:Performance level (upper left part Figure 11.6): split dataset five spatially disjoint (outer) subfolds.Tuning level (lower left part Figure 11.6): use first fold performance level split spatially five (inner) subfolds hyperparameter tuning.\nUse 50 randomly selected hyperparameters inner subfolds, .e., fit 250 models.Performance estimation: Use best hyperparameter combination previous step (tuning level) apply first outer fold performance level estimate performance (AUROC).Repeat steps 2 3 remaining four outer folds.Repeat steps 2 4, 100 times.process hyperparameter tuning performance estimation computationally intensive.\nModel runtime can reduced parallelization, can done number ways, depending operating system.\nstarting parallelization, ensure processing continues even one models throws error setting .learner.error warn.\navoids process stopping just one failed model, desirable large model runs.\ninspect failed models processing completed, dump :start parallelization, set mode multicore use mclapply() background single machine case Unix-based operating system.71\nEquivalenty, parallelStartSocket() enables parallelization Windows.\nlevel defines level enable parallelization, mlr.tuneParams determining hyperparameter tuning level parallelized (see lower left part Figure 11.6, ?parallelGetRegisteredLevels, mlr parallelization tutorial details).\nuse half available cores (set cpus parameter), setting allows possible users work high performance computing cluster case one used (case ran code).\n\nSetting mc.set.seed TRUE ensures randomly chosen hyperparameters tuning can reproduced running code .\nUnfortunately, mc.set.seed available Unix-based systems.Now set computing nested spatial CV.\nUsing seed allows us recreate exact spatial partitions re-running code.\nSpecifying resample() parameters follows exact procedure presented using GLM, difference extract argument.\nallows extraction hyperparameter tuning results important plan follow-analyses tuning.\nprocessing, good practice explicitly stop parallelization parallelStop().\nFinally, save output object (result) disk case like use another R session.\nrunning subsequent code, aware time-consuming:\n125,500 models took ~1/2hr server using 24 cores (see ).case want run code locally, saved subset results book’s GitHub repo.\ncan loaded follows:Note runtime depends many aspects: CPU speed, selected algorithm, selected number cores dataset.Even important runtime final aggregated AUROC: model’s ability discriminate two classes.appears GLM (aggregated AUROC 0.78) slightly better SVM specific case.\nHowever, using 50 iterations random search probably yield hyperparameters result models better AUROC (Schratz et al. 2018).\nhand, increasing number random search iterations also increase total number models thus runtime.estimated optimal hyperparameters fold performance estimation level can also viewed.\nfollowing command shows best hyperparameter combination first fold first iteration (recall results first 5 * 50 model runs):estimated hyperparameters used first fold first iteration performance estimation level resulted following AUROC value:far spatial CV used assess ability learning algorithms generalize unseen data.\nspatial predictions, one tune hyperparameters complete dataset.\ncovered Chapter 14.","code":"\nlrns = listLearners(task, warn.missing.packages = FALSE)\nfilter(lrns, grepl(\"svm\", class)) %>% \n  dplyr::select(class, name, short.name, package)\n#>            class                                 name short.name package\n#> 6   classif.ksvm              Support Vector Machines       ksvm kernlab\n#> 9  classif.lssvm Least Squares Support Vector Machine      lssvm kernlab\n#> 17   classif.svm     Support Vector Machines (libsvm)        svm   e1071\nlrn_ksvm = makeLearner(\"classif.ksvm\",\n                        predict.type = \"prob\",\n                        kernel = \"rbfdot\")\n# performance estimation level\nperf_level = makeResampleDesc(method = \"SpRepCV\", folds = 5, reps = 100)\n# five spatially disjoint partitions\ntune_level = makeResampleDesc(\"SpCV\", iters = 5)\n# use 50 randomly selected hyperparameters\nctrl = makeTuneControlRandom(maxit = 50)\n# define the outer limits of the randomly selected hyperparameters\nps = makeParamSet(\n  makeNumericParam(\"C\", lower = -12, upper = 15, trafo = function(x) 2^x),\n  makeNumericParam(\"sigma\", lower = -15, upper = 6, trafo = function(x) 2^x)\n  )\nwrapped_lrn_ksvm = makeTuneWrapper(learner = lrn_ksvm, \n                                   resampling = tune_level,\n                                   par.set = ps,\n                                   control = ctrl, \n                                   show.info = TRUE,\n                                   measures = mlr::auc)\nconfigureMlr(on.learner.error = \"warn\", on.error.dump = TRUE)\nlibrary(parallelMap)\nif (Sys.info()[\"sysname\"] %in% c(\"Linux\", \"Darwin\")) {\nparallelStart(mode = \"multicore\", \n              # parallelize the hyperparameter tuning level\n              level = \"mlr.tuneParams\", \n              # just use half of the available cores\n              cpus = round(parallel::detectCores() / 2),\n              mc.set.seed = TRUE)\n}\n\nif (Sys.info()[\"sysname\"] == \"Windows\") {\n  parallelStartSocket(level = \"mlr.tuneParams\",\n                      cpus =  round(parallel::detectCores() / 2))\n}\nset.seed(12345)\nresult = mlr::resample(learner = wrapped_lrn_ksvm, \n                       task = task,\n                       resampling = perf_level,\n                       extract = getTuneResult,\n                       measures = mlr::auc)\n# stop parallelization\nparallelStop()\n# save your result, e.g.:\n# saveRDS(result, \"svm_sp_sp_rbf_50it.rds\")\nresult = readRDS(\"extdata/spatial_cv_result.rds\")\n# Exploring the results\n# runtime in minutes\nround(result$runtime / 60, 2)\n#> [1] 37.4\n# final aggregated AUROC \nresult$aggr\n#> auc.test.mean \n#>         0.758\n# same as\nmean(result$measures.test$auc)\n#> [1] 0.758\n# winning hyperparameters of tuning step, \n# i.e. the best combination out of 50 * 5 models\nresult$extract[[1]]$x\n#> $C\n#> [1] 0.458\n#> \n#> $sigma\n#> [1] 0.023\nresult$measures.test[1, ]\n#>   iter   auc\n#> 1    1 0.799"},{"path":"spatial-cv.html","id":"conclusions","chapter":"11 Statistical learning","heading":"11.6 Conclusions","text":"Resampling methods important part data scientist’s toolbox (James et al. 2013).\nchapter used cross-validation assess predictive performance various models.\ndescribed Section 11.4, observations spatial coordinates may statistically independent due spatial autocorrelation, violating fundamental assumption cross-validation.\nSpatial CV addresses issue reducing bias introduced spatial autocorrelation.mlr package facilitates (spatial) resampling techniques combination popular statistical learning techniques including linear regression, semi-parametric models generalized additive models machine learning techniques random forests, SVMs, boosted regression trees (Bischl et al. 2016; Schratz et al. 2018).\nMachine learning algorithms often require hyperparameter inputs, optimal ‘tuning’ can require thousands model runs require large computational resources, consuming much time, RAM /cores.\nmlr tackles issue enabling parallelization.Machine learning overall, use understand spatial data, large field chapter provided basics, learn.\nrecommend following resources direction:mlr tutorials Machine Learning R Handling spatial DataAn academic paper hyperparameter tuning (Schratz et al. 2018)case spatio-temporal data, one account spatial temporal autocorrelation CV (Meyer et al. 2018)","code":""},{"path":"spatial-cv.html","id":"exercises-8","chapter":"11 Statistical learning","heading":"11.7 Exercises","text":"Compute following terrain attributes dem datasets loaded data(\"landslides\", package = \"RSAGA\") help R-GIS bridges (see Chapter 9):\nSlope\nPlan curvature\nProfile curvature\nCatchment area\nSlopePlan curvatureProfile curvatureCatchment areaExtract values corresponding output rasters landslides data frame (data(landslides, package = \"RSAGA\") adding new variables called slope, cplan, cprof, elev log_carea. Keep landslide initiation points 175 randomly selected non-landslide points (see Section 11.2 details).Use derived terrain attribute rasters combination GLM make spatial prediction map similar shown Figure 11.2.\nRunning data(\"study_mask\", package = \"spDataLarge\") attaches mask study area.Compute 100-repeated 5-fold non-spatial cross-validation spatial CV based GLM learner compare AUROC values resampling strategies help boxplots (see Figure 11.5).\nHint: need specify non-spatial task non-spatial resampling strategy.\n\nModel landslide susceptibility using quadratic discriminant analysis (QDA, James et al. 2013).\nAssess predictive performance (AUROC) QDA.\ndifference spatially cross-validated mean AUROC value QDA GLM?\n\nHint: running spatial cross-validation learners, set seed make sure use spatial partitions turn guarantees comparability.Run SVM without tuning hyperparameters.\nUse rbfdot kernel \\(\\sigma\\) = 1 C = 1.\nLeaving hyperparameters unspecified kernlab’s ksvm() otherwise initialize automatic non-spatial hyperparameter tuning.\ndiscussion need (spatial) tuning hyperparameters, please refer Schratz et al. (2018).","code":""},{"path":"transport.html","id":"transport","chapter":"12 Transportation","heading":"12 Transportation","text":"","code":""},{"path":"transport.html","id":"prerequisites-10","chapter":"12 Transportation","heading":"Prerequisites","text":"chapter uses following packages:72","code":"\nlibrary(sf)\nlibrary(dplyr)\nlibrary(spDataLarge)\n#> Warning: multiple methods tables found for 'approxNA'\nlibrary(stplanr)      # geographic transport data package\nlibrary(tmap)         # visualization package (see Chapter 8)"},{"path":"transport.html","id":"introduction-7","chapter":"12 Transportation","heading":"12.1 Introduction","text":"sectors geographic space tangible transport.\neffort moving (overcoming distance) central ‘first law’ geography, defined Waldo Tobler 1970 follows (Miller 2004):Everything related everything else, near things related distant things.‘law’ basis spatial autocorrelation key geographic concepts.\napplies phenomena diverse friendship networks ecological diversity can explained costs transport — terms time, energy money — constitute ‘friction distance.’\nperspective, transport technologies disruptive, changing geographic relationships geographic entities including mobile humans goods: “purpose transportation overcome space” (Rodrigue, Comtois, Slack 2013).Transport inherently geospatial activity.\ninvolves traversing continuous geographic space B, infinite localities .\ntherefore unsurprising transport researchers long turned geocomputational methods understand movement patterns transport problems motivator geocomputational methods.chapter introduces geographic analysis transport systems different geographic levels, including:Areal units: transport patterns can understood reference zonal aggregates main mode travel (car, bike foot, example) average distance trips made people living particular zone, covered Section 12.3.Desire lines: straight lines represent ‘origin-destination’ data records many people travel (travel) places (points zones) geographic space, topic Section 12.4.Routes: lines representing path along route network along desire lines defined previous bullet point.\nsee create Section 12.5.Nodes: points transport system can represent common origins destinations public transport stations bus stops rail stations, topic Section 12.6.Route networks: represent system roads, paths linear features area covered Section 12.7. can represented geographic features (representing route segments) structured interconnected graph, level traffic different segments referred ‘flow’ transport modelers (Hollander 2016).Another key level agents, mobile entities like .\ncan represented computationally thanks software MATSim, captures dynamics transport systems using agent-based modeling (ABM) approach high spatial temporal resolution (Horni, Nagel, Axhausen 2016).\nABM powerful approach transport research great potential integration R’s spatial classes (Thiele 2014; Lovelace Dumont 2016), outside scope chapter.\nBeyond geographic levels agents, basic unit analysis transport models trip, single purpose journey origin ‘’ destination ‘B’ (Hollander 2016).\nTrips join-different levels transport systems: usually represented desire lines connecting zone centroids (nodes), can allocated onto route network routes, made people can represented agents.Transport systems dynamic systems adding additional complexity.\npurpose geographic transport modeling can interpreted simplifying complexity way captures essence transport problems.\nSelecting appropriate level geographic analysis can help simplify complexity, capture essence transport system without losing important features variables (Hollander 2016).Typically, models designed solve particular problem.\nreason, chapter based around policy scenario, introduced next section, asks:\nincrease cycling city Bristol?\nChapter 13 demonstrates another application geocomputation:\nprioritising location new bike shops.\nlink chapters bike shops may benefit new cycling infrastructure, demonstrating important feature transport systems: closely linked broader social, economic land-use patterns.","code":""},{"path":"transport.html","id":"bris-case","chapter":"12 Transportation","heading":"12.2 A case study of Bristol","text":"case study used chapter located Bristol, city west England, around 30 km east Welsh capital Cardiff.\noverview region’s transport network illustrated Figure 12.1, shows diversity transport infrastructure, cycling, public transport, private motor vehicles.\nFIGURE 12.1: Bristol’s transport network represented colored lines active (green), public (railways, black) private motor (red) modes travel. Blue border lines represent inner city boundary larger Travel Work Area (TTWA).\nBristol 10th largest city council England, population half million people, although travel catchment area larger (see Section 12.3).\nvibrant economy aerospace, media, financial service tourism companies, alongside two major universities.\nBristol shows high average income per capita also contains areas severe deprivation (Bristol City Council 2015).terms transport, Bristol well served rail road links, relatively high level active travel.\n19% citizens cycle 88% walk least per month according Active People Survey (national average 15% 81%, respectively).\n8% population said cycled work 2011 census, compared 3% nationwide.Despite impressive walking cycling statistics, city major congestion problem.\nPart solution continue increase proportion trips made cycling.\nCycling greater potential replace car trips walking speed mode, around 3-4 times faster walking (typical speeds 15-20 km/h vs 4-6 km/h walking).\nambitious plan double share cycling 2020.policy context, aim chapter, beyond demonstrating geocomputation R can used support sustainable transport planning, provide evidence decision-makers Bristol decide best increase share walking cycling particular city.\nhigh-level aim met via following objectives:Describe geographical pattern transport behavior cityIdentify key public transport nodes routes along cycling rail stations encouraged, first stage multi-model tripsAnalyze travel ‘desire lines’, find many people drive short distancesIdentify cycle route locations encourage less car driving cyclingTo get wheels rolling practical aspects chapter, begin loading zonal data travel patterns.\nzone-level data small often vital gaining basic understanding settlement’s overall transport system.","code":""},{"path":"transport.html","id":"transport-zones","chapter":"12 Transportation","heading":"12.3 Transport zones","text":"Although transport systems primarily based linear features nodes — including pathways stations — often makes sense start areal data, break continuous space tangible units (Hollander 2016).\naddition boundary defining study area (Bristol case), two zone types particular interest transport researchers: origin destination zones.\nOften, geographic units used origins destinations.\nHowever, different zoning systems, ‘Workplace Zones,’ may appropriate represent increased density trip destinations areas many ‘trip attractors’ schools shops (Office National Statistics 2014).simplest way define study area often first matching boundary returned OpenStreetMap, can obtained using osmdata command bristol_region = osmdata::getbb(\"Bristol\", format_out = \"sf_polygon\"). results sf object representing bounds largest matching city region, either rectangular polygon bounding box detailed polygonal boundary.73\nBristol, UK, detailed polygon returned, representing official boundary Bristol (see inner blue boundary Figure 12.1) couple issues approach:first OSM boundary returned OSM may official boundary used local authoritiesEven OSM returns official boundary, may inappropriate transport research bear little relation people travelTravel Work Areas (TTWAs) address issues creating zoning system analogous hydrological watersheds.\nTTWAs first defined contiguous zones within 75% population travels work (Coombes, Green, Openshaw 1986), definition used chapter.\nBristol major employer attracting travel surrounding towns, TTWA substantially larger city bounds (see Figure 12.1).\npolygon representing transport-orientated boundary stored object bristol_ttwa, provided spDataLarge package loaded beginning chapter.origin destination zones used chapter : officially defined zones intermediate geographic resolution (official name Middle layer Super Output Areas MSOAs).\nhouses around 8,000 people.\nadministrative zones can provide vital context transport analysis, type people might benefit particular interventions (e.g., Moreno-Monroy, Lovelace, Ramos 2017).geographic resolution zones important: small zones high geographic resolution usually preferable high number large regions can consequences processing (especially origin-destination analysis number possibilities increases non-linear function number zones) (Hollander 2016).\nAnother issue small zones related anonymity rules. make impossible infer identity individuals zones, detailed socio-demographic variables often available low geographic resolution. Breakdowns travel mode age sex, example, available Local Authority level UK, much higher Output Area level, contains around 100 households. details, see www.ons.gov.uk/methodology/geography.\n102 zones used chapter stored bristol_zones, illustrated Figure 12.2.\nNote zones get smaller densely populated areas: houses similar number people.\nbristol_zones contains attribute data transport, however, name code zone:add travel data, undertake attribute join, common task described Section 3.2.4.\nuse travel data UK’s 2011 census question travel work, data stored bristol_od, provided ons.gov.uk data portal.\nbristol_od origin-destination (OD) dataset travel work zones UK’s 2011 Census (see Section 12.4).\nfirst column ID zone origin second column zone destination.\nbristol_od rows bristol_zones, representing travel zones rather zones :results previous code chunk shows 10 OD pairs every zone, meaning need aggregate origin-destination data joined bristol_zones, illustrated (origin-destination data described Section 12.4):preceding chunk:grouped data zone origin (contained column o);aggregated variables bristol_od dataset numeric, find total number people living zone mode transport; and74renamed grouping variable o matches ID column geo_code bristol_zones object.resulting object zones_attr data frame rows representing zones ID variable.\ncan verify IDs match zones dataset using %% operator follows:results show 102 zones present new object zone_attr form can joined onto zones.75\ndone using joining function left_join() (note inner_join() produce result):\n\nresult zones_joined, contains new columns representing total number trips originating zone study area (almost 1/4 million) mode travel (bicycle, foot, car train).\ngeographic distribution trip origins illustrated left-hand map Figure 12.2.\nshows zones 0 4,000 trips originating study area.\ntrips made people living near center Bristol fewer outskirts.\n? Remember dealing trips within study region:\nlow trip numbers outskirts region can explained fact many people peripheral zones travel regions outside study area.\nTrips outside study region can included regional model special destination ID covering trips go zone represented model (Hollander 2016).\ndata bristol_od, however, simply ignores trips: ‘intra-zonal’ model.way OD datasets can aggregated zone origin, can also aggregated provide information destination zones.\nPeople tend gravitate towards central places.\nexplains spatial distribution represented right panel Figure 12.2 relatively uneven, common destination zones concentrated Bristol city center.\nresult zones_od, contains new column reporting number trip destinations mode, created follows:simplified version Figure 12.2 created code (see 12-zones.R code folder book’s GitHub repo reproduce figure Section 8.2.6 details faceted maps tmap):\nFIGURE 12.2: Number trips (commuters) living working region. left map shows zone origin commute trips; right map shows zone destination (generated script 12-zones.R).\n","code":"\nnames(bristol_zones)\n#> [1] \"geo_code\" \"name\"     \"geometry\"\nnrow(bristol_od)\n#> [1] 2910\nnrow(bristol_zones)\n#> [1] 102\nzones_attr = bristol_od %>% \n  group_by(o) %>% \n  summarize_if(is.numeric, sum) %>% \n  dplyr::rename(geo_code = o)\nsummary(zones_attr$geo_code %in% bristol_zones$geo_code)\n#>    Mode    TRUE \n#> logical     102\nzones_joined = left_join(bristol_zones, zones_attr, by = \"geo_code\")\nsum(zones_joined$all)\n#> [1] 238805\nnames(zones_joined)\n#> [1] \"geo_code\"   \"name\"       \"all\"        \"bicycle\"    \"foot\"      \n#> [6] \"car_driver\" \"train\"      \"geometry\"\nzones_od = bristol_od %>% \n  group_by(d) %>% \n  summarize_if(is.numeric, sum) %>% \n  dplyr::select(geo_code = d, all_dest = all) %>% \n  inner_join(zones_joined, ., by = \"geo_code\")\nqtm(zones_od, c(\"all\", \"all_dest\")) +\n  tm_layout(panel.labels = c(\"Origin\", \"Destination\"))"},{"path":"transport.html","id":"desire-lines","chapter":"12 Transportation","heading":"12.4 Desire lines","text":"Unlike zones, represent trip origins destinations, desire lines connect centroid origin destination zone, thereby represent people desire go zones.\nrepresent quickest ‘bee line’ ‘crow flies’ route B taken, obstacles buildings windy roads getting way (see convert desire lines routes next section).already loaded data representing desire lines dataset bristol_od.\norigin-destination (OD) data frame object represents number people traveling zone represented o d, illustrated Table 12.1.\narrange OD data trips filter-top 5, type (please refer Chapter 3 detailed description non-spatial attribute operations):TABLE 12.1: Sample top 5 origin-destination pairs Bristol OD data frame, representing travel desire lines zones study area.resulting table provides snapshot Bristolian travel patterns terms commuting (travel work).\ndemonstrates walking popular mode transport among top 5 origin-destination pairs, zone E02003043 popular destination (Bristol city center, destination top 5 OD pairs), intrazonal trips, one part zone E02003043 another (first row Table 12.1), constitute traveled OD pair dataset.\npolicy perspective, raw data presented Table 12.1 limited use:\naside fact contains tiny portion 2,910 OD pairs, tells us little policy measures needed, proportion trips made walking cycling.\nfollowing command calculates percentage desire line made active modes:two main types OD pair:\ninterzonal intrazonal.\nInterzonal OD pairs represent travel zones destination different origin.\nIntrazonal OD pairs represent travel within zone (see top row Table 12.1).\nfollowing code chunk splits od_bristol two types:next step convert interzonal OD pairs sf object representing desire lines can plotted map stplanr function od2line().76An illustration results presented Figure 12.3, simplified version created following command (see code 12-desire.R reproduce figure exactly Chapter 8 details visualization tmap):\nFIGURE 12.3: Desire lines representing trip patterns Bristol, width representing number trips color representing percentage trips made active modes (walking cycling). four black lines represent interzonal OD pairs Table 7.1.\nmap shows city center dominates transport patterns region, suggesting policies prioritized , although number peripheral sub-centers can also seen.\nNext interesting look distribution interzonal modes, e.g. zones cycling least common means transport.","code":"\nod_top5 = bristol_od %>% \n  arrange(desc(all)) %>% \n  top_n(5, wt = all)\nbristol_od$Active = (bristol_od$bicycle + bristol_od$foot) /\n  bristol_od$all * 100\nod_intra = filter(bristol_od, o == d)\nod_inter = filter(bristol_od, o != d)\ndesire_lines = od2line(od_inter, zones_od)\n#> Creating centroids representing desire line start and end points.\nqtm(desire_lines, lines.lwd = \"all\")"},{"path":"transport.html","id":"routes","chapter":"12 Transportation","heading":"12.5 Routes","text":"geographer’s perspective, routes desire lines longer straight:\norigin destination points , pathway get B complex.\nDesire lines contain two vertices (beginning end points) routes can contain hundreds vertices cover large distance represent travel patterns intricate road network (routes simple grid-based road networks require relatively vertices).\nRoutes generated desire lines — commonly origin-destination pairs — using routing services either run locally remotely.Local routing can advantageous terms speed execution control weighting profile different modes transport.\nDisadvantages include difficulty representing complex networks locally; temporal dynamics (primarily due traffic); need specialized software ‘pgRouting,’ issue developers packages stplanr dodgr seek address.Remote routing services, contrast, use web API send queries origins destinations return results generated powerful server running specialised software.\ngives remote routing services various advantages, including usuallyhave global coverage;update regularly; andrun specialist hardware software set-job.Disadvantages remote routing services include speed (rely data transfer internet) price (Google routing API, example, limits number free queries).\ngoogleway package provides interface Google’s routing API.\nFree (rate limited) routing service include OSRM openrouteservice.org.Instead routing desire lines generated previous section, time memory-consuming, focus desire lines policy interest.\nbenefits cycling trips greatest replace car trips.\nClearly, car trips can realistically replaced cycling.\nHowever, 5 km Euclidean distance (around 6-8 km route distance) can realistically cycled many people, especially riding electric bicycle (‘ebike’).\ntherefore route desire lines along high (300+) number car trips take place 5 km distance.\nrouting done code chunk stplanr function route(), creates sf objects representing routes transport network, one desire line.st_length() determines length linestring, falls distance relations category (see also Section 4.2.7).\nSubsequently, apply simple attribute filter operation (see Section 3.2.1) letting OSRM service routing remote server.\nNote routing works working internet connection.keep new route_carshort object separate straight line representation trip desire_carshort , data management perspective, makes sense combine : represent trip.\nnew route dataset contains distance (referring route distance time) duration fields (seconds) useful.\nHowever, purposes chapter, interested geometry, route distance can calculated.\nfollowing command makes use ability simple features objects contain multiple geographic columns:allows plotting desire lines along many short car journeys take place alongside likely routes traveled cars referring geometry column separately (desire_carshort$geometry desire_carshort$geom_car case).\nMaking width routes proportional number car journeys potentially replaced provides effective way prioritize interventions road network (Lovelace et al. 2017).code chunk plots desire lines routes, resulting Figure 12.4 shows routes along people drive short distances:77\nFIGURE 12.4: Routes along many (300+) short (<5km Euclidean distance) car journeys made (red) overlaying desire lines representing trips (black) zone centroids (dots).\nPlotting results interactive map, mapview::mapview(desire_carshort$geom_car) example, shows many short car trips take place around Bradley Stoke.\neasy find explanations area’s high level car dependency: according Wikipedia, Bradley Stoke “Europe’s largest new town built private investment,” suggesting limited public transport provision.\nFurthermore, town surrounded large (cycling unfriendly) road structures, “junctions M4 M5 motorways” (Tallon 2007).many benefits converting travel desire lines likely routes travel policy perspective, primary among ability understand surrounding environment makes people travel particular mode.\ndiscuss future directions research building routes Section 12.9.\npurposes case study, suffice say roads along short car journeys travel prioritized investigation understand can made conducive sustainable transport modes.\nOne option add new public transport nodes network.\nnodes described next section.","code":"\ndesire_lines$distance = as.numeric(st_length(desire_lines))\ndesire_carshort = dplyr::filter(desire_lines, car_driver > 300 & distance < 5000)\nroute_carshort = route(l = desire_carshort, route_fun = route_osrm)\ndesire_carshort$geom_car = st_geometry(route_carshort)\nplot(st_geometry(desire_carshort))\nplot(desire_carshort$geom_car, col = \"red\", add = TRUE)\nplot(st_geometry(st_centroid(zones_od)), add = TRUE)"},{"path":"transport.html","id":"nodes","chapter":"12 Transportation","heading":"12.6 Nodes","text":"Nodes geographic transport data zero-dimensional features (points) among predominantly one-dimensional features (lines) comprise network.\ntwo types transport nodes:Nodes directly network zone centroids — covered next section — individual origins destinations houses workplaces.Nodes part transport networks, representing individual pathways, intersections pathways (junctions) points entering exiting transport network bus stops train stations.Transport networks can represented graphs, segment connected (via edges representing geographic lines) one edges network.\nNodes outside network can added “centroid connectors”, new route segments nearby nodes network (Hollander 2016).78\nEvery node network connected one ‘edges’ represent individual segments network.\nsee transport networks can represented graphs Section 12.7.Public transport stops particularly important nodes can represented either type node: bus stop part road, large rail station represented pedestrian entry point hundreds meters railway tracks.\nuse railway stations illustrate public transport nodes, relation research question increasing cycling Bristol.\nstations provided spDataLarge bristol_stations.common barrier preventing people switching away cars commuting work distance home work far walk cycle.\nPublic transport can reduce barrier providing fast high-volume option common routes cities.\nactive travel perspective, public transport ‘legs’ longer journeys divide trips three:origin leg, typically residential areas public transport stationsThe public transport leg, typically goes station nearest trip’s origin station nearest destinationThe destination leg, station alighting destinationBuilding analysis conducted Section 12.4, public transport nodes can used construct three-part desire lines trips can taken bus (mode used example) rail.\nfirst stage identify desire lines public transport travel, case easy previously created dataset desire_lines already contains variable describing number trips train (public transport potential also estimated using public transport routing services OpenTripPlanner).\nmake approach easier follow, select top three desire lines terms rails use:challenge now ‘break-’ lines three pieces, representing travel via public transport nodes.\ncan done converting desire line multiline object consisting three line geometries representing origin, public transport destination legs trip.\noperation can divided three stages: matrix creation (origins, destinations ‘via’ points representing rail stations), identification nearest neighbors conversion multilines.\nundertaken line_via().\nstplanr function takes input lines points returns copy desire lines — see Desire Lines Extended vignette geocompr.github.io website ?line_via details works.\noutput input line, except new geometry columns representing journey via public transport nodes, demonstrated :illustrated Figure 12.5, initial desire_rail lines now three additional geometry list columns representing travel home origin station, destination, finally destination station destination.\ncase, destination leg short (walking distance) origin legs may sufficiently far justify investment cycling infrastructure encourage people cycle stations outward leg peoples’ journey work residential areas surrounding three origin stations Figure 12.5.\nFIGURE 12.5: Station nodes (red dots) used intermediary points convert straight desire lines high rail usage (black) three legs: origin station (red) via public transport (gray) destination (short blue line).\n","code":"\ndesire_rail = top_n(desire_lines, n = 3, wt = train)\nncol(desire_rail)\n#> [1] 10\ndesire_rail = line_via(desire_rail, bristol_stations)\nncol(desire_rail)\n#> [1] 13"},{"path":"transport.html","id":"route-networks","chapter":"12 Transportation","heading":"12.7 Route networks","text":"\ndata used section downloaded using osmdata.\navoid request data OSM repeatedly, use bristol_ways object, contains point line data case study area (see ?bristol_ways):code chunk loaded simple feature object representing around 3,000 segments transport network.\neasily manageable dataset size (transport datasets can large, ’s best start small).mentioned, route networks can usefully represented mathematical graphs, nodes network connected edges.\nnumber R packages developed dealing graphs, notably igraph.\nOne can manually convert route network igraph object, geographic attributes lost.\novercome issue SpatialLinesNetwork() developed stplanr package represent route networks simultaneously graphs set geographic lines.\nfunction demonstrated using subset bristol_ways object used previous sections.output previous code chunk shows ways_sln composite object various ‘slots.’\ninclude: spatial component network (named sl), graph component (g) ‘weightfield,’ edge variable used shortest path  calculation (default segment distance).\nways_sln class sfNetwork, defined S4 class system.\nmeans component can accessed using @ operator, used extract graph component process using igraph package, plotting results geographic space.\nexample , ‘edge betweenness’, meaning number shortest paths passing edge, calculated (see ?igraph::betweenness details Figure 12.6).\nresults demonstrate graph edge represents segment: segments near center road network greatest betweenness scores.\nFIGURE 12.6: Illustration small route network, segment thickness proportional betweenness, generated using igraph package described text.\nOne can also find shortest route origins destinations using graph representation route network.\ncan done functions sum_network_routes() stplanr, undertakes ‘local routing’ (see Section 12.5).","code":"\nsummary(bristol_ways)\n#>      highway        maxspeed         ref                geometry   \n#>  cycleway:1317   30 mph : 925   A38    : 214   LINESTRING   :4915  \n#>  rail    : 832   20 mph : 556   A432   : 146   epsg:4326    :   0  \n#>  road    :2766   40 mph : 397   M5     : 144   +proj=long...:   0  \n#>                  70 mph : 328   A4018  : 124                       \n#>                  50 mph : 158   A420   : 115                       \n#>                  (Other): 490   (Other):1877                       \n#>                  NA's   :2061   NA's   :2295\nways_freeway = bristol_ways %>% filter(maxspeed == \"70 mph\") \nways_sln = SpatialLinesNetwork(ways_freeway)\n#> Warning in SpatialLinesNetwork.sf(ways_freeway): Graph composed of multiple\n#> subgraphs, consider cleaning it with sln_clean_graph().\nslotNames(ways_sln)\n#> [1] \"sl\"          \"g\"           \"nb\"          \"weightfield\"\nweightfield(ways_sln)\n#> [1] \"length\"\nclass(ways_sln@g)\n#> [1] \"igraph\"\ne = igraph::edge_betweenness(ways_sln@g)\nplot(ways_sln@sl$geometry, lwd = e / 500)"},{"path":"transport.html","id":"prioritizing-new-infrastructure","chapter":"12 Transportation","heading":"12.8 Prioritizing new infrastructure","text":"chapter’s final practical section demonstrates policy-relevance geocomputation transport applications identifying locations new transport infrastructure may needed.\nClearly, types analysis presented need extended complemented methods used real-world applications, discussed Section 12.9.\nHowever, stage useful , feed wider analyses.\nsummarize, : identifying short car-dependent commuting routes (generated desire lines) Section 12.5; creating desire lines representing trips rail stations Section 12.6; analysis transport systems route network using graph theory Section 12.7.final code chunk chapter combines strands analysis.\nadds car-dependent routes route_carshort newly created object, route_rail creates new column representing amount travel along centroid--centroid desire lines represent:results preceding code visualized Figure 12.7, shows routes high levels car dependency highlights opportunities cycling rail stations (subsequent code chunk creates simple version figure — see code/12-cycleways.R reproduce figure exactly).\nmethod limitations: reality, people travel zone centroids always use shortest route algorithm particular mode.\nHowever, results demonstrate routes along cycle paths prioritized car dependency public transport perspectives.\nFIGURE 12.7: Potential routes along prioritise cycle infrastructure Bristol, based access key rail stations (red dots) routes many short car journeys (north Bristol surrounding Stoke Bradley). Line thickness proportional number trips.\nresults may look attractive interactive map, mean?\nroutes highlighted Figure 12.7 suggest transport systems intimately linked wider economic social context.\nexample Stoke Bradley case point:\nlocation, lack public transport services active travel infrastructure help explain highly car-dependent.\nwider point car dependency spatial distribution implications sustainable transport policies (Hickman, Ashiru, Banister 2011).","code":"\nroute_rail = desire_rail %>%\n  st_set_geometry(\"leg_orig\") %>% \n  route(l = ., route_fun = route_osrm) %>% \n  select(names(route_carshort))\nroute_cycleway = rbind(route_rail, route_carshort)\nroute_cycleway$all = c(desire_rail$all, desire_carshort$all)\nqtm(route_cycleway, lines.lwd = \"all\")"},{"path":"transport.html","id":"future-directions-of-travel","chapter":"12 Transportation","heading":"12.9 Future directions of travel","text":"chapter provides taste possibilities using geocomputation transport research.\nexplored key geographic elements make-city’s transport system using open data reproducible code.\nresults help plan investment needed.Transport systems operate multiple interacting levels, meaning geocomputational methods great potential generate insights work.\nmuch done area: possible build foundations presented chapter many directions.\nTransport fastest growing source greenhouse gas emissions many countries, set become “largest GHG emitting sector, especially developed countries” (see EURACTIV.com).\nhighly unequal distribution transport-related emissions across society, fact transport (unlike food heating) essential well-, great potential sector rapidly decarbonize demand reduction, electrification vehicle fleet uptake active travel modes walking cycling.\nexploration ‘transport futures’ local level represents promising direction travel transport-related geocomputational research.Methodologically, foundations presented chapter extended including variables analysis.\nCharacteristics route speed limits, busyness provision protected cycling walking paths linked ‘mode-split’ (proportion trips made different modes transport).\naggregating OpenStreetMap data using buffers geographic data methods presented Chapters 3 4, example, possible detect presence green space close proximity transport routes.\nUsing R’s statistical modeling capabilities, used predict current future levels cycling, example.type analysis underlies Propensity Cycle Tool (PCT), publicly accessible (see www.pct.bike) mapping tool developed R used prioritize investment cycling across England (Lovelace et al. 2017).\nSimilar tools used encourage evidence-based transport policies related topics air pollution public transport access around world.","code":""},{"path":"transport.html","id":"ex-transport","chapter":"12 Transportation","heading":"12.10 Exercises","text":"total distance cycleways constructed routes presented Figure 12.7 constructed?\nBonus: find two ways arriving answer.\nBonus: find two ways arriving answer.proportion trips represented desire_lines accounted route_cycleway object?\nBonus: proportion trips cross proposed routes?\nAdvanced: write code increase proportion.\nBonus: proportion trips cross proposed routes?Advanced: write code increase proportion.analysis presented chapter designed teaching geocomputation methods can applied transport research. ‘real’ local government transport consultancy, top 3 things differently?\n\n\n\nClearly, routes identified Figure 12.7 provide part picture. extend analysis incorporate trips potentially cycled?Imagine want extend scenario creating key areas (routes) investment place-based cycling policies car-free zones, cycle parking points reduced car parking strategy. raster data assist work?\nBonus: develop raster layer divides Bristol region 100 cells (10 10) provide metric related transport policy, number people trips pass cell walking average speed limit roads, bristol_ways dataset (approach taken Chapter 13).\nBonus: develop raster layer divides Bristol region 100 cells (10 10) provide metric related transport policy, number people trips pass cell walking average speed limit roads, bristol_ways dataset (approach taken Chapter 13).","code":""},{"path":"location.html","id":"location","chapter":"13 Geomarketing","heading":"13 Geomarketing","text":"","code":""},{"path":"location.html","id":"prerequisites-11","chapter":"13 Geomarketing","heading":"Prerequisites","text":"chapter requires following packages (revgeo must also installed):Required data, downloaded due courseAs convenience reader ensure easy reproducibility, made available downloaded data spDataLarge package.","code":"\nlibrary(sf)\nlibrary(dplyr)\nlibrary(purrr)\nlibrary(raster)\n#> Warning: multiple methods tables found for 'approxNA'\nlibrary(osmdata)\nlibrary(spDataLarge)"},{"path":"location.html","id":"introduction-8","chapter":"13 Geomarketing","heading":"13.1 Introduction","text":"chapter demonstrates skills learned Parts II can applied particular domain: geomarketing (sometimes also referred location analysis location intelligence).\nbroad field research commercial application.\ntypical example locate new shop.\naim attract visitors , ultimately, make profit.\nalso many non-commercial applications can use technique public benefit, example locate new health services (Tomintz, Clarke, Rigby 2008).People fundamental location analysis, particular likely spend time resources.\nInterestingly, ecological concepts models quite similar used store location analysis.\nAnimals plants can best meet needs certain ‘optimal’ locations, based variables change space (Muenchow et al. (2018); see also chapter 14).\none great strengths geocomputation GIScience general.\nConcepts methods transferable fields.\nPolar bears, example, prefer northern latitudes temperatures lower food (seals sea lions) plentiful.\nSimilarly, humans tend congregate certain places, creating economic niches (high land prices) analogous ecological niche Arctic.\nmain task location analysis find ‘optimal locations’ specific services, based available data.\nTypical research questions include:target groups live areas frequent?competing stores services located?many people can easily reach specific stores?existing services - -exploit market potential?market share company specific area?chapter demonstrates geocomputation can answer questions based hypothetical case study based real data.","code":""},{"path":"location.html","id":"case-study","chapter":"13 Geomarketing","heading":"13.2 Case study: bike shops in Germany","text":"Imagine starting chain bike shops Germany.\nstores placed urban areas many potential customers possible.\nAdditionally, hypothetical survey (invented chapter, commercial use!) suggests single young males (aged 20 40) likely buy products: target audience.\nlucky position sufficient capital open number shops.\nplaced?\nConsulting companies (employing geomarketing analysts) happily charge high rates answer questions.\nLuckily, can help open data open source software.\nfollowing sections demonstrate techniques learned first chapters book can applied undertake common steps service location analysis:Tidy input data German census (Section 13.3)Convert tabulated census data raster objects (Section 13.4)Identify metropolitan areas high population densities (Section 13.5)Download detailed geographic data (OpenStreetMap, osmdata) areas (Section 13.6)Create rasters scoring relative desirability different locations using map algebra (Section 13.7)Although applied steps specific case study, generalized many scenarios store location public service provision.","code":""},{"path":"location.html","id":"tidy-the-input-data","chapter":"13 Geomarketing","heading":"13.3 Tidy the input data","text":"German government provides gridded census data either 1 km 100 m resolution.\nfollowing code chunk downloads, unzips reads 1 km data.\nPlease note census_de also available spDataLarge package (data(\"census_de\", package = \"spDataLarge\").census_de object data frame containing 13 variables 300,000 grid cells across Germany.\nwork, need subset : Easting (x) Northing (y), number inhabitants (population; pop), mean average age (mean_age), proportion women (women) average household size (hh_size).\nvariables selected renamed German English code chunk summarized Table 13.1.\n, mutate_all() used convert values -1 -9 (meaning unknown) NA.TABLE 13.1: Categories variable census data Datensatzbeschreibung…xlsx located downloaded file census.zip (see Figure 13.1 spatial distribution).","code":"\ndownload.file(\"https://tinyurl.com/ybtpkwxz\", \n              destfile = \"census.zip\", mode = \"wb\")\nunzip(\"census.zip\") # unzip the files\ncensus_de = readr::read_csv2(list.files(pattern = \"Gitter.csv\"))\n# pop = population, hh_size = household size\ninput = dplyr::select(census_de, x = x_mp_1km, y = y_mp_1km, pop = Einwohner,\n                      women = Frauen_A, mean_age = Alter_D,\n                      hh_size = HHGroesse_D)\n# set -1 and -9 to NA\ninput_tidy = mutate_all(input, list(~ifelse(. %in% c(-1, -9), NA, .)))"},{"path":"location.html","id":"create-census-rasters","chapter":"13 Geomarketing","heading":"13.4 Create census rasters","text":"preprocessing, data can converted raster stack brick (see Sections 2.3.4 3.3.1).\nrasterFromXYZ() makes really easy.\nrequires input data frame first two columns represent coordinates regular grid.\nremaining columns (: pop, women, mean_age, hh_size) serve input raster brick layers (Figure 13.1; see also code/13-location-jm.R github repository).\nFIGURE 13.1: Gridded German census data 2011 (see Table 13.1 description classes).\nnext stage reclassify values rasters stored input_ras accordance survey mentioned Section 13.2, using raster function reclassify(), introduced Section 4.3.3.\ncase population data, convert classes numeric data type using class means.\nRaster cells assumed population 127 value 1 (cells ‘class 1’ contain 3 250 inhabitants) 375 value 2 (containing 250 500 inhabitants), (see Table 13.1).\ncell value 8000 inhabitants chosen ‘class 6’ cells contain 8000 people.\ncourse, approximations true population, precise values.79\nHowever, level detail sufficient delineate metropolitan areas (see next section).contrast pop variable, representing absolute estimates total population, remaining variables re-classified weights corresponding weights used survey.\nClass 1 variable women, instance, represents areas 0 40% population female;\nreclassified comparatively high weight 3 target demographic predominantly male.\nSimilarly, classes containing youngest people highest proportion single households reclassified high weights.Note made sure order reclassification matrices list elements input_ras.\ninstance, first element corresponds cases population.\nSubsequently, -loop applies reclassification matrix corresponding raster layer.\nFinally, code chunk ensures reclass layers name layers input_ras.","code":"\ninput_ras = rasterFromXYZ(input_tidy, crs = st_crs(3035)$proj4string)\ninput_ras\n#> class : RasterBrick\n#> dimensions : 868, 642, 557256, 4 (nrow, ncol, ncell, nlayers)\n#> resolution : 1000, 1000 (x, y)\n#> extent : 4031000, 4673000, 2684000, 3552000 (xmin, xmax, ymin, ymax)\n#> coord. ref. : +proj=laea +lat_0=52 +lon_0=10\n#> names       :  pop, women, mean_age, hh_size \n#> min values  :    1,     1,        1,       1 \n#> max values  :    6,     5,        5,       5\nrcl_pop = matrix(c(1, 1, 127, 2, 2, 375, 3, 3, 1250, \n                   4, 4, 3000, 5, 5, 6000, 6, 6, 8000), \n                 ncol = 3, byrow = TRUE)\nrcl_women = matrix(c(1, 1, 3, 2, 2, 2, 3, 3, 1, 4, 5, 0), \n                   ncol = 3, byrow = TRUE)\nrcl_age = matrix(c(1, 1, 3, 2, 2, 0, 3, 5, 0),\n                 ncol = 3, byrow = TRUE)\nrcl_hh = rcl_women\nrcl = list(rcl_pop, rcl_women, rcl_age, rcl_hh)\nreclass = input_ras\nfor (i in seq_len(nlayers(reclass))) {\n  reclass[[i]] = reclassify(x = reclass[[i]], rcl = rcl[[i]], right = NA)\n}\nnames(reclass) = names(input_ras)\nreclass\n#> ... (full output not shown)\n#> names       :  pop, women, mean_age, hh_size \n#> min values  :  127,     0,        0,       0 \n#> max values  : 8000,     3,        3,       3"},{"path":"location.html","id":"define-metropolitan-areas","chapter":"13 Geomarketing","heading":"13.5 Define metropolitan areas","text":"define metropolitan areas pixels 20 km2 inhabited 500,000 people.\nPixels coarse resolution can rapidly created using aggregate(), introduced Section 5.3.3.\ncommand uses argument fact = 20 reduce resolution result twenty-fold (recall original raster resolution 1 km2):next stage keep cells half million people.Plotting reveals eight metropolitan regions (Figure 13.2).\nregion consists one raster cells.\nnice join cells belonging one region.\nraster’s clump() command exactly .\nSubsequently, rasterToPolygons() converts raster object spatial polygons, st_as_sf() converts sf-object.polys now features column named clumps indicates metropolitan region polygon belongs use dissolve polygons coherent single regions (see also Section 5.2.6):Given column input, summarize() dissolves geometry.\nFIGURE 13.2: aggregated population raster (resolution: 20 km) identified metropolitan areas (golden polygons) corresponding names.\nresulting eight metropolitan areas suitable bike shops (Figure 13.2; see also code/13-location-jm.R creating figure) still missing name.\nreverse geocoding approach can settle problem.\nGiven coordinate, reverse geocoding finds corresponding address.\nConsequently, extracting centroid coordinate metropolitan area can serve input reverse geocoding API.\nrevgeo package provides access open source Photon geocoder OpenStreetMap, Google Maps Bing.\ndefault, uses Photon API.\nrevgeo::revgeo() accepts geographical coordinates (latitude/longitude); therefore, first requirement bring metropolitan polygons appropriate coordinate reference system (Chapter 6).Choosing frame revgeocode()’s output option give back data.frame several columns referring location including street name, house number city.make sure reader uses exact results, put spDataLarge object metro_names.TABLE 13.2: Result reverse geocoding.Overall, satisfied city column serving metropolitan names (Table 13.2) apart one exception, namely Wülfrath belongs greater region Düsseldorf.\nHence, replace Wülfrath Düsseldorf (Figure 13.2).\nUmlauts like ü might lead trouble , example determining bounding box metropolitan area opq() (see ), avoid .","code":"\npop_agg = aggregate(reclass$pop, fact = 20, fun = sum)\nsummary(pop_agg)\n#>             pop\n#> Min.        127\n#> 1st Qu.   39886\n#> Median    66008\n#> 3rd Qu.  105696\n#> Max.    1204870\n#> NA's        447\npop_agg = pop_agg[pop_agg > 500000, drop = FALSE] \npolys = pop_agg %>% \n  clump() %>%\n  rasterToPolygons() %>%\n  st_as_sf()\nmetros = polys %>%\n  group_by(clumps) %>%\n  summarize()\nmetros_wgs = st_transform(metros, 4326)\ncoords = st_centroid(metros_wgs) %>%\n  st_coordinates() %>%\n  round(4)\nlibrary(revgeo)\nmetro_names = revgeo(longitude = coords[, 1], latitude = coords[, 2], \n                     output = \"frame\")\nmetro_names = dplyr::pull(metro_names, city) %>% \n  as.character() %>% \n  ifelse(. == \"Wülfrath\", \"Duesseldorf\", .)"},{"path":"location.html","id":"points-of-interest","chapter":"13 Geomarketing","heading":"13.6 Points of interest","text":"\nosmdata package provides easy--use access OSM data (see also Section 7.2).\nInstead downloading shops whole Germany, restrict query defined metropolitan areas, reducing computational load providing shop locations areas interest.\nsubsequent code chunk using number functions including:map() (tidyverse equivalent lapply()), iterates eight metropolitan names subsequently define bounding box OSM query function opq() (see Section 7.2).add_osm_feature() specify OSM elements key value shop (see wiki.openstreetmap.org list common key:value pairs).osmdata_sf(), converts OSM data spatial objects (class sf).(), tries repeatedly (three times case) download data fails first time.80\nrunning code: please consider download almost 2GB data.\nsave time resources, put output named shops spDataLarge.\nmake available environment ensure spDataLarge package loaded, run data(\"shops\", package = \"spDataLarge\").highly unlikely shops defined metropolitan areas.\nfollowing condition simply checks least one shop region.\n, recommend try download shops /specific region/s.make sure list element (sf data frame) comes columns, keep osm_id shop columns help another map loop.\ngiven since OSM contributors equally meticulous collecting data.\nFinally, rbind shops one large sf object.easier simply use map_dfr().\nUnfortunately, far work harmony sf objects.\nNote: shops provided spDataLarge package.thing left convert spatial point object raster (see Section 5.4.3).\nsf object, shops, converted raster parameters (dimensions, resolution, CRS) reclass object.\nImportantly, count() function used calculate number shops cell.result subsequent code chunk therefore estimate shop density (shops/km2).\nst_transform() used rasterize() ensure CRS inputs match.raster layers (population, women, mean age, household size) poi raster reclassified four classes (see Section 13.4).\nDefining class intervals arbitrary undertaking certain degree.\nOne can use equal breaks, quantile breaks, fixed values others.\n, choose Fisher-Jenks natural breaks approach minimizes within-class variance, result provides input reclassification matrix.","code":"\nshops = map(metro_names, function(x) {\n  message(\"Downloading shops of: \", x, \"\\n\")\n  # give the server a bit time\n  Sys.sleep(sample(seq(5, 10, 0.1), 1))\n  query = opq(x) %>%\n    add_osm_feature(key = \"shop\")\n  points = osmdata_sf(query)\n  # request the same data again if nothing has been downloaded\n  iter = 2\n  while (nrow(points$osm_points) == 0 & iter > 0) {\n    points = osmdata_sf(query)\n    iter = iter - 1\n  }\n  points = st_set_crs(points$osm_points, 4326)\n})\n# checking if we have downloaded shops for each metropolitan area\nind = map(shops, nrow) == 0\nif (any(ind)) {\n  message(\"There are/is still (a) metropolitan area/s without any features:\\n\",\n          paste(metro_names[ind], collapse = \", \"), \"\\nPlease fix it!\")\n}\n# select only specific columns\nshops = map(shops, dplyr::select, osm_id, shop)\n# putting all list elements into a single data frame\nshops = do.call(rbind, shops)\nshops = st_transform(shops, proj4string(reclass))\n# create poi raster\npoi = rasterize(x = shops, y = reclass, field = \"osm_id\", fun = \"count\")\n# construct reclassification matrix\nint = classInt::classIntervals(values(poi), n = 4, style = \"fisher\")\nint = round(int$brks)\nrcl_poi = matrix(c(int[1], rep(int[-c(1, length(int))], each = 2), \n                   int[length(int)] + 1), ncol = 2, byrow = TRUE)\nrcl_poi = cbind(rcl_poi, 0:3)  \n# reclassify\npoi = reclassify(poi, rcl = rcl_poi, right = NA) \nnames(poi) = \"poi\""},{"path":"location.html","id":"identifying-suitable-locations","chapter":"13 Geomarketing","heading":"13.7 Identifying suitable locations","text":"steps remain combining layers add poi reclass raster stack remove population layer .\nreasoning latter twofold.\nFirst , already delineated metropolitan areas, areas population density average compared rest Germany.\nSecond, though advantageous many potential customers within specific catchment area, sheer number alone might actually represent desired target group.\ninstance, residential tower blocks areas high population density necessarily high purchasing power expensive cycle components.\nachieved complementary functions addLayer() dropLayer():common data science projects, data retrieval ‘tidying’ consumed much overall workload far.\nclean data, final step — calculating final score summing raster layers — can accomplished single line code.instance, score greater 9 might suitable threshold indicating raster cells bike shop placed (Figure 13.3; see also code/13-location-jm.R).\nFIGURE 13.3: Suitable areas (.e., raster cells score > 9) accordance hypothetical survey bike stores Berlin.\n","code":"\n# add poi raster\nreclass = addLayer(reclass, poi)\n# delete population raster\nreclass = dropLayer(reclass, \"pop\")\n# calculate the total score\nresult = sum(reclass)"},{"path":"location.html","id":"discussion-and-next-steps","chapter":"13 Geomarketing","heading":"13.8 Discussion and next steps","text":"presented approach typical example normative usage GIS (P. Longley 2015).\ncombined survey data expert-based knowledge assumptions (definition metropolitan areas, defining class intervals, definition final score threshold).\napproach less suitable scientific research applied analysis provides evidence based indication areas suitable bike shops compared sources information.\nnumber changes approach improve analysis:used equal weights calculating final scores factors, household size, important portion women mean ageWe used points interest related bike shops, --, hardware, bicycle, fishing, hunting, motorcycles, outdoor sports shops (see range shop values available OSM Wiki) may yielded refined resultsData higher resolution may improve output (see exercises)used limited set variables data sources, INSPIRE geoportal data cycle paths OpenStreetMap, may enrich analysis (see also Section 7.2)Interactions remained unconsidered, possible relationships portion men single householdsIn short, analysis extended multiple directions.\nNevertheless, given first impression understanding obtain deal spatial data R within geomarketing context.Finally, point presented analysis merely first step finding suitable locations.\nfar identified areas, 1 1 km size, representing potentially suitable locations bike shop accordance survey.\nSubsequent steps analysis taken:Find optimal location based number inhabitants within specific catchment area.\nexample, shop reachable many people possible within 15 minutes traveling bike distance (catchment area routing).\nThereby, account fact away people shop, unlikely becomes actually visit (distance decay function).Also good idea take account competitors.\n, already bike shop vicinity chosen location, possible customers (sales potential) distributed competitors (Huff 1963; Wieland 2017).need find suitable affordable real estate, e.g., terms accessibility, availability parking spots, desired frequency passers-, big windows, etc.","code":""},{"path":"location.html","id":"exercises-9","chapter":"13 Geomarketing","heading":"13.9 Exercises","text":"used raster::rasterFromXYZ() convert input_tidy raster brick.\nTry achieve help sp::gridded() function.\nused raster::rasterFromXYZ() convert input_tidy raster brick.\nTry achieve help sp::gridded() function.\nDownload csv file containing inhabitant information 100-m cell resolution (https://www.zensus2011.de/SharedDocs/Downloads/DE/Pressemitteilung/DemografischeGrunddaten/csv_Bevoelkerung_100m_Gitter.zip?__blob=publicationFile&v=3).\nPlease note unzipped file size 1.23 GB.\nread R, can use readr::read_csv.\ntakes 30 seconds machine (16 GB RAM)\ndata.table::fread() might even faster, returns object class data.table().\nUse .tibble() convert tibble.\nBuild inhabitant raster, aggregate cell resolution 1 km, compare difference inhabitant raster (inh) created using class mean values.Download csv file containing inhabitant information 100-m cell resolution (https://www.zensus2011.de/SharedDocs/Downloads/DE/Pressemitteilung/DemografischeGrunddaten/csv_Bevoelkerung_100m_Gitter.zip?__blob=publicationFile&v=3).\nPlease note unzipped file size 1.23 GB.\nread R, can use readr::read_csv.\ntakes 30 seconds machine (16 GB RAM)\ndata.table::fread() might even faster, returns object class data.table().\nUse .tibble() convert tibble.\nBuild inhabitant raster, aggregate cell resolution 1 km, compare difference inhabitant raster (inh) created using class mean values.Suppose bike shop predominantly sold electric bikes older people.\nChange age raster accordingly, repeat remaining analyses compare changes original result.Suppose bike shop predominantly sold electric bikes older people.\nChange age raster accordingly, repeat remaining analyses compare changes original result.","code":""},{"path":"eco.html","id":"eco","chapter":"14 Ecology","heading":"14 Ecology","text":"","code":""},{"path":"eco.html","id":"prerequisites-12","chapter":"14 Ecology","heading":"Prerequisites","text":"chapter assumes strong grasp geographic data analysis processing, covered Chapters 2 5.\nalso make use R’s interfaces dedicated GIS software, spatial cross-validation, topics covered Chapters 9 11, respectively.chapter uses following packages:","code":"\nlibrary(sf)\nlibrary(raster)\n# library(RQGIS)\nlibrary(mlr)\nlibrary(dplyr)\nlibrary(vegan)"},{"path":"eco.html","id":"introduction-9","chapter":"14 Ecology","heading":"14.1 Introduction","text":"chapter model floristic gradient fog oases reveal distinctive vegetation belts clearly controlled water availability.\n, bring together concepts presented previous chapters even extend (Chapters 2 5 Chapters 9 11).Fog oases one fascinating vegetation formations ever encountered.\nformations, locally termed lomas, develop mountains along coastal deserts Peru Chile.81\ndeserts’ extreme conditions remoteness provide habitat unique ecosystem, including species endemic fog oases.\nDespite arid conditions low levels precipitation around 30-50 mm per year average, fog deposition increases amount water available plants austal winter.\nresults green southern-facing mountain slopes along coastal strip Peru (Figure 14.1).\nfog, develops temperature inversion caused cold Humboldt current austral winter, provides name habitat.\nEvery years, El Niño phenomenon brings torrential rainfall sun-baked environment (Dillon, Nakazawa, Leiva 2003).\ncauses desert bloom, provides tree seedlings chance develop roots long enough survive following arid conditions.Unfortunately, fog oases heavily endangered.\nmostly due human activity (agriculture climate change).\neffectively protect last remnants unique vegetation ecosystem, evidence needed composition spatial distribution native flora (Muenchow, Bräuning, et al. 2013; Muenchow, Hauenstein, et al. 2013).\nLomas mountains also economic value tourist destination, can contribute well-local people via recreation.\nexample, Peruvians live coastal desert, lomas mountains frequently closest “green” destination.chapter demonstrate ecological applications techniques learned previous chapters.\ncase study involve analyzing composition spatial distribution vascular plants southern slope Mt. Mongón, lomas mountain near Casma central northern coast Peru (Figure 14.1).\nFIGURE 14.1: Mt. Mongón study area, Muenchow, Schratz, Brenning (2017).\nfield study Mt. Mongón, recorded vascular plants living 100 randomly sampled 4x4 m2 plots austral winter 2011 (Muenchow, Bräuning, et al. 2013).\nsampling coincided strong La Niña event year (see ENSO monitoring NOASS Climate Prediction Center).\nled even higher levels aridity usual coastal desert.\nhand, also increased fog activity southern slopes Peruvian lomas mountains.Ordinations dimension-reducing techniques allow extraction main gradients (noisy) dataset, case floristic gradient developing along southern mountain slope (see next section).\nchapter model first ordination axis, .e., floristic gradient, function environmental predictors altitude, slope, catchment area NDVI.\n, make use random forest model - popular machine learning algorithm (Breiman 2001).\nmodel allow us make spatial predictions floristic composition anywhere study area.\nguarantee optimal prediction, advisable tune beforehand hyperparameters help spatial cross-validation (see Section 11.5.2).","code":""},{"path":"eco.html","id":"data-and-data-preparation","chapter":"14 Ecology","heading":"14.2 Data and data preparation","text":"data needed subsequent analyses available via spDataLarge package.study_area sf polygon representing outlines study area.\nrandom_points sf object, contains 100 randomly chosen sites.\ncomm community matrix wide data format (Wickham 2014) rows represent visited sites field columns observed species.82The values represent species cover per site, recorded area covered species proportion site area percentage points (%; please note one site can >100% due overlapping cover individual plants).\nrownames comm correspond id column random_points.\ndem digital elevation model (DEM) study area, ndvi Normalized Difference Vegetation Index (NDVI) computed red near-infrared channels Landsat scene (see Section 4.3.3 ?ndvi).\nVisualizing data helps get familiar , shown Figure 14.2 dem overplotted random_points study_area.\nFIGURE 14.2: Study mask (polygon), location sampling sites (black points) DEM background.\nnext step compute variables need modeling predictive mapping (see Section 14.4.2) also aligning Non-metric multidimensional scaling (NMDS) axes main gradient study area, altitude humidity, respectively (see Section 14.3).Specifically, compute catchment slope catchment area digital elevation model using R-GIS bridges (see Chapter 9).\nCurvatures might also represent valuable predictors, Exercise section can find change modeling result.compute catchment area catchment slope, make use saga:sagawetnessindex function.83\nget_usage() returns function parameters default values specific geoalgorithm.\n, present selection complete output.Subsequently, can specify needed parameters using R named arguments (see Section 9.2).\nRemember can use RasterLayer living R’s global environment specify input raster DEM (see Section 9.2).\nSpecifying 1 SLOPE_TYPE makes sure algorithm return catchment slope.\nresulting output rasters saved temporary files .sdat extension SAGA raster format.\nSetting load_output TRUE ensures resulting rasters imported R.returns list named ep consisting two elements: AREA SLOPE.\nLet us add two raster objects list, namely dem ndvi, convert raster stack (see Section 2.3.4).Additionally, catchment area values highly skewed right (hist(ep$carea)).\nlog10-transformation makes distribution normal.convenience reader, added ep spDataLarge:Finally, can extract terrain attributes field observations (see also Section 5.4.2).","code":"\ndata(\"study_area\", \"random_points\", \"comm\", \"dem\", \"ndvi\", package = \"spDataLarge\")\n# sites 35 to 40 and corresponding occurrences of the first five species in the\n# community matrix\ncomm[35:40, 1:5]\n#>    Alon_meri Alst_line Alte_hali Alte_porr Anth_eccr\n#> 35         0         0         0       0.0     1.000\n#> 36         0         0         1       0.0     0.500\n#> 37         0         0         0       0.0     0.125\n#> 38         0         0         0       0.0     3.000\n#> 39         0         0         0       0.0     2.000\n#> 40         0         0         0       0.2     0.125\nget_usage(\"saga:sagawetnessindex\")\n#>ALGORITHM: Saga wetness index\n#>  DEM <ParameterRaster>\n#>  ...\n#>  SLOPE_TYPE <ParameterSelection>\n#>  ...\n#>  AREA <OutputRaster>\n#>  SLOPE <OutputRaster>\n#>  AREA_MOD <OutputRaster>\n#>  TWI <OutputRaster>\n#> ...\n#>SLOPE_TYPE(Type of Slope)\n#>  0 - [0] local slope\n#>  1 - [1] catchment slope\n#> ...\n# environmental predictors: catchment slope and catchment area\nep = run_qgis(alg = \"saga:sagawetnessindex\",\n              DEM = dem,\n              SLOPE_TYPE = 1, \n              SLOPE = tempfile(fileext = \".sdat\"),\n              AREA = tempfile(fileext = \".sdat\"),\n              load_output = TRUE,\n              show_output_paths = FALSE)\nep = stack(c(dem, ndvi, ep))\nnames(ep) = c(\"dem\", \"ndvi\", \"carea\", \"cslope\")\nep$carea = log10(ep$carea)\ndata(\"ep\", package = \"spDataLarge\")\nrandom_points[, names(ep)] = raster::extract(ep, random_points)"},{"path":"eco.html","id":"nmds","chapter":"14 Ecology","heading":"14.3 Reducing dimensionality","text":"Ordinations popular tool vegetation science extract main information, frequently corresponding ecological gradients, large species-plot matrices mostly filled 0s.\nHowever, also used remote sensing, soil sciences, geomarketing many fields.\nunfamiliar ordination techniques need refresher, look Michael W. Palmer’s web page short introduction popular ordination techniques ecology Borcard, Gillet, Legendre (2011) deeper look apply techniques R.\nvegan’s package documentation also helpful resource (vignette(package = \"vegan\")).Principal component analysis (PCA) probably famous ordination technique.\ngreat tool reduce dimensionality one can expect linear relationships variables, joint absence variable (example calcium) two plots (observations) can considered similarity.\nbarely case vegetation data.one, relationships usually non-linear along environmental gradients.\nmeans presence plant usually follows unimodal relationship along gradient (e.g., humidity, temperature salinity) peak favorable conditions declining ends towards unfavorable conditions.Secondly, joint absence species two plots hardly indication similarity.\nSuppose plant species absent driest (e.g., extreme desert) moistest locations (e.g., tree savanna) sampling.\nreally refrain counting similarity likely thing two completely different environmental settings common terms floristic composition shared absence species (except rare ubiquitous species).Non-metric multidimensional scaling (NMDS) one popular dimension-reducing technique ecology (von Wehrden et al. 2009).\nNMDS reduces rank-based differences distances objects original matrix distances ordinated objects.\ndifference expressed stress.\nlower stress value, better ordination, .e., low-dimensional representation original matrix.\nStress values lower 10 represent excellent fit, stress values around 15 still good, values greater 20 represent poor fit (McCune, Grace, Urban 2002).\nR, metaMDS() vegan package can execute NMDS.\ninput, expects community matrix sites rows species columns.\nOften ordinations using presence-absence data yield better results (terms explained variance) though prize , course, less informative input matrix (see also Exercises).\ndecostand() converts numerical observations presences absences 1 indicating occurrence species 0 absence species.\nOrdination techniques NMDS require least one observation per site.\nHence, need dismiss sites species found.resulting output matrix serves input NMDS.\nk specifies number output axes, , set 4.84\nNMDS iterative procedure trying make ordinated space similar input matrix step.\nmake sure algorithm converges, set number steps 500 (try parameter).stress value 9 represents good result, means reduced ordination space represents large majority variance input matrix.\nOverall, NMDS puts objects similar (terms species composition) closer together ordination space.\nHowever, opposed ordination techniques, axes arbitrary necessarily ordered importance (Borcard, Gillet, Legendre 2011).\nHowever, already know humidity represents main gradient study area (Muenchow, Bräuning, et al. 2013; Muenchow, Schratz, Brenning 2017).\nSince humidity highly correlated elevation, rotate NMDS axes accordance elevation (see also ?MDSrotate details rotating NMDS axes).\nPlotting result reveals first axis , intended, clearly associated altitude (Figure 14.3).\nFIGURE 14.3: Plotting first NMDS axis altitude.\nscores first NMDS axis represent different vegetation formations, .e., floristic gradient, appearing along slope Mt. Mongón.\nspatially visualize , can model NMDS scores previously created predictors (Section 14.2), use resulting model predictive mapping (see next section).","code":"\n# presence-absence matrix\npa = decostand(comm, \"pa\")  # 100 rows (sites), 69 columns (species)\n# keep only sites in which at least one species was found\npa = pa[rowSums(pa) != 0, ]  # 84 rows, 69 columns\nset.seed(25072018)\nnmds = metaMDS(comm = pa, k = 4, try = 500)\nnmds$stress\n#> ...\n#> Run 498 stress 0.08834745 \n#> ... Procrustes: rmse 0.004100446  max resid 0.03041186 \n#> Run 499 stress 0.08874805 \n#> ... Procrustes: rmse 0.01822361  max resid 0.08054538 \n#> Run 500 stress 0.08863627 \n#> ... Procrustes: rmse 0.01421176  max resid 0.04985418 \n#> *** Solution reached\n#> 0.08831395\nelev = dplyr::filter(random_points, id %in% rownames(pa)) %>% \n  dplyr::pull(dem)\n# rotating NMDS in accordance with altitude (proxy for humidity)\nrotnmds = MDSrotate(nmds, elev)\n# extracting the first two axes\nsc = scores(rotnmds, choices = 1:2)\n# plotting the first axis against altitude\nplot(y = sc[, 1], x = elev, xlab = \"elevation in m\", \n     ylab = \"First NMDS axis\", cex.lab = 0.8, cex.axis = 0.8)\nknitr::include_graphics(\"figures/xy-nmds-1.png\")"},{"path":"eco.html","id":"modeling-the-floristic-gradient","chapter":"14 Ecology","heading":"14.4 Modeling the floristic gradient","text":"predict floristic gradient spatially, make use random forest model (Hengl et al. 2018).\nRandom forest models frequently used environmental ecological modeling, often provide best results terms predictive performance (Schratz et al. 2018).\n, shortly introduce decision trees bagging, since form basis random forests.\nrefer reader James et al. (2013) detailed description random forests related techniques.introduce decision trees example, first construct response-predictor matrix joining rotated NMDS scores field observations (random_points).\nalso use resulting data frame mlr modeling later .Decision trees split predictor space number regions.\nillustrate , apply decision tree data using scores first NMDS axis response (sc) altitude (dem) predictor.\nFIGURE 14.4: Simple example decision tree three internal nodes four terminal nodes.\nresulting tree consists three internal nodes four terminal nodes (Figure 14.4).\nfirst internal node top tree assigns observations \n\n328.5\nleft observations right branch.\nobservations falling left branch mean NMDS score \n\n-1.198.\nOverall, can interpret tree follows: higher elevation, higher NMDS score becomes.\nDecision trees tendency overfit, mirror closely input data including noise turn leads bad predictive performances [Section 11.4; James et al. (2013)].\nBootstrap aggregation (bagging) ensemble technique helps overcome problem.\nEnsemble techniques simply combine predictions multiple models.\nThus, bagging takes repeated samples input data averages predictions.\nreduces variance overfitting result much better predictive accuracy compared decision trees.\nFinally, random forests extend improve bagging decorrelating trees desirable since averaging predictions highly correlated trees shows higher variance thus lower reliability averaging predictions decorrelated trees (James et al. 2013).\nachieve , random forests use bagging, contrast traditional bagging tree allowed use available predictors, random forests use random sample available predictors.","code":"\n# construct response-predictor matrix\n# id- and response variable\nrp = data.frame(id = as.numeric(rownames(sc)), sc = sc[, 1])\n# join the predictors (dem, ndvi and terrain attributes)\nrp = inner_join(random_points, rp, by = \"id\")\nlibrary(\"tree\")\ntree_mo = tree(sc ~ dem, data = rp)\nplot(tree_mo)\ntext(tree_mo, pretty = 0)"},{"path":"eco.html","id":"mlr-building-blocks","chapter":"14 Ecology","heading":"14.4.1 mlr building blocks","text":"code section largely follows steps introduced Section 11.5.2.\ndifferences following:response variable numeric, hence regression task replace classification task Section 11.5.2.Instead AUROC can used categorical response variables, use root mean squared error (RMSE) performance measure.use random forest model instead support vector machine naturally goes along different hyperparameters.leaving assessment bias-reduced performance measure exercise reader (see Exercises).\nInstead show tune hyperparameters (spatial) predictions.Remember 125,500 models necessary retrieve bias-reduced performance estimates using 100-repeated 5-fold spatial cross-validation random search 50 iterations (see Section 11.5.2).\nhyperparameter tuning level, found best hyperparameter combination turn used outer performance level predicting test data specific spatial partition (see also Figure 11.6).\ndone five spatial partitions, repeated 100 times yielding total 500 optimal hyperparameter combinations.\none use making spatial predictions?\nanswer simple, none .\nRemember, tuning done retrieve bias-reduced performance estimate, best possible spatial prediction.\nlatter, one estimates best hyperparameter combination complete dataset.\nmeans, inner hyperparameter tuning level longer needed makes perfect sense since applying model new data (unvisited field observations) true outcomes unavailable, hence testing impossible case.\nTherefore, tune hyperparameters good spatial prediction complete dataset via 5-fold spatial CV one repetition.\npreparation modeling using mlr package includes construction response-predictor matrix containing variables used modeling construction separate coordinate data frame.constructed input variables, set specifying mlr building blocks (task, learner, resampling).\nuse regression task since response variable numeric.\nlearner random forest model implementation ranger package.opposed example support vector machines (see Section 11.5.2), random forests often already show good performances used default values hyperparameters (may one reason popularity).\nStill, tuning often moderately improves model results, thus worth effort (Probst, Wright, Boulesteix 2018).\nSince deal geographic data, make use spatial cross-validation tune hyperparameters (see Sections 11.4 11.5).\nSpecifically, use five-fold spatial partitioning one repetition (makeResampleDesc()).\nspatial partitions, run 50 models (makeTuneControlRandom()) find optimal hyperparameter combination.random forests, hyperparameters mtry, min.node.size sample.fraction determine degree randomness, tuned (Probst, Wright, Boulesteix 2018).\nmtry indicates many predictor variables used tree.\npredictors used, corresponds fact bagging (see beginning Section 14.4).\nsample.fraction parameter specifies fraction observations used tree.\nSmaller fractions lead greater diversity, thus less correlated trees often desirable (see ).\nmin.node.size parameter indicates number observations terminal node least (see also Figure 14.4).\nNaturally, trees computing time become larger, lower min.node.size.Hyperparameter combinations selected randomly fall inside specific tuning limits (makeParamSet()).\nmtry range 1 number predictors\n\n(4)\nsample.fraction\nrange 0.2 0.9 min.node.size range 1 10.Finally, tuneParams() runs hyperparameter tuning, find optimal hyperparameter combination specified parameters.\nperformance measure root mean squared error (RMSE).mtry 4, sample.fraction 0.887, min.node.size 10 represent best hyperparameter combination.\nRMSE \n\n0.51\nrelatively good considering range response variable \n\n3.04\n(diff(range(rp$sc))).","code":"\n# extract the coordinates into a separate data frame\ncoords = sf::st_coordinates(rp) %>% \n  as.data.frame() %>%\n  rename(x = X, y = Y)\n# only keep response and predictors which should be used for the modeling\nrp = dplyr::select(rp, -id, -spri) %>%\n  st_drop_geometry()\n# create task\ntask = makeRegrTask(data = rp, target = \"sc\", coordinates = coords)\n# learner\nlrn_rf = makeLearner(cl = \"regr.ranger\", predict.type = \"response\")\n# spatial partitioning\nperf_level = makeResampleDesc(\"SpCV\", iters = 5)\n# specifying random search\nctrl = makeTuneControlRandom(maxit = 50L)\n# specifying the search space\nps = makeParamSet(\n  makeIntegerParam(\"mtry\", lower = 1, upper = ncol(rp) - 1),\n  makeNumericParam(\"sample.fraction\", lower = 0.2, upper = 0.9),\n  makeIntegerParam(\"min.node.size\", lower = 1, upper = 10)\n)\n# hyperparamter tuning\nset.seed(02082018)\ntune = tuneParams(learner = lrn_rf, \n                  task = task,\n                  resampling = perf_level,\n                  par.set = ps,\n                  control = ctrl, \n                  measures = mlr::rmse)\n#>...\n#> [Tune-x] 49: mtry=3; sample.fraction=0.533; min.node.size=5\n#> [Tune-y] 49: rmse.test.rmse=0.5636692; time: 0.0 min\n#> [Tune-x] 50: mtry=1; sample.fraction=0.68; min.node.size=5\n#> [Tune-y] 50: rmse.test.rmse=0.6314249; time: 0.0 min\n#> [Tune] Result: mtry=4; sample.fraction=0.887; min.node.size=10 :\n#> rmse.test.rmse=0.5104918"},{"path":"eco.html","id":"predictive-mapping","chapter":"14 Ecology","heading":"14.4.2 Predictive mapping","text":"tuned hyperparameters can now used prediction.\nsimply modify learner using result hyperparameter tuning, run corresponding model.last step apply model spatially available predictors, .e., raster stack.\nfar, raster::predict() support output ranger models, hence, program prediction .\nFirst, convert ep prediction data frame secondly serves input predict.ranger() function.\nThirdly, put predicted values back RasterLayer (see Section 3.3.1 Figure 14.5).\nFIGURE 14.5: Predictive mapping floristic gradient clearly revealing distinct vegetation belts.\npredictive mapping clearly reveals distinct vegetation belts (Figure 14.5).\nPlease refer Muenchow, Hauenstein, et al. (2013) detailed description vegetation belts lomas mountains.\nblue color tones represent -called Tillandsia-belt.\nTillandsia highly adapted genus especially found high quantities sandy quite desertic foot lomas mountains.\nyellow color tones refer herbaceous vegetation belt much higher plant cover compared Tillandsia-belt.\norange colors represent bromeliad belt, features highest species richness plant cover.\ncan found directly beneath temperature inversion (ca. 750-850 m asl) humidity due fog highest.\nWater availability naturally decreases temperature inversion, landscape becomes desertic succulent species (succulent belt; red colors).\nInterestingly, spatial prediction clearly reveals bromeliad belt interrupted interesting finding detected without predictive mapping.","code":"\n# learning using the best hyperparameter combination\nlrn_rf = makeLearner(cl = \"regr.ranger\",\n                     predict.type = \"response\",\n                     mtry = tune$x$mtry, \n                     sample.fraction = tune$x$sample.fraction,\n                     min.node.size = tune$x$min.node.size)\n# doing the same more elegantly using setHyperPars()\n# lrn_rf = setHyperPars(makeLearner(\"regr.ranger\", predict.type = \"response\"),\n#                       par.vals = tune$x)\n# train model\nmodel_rf = train(lrn_rf, task)\n# to retrieve the ranger output, run:\n# mlr::getLearnerModel(model_rf)\n# which corresponds to:\n# ranger(sc ~ ., data = rp, \n#        mtry = tune$x$mtry, \n#        sample.fraction = tune$x$sample.fraction,\n#        min.node.sie = tune$x$min.node.size)\n# convert raster stack into a data frame\nnew_data = as.data.frame(as.matrix(ep))\n# apply the model to the data frame\npred_rf = predict(model_rf, newdata = new_data)\n# put the predicted values into a raster\npred = dem\n# replace altitudinal values by rf-prediction values\npred[] = pred_rf$data$response"},{"path":"eco.html","id":"conclusions-1","chapter":"14 Ecology","heading":"14.5 Conclusions","text":"chapter ordinated community matrix lomas Mt. Mongón help NMDS (Section 14.3).\nfirst axis, representing main floristic gradient study area, modeled function environmental predictors partly derived R-GIS bridges (Section 14.2).\nmlr package provided building blocks spatially tune hyperparameters mtry, sample.fraction min.node.size (Section 14.4.1).\ntuned hyperparameters served input final model turn applied environmental predictors spatial representation floristic gradient (Section 14.4.2).\nresult demonstrates spatially astounding biodiversity middle desert.\nSince lomas mountains heavily endangered, prediction map can serve basis informed decision-making delineating protection zones, making local population aware uniqueness found immediate neighborhood.terms methodology, additional points addressed:interesting also model second ordination axis, subsequently find innovative way visualizing jointly modeled scores two axes one prediction map.interested interpreting model ecologically meaningful way, probably use (semi-)parametric models (Muenchow, Bräuning, et al. 2013; . Zuur et al. 2009; . F. Zuur et al. 2017).\nHowever, least approaches help interpret machine learning models random forests (see, e.g., https://mlr-org.github.io/interpretable-machine-learning-iml--mlr/).sequential model-based optimization (SMBO) might preferable random search hyperparameter optimization used chapter (Probst, Wright, Boulesteix 2018).Finally, please note random forest machine learning models frequently used setting lots observations many predictors, much used chapter, unclear variables variable interactions contribute explaining response.\nAdditionally, relationships might highly non-linear.\nuse case, relationship response predictors pretty clear, slight amount non-linearity number observations predictors low.\nHence, might worth trying linear model.\nlinear model much easier explain understand random forest model, therefore preferred (law parsimony), additionally computationally less demanding (see Exercises).\nlinear model cope degree non-linearity present data, one also try generalized additive model (GAM).\npoint toolbox data scientist consists one tool, responsibility select tool best suited task purpose hand.\n, wanted introduce reader random forest modeling use corresponding results spatial predictions.\npurpose, well-studied dataset known relationships response predictors, appropriate.\nHowever, imply random forest model returned best result terms predictive performance (see Exercises).","code":""},{"path":"eco.html","id":"exercises-10","chapter":"14 Ecology","heading":"14.6 Exercises","text":"Run NMDS using percentage data community matrix.\nReport stress value compare stress value retrieved NMDS using presence-absence data.\nmight explain observed difference?Run NMDS using percentage data community matrix.\nReport stress value compare stress value retrieved NMDS using presence-absence data.\nmight explain observed difference?Compute predictor rasters used chapter (catchment slope, catchment area), put raster stack.\nAdd dem ndvi raster stack.\nNext, compute profile tangential curvature additional predictor rasters add raster stack (hint: grass7:r.slope.aspect).\nFinally, construct response-predictor matrix.\nscores first NMDS axis (result using presence-absence community matrix) rotated accordance elevation represent response variable, joined random_points (use inner join).\ncomplete response-predictor matrix, extract values environmental predictor raster stack random_points.Compute predictor rasters used chapter (catchment slope, catchment area), put raster stack.\nAdd dem ndvi raster stack.\nNext, compute profile tangential curvature additional predictor rasters add raster stack (hint: grass7:r.slope.aspect).\nFinally, construct response-predictor matrix.\nscores first NMDS axis (result using presence-absence community matrix) rotated accordance elevation represent response variable, joined random_points (use inner join).\ncomplete response-predictor matrix, extract values environmental predictor raster stack random_points.Use response-predictor matrix previous exercise fit random forest model.\nFind optimal hyperparameters use making prediction map.Use response-predictor matrix previous exercise fit random forest model.\nFind optimal hyperparameters use making prediction map.Retrieve bias-reduced RMSE random forest model using spatial cross-validation including estimation optimal hyperparameter combinations (random search 50 iterations) inner tuning loop (see Section 11.5.2).\nParallelize tuning level (see Section 11.5.2).\nReport mean RMSE use boxplot visualize retrieved RMSEs.Retrieve bias-reduced RMSE random forest model using spatial cross-validation including estimation optimal hyperparameter combinations (random search 50 iterations) inner tuning loop (see Section 11.5.2).\nParallelize tuning level (see Section 11.5.2).\nReport mean RMSE use boxplot visualize retrieved RMSEs.Retrieve bias-reduced RMSE simple linear model using spatial cross-validation.\nCompare result result random forest model making RMSE boxplots modeling approach.Retrieve bias-reduced RMSE simple linear model using spatial cross-validation.\nCompare result result random forest model making RMSE boxplots modeling approach.","code":""},{"path":"conclusion.html","id":"conclusion","chapter":"15 Conclusion","heading":"15 Conclusion","text":"","code":""},{"path":"conclusion.html","id":"prerequisites-13","chapter":"15 Conclusion","heading":"Prerequisites","text":"Like introduction, concluding chapter contains code chunks.\nprerequisites demanding.\nassumes :read attempted exercises chapters Part (Foundations);considered can use geocomputation solve real-world problems, work beyond, engaging Part III (Applications).","code":""},{"path":"conclusion.html","id":"introduction-10","chapter":"15 Conclusion","heading":"15.1 Introduction","text":"aim chapter synthesize contents, reference recurring themes/concepts, inspire future directions application development.\nSection 15.2 discusses wide range options handling geographic data R.\nChoice key feature open source software; section provides guidance choosing various options.\nSection 15.3 describes gaps book’s contents explains areas research deliberately omitted, others emphasized.\ndiscussion leads question (answered Section 15.4): read book, go next?\nSection 15.5 returns wider issues raised Chapter 1.\nconsider geocomputation part wider ‘open source approach’ ensures methods publicly accessible, reproducible supported collaborative communities.\nfinal section book also provides pointers get involved.","code":""},{"path":"conclusion.html","id":"package-choice","chapter":"15 Conclusion","heading":"15.2 Package choice","text":"characteristic R often multiple ways achieve result.\ncode chunk illustrates using three functions, covered Chapters 3 5, combine 16 regions New Zealand single geometry:Although classes, attributes column names resulting objects nz_u1 nz_u3 differ, geometries identical.\nverified using base R function identical().85\nuse?\ndepends: former processes geometry data contained nz faster, options performed attribute operations, may useful subsequent steps.wider point often multiple options choose working geographic data R, even within single package.\nrange options grows R packages considered: achieve result using older sp package, example.\nrecommend using sf packages showcased book, reasons outlined Chapter 2, ’s worth aware alternatives able justify choice software.common (sometimes controversial) choice tidyverse base R approaches.\ncover encourage try deciding appropriate different tasks.\nfollowing code chunk, described Chapter 3, shows attribute data subsetting works approach, using base R operator [ select() function tidyverse package dplyr.\nsyntax differs results (essence) :question arises: use?\nanswer : depends.\napproach advantages: pipe syntax popular appealing , base R stable, well known others.\nChoosing therefore largely matter preference.\nHowever, choose use tidyverse functions handle geographic data, beware number pitfalls (see supplementary article tidyverse-pitfalls website supports book).commonly needed operators/functions covered depth — base R [ subsetting operator dplyr function filter() — many functions working geographic data, packages, mentioned.\nChapter 1 mentions 20+ influential packages working geographic data, handful demonstrated subsequent chapters.\nhundreds .\nearly 2019, nearly 200 packages mentioned Spatial Task View;\npackages countless functions geographic data developed year, making impractical cover single book.rate evolution R’s spatial ecosystem may seem overwhelming, strategies deal wide range options.\nadvice start learning one approach depth general understand breadth options available.\nadvice applies equally solving geographic problems R (Section 15.4 covers developments languages) fields knowledge application.course, packages perform much better others, making package selection important decision.\ndiversity, focused packages future-proof (work long future), high performance (relative R packages) complementary.\nstill overlap packages used, illustrated diversity packages making maps, example (see Chapter 8).Package overlap necessarily bad thing.\ncan increase resilience, performance (partly driven friendly competition mutual learning developers) choice, key feature open source software.\ncontext decision use particular approach, sf/tidyverse/raster ecosystem advocated book made knowledge alternatives.\nsp/rgdal/rgeos ecosystem sf designed supersede, example, can many things covered book , due age, built many packages.86\nAlthough best known point pattern analysis, spatstat package also supports raster vector geometries (Baddeley Turner 2005).\ntime writing (October 2018) 69 packages depend , making package: spatstat alternative R-spatial ecosystem.also aware promising alternatives development.\npackage stars, example, provides new class system working spatiotemporal data.\ninterested topic, can check updates package’s source code broader SpatioTemporal Task View.\nprinciple applies domains: important justify software choices review software decisions based --date information.","code":"\nlibrary(spData)\n#> Warning: multiple methods tables found for 'approxNA'\nnz_u1 = sf::st_union(nz)\nnz_u2 = aggregate(nz[\"Population\"], list(rep(1, nrow(nz))), sum)\nnz_u3 = dplyr::summarise(nz, t = sum(Population))\nidentical(nz_u1, nz_u2$geometry)\n#> [1] TRUE\nidentical(nz_u1, nz_u3$geom)\n#> [1] TRUE\nlibrary(dplyr)                          # attach tidyverse package\nnz_name1 = nz[\"Name\"]                   # base R approach\nnz_name2 = nz %>% select(Name)          # tidyverse approach\nidentical(nz_name1$Name, nz_name2$Name) # check results\n#> [1] TRUE"},{"path":"conclusion.html","id":"gaps","chapter":"15 Conclusion","heading":"15.3 Gaps and overlaps","text":"number gaps , overlaps , topics covered book.\nselective, emphasizing topics omitting others.\ntried emphasize topics commonly needed real-world applications geographic data operations, projections, data read/write visualization.\ntopics appear repeatedly chapters, substantial area overlap designed consolidate essential skills geocomputation.hand, omitted topics less commonly used, covered -depth elsewhere.\nStatistical topics including point pattern analysis, spatial interpolation (kriging) spatial epidemiology, example, mentioned reference topics machine learning techniques covered Chapter 11 ().\nalready excellent material methods, including statistically orientated chapters R. Bivand, Pebesma, Gómez-Rubio (2013) book point pattern analysis Baddeley, Rubak, Turner (2015).\ntopics received limited attention remote sensing using R alongside (rather bridge ) dedicated GIS software.\nmany resources topics, including Wegmann, Leutner, Dech (2016) GIS-related teaching materials available Marburg University.Instead covering spatial statistical modeling inference techniques, focussed machine learning (see Chapters 11 14).\n, reason already excellent resources topics, especially ecological use cases, including . Zuur et al. (2009), . F. Zuur et al. (2017) freely available teaching material code Geostatistics & Open-source Statistical Computing David Rossiter, hosted css.cornell.edu/faculty/dgr2 granolarr project Stefano De Sabbata University Leicester introduction R geographic data science.\nalso excellent resources spatial statistics using Bayesian modeling, powerful framework modeling uncertainty estimation (Blangiardo Cameletti 2015; Krainski et al. 2018).Finally, largely omitted big data analytics.\nmight seem surprising since especially geographic data can become big really fast.\nprerequisite big data analytics know solve problem small dataset.\nlearned , can apply exact techniques big data questions, though course need expand toolbox.\nfirst thing learn handle geographic data queries.\nbig data analytics often boil extracting small amount data database specific statistical analysis.\n, provided introduction spatial databases use GIS within R Chapter 9.\nreally analysis big even complete dataset, hopefully, problem trying solve embarrassingly parallel.\n, need learn system able parallelization efficiently Hadoop, GeoMesa (http://www.geomesa.org/) GeoSpark (Huang et al. 2017).\nstill, applying techniques concepts used small datasets answer big data question, difference big data setting.","code":""},{"path":"conclusion.html","id":"next","chapter":"15 Conclusion","heading":"15.4 Where to go next?","text":"indicated previous sections, book covered fraction R’s geographic ecosystem, much discover.\nprogressed quickly, geographic data models Chapter 2, advanced applications Chapter 14.\nConsolidation skills learned, discovery new packages approaches handling geographic data, application methods new datasets domains suggested future directions.\nsection expands general advice suggesting specific ‘next steps,’ highlighted bold .addition learning geographic methods applications R, example reference work cited previous section, deepening understanding R logical next step.\nR’s fundamental classes data.frame matrix foundation sf raster classes, studying improve understanding geographic data.\ncan done reference documents part R, can found command help.start() additional resources subject Wickham (2019) Chambers (2016).Another software-related direction future learning discovering geocomputation languages.\ngood reasons learning R language geocomputation, described Chapter 1, option.87\npossible study Geocomputation : Python, C++, JavaScript, Scala Rust equal depth.\nevolving geospatial capabilities.\nrasterio, example, Python package\nsupplement/replace raster package used book — see Garrard (2016) online tutorials automating-gis-processes Python ecosystem.\nDozens geospatial libraries developed C++, including well known libraries GDAL GEOS, less well known libraries Orfeo Toolbox processing remote sensing (raster) data.\nTurf.js example potential geocomputation JavaScript.\n\n\nGeoTrellis provides functions working raster vector data Java-based language Scala.\nWhiteBoxTools provides example rapidly evolving command-line GIS implemented Rust.\n\n\npackages/libraries/languages advantages geocomputation many discover, documented curated list open source geospatial resources Awesome-Geospatial.geocomputation software, however.\ncan recommend exploring learning new research topics methods academic theoretical perspectives.\nMany methods written yet implemented.\nLearning geographic methods potential applications can therefore rewarding, writing code.\nexample geographic methods increasingly implemented R sampling strategies scientific applications.\nnext step case read-relevant articles area Brus (2018), accompanied reproducible code tutorial content hosted github.com/DickBrus/TutorialSampling4DSM.","code":""},{"path":"conclusion.html","id":"benefit","chapter":"15 Conclusion","heading":"15.5 The open source approach","text":"technical book makes sense next steps, outlined previous section, also technical.\nHowever, wider issues worth considering final section, returns definition geocomputation.\nOne elements term introduced Chapter 1 geographic methods positive impact.\ncourse, define measure ‘positive’ subjective, philosophical question, beyond scope book.\nRegardless worldview, consideration impacts geocomputational work useful exercise:\npotential positive impacts can provide powerful motivation future learning , conversely, new methods can open-many possible fields application.\nconsiderations lead conclusion geocomputation part wider ‘open source approach.’Section 1.1 presented terms mean roughly thing geocomputation, including geographic data science (GDS) ‘GIScience.’\ncapture essence working geographic data, geocomputation advantages: concisely captures ‘computational’ way working geographic data advocated book — implemented code therefore encouraging reproducibility — builds desirable ingredients early definition (Openshaw Abrahart 2000):creative use geographic dataApplication real-world problemsBuilding ‘scientific’ toolsReproducibilityWe added final ingredient: reproducibility barely mentioned early work geocomputation, yet strong case can made vital component first two ingredients.\nReproducibilityencourages creativity shifting focus away basics (readily available shared code) towards applications;discourages people ‘reinventing wheel’: need re-others done methods can used others; andmakes research conducive real world applications, enabling anyone sector apply methods new areas.reproducibility defining asset geocomputation (command-line GIS) worth considering makes reproducible.\nbrings us ‘open source approach,’ three main components:command-line interface (CLI), encouraging scripts recording geographic work shared reproducedOpen source software, can inspected potentially improved anyone worldAn active developer community, collaborates self-organizes build complementary modular toolsLike term geocomputation, open source approach technical entity.\ncommunity composed people interacting daily shared aims: produce high performance tools, free commercial legal restrictions, accessible anyone use.\nopen source approach working geographic data advantages transcend technicalities software works, encouraging learning, collaboration efficient division labor.many ways engage community, especially emergence code hosting sites, GitHub, encourage communication collaboration.\ngood place start simply browsing source code, ‘issues’ ‘commits’ geographic package interest.\nquick glance r-spatial/sf GitHub repository, hosts code underlying sf package, shows 40+ people contributed codebase documentation.\nDozens people contributed asking question contributing ‘upstream’ packages sf uses.\n600 issues closed issue tracker, representing huge amount work make sf faster, stable user-friendly.\nexample, just one package dozens, shows scale intellectual operation underway make R highly effective continuously evolving language geocomputation.instructive watch incessant development activity happen public fora GitHub, even rewarding become active participant.\none greatest features open source approach: encourages people get involved.\nbook result open source approach:\nmotivated amazing developments R’s geographic capabilities last two decades, made practically possible dialogue code sharing platforms collaboration.\nhope addition disseminating useful methods working geographic data, book inspires take open source approach.\nWhether ’s raising constructive issue alerting developers problems package; making work done organizations work open; simply helping people passing knowledge ’ve learned, getting involved can rewarding experience.","code":""},{"path":"references.html","id":"references","chapter":"References","heading":"References","text":"","code":""}]
+[{"path":"index.html","id":"welcome","chapter":"Welcome","heading":"Welcome","text":"online home Geocomputation R, book geographic data analysis, visualization modeling.Note: first edition book published CRC Press R Series.\ncan buy book CRC Press, Amazon, see archived First Edition hosted bookdown.org.Inspired bookdown Free Open Source Software Geospatial (FOSS4G) movement, code prose underlying book open source, ensuring ’s reproducible, accessible modifiable (e.g. case find inevitable typo) benefit people worldwide.\nonline version book hosted geocompr.robinlovelace.net kept --date GitHub Actions.\ncurrent ‘build status’ follows:version book built GH Actions 2021-12-14.work licensed Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.","code":""},{"path":"index.html","id":"how-to-contribute","chapter":"Welcome","heading":"How to contribute?","text":"bookdown makes editing book easy editing wiki, provided GitHub account (sign-github.com).\nlogged-GitHub, click ‘Edit page’ icon right panel book website.\ntake editable version source R Markdown file generated page ’re .raise issue book’s content (e.g. code running) make feature request, check-issue tracker.Maintainers contributors must follow repository’s CODE CONDUCT.","code":""},{"path":"index.html","id":"reproducibility","chapter":"Welcome","heading":"Reproducibility","text":"quickest way reproduce contents book ’re new geographic data R may web browser, thanks Binder.\nClicking link open new window containing RStudio Server web browser, enabling open chapter files running code chunks test code reproducible.see something like image , congratulations, ’s worked can start exploring Geocomputation R cloud-based environment (aware mybinder.org user guidelines):\nFIGURE 0.1: Screenshot reproducible code contained Geocomputation R running RStudio Server browser served Binder\nreproduce code book computer, need recent version R --date packages.\ncan installed using remotes package.installing book’s dependencies, able reproduce code chunks book’s chapters.\nclone book’s repo navigate geocompr folder, able reproduce contents following command:See project’s GitHub repo details reproducing book.","code":"\ninstall.packages(\"remotes\")\nremotes::install_github(\"geocompr/geocompkg\")\nremotes::install_github(\"nowosad/spData\")\nremotes::install_github(\"nowosad/spDataLarge\")\n\n# During development work on the 2nd edition you may also need dev versions of\n# other packages to build the book, e.g.,:\nremotes::install_github(\"rspatial/terra\")\nremotes::install_github(\"mtennekes/tmap\")\nbookdown::serve_book()"},{"path":"index.html","id":"supporting-the-project","chapter":"Welcome","heading":"Supporting the project","text":"find book useful, please support :Telling people personCommunicating book digital media, e.g., via #geocompr hashtag Twitter (see Guestbook geocompr.github.io) letting us know courses using bookCiting linking-‘Starring’ geocompr GitHub repositoryReviewing , e.g., Amazon GoodreadsAsking questions making suggestion content via GitHub Twitter.Buying copyFurther details can found github.com/Robinlovelace/geocompr.globe icon used book created Jean-Marc Viglino licensed CC-4.0 International.","code":""},{"path":"foreword-1st-edition.html","id":"foreword-1st-edition","chapter":"Foreword (1st Edition)","heading":"Foreword (1st Edition)","text":"‘spatial’ R always broad, seeking provide integrate tools geography, geoinformatics, geocomputation spatial statistics anyone interested joining : joining asking interesting questions, contributing fruitful research questions, writing improving code.\n, ‘spatial’ R always included open source code, open data reproducibility.‘spatial’ R also sought open interaction many branches applied spatial data analysis, also implement new advances data representation methods analysis expose cross-disciplinary scrutiny.\nbook demonstrates, often alternative workflows similar data similar results, may learn comparisons others create understand workflows.\nincludes learning similar communities around Open Source GIS complementary languages Python, Java .R’s wide range spatial capabilities never evolved without people willing share creating adapting.\nmight include teaching materials, software, research practices (reproducible research, open data), combinations .\nR users also benefitted greatly ‘upstream’ open source geo libraries GDAL, GEOS PROJ.book clear example , curious willing join , can find things need match aptitudes.\nadvances data representation workflow alternatives, ever increasing numbers new users often without applied quantitative command-line exposure, book kind really needed.\nDespite effort involved, authors supported pressing forward publication., fresh book ready go; authors tried many tutorials workshops, readers instructors able benefit knowing contents continue tried people like .\nEngage authors wider R-spatial community, see value choice building workflows important, enjoy applying learn things care .Roger BivandBergen, September 2018","code":""},{"path":"preface.html","id":"preface","chapter":"Preface","heading":"Preface","text":"","code":""},{"path":"preface.html","id":"who-this-book-is-for","chapter":"Preface","heading":"Who this book is for","text":"book people want analyze, visualize model geographic data open source software.\nbased R, statistical programming language powerful data processing, visualization geospatial capabilities.\nbook covers wide range topics interest wide range people many different backgrounds, especially:People learned spatial analysis skills using desktop Geographic Information System (GIS), QGIS, ArcGIS, GRASS SAGA, want access powerful (geo)statistical visualization programming language benefits command-line approach (Sherman 2008):\n\nadvent ‘modern’ GIS software, people want point click way life. ’s good, tremendous amount flexibility power waiting command line.\nPeople learned spatial analysis skills using desktop Geographic Information System (GIS), QGIS, ArcGIS, GRASS SAGA, want access powerful (geo)statistical visualization programming language benefits command-line approach (Sherman 2008):advent ‘modern’ GIS software, people want point click way life. ’s good, tremendous amount flexibility power waiting command line.Graduate students researchers fields specializing geographic data including Geography, Remote Sensing, Planning, GIS Geographic Data ScienceGraduate students researchers fields specializing geographic data including Geography, Remote Sensing, Planning, GIS Geographic Data ScienceAcademics post-graduate students working geographic data — fields Geology, Regional Science, Biology Ecology, Agricultural Sciences, Archaeology, Epidemiology, Transport Modeling, broadly defined Data Science — require power flexibility R researchAcademics post-graduate students working geographic data — fields Geology, Regional Science, Biology Ecology, Agricultural Sciences, Archaeology, Epidemiology, Transport Modeling, broadly defined Data Science — require power flexibility R researchApplied researchers analysts public, private third-sector organizations need reproducibility, speed flexibility command-line language R applications dealing spatial data diverse Urban Transport Planning, Logistics, Geo-marketing (store location analysis) Emergency PlanningApplied researchers analysts public, private third-sector organizations need reproducibility, speed flexibility command-line language R applications dealing spatial data diverse Urban Transport Planning, Logistics, Geo-marketing (store location analysis) Emergency PlanningThe book designed intermediate--advanced R users interested geocomputation R beginners prior experience geographic data.\nnew R geographic data, discouraged: provide links materials describe nature spatial data beginner’s perspective Chapter 2 links provided .","code":""},{"path":"preface.html","id":"how-to-read-this-book","chapter":"Preface","heading":"How to read this book","text":"book divided three parts:Part : Foundations, aimed getting --speed geographic data R.Part II: Extensions, covers advanced techniques.Part III: Applications, real-world problems.chapters get progressively harder recommend reading book order.\nmajor barrier geographical analysis R steep learning curve.\nchapters Part aim address providing reproducible code simple datasets ease process getting started.important aspect book teaching/learning perspective exercises end chapter.\nCompleting develop skills equip confidence needed tackle range geospatial problems.\nSolutions exercises, number extended examples, provided book’s supporting website, geocompr.github.io.Impatient readers welcome dive straight practical examples, starting Chapter 2.\nHowever, recommend reading wider context Geocomputation R Chapter 1 first.\nnew R, also recommend learning language attempting run code chunks provided chapter (unless ’re reading book understanding concepts).\nFortunately R beginners R supportive community developed wealth resources can help.\nparticularly recommend three tutorials: R Data Science (Grolemund Wickham 2016) Efficient R Programming (Gillespie Lovelace 2016), especially Chapter 2 (installing setting-R/RStudio) Chapter 10 (learning learn), introduction R (R Core Team, Smith, Team 2021).","code":""},{"path":"preface.html","id":"why-r","chapter":"Preface","heading":"Why R?","text":"Although R steep learning curve, command-line approach advocated book can quickly pay .\n’ll learn subsequent chapters, R effective tool tackling wide range geographic data challenges.\nexpect , practice, R become program choice geospatial toolbox many applications.\nTyping executing commands command-line , many cases, faster pointing--clicking around graphical user interface (GUI) desktop GIS.\napplications Spatial Statistics modeling R may realistic way get work done.outlined Section 1.2, many reasons using R geocomputation:\nR well-suited interactive use required many geographic data analysis workflows compared languages.\nR excels rapidly growing fields Data Science (includes data carpentry, statistical learning techniques data visualization) Big Data (via efficient interfaces databases distributed computing systems).\nFurthermore R enables reproducible workflow: sharing scripts underlying analysis allow others build-work.\nensure reproducibility book made source code available github.com/Robinlovelace/geocompr.\nfind script files code/ folder generate figures:\ncode generating figure provided main text book, name script file generated provided caption (see example caption Figure 12.2).languages Python, Java C++ can used geocomputation excellent resources learning geocomputation without R, discussed Section 1.3.\nNone provide unique combination package ecosystem, statistical capabilities, visualization options, powerful IDEs offered R community.\nFurthermore, teaching use one language (R) depth, book equip concepts confidence needed geocomputation languages.","code":""},{"path":"preface.html","id":"real-world-impact","chapter":"Preface","heading":"Real-world impact","text":"Geocomputation R equip knowledge skills tackle wide range issues, including scientific, societal environmental implications, manifested geographic data.\ndescribed Section 1.1, geocomputation using computers process geographic data:\nalso real-world impact.\ninterested wider context motivations behind book, read ; covered Chapter 1.","code":""},{"path":"preface.html","id":"acknowledgements","chapter":"Preface","heading":"Acknowledgements","text":"Many thanks everyone contributed directly indirectly via code hosting collaboration site GitHub, including following people contributed direct via pull requests: prosoitos, florisvdh, katygregg, rsbivand, KiranmayiV, zmbc, erstearns, MikeJohnPage, eyesofbambi, nickbearman, tyluRp, marcosci, giocomai, KHwong12, LaurieLBaker, MarHer90, mdsumner, pat-s, gisma, ateucher, annakrystalli, DarrellCarvalho, kant, gavinsimpson, Himanshuteli, yutannihilation, jbixon13, olyerickson, yvkschaefer, katiejolly, layik, mpaulacaldas, mtennekes, mvl22, ganes1410, richfitz, wdearden, yihui, chihinl, cshancock, gregor-d, jasongrahn, p-kono, pokyah, schuetzingit, sdesabbata, tim-salabim, tszberkowitz.\nSpecial thanks Marco Sciaini, created front cover image, also published code generated (see code/frontcover.R book’s GitHub repo).\nDozens people contributed online, raising commenting issues, providing feedback via social media.\n#geocompr hashtag live !like thank John Kimmel CRC Press, worked us two years take ideas early book plan production via four rounds peer review.\nreviewers deserve special mention : detailed feedback expertise substantially improved book’s structure content.thank Patrick Schratz Alexander Brenning University Jena fruitful discussions input Chapters 11 14.\nthank Emmanuel Blondel Food Agriculture Organization United Nations expert input section web services;\nMichael Sumner critical input many areas book, especially discussion algorithms Chapter 10;\nTim Appelhans David Cooley key contributions visualization chapter (Chapter 8);\nKaty Gregg, proofread every chapter greatly improved readability book.Countless others mentioned contributed myriad ways.\nfinal thank software developers make geocomputation R possible.\nEdzer Pebesma (created sf package), Robert Hijmans (created raster) Roger Bivand (laid foundations much R-spatial software) made high performance geographic computing possible R.","code":""},{"path":"intro.html","id":"intro","chapter":"1 Introduction","heading":"1 Introduction","text":"book using power computers things geographic data.\nteaches range spatial skills, including: reading, writing manipulating geographic data; making static interactive maps; applying geocomputation solve real-world problems; modeling geographic phenomena.\ndemonstrating various geographic operations can linked, reproducible ‘code chunks’ intersperse prose, book also teaches transparent thus scientific workflow.\nLearning use wealth geospatial tools available R command line can exciting, creating new ones can truly liberating.\nUsing command-line driven approach taught throughout, programming techniques covered Chapter 10, can help remove constraints creativity imposed software.\nreading book completing exercises, therefore feel empowered strong understanding possibilities opened R’s impressive geographic capabilities, new skills solve real-world problems geographic data, ability communicate work maps reproducible code.last decades free open source software geospatial (FOSS4G) progressed astonishing rate.\nThanks organizations OSGeo, geographic data analysis longer preserve expensive hardware software: anyone can now download run high-performance spatial libraries.\nOpen source Geographic Information Systems (GIS), QGIS, made geographic analysis accessible worldwide.\nGIS programs tend emphasize graphical user interfaces (GUIs), unintended consequence discouraging reproducibility (although many can used command line ’ll see Chapter 9).\nR, contrast, emphasizes command line interface (CLI).\nsimplistic comparison different approaches illustrated Table 1.1.TABLE 1.1: Differences emphasis software packages (Graphical User Interface (GUI) Geographic Information Systems (GIS) R).book motivated importance reproducibility scientific research (see note ).\naims make reproducible geographic data analysis workflows accessible, demonstrate power open geospatial software available command-line.\n“Interfaces software part R” (Eddelbuettel Balamuta 2018).\nmeans addition outstanding ‘house’ capabilities, R allows access many spatial software libraries, explained Section 1.2 demonstrated Chapter 9.\ngoing details software, however, worth taking step back thinking mean geocomputation.Reproducibility major advantage command-line interfaces, mean practice?\ndefine follows: “process results can generated others using publicly accessible code.”","code":""},{"path":"intro.html","id":"what-is-geocomputation","chapter":"1 Introduction","heading":"1.1 What is geocomputation?","text":"Geocomputation young term, dating back first conference subject 1996.1\ndistinguished geocomputation (time) commonly used term ‘quantitative geography,’ early advocates proposed, emphasis “creative experimental” applications (P. . Longley et al. 1998) development new tools methods (Openshaw Abrahart 2000):\n“GeoComputation using various different types geodata developing relevant geo-tools within overall context ‘scientific’ approach.”\nbook aims go beyond teaching methods code; end able use geocomputational skills, “practical work beneficial useful” (Openshaw Abrahart 2000).approach differs early adopters Stan Openshaw, however, emphasis reproducibility collaboration.\nturn 21st Century, unrealistic expect readers able reproduce code examples, due barriers preventing access necessary hardware, software data.\nFast-forward two decades things progressed rapidly.\nAnyone access laptop ~4GB RAM can realistically expect able install run software geocomputation publicly accessible datasets, widely available ever (see Chapter 7).2\nUnlike early works field, work presented book reproducible using code example data supplied alongside book, R packages spData, installation covered Chapter 2.Geocomputation closely related terms including: Geographic Information Science (GIScience); Geomatics; Geoinformatics; Spatial Information Science; Geoinformation Engineering (P. Longley 2015); Geographic Data Science (GDS).\nterm shares emphasis ‘scientific’ (implying reproducible falsifiable) approach influenced GIS, although origins main fields application differ.\nGDS, example, emphasizes ‘data science’ skills large datasets, Geoinformatics tends focus data structures.\noverlaps terms larger differences use geocomputation rough synonym encapsulating :\nseek use geographic data applied scientific work.\nUnlike early users term, however, seek imply cohesive academic field called ‘Geocomputation’ (‘GeoComputation’ Stan Openshaw called ).\nInstead, define term follows: working geographic data computational way, focusing code, reproducibility modularity.Geocomputation recent term influenced old ideas.\ncan seen part Geography, 2000+ year history (Talbert 2014);\nextension Geographic Information Systems (GIS) (Neteler Mitasova 2008), emerged 1960s (Coppock Rhind 1991).Geography played important role explaining influencing humanity’s relationship natural world long invention computer, however.\nAlexander von Humboldt’s travels South America early 1800s illustrates role:\nresulting observations lay foundations traditions physical plant geography, also paved way towards policies protect natural world (Wulf 2015).\nbook aims contribute ‘Geographic Tradition’ (Livingstone 1992) harnessing power modern computers open source software.book’s links older disciplines reflected suggested titles book: Geography R R GIS.\nadvantages.\nformer conveys message comprises much just spatial data:\nnon-spatial attribute data inevitably interwoven geometry data, Geography something map.\nlatter communicates book using R GIS, perform spatial operations geographic data (R. Bivand, Pebesma, Gómez-Rubio 2013).\nHowever, term GIS conveys connotations (see Table 1.1) simply fail communicate one R’s greatest strengths:\nconsole-based ability seamlessly switch geographic non-geographic data processing, modeling visualization tasks.\ncontrast, term geocomputation implies reproducible creative programming.\ncourse, (geocomputational) algorithms powerful tools can become highly complex.\nHowever, algorithms composed smaller parts.\nteaching foundations underlying structure, aim empower create innovative solutions geographic data problems.","code":""},{"path":"intro.html","id":"why-use-r-for-geocomputation","chapter":"1 Introduction","heading":"1.2 Why use R for geocomputation?","text":"Early geographers used variety tools including barometers, compasses sextants advance knowledge world (Wulf 2015).\ninvention marine chronometer 1761 became possible calculate longitude sea, enabling ships take direct routes.Nowadays lack geographic data hard imagine.\nEvery smartphone global positioning (GPS) receiver multitude sensors devices ranging satellites semi-autonomous vehicles citizen scientists incessantly measure every part world.\nrate data produced overwhelming.\nautonomous vehicle, example, can generate 100 GB data per day (Economist 2016).\nRemote sensing data satellites become large analyze corresponding data single computer, leading initiatives OpenEO.‘geodata revolution’ drives demand high performance computer hardware efficient, scalable software handle extract signal noise, understand perhaps change world.\nSpatial databases enable storage generation manageable subsets vast geographic data stores, making interfaces gaining knowledge vital tools future.\nR one tool, advanced analysis, modeling visualization capabilities.\ncontext focus book language (see Wickham 2019).\nInstead use R ‘tool trade’ understanding world, similar Humboldt’s use tools gain deep understanding nature complexity interconnections (see Wulf 2015).\nAlthough programming can seem like reductionist activity, aim teach geocomputation R fun, understanding world.R multi-platform, open source language environment statistical computing graphics (r-project.org/).\nwide range packages, R also supports advanced geospatial statistics, modeling visualization.\n\nNew integrated development environments (IDEs) RStudio made R user-friendly many, easing map making panel dedicated interactive visualization.core, R object-oriented, functional programming language (Wickham 2019), specifically designed interactive interface software (Chambers 2016).\nlatter also includes many ‘bridges’ treasure trove GIS software, ‘geolibraries’ functions (see Chapter 9).\nthus ideal quickly creating ‘geo-tools,’ without needing master lower level languages (compared R) C, FORTRAN Java (see Section 1.3).\n\ncan feel like breaking free metaphorical ‘glass ceiling’ imposed GUI-based proprietary geographic information systems (see Table 1.1 definition GUI).\nFurthermore, R facilitates access languages:\npackages Rcpp reticulate enable access C++ Python code, example.\nmeans R can used ‘bridge’ wide range geospatial programs (see Section 1.3).Another example showing R’s flexibility evolving geographic capabilities interactive map making.\n’ll see Chapter 8, statement R “limited interactive [plotting] facilities” (R. Bivand, Pebesma, Gómez-Rubio 2013) longer true.\ndemonstrated following code chunk, creates Figure 1.1 (functions generate plot covered Section 8.4).\nFIGURE 1.1: blue markers indicate authors . basemap tiled image Earth night provided NASA. Interact online version geocompr.robinlovelace.net, example zooming clicking popups.\ndifficult produce Figure 1.1 using R years ago, let alone interactive map.\nillustrates R’s flexibility , thanks developments knitr leaflet, can used interface software, theme recur throughout book.\nuse R code, therefore, enables teaching geocomputation reference reproducible examples provided Figure 1.1 rather abstract concepts.","code":"\nlibrary(leaflet)\npopup = c(\"Robin\", \"Jakub\", \"Jannes\")\nleaflet() %>%\n  addProviderTiles(\"NASAGIBS.ViirsEarthAtNight2012\") %>%\n  addMarkers(lng = c(-3, 23, 11),\n             lat = c(52, 53, 49), \n             popup = popup)"},{"path":"intro.html","id":"software-for-geocomputation","chapter":"1 Introduction","heading":"1.3 Software for geocomputation","text":"R powerful language geocomputation many options geographic data analysis providing thousands geographic functions.\nAwareness languages geocomputation help decide different tool may appropriate specific task, place R wider geospatial ecosystem.\nsection briefly introduces languages C++, Java Python geocomputation, preparation Chapter 9.important feature R (Python) interpreted language.\nadvantageous enables interactive programming Read–Eval–Print Loop (REPL):\ncode entered console immediately executed result printed, rather waiting intermediate stage compilation.\nhand, compiled languages C++ Java tend run faster (compiled).C++ provides basis many GIS packages QGIS, GRASS SAGA sensible starting point.\nWell-written C++ fast, making good choice performance-critical applications processing large geographic datasets, harder learn Python R.\nC++ become accessible Rcpp package, provides good ‘way ’ C programming R users.\nProficiency low-level languages opens possibility creating new, high-performance ‘geoalgorithms’ better understanding GIS software works (see Chapter 10).Java another important versatile language geocomputation.\nGIS packages gvSig, OpenJump uDig written Java.\nmany GIS libraries written Java, including GeoTools JTS, Java Topology Suite (GEOS C++ port JTS).\nFurthermore, many map server applications use Java including Geoserver/Geonode, deegree 52°North WPS.Java’s object-oriented syntax similar C++.\nmajor advantage Java platform-independent (unusual compiled language) highly scalable, making suitable language IDEs RStudio, book written.\nJava fewer tools statistical modeling visualization Python R, although can used data science (Brzustowicz 2017).Python important language geocomputation especially many Desktop GIS GRASS, SAGA QGIS provide Python API (see Chapter 9).\nLike R, popular tool data science.\nlanguages object-oriented, many areas overlap, leading initiatives reticulate package facilitates access Python R Ursa Labs initiative support portable libraries benefit entire open source data science ecosystem.practice R Python strengths extent use less important domain application communication results.\nLearning either provide head-start learning .\nHowever, major advantages R Python geocomputation.\nincludes much better support geographic data models vector raster language (see Chapter 2) corresponding visualization possibilities (see Chapters 2 8).\nEqually important, R unparalleled support statistics, including spatial statistics, hundreds packages (unmatched Python) supporting thousands statistical methods.major advantage Python general-purpose programming language.\nused many domains, including desktop software, computer games, websites data science.\nPython often shared language different (geocomputation) communities can seen ‘glue’ holds many GIS programs together.\nMany geoalgorithms, including QGIS ArcMap, can accessed Python command line, making well-suited starter language command-line GIS.3For spatial statistics predictive modeling, however, R second--none.\nmean must choose either R Python: Python supports common statistical techniques (though R tends support new developments spatial statistics earlier) many concepts learned Python can applied R world.\n\n\nLike R, Python also supports geographic data analysis manipulation packages osgeo, Shapely, NumPy PyGeoProcessing (Garrard 2016).","code":""},{"path":"intro.html","id":"r-ecosystem","chapter":"1 Introduction","heading":"1.4 R’s spatial ecosystem","text":"many ways handle geographic data R, dozens packages area.4\nbook endeavor teach state---art field whilst ensuring methods future-proof.\nLike many areas software development, R’s spatial ecosystem rapidly evolving (Figure 1.2).\nR open source, developments can easily build previous work, ‘standing shoulders giants,’ Isaac Newton put 1675.\napproach advantageous encourages collaboration avoids ‘reinventing wheel.’\npackage sf (covered Chapter 2), example, builds predecessor sp.surge development time (interest) ‘R-spatial’ followed award grant R Consortium development support Simple Features, open-source standard model store access vector geometries.\nresulted sf package (covered Section 2.2.1).\nMultiple places reflect immense interest sf.\nespecially true R-sig-Geo Archives, long-standing open access email list containing much R-spatial wisdom accumulated years.\nFIGURE 1.2: Downloads selected R packages working geographic data. y-axis shows average number downloads per day, within 91-day rolling window.\nnoteworthy shifts wider R community, exemplified data processing package dplyr (released 2014) influenced shifts R’s spatial ecosystem.\nAlongside packages shared style emphasis ‘tidy data’ (including, e.g., ggplot2), dplyr placed tidyverse ‘metapackage’ late 2016.\n\n\ntidyverse approach, focus long-form data fast intuitively named functions, become immensely popular.\nled demand ‘tidy geographic data’ partly met sf.\nobvious feature tidyverse tendency packages work harmony.\n\n\nequivalent geoverse, attempts harmonization packages hosted r-spatial organization growing number packages use sf (Table 1.2).TABLE 1.2: top 5 downloaded packages depend sf, terms average number downloads per day previous month. 2021-11-19 289 packages import sf.Parallel group developments relates rspatial set packages.5\nmain member terra package spatial raster handling (see Section 2.3.2).","code":""},{"path":"intro.html","id":"the-history-of-r-spatial","chapter":"1 Introduction","heading":"1.5 The history of R-spatial","text":"many benefits using recent spatial packages sf, also important aware history R’s spatial capabilities: many functions, use-cases teaching material contained older packages.\ncan still useful today, provided know look.\n\nR’s spatial capabilities originated early spatial packages S language (R. Bivand Gebhardt 2000).\n\n1990s saw development numerous S scripts handful packages spatial statistics.\nR packages arose 2000 R packages various spatial methods “point pattern analysis, geostatistics, exploratory spatial data analysis spatial econometrics,” according article presented GeoComputation 2000 (R. Bivand Neteler 2000).\n, notably spatial, sgeostat splancs still available CRAN (B. S. Rowlingson Diggle 1993; B. Rowlingson Diggle 2017; Venables Ripley 2002; Majure Gebhardt 2016).subsequent article R News (predecessor R Journal) contained overview spatial statistical software R time, much based previous code written S/S-PLUS (Ripley 2001).\noverview described packages spatial smoothing interpolation, including akima geoR (Akima Gebhardt 2016; Jr Diggle 2016), point pattern analysis, including splancs (B. Rowlingson Diggle 2017) spatstat (Baddeley, Rubak, Turner 2015).following R News issue (Volume 1/3) put spatial packages spotlight , detailed introduction splancs commentary future prospects regarding spatial statistics (R. Bivand 2001).\nAdditionally, issue introduced two packages testing spatial autocorrelation eventually became part spdep (R. Bivand 2017).\nNotably, commentary mentions need standardization spatial interfaces, efficient mechanisms exchanging data GIS, handling spatial metadata coordinate reference systems (CRS).maptools (written Nicholas Lewin-Koh; R. Bivand Lewin-Koh (2017)) another important package time.\nInitially maptools just contained wrapper around shapelib permitted reading ESRI Shapefiles geometry nested lists.\ncorresponding nowadays obsolete S3 class called “Map” stored list alongside attribute data frame.\nwork “Map” class representation nevertheless important since directly fed sp prior publication CRAN.2003 Roger Bivand published extended review spatial packages.\nproposed class system support “data objects offered GDAL”, including ‘fundamental’ point, line, polygon, raster types.\nFurthermore, suggested interfaces external libraries form basis modular R packages (R. Bivand 2003).\nlarge extent ideas realized packages rgdal sp.\nprovided foundation spatial data analysis R, described Applied Spatial Data Analysis R (ASDAR) (R. Bivand, Pebesma, Gómez-Rubio 2013), first published 2008.\nTen years later, R’s spatial capabilities evolved substantially still build ideas set-R. Bivand (2003):\ninterfaces GDAL PROJ, example, still power R’s high-performance geographic data /O CRS transformation capabilities (see Chapters 6 7, respectively).rgdal, released 2003, provided GDAL bindings R greatly enhanced ability import data previously unavailable geographic data formats.\ninitial release supported raster drivers subsequent enhancements provided support coordinate reference systems (via PROJ library), reprojections import vector file formats (see Chapter 7 file formats).\nMany additional capabilities developed Barry Rowlingson released rgdal codebase 2006 (see B. Rowlingson et al. 2003 R-help email list context).sp, released 2005, overcame R’s inability distinguish spatial non-spatial objects (E. J. Pebesma Bivand 2005).\nsp grew workshop Vienna 2003 hosted sourceforge migrating R-Forge.\nPrior 2005, geographic coordinates generally treated like number.\nsp changed classes generic methods supporting points, lines, polygons grids, attribute data.sp stores information bounding box, coordinate reference system attributes slots Spatial objects using S4 class system,\nenabling data operations work geographic data (see Section 2.2.2).\n, sp provides generic methods summary() plot() geographic data.\nfollowing decade, sp classes rapidly became popular geographic data R number packages depended increased around 20 2008 100 2013 (R. Bivand, Pebesma, Gómez-Rubio 2013).\n2018 almost 500 packages rely sp, making important part R ecosystem.\nProminent R packages using sp include: gstat, spatial spatio-temporal geostatistics; geosphere, spherical trigonometry; adehabitat used analysis habitat selection animals (E. Pebesma Graeler 2018; Calenge 2006; Hijmans 2016).rgdal sp solved many spatial issues, R still lacked ability geometric operations (see Chapter 5).\nColin Rundel addressed issue developing rgeos, R interface open-source geometry library (GEOS) Google Summer Code project 2010 (R. Bivand Rundel 2018).\nrgeos enabled GEOS manipulate sp objects, functions gIntersection().Another limitation sp — restricted support raster data — overcome raster, first released 2010 (Hijmans 2017).\nclass system functions support range raster operations outlined Section 2.3.\nkey feature raster ability work datasets large fit RAM (R’s interface PostGIS supports -disc operations vector geographic data).\nraster also supports map algebra (see Section 4.3.2).parallel developments class systems methods came support R interface dedicated GIS software.\nGRASS (R. S. Bivand 2000) follow-packages spgrass6 rgrass7 (GRASS GIS 6 7, respectively) prominent examples direction (R. Bivand 2016a, 2016b).\nexamples bridges R GIS include RSAGA (Brenning, Bangs, Becker 2018, first published 2008), RPyGeo (Brenning 2012a, first published 2008), RQGIS (Muenchow, Schratz, Brenning 2017, first published 2016), rqgisprocess  (see Chapter 9).\n\nVisualization focus initially, bulk R-spatial development focused analysis geographic operations.\nsp provided methods map making using base lattice plotting system demand growing advanced map making capabilities.\nRgoogleMaps first released 2009, allowed overlay R spatial data top ‘basemap’ tiles online services Google Maps OpenStreetMap (Loecher Ropkins 2015).\n\nfollowed ggmap package added similar ‘basemap’ tiles capabilities ggplot2 (Kahle Wickham 2013).\nThough ggmap facilitated map-making ggplot2, utility limited need fortify spatial objects, means converting long data frames.\nworks well points computationally inefficient lines polygons, since coordinate (vertex) converted row, leading huge data frames represent complex geometries.\nAlthough geographic visualization tended focus vector data, raster visualization supported raster received boost release rasterVis, described book subject spatial temporal data visualization (Lamigueiro 2018).\n2018 map making R hot topic dedicated packages tmap, leaflet mapview supporting class system provided sf, focus next chapter (see Chapter 8 visualization).Since 2018, movement modernizing basic R packages related handling spatial data continued.\n\nterra – successor raster package aimed better performance straightforward user interface firstly released (see Chapter 2.3) 2020 (hijmans_terra_2021?).\nmid-2021, significant change made sf package incorporating spherical geometry calculations.\nSince change, default, many spatial operations data geographic CRSs apply C++ s2geometry library’s spherical geometry algorithms, types operations data projected CRSs still using GEOS.\n\n\nNew ideas spatial data representations also developed period.\n\n\nincludes stars package, closely connected sf, handling raster vector data cubes (pebesma_stars_2021?) lidR processing airborne LiDAR (Light Detection Ranging) point clouds (Roussel2020?).modernization several reasons, including emergence new technologies standard, impacts spatial software development outside R environment (R. S. Bivand 2020).\nimportant external factor affecting spatial software, including R spatial packages, major updates, including many breaking changes PROJ library begun 2018.\nimportantly, changes forced replacement proj4string WKT2 representation storage coordinate reference systems coordinates operations (learn Section 2.4 Chapter 6).\nSince 2018, progress spatial visualization tools R related factors.\nFirstly, new types spatial plots developed, including rayshader package offering combination raytracing multiple hill-shading methods produce 2D 3D data visualizations (morganwall_rayshader_2021?).\n\nSecondly, ggplot2 gained new spatial capabilities, mostly thanks ggspatial package adds spatial vizualization elements, including scale bars north arrows (dunnington_ggspatial_2021?) gganimate enables smooth customizable spatial animations (pedersen_gganimate_2020?).\nThirdly, performance visualizing large spatial dataset improved.\nespecially relates automatic plotting downscaled rasters tmap possibility using high-performance interactive rendering platforms mapview package, \"leafgl\" \"mapdeck\".\nLastly, existing mapping tools rewritten minimize dependencies, improve user interface, allow easier creation extensions.\nincludes mapsf package (successor cartography) (giraud_mapsf_2021?) version 4 tmap package, internal code revised.late 2021, planned retirement rgdal, rgeos maptools end 2023 announced R-sig-Geo mailing list Roger Bivand.\nlarge impact existing workflows applying packages, also influence packages depend rgdal, rgeos maptools.\nTherefore, Bivand’s suggestion plan transition modern tools, including sf terra, explained book’s next chapters.","code":""},{"path":"intro.html","id":"exercises","chapter":"1 Introduction","heading":"1.6 Exercises","text":"E1. Think terms ‘GIS’, ‘GDS’ ‘geocomputation’ described . () best describes work like using geo* methods software ?E2. Provide three reasons using scriptable language R geocomputation instead using graphical user interface (GUI) based GIS QGIS.E3. Name two advantages two disadvantages using mature vs recent packages geographic data analysis (example sp vs sf, raster vs terra).","code":""},{"path":"spatial-class.html","id":"spatial-class","chapter":"2 Geographic data in R","heading":"2 Geographic data in R","text":"","code":""},{"path":"spatial-class.html","id":"prerequisites","chapter":"2 Geographic data in R","heading":"Prerequisites","text":"first practical chapter book, therefore comes software requirements.\nassume --date version R installed comfortable using software command-line interface integrated development environment (IDE) RStudio.\nnew R, recommend reading Chapter 2 online book Efficient R Programming Gillespie Lovelace (2016) learning basics language reference resources Grolemund Wickham (2016).\nOrganize work (e.g., RStudio projects) give scripts sensible names 02-chapter.R document code write learn.\npackages used chapter can installed following commands:6All packages needed reproduce contents book can installed following command: remotes::install_github(\"geocompr/geocompkg\").\nnecessary packages can ‘loaded’ (technically attached) library() function follows:output library(sf) reports versions key geographic libraries GEOS package using, outlined Section 2.2.1.packages installed contain data used book:","code":"\ninstall.packages(\"sf\")\ninstall.packages(\"terra\")\ninstall.packages(\"spData\")\ninstall.packages(\"spDataLarge\", repos = \"https://nowosad.r-universe.dev\")\nlibrary(sf)          # classes and functions for vector data\n#> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1\nlibrary(terra)      # classes and functions for raster data\nlibrary(spData)        # load geographic data\n#> Warning: multiple methods tables found for 'approxNA'\nlibrary(spDataLarge)   # load larger geographic data"},{"path":"spatial-class.html","id":"intro-spatial-class","chapter":"2 Geographic data in R","heading":"2.1 Introduction","text":"chapter provide brief explanations fundamental geographic data models: vector raster.\nintroduce theory behind data model disciplines predominate, demonstrating implementation R.vector data model represents world using points, lines polygons.\ndiscrete, well-defined borders, meaning vector datasets usually high level precision (necessarily accuracy see Section 2.5).\nraster data model divides surface cells constant size.\nRaster datasets basis background images used web-mapping vital source geographic data since origins aerial photography satellite-based remote sensing devices.\nRasters aggregate spatially specific features given resolution, meaning consistent space scalable (many worldwide raster datasets available).use?\nanswer likely depends domain application:Vector data tends dominate social sciences human settlements tend discrete bordersRaster dominates many environmental sciences reliance remote sensing dataThere much overlap fields raster vector datasets can used together:\necologists demographers, example, commonly use vector raster data.\nFurthermore, possible convert two forms (see Section 5.4).\nWhether work involves use vector raster datasets, worth understanding underlying data model using , discussed subsequent chapters.\nbook uses sf terra packages work vector data raster datasets, respectively.","code":""},{"path":"spatial-class.html","id":"vector-data","chapter":"2 Geographic data in R","heading":"2.2 Vector data","text":"geographic vector data model based points located within coordinate reference system (CRS).\nPoints can represent self-standing features (e.g., location bus stop) can linked together form complex geometries lines polygons.\npoint geometries contain two dimensions (3-dimensional CRSs contain additional \\(z\\) value, typically representing height sea level).system London, example, can represented coordinates c(-0.1, 51.5).\nmeans location -0.1 degrees east 51.5 degrees north origin.\norigin case 0 degrees longitude (Prime Meridian) 0 degree latitude (Equator) geographic (‘lon/lat’) CRS (Figure 2.1, left panel).\npoint also approximated projected CRS ‘Easting/Northing’ values c(530000, 180000) British National Grid, meaning London located 530 km East 180 km North \\(origin\\) CRS.\ncan verified visually: slightly 5 ‘boxes’ — square areas bounded gray grid lines 100 km width — separate point representing London origin (Figure 2.1, right panel).location National Grid’s origin, sea beyond South West Peninsular, ensures locations UK positive Easting Northing values.7\nCRSs, described Sections 2.4 6 , purposes section, sufficient know coordinates consist two numbers representing distance origin, usually \\(x\\) \\(y\\) dimensions.\nFIGURE 2.1: Illustration vector (point) data location London (red X) represented reference origin (blue circle). left plot represents geographic CRS origin 0° longitude latitude. right plot represents projected CRS origin located sea west South West Peninsula.\nsf package providing class system geographic vector data.\nsf supersede sp, also provides consistent command-line interface GEOS GDAL, superseding rgeos rgdal (described Section 1.5).\nsection introduces sf classes preparation subsequent chapters (Chapters 5 7 cover GEOS GDAL interface, respectively).","code":""},{"path":"spatial-class.html","id":"intro-sf","chapter":"2 Geographic data in R","heading":"2.2.1 An introduction to simple features","text":"Simple features open standard developed endorsed Open Geospatial Consortium (OGC), --profit organization whose activities revisit later chapter (Section 7.5).\n\nSimple Features hierarchical data model represents wide range geometry types.\n17 geometry types supported specification, 7 used vast majority geographic research (see Figure 2.2);\ncore geometry types fully supported R package sf (E. Pebesma 2018).8\nFIGURE 2.2: Simple feature types fully supported sf.\nsf can represent common vector geometry types (raster data classes supported sf): points, lines, polygons respective ‘multi’ versions (group together features type single feature).\n\n\nsf also supports geometry collections, can contain multiple geometry types single object.\nsf provides functionality () previously provided three packages — sp data classes (E. Pebesma Bivand 2018), rgdal data read/write via interface GDAL PROJ (R. Bivand, Keitt, Rowlingson 2018) rgeos spatial operations via interface GEOS (R. Bivand Rundel 2018).re-iterate message Chapter 1, geographic R packages long history interfacing lower level libraries, sf continues tradition unified interface recent versions GEOS geometry operations, GDAL library reading writing geographic data files, PROJ library representing transforming projected coordinate reference systems.\ns2,\n\n”\n\nR interface Google’s spherical geometry library s2, sf also access fast accurate “measurements operations non-planar geometries” (bivand_progress_2021?).\nSince sf version 1.0.0, launched June 2021, s2 functionality now used default geometries geographic (longitude/latitude) coordinate systems, unique feature sf differs spatial libraries support GEOS geometry operations Python package GeoPandas.\ndiscuss s2 subsequent chapters.\n\nsf’s ability integrate multiple powerful libraries geocomputation single framework notable achievement reduces ‘barriers entry’ world reproducible geographic data analysis high-performance libraries.\nsf’s functionality well documented website r-spatial.github.io/sf/ contains 7 vignettes.\ncan viewed offline follows:first vignette explains, simple feature objects R stored data frame, geographic data occupying special column, usually named ‘geom’ ‘geometry.’\nuse world dataset provided spData, loaded beginning chapter, show sf objects work.\nworld ‘sf data frame’ containing spatial attribute columns, names returned function names() (last column example contains geographic information):contents geom column give sf objects spatial powers: world$geom ‘list column’ contains coordinates country polygons.\n\nsf objects can plotted quickly base R function plot();\nfollowing command creates Figure 2.3.\nFIGURE 2.3: spatial plot world using sf package, facet attribute.\nNote instead creating single map default geographic objects, GIS programs , plot()ing sf objects results map variable datasets.\nbehavior can useful exploring spatial distribution different variables discussed Section 2.2.3.broadly, treating geographic objects regular data frames spatial powers many advantages, especially already used working data frames.\ncommonly used summary() function, example, provides useful overview variables within world object.Although selected one variable summary() command, also outputs report geometry.\ndemonstrates ‘sticky’ behavior geometry columns sf objects, meaning geometry kept unless user deliberately removes , ’ll see Section 3.2.\nresult provides quick summary non-spatial spatial data contained world: mean average life expectancy 71 years (ranging less 51 83 years median 73 years) across countries.worth taking deeper look basic behavior contents simple feature object, can usefully thought ‘spatial data frame.’sf objects easy subset.\ncode shows first two rows three columns.\noutput shows two major differences compared regular data.frame: inclusion additional geographic data (geometry type, dimension, bbox CRS information - epsg (SRID), proj4string), presence geometry column, named geom:may seem rather complex, especially class system supposed simple.\nHowever, good reasons organizing things way using sf.describing geometry type sf package supports, worth taking step back understand building blocks sf objects.\nSection 2.2.8 shows simple features objects data frames, special geometry columns.\nspatial columns often called geom geometry: world$geom refers spatial element world object described .\ngeometry columns ‘list columns’ class sfc (see Section 2.2.7).\nturn, sfc objects composed one objects class sfg: simple feature geometries describe Section 2.2.6.\n\nunderstand spatial components simple features work, vital understand simple feature geometries.\nreason cover currently supported simple features geometry type Section 2.2.5 moving describe can represented R using sfg objects, form basis sfc eventually full sf objects.","code":"\nvignette(package = \"sf\") # see which vignettes are available\nvignette(\"sf1\")          # an introduction to the package\nclass(world)\n#> [1] \"sf\"         \"tbl_df\"     \"tbl\"        \"data.frame\"\nnames(world)\n#>  [1] \"iso_a2\"    \"name_long\" \"continent\" \"region_un\" \"subregion\" \"type\"     \n#>  [7] \"area_km2\"  \"pop\"       \"lifeExp\"   \"gdpPercap\" \"geom\"\nplot(world)\nsummary(world[\"lifeExp\"])\n#>     lifeExp                geom    \n#>  Min.   :50.6   MULTIPOLYGON :177  \n#>  1st Qu.:65.0   epsg:4326    :  0  \n#>  Median :72.9   +proj=long...:  0  \n#>  Mean   :70.9                      \n#>  3rd Qu.:76.8                      \n#>  Max.   :83.6                      \n#>  NA's   :10\nworld_mini = world[1:2, 1:3]\nworld_mini\n#> Simple feature collection with 2 features and 3 fields\n#> Geometry type: MULTIPOLYGON\n#> Dimension:     XY\n#> Bounding box:  xmin: -180 ymin: -18.3 xmax: 180 ymax: -0.95\n#> Geodetic CRS:  WGS 84\n#> # A tibble: 2 × 4\n#>   iso_a2 name_long continent                                                geom\n#>   <chr>  <chr>     <chr>                                      <MULTIPOLYGON [°]>\n#> 1 FJ     Fiji      Oceania   (((-180 -16.6, -180 -16.5, -180 -16, -180 -16.1, -…\n#> 2 TZ     Tanzania  Africa    (((33.9 -0.95, 31.9 -1.03, 30.8 -1.01, 30.4 -1.13,…"},{"path":"spatial-class.html","id":"why-simple-features","chapter":"2 Geographic data in R","heading":"2.2.2 Why simple features?","text":"Simple features widely supported data model underlies data structures many GIS applications including QGIS PostGIS.\nmajor advantage using data model ensures work cross-transferable set-ups, example importing exporting spatial databases.\nspecific question R perspective “use sf package sp already tried tested?”\nmany reasons (linked advantages simple features model):Fast reading writing dataEnhanced plotting performancesf objects can treated data frames operationssf function names relatively consistent intuitive (begin st_)sf functions can combined using %>% operator works well tidyverse collection R packages.sf’s support tidyverse packages exemplified provision two functions reading data, st_read() read_sf() store attributes base R data.frame tidyverse tibble classes respectively, demonstrated (see Chapter 3 manipulating geographic data tidyverse functions Section 7.6.1 details reading writing geographic vector data R):sf now go-package analysis spatial vector data R (withstanding spatstat package ecosystem provides numerous functions spatial statistics).\nMany popular packages build sf, shown rise popularity terms number downloads per day, shown Section 1.4 previous chapter.\nHowever, take many years packages fully transition away older packages sp, many packages depend sf sp never switch (bivand_progress_2021?).context important note people still using sp (related rgeos rgdal) packages advised switch sf.\ndescription rgeos CRAN, example, states package “retired end 2023” advises people plan transition sf.\nwords, sf future proof sp .\nworkflows depend legacy class system, sf objects can converted Spatial class sp package follows:","code":"\nnc_dfr = st_read(system.file(\"shape/nc.shp\", package=\"sf\"))\n#> Reading layer `nc' from data source \n#>   `/usr/local/lib/R/site-library/sf/shape/nc.shp' using driver `ESRI Shapefile'\n#> Simple feature collection with 100 features and 14 fields\n#> Geometry type: MULTIPOLYGON\n#> Dimension:     XY\n#> Bounding box:  xmin: -84.3 ymin: 33.9 xmax: -75.5 ymax: 36.6\n#> Geodetic CRS:  NAD27\nnc_tbl = read_sf(system.file(\"shape/nc.shp\", package=\"sf\"))\nclass(nc_dfr)\n#> [1] \"sf\"         \"data.frame\"\nclass(nc_tbl)\n#> [1] \"sf\"         \"tbl_df\"     \"tbl\"        \"data.frame\"\nlibrary(sp)\nworld_sp = as(world, Class = \"Spatial\") # from an sf object to sp\n# sp functions ...\nworld_sf = st_as_sf(world_sp)           # from sp to sf"},{"path":"spatial-class.html","id":"basic-map","chapter":"2 Geographic data in R","heading":"2.2.3 Basic map making","text":"Basic maps created sf plot().\ndefault creates multi-panel plot (like sp’s spplot()), one sub-plot variable object, illustrated left-hand panel Figure 2.4.\nlegend ‘key’ continuous color produced object plotted single variable (see right-hand panel).\nColors can also set col =, although create continuous palette legend.\n\nFIGURE 2.4: Plotting sf, multiple variables (left) single variable (right).\nPlots added layers existing images setting add = TRUE.9\ndemonstrate , provide taster content covered Chapters 3 4 attribute spatial data operations, subsequent code chunk combines countries Asia:can now plot Asian continent map world.\nNote first plot must one facet add = TRUE work.\nfirst plot key, reset = FALSE must used (result shown):Adding layers way can used verify geographic correspondence layers:\nplot() function fast execute requires lines code, create interactive maps wide range options.\nadvanced map making recommend using dedicated visualization packages tmap (see Chapter 8).","code":"\nplot(world[3:6])\nplot(world[\"pop\"])\nworld_asia = world[world$continent == \"Asia\", ]\nasia = st_union(world_asia)\nplot(world[\"pop\"], reset = FALSE)\nplot(asia, add = TRUE, col = \"red\")"},{"path":"spatial-class.html","id":"base-args","chapter":"2 Geographic data in R","heading":"2.2.4 Base plot arguments","text":"various ways modify maps sf’s plot() method.\nsf extends base R plotting methods plot()’s arguments main = (specifies title map) work sf objects (see ?graphics::plot ?par).10\n\nFigure 2.5 illustrates flexibility overlaying circles, whose diameters (set cex =) represent country populations, map world.\nunprojected version figure can created following commands (see exercises end chapter script 02-contplot.R reproduce Figure 2.5):\nFIGURE 2.5: Country continents (represented fill color) 2015 populations (represented circles, area proportional population).\ncode uses function st_centroid() convert one geometry type (polygons) another (points) (see Chapter 5), aesthetics varied cex argument.\nsf’s plot method also arguments specific geographic data. expandBB, example, can used plot sf object context:\ntakes numeric vector length four expands bounding box plot relative zero following order: bottom, left, top, right.\nused plot India context giant Asian neighbors, emphasis China east, following code chunk, generates Figure 2.6 (see exercises adding text plots):\nFIGURE 2.6: India context, demonstrating expandBB argument.\nNote use [0] keep geometry column lwd emphasize India.\nSee Section 8.6 visualization techniques representing range geometry types, subject next section.","code":"\nplot(world[\"continent\"], reset = FALSE)\ncex = sqrt(world$pop) / 10000\nworld_cents = st_centroid(world, of_largest = TRUE)\nplot(st_geometry(world_cents), add = TRUE, cex = cex)\nindia = world[world$name_long == \"India\", ]\nplot(st_geometry(india), expandBB = c(0, 0.2, 0.1, 1), col = \"gray\", lwd = 3)\nplot(world_asia[0], add = TRUE)"},{"path":"spatial-class.html","id":"geometry","chapter":"2 Geographic data in R","heading":"2.2.5 Geometry types","text":"Geometries basic building blocks simple features.\nSimple features R can take one 17 geometry types supported sf package.\n\n\nchapter focus seven commonly used types: POINT, LINESTRING, POLYGON, MULTIPOINT, MULTILINESTRING, MULTIPOLYGON GEOMETRYCOLLECTION.\nFind whole list possible feature types PostGIS manual.Generally, well-known binary (WKB) well-known text (WKT) standard encoding simple feature geometries.\n\n\n\nWKB representations usually hexadecimal strings easily readable computers.\nGIS spatial databases use WKB transfer store geometry objects.\nWKT, hand, human-readable text markup description simple features.\nformats exchangeable, present one, naturally choose WKT representation.basis geometry type point.\npoint simply coordinate 2D, 3D 4D space (see vignette(\"sf1\") information) (see left panel Figure 2.7):\nPOINT (5 2)\nlinestring sequence points straight line connecting points, example (see middle panel Figure 2.7):LINESTRING (1 5, 4 4, 4 1, 2 2, 3 2)polygon sequence points form closed, non-intersecting ring.\nClosed means first last point polygon coordinates (see right panel Figure 2.7).11\nPolygon without hole: POLYGON ((1 5, 2 2, 4 1, 4 4, 1 5))\nFIGURE 2.7: Illustration point, linestring polygon geometries.\nfar created geometries one geometric entity per feature.\nHowever, sf also allows multiple geometries exist within single feature (hence term ‘geometry collection’) using “multi” version geometry type:\nMultipoint: MULTIPOINT (5 2, 1 3, 3 4, 3 2)Multilinestring: MULTILINESTRING ((1 5, 4 4, 4 1, 2 2, 3 2), (1 2, 2 4))Multipolygon: MULTIPOLYGON (((1 5, 2 2, 4 1, 4 4, 1 5), (0 2, 1 2, 1 3, 0 3, 0 2)))\nFIGURE 2.8: Illustration multi* geometries.\nFinally, geometry collection can contain combination geometries including (multi)points linestrings (see Figure 2.9):\nGeometry collection: GEOMETRYCOLLECTION (MULTIPOINT (5 2, 1 3, 3 4, 3 2), LINESTRING (1 5, 4 4, 4 1, 2 2, 3 2))\nFIGURE 2.9: Illustration geometry collection.\n","code":""},{"path":"spatial-class.html","id":"sfg","chapter":"2 Geographic data in R","heading":"2.2.6 Simple feature geometries (sfg)","text":"sfg class represents different simple feature geometry types R: point, linestring, polygon (‘multi’ equivalents, multipoints) geometry collection.\nUsually spared tedious task creating geometries since can simply import already existing spatial file.\nHowever, set functions create simple feature geometry objects (sfg) scratch needed.\nnames functions simple consistent, start st_ prefix end name geometry type lowercase letters:point: st_point()linestring: st_linestring()polygon: st_polygon()multipoint: st_multipoint()multilinestring: st_multilinestring()multipolygon: st_multipolygon()geometry collection: st_geometrycollection()sfg objects can created three base R data types:numeric vector: single pointA matrix: set points, row represents point, multipoint linestringA list: collection objects matrices, multilinestrings geometry collectionsThe function st_point() creates single points numeric vectors:results show XY (2D coordinates), XYZ (3D coordinates) XYZM (3D additional variable, typically measurement accuracy) point types created vectors length 2, 3, 4, respectively.\nXYM type must specified using dim argument (short dimension).contrast, use matrices case multipoint (st_multipoint()) linestring (st_linestring()) objects:Finally, use lists creation multilinestrings, (multi-)polygons geometry collections:","code":"\nst_point(c(5, 2))                 # XY point\n#> POINT (5 2)\nst_point(c(5, 2, 3))              # XYZ point\n#> POINT Z (5 2 3)\nst_point(c(5, 2, 1), dim = \"XYM\") # XYM point\n#> POINT M (5 2 1)\nst_point(c(5, 2, 3, 1))           # XYZM point\n#> POINT ZM (5 2 3 1)\n# the rbind function simplifies the creation of matrices\n## MULTIPOINT\nmultipoint_matrix = rbind(c(5, 2), c(1, 3), c(3, 4), c(3, 2))\nst_multipoint(multipoint_matrix)\n#> MULTIPOINT ((5 2), (1 3), (3 4), (3 2))\n## LINESTRING\nlinestring_matrix = rbind(c(1, 5), c(4, 4), c(4, 1), c(2, 2), c(3, 2))\nst_linestring(linestring_matrix)\n#> LINESTRING (1 5, 4 4, 4 1, 2 2, 3 2)\n## POLYGON\npolygon_list = list(rbind(c(1, 5), c(2, 2), c(4, 1), c(4, 4), c(1, 5)))\nst_polygon(polygon_list)\n#> POLYGON ((1 5, 2 2, 4 1, 4 4, 1 5))\n## POLYGON with a hole\npolygon_border = rbind(c(1, 5), c(2, 2), c(4, 1), c(4, 4), c(1, 5))\npolygon_hole = rbind(c(2, 4), c(3, 4), c(3, 3), c(2, 3), c(2, 4))\npolygon_with_hole_list = list(polygon_border, polygon_hole)\nst_polygon(polygon_with_hole_list)\n#> POLYGON ((1 5, 2 2, 4 1, 4 4, 1 5), (2 4, 3 4, 3 3, 2 3, 2 4))\n## MULTILINESTRING\nmultilinestring_list = list(rbind(c(1, 5), c(4, 4), c(4, 1), c(2, 2), c(3, 2)), \n                            rbind(c(1, 2), c(2, 4)))\nst_multilinestring((multilinestring_list))\n#> MULTILINESTRING ((1 5, 4 4, 4 1, 2 2, 3 2), (1 2, 2 4))\n## MULTIPOLYGON\nmultipolygon_list = list(list(rbind(c(1, 5), c(2, 2), c(4, 1), c(4, 4), c(1, 5))),\n                         list(rbind(c(0, 2), c(1, 2), c(1, 3), c(0, 3), c(0, 2))))\nst_multipolygon(multipolygon_list)\n#> MULTIPOLYGON (((1 5, 2 2, 4 1, 4 4, 1 5)), ((0 2, 1 2, 1 3, 0 3, 0 2)))\n## GEOMETRYCOLLECTION\ngemetrycollection_list = list(st_multipoint(multipoint_matrix),\n                              st_linestring(linestring_matrix))\nst_geometrycollection(gemetrycollection_list)\n#> GEOMETRYCOLLECTION (MULTIPOINT (5 2, 1 3, 3 4, 3 2),\n#>   LINESTRING (1 5, 4 4, 4 1, 2 2, 3 2))"},{"path":"spatial-class.html","id":"sfc","chapter":"2 Geographic data in R","heading":"2.2.7 Simple feature columns (sfc)","text":"One sfg object contains single simple feature geometry.\nsimple feature geometry column (sfc) list sfg objects, additionally able contain information coordinate reference system use.\ninstance, combine two simple features one object two features, can use st_sfc() function.\n\nimportant since sfc represents geometry column sf data frames:cases, sfc object contains objects geometry type.\nTherefore, convert sfg objects type polygon simple feature geometry column, also end sfc object type polygon, can verified st_geometry_type().\nEqually, geometry column multilinestrings result sfc object type multilinestring:also possible create sfc object sfg objects different geometry types:mentioned , sfc objects can additionally store information coordinate reference systems (CRS).\ndefault value NA (Available), can verified st_crs():geometries sfc object must CRS.\ncan add coordinate reference system crs argument st_sfc().\nspecify certain CRS, can provide Spatial Reference System Identifier (SRID, e.g., \"EPSG:4326\"), well-known text (WKT2), proj4string representation (see Section 2.4).\nprovide SRID proj4string, well-known text (WKT2) added automatically.","code":"\n# sfc POINT\npoint1 = st_point(c(5, 2))\npoint2 = st_point(c(1, 3))\npoints_sfc = st_sfc(point1, point2)\npoints_sfc\n#> Geometry set for 2 features \n#> Geometry type: POINT\n#> Dimension:     XY\n#> Bounding box:  xmin: 1 ymin: 2 xmax: 5 ymax: 3\n#> CRS:           NA\n#> POINT (5 2)\n#> POINT (1 3)\n# sfc POLYGON\npolygon_list1 = list(rbind(c(1, 5), c(2, 2), c(4, 1), c(4, 4), c(1, 5)))\npolygon1 = st_polygon(polygon_list1)\npolygon_list2 = list(rbind(c(0, 2), c(1, 2), c(1, 3), c(0, 3), c(0, 2)))\npolygon2 = st_polygon(polygon_list2)\npolygon_sfc = st_sfc(polygon1, polygon2)\nst_geometry_type(polygon_sfc)\n#> [1] POLYGON POLYGON\n#> 18 Levels: GEOMETRY POINT LINESTRING POLYGON MULTIPOINT ... TRIANGLE\n# sfc MULTILINESTRING\nmultilinestring_list1 = list(rbind(c(1, 5), c(4, 4), c(4, 1), c(2, 2), c(3, 2)), \n                            rbind(c(1, 2), c(2, 4)))\nmultilinestring1 = st_multilinestring((multilinestring_list1))\nmultilinestring_list2 = list(rbind(c(2, 9), c(7, 9), c(5, 6), c(4, 7), c(2, 7)), \n                            rbind(c(1, 7), c(3, 8)))\nmultilinestring2 = st_multilinestring((multilinestring_list2))\nmultilinestring_sfc = st_sfc(multilinestring1, multilinestring2)\nst_geometry_type(multilinestring_sfc)\n#> [1] MULTILINESTRING MULTILINESTRING\n#> 18 Levels: GEOMETRY POINT LINESTRING POLYGON MULTIPOINT ... TRIANGLE\n# sfc GEOMETRY\npoint_multilinestring_sfc = st_sfc(point1, multilinestring1)\nst_geometry_type(point_multilinestring_sfc)\n#> [1] POINT           MULTILINESTRING\n#> 18 Levels: GEOMETRY POINT LINESTRING POLYGON MULTIPOINT ... TRIANGLE\nst_crs(points_sfc)\n#> Coordinate Reference System: NA\n# EPSG definition\npoints_sfc_wgs = st_sfc(point1, point2, crs = \"EPSG:4326\")\nst_crs(points_sfc_wgs)\n#> Coordinate Reference System:\n#>   User input: EPSG:4326 \n#>   wkt:\n#> GEOGCRS[\"WGS 84\",\n#>     DATUM[\"World Geodetic System 1984\",\n#>         ELLIPSOID[\"WGS 84\",6378137,298.257223563,\n#>             LENGTHUNIT[\"metre\",1]]],\n#>     PRIMEM[\"Greenwich\",0,\n#>         ANGLEUNIT[\"degree\",0.0174532925199433]],\n#>     CS[ellipsoidal,2],\n#>         AXIS[\"geodetic latitude (Lat)\",north,\n#>             ORDER[1],\n#>             ANGLEUNIT[\"degree\",0.0174532925199433]],\n#>         AXIS[\"geodetic longitude (Lon)\",east,\n#>             ORDER[2],\n#>             ANGLEUNIT[\"degree\",0.0174532925199433]],\n#>     USAGE[\n#>         SCOPE[\"unknown\"],\n#>         AREA[\"World\"],\n#>         BBOX[-90,-180,90,180]],\n#>     ID[\"EPSG\",4326]]"},{"path":"spatial-class.html","id":"sf","chapter":"2 Geographic data in R","heading":"2.2.8 The sf class","text":"Sections 2.2.5 2.2.7 deal purely geometric objects, ‘sf geometry’ ‘sf column’ objects, respectively.\ngeographic building blocks geographic vector data represented simple features.\nfinal building block non-geographic attributes, representing name feature attributes measured values, groups, things.\nillustrate attributes, represent temperature 25°C London June 21st, 2017.\nexample contains geometry (coordinates), three attributes three different classes (place name, temperature date).12\nObjects class sf represent data combining attributes (data.frame) simple feature geometry column (sfc).\ncreated st_sf() illustrated , creates London example described :just happened? First, coordinates used create simple feature geometry (sfg).\nSecond, geometry converted simple feature geometry column (sfc), CRS.\nThird, attributes stored data.frame, combined sfc object st_sf().\nresults sf object, demonstrated (output omitted):result shows sf objects actually two classes, sf data.frame.\nSimple features simply data frames (square tables), spatial attributes stored list column, usually called geometry, described Section 2.2.1.\nduality central concept simple features:\ntime sf can treated behaves like data.frame.\nSimple features , essence, data frames spatial extension.","code":"\nlnd_point = st_point(c(0.1, 51.5))                 # sfg object\nlnd_geom = st_sfc(lnd_point, crs = 4326)           # sfc object\nlnd_attrib = data.frame(                           # data.frame object\n  name = \"London\",\n  temperature = 25,\n  date = as.Date(\"2017-06-21\")\n  )\nlnd_sf = st_sf(lnd_attrib, geometry = lnd_geom)    # sf object\nlnd_sf\n#> Simple feature collection with 1 features and 3 fields\n#> ...\n#>     name temperature       date         geometry\n#> 1 London          25 2017-06-21 POINT (0.1 51.5)\nclass(lnd_sf)\n#> [1] \"sf\"         \"data.frame\""},{"path":"spatial-class.html","id":"raster-data","chapter":"2 Geographic data in R","heading":"2.3 Raster data","text":"spatial raster data model represents world continuous grid cells (often also called pixels; Figure 2.10:).\ndata model often refers -called regular grids, cell , constant size – focus regular grids book .\nHowever, several types grids exist, including rotated, sheared, rectilinear, curvilinear grids (see Chapter 1 E. Pebesma Bivand (2022) Chapter 2 Tennekes Nowosad (2022)).raster data model usually consists raster header\nmatrix (rows columns) representing equally spaced cells (often also called pixels; Figure 2.10:).13\nraster header defines coordinate reference system, extent origin.\n\n\norigin (starting point) frequently coordinate lower-left corner matrix (terra package, however, uses upper left corner, default (Figure 2.10:B)).\nheader defines extent via number columns, number rows cell size resolution.\nHence, starting origin, can easily access modify single cell either using ID cell (Figure 2.10:B) explicitly specifying rows columns.\nmatrix representation avoids storing explicitly coordinates four corner points (fact stores one coordinate, namely origin) cell corner case rectangular vector polygons.\nmap algebra (Section 4.3.2) makes raster processing much efficient faster vector data processing.\nHowever, contrast vector data, cell one raster layer can hold single value.\nvalue might numeric categorical (Figure 2.10:C).\nFIGURE 2.10: Raster data types: () cell IDs, (B) cell values, (C) colored raster map.\nRaster maps usually represent continuous phenomena elevation, temperature, population density spectral data (Figure 2.11).\ncourse, can represent discrete features soil land-cover classes also help raster data model (Figure 2.11).\nConsequently, discrete borders features become blurred, depending spatial task vector representation might suitable.\nFIGURE 2.11: Examples continuous categorical rasters.\n","code":""},{"path":"spatial-class.html","id":"r-packages-for-working-with-raster-data","chapter":"2 Geographic data in R","heading":"2.3.1 R packages for working with raster data","text":"last two decades, several packages packages reading processing raster datasets developed.\noutlined Section 1.5, chief among raster, led step change R’s raster capabilities launched 2010 premier package space development terra stars.\nrecently developed package provide powerful performant functions working raster datasets substantial overlap possibly use cases.\nbook focus terra, replaces older (cases) slower raster.\nlearning terra’s class system works, section describes similarities differences terra raster; knowledge help decide appropriate different situations.First, terra focuses common raster data model (regular grids), stars also allows storing less popular models (including regular, rotated, sheared, rectilinear, curvilinear grids).\nterra usually handle one multi-layered rasters14, stars package provides ways store raster data cubes – raster object many layers (e.g., bands), many moments time (e.g., months), many attributes (e.g., sensor type sensor type B).\nImportantly, packages, layers elements data cube must spatial dimensions extent.\nSecond, packages allow either read raster data memory just read metadata – usually done automatically based input file size.\nHowever, store raster values differently.\nterra based C++ code mostly uses C++ pointers.\nstars stores values lists arrays smaller rasters just file path larger ones.\nThird, stars functions closely related vector objects functions sf, terra uses class objects vector data, namely SpatVector.\nFourth, packages different approach various functions work objects.\nterra package mostly relies large number built-functions, function specific purpose (e.g., resampling cropping).\nhand, stars uses build-functions (usually names starting st_), methods existing R functions (e.g., split() aggregate()), also existing dplyr functions (e.g., filter() slice()).Importantly, straightforward convert objects terra stars (using st_as_stars()) way round (using rast()).\nalso encourage read E. Pebesma Bivand (2022) comprehensive introduction stars package.","code":""},{"path":"spatial-class.html","id":"an-introduction-to-terra","chapter":"2 Geographic data in R","heading":"2.3.2 An introduction to terra","text":"terra package supports raster objects R.\nLike predecessor raster (created developer, Robert Hijmans), provides extensive set functions create, read, export, manipulate process raster datasets.\nterra’s functionality largely mature raster package, differences: terra functions usually computationally efficient raster equivalents.\n\nhand, raster class system popular used many packages; sf sp, good news can seamlessly translate two types object ensure backwards compatibility older scripts packages, example, functions raster(), stack(), brick() (see previous chapter evolution R packages working geographic data).addition functions raster data manipulation, terra provides many low-level functions can form foundation developing new tools working raster datasets.\n\nterra also lets work large raster datasets large fit main memory.\ncase, terra provides possibility divide raster smaller chunks, processes iteratively instead loading whole raster file RAM.illustration terra concepts, use datasets spDataLarge.\nconsists raster objects one vector object covering area Zion National Park (Utah, USA).\nexample, srtm.tif digital elevation model area (details, see documentation ?srtm).\nFirst, let’s create SpatRaster object named my_rast:Typing name raster console, print raster header (dimensions, resolution, extent, CRS) additional information (class, data source, summary raster values):Dedicated functions report component: dim(my_rast) returns number rows, columns layers; ncell() number cells (pixels); res() spatial resolution; ext() spatial extent; crs() coordinate reference system (raster reprojection covered Section 6.6).\ninMemory() reports whether raster data stored memory disk.help(\"terra-package\") returns full list available terra functions.","code":"\nraster_filepath = system.file(\"raster/srtm.tif\", package = \"spDataLarge\")\nmy_rast = rast(raster_filepath)\nclass(my_rast)\n#> [1] \"SpatRaster\"\n#> attr(,\"package\")\n#> [1] \"terra\"\nmy_rast\n#> class       : SpatRaster \n#> dimensions  : 457, 465, 1  (nrow, ncol, nlyr)\n#> resolution  : 0.000833, 0.000833  (x, y)\n#> extent      : -113, -113, 37.1, 37.5  (xmin, xmax, ymin, ymax)\n#> coord. ref. : lon/lat WGS 84 (EPSG:4326) \n#> source      : srtm.tif \n#> name        : srtm \n#> min value   : 1024 \n#> max value   : 2892"},{"path":"spatial-class.html","id":"basic-map-raster","chapter":"2 Geographic data in R","heading":"2.3.3 Basic map making","text":"Similar sf package, terra also provides plot() methods classes.\n\nFIGURE 2.12: Basic raster plot.\nseveral approaches plotting raster data R outside scope section, including:plotRGB() function terra package create Red-Green-Blue plot based three layers SpatRaster objectpackages tmap create static interactive maps raster vector objects (see Chapter 8)functions, example levelplot() rasterVis package, create facets, common technique visualizing change time","code":"\nplot(my_rast)"},{"path":"spatial-class.html","id":"raster-classes","chapter":"2 Geographic data in R","heading":"2.3.4 Raster classes","text":"SpatRaster class represents rasters object terra.\neasiest way create raster object R read-raster file disk server (Section 7.6.2.\nterra package supports numerous drivers help GDAL library.\nRasters files usually read entirely RAM, exception header pointer file .Rasters can also created scratch using rast() function.\nillustrated subsequent code chunk, results new SpatRaster object.\nresulting raster consists 36 cells (6 columns 6 rows specified nrows ncols) centered around Prime Meridian Equator (see xmin, xmax, ymin ymax parameters).\ndefault CRS raster objects WGS84, can changed crs argument.\nmeans unit resolution degrees set 0.5 (resolution).\nValues (vals) assigned cell: 1 cell 1, 2 cell 2, .\nRemember: rast() fills cells row-wise (unlike matrix()) starting upper left corner, meaning top row contains values 1 6, second 7 12, etc.ways creating raster objects, see ?rast.SpatRaster class also handles multiple layers, typically correspond single multispectral satellite file time-series rasters.nlyr() retrieves number layers stored SpatRaster object:multi-layer raster objects, layers can selected terra::subset().15\naccepts layer number name second argument:opposite operation, combining several SpatRaster objects one, can done using c function:","code":"\nsingle_raster_file = system.file(\"raster/srtm.tif\", package = \"spDataLarge\")\nsingle_rast = rast(raster_filepath)\nnew_raster = rast(nrows = 6, ncols = 6, resolution = 0.5, \n                  xmin = -1.5, xmax = 1.5, ymin = -1.5, ymax = 1.5,\n                  vals = 1:36)\nmulti_raster_file = system.file(\"raster/landsat.tif\", package = \"spDataLarge\")\nmulti_rast = rast(multi_raster_file)\nmulti_rast\n#> class       : SpatRaster \n#> dimensions  : 1428, 1128, 4  (nrow, ncol, nlyr)\n#> resolution  : 30, 30  (x, y)\n#> extent      : 301905, 335745, 4111245, 4154085  (xmin, xmax, ymin, ymax)\n#> coord. ref. : WGS 84 / UTM zone 12N (EPSG:32612) \n#> source      : landsat.tif \n#> names       : lan_1, lan_2, lan_3, lan_4 \n#> min values  :  7550,  6404,  5678,  5252 \n#> max values  : 19071, 22051, 25780, 31961\nnlyr(multi_rast)\n#> [1] 4\nmulti_rast3 = subset(multi_rast, 3)\nmulti_rast4 = subset(multi_rast, 4)\nmulti_rast34 = c(multi_rast3, multi_rast4)"},{"path":"spatial-class.html","id":"crs-intro","chapter":"2 Geographic data in R","heading":"2.4 Coordinate Reference Systems","text":"\nVector raster spatial data types share concepts intrinsic spatial data.\nPerhaps fundamental Coordinate Reference System (CRS), defines spatial elements data relate surface Earth (bodies).\nCRSs either geographic projected, introduced beginning chapter (see Figure 2.1).\nsection explain type, laying foundations Section 6 CRS transformations.","code":""},{"path":"spatial-class.html","id":"geographic-coordinate-systems","chapter":"2 Geographic data in R","heading":"2.4.1 Geographic coordinate systems","text":"\nGeographic coordinate systems identify location Earth’s surface using two values — longitude latitude (see left panel Figure 2.14).\nLongitude location East-West direction angular distance Prime Meridian plane.\nLatitude angular distance North South equatorial plane.\nDistances geographic CRSs therefore measured meters.\nimportant consequences, demonstrated Section 6.surface Earth geographic coordinate systems represented spherical ellipsoidal surface.\nSpherical models assume Earth perfect sphere given radius – advantage simplicity , time, inaccurate: Earth sphere!\nEllipsoidal models defined two parameters: equatorial radius polar radius.\nsuitable Earth compressed: equatorial radius around 11.5 km longer polar radius (Maling 1992).16Ellipsoids part wider component CRSs: datum.\ncontains information ellipsoid use precise relationship Cartesian coordinates location Earth’s surface.\ntwo types datum — geocentric (WGS84) local (NAD83).\ncan see examples two types datums Figure 2.13.\nBlack lines represent geocentric datum, center located Earth’s center gravity optimized specific location.\nlocal datum, shown purple dashed line, ellipsoidal surface shifted align surface particular location.\nallow local variations Earth’s surface, example due large mountain ranges, accounted local CRS.\ncan seen Figure 2.13, local datum fitted area Philippines, misaligned rest planet’s surface.\ndatums Figure 2.13 put top geoid - model global mean sea level.17\nFIGURE 2.13: Geocentric local geodetic datums shown top geoid (false color vertical exaggeration 10,000 scale factor). Image geoid adapted work Ince et al. (2019).\n","code":""},{"path":"spatial-class.html","id":"projected-coordinate-reference-systems","chapter":"2 Geographic data in R","heading":"2.4.2 Projected coordinate reference systems","text":"\nprojected CRSs based geographic CRS, described previous section, rely map projections convert three-dimensional surface Earth Easting Northing (x y) values projected CRS.\nProjected CRSs based Cartesian coordinates implicitly flat surface (right panel Figure 2.14).\norigin, x y axes, linear unit measurement meters.transition done without adding deformations.\nTherefore, properties Earth’s surface distorted process, area, direction, distance, shape.\nprojected coordinate system can preserve one two properties.\nProjections often named based property preserve: equal-area preserves area, azimuthal preserve direction, equidistant preserve distance, conformal preserve local shape.three main groups projection types - conic, cylindrical, planar (azimuthal).\nconic projection, Earth’s surface projected onto cone along single line tangency two lines tangency.\nDistortions minimized along tangency lines rise distance lines projection.\nTherefore, best suited maps mid-latitude areas.\ncylindrical projection maps surface onto cylinder.\nprojection also created touching Earth’s surface along single line tangency two lines tangency.\nCylindrical projections used often mapping entire world.\nplanar projection projects data onto flat surface touching globe point along line tangency.\ntypically used mapping polar regions.\nsf_proj_info(type = \"proj\") gives list available projections supported PROJ library.quick summary different projections, types, properties, suitability can found “Map Projections” (1993).\nFIGURE 2.14: Examples geographic (WGS 84; left) projected (NAD83 / UTM zone 12N; right) coordinate systems vector data type.\n","code":""},{"path":"spatial-class.html","id":"crs-in-r","chapter":"2 Geographic data in R","heading":"2.4.3 CRSs in R","text":"\n\n\nSpatial R packages support wide range CRSs use long-established PROJ library.\nTwo recommend ways describe CRSs R () Spatial Reference System Identifier (SRID) (b) well-known text (known WKT218) definitions.\napproaches advantages disadvantages.SRID unique value used identify coordinate reference system definitions form AUTHORITY:CODE.\npopular registry SRIDs EPSG, however, registries, ESRI OGR, exist.\nexample, EPSG:4326 represents latitude/longitude WGS84 CRS, ESRI:54030 - Robinson projection.\nSRIDs usually short therefore easier remember.\nSRID associated well-known text (WKT2) definition coordinate reference system.WKT2 describes coordinate reference systems (CRSs) coordinates operations form well-known text strings.\nexhaustive, detailed, precise (can see later section), allowing unambiguous CRSs storage transformations.\nconsists information given CRS, including datum ellipsoid, prime meridian, projection, units, etc.\nfeature also makes WKT2 approach complicated\nusually complex manually defined.past, proj4string definitions, standard way specify coordinate operations store CRSs.\nstring representations, built key=value form (e.g, +proj=longlat +datum=WGS84 +no_defs), , however, currently discouraged cases.\nPROJ version 6 still allows use proj4strings define coordinate operations, proj4string keys longer supported advisable use (e.g., +nadgrids, +towgs84, +k, +init=epsg:) three datums (.e., WGS84, NAD83, NAD27) can directly set proj4string.\nImportantly, proj4strings used store CRSs anymore.\nLonger explanations recent changes PROJ library proj4string replaced WKT2 can found R. S. Bivand (2020), Chapter 2 E. Pebesma Bivand (2022), blog post Floris Vanderhaeghe.Let’s look CRSs stored R spatial objects can set.\n, need read-vector dataset:new object, new_vector, polygon representing world map data (?spData::world).\nsf CRS object can retrieved using st_crs().CRS sf objects list two elements - input wkt.\ninput element quite flexible, depending input file user input, can contain SRID representation (e.g., \"EPSG:4326\"), CRS’s name (e.g., \"WGS84\"), even proj4string definition.\nwkt element stores WKT2 representation, used saving object file coordinate operations.\n, can see new_vector object WGS84 ellipsoid, uses Greenwich prime meridian, latitude longitude axis order.\ncase, also additional elements, USAGE explaining area suitable use CRS, ID pointing CRS’s SRID - \"EPSG:4326\".st_crs function also one helpful feature – can retrieve additional information used CRS.\nexample, try run:st_crs(new_vector)$IsGeographic check CRS geographic notst_crs(new_vector)$units_gdal find CRS unitsst_crs(new_vector)$srid extracts SRID (available)st_crs(new_vector)$proj4string extracts proj4string representationIn cases coordinate reference system (CRS) missing wrong CRS set, st_set_crs() function can used:second argument function either SRID (\"EPSG:4326\" example), complete WKT2 representation, proj4string, CRS extracted existing object st_crs().crs() function can used access CRS information SpatRaster object19:output WKT2 representation CRS.function, crs(), can also used set CRS raster objects., can use either SRID, complete WKT2 representation, proj4string, CRS extracted existing object crs().Importantly, st_crs() crs() functions alter coordinates’ values geometries.\nrole set metadata information object CRS.\nexpand CRSs explain project one CRS another Chapter 6.","code":"\nvector_filepath = system.file(\"shapes/world.gpkg\", package = \"spData\")\nnew_vector = read_sf(vector_filepath)\nst_crs(new_vector) # get CRS\n#> Coordinate Reference System:\n#>   User input: WGS 84 \n#>   wkt:\n#> GEOGCRS[\"WGS 84\",\n#>     DATUM[\"World Geodetic System 1984\",\n#>         ELLIPSOID[\"WGS 84\",6378137,298.257223563,\n#>             LENGTHUNIT[\"metre\",1]]],\n#>     PRIMEM[\"Greenwich\",0,\n#>         ANGLEUNIT[\"degree\",0.0174532925199433]],\n#>     CS[ellipsoidal,2],\n#>         AXIS[\"geodetic latitude (Lat)\",north,\n#>             ORDER[1],\n#>             ANGLEUNIT[\"degree\",0.0174532925199433]],\n#>         AXIS[\"geodetic longitude (Lon)\",east,\n#>             ORDER[2],\n#>             ANGLEUNIT[\"degree\",0.0174532925199433]],\n#>     USAGE[\n#>         SCOPE[\"unknown\"],\n#>         AREA[\"World\"],\n#>         BBOX[-90,-180,90,180]],\n#>     ID[\"EPSG\",4326]]\nnew_vector = st_set_crs(new_vector, \"EPSG:4326\") # set CRS\ncrs(my_rast) # get CRS\n#> [1] \"GEOGCRS[\\\"WGS 84\\\",\\n    DATUM[\\\"World Geodetic System 1984\\\",\\n        ELLIPSOID[\\\"WGS 84\\\",6378137,298.257223563,\\n            LENGTHUNIT[\\\"metre\\\",1]]],\\n    PRIMEM[\\\"Greenwich\\\",0,\\n        ANGLEUNIT[\\\"degree\\\",0.0174532925199433]],\\n    CS[ellipsoidal,2],\\n        AXIS[\\\"geodetic latitude (Lat)\\\",north,\\n            ORDER[1],\\n            ANGLEUNIT[\\\"degree\\\",0.0174532925199433]],\\n        AXIS[\\\"geodetic longitude (Lon)\\\",east,\\n            ORDER[2],\\n            ANGLEUNIT[\\\"degree\\\",0.0174532925199433]],\\n    ID[\\\"EPSG\\\",4326]]\"\ncrs(my_rast) = \"EPSG:26912\" # set CRS"},{"path":"spatial-class.html","id":"units","chapter":"2 Geographic data in R","heading":"2.5 Units","text":"important feature CRSs contain information spatial units.\nClearly, vital know whether house’s measurements feet meters, applies maps.\ngood cartographic practice add scale bar distance indicator onto maps demonstrate relationship distances page screen distances ground.\nLikewise, important formally specify units geometry data cells measured provide context, ensure subsequent calculations done context.novel feature geometry data sf objects native support units.\nmeans distance, area geometric calculations sf return values come units attribute, defined units package (E. Pebesma, Mailund, Hiebert 2016).\nadvantageous, preventing confusion caused different units (CRSs use meters, use feet) providing information dimensionality.\ndemonstrated code chunk , calculates area Luxembourg:\n\noutput units square meters (m2), showing result represents two-dimensional space.\ninformation, stored attribute (interested readers can discover attributes(st_area(luxembourg))), can feed subsequent calculations use units, population density (measured people per unit area, typically per km2).\nReporting units prevents confusion.\ntake Luxembourg example, units remained unspecified, one incorrectly assume units hectares.\ntranslate huge number digestible size, tempting divide results million (number square meters square kilometer):However, result incorrectly given square meters.\nsolution set correct units units package:Units equal importance case raster data.\nHowever, far sf spatial package supports units, meaning people working raster data approach changes units analysis (example, converting pixel widths imperial decimal units) care.\nmy_rast object (see ) uses WGS84 projection decimal degrees units.\nConsequently, resolution also given decimal degrees know , since res() function simply returns numeric vector.used UTM projection, units change., res() command gives back numeric vector without unit, forcing us know unit UTM projection meters.","code":"\nluxembourg = world[world$name_long == \"Luxembourg\", ]\nst_area(luxembourg) # requires the s2 package in recent versions of sf\n#> 2.41e+09 [m^2]\nst_area(luxembourg) / 1000000\n#> 2409 [m^2]\nunits::set_units(st_area(luxembourg), km^2)\n#> 2409 [km^2]\nres(my_rast)\n#> [1] 0.000833 0.000833\nrepr = project(my_rast, \"EPSG:26912\")\nres(repr)\n#> [1] 0.000833 0.000833"},{"path":"spatial-class.html","id":"ex2","chapter":"2 Geographic data in R","heading":"2.6 Exercises","text":"E1. Use summary() geometry column world data object. output tell us :geometry type?number countries?coordinate reference system (CRS)?E2. Run code ‘generated’ map world Section 2.2.4 Base plot arguments.\nFind two similarities two differences image computer book.cex argument (see ?plot)?cex set sqrt(world$pop) / 10000?Bonus: experiment different ways visualize global population.E3. Use plot() create maps Nigeria context (see Section 2.2.4 Base plot arguments).Adjust lwd, col expandBB arguments plot().Challenge: read documentation text() annotate map.E4. Create empty SpatRaster object called my_raster 10 columns 10 rows.\nAssign random values 0 10 new raster plot .E5. Read-raster/nlcd.tif file spDataLarge package.\nkind information can get properties file?E6. Check CRS raster/nlcd.tif file spDataLarge package.\nkind information can learn ?","code":""},{"path":"attr.html","id":"attr","chapter":"3 Attribute data operations","heading":"3 Attribute data operations","text":"","code":""},{"path":"attr.html","id":"prerequisites-1","chapter":"3 Attribute data operations","heading":"Prerequisites","text":"chapter requires following packages installed attached:also relies spData, loads datasets used code examples chapter:","code":"\nlibrary(sf)      # vector data package introduced in Chapter 2\nlibrary(terra)   # raster data package introduced in Chapter 2\nlibrary(dplyr)   # tidyverse package for data frame manipulation\nlibrary(spData)  # spatial data package introduced in Chapter 2\n#> Warning: multiple methods tables found for 'approxNA'"},{"path":"attr.html","id":"introduction","chapter":"3 Attribute data operations","heading":"3.1 Introduction","text":"Attribute data non-spatial information associated geographic (geometry) data.\nbus stop provides simple example: position typically represented latitude longitude coordinates (geometry data), addition name.\nElephant & Castle / New Kent Road stop London, example coordinates -0.098 degrees longitude 51.495 degrees latitude can represented POINT (-0.098 51.495) sfc representation described Chapter 2.\nAttributes name attribute POINT feature (use Simple Features terminology) topic chapter.Another example elevation value (attribute) specific grid cell raster data.\nUnlike vector data model, raster data model stores coordinate grid cell indirectly, meaning distinction attribute spatial information less clear.\nillustrate point, think pixel 3rd row 4th column raster matrix.\nspatial location defined index matrix: move origin four cells x direction (typically east right maps) three cells y direction (typically south ).\nraster’s resolution defines distance x- y-step specified header.\nheader vital component raster datasets specifies pixels relate geographic coordinates (see also Chapter 4).teaches manipulate geographic objects based attributes names bus stops vector dataset elevations pixels raster dataset.\nvector data, means techniques subsetting aggregation (see Sections 3.2.1 3.2.3).\nSections 3.2.4 3.2.5 demonstrate join data onto simple feature objects using shared ID create new variables, respectively.\noperations spatial equivalent:\n[ operator base R, example, works equally subsetting objects based attribute spatial objects; can also join attributes two geographic datasets using spatial joins.\ngood news: skills developed chapter cross-transferable.\nChapter 4 extends methods presented spatial world.deep dive various types vector attribute operations next section, raster attribute data operations covered Section 3.3, demonstrates create raster layers containing continuous categorical attributes extracting cell values one layer (raster subsetting).\nSection 3.3.2 provides overview ‘global’ raster operations can used summarize entire raster datasets.","code":""},{"path":"attr.html","id":"vector-attribute-manipulation","chapter":"3 Attribute data operations","heading":"3.2 Vector attribute manipulation","text":"Geographic vector datasets well supported R thanks sf class, extends base R’s data.frame.\nLike data frames, sf objects one column per attribute variable (‘name’) one row per observation feature (e.g., per bus station).\nsf objects differ basic data frames geometry column class sfc can contain range geographic entities (single ‘multi’ point, line, polygon features) per row.\ndescribed Chapter 2, demonstrated generic methods plot() summary() work sf objects.\nsf also provides generics allow sf objects behave like regular data frames, shown printing class’s methods:Many (aggregate(), cbind(), merge(), rbind() [) manipulating data frames.\nrbind(), example, binds rows two data frames together, one ‘top’ .\n$<- creates new columns.\nkey feature sf objects store spatial non-spatial data way, columns data.frame.geometry column sf objects typically called geometry geom name can used.\nfollowing command, example, creates geometry column named g:st_sf(data.frame(n = world$name_long), g = world$geom)sf objects can also extend tidyverse classes data frames, tibble tbl.\n.\nThus sf enables full power R’s data analysis capabilities unleashed geographic data, whether use base R tidyverse functions data analysis.\n\n(See Rdatatable/data.table#2273 discussion compatibility sf objects fast data.table package.)\nusing capabilities worth re-capping discover basic properties vector data objects.\nLet’s start using base R functions learn world dataset spData package:world contains ten non-geographic columns (one geometry list column) almost 200 rows representing world’s countries.\nfunction st_drop_geometry() keeps attributes data sf object, words removing geometry:Dropping geometry column working attribute data can useful; data manipulation processes can run faster work attribute data geometry columns always needed.\ncases, however, makes sense keep geometry column, explaining column ‘sticky’ (remains attribute operations unless specifically dropped).\nNon-spatial data operations sf objects change object’s geometry appropriate (e.g., dissolving borders adjacent polygons following aggregation).\nBecoming skilled geographic attribute data manipulation means becoming skilled manipulating data frames.many applications, tidyverse package dplyr offers effective approach working data frames.\nTidyverse compatibility advantage sf predecessor sp, pitfalls avoid (see supplementary tidyverse-pitfalls vignette geocompr.github.io details).","code":"\nmethods(class = \"sf\") # methods for sf objects, first 12 shown\n#>  [1] aggregate             cbind                 coerce               \n#>  [4] initialize            merge                 plot                 \n#>  [7] print                 rbind                 [                    \n#> [10] [[<-                  $<-                   show                 \nclass(world) # it's an sf object and a (tidy) data frame\n#> [1] \"sf\"         \"tbl_df\"     \"tbl\"        \"data.frame\"\ndim(world)   # it is a 2 dimensional object, with 177 rows and 11 columns\n#> [1] 177  11\nworld_df = st_drop_geometry(world)\nclass(world_df)\n#> [1] \"tbl_df\"     \"tbl\"        \"data.frame\"\nncol(world_df)\n#> [1] 10"},{"path":"attr.html","id":"vector-attribute-subsetting","chapter":"3 Attribute data operations","heading":"3.2.1 Vector attribute subsetting","text":"Base R subsetting methods include operator [ function subset().\nkey dplyr subsetting functions filter() slice() subsetting rows, select() subsetting columns.\napproaches preserve spatial components attribute data sf objects, using operator $ dplyr function pull() return single attribute column vector lose attribute data, see.\n\nsection focuses subsetting sf data frames; details subsetting vectors non-geographic data frames recommend reading section section 2.7 Introduction R (R Core Team, Smith, Team 2021) Chapter 4 Advanced R Programming (Wickham 2019), respectively.[ operator can subset rows columns.\nIndices placed inside square brackets placed directly data frame object name specify elements keep.\ncommand object[, j] means ’return rows represented columns represented j, j typically contain integers TRUEs FALSEs (indices can also character strings, indicating row column names).\nobject[5, 1:3], example, means ’return data containing 5th row columns 1 3: result data frame 1 row 3 columns, fourth geometry column ’s sf object.\nLeaving j empty returns rows columns, world[1:5, ] returns first five rows 11 columns.\nexamples demonstrate subsetting base R.\nGuess number rows columns sf data frames returned command check results computer (see end chapter exercises):demonstration utility using logical vectors subsetting shown code chunk .\ncreates new object, small_countries, containing nations whose surface area smaller 10,000 km2:intermediary i_small (short index representing small countries) logical vector can used subset seven smallest countries world surface area.\nconcise command, omits intermediary object, generates result:base R function subset() provides another way achieve result:Base R functions mature, stable widely used, making rock solid choice, especially contexts reproducibility reliability key.\ndplyr functions enable ‘tidy’ workflows people (authors book included) find intuitive productive interactive data analysis, especially combined code editors RStudio enable auto-completion column names.\nKey functions subsetting data frames (including sf data frames) dplyr functions demonstrated .\n\n\n\nselect() selects columns name position.\nexample, select two columns, name_long pop, following command:Note: equivalent command base R (world[, c(\"name_long\", \"pop\")]), sticky geom column remains.\nselect() also allows selecting range columns help : operator:can remove specific columns - operator:Subset rename columns time new_name = old_name syntax:worth noting command concise base R equivalent, requires two lines code:select() also works ‘helper functions’ advanced subsetting operations, including contains(), starts_with() num_range() (see help page ?select details).dplyr verbs return data frame, can extract single column vector pull().\n\n\n\ncan get result base R list subsetting operators $ [[, three following commands return numeric vector:slice() row-equivalent select().\nfollowing code chunk, example, selects rows 1 6:filter() dplyr’s equivalent base R’s subset() function.\nkeeps rows matching given criteria, e.g., countries area certain threshold, high average life expectancy, shown following examples:standard set comparison operators can used filter() function, illustrated Table 3.1:TABLE 3.1: Comparison operators return Booleans (TRUE/FALSE).","code":"\nworld[1:6, ]    # subset rows by position\nworld[, 1:3]    # subset columns by position\nworld[1:6, 1:3] # subset rows and columns by position\nworld[, c(\"name_long\", \"pop\")] # columns by name\nworld[, c(T, T, F, F, F, F, F, T, T, F, F)] # by logical indices\nworld[, 888] # an index representing a non-existent column\ni_small = world$area_km2 < 10000\nsummary(i_small) # a logical vector\n#>    Mode   FALSE    TRUE \n#> logical     170       7\nsmall_countries = world[i_small, ]\nsmall_countries = world[world$area_km2 < 10000, ]\nsmall_countries = subset(world, area_km2 < 10000)\nworld1 = dplyr::select(world, name_long, pop)\nnames(world1)\n#> [1] \"name_long\" \"pop\"       \"geom\"\n# all columns between name_long and pop (inclusive)\nworld2 = dplyr::select(world, name_long:pop)\n# all columns except subregion and area_km2 (inclusive)\nworld3 = dplyr::select(world, -subregion, -area_km2)\nworld4 = dplyr::select(world, name_long, population = pop)\nworld5 = world[, c(\"name_long\", \"pop\")] # subset columns by name\nnames(world5)[names(world5) == \"pop\"] = \"population\" # rename column manually\npull(world, pop)\nworld$pop\nworld[[\"pop\"]]\nslice(world, 1:6)\nworld7 = filter(world ,area_km2 < 10000) # countries with a small area\nworld7 = filter(world, lifeExp > 82)      # with high life expectancy"},{"path":"attr.html","id":"chaining-commands-with-pipes","chapter":"3 Attribute data operations","heading":"3.2.2 Chaining commands with pipes","text":"Key workflows using dplyr functions ‘pipe’ operator %>% (since R 4.1.0 native pipe |>), takes name Unix pipe | (Grolemund Wickham 2016).\nPipes enable expressive code: output previous function becomes first argument next function, enabling chaining.\nillustrated , countries Asia filtered world dataset, next object subset columns (name_long continent) first five rows (result shown).chunk shows pipe operator allows commands written clear order:\nrun top bottom (line--line) left right.\nalternative %>% nested function calls, harder read:","code":"\nworld7 = world %>%\n  filter(continent == \"Asia\") %>%\n  dplyr::select(name_long, continent) %>%\n  slice(1:5)\nworld8 = slice(\n  dplyr::select(\n    filter(world, continent == \"Asia\"),\n    name_long, continent),\n  1:5)"},{"path":"attr.html","id":"vector-attribute-aggregation","chapter":"3 Attribute data operations","heading":"3.2.3 Vector attribute aggregation","text":"\n\nAggregation involves summarizing data one ‘grouping variables,’ typically columns data frame aggregated (geographic aggregation covered next chapter).\nexample attribute aggregation calculating number people per continent based country-level data (one row per country).\nworld dataset contains necessary ingredients: columns pop continent, population grouping variable, respectively.\naim find sum() country populations continent, resulting smaller data frame (aggregation form data reduction can useful early step working large datasets).\ncan done base R function aggregate() follows:result non-spatial data frame six rows, one per continent, two columns reporting name population continent (see Table 3.2 results top 3 populous continents).aggregate() generic function means behaves differently depending inputs.\nsf provides method aggregate.sf() activated automatically x sf object argument provided:resulting world_agg2 object spatial object containing 8 features representing continents world (open ocean).\ngroup_by() %>% summarize() dplyr equivalent aggregate(), variable name provided group_by() function specifying grouping variable information summarized passed summarize() function, shown :approach may seem complex benefits: flexibility, readability, control new column names.\nflexibility illustrated command , calculates population also area number countries continent:previous code chunk pop, area (sqkm) n column names result, sum() n() aggregating functions.\naggregating functions return sf objects rows representing continents geometries containing multiple polygons representing land mass associated islands (works thanks geometric operation ‘union,’ explained Section 5.2.6).Let’s combine learned far dplyr functions, chaining multiple commands summarize attribute data countries worldwide continent.\nfollowing command calculates population density (mutate()), arranges continents number countries contain (dplyr::arrange()), keeps 3 populous continents (top_n()), result presented Table 3.2):TABLE 3.2: top 3 populous continents ordered population density (people per square km).","code":"\nworld_agg1 = aggregate(pop ~ continent, FUN = sum, data = world, na.rm = TRUE)\nclass(world_agg1)\n#> [1] \"data.frame\"\nworld_agg2 = aggregate(world[\"pop\"], list(world$continent), FUN = sum, na.rm = TRUE)\nclass(world_agg2)\n#> [1] \"sf\"         \"data.frame\"\nnrow(world_agg2)\n#> [1] 8\nworld_agg3 = world %>%\n  group_by(continent) %>%\n  summarize(pop = sum(pop, na.rm = TRUE))\nworld_agg4  = world %>% \n  group_by(continent) %>%\n  summarize(pop = sum(pop, na.rm = TRUE), `area (sqkm)` = sum(area_km2), n = n())\nworld_agg5 = world %>% \n  st_drop_geometry() %>%                      # drop the geometry for speed\n  dplyr::select(pop, continent, area_km2) %>% # subset the columns of interest  \n  group_by(continent) %>%                     # group by continent and summarize:\n  summarize(Pop = sum(pop, na.rm = TRUE), Area = sum(area_km2), N = n()) %>%\n  mutate(Density = round(Pop / Area)) %>%     # calculate population density\n  top_n(n = 3, wt = Pop) %>%                  # keep only the top 3\n  arrange(desc(N))                            # arrange in order of n. countries"},{"path":"attr.html","id":"vector-attribute-joining","chapter":"3 Attribute data operations","heading":"3.2.4 Vector attribute joining","text":"Combining data different sources common task data preparation.\nJoins combining tables based shared ‘key’ variable.\ndplyr multiple join functions including left_join() inner_join() — see vignette(\"two-table\") full list.\nfunction names follow conventions used database language SQL (Grolemund Wickham 2016, chap. 13); using join non-spatial datasets sf objects focus section.\ndplyr join functions work data frames sf objects, important difference geometry list column.\nresult data joins can either sf data.frame object.\ncommon type attribute join spatial data takes sf object first argument adds columns data.frame specified second argument.\n\ndemonstrate joins, combine data coffee production world dataset.\ncoffee data data frame called coffee_data spData package (see ?coffee_data details).\n3 columns:\nname_long names major coffee-producing nations coffee_production_2016 coffee_production_2017 contain estimated values coffee production units 60-kg bags year.\n‘left join,’ preserves first dataset, merges world coffee_data:input datasets share ‘key variable’ (name_long) join worked without using argument (see ?left_join details).\nresult sf object identical original world object two new variables (column indices 11 12) coffee production.\ncan plotted map, illustrated Figure 3.1, generated plot() function :\nFIGURE 3.1: World coffee production (thousand 60-kg bags) country, 2017. Source: International Coffee Organization.\njoining work, ‘key variable’ must supplied datasets.\ndefault dplyr uses variables matching names.\ncase, world_coffee world objects contained variable called name_long, explaining message Joining, = \"name_long\".\nmajority cases variable names , two options:Rename key variable one objects match.Use argument specify joining variables.latter approach demonstrated renamed version coffee_data:Note name original object kept, meaning world_coffee new object world_coffee2 identical.\nAnother feature result number rows original dataset.\nAlthough 47 rows data coffee_data, 177 country records kept intact world_coffee world_coffee2:\nrows original dataset match assigned NA values new coffee production variables.\nwant keep countries match key variable?\ncase inner join can used:Note result inner_join() 45 rows compared 47 coffee_data.\nhappened remaining rows?\ncan identify rows match using setdiff() function follows:result shows Others accounts one row present world dataset name Democratic Republic Congo accounts :\nabbreviated, causing join miss .\nfollowing command uses string matching (regex) function stringr package confirm Congo, Dem. Rep. :fix issue, create new version coffee_data update name.\ninner_join()ing updated data frame returns result 46 coffee-producing nations:also possible join direction: starting non-spatial dataset adding variables simple features object.\ndemonstrated , starts coffee_data object adds variables original world dataset.\ncontrast previous joins, result another simple feature object, data frame form tidyverse tibble:\noutput join tends match first argument:section covers majority joining use cases.\ninformation, recommend Grolemund Wickham (2016), join vignette geocompkg package accompanies book, documentation data.table package.20\nAnother type join spatial join, covered next chapter (Section 4.2.3).","code":"\nworld_coffee = left_join(world, coffee_data)\n#> Joining, by = \"name_long\"\nclass(world_coffee)\n#> [1] \"sf\"         \"tbl_df\"     \"tbl\"        \"data.frame\"\nnames(world_coffee)\n#>  [1] \"iso_a2\"                 \"name_long\"              \"continent\"             \n#>  [4] \"region_un\"              \"subregion\"              \"type\"                  \n#>  [7] \"area_km2\"               \"pop\"                    \"lifeExp\"               \n#> [10] \"gdpPercap\"              \"geom\"                   \"coffee_production_2016\"\n#> [13] \"coffee_production_2017\"\nplot(world_coffee[\"coffee_production_2017\"])\ncoffee_renamed = rename(coffee_data, nm = name_long)\nworld_coffee2 = left_join(world, coffee_renamed, by = c(name_long = \"nm\"))\nworld_coffee_inner = inner_join(world, coffee_data)\n#> Joining, by = \"name_long\"\nnrow(world_coffee_inner)\n#> [1] 45\nsetdiff(coffee_data$name_long, world$name_long)\n#> [1] \"Congo, Dem. Rep. of\" \"Others\"\n(drc = stringr::str_subset(world$name_long, \"Dem*.+Congo\"))\n#> [1] \"Democratic Republic of the Congo\"\ncoffee_data$name_long[grepl(\"Congo,\", coffee_data$name_long)] = drc\nworld_coffee_match = inner_join(world, coffee_data)\n#> Joining, by = \"name_long\"\nnrow(world_coffee_match)\n#> [1] 46\ncoffee_world = left_join(coffee_data, world)\n#> Joining, by = \"name_long\"\nclass(coffee_world)\n#> [1] \"tbl_df\"     \"tbl\"        \"data.frame\""},{"path":"attr.html","id":"vec-attr-creation","chapter":"3 Attribute data operations","heading":"3.2.5 Creating attributes and removing spatial information","text":"Often, like create new column based already existing columns.\nexample, want calculate population density country.\nneed divide population column, pop, area column, area_km2 unit area square kilometers.\nUsing base R, can type:Alternatively, can use one dplyr functions - mutate() transmute().\nmutate() adds new columns penultimate position sf object (last one reserved geometry):difference mutate() transmute() latter drops existing columns (except sticky geometry column):unite() tidyr package (provides many useful functions reshaping datasets, including pivot_longer()) pastes together existing columns.\nexample, want combine continent region_un columns new column named con_reg.\nAdditionally, can define separator (: colon :) defines values input columns joined, original columns removed (: TRUE):separate() function opposite unite(): splits one column multiple columns using either regular expression character positions.\nfunction also comes tidyr package.dplyr function rename() base R function setNames() useful renaming columns.\nfirst replaces old name new one.\nfollowing command, example, renames lengthy name_long column simply name:setNames() changes column names , requires character vector name matching column.\nillustrated , outputs world object, short names:important note attribute data operations preserve geometry simple features.\nmentioned outset chapter, can useful remove geometry.\n, explicitly remove .\nHence, approach select(world, -geom) unsuccessful instead use st_drop_geometry().21","code":"\nworld_new = world # do not overwrite our original data\nworld_new$pop_dens = world_new$pop / world_new$area_km2\nworld %>% \n  mutate(pop_dens = pop / area_km2)\nworld %>% \n  transmute(pop_dens = pop / area_km2)\nworld_unite = world %>%\n  unite(\"con_reg\", continent:region_un, sep = \":\", remove = TRUE)\nworld_separate = world_unite %>% \n  separate(con_reg, c(\"continent\", \"region_un\"), sep = \":\")\nworld %>% \n  rename(name = name_long)\nnew_names = c(\"i\", \"n\", \"c\", \"r\", \"s\", \"t\", \"a\", \"p\", \"l\", \"gP\", \"geom\")\nworld %>% \n  setNames(new_names)\nworld_data = world %>% st_drop_geometry()\nclass(world_data)\n#> [1] \"tbl_df\"     \"tbl\"        \"data.frame\""},{"path":"attr.html","id":"manipulating-raster-objects","chapter":"3 Attribute data operations","heading":"3.3 Manipulating raster objects","text":"contrast vector data model underlying simple features (represents points, lines polygons discrete entities space), raster data represent continuous surfaces.\nsection shows raster objects work creating scratch, building Section 2.3.2.\nunique structure, subsetting operations raster datasets work different way, demonstrated Section 3.3.1.\nfollowing code recreates raster dataset used Section 2.3.4, result illustrated Figure 3.2.\ndemonstrates rast() function works create example raster named elev (representing elevations).result raster object 6 rows 6 columns (specified nrow ncol arguments), minimum maximum spatial extent x y direction (xmin, xmax, ymin, ymax).\nvals argument sets values cell contains: numeric data ranging 1 36 case.\nRaster objects can also contain categorical values class logical factor variables R.\nfollowing code creates raster representing grain sizes (Figure 3.2):raster object stores corresponding look-table “Raster Attribute Table” (RAT) list data frames, can viewed cats(grain) (see ?cats() information).\nelement list layer raster.\nalso possible use function levels() retrieving adding new replacing existing factor levels:\nFIGURE 3.2: Raster datasets numeric (left) categorical values (right).\n","code":"\nelev = rast(nrows = 6, ncols = 6, resolution = 0.5, \n            xmin = -1.5, xmax = 1.5, ymin = -1.5, ymax = 1.5,\n            vals = 1:36)\ngrain_order = c(\"clay\", \"silt\", \"sand\")\ngrain_char = sample(grain_order, 36, replace = TRUE)\ngrain_fact = factor(grain_char, levels = grain_order)\ngrain = rast(nrows = 6, ncols = 6, resolution = 0.5, \n             xmin = -1.5, xmax = 1.5, ymin = -1.5, ymax = 1.5,\n             vals = grain_fact)\nlevels(grain)[[1]] = c(levels(grain)[[1]], wetness = c(\"wet\", \"moist\", \"dry\"))\nlevels(grain)\n#> [[1]]\n#> [1] \"clay\"  \"silt\"  \"sand\"  \"wet\"   \"moist\" \"dry\""},{"path":"attr.html","id":"raster-subsetting","chapter":"3 Attribute data operations","heading":"3.3.1 Raster subsetting","text":"Raster subsetting done base R operator [, accepts variety inputs:\nRow-column indexingCell IDsCoordinates (see Section 4.3.1)Another spatial object (see Section 4.3.1), show first two options since can considered non-spatial operations.\nneed spatial object subset another output spatial object, refer spatial subsetting.\nTherefore, latter two options shown next chapter (see Section 4.3.1).first two subsetting options demonstrated commands —\nreturn value top left pixel raster object elev (results shown):Subsetting multi-layered raster objects return cell value(s) layer.\nexample, c(elev, grain)[1] returns data frame one row two columns — one layer.\nextract values complete rows, can also use values().Cell values can modified overwriting existing values conjunction subsetting operation.\nfollowing code chunk, example, sets upper left cell elev 0 (results shown):Leaving square brackets empty shortcut version values() retrieving values raster.\nMultiple cells can also modified way:Replacing values multilayered rasters can done matrix many columns layers rows replaceable cells (results shown):","code":"\n# row 1, column 1\nelev[1, 1]\n# cell ID 1\nelev[1]\nelev[1, 1] = 0\nelev[]\nelev[1, c(1, 2)] = 0\ntwo_layers = c(grain, elev) \ntwo_layers[1] = cbind(c(0), c(4))\ntwo_layers[]"},{"path":"attr.html","id":"summarizing-raster-objects","chapter":"3 Attribute data operations","heading":"3.3.2 Summarizing raster objects","text":"terra contains functions extracting descriptive statistics entire rasters.\nPrinting raster object console typing name returns minimum maximum values raster.\nsummary() provides common descriptive statistics – minimum, maximum, quartiles number NAs continuous rasters number cells class categorical rasters.\nsummary operations standard deviation (see ) custom summary statistics can calculated global().\nAdditionally, freq() function allows get frequency table categorical values.Raster value statistics can visualized variety ways.\nSpecific functions boxplot(), density(), hist() pairs() work also raster objects, demonstrated histogram created command (shown):case desired visualization function work raster objects, one can extract raster data plotted help values() (Section 3.3.1).\nDescriptive raster statistics belong -called global raster operations.\ntypical raster processing operations part map algebra scheme, covered next chapter (Section 4.3.2).\nfunction names clash packages (e.g., function name extract() exist terra tidyr packages). addition loading packages referring functions verbosely (e.g., tidyr::extract()), another way prevent function names clashes unloading offending package detach(). following command, example, unloads terra package (can also done package tab resides default right-bottom pane RStudio): detach(“package:terra,” unload = TRUE, force = TRUE). force argument makes sure package detached even packages depend . , however, may lead restricted usability packages depending detached package, therefore recommended.\n","code":"\nglobal(elev, sd)\nhist(elev)"},{"path":"attr.html","id":"exercises-1","chapter":"3 Attribute data operations","heading":"3.4 Exercises","text":"exercises use us_states us_states_df datasets spData package.\nmust attached package, packages used attribute operations chapter (sf, dplyr, terra) commands library(spData) attempting exercises:us_states spatial object (class sf), containing geometry attributes (including name, region, area, population) states within contiguous United States.\nus_states_df data frame (class data.frame) containing name additional variables (including median income poverty level, years 2010 2015) US states, including Alaska, Hawaii Puerto Rico.\ndata comes United States Census Bureau, documented ?us_states ?us_states_df.E1. Create new object called us_states_name contains NAME column us_states object using either base R ([) tidyverse (select()) syntax.\nclass new object makes geographic?E2. Select columns us_states object contain population data.\nObtain result using different command (bonus: try find three ways obtaining result).\nHint: try use helper functions, contains starts_with dplyr (see ?contains).E3. Find states following characteristics (bonus find plot ):Belong Midwest region.Belong West region, area 250,000 km2and 2015 population greater 5,000,000 residents (hint: may need use function units::set_units() .numeric()).Belong South region, area larger 150,000 km2 total population 2015 larger 7,000,000 residents.E4. total population 2015 us_states dataset?\nminimum maximum total population 2015?E5. many states region?E6. minimum maximum total population 2015 region?\ntotal population 2015 region?E7. Add variables us_states_df us_states, create new object called us_states_stats.\nfunction use ?\nvariable key datasets?\nclass new object?E8. us_states_df two rows us_states.\ncan find ? (hint: try use dplyr::anti_join() function)E9. population density 2015 state?\npopulation density 2010 state?E10. much population density changed 2010 2015 state?\nCalculate change percentages map .E11. Change columns’ names us_states lowercase. (Hint: helper functions - tolower() colnames() may help.)E12. Using us_states us_states_df create new object called us_states_sel.\nnew object two variables - median_income_15 geometry.\nChange name median_income_15 column Income.E13. Calculate change number residents living poverty level 2010 2015 state. (Hint: See ?us_states_df documentation poverty level columns.)\nBonus: Calculate change percentage residents living poverty level state.E14. minimum, average maximum state’s number people living poverty line 2015 region?\nBonus: region largest increase people living poverty line?E15. Create raster scratch nine rows columns resolution 0.5 decimal degrees (WGS84).\nFill random numbers.\nExtract values four corner cells.E16. common class example raster grain (hint: modal)?E17. Plot histogram boxplot dem.tif file spDataLarge package (system.file(\"raster/dem.tif\", package = \"spDataLarge\")).","code":"\nlibrary(sf)\nlibrary(dplyr)\nlibrary(terra)\nlibrary(spData)\ndata(us_states)\ndata(us_states_df)"},{"path":"spatial-operations.html","id":"spatial-operations","chapter":"4 Spatial data operations","heading":"4 Spatial data operations","text":"","code":""},{"path":"spatial-operations.html","id":"prerequisites-2","chapter":"4 Spatial data operations","heading":"Prerequisites","text":"chapter requires packages used Chapter 3:also need read couple datasets follows Section 4.3","code":"\nlibrary(sf)\nlibrary(terra)\nlibrary(dplyr)\nlibrary(spData)\n#> Warning: multiple methods tables found for 'approxNA'\nelev = rast(system.file(\"raster/elev.tif\", package = \"spData\"))\ngrain = rast(system.file(\"raster/grain.tif\", package = \"spData\"))"},{"path":"spatial-operations.html","id":"introduction-1","chapter":"4 Spatial data operations","heading":"4.1 Introduction","text":"Spatial operations, including spatial joins vector datasets local focal operations raster datasets, vital part geocomputation.\nchapter shows spatial objects can modified multitude ways based location shape.\nMany spatial operations non-spatial (attribute) equivalent, concepts subsetting joining datasets demonstrated previous chapter applicable .\nespecially true vector operations: Section 3.2 vector attribute manipulation provides basis understanding spatial counterpart, namely spatial subsetting (covered Section 4.2.1).\nSpatial joining (Section 4.2.3) aggregation (Section 4.2.5) also non-spatial counterparts, covered previous chapter.Spatial operations differ non-spatial operations number ways, however:\nSpatial joins, example, can done number ways — including matching entities intersect within certain distance target dataset — attribution joins discussed Section 3.2.4 previous chapter can done one way (except using fuzzy joins, described documentation fuzzyjoin package).\ntype spatial relationship objects must considered undertaking spatial operations, described Section 4.2.2, topological relations vector features.\n.\nAnother unique aspect spatial objects distance: spatial objects related space, distance calculations can used explore strength relationship, described context vector data Section 4.2.7.Spatial operations raster objects include subsetting — covered Section 4.3.1 — merging several raster ‘tiles’ single object, demonstrated Section 4.3.8.\nMap algebra covers range operations modify raster cell values, without reference surrounding cell values, vital many applications.\nconcept map algebra introduced Section 4.3.2; local, focal zonal map algebra operations covered sections 4.3.3, 4.3.4, 4.3.5, respectively. Global map algebra operations, generate summary statistics representing entire raster dataset, distance calculations rasters, discussed Section 4.3.6.\nfinal section exercises (??) process merging two raster datasets discussed demonstrated reference reproducible example.","code":""},{"path":"spatial-operations.html","id":"spatial-vec","chapter":"4 Spatial data operations","heading":"4.2 Spatial operations on vector data","text":"section provides overview spatial operations vector geographic data represented simple features sf package.\nSection 4.3 presents spatial operations raster datasets using classes functions terra package.","code":""},{"path":"spatial-operations.html","id":"spatial-subsetting","chapter":"4 Spatial data operations","heading":"4.2.1 Spatial subsetting","text":"Spatial subsetting process taking spatial object returning new object containing features relate space another object.\nAnalogous attribute subsetting (covered Section 3.2.1), subsets sf data frames can created square bracket ([) operator using syntax x[y, , op = st_intersects], x sf object subset rows returned, y ‘subsetting object’ , op = st_intersects optional argument specifies topological relation (also known binary predicate) used subsetting.\ndefault topological relation used op argument provided st_intersects(): command x[y, ] identical x[y, , op = st_intersects] shown x[y, , op = st_disjoint] (meaning topological relations described next section).\nfilter() function tidyverse can also used approach verbose, see examples .\n\ndemonstrate spatial subsetting, use nz nz_height datasets spData package, contain geographic data 16 main regions 101 highest points New Zealand, respectively (Figure 4.1), projected coordinate system.\nfollowing code chunk creates object representing Canterbury, uses spatial subsetting return high points region:\nFIGURE 4.1: Illustration spatial subsetting red triangles representing 101 high points New Zealand, clustered near central Canterbuy region (left). points Canterbury created [ subsetting operator (highlighted gray, right).\nLike attribute subsetting, command x[y, ] (equivalent nz_height[canterbury, ]) subsets features target x using contents source object y.\nInstead y vector class logical integer, however, spatial subsetting x y must geographic objects.\nSpecifically, objects used spatial subsetting way must class sf sfc: nz nz_height geographic vector data frames class sf, result operation returns another sf object representing features target nz_height object intersect (case high points located within) canterbury region.Various topological relations can used spatial subsetting determine type spatial relationship features target object must subsetting object selected.\ninclude touches, crosses within, see shortly Section 4.2.2.\ndefault setting st_intersects ‘catch ’ topological relation return features target touch, cross within source ‘subsetting’ object.\nindicated , alternative spatial operators can specified op = argument, demonstrated following command returns opposite st_intersects(), points intersect Canterbury (see Section 4.2.2):many applications, ’ll need know spatial subsetting vector data: just works.\nimpatient learn topological relations, beyond st_intersects() st_disjoint(), skip next section (4.2.2).\n’re interested details, including ways subsetting, read .Another way spatial subsetting uses objects returned topological operators.\nobjects can useful right, example exploring graph network relationships contiguous regions, can also used subsetting, demonstrated code chunk :code chunk creates object class sgbp (sparse geometry binary predicate, list length x spatial operation) converts logical vector sel_logical (containing TRUE FALSE values, something can also used dplyr’s filter function).\n\nfunction lengths() identifies features nz_height intersect objects y.\ncase 1 greatest possible value complex operations one use method subset features intersect , example, 2 features source object.result can achieved sf function st_filter() created increase compatibility sf objects dplyr data manipulation code:point, three identical (row names) versions canterbury_height, one created using [ operator, one created via intermediary selection object, another using sf’s convenience function st_filter().\n\n\nnext section explores different types spatial relation, also known binary predicates, can used identify whether two features spatially related .","code":"\ncanterbury = nz %>% filter(Name == \"Canterbury\")\ncanterbury_height = nz_height[canterbury, ]\nnz_height[canterbury, , op = st_disjoint]\nsel_sgbp = st_intersects(x = nz_height, y = canterbury)\nclass(sel_sgbp)\n#> [1] \"sgbp\" \"list\"\nsel_sgbp\n#> Sparse geometry binary predicate list of length 101, where the\n#> predicate was `intersects'\n#> first 10 elements:\n#>  1: (empty)\n#>  2: (empty)\n#>  3: (empty)\n#>  4: (empty)\n#>  5: 1\n#>  6: 1\n#>  7: 1\n#>  8: 1\n#>  9: 1\n#>  10: 1\nsel_logical = lengths(sel_sgbp) > 0\ncanterbury_height2 = nz_height[sel_logical, ]\ncanterbury_height3 = nz_height %>%\n  st_filter(y = canterbury, .predicate = st_intersects)"},{"path":"spatial-operations.html","id":"topological-relations","chapter":"4 Spatial data operations","heading":"4.2.2 Topological relations","text":"Topological relations describe spatial relationships objects.\n“Binary topological relationships,” give full name, logical statements (answer can TRUE FALSE) spatial relationships two objects defined ordered sets points (typically forming points, lines polygons) two dimensions (Egenhofer Herring 1990).\nmay sound rather abstract , indeed, definition classification topological relations based earlier mathematical foundations first published book 1966 (Spanier 1995).22Despite mathematical origins, topological relations can understood intuitively reference visualizations commonly used functions test common types spatial relationships.\nFigure 4.2 shows variety geometry pairs associated relations.\nthird fourth pairs Figure 4.2 (left right ) demonstrate , relations, order important: relations equals, intersects, crosses, touches overlaps symmetrical, meaning function(x, y) true, function(y, x) also true, relations order geomtries important contains within .\n\nFIGURE 4.2: Topological relations vector geometries, inspired Figures 1 2 Egenhofer Herring (1990). relations function(x, y) true printed geometry pair, x represented pink y represented blue.\nsf, functions testing different types topological relations called ‘binary predicates,’ described vignette Manipulating Simple Feature Geometries, can viewed command vignette(\"sf3\"), help page ?geos_binary_pred (E. Pebesma 2018).\nsee topological relations work practice, let’s create simple reproducible example, building relations illustrated Figure 4.2 consolidating knowledge vector geometries represented previous chapter (Section 2.2.5).\nNote create tabular data representing coordinates (x y) polygon vertices, use base R function read.csv() convert result matrix, POLYGON finally sfc object:create additional geometries demonstrate spatial relations following commands , plotted top polygon created , relate space one another, shown Figure 4.3.\nNote use function st_as_sf() argument coords efficiently convert data frame containing columns representing coordinates sf object containing points:\nFIGURE 4.3: Points (point_df 1 3), line polygon objects arranged illustrate topological relations.\nsimple query : points point_sf intersect way polygon polygon_sfc?\nquestion can answered inspection (points 1 2 touch triangle).\ncan also answered using spatial predicate objects intersect?\nimplemented sf follows:contents result expected:\nfunction returns positive (1) third point, negative result (represented empty vector) first two outside polygon’s border.\nmay unexpected result comes form list vectors.\nsparse matrix output registers relation one exists, reducing memory requirements topological operations multi-feature objects.\nsaw previous section, dense matrix consisting TRUE FALSE values combination features can also returned sparse = FALSE:output matrix row represents feature target object column represents feature selecting object.\ncase, third feature point_sf intersects polygon_sfc.\none feature polygon_sfc result one column.\nresult can used subsetting saw Section 4.2.1.Note: st_intersects() returns TRUE even cases features just touch: intersects ‘catch-’ topological operation identifies many types spatial relation, illustrated Figure ??.\nopposite st_intersects() st_disjoint(), returns objects spatially relate way selecting object (note [, 1] converts result vector):st_within() returns TRUE objects completely within selecting object.\nalso applies third point, inside polygon, illustrated :Note although first point within triangle, touch part border.\nreason st_touches() returns FALSE points:features touch, almost touch selection object?\ncan selected using st_is_within_distance(), additional dist argument.\ncan used set close target objects need selected.\nNote although point 4 one unit distance nearest node polygon_sfc (point 2 Figure 4.3), still selected distance set 0.9.\nillustrated code chunk , second line uses function lengths() convert lengthy list output logical object:","code":"#> Warning: Currect projection of shape xy unknown. Long-lat (WGS84) is assumed.\n\n#> Warning: Currect projection of shape xy unknown. Long-lat (WGS84) is assumed.\n\n#> Warning: Currect projection of shape xy unknown. Long-lat (WGS84) is assumed.\n\n#> Warning: Currect projection of shape xy unknown. Long-lat (WGS84) is assumed.\n\n#> Warning: Currect projection of shape xy unknown. Long-lat (WGS84) is assumed.\n\n#> Warning: Currect projection of shape xy unknown. Long-lat (WGS84) is assumed.\n\n#> Warning: Currect projection of shape xy unknown. Long-lat (WGS84) is assumed.\n\n#> Warning: Currect projection of shape xy unknown. Long-lat (WGS84) is assumed.\n\n#> Warning: Currect projection of shape xy unknown. Long-lat (WGS84) is assumed.\n\n#> Warning: Currect projection of shape xy unknown. Long-lat (WGS84) is assumed.\n\n#> Warning: Currect projection of shape xy unknown. Long-lat (WGS84) is assumed.\n\n#> Warning: Currect projection of shape xy unknown. Long-lat (WGS84) is assumed.\npolygon_df = read.csv(text = \"x, y\n0, 0.0\n0, 1.0\n1, 1.0\n1, 0.5\n0, 0.0\")\npolygon_matrix = as.matrix(polygon_df)\npolygon = st_polygon(list(polygon_matrix))\npolygon_sfc = st_sfc(polygon)\nline_df = read.csv(text = \"x,y\n0.1, 0\n1, 0.1\")\nline = st_linestring(x = as.matrix(line_df))\nline_sfc = st_sfc(line)\n# create points\npoint_df = read.csv(text = \"x,y\n0.1,0\n0.7,0.2\n0.4,0.8\")\npoint_sf = st_as_sf(point_df, coords = c(\"x\", \"y\"))\nst_intersects(point_sf, polygon_sfc)\n#> Sparse geometry binary ..., where the predicate was `intersects'\n#> 1: (empty)\n#> 2: (empty)\n#> 3: 1\nst_intersects(point_sf, polygon_sfc, sparse = FALSE)\n#>       [,1]\n#> [1,] FALSE\n#> [2,] FALSE\n#> [3,]  TRUE\nst_disjoint(point_sf, polygon_sfc, sparse = FALSE)[, 1]\n#> [1]  TRUE  TRUE FALSE\nst_within(point_sf, polygon_sfc, sparse = FALSE)[, 1]\n#> [1] FALSE FALSE  TRUE\nst_touches(point_sf, polygon_sfc, sparse = FALSE)[, 1]\n#> [1] FALSE FALSE FALSE\n# can only return a sparse matrix\nsel = st_is_within_distance(point_sf, polygon_sfc, dist = 0.1) \nlengths(sel) > 0\n#> [1]  TRUE FALSE  TRUE"},{"path":"spatial-operations.html","id":"spatial-joining","chapter":"4 Spatial data operations","heading":"4.2.3 Spatial joining","text":"Joining two non-spatial datasets relies shared ‘key’ variable, described Section 3.2.4.\nSpatial data joining applies concept, instead relies spatial relations, described previous section.\nattribute data, joining adds new columns target object (argument x joining functions), source object (y).\n\nprocess illustrated following example: imagine ten points randomly distributed across Earth’s surface ask, points land, countries ?\nImplementing idea reproducible example build geographic data handling skills show spatial joins work.\nstarting point create points randomly scattered Earth’s surface:scenario illustrated Figure 4.4 shows random_points object (top left) lacks attribute data, world (top right) attributes, including country names shown sample countries legend.\nSpatial joins implemented st_join(), illustrated code chunk .\noutput random_joined object illustrated Figure 4.4 (bottom left).\ncreating joined dataset, use spatial subsetting create world_random, contains countries contain random points, verify number country names returned joined dataset four (see top right panel Figure 4.4).\nFIGURE 4.4: Illustration spatial join. new attribute variable added random points (top left) source world object (top right) resulting data represented final panel.\ndefault, st_join() performs left join, meaning result object containing rows x including rows match y (see Section 3.2.4), can also inner joins setting argument left = FALSE.\nLike spatial subsetting, default topological operator used st_join() st_intersects(), can changed setting join argument (see ?st_join details).\nexample demonstrates addition column polygon layer point layer, approach works regardless geometry types.\ncases, example x contains polygons, match multiple objects y, spatial joins result duplicate features, creates new row match y.","code":"\nset.seed(2018) # set seed for reproducibility\n(bb = st_bbox(world)) # the world's bounds\n#>   xmin   ymin   xmax   ymax \n#> -180.0  -89.9  180.0   83.6\nrandom_df = tibble(\n  x = runif(n = 10, min = bb[1], max = bb[3]),\n  y = runif(n = 10, min = bb[2], max = bb[4])\n)\nrandom_points = random_df %>% \n  st_as_sf(coords = c(\"x\", \"y\")) %>% # set coordinates\n  st_set_crs(\"EPSG:4326\") # set geographic CRS\nworld_random = world[random_points, ]\nnrow(world_random)\n#> [1] 4\nrandom_joined = st_join(random_points, world[\"name_long\"])"},{"path":"spatial-operations.html","id":"non-overlapping-joins","chapter":"4 Spatial data operations","heading":"4.2.4 Non-overlapping joins","text":"Sometimes two geographic datasets touch still strong geographic relationship.\ndatasets cycle_hire cycle_hire_osm, already attached spData package, provide good example.\nPlotting shows often closely related touch, shown Figure 4.5, base version created following code :\ncan check points st_intersects() shown :\nFIGURE 4.5: spatial distribution cycle hire points London based official data (blue) OpenStreetMap data (red).\nImagine need join capacity variable cycle_hire_osm onto official ‘target’ data contained cycle_hire.\nnon-overlapping join needed.\nsimplest method use topological operator st_is_within_distance(), demonstrated using threshold distance 20 m (note works projected unprojected data).shows 438 points target object cycle_hire within threshold distance cycle_hire_osm.\nretrieve values associated respective cycle_hire_osm points?\nsolution st_join(), addition dist argument (set 20 m ):Note number rows joined result greater target.\ncycle hire stations cycle_hire multiple matches cycle_hire_osm.\naggregate values overlapping points return mean, can use aggregation methods learned Chapter 3, resulting object number rows target:capacity nearby stations can verified comparing plot capacity source cycle_hire_osm data results new object (plots shown):result join used spatial operation change attribute data associated simple features; geometry associated feature remained unchanged.","code":"\nplot(st_geometry(cycle_hire), col = \"blue\")\nplot(st_geometry(cycle_hire_osm), add = TRUE, pch = 3, col = \"red\")\nany(st_touches(cycle_hire, cycle_hire_osm, sparse = FALSE))\n#> [1] FALSE\nsel = st_is_within_distance(cycle_hire, cycle_hire_osm, dist = 20)\nsummary(lengths(sel) > 0)\n#>    Mode   FALSE    TRUE \n#> logical     304     438\nz = st_join(cycle_hire, cycle_hire_osm, st_is_within_distance, dist = 20)\nnrow(cycle_hire)\n#> [1] 742\nnrow(z)\n#> [1] 762\nz = z %>% \n  group_by(id) %>% \n  summarize(capacity = mean(capacity))\nnrow(z) == nrow(cycle_hire)\n#> [1] TRUE\nplot(cycle_hire_osm[\"capacity\"])\nplot(z[\"capacity\"])"},{"path":"spatial-operations.html","id":"spatial-aggr","chapter":"4 Spatial data operations","heading":"4.2.5 Spatial aggregation","text":"attribute data aggregation, spatial data aggregation condenses data: aggregated outputs fewer rows non-aggregated inputs.\nStatistical aggregating functions, mean average sum, summarise multiple values  variable, return single value per grouping variable.\nSection 3.2.3 demonstrated aggregate() group_by() %>% summarize() condense data based attribute variables, section shows functions work spatial objects.\nReturning example New Zealand, imagine want find average height high points region: geometry source (y nz case) defines values target object (x nz_height) grouped.\ncan done single line code base R’s aggregate() method:result previous command sf object geometry (spatial) aggregating object (nz), can verify command identical(st_geometry(nz), st_geometry(nz_agg)).\nresult previous operation illustrated Figure 4.6, shows average value features nz_height within New Zealand’s 16 regions.\nresult can also generated piping output st_join() ‘tidy’ functions group_by() summarize() follows:\nFIGURE 4.6: Average height top 101 high points across regions New Zealand.\nresulting nz_agg objects geometry aggregating object nz new column summarising values x region using function mean() (cold, course, replaced median(), sd() functions return single value).\nNote: one difference aggregate() group_by() %>% summarize() approaches former results NA values unmatching region names, latter preserves region names flexible terms aggregating functions column names results.\naggregating operations also create new geometries, see Section 5.2.6.","code":"\nnz_agg = aggregate(x = nz_height, by = nz, FUN = mean)\nnz_agg2 = st_join(x = nz, y = nz_height) %>%\n  group_by(Name) %>%\n  summarize(elevation = mean(elevation, na.rm = TRUE))"},{"path":"spatial-operations.html","id":"incongruent","chapter":"4 Spatial data operations","heading":"4.2.6 Joining incongruent layers","text":"Spatial congruence important concept related spatial aggregation.\naggregating object (refer y) congruent target object (x) two objects shared borders.\nOften case administrative boundary data, whereby larger units — Middle Layer Super Output Areas (MSOAs) UK districts many European countries — composed many smaller units.Incongruent aggregating objects, contrast, share common borders target (Qiu, Zhang, Zhou 2012).\nproblematic spatial aggregation (spatial operations) illustrated Figure 4.7: aggregating centroid sub-zone return accurate results.\nAreal interpolation overcomes issue transferring values one set areal units another, using range algorithms including simple area weighted approaches sophisticated approaches ‘pycnophylactic’ methods (Tobler 1979).\nFIGURE 4.7: Illustration congruent (left) incongruent (right) areal units respect larger aggregating zones (translucent blue borders).\nspData package contains dataset named incongruent (colored polygons black borders right panel Figure 4.7) dataset named aggregating_zones (two polygons translucent blue border right panel Figure 4.7).\nLet us assume value column incongruent refers total regional income million Euros.\ncan transfer values underlying nine spatial polygons two polygons aggregating_zones?simplest useful method area weighted spatial interpolation, transfers values incongruent object new column aggregating_zones proportion area overlap: larger spatial intersection input output features, larger corresponding value.\nimplemented st_interpolate_aw(), demonstrated code chunk .case meaningful sum values intersections falling aggregating zones since total income -called spatially extensive variable (increases area), assuming income evenly distributed across smaller zones (hence warning message ).\ndifferent spatially intensive variables average income percentages, increase area increases.\nst_interpolate_aw() works equally spatially intensive variables: set extensive parameter FALSE use average rather sum function aggregation.","code":"\niv = incongruent[\"value\"] # keep only the values to be transferred\nagg_aw = st_interpolate_aw(iv, aggregating_zones, ext = TRUE)\n#> Warning in st_interpolate_aw.sf(iv, aggregating_zones, ext = TRUE):\n#> st_interpolate_aw assumes attributes are constant or uniform over areas of x\nagg_aw$value\n#> [1] 19.6 25.7"},{"path":"spatial-operations.html","id":"distance-relations","chapter":"4 Spatial data operations","heading":"4.2.7 Distance relations","text":"topological relations binary — feature either intersects another — distance relations continuous.\ndistance two objects calculated st_distance() function.\nillustrated code chunk , finds distance highest point New Zealand geographic centroid Canterbury region, created Section 4.2.1:\ntwo potentially surprising things result:units, telling us distance 100,000 meters, 100,000 inches, measure distanceIt returned matrix, even though result contains single valueThis second feature hints another useful feature st_distance(), ability return distance matrices combinations features objects x y.\nillustrated command , finds distances first three features nz_height Otago Canterbury regions New Zealand represented object co.Note distance second third features nz_height second feature co zero.\ndemonstrates fact distances points polygons refer distance part polygon:\nsecond third points nz_height Otago, can verified plotting (result shown):","code":"\nnz_heighest = nz_height %>% top_n(n = 1, wt = elevation)\ncanterbury_centroid = st_centroid(canterbury)\nst_distance(nz_heighest, canterbury_centroid)\n#> Units: [m]\n#>        [,1]\n#> [1,] 115540\nco = filter(nz, grepl(\"Canter|Otag\", Name))\nst_distance(nz_height[1:3, ], co)\n#> Units: [m]\n#>        [,1]  [,2]\n#> [1,] 123537 15498\n#> [2,]  94283     0\n#> [3,]  93019     0\nplot(st_geometry(co)[2])\nplot(st_geometry(nz_height)[2:3], add = TRUE)"},{"path":"spatial-operations.html","id":"spatial-ras","chapter":"4 Spatial data operations","heading":"4.3 Spatial operations on raster data","text":"section builds Section 3.3, highlights various basic methods manipulating raster datasets, demonstrate advanced explicitly spatial raster operations, uses objects elev grain manually created Section 3.3.\nreader’s convenience, datasets can also found spData package.","code":""},{"path":"spatial-operations.html","id":"spatial-raster-subsetting","chapter":"4 Spatial data operations","heading":"4.3.1 Spatial subsetting","text":"previous chapter (Section 3.3) demonstrated retrieve values associated specific cell IDs row column combinations.\nRaster objects can also extracted location (coordinates) spatial objects.\nuse coordinates subsetting, one can ‘translate’ coordinates cell ID terra function cellFromXY().\nalternative use terra::extract() (careful, also function called extract() tidyverse) extract values.\nmethods demonstrated find value cell covers point located coordinates 0.1, 0.1.\n\nRaster objects can also subset another raster object, demonstrated code chunk :amounts retrieving values first raster object (case elev) fall within extent second raster (: clip), illustrated Figure 4.8.\nFIGURE 4.8: Original raster (left). Raster mask (middle). Output masking raster (right).\nexample returned values specific cells, many cases spatial outputs subsetting operations raster datasets needed.\ncan done using [ operator, drop = FALSE, outlined Section 3.3, also shows raster objects can subsetted various objects.\ndemonstrated code , returns first two cells elev raster object first two cells top row (first 2 lines output shown):Another common use case spatial subsetting raster logical (NA) values used mask another raster extent resolution, illustrated Figure 4.8.\ncase, [ mask() functions can used (results shown):code chunk , created mask object called rmask values randomly assigned NA TRUE.\nNext, want keep values elev TRUE rmask.\nwords, want mask elev rmask.approach can also used replace values (e.g., expected wrong) NA.operations fact Boolean local operations since compare cell-wise two rasters.\nnext subsection explores related operations detail.","code":"\nid = cellFromXY(elev, xy = matrix(c(0.1, 0.1), ncol = 2))\nelev[id]\n# the same as\nterra::extract(elev, matrix(c(0.1, 0.1), ncol = 2))\nclip = rast(xmin = 0.9, xmax = 1.8, ymin = -0.45, ymax = 0.45,\n            resolution = 0.3, vals = rep(1, 9))\nelev[clip]\n# we can also use extract\n# terra::extract(elev, ext(clip))\nelev[1:2, drop = FALSE]    # spatial subsetting with cell IDs\n#> class       : SpatRaster \n#> dimensions  : 1, 2, 1  (nrow, ncol, nlyr)\n#> ...\n# create raster mask\nrmask = elev\nvalues(rmask) = sample(c(NA, TRUE), 36, replace = TRUE)\n# spatial subsetting\nelev[rmask, drop = FALSE]           # with [ operator\nmask(elev, rmask)                   # with mask()\nelev[elev < 20] = NA"},{"path":"spatial-operations.html","id":"map-algebra","chapter":"4 Spatial data operations","heading":"4.3.2 Map algebra","text":"\nterm ‘map algebra’ coined late 1970s describe “set conventions, capabilities, techniques” analysis geographic raster (although less prominently) vector data (Tomlin 1994).\n\ncontext, define map algebra narrowly, operations modify summarise raster cell values, reference surrounding cells, zones, statistical functions apply every cell.Map algebra operations tend fast, raster datasets implicitly store coordinates (hence oversimplifying phrase “raster faster vector corrector”).\nlocation cells raster datasets can calculated using matrix position resolution origin dataset (stored header).\nprocessing, however, geographic position cell barely relevant long make sure cell position still processing.\nAdditionally, two raster datasets share extent, projection resolution, one treat matrices processing.way map algebra works terra package.\nFirst, headers raster datasets queried (cases map algebra operations work one dataset) checked ensure datasets compatible.\nSecond, map algebra retains -called one--one locational correspondence, meaning cells move.\ndiffers matrix algebra, values change position, example multiplying dividing matrices.Map algebra (cartographic modeling raster data) divides raster operations four subclasses (Tomlin 1990), working one several grids simultaneously:Local per-cell operationsFocal neighborhood operations.\noften output cell value result 3 x 3 input cell blockZonal operations similar focal operations, surrounding pixel grid new values computed can irregular sizes shapesGlobal per-raster operations; means output cell derives value potentially one several entire rastersThis typology classifies map algebra operations number cells used pixel processing step type output.\nsake completeness, mention raster operations can also classified discipline terrain, hydrological analysis, image classification.\nfollowing sections explain type map algebra operations can used, reference worked examples.","code":""},{"path":"spatial-operations.html","id":"local-operations","chapter":"4 Spatial data operations","heading":"4.3.3 Local operations","text":"\nLocal operations comprise cell--cell operations one several layers.\nRaster algebra classical use case local operations – includes adding subtracting values raster, squaring multipling rasters.\nRaster algebra also allows logical operations finding raster cells greater specific value (5 example ).\nterra package supports operations , demonstrated (Figure 4.9):\nFIGURE 4.9: Examples different local operations elev raster object: adding two rasters, squaring, applying logarithmic transformation, performing logical operation.\nAnother good example local operations classification intervals numeric values groups grouping digital elevation model low (class 1), middle (class 2) high elevations (class 3).\nUsing classify() command, need first construct reclassification matrix, first column corresponds lower second column upper end class.\nthird column represents new value specified ranges column one two., assign raster values ranges 0–12, 12–24 24–36 reclassified take values 1, 2 3, respectively.classify() function can also used want reduce number classes categorical rasters.\nperform several additional reclassifications Chapter 13.Apart arithmetic operators, one can also use app(), tapp() lapp() functions.\nefficient, hence, preferable presence large raster datasets.\nAdditionally, allow save output file directly.\napp() function applies function cell raster used summarize (e.g., calculating sum) values multiple layers one layer.\ntapp() extension app(), allowing us select subset layers (see index argument) want perform certain operation.\nFinally, lapp() function allows apply function cell using layers arguments – application lapp() presented .calculation normalized difference vegetation index (NDVI) well-known local (pixel--pixel) raster operation.\nreturns raster values -1 1; positive values indicate presence living plants (mostly > 0.2).\nNDVI calculated red near-infrared (NIR) bands remotely sensed imagery, typically satellite systems Landsat Sentinel.\nVegetation absorbs light heavily visible light spectrum, especially red channel, reflecting NIR light, explaining NVDI formula:\\[\n\\begin{split}\nNDVI&= \\frac{\\text{NIR} - \\text{Red}}{\\text{NIR} + \\text{Red}}\\\\\n\\end{split}\n\\]Let’s calculate NDVI multispectral satellite file Zion National Park.raster object four satellite bands - blue, green, red, near-infrared (NIR).\nnext step implement NDVI formula R function:function accepts two numerical arguments, nir red, returns numerical vector NDVI values.\ncan used fun argument lapp().\njust need remember function just needs two bands (four original raster), need NIR, red order.\nsubset input raster multi_rast[[c(4, 3)]] calculations.result, shown right panel Figure 4.10, can compared RGB image area (left panel Figure).\nallows us see largest NDVI values connected areas dense forest northern parts area, lowest values related lake north snowy mountain ridges.\nFIGURE 4.10: RGB image (left) NDVI values (right) calculated example satellite file Zion National Park\nPredictive mapping another interesting application local raster operations.\nresponse variable corresponds measured observed points space, example, species richness, presence landslides, tree disease crop yield.\nConsequently, can easily retrieve space- airborne predictor variables various rasters (elevation, pH, precipitation, temperature, landcover, soil class, etc.).\nSubsequently, model response function predictors using lm(), glm(), gam() machine-learning technique.\nSpatial predictions raster objects can therefore made applying estimated coefficients predictor raster values, summing output raster values (see Chapter 14).","code":"\nelev + elev\nelev^2\nlog(elev)\nelev > 5\nrcl = matrix(c(0, 12, 1, 12, 24, 2, 24, 36, 3), ncol = 3, byrow = TRUE)\nrcl\n#>      [,1] [,2] [,3]\n#> [1,]    0   12    1\n#> [2,]   12   24    2\n#> [3,]   24   36    3\nrecl = classify(elev, rcl = rcl)\nmulti_raster_file = system.file(\"raster/landsat.tif\", package = \"spDataLarge\")\nmulti_rast = rast(multi_raster_file)\nndvi_fun = function(nir, red){\n  (nir - red) / (nir + red)\n}\nndvi_rast = lapp(multi_rast[[c(4, 3)]], fun = ndvi_fun)"},{"path":"spatial-operations.html","id":"focal-operations","chapter":"4 Spatial data operations","heading":"4.3.4 Focal operations","text":"\nlocal functions operate one cell, though possibly multiple layers, focal operations take account central (focal) cell neighbors.\nneighborhood (also named kernel, filter moving window) consideration typically size 3--3 cells (central cell eight surrounding neighbors), can take (necessarily rectangular) shape defined user.\nfocal operation applies aggregation function cells within specified neighborhood, uses corresponding output new value central cell, moves next central cell (Figure 4.11).\nnames operation spatial filtering convolution (Burrough, McDonnell, Lloyd 2015).R, can use focal() function perform spatial filtering.\ndefine shape moving window matrix whose values correspond weights (see w parameter code chunk ).\nSecondly, fun parameter lets us specify function wish apply neighborhood.\n, choose minimum, summary function, including sum(), mean(), var() can used.function also accepts additional arguments, example, remove NAs process (na.rm = TRUE) (na.rm = FALSE).\nFIGURE 4.11: Input raster (left) resulting output raster (right) due focal operation - finding minimum value 3--3 moving windows.\ncan quickly check output meets expectations.\nexample, minimum value always upper left corner moving window (remember created input raster row-wise incrementing cell values one starting upper left corner).\nexample, weighting matrix consists 1s, meaning cell weight output, can changed.Focal functions filters play dominant role image processing.\nLow-pass smoothing filters use mean function remove extremes.\ncase categorical data, can replace mean mode, common value.\ncontrast, high-pass filters accentuate features.\nline detection Laplace Sobel filters might serve example .\nCheck focal() help page use R (also used exercises end chapter).Terrain processing, calculation topographic characteristics slope, aspect flow directions, relies focal functions.\nterrain() can used calculate metrics, although terrain algorithms, including Zevenbergen Thorne method compute slope, implemented terra function.\nMany algorithms — including curvatures, contributing areas wetness indices — implemented open source desktop geographic information system (GIS) software.\nChapter 9 shows access GIS functionality within R.","code":"\nr_focal = focal(elev, w = matrix(1, nrow = 3, ncol = 3), fun = min)"},{"path":"spatial-operations.html","id":"zonal-operations","chapter":"4 Spatial data operations","heading":"4.3.5 Zonal operations","text":"\nJust like focal operations, zonal operations apply aggregation function multiple raster cells.\nHowever, second raster, usually categorical values, defines zonal filters (‘zones’) case zonal operations, opposed predefined neighborhood window case focal operation presented previous section.\nConsequently, raster cells defining zonal filter necessarily neighbors.\ngrain size raster good example, illustrated right panel Figure 3.2): different grain sizes spread irregularly throughout raster.\nFinally, result zonal operation summary table grouped zone operation also known zonal statistics GIS world.\ncontrast focal operations return raster object.following code chunk uses zonal() function calculate mean elevation associated grain size class, example.\noutput shown Figure 3.2).returns statistics category, mean altitude grain size class.\nNote: also possible get raster calculated statistics zone setting .raster argument TRUE.","code":"\nz = zonal(elev, grain, fun = \"mean\")\nz\n#>   grain elev\n#> 1  clay 14.8\n#> 2  silt 21.2\n#> 3  sand 18.7"},{"path":"spatial-operations.html","id":"global-operations-and-distances","chapter":"4 Spatial data operations","heading":"4.3.6 Global operations and distances","text":"Global operations special case zonal operations entire raster dataset representing single zone.\ncommon global operations descriptive statistics entire raster dataset minimum maximum – already discussed Section 3.3.2.Aside , global operations also useful computation distance weight rasters.\nfirst case, one can calculate distance cell specific target cell.\nexample, one might want compute distance nearest coast (see also terra::distance()).\nmight also want consider topography, means, interested pure distance like also avoid crossing mountain ranges going coast.\n, can weight distance elevation additional altitudinal meter ‘prolongs’ Euclidean distance.\nVisibility viewshed computations also belong family global operations (exercises Chapter 9, compute viewshed raster).","code":""},{"path":"spatial-operations.html","id":"map-algebra-counterparts-in-vector-processing","chapter":"4 Spatial data operations","heading":"4.3.7 Map algebra counterparts in vector processing","text":"Many map algebra operations counterpart vector processing (Liu Mason 2009).\nComputing distance raster (global operation) considering maximum distance (logical focal operation) equivalent vector buffer operation (Section 5.2.5).\nReclassifying raster data (either local zonal function depending input) equivalent dissolving vector data (Section 4.2.3).\nOverlaying two rasters (local operation), one contains NULL NA values representing mask, similar vector clipping (Section 5.2.5).\nQuite similar spatial clipping intersecting two layers (Section 4.2.1).\ndifference two layers (vector raster) simply share overlapping area (see Figure 5.8 example).\nHowever, careful wording.\nSometimes words slightly different meanings raster vector data models.\nAggregating case vector data refers dissolving polygons, means increasing resolution case raster data.\nfact, one see dissolving aggregating polygons decreasing resolution.\nHowever, zonal operations might better raster equivalent compared changing cell resolution.\nZonal operations can dissolve cells one raster accordance zones (categories) another raster using aggregation function (see ).","code":""},{"path":"spatial-operations.html","id":"merging-rasters","chapter":"4 Spatial data operations","heading":"4.3.8 Merging rasters","text":"\nSuppose like compute NDVI (see Section 4.3.3), additionally want compute terrain attributes elevation data observations within study area.\ncomputations rely remotely sensed information.\ncorresponding imagery often divided scenes covering specific spatial extent, frequently, study area covers one scene.\n, need merge scenes covered study area.\neasiest case, can just merge scenes, put side side.\npossible, example, digital elevation data (SRTM, ASTER).\nfollowing code chunk first download SRTM elevation data Austria Switzerland (country codes, see geodata function country_codes()).\nsecond step, merge two rasters one.terra’s merge() command combines two images, case overlap, uses value first raster.\n\n\n\n\n\nmerging approach little use overlapping values correspond .\nfrequently case want combine spectral imagery scenes taken different dates.\nmerge() command still work see clear border resulting image.\nhand, mosaic() command lets define function overlapping area.\ninstance, compute mean value – might smooth clear border merged result likely make disappear.\n, need advanced approach.\nRemote sensing scientists frequently apply histogram matching use regression techniques align values first image second image.\npackages landsat (histmatch(), relnorm(), PIF()), satellite (calcHistMatch()) RStoolbox (histMatch(), pifMatch()) provide corresponding functions raster’s package objects.\ndetailed introduction use R remote sensing, refer reader Wegmann, Leutner, Dech (2016).\n\n","code":"\naut = geodata::elevation_30s(country = \"AUT\", path = tempdir())\nch = geodata::elevation_30s(country = \"CHE\", path = tempdir())\naut_ch = merge(aut, ch)"},{"path":"spatial-operations.html","id":"exercises-2","chapter":"4 Spatial data operations","heading":"4.4 Exercises","text":"E1. established Section 4.2 Canterbury region New Zealand containing 100 highest points country.\nmany high points Canterbury region contain?E2. region second highest number nz_height points , many ?E3. Generalizing question regions: many New Zealand’s 16 regions contain points belong top 100 highest points country? regions?Bonus: create table listing regions order number points name.E4. Use dem = rast(system.file(\"raster/dem.tif\", package = \"spDataLarge\")), reclassify elevation three classes: low (<300), medium high (>500).\nSecondly, read NDVI raster (ndvi = rast(system.file(\"raster/ndvi.tif\", package = \"spDataLarge\"))) compute mean NDVI mean elevation altitudinal class.E5. Apply line detection filter rast(system.file(\"ex/logo.tif\", package = \"terra\")).\nPlot result.\nHint: Read ?terra::focal().E6. Calculate Normalized Difference Water Index (NDWI; (green - nir)/(green + nir)) Landsat image.\nUse Landsat image provided spDataLarge package (system.file(\"raster/landsat.tif\", package = \"spDataLarge\")).\nAlso, calculate correlation NDVI NDWI area.E7. StackOverflow post shows compute distances nearest coastline using raster::distance().\nTry something similar terra::distance(): retrieve digital elevation model Spain, compute raster represents distances coast across country (hint: use geodata::elevation_30s()).\nConvert resulting distances meters kilometers.\nNote: may wise increase cell size input raster reduce compute time operation.E8. Try modify approach used exercise weighting distance raster elevation raster; every 100 altitudinal meters increase distance coast 10 km.\nNext, compute visualize difference raster created using Euclidean distance (E7) raster weighted elevation.","code":""},{"path":"geometric-operations.html","id":"geometric-operations","chapter":"5 Geometry operations","heading":"5 Geometry operations","text":"","code":""},{"path":"geometric-operations.html","id":"prerequisites-3","chapter":"5 Geometry operations","heading":"Prerequisites","text":"chapter uses packages Chapter 4 addition spDataLarge, installed Chapter 2:","code":"\nlibrary(sf)\nlibrary(terra)\nlibrary(dplyr)\nlibrary(spData)\n#> Warning: multiple methods tables found for 'approxNA'\nlibrary(spDataLarge)"},{"path":"geometric-operations.html","id":"introduction-2","chapter":"5 Geometry operations","heading":"5.1 Introduction","text":"previous three chapters demonstrated geographic datasets structured R (Chapter 2) manipulate based non-geographic attributes (Chapter 3) spatial properties (Chapter 4).\nchapter extends skills.\nreading — attempting exercises end — understand control geometry column sf objects geographic location pixels represented rasters.Section 5.2 covers transforming vector geometries ‘unary’ ‘binary’ operations.\nUnary operations work single geometry isolation.\nincludes simplification (lines polygons), creation buffers centroids, shifting/scaling/rotating single geometries using ‘affine transformations’ (Sections 5.2.1 5.2.4).\nBinary transformations modify one geometry based shape another.\nincludes clipping geometry unions, covered Sections 5.2.5 5.2.6, respectively.\nType transformations (polygon line, example) demonstrated Section 5.2.7.Section 5.3 covers geometric transformations raster objects.\ninvolves changing size number underlying pixels, assigning new values.\nteaches change resolution (also called raster aggregation disaggregation), extent origin raster.\noperations especially useful one like align raster datasets diverse sources.\nAligned raster objects share one--one correspondence pixels, allowing processed using map algebra operations, described Section 4.3.2. final Section 5.4 connects vector raster objects.\nshows raster values can ‘masked’ ‘extracted’ vector geometries.\nImportantly shows ‘polygonize’ rasters ‘rasterize’ vector datasets, making two data models interchangeable.","code":""},{"path":"geometric-operations.html","id":"geo-vec","chapter":"5 Geometry operations","heading":"5.2 Geometric operations on vector data","text":"section operations way change geometry vector (sf) objects.\nadvanced spatial data operations presented previous chapter (Section 4.2), drill geometry:\nfunctions discussed section work objects class sfc addition objects class sf.","code":""},{"path":"geometric-operations.html","id":"simplification","chapter":"5 Geometry operations","heading":"5.2.1 Simplification","text":"\nSimplification process generalization vector objects (lines polygons) usually use smaller scale maps.\nAnother reason simplifying objects reduce amount memory, disk space network bandwidth consume:\nmay wise simplify complex geometries publishing interactive maps.\nsf package provides st_simplify(), uses GEOS implementation Douglas-Peucker algorithm reduce vertex count.\nst_simplify() uses dTolerance control level generalization map units (see Douglas Peucker 1973 details).\nFigure 5.1 illustrates simplification LINESTRING geometry representing river Seine tributaries.\nsimplified geometry created following command:\nFIGURE 5.1: Comparison original simplified geometry seine object.\nresulting seine_simp object copy original seine fewer vertices.\napparent, result visually simpler (Figure 5.1, right) consuming less memory original object, verified :Simplification also applicable polygons.\nillustrated using us_states, representing contiguous United States.\nshow Chapter 6, GEOS assumes data projected CRS lead unexpected results using geographic CRS.\nTherefore, first step project data adequate projected CRS, US National Atlas Equal Area (epsg = 2163) (left Figure 5.2):st_simplify() works equally well projected polygons:limitation st_simplify() simplifies objects per-geometry basis.\nmeans ‘topology’ lost, resulting overlapping ‘holy’ areal units illustrated Figure 5.2 (middle panel).\nms_simplify() rmapshaper provides alternative overcomes issue.\ndefault uses Visvalingam algorithm, overcomes limitations Douglas-Peucker algorithm (Visvalingam Whyatt 1993).\n\nfollowing code chunk uses function simplify us_states2163.\nresult 1% vertices input (set using argument keep) number objects remains intact set keep_shapes = TRUE:23Finally, visual comparison original dataset two simplified versions shows differences Douglas-Peucker (st_simplify) Visvalingam (ms_simplify) algorithm outputs (Figure 5.2):\nFIGURE 5.2: Polygon simplification action, comparing original geometry contiguous United States simplified versions, generated functions sf (center) rmapshaper (right) packages.\n","code":"\nseine_simp = st_simplify(seine, dTolerance = 2000)  # 2000 m\nobject.size(seine)\n#> 18096 bytes\nobject.size(seine_simp)\n#> 9112 bytes\nus_states2163 = st_transform(us_states, 2163)\nus_states_simp1 = st_simplify(us_states2163, dTolerance = 100000)  # 100 km\n# proportion of points to retain (0-1; default 0.05)\nus_states2163$AREA = as.numeric(us_states2163$AREA)\nus_states_simp2 = rmapshaper::ms_simplify(us_states2163, keep = 0.01,\n                                          keep_shapes = TRUE)"},{"path":"geometric-operations.html","id":"centroids","chapter":"5 Geometry operations","heading":"5.2.2 Centroids","text":"\nCentroid operations identify center geographic objects.\nLike statistical measures central tendency (including mean median definitions ‘average’), many ways define geographic center object.\ncreate single point representations complex vector objects.commonly used centroid operation geographic centroid.\ntype centroid operation (often referred ‘centroid’) represents center mass spatial object (think balancing plate finger).\nGeographic centroids many uses, example create simple point representation complex geometries, estimate distances polygons.\ncan calculated sf function st_centroid() demonstrated code , generates geographic centroids regions New Zealand tributaries River Seine, illustrated black points Figure 5.3.Sometimes geographic centroid falls outside boundaries parent objects (think doughnut).\ncases point surface operations can used guarantee point parent object (e.g., labeling irregular multipolygon objects island states), illustrated red points Figure 5.3.\nNotice red points always lie parent objects.\ncreated st_point_on_surface() follows:24\nFIGURE 5.3: Centroids (black points) ‘points surface’ (red points) New Zealand’s regions (left) Seine (right) datasets.\ntypes centroids exist, including Chebyshev center visual center.\nexplore possible calculate using R, ’ll see Chapter 10.","code":"\nnz_centroid = st_centroid(nz)\nseine_centroid = st_centroid(seine)\nnz_pos = st_point_on_surface(nz)\nseine_pos = st_point_on_surface(seine)"},{"path":"geometric-operations.html","id":"buffers","chapter":"5 Geometry operations","heading":"5.2.3 Buffers","text":"\nBuffers polygons representing area within given distance geometric feature:\nregardless whether input point, line polygon, output polygon.\nUnlike simplification (often used visualization reducing file size) buffering tends used geographic data analysis.\nmany points within given distance line?\ndemographic groups within travel distance new shop?\nkinds questions can answered visualized creating buffers around geographic entities interest.Figure 5.4 illustrates buffers different sizes (5 50 km) surrounding river Seine tributaries.\nbuffers created commands , show command st_buffer() requires least two arguments: input geometry distance, provided units CRS (case meters):\nFIGURE 5.4: Buffers around Seine dataset 5 km (left) 50 km (right). Note colors, reflect fact one buffer created per geometry feature.\n","code":"\nseine_buff_5km = st_buffer(seine, dist = 5000)\nseine_buff_50km = st_buffer(seine, dist = 50000)"},{"path":"geometric-operations.html","id":"affine-transformations","chapter":"5 Geometry operations","heading":"5.2.4 Affine transformations","text":"\nAffine transformation transformation preserves lines parallelism.\nHowever, angles length necessarily preserved.\nAffine transformations include, among others, shifting (translation), scaling rotation.\nAdditionally, possible use combination .\nAffine transformations essential part geocomputation.\nexample, shifting needed labels placement, scaling used non-contiguous area cartograms (see Section 8.6), many affine transformations applied reprojecting improving geometry created based distorted wrongly projected map.\nsf package implements affine transformation objects classes sfg sfc.Shifting moves every point distance map units.\ndone adding numerical vector vector object.\nexample, code shifts y-coordinates 100,000 meters north, leaves x-coordinates untouched (left panel Figure 5.5).Scaling enlarges shrinks objects factor.\ncan applied either globally locally.\nGlobal scaling increases decreases coordinates values relation origin coordinates, keeping geometries topological relations intact.\ncan done subtraction multiplication asfg sfc object.Local scaling treats geometries independently requires points around geometries going scaled, e.g., centroids.\nexample , geometry shrunk factor two around centroids (middle panel Figure 5.5).\nachieve , object firstly shifted way center coordinates 0, 0 ((nz_sfc - nz_centroid_sfc)).\nNext, sizes geometries reduced half (* 0.5).\nFinally, object’s centroid moved back input data coordinates (+ nz_centroid_sfc).Rotation two-dimensional coordinates requires rotation matrix:\\[\nR =\n\\begin{bmatrix}\n\\cos \\theta & -\\sin \\theta \\\\  \n\\sin \\theta & \\cos \\theta \\\\\n\\end{bmatrix}\n\\]rotates points clockwise direction.\nrotation matrix can implemented R :rotation function accepts one argument - rotation angle degrees.\nRotation done around selected points, centroids (right panel Figure 5.5).\nSee vignette(\"sf3\") examples.\nFIGURE 5.5: Illustrations affine transformations: shift, scale rotate.\nFinally, newly created geometries can replace old ones st_set_geometry() function:","code":"\nnz_sfc = st_geometry(nz)\nnz_shift = nz_sfc + c(0, 100000)\nnz_centroid_sfc = st_centroid(nz_sfc)\nnz_scale = (nz_sfc - nz_centroid_sfc) * 0.5 + nz_centroid_sfc\nrotation = function(a){\n  r = a * pi / 180 #degrees to radians\n  matrix(c(cos(r), sin(r), -sin(r), cos(r)), nrow = 2, ncol = 2)\n} \nnz_rotate = (nz_sfc - nz_centroid_sfc) * rotation(30) + nz_centroid_sfc\nnz_scale_sf = st_set_geometry(nz, nz_scale)"},{"path":"geometric-operations.html","id":"clipping","chapter":"5 Geometry operations","heading":"5.2.5 Clipping","text":"\n\nSpatial clipping form spatial subsetting involves changes geometry columns least affected features.Clipping can apply features complex points:\nlines, polygons ‘multi’ equivalents.\nillustrate concept start simple example:\ntwo overlapping circles center point one unit away radius one (Figure 5.6).\nFIGURE 5.6: Overlapping circles.\nImagine want select one circle , space covered x y.\ncan done using function st_intersection(), illustrated using objects named x y represent left- right-hand circles (Figure 5.7).\nFIGURE 5.7: Overlapping circles gray color indicating intersection .\nsubsequent code chunk demonstrates works combinations ‘Venn’ diagram representing x y, inspired Figure 5.1 book R Data Science (Grolemund Wickham 2016).\nFIGURE 5.8: Spatial equivalents logical operators.\nillustrate relationship subsetting clipping spatial data, subset points cover bounding box circles x y Figure 5.8.\npoints inside just one circle, inside inside neither.\nst_sample() used generate simple random distribution points within extent circles x y, resulting output illustrated Figure 5.9.\nFIGURE 5.9: Randomly distributed points within bounding box enclosing circles x y.\nlogical operator way find points inside x y using spatial predicate st_intersects(), whereas intersection method simply finds points inside intersecting region created x_and_y.\ndemonstrated results identical, method uses clipped polygon concise:","code":"\nb = st_sfc(st_point(c(0, 1)), st_point(c(1, 1))) # create 2 points\nb = st_buffer(b, dist = 1) # convert points to circles\nplot(b)\ntext(x = c(-0.5, 1.5), y = 1, labels = c(\"x\", \"y\")) # add text\nx = b[1]\ny = b[2]\nx_and_y = st_intersection(x, y)\nplot(b)\nplot(x_and_y, col = \"lightgrey\", add = TRUE) # color intersecting area\nbb = st_bbox(st_union(x, y))\nbox = st_as_sfc(bb)\nset.seed(2017)\np = st_sample(x = box, size = 10)\nplot(box)\nplot(x, add = TRUE)\nplot(y, add = TRUE)\nplot(p, add = TRUE)\ntext(x = c(-0.5, 1.5), y = 1, labels = c(\"x\", \"y\"))\nsel_p_xy = st_intersects(p, x, sparse = FALSE)[, 1] &\n  st_intersects(p, y, sparse = FALSE)[, 1]\np_xy1 = p[sel_p_xy]\np_xy2 = p[x_and_y]\nidentical(p_xy1, p_xy2)\n#> [1] TRUE"},{"path":"geometric-operations.html","id":"geometry-unions","chapter":"5 Geometry operations","heading":"5.2.6 Geometry unions","text":"\n\nsaw Section 3.2.3, spatial aggregation can silently dissolve geometries touching polygons group.\ndemonstrated code chunk 49 us_states aggregated 4 regions using base tidyverse functions (see results Figure 5.10):\nFIGURE 5.10: Spatial aggregation contiguous polygons, illustrated aggregating population US states regions, population represented color. Note operation automatically dissolves boundaries states.\ngoing terms geometries?\nBehind scenes, aggregate() summarize() combine geometries dissolve boundaries using st_union().\ndemonstrated code chunk creates united western US:function can take two geometries unite , demonstrated code chunk creates united western block incorporating Texas (challenge: reproduce plot result):","code":"\nregions = aggregate(x = us_states[, \"total_pop_15\"], by = list(us_states$REGION),\n                    FUN = sum, na.rm = TRUE)\nregions2 = us_states %>% group_by(REGION) %>%\n  summarize(pop = sum(total_pop_15, na.rm = TRUE))\nus_west = us_states[us_states$REGION == \"West\", ]\nus_west_union = st_union(us_west)\ntexas = us_states[us_states$NAME == \"Texas\", ]\ntexas_union = st_union(us_west_union, texas)"},{"path":"geometric-operations.html","id":"type-trans","chapter":"5 Geometry operations","heading":"5.2.7 Type transformations","text":"\nGeometry casting powerful operation enables transformation geometry type.\nimplemented st_cast function sf package.\nImportantly, st_cast behaves differently single simple feature geometry (sfg) objects, simple feature geometry column (sfc) simple features objects.Let’s create multipoint illustrate geometry casting works simple feature geometry (sfg) objects:case, st_cast can useful transform new object linestring polygon (Figure 5.11):\nFIGURE 5.11: Examples linestring polygon casted multipoint geometry.\nConversion multipoint linestring common operation creates line object ordered point observations, GPS measurements geotagged media.\nallows spatial operations length path traveled.\nConversion multipoint linestring polygon often used calculate area, example set GPS measurements taken around lake corners building lot.transformation process can also reversed using st_cast:Geometry casting simple features geometry column (sfc) simple features objects works single geometries cases.\nOne important difference conversion multi-types non-multi-types.\nresult process, multi-objects split many non-multi-objects.Table 5.1 shows possible geometry type transformations simple feature objects.\ninput simple feature object one element (first column) transformed directly another geometry type.\nSeveral transformations possible, example, convert single point multilinestring polygon (cells [1, 4:5] table NA).\nhand, transformations splitting single element input object multi-element object.\ncan see , example, cast multipoint consisting five pairs coordinates point.\nTABLE 5.1: Geometry casting simple feature geometries (see Section 2.1) input type row output type column\nLet’s try apply geometry type transformations new object, multilinestring_sf, example (left Figure 5.12):can imagine road river network.\nnew object one row defines lines.\nrestricts number operations can done, example prevents adding names line segment calculating lengths single lines.\nst_cast function can used situation, separates one mutlilinestring three linestrings:\nFIGURE 5.12: Examples type casting MULTILINESTRING (left) LINESTRING (right).\nnewly created object allows attributes creation (see Section 3.2.5) length measurements:","code":"\nmultipoint = st_multipoint(matrix(c(1, 3, 5, 1, 3, 1), ncol = 2))\nlinestring = st_cast(multipoint, \"LINESTRING\")\npolyg = st_cast(multipoint, \"POLYGON\")\nmultipoint_2 = st_cast(linestring, \"MULTIPOINT\")\nmultipoint_3 = st_cast(polyg, \"MULTIPOINT\")\nall.equal(multipoint, multipoint_2, multipoint_3)\n#> [1] TRUE\nmultilinestring_list = list(matrix(c(1, 4, 5, 3), ncol = 2), \n                            matrix(c(4, 4, 4, 1), ncol = 2),\n                            matrix(c(2, 4, 2, 2), ncol = 2))\nmultilinestring = st_multilinestring((multilinestring_list))\nmultilinestring_sf = st_sf(geom = st_sfc(multilinestring))\nmultilinestring_sf\n#> Simple feature collection with 1 feature and 0 fields\n#> Geometry type: MULTILINESTRING\n#> Dimension:     XY\n#> Bounding box:  xmin: 1 ymin: 1 xmax: 4 ymax: 5\n#> CRS:           NA\n#>                             geom\n#> 1 MULTILINESTRING ((1 5, 4 3)...\nlinestring_sf2 = st_cast(multilinestring_sf, \"LINESTRING\")\nlinestring_sf2\n#> Simple feature collection with 3 features and 0 fields\n#> Geometry type: LINESTRING\n#> Dimension:     XY\n#> Bounding box:  xmin: 1 ymin: 1 xmax: 4 ymax: 5\n#> CRS:           NA\n#>                    geom\n#> 1 LINESTRING (1 5, 4 3)\n#> 2 LINESTRING (4 4, 4 1)\n#> 3 LINESTRING (2 2, 4 2)\nlinestring_sf2$name = c(\"Riddle Rd\", \"Marshall Ave\", \"Foulke St\")\nlinestring_sf2$length = st_length(linestring_sf2)\nlinestring_sf2\n#> Simple feature collection with 3 features and 2 fields\n#> Geometry type: LINESTRING\n#> Dimension:     XY\n#> Bounding box:  xmin: 1 ymin: 1 xmax: 4 ymax: 5\n#> CRS:           NA\n#>                    geom         name length\n#> 1 LINESTRING (1 5, 4 3)    Riddle Rd   3.61\n#> 2 LINESTRING (4 4, 4 1) Marshall Ave   3.00\n#> 3 LINESTRING (2 2, 4 2)    Foulke St   2.00"},{"path":"geometric-operations.html","id":"geo-ras","chapter":"5 Geometry operations","heading":"5.3 Geometric operations on raster data","text":"\nGeometric raster operations include shift, flipping, mirroring, scaling, rotation warping images.\noperations necessary variety applications including georeferencing, used allow images overlaid accurate map known CRS (Liu Mason 2009).\nvariety georeferencing techniques exist, including:Georectification based known ground control pointsOrthorectification, also accounts local topographyImage registration used combine images thing shot different sensors aligning one image another (terms coordinate system resolution)R rather unsuitable first two points since often require manual intervention usually done help dedicated GIS software (see also Chapter 9).\nhand, aligning several images possible R section shows among others .\noften includes changing extent, resolution origin image.\nmatching projection course also required already covered Section 6.6.case, reasons perform geometric operation single raster image.\ninstance, Chapter 13 define metropolitan areas Germany 20 km2 pixels 500,000 inhabitants.\noriginal inhabitant raster, however, resolution 1 km2 decrease (aggregate) resolution factor 20 (see Section 13.5).\nAnother reason aggregating raster simply decrease run-time save disk space.\ncourse, possible task hand allows coarser resolution.\nSometimes coarser resolution sufficient task hand.","code":""},{"path":"geometric-operations.html","id":"geometric-intersections","chapter":"5 Geometry operations","heading":"5.3.1 Geometric intersections","text":"\nSection 4.3.1 shown extract values raster overlaid spatial objects.\nretrieve spatial output, can use almost subsetting syntax.\ndifference make clear like keep matrix structure setting drop argument FALSE.\nreturn raster object containing cells whose midpoints overlap clip.operation can also use intersect() crop() command.","code":"\nelev = rast(system.file(\"raster/elev.tif\", package = \"spData\"))\nclip = rast(xmin = 0.9, xmax = 1.8, ymin = -0.45, ymax = 0.45,\n            resolution = 0.3, vals = rep(1, 9))\nelev[clip, drop = FALSE]\n#> class       : SpatRaster \n#> dimensions  : 2, 1, 1  (nrow, ncol, nlyr)\n#> resolution  : 0.5, 0.5  (x, y)\n#> extent      : 1, 1.5, -0.5, 0.5  (xmin, xmax, ymin, ymax)\n#> coord. ref. : lon/lat WGS 84 (EPSG:4326) \n#> source      : memory \n#> name        : elev \n#> min value   :   18 \n#> max value   :   24"},{"path":"geometric-operations.html","id":"extent-and-origin","chapter":"5 Geometry operations","heading":"5.3.2 Extent and origin","text":"\nmerging performing map algebra rasters, resolution, projection, origin /extent match. Otherwise, add values one raster resolution 0.2 decimal degrees second raster resolution 1 decimal degree?\nproblem arises like merge satellite imagery different sensors different projections resolutions.\ncan deal mismatches aligning rasters.simplest case, two images differ regard extent.\nFollowing code adds one row two columns side raster setting new values elevation 1000 meters (Figure 5.13).\nFIGURE 5.13: Original raster (left) raster (right) extended one row top bottom two columns left right.\nPerforming algebraic operation two objects differing extents R, terra package returns error.However, can align extent two rasters extend().\nInstead telling function many rows columns added (done ), allow figure using another raster object.\n, extend elev object extent elev_2.\nnewly added rows column receive default value value parameter, .e., NA.origin raster cell corner closest coordinates (0, 0).\norigin() function returns coordinates origin.\nexample cell corner exists coordinates (0, 0), necessarily case.two rasters different origins, cells overlap completely make map algebra impossible.\nchange origin – use origin().25\nLooking Figure 5.14 reveals effect changing origin.\nFIGURE 5.14: Rasters identical values different origins.\nNote changing resolution (next section) frequently also changes origin.","code":"\nelev = rast(system.file(\"raster/elev.tif\", package = \"spData\"))\nelev_2 = extend(elev, c(1, 2))\nelev_3 = elev + elev_2\n#> Error: [+] extents do not match\nelev_4 = extend(elev, elev_2)\norigin(elev_4)\n#> [1] 0 0\n# change the origin\norigin(elev_4) = c(0.25, 0.25)"},{"path":"geometric-operations.html","id":"aggregation-and-disaggregation","chapter":"5 Geometry operations","heading":"5.3.3 Aggregation and disaggregation","text":"\n\nRaster datasets can also differ regard resolution.\nmatch resolutions, one can either decrease (aggregate()) increase (disagg()) resolution one raster.26\nexample, change spatial resolution dem (found spDataLarge package) factor 5 (Figure 5.15).\nAdditionally, output cell value correspond mean input cells (note one use functions well, median(), sum(), etc.):\nFIGURE 5.15: Original raster (left). Aggregated raster (right).\ncontrast, disagg() function increases resolution.\nHowever, specify method fill new cells.\ndisagg() function provides two methods.\ndefault one (method = \"near\") simply gives output cells value input cell, hence duplicates values leads blocky output image.bilinear method, turn, interpolation technique uses four nearest pixel centers input image (salmon colored points Figure 5.16) compute average weighted distance (arrows Figure 5.16 value output cell - square upper left corner Figure 5.16).\nFIGURE 5.16: distance-weighted average four closest input cells determine output using bilinear method disaggregation.\nComparing values dem dem_disagg tells us identical (can also use compareGeom() .equal()).\nHowever, hardly expected, since disaggregating simple interpolation technique.\nimportant keep mind disaggregating results finer resolution; corresponding values, however, accurate lower resolution source.","code":"\ndem = rast(system.file(\"raster/dem.tif\", package = \"spDataLarge\"))\ndem_agg = aggregate(dem, fact = 5, fun = mean)\ndem_disagg = disagg(dem_agg, fact = 5, method = \"bilinear\")\nidentical(dem, dem_disagg)\n#> [1] FALSE"},{"path":"geometric-operations.html","id":"resampling","chapter":"5 Geometry operations","heading":"5.3.4 Resampling","text":"\nmethods aggregation disaggregation suitable want change resolution raster aggregation/disaggregation factor.\nHowever, two rasters different resolutions origins?\nrole resampling – process computing values new pixel locations.\nshort, process takes values original raster recalculates new values target raster custom resolution origin.Several methods recalculating (estimating) values raster different resolutions/origins exist (Figure 5.17).\nincludes:Nearest neighbor - assigns value nearest cell original raster cell target one.\nfast usually suitable categorical rasters.Bilinear interpolation - assigns weighted average four nearest cells original raster cell target one (Figure 5.16). fastest method continuous rasters.Cubic interpolation - uses values 16 nearest cells original raster determine output cell value, applying third-order polynomial functions. Used continuous rasters. results smoothed surface bilinear interpolation, also computationally demanding.Cubic spline interpolation - also uses values 16 nearest cells original raster determine output cell value, applies cubic splines (piecewise third-order polynomial functions) derive results. Used continuous rasters.Lanczos windowed sinc resampling - uses values 36 nearest cells original raster determine output cell value. Used continuous rasters.27As can find explanation, nearest neighbor suitable categorical rasters, methods can used (different outcomes) continuous rasters.\nAdditionally, successive method requires processing time.apply resampling, terra package provides resample() function.\naccepts input raster (x), raster target spatial properties (y), resampling method (method).need raster target spatial properties see resample() function works.\nexample, create target_rast, often use already existing raster object.Next, need provide two raster objects first two arguments one resampling methods described .Figure 5.17 shows comparison different resampling methods dem object.\nFIGURE 5.17: Visual comparison original raster five different resampling methods.\nsee section 6.6, raster reprojection special case resampling target raster different CRS original raster.geometry operations terra user-friendly, rather fast, work large raster objects.\nHowever, cases, terra performant either extensive rasters many raster files, alternatives considered.established alternatives come GDAL library.\ncontains several utility functions, including:gdalinfo - lists various information raster file, including resolution, CRS, bounding box, moregdal_translate - converts raster data different file formatsgdal_rasterize - converts vector data raster filesgdalwarp - allows raster mosaicing, resampling, cropping, reprojecting","code":"\ntarget_rast = rast(xmin = 794600, xmax = 798200, \n                   ymin = 8931800, ymax = 8935400,\n                   resolution = 150, crs = \"EPSG:32717\")\ndem_resampl = resample(dem, y = target_rast, method = \"bilinear\")"},{"path":"geometric-operations.html","id":"raster-vector","chapter":"5 Geometry operations","heading":"5.4 Raster-vector interactions","text":"\nsection focuses interactions raster vector geographic data models, introduced Chapter 2.\nincludes four main techniques:\nraster cropping masking using vector objects (Section 5.4.1);\nextracting raster values using different types vector data (Section 5.4.2);\nraster-vector conversion (Sections 5.4.3 5.4.4).\nconcepts demonstrated using data used previous chapters understand potential real-world applications.","code":""},{"path":"geometric-operations.html","id":"raster-cropping","chapter":"5 Geometry operations","heading":"5.4.1 Raster cropping","text":"\nMany geographic data projects involve integrating data many different sources, remote sensing images (rasters) administrative boundaries (vectors).\nOften extent input raster datasets larger area interest.\ncase raster cropping masking useful unifying spatial extent input data.\noperations reduce object memory use associated computational resources subsequent analysis steps, may necessary preprocessing step creating attractive maps involving raster data.use two objects illustrate raster cropping:SpatRaster object srtm representing elevation (meters sea level) south-western Utah.vector (sf) object zion representing Zion National Park.target cropping objects must projection.\nfollowing code chunk therefore reads datasets spDataLarge package (installed Chapter 2), also reprojects zion (see Section 6 reprojection):use crop() terra package crop srtm raster.\nreduces rectangular extent object passed first argument based extent object passed second argument, demonstrated command (generates Figure 5.18(B) — note smaller extent raster background):\nRelated crop() terra function mask(), sets values outside bounds object passed second argument NA.\nfollowing command therefore masks every cell outside Zion National Park boundaries (Figure 5.18(C)):Importantly, want use crop() mask() together cases.\ncombination functions () limit raster’s extent area interest (b) replace values outside area NA.Changing settings mask() yields different results.\nSetting updatevalue = 0, example, set pixels outside national park 0.\nSetting inverse = TRUE mask everything inside bounds park (see ?mask details) (Figure 5.18(D)).\nFIGURE 5.18: Illustration raster cropping raster masking.\n","code":"\nsrtm = rast(system.file(\"raster/srtm.tif\", package = \"spDataLarge\"))\nzion = st_read(system.file(\"vector/zion.gpkg\", package = \"spDataLarge\"))\nzion = st_transform(zion, crs(srtm))\nsrtm_cropped = crop(srtm, vect(zion))\nsrtm_masked = mask(srtm, vect(zion))\nsrtm_cropped = crop(srtm, vect(zion))\nsrtm_final = mask(srtm_cropped, vect(zion))\nsrtm_inv_masked = mask(srtm, vect(zion), inverse = TRUE)"},{"path":"geometric-operations.html","id":"raster-extraction","chapter":"5 Geometry operations","heading":"5.4.2 Raster extraction","text":"\nRaster extraction process identifying returning values associated ‘target’ raster specific locations, based (typically vector) geographic ‘selector’ object.\nresults depend type selector used (points, lines polygons) arguments passed terra::extract() function, use demonstrate raster extraction.\nreverse raster extraction — assigning raster cell values based vector objects — rasterization, described Section 5.4.3.basic example extracting value raster cell specific points.\npurpose, use zion_points, contain sample 30 locations within Zion National Park (Figure 5.19).\nfollowing command extracts elevation values srtm creates data frame points’ IDs (one value per vector’s row) related srtm values point.\nNow, can add resulting object zion_points dataset cbind() function:\nFIGURE 5.19: Locations points used raster extraction.\nRaster extraction also works line selectors.\n, extracts one value raster cell touched line.\nHowever, line extraction approach recommended obtain values along transects hard get correct distance pair extracted raster values.case, better approach split line many points extract values points.\ndemonstrate , code creates zion_transect, straight line going northwest southeast Zion National Park, illustrated Figure 5.20() (see Section 2.2 recap vector data model):utility extracting heights linear selector illustrated imagining planning hike.\nmethod demonstrated provides ‘elevation profile’ route (line need straight), useful estimating long take due long climbs.first step add unique id transect.\nNext, st_segmentize() function can add points along line(s) provided density (dfMaxLength) convert points st_cast().Now, large set points, want derive distance first point transects subsequent points.\ncase, one transect, code, principle, work number transects:Finally, can extract elevation values point transects combine information main object.resulting zion_transect can used create elevation profiles, illustrated Figure 5.20(B).\nFIGURE 5.20: Location line used raster extraction (left) elevation along line (right).\nfinal type geographic vector object raster extraction polygons.\nLike lines, polygons tend return many raster values per polygon.\ndemonstrated command , results data frame column names ID (row number polygon) srtm (associated elevation values):results can used generate summary statistics raster values per polygon, example characterize single region compare many regions.\ngeneration summary statistics demonstrated code , creates object zion_srtm_df containing summary statistics elevation values Zion National Park (see Figure 5.21()):preceding code chunk used tidyverse provide summary statistics cell values per polygon ID, described Chapter 3.\nresults provide useful summaries, example maximum height park around 2,661 meters see level (summary statistics, standard deviation, can also calculated way).\none polygon example data frame single row returned; however, method works multiple selector polygons used.similar approach works counting occurrences categorical raster values within polygons.\nillustrated land cover dataset (nlcd) spDataLarge package Figure 5.21(B), demonstrated code :\nFIGURE 5.21: Area used continuous (left) categorical (right) raster extraction.\n","code":"\ndata(\"zion_points\", package = \"spDataLarge\")\nelevation = terra::extract(srtm, vect(zion_points))\nzion_points = cbind(zion_points, elevation)\nzion_transect = cbind(c(-113.2, -112.9), c(37.45, 37.2)) %>%\n  st_linestring() %>% \n  st_sfc(crs = crs(srtm)) %>% \n  st_sf()\nzion_transect$id = 1:nrow(zion_transect)\nzion_transect = st_segmentize(zion_transect, dfMaxLength = 250)\nzion_transect = st_cast(zion_transect, \"POINT\")\nzion_transect = zion_transect %>% \n  group_by(id) %>% \n  mutate(dist = st_distance(geometry)[, 1]) \nzion_elev = terra::extract(srtm, vect(zion_transect))\nzion_transect = cbind(zion_transect, zion_elev)\nzion_srtm_values = terra::extract(x = srtm, y = vect(zion))\ngroup_by(zion_srtm_values, ID) %>% \n  summarize(across(srtm, list(min = min, mean = mean, max = max)))\n#> # A tibble: 1 × 4\n#>      ID srtm_min srtm_mean srtm_max\n#>   <dbl>    <dbl>     <dbl>    <dbl>\n#> 1     1     1122     1818.     2661\nnlcd = rast(system.file(\"raster/nlcd.tif\", package = \"spDataLarge\"))\nzion2 = st_transform(zion, st_crs(nlcd))\nzion_nlcd = terra::extract(nlcd, vect(zion2))\nzion_nlcd %>% \n  group_by(ID, levels) %>%\n  count()\n#> # A tibble: 7 × 3\n#> # Groups:   ID, levels [7]\n#>      ID levels      n\n#>   <dbl>  <dbl>  <int>\n#> 1     1      2   4205\n#> 2     1      3  98285\n#> 3     1      4 298299\n#> 4     1      5 203701\n#> # … with 3 more rows"},{"path":"geometric-operations.html","id":"rasterization","chapter":"5 Geometry operations","heading":"5.4.3 Rasterization","text":"\nRasterization conversion vector objects representation raster objects.\nUsually, output raster used quantitative analysis (e.g., analysis terrain) modeling.\nsaw Chapter 2 raster data model characteristics make conducive certain methods.\nFurthermore, process rasterization can help simplify datasets resulting values spatial resolution: rasterization can seen special type geographic data aggregation.terra package contains function rasterize() work.\nfirst two arguments , x, vector object rasterized , y, ‘template raster’ object defining extent, resolution CRS output.\ngeographic resolution input raster major impact results: low (cell size large), result may miss full geographic variability vector data; high, computational times may excessive.\nsimple rules follow deciding appropriate geographic resolution, heavily dependent intended use results.\nOften target resolution imposed user, example output rasterization needs aligned existing raster.demonstrate rasterization action, use template raster extent CRS input vector data cycle_hire_osm_projected (dataset cycle hire points London illustrated Figure 5.22()) spatial resolution 1000 meters:Rasterization flexible operation: results depend nature template raster, also type input vector (e.g., points, polygons) variety arguments taken rasterize() function.illustrate flexibility try three different approaches rasterization.\nFirst, create raster representing presence absence cycle hire points (known presence/absence rasters).\ncase rasterize() requires one argument addition x y (aforementioned vector raster objects): value transferred non-empty cells specified field (results illustrated Figure 5.22(B)).fun argument specifies summary statistics used convert multiple observations close proximity associate cells raster object.\ndefault fun = \"last\" used options fun = \"length\" can used, case count number cycle hire points grid cell (results operation illustrated Figure 5.22(C)).new output, ch_raster2, shows number cycle hire points grid cell.\ncycle hire locations different numbers bicycles described capacity variable, raising question, ’s capacity grid cell?\ncalculate must sum field (\"capacity\"), resulting output illustrated Figure 5.22(D), calculated following command (summary functions mean used):\nFIGURE 5.22: Examples point rasterization.\nAnother dataset based California’s polygons borders (created ) illustrates rasterization lines.\ncasting polygon objects multilinestring, template raster created resolution 0.5 degree:considering line polygon rasterization, one useful additional argument touches.\ndefault FALSE, changed TRUE – cells touched line polygon border get value.\nLine rasterization touches = TRUE demonstrated code (Figure 5.23()).Compare polygon rasterization, touches = FALSE default, selects cells whose centroids inside selector polygon, illustrated Figure 5.23(B).\nFIGURE 5.23: Examples line polygon rasterizations.\n","code":"\ncycle_hire_osm_projected = st_transform(cycle_hire_osm, \"EPSG:27700\")\nraster_template = rast(ext(cycle_hire_osm_projected), resolution = 1000,\n                       crs = st_crs(cycle_hire_osm_projected)$wkt)\nch_raster1 = rasterize(vect(cycle_hire_osm_projected), raster_template,\n                       field = 1)\nch_raster2 = rasterize(vect(cycle_hire_osm_projected), raster_template, \n                       fun = \"length\")\nch_raster3 = rasterize(vect(cycle_hire_osm_projected), raster_template, \n                       field = \"capacity\", fun = sum)\ncalifornia = dplyr::filter(us_states, NAME == \"California\")\ncalifornia_borders = st_cast(california, \"MULTILINESTRING\")\nraster_template2 = rast(ext(california), resolution = 0.5,\n                        crs = st_crs(california)$wkt)\ncalifornia_raster1 = rasterize(vect(california_borders), raster_template2,\n                               touches = TRUE)\ncalifornia_raster2 = rasterize(vect(california), raster_template2) "},{"path":"geometric-operations.html","id":"spatial-vectorization","chapter":"5 Geometry operations","heading":"5.4.4 Spatial vectorization","text":"\nSpatial vectorization counterpart rasterization (Section 5.4.3), opposite direction.\ninvolves converting spatially continuous raster data spatially discrete vector data points, lines polygons.simplest form vectorization convert centroids raster cells points.\n.points() exactly non-NA raster grid cells (Figure 5.24).\nNote, also used st_as_sf() convert resulting object sf class.\nFIGURE 5.24: Raster point representation elev object.\nAnother common type spatial vectorization creation contour lines representing lines continuous height temperatures (isotherms) example.\nuse real-world digital elevation model (DEM) artificial raster elev produces parallel lines (task reader: verify explain happens).\nContour lines can created terra function .contour(), wrapper around filled.contour(), demonstrated (shown):Contours can also added existing plots functions contour(), rasterVis::contourplot() tmap::tm_iso().\nillustrated Figure 5.25, isolines can labelled.\nFIGURE 5.25: DEM hillshade southern flank Mt. Mongón overlaid contour lines.\nfinal type vectorization involves conversion rasters polygons.\ncan done terra::.polygons(), converts raster cell polygon consisting five coordinates, stored memory (explaining rasters often fast compared vectors!).illustrated converting grain object polygons subsequently dissolving borders polygons attribute values (also see dissolve argument .polygons()).\nFIGURE 5.26: Illustration vectorization raster (left) polygon (center) polygon aggregation (right).\n","code":"\nelev_point = as.points(elev) %>% \n  st_as_sf()\ndem = rast(system.file(\"raster/dem.tif\", package = \"spDataLarge\"))\ncl = as.contour(dem)\nplot(dem, axes = FALSE)\nplot(cl, add = TRUE)\n# create hillshade\nhs = shade(slope = terrain(dem, \"slope\", unit = \"radians\"),\n           aspect = terrain(dem, \"aspect\", unit = \"radians\"))\nplot(hs, col = gray(0:100 / 100), legend = FALSE)\n# overlay with DEM\nplot(dem, col = terrain.colors(25), alpha = 0.5, legend = FALSE, add = TRUE)\n# add contour lines\ncontour(dem, col = \"white\", add = TRUE)\ngrain = rast(system.file(\"raster/grain.tif\", package = \"spData\"))\ngrain_poly = as.polygons(grain) %>% \n  st_as_sf()"},{"path":"geometric-operations.html","id":"exercises-3","chapter":"5 Geometry operations","heading":"5.5 Exercises","text":"exercises use vector (zion_points) raster dataset (srtm) spDataLarge package.\nalso use polygonal ‘convex hull’ derived vector dataset (ch) represent area interest:E1. Generate plot simplified versions nz dataset.\nExperiment different values keep (ranging 0.5 0.00005) ms_simplify() dTolerance (100 100,000) st_simplify().value form result start break method, making New Zealand unrecognizable?Advanced: different geometry type results st_simplify() compared geometry type ms_simplify()? problems create can resolved?E2. first exercise Chapter Spatial data operations established Canterbury region 70 101 highest points New Zealand.\nUsing st_buffer(), many points nz_height within 100 km Canterbury?E3. Find geographic centroid New Zealand.\nfar geographic centroid Canterbury?E4. world maps north-orientation.\nworld map south-orientation created reflection (one affine transformations mentioned chapter) world object’s geometry.\nWrite code .\nHint: need use two-element vector transformation.\nBonus: create upside-map country.E5. Subset point p contained within x y.Using base subsetting operators.Using intermediary object created st_intersection().E6. Calculate length boundary lines US states meters.\nstate longest border shortest?\nHint: st_length function computes length LINESTRING MULTILINESTRING geometry.E7. Crop srtm raster using (1) zion_points dataset (2) ch dataset.\ndifferences output maps?\nNext, mask srtm using two datasets.\nCan see difference now?\ncan explain ?E8. Firstly, extract values srtm points represented zion_points.\nNext, extract average values srtm using 90 buffer around point zion_points compare two sets values.\nextracting values buffers suitable points alone?E9. Subset points higher 3100 meters New Zealand (nz_height object) create template raster resolution 3 km extent new point dataset.\nUsing two new objects:Count numbers highest points grid cell.Find maximum elevation grid cell.E10. Aggregate raster counting high points New Zealand (created previous exercise), reduce geographic resolution half (cells 6 6 km) plot result.Resample lower resolution raster back original resolution 3 km. results changed?Name two advantages disadvantages reducing raster resolution.E11. Polygonize grain dataset filter squares representing clay.Name two advantages disadvantages vector data raster data.useful convert rasters vectors work?","code":"\nlibrary(sf)\nlibrary(terra)\nlibrary(spData)\nzion_points = read_sf(system.file(\"vector/zion_points.gpkg\", package = \"spDataLarge\"))\nsrtm = rast(system.file(\"raster/srtm.tif\", package = \"spDataLarge\"))\nch = st_combine(zion_points) %>%\n  st_convex_hull() %>% \n  st_as_sf()"},{"path":"reproj-geo-data.html","id":"reproj-geo-data","chapter":"6 Reprojecting geographic data","heading":"6 Reprojecting geographic data","text":"","code":""},{"path":"reproj-geo-data.html","id":"prerequisites-4","chapter":"6 Reprojecting geographic data","heading":"Prerequisites","text":"chapter requires following packages (lwgeom also used, need attached):","code":"\nlibrary(sf)\nlibrary(terra)\nlibrary(dplyr)\nlibrary(spData)\n#> Warning: multiple methods tables found for 'approxNA'\nlibrary(spDataLarge)"},{"path":"reproj-geo-data.html","id":"introduction-3","chapter":"6 Reprojecting geographic data","heading":"6.1 Introduction","text":"Section 2.4 introduced coordinate reference systems (CRSs) demonstrated importance.\nchapter goes .\nhighlights issues can arise using inappropriate CRSs transform data one CRS another.\n\nillustrated Figure 2.1, two types CRSs: geographic (‘lon/lat,’ units degrees longitude latitude) projected (typically units meters datum).\nconsequences.\n\n\n\ncheck data geographic CRS, can use sf::st_is_longlat() vector data terra::.lonlat() raster data.\ncases CRS unknown, shown using example London introduced Section 2.2:shows unless CRS manually specified loaded source CRS metadata, CRS NA.\nCRS can added sf objects st_set_crs() follows:28Datasets without specified CRS can cause problems.subsequent sections go depth, exploring CRS use details reprojecting vector raster objects.","code":"\nlondon = data.frame(lon = -0.1, lat = 51.5) %>% \n  st_as_sf(coords = c(\"lon\", \"lat\"))\nst_is_longlat(london)\n#> [1] NA\nlondon_geo = st_set_crs(london, \"EPSG:4326\")\nst_is_longlat(london_geo)\n#> [1] TRUE\nlondon_proj = data.frame(x = 530000, y = 180000) %>% \n  st_as_sf(coords = 1:2, crs = \"EPSG:27700\")"},{"path":"reproj-geo-data.html","id":"when-to-reproject","chapter":"6 Reprojecting geographic data","heading":"6.2 When to reproject?","text":"\nprevious section showed set CRS manually, st_set_crs(london, \"EPSG:4326\").\nreal world applications, however, CRSs usually set automatically data read-.\nmain task involving CRSs often transform objects, one CRS another.\ndata transformed?\nCRS?\nclear-cut answers questions CRS selection always involves trade-offs (Maling 1992).\nHowever, general principles provided section can help decide.First ’s worth considering transform.\n\n\ncases transformation projected CRS essential, using geometric functions st_buffer(), Figure ?? showed.\nConversely, publishing data online leaflet package may require geographic CRS.\nAnother case two objects different CRSs must compared combined, shown try find distance two objects different CRSs:make london london_proj objects geographically comparable one must transformed CRS .\nCRS use?\nanswer usually ‘projected CRS,’ case British National Grid (EPSG:27700):Now transformed version london created, using sf function st_transform(), distance two representations London can found.\nmay come surprise london london2 just 2 km apart!29","code":"\nst_distance(london_geo, london_proj)\n# > Error: st_crs(x) == st_crs(y) is not TRUE\nlondon2 = st_transform(london_geo, \"EPSG:27700\")\nst_distance(london2, london_proj)\n#> Units: [m]\n#>      [,1]\n#> [1,] 2018"},{"path":"reproj-geo-data.html","id":"which-crs-to-use","chapter":"6 Reprojecting geographic data","heading":"6.3 Which CRS to use?","text":"\n\nquestion CRS tricky, rarely ‘right’ answer:\n“exist -purpose projections, involve distortion far center specified frame” (R. Bivand, Pebesma, Gómez-Rubio 2013).geographic CRSs, answer often WGS84, web mapping, also GPS datasets thousands raster vector datasets provided CRS default.\nWGS84 common CRS world, worth knowing EPSG code: 4326.\n‘magic number’ can used convert objects unusual projected CRSs something widely understood.projected CRS required?\ncases, something free decide:\n“often choice projection made public mapping agency” (R. Bivand, Pebesma, Gómez-Rubio 2013).\nmeans working local data sources, likely preferable work CRS data provided, ensure compatibility, even official CRS accurate.\nexample London easy answer () British National Grid (associated EPSG code 27700) well known (b) original dataset (london) already CRS.cases appropriate CRS immediately clear, choice CRS depend properties important preserve subsequent maps analysis.\nCRSs either equal-area, equidistant, conformal (shapes remaining unchanged), combination compromises (section 2.4.2).\nCustom CRSs local parameters can created region interest multiple CRSs can used projects single CRS suits tasks.\n‘Geodesic calculations’ can provide fall-back CRSs appropriate (see proj.org/geodesic.html).\nRegardless projected CRS used, results may accurate geometries covering hundreds kilometers.deciding custom CRS, recommend following:30\n\n\n\nLambert azimuthal equal-area (LAEA) projection custom local projection (set lon_0 lat_0 center study area), equal-area projection locations distorts shapes beyond thousands kilometersAzimuthal equidistant (AEQD) projections specifically accurate straight-line distance point center point local projectionLambert conformal conic (LCC) projections regions covering thousands kilometers, cone set keep distance area properties reasonable secant linesStereographic (STERE) projections polar regions, taking care rely area distance calculations thousands kilometers centerOne possible approach automatically select projected CRS specific local dataset create azimuthal equidistant (AEQD) projection center-point study area.\ninvolves creating custom CRS (EPSG code) units meters based center point dataset.\napproach used caution: datasets compatible custom CRS created results may accurate used extensive datasets covering hundreds kilometers.commonly used default Universal Transverse Mercator (UTM), set CRSs divides Earth 60 longitudinal wedges 20 latitudinal segments.\ntransverse Mercator projection used UTM CRSs conformal distorts areas distances increasing severity distance center UTM zone.\nDocumentation GIS software Manifold therefore suggests restricting longitudinal extent projects using UTM zones 6 degrees central meridian (source: manifold.net).Almost every place Earth UTM code, “60H” refers northern New Zealand R invented.\nUTM EPSG codes run sequentially 32601 32660 northern hemisphere locations 32701 32760 southern hemisphere locations.show system works, let’s create function, lonlat2UTM() calculate EPSG code associated point planet follows:following command uses function identify UTM zone associated EPSG code Auckland London:Maps UTM zones provided dmap.co.uk confirm London UTM zone 30U.principles outlined section apply equally vector raster datasets.\nfeatures CRS transformation however unique geographic data model.\ncover particularities vector data transformation Section 6.4 raster transformation Section 6.6.","code":"\nlonlat2UTM = function(lonlat) {\n  utm = (floor((lonlat[1] + 180) / 6) %% 60) + 1\n  if(lonlat[2] > 0) {\n    utm + 32600\n  } else{\n    utm + 32700\n  }\n}\nepsg_utm_auk = lonlat2UTM(c(174.7, -36.9))\nepsg_utm_lnd = lonlat2UTM(st_coordinates(london))\nst_crs(epsg_utm_auk)$proj4string\n#> [1] \"+proj=utm +zone=60 +south +datum=WGS84 +units=m +no_defs\"\nst_crs(epsg_utm_lnd)$proj4string\n#> [1] \"+proj=utm +zone=30 +datum=WGS84 +units=m +no_defs\""},{"path":"reproj-geo-data.html","id":"reproj-vec-geom","chapter":"6 Reprojecting geographic data","heading":"6.4 Reprojecting vector geometries","text":"\n\nChapter 2 demonstrated vector geometries made-points, points form basis complex objects lines polygons.\nReprojecting vectors thus consists transforming coordinates points.\nillustrated cycle_hire_osm, sf object spData represents cycle hire locations across London.\nprevious section showed CRS vector data can queried st_crs().\n\n\n\n","code":""},{"path":"reproj-geo-data.html","id":"modifying-map-projections","chapter":"6 Reprojecting geographic data","heading":"6.5 Modifying map projections","text":"information CRS modifications can found Using PROJ documentation.","code":""},{"path":"reproj-geo-data.html","id":"reprojecting-raster-geometries","chapter":"6 Reprojecting geographic data","heading":"6.6 Reprojecting raster geometries","text":"\n\n\n\nprojection concepts described previous section apply equally rasters.\nHowever, important differences reprojection vectors rasters:\ntransforming vector object involves changing coordinates every vertex apply raster data.\nRasters composed rectangular cells size (expressed map units, degrees meters), usually impracticable transform coordinates pixels separately.\nRaster reprojection involves creating new raster object, often different number columns rows original.\nattributes must subsequently re-estimated, allowing new pixels ‘filled’ appropriate values.\nwords, raster reprojection can thought two separate spatial operations: vector reprojection raster extent another CRS (Section 6.4), computation new pixel values resampling (Section 5.3.4).\nThus cases raster vector data used, better avoid reprojecting rasters reproject vectors instead.raster reprojection process done project() terra package.\nLike st_transform() function demonstrated previous section, project() takes geographic object (raster dataset case) CRS representation second argument.\nside note – second argument can also existing raster object different CRS.Let’s take look two examples raster transformation: using categorical continuous data.\nLand cover data usually represented categorical maps.\nnlcd.tif file provides information small area Utah, USA obtained National Land Cover Database 2011 NAD83 / UTM zone 12N CRS.region, 8 land cover classes distinguished (full list NLCD2011 land cover classes can found mrlc.gov):reprojecting categorical rasters, estimated values must original.\ndone using nearest neighbor method (near), sets new cell value value nearest cell (center) input raster.\nexample reprojecting cat_raster WGS84, geographic CRS well suited web mapping.\nfirst step obtain PROJ definition CRS, can done, example using http://spatialreference.org webpage.\nfinal step reproject raster project() function , case categorical data, uses nearest neighbor method (near):Many properties new object differ previous one, including number columns rows (therefore number cells), resolution (transformed meters degrees), extent, illustrated Table 6.1 (note number categories increases 8 9 addition NA values, new category created — land cover classes preserved).TABLE 6.1: Key attributes original (‘cat_raster’) projected (‘cat_raster_wgs84’) categorical raster datasets.Reprojecting numeric rasters (numeric case integer values) follows almost identical procedure.\ndemonstrated srtm.tif spDataLarge Shuttle Radar Topography Mission (SRTM), represents height meters sea level (elevation) WGS84 CRS:reproject dataset projected CRS, nearest neighbor method appropriate categorical data.\nInstead, use bilinear method computes output cell value based four nearest cells original raster.31\nvalues projected dataset distance-weighted average values four cells:\ncloser input cell center output cell, greater weight.\nfollowing commands create text string representing WGS 84 / UTM zone 12N, reproject raster CRS, using bilinear method:Raster reprojection numeric variables also leads small changes values spatial properties, number cells, resolution, extent.\nchanges demonstrated Table 6.232:TABLE 6.2: Key attributes original (‘con_raster’) projected (‘con_raster_ea’) continuous raster datasets.learn CRSs.\nexcellent resource area, also implemented R, website R Spatial.\nChapter 6 free online book recommended reading — see: rspatial.org/terra/spatial/6-crs.html","code":"\ncat_raster = rast(system.file(\"raster/nlcd.tif\", package = \"spDataLarge\"))\ncrs(cat_raster)\n#> [1] \"PROJCRS[\\\"NAD83 / UTM zone 12N\\\",\\n    BASEGEOGCRS[\\\"NAD83\\\",\\n        DATUM[\\\"North American Datum 1983\\\",\\n            ELLIPSOID[\\\"GRS 1980\\\",6378137,298.257222101,\\n                LENGTHUNIT[\\\"metre\\\",1]]],\\n        PRIMEM[\\\"Greenwich\\\",0,\\n            ANGLEUNIT[\\\"degree\\\",0.0174532925199433]],\\n        ID[\\\"EPSG\\\",4269]],\\n    CONVERSION[\\\"UTM zone 12N\\\",\\n        METHOD[\\\"Transverse Mercator\\\",\\n            ID[\\\"EPSG\\\",9807]],\\n        PARAMETER[\\\"Latitude of natural origin\\\",0,\\n            ANGLEUNIT[\\\"degree\\\",0.0174532925199433],\\n            ID[\\\"EPSG\\\",8801]],\\n        PARAMETER[\\\"Longitude of natural origin\\\",-111,\\n            ANGLEUNIT[\\\"degree\\\",0.0174532925199433],\\n            ID[\\\"EPSG\\\",8802]],\\n        PARAMETER[\\\"Scale factor at natural origin\\\",0.9996,\\n            SCALEUNIT[\\\"unity\\\",1],\\n            ID[\\\"EPSG\\\",8805]],\\n        PARAMETER[\\\"False easting\\\",500000,\\n            LENGTHUNIT[\\\"metre\\\",1],\\n            ID[\\\"EPSG\\\",8806]],\\n        PARAMETER[\\\"False northing\\\",0,\\n            LENGTHUNIT[\\\"metre\\\",1],\\n            ID[\\\"EPSG\\\",8807]]],\\n    CS[Cartesian,2],\\n        AXIS[\\\"(E)\\\",east,\\n            ORDER[1],\\n            LENGTHUNIT[\\\"metre\\\",1]],\\n        AXIS[\\\"(N)\\\",north,\\n            ORDER[2],\\n            LENGTHUNIT[\\\"metre\\\",1]],\\n    USAGE[\\n        SCOPE[\\\"unknown\\\"],\\n        AREA[\\\"North America - 114°W to 108°W and NAD83 by country\\\"],\\n        BBOX[31.33,-114,84,-108]],\\n    ID[\\\"EPSG\\\",26912]]\"\nunique(cat_raster)\n#>   levels\n#> 1      1\n#> 2      2\n#> 3      3\n#> 4      4\n#> 5      5\n#> 6      6\n#> 7      7\n#> 8      8\ncat_raster_wgs84 = project(cat_raster, \"EPSG:4326\", method = \"near\")\ncon_raster = rast(system.file(\"raster/srtm.tif\", package = \"spDataLarge\"))\ncrs(con_raster)\n#> [1] \"GEOGCRS[\\\"WGS 84\\\",\\n    DATUM[\\\"World Geodetic System 1984\\\",\\n        ELLIPSOID[\\\"WGS 84\\\",6378137,298.257223563,\\n            LENGTHUNIT[\\\"metre\\\",1]]],\\n    PRIMEM[\\\"Greenwich\\\",0,\\n        ANGLEUNIT[\\\"degree\\\",0.0174532925199433]],\\n    CS[ellipsoidal,2],\\n        AXIS[\\\"geodetic latitude (Lat)\\\",north,\\n            ORDER[1],\\n            ANGLEUNIT[\\\"degree\\\",0.0174532925199433]],\\n        AXIS[\\\"geodetic longitude (Lon)\\\",east,\\n            ORDER[2],\\n            ANGLEUNIT[\\\"degree\\\",0.0174532925199433]],\\n    ID[\\\"EPSG\\\",4326]]\"\ncon_raster_ea = project(con_raster, \"EPSG:32612\", method = \"bilinear\")\ncrs(con_raster_ea)\n#> [1] \"PROJCRS[\\\"WGS 84 / UTM zone 12N\\\",\\n    BASEGEOGCRS[\\\"WGS 84\\\",\\n        DATUM[\\\"World Geodetic System 1984\\\",\\n            ELLIPSOID[\\\"WGS 84\\\",6378137,298.257223563,\\n                LENGTHUNIT[\\\"metre\\\",1]]],\\n        PRIMEM[\\\"Greenwich\\\",0,\\n            ANGLEUNIT[\\\"degree\\\",0.0174532925199433]],\\n        ID[\\\"EPSG\\\",4326]],\\n    CONVERSION[\\\"UTM zone 12N\\\",\\n        METHOD[\\\"Transverse Mercator\\\",\\n            ID[\\\"EPSG\\\",9807]],\\n        PARAMETER[\\\"Latitude of natural origin\\\",0,\\n            ANGLEUNIT[\\\"degree\\\",0.0174532925199433],\\n            ID[\\\"EPSG\\\",8801]],\\n        PARAMETER[\\\"Longitude of natural origin\\\",-111,\\n            ANGLEUNIT[\\\"degree\\\",0.0174532925199433],\\n            ID[\\\"EPSG\\\",8802]],\\n        PARAMETER[\\\"Scale factor at natural origin\\\",0.9996,\\n            SCALEUNIT[\\\"unity\\\",1],\\n            ID[\\\"EPSG\\\",8805]],\\n        PARAMETER[\\\"False easting\\\",500000,\\n            LENGTHUNIT[\\\"metre\\\",1],\\n            ID[\\\"EPSG\\\",8806]],\\n        PARAMETER[\\\"False northing\\\",0,\\n            LENGTHUNIT[\\\"metre\\\",1],\\n            ID[\\\"EPSG\\\",8807]]],\\n    CS[Cartesian,2],\\n        AXIS[\\\"(E)\\\",east,\\n            ORDER[1],\\n            LENGTHUNIT[\\\"metre\\\",1]],\\n        AXIS[\\\"(N)\\\",north,\\n            ORDER[2],\\n            LENGTHUNIT[\\\"metre\\\",1]],\\n    USAGE[\\n        SCOPE[\\\"unknown\\\"],\\n        AREA[\\\"World - N hemisphere - 114°W to 108°W - by country\\\"],\\n        BBOX[0,-114,84,-108]],\\n    ID[\\\"EPSG\\\",32612]]\""},{"path":"reproj-geo-data.html","id":"exercises-4","chapter":"6 Reprojecting geographic data","heading":"6.7 Exercises","text":"E1. Create new object called nz_wgs transforming nz object WGS84 CRS.Create object class crs use query CRSs.reference bounding box object, units CRS use?Remove CRS nz_wgs plot result: wrong map New Zealand ?E2. Transform world dataset transverse Mercator projection (\"+proj=tmerc\") plot result.\nchanged ?\nTry transform back WGS 84 plot new object.\nnew object differ original one?E3. Transform continuous raster (con_raster) NAD83 / UTM zone 12N using nearest neighbor interpolation method.\nchanged?\ninfluence results?E4. Transform categorical raster (cat_raster) WGS 84 using bilinear interpolation method.\nchanged?\ninfluence results?","code":""},{"path":"read-write.html","id":"read-write","chapter":"7 Geographic data I/O","heading":"7 Geographic data I/O","text":"","code":""},{"path":"read-write.html","id":"prerequisites-5","chapter":"7 Geographic data I/O","heading":"Prerequisites","text":"chapter requires following packages:","code":"\nlibrary(sf)\nlibrary(terra)\nlibrary(dplyr)\nlibrary(spData)\n#> Warning: multiple methods tables found for 'approxNA'"},{"path":"read-write.html","id":"introduction-4","chapter":"7 Geographic data I/O","heading":"7.1 Introduction","text":"chapter reading writing geographic data.\nGeographic data import essential geocomputation: real-world applications impossible without data.\nothers benefit results work, data output also vital.\nTaken together, refer processes /O, short input/output.Geographic data /O almost always part wider process.\ndepends knowing datasets available, can found retrieve .\ntopics covered Section 7.2, describes various geoportals, collectively contain many terabytes data, use .\nease data access, number packages downloading geographic data developed.\ndescribed Section 7.3.many geographic file formats, pros cons.\ndescribed Section 7.5.\nprocess actually reading writing file formats efficiently covered Sections 7.6 7.7, respectively.\nfinal Section 7.8 demonstrates methods saving visual outputs (maps), preparation Chapter 8 visualization.","code":""},{"path":"read-write.html","id":"retrieving-data","chapter":"7 Geographic data I/O","heading":"7.2 Retrieving open data","text":"\nvast ever-increasing amount geographic data available internet, much free access use (appropriate credit given providers).\nways now much data, sense often multiple places access dataset.\ndatasets poor quality.\ncontext, vital know look, first section covers important sources.\nVarious ‘geoportals’ (web services providing geospatial datasets Data.gov) good place start, providing wide range data often specific locations (illustrated updated Wikipedia page topic).\nglobal geoportals overcome issue.\nGEOSS portal Copernicus Open Access Hub, example, contain many raster datasets global coverage.\nwealth vector datasets can accessed SEDAC portal run National Aeronautics Space Administration (NASA) European Union’s INSPIRE geoportal, global regional coverage.geoportals provide graphical interface allowing datasets queried based characteristics spatial temporal extent, United States Geological Services’ EarthExplorer prime example.\nExploring datasets interactively browser effective way understanding available layers.\nDownloading data best done code, however, reproducibility efficiency perspectives.\nDownloads can initiated command line using variety techniques, primarily via URLs APIs (see Sentinel API example).\nFiles hosted static URLs can downloaded download.file(), illustrated code chunk accesses US National Parks data catalog.data.gov/dataset/national-parks:","code":"\ndownload.file(url = \"https://irma.nps.gov/DataStore/DownloadFile/666527\",\n              destfile = \"nps_boundary.zip\")\nunzip(zipfile = \"nps_boundary.zip\")\nusa_parks = read_sf(dsn = \"nps_boundary.shp\")"},{"path":"read-write.html","id":"geographic-data-packages","chapter":"7 Geographic data I/O","heading":"7.3 Geographic data packages","text":"\nMany R packages developed accessing geographic data, presented Table 7.1.\nprovide interfaces one spatial libraries geoportals aim make data access even quicker command line.\nTABLE 7.1: Selected R packages geographic data retrieval.\nemphasized Table 7.1 represents small number available geographic data packages.\n\n\nnotable packages include GSODR, provides Global Summary Daily Weather Data R (see package’s README overview weather data sources);\ntidycensus tigris, provide socio-demographic vector data USA; hddtools, provides access range hydrological datasets.data package syntax accessing data.\ndiversity demonstrated subsequent code chunks, show get data using three packages Table 7.1.\nCountry borders often useful can accessed ne_countries() function rnaturalearth package follows:default rnaturalearth returns objects class Spatial*.\nresult can converted sf objects st_as_sf() follows:second example downloads series rasters containing global monthly precipitation sums spatial resolution ten minutes (~18.5 km equator) using geodata package.\nresult multilayer object class SpatRaster.third example uses osmdata package (Padgham et al. 2018) find parks OpenStreetMap (OSM) database.\nillustrated code-chunk , queries begin function opq() (short OpenStreetMap query), first argument bounding box, text string representing bounding box (city Leeds case).\nresult passed function selecting OSM elements ’re interested (parks case), represented key-value pairs.\nNext, passed function osmdata_sf() work downloading data converting list sf objects (see vignette('osmdata') details):limitation osmdata package rate limited, meaning download large OSM datasets (e.g. OSM data large city).\novercome limitation, osmextract package developed, can used download import binary .pbf files containing compressed versions OSM database pre-defined regions.\nOpenStreetMap vast global database crowd-sourced data, growing daily, wider ecosystem tools enabling easy access data, Overpass turbo web service rapid development testing OSM queries osm2pgsql importing data PostGIS database.\nAlthough quality datasets derived OSM varies, data source wider OSM ecosystems many advantages: provide datasets available globally, free charge, constantly improving thanks army volunteers.\nUsing OSM encourages ‘citizen science’ contributions back digital commons (can start editing data representing part world know well www.openstreetmap.org).\nexamples OSM data action provided Chapters 9, 12 13.Sometimes, packages come built-datasets.\ncan accessed four ways: attaching package (package uses ‘lazy loading’ spData ), data(dataset, package = mypackage), referring dataset mypackage::dataset, system.file(filepath, package = mypackage) access raw data files.\nfollowing code chunk illustrates latter two options using world dataset (already loaded attaching parent package library(spData)):33The last example, system.file(\"shapes/world.gpkg\", package = \"spData\"), returns path world.gpkg file, stored inside \"shapes/\" folder spData package.","code":"\nlibrary(rnaturalearth)\nusa = ne_countries(country = \"United States of America\") # United States borders\nclass(usa)\n#> [1] \"SpatialPolygonsDataFrame\"\n#> attr(,\"package\")\n#> [1] \"sp\"\n# alternative way of accessing the data, with geodata\n# geodata::gadm(\"USA\", level = 0, path = tempdir())\nusa_sf = st_as_sf(usa)\nlibrary(geodata)\nworldclim_prec = worldclim_global(\"prec\", res = 10, path = tempdir())\nclass(worldclim_prec)\n#> [1] \"SpatRaster\"\n#> attr(,\"package\")\n#> [1] \"terra\"\nlibrary(osmdata)\nparks = opq(bbox = \"leeds uk\") %>% \n  add_osm_feature(key = \"leisure\", value = \"park\") %>% \n  osmdata_sf()\nworld2 = spData::world\nworld3 = read_sf(system.file(\"shapes/world.gpkg\", package = \"spData\"))"},{"path":"read-write.html","id":"geographic-web-services","chapter":"7 Geographic data I/O","heading":"7.4 Geographic web services","text":"\neffort standardize web APIs accessing spatial data, Open Geospatial Consortium (OGC) created number specifications web services (collectively known OWS, short OGC Web Services).\nspecifications include Web Feature Service (WFS), Web Map Service (WMS), Web Map Tile Service (WMTS), Web Coverage Service (WCS) even Web Processing Service (WPS).\nMap servers PostGIS adopted protocols, leading standardization queries.\nLike web APIs, OWS APIs use ‘base URL,’ ‘endpoint’ ‘URL query arguments’ following ? request data (see best-practices-api-packages vignette httr package).many requests can made OWS service.\nOne fundamental getCapabilities, demonstrated httr .\ncode chunk demonstrates API queries can constructed dispatched, case discover capabilities service run Food Agriculture Organization United Nations (FAO):code chunk demonstrates API requests can constructed programmatically GET() function, takes base URL list query parameters can easily extended.\nresult request saved res, object class response defined httr package, list containing information request, including URL.\ncan seen executing browseURL(res$url), results can also read directly browser.\nOne way extracting contents request follows:Data can downloaded WFS services GetFeature request specific typeName (illustrated code chunk ).Available names differ depending accessed web feature service.\nOne can extract programmatically using web technologies (Nolan Lang 2014) scrolling manually contents GetCapabilities output browser.Note use write_disk() ensure results written disk rather loaded memory, allowing imported sf.\nexample shows gain low-level access web services using httr, can useful understanding web services work.\nmany everyday tasks, however, higher-level interface may appropriate, number R packages, tutorials, developed precisely purpose.Packages ows4R, rwfs sos4R developed working OWS services general, WFS sensor observation service (SOS) respectively.\nOctober 2018, ows4R CRAN.\npackage’s basic functionality demonstrated , commands get FAO_AREAS previous code chunk:34There much learn web services much potential development R-OWS interfaces, active area development.\ninformation topic, recommend examples European Centre Medium-Range Weather Forecasts (ECMWF) services github.com/OpenDataHack reading-OCG Web Services opengeospatial.org.","code":"\nbase_url = \"http://www.fao.org\"\nendpoint = \"/figis/geoserver/wfs\"\nq = list(request = \"GetCapabilities\")\nres = httr::GET(url = httr::modify_url(base_url, path = endpoint), query = q)\nres$url\n#> [1] \"https://www.fao.org/figis/geoserver/wfs?request=GetCapabilities\"\ntxt = httr::content(res, \"text\")\nxml = xml2::read_xml(txt)\nxml\n#> {xml_document} ...\n#> [1] <ows:ServiceIdentification>\\n  <ows:Title>GeoServer WFS...\n#> [2] <ows:ServiceProvider>\\n  <ows:ProviderName>UN-FAO Fishe...\n#> ...\nqf = list(request = \"GetFeature\", typeName = \"area:FAO_AREAS\")\nfile = tempfile(fileext = \".gml\")\nhttr::GET(url = base_url, path = endpoint, query = qf, httr::write_disk(file))\nfao_areas = read_sf(file)\nlibrary(ows4R)\nwfs = WFSClient$new(\"http://www.fao.org/figis/geoserver/wfs\",\n                      serviceVersion = \"1.0.0\", logger = \"INFO\")\nfao_areas = wfs$getFeatures(\"area:FAO_AREAS\")"},{"path":"read-write.html","id":"file-formats","chapter":"7 Geographic data I/O","heading":"7.5 File formats","text":"\nGeographic datasets usually stored files spatial databases.\nFile formats can either store vector raster data, spatial databases PostGIS can store (see also Section 9.6.2).\nToday variety file formats may seem bewildering much consolidation standardization since beginnings GIS software 1960s first widely distributed program (SYMAP) spatial analysis created Harvard University (Coppock Rhind 1991).\nGDAL (pronounced “goo-dal,” double “o” making reference object-orientation), Geospatial Data Abstraction Library, resolved many issues associated incompatibility geographic file formats since release 2000.\nGDAL provides unified high-performance interface reading writing many raster vector data formats.35\nMany open proprietary GIS programs, including GRASS, ArcGIS QGIS, use GDAL behind GUIs legwork ingesting spitting geographic data appropriate formats.GDAL provides access 200 vector raster data formats.\nTable 7.2 presents basic information selected often used spatial file formats.\nTABLE 7.2: Selected spatial file formats.\n\nimportant development ensuring standardization open-sourcing file formats founding Open Geospatial Consortium (OGC) 1994.\nBeyond defining simple features data model (see Section 2.2.1), OGC also coordinates development open standards, example used file formats KML GeoPackage.\nOpen file formats kind endorsed OGC several advantages proprietary formats: standards published, ensure transparency open possibility users develop adjust file formats specific needs.ESRI Shapefile popular vector data exchange format; however, open format (though specification open).\ndeveloped early 1990s number limitations.\nFirst , multi-file format, consists least three files.\nsupports 255 columns, column names restricted ten characters file size limit 2 GB.\nFurthermore, ESRI Shapefile support possible geometry types, example, unable distinguish polygon multipolygon.36\nDespite limitations, viable alternative missing long time.\nmeantime, GeoPackage emerged, seems suitable replacement candidate ESRI Shapefile.\nGeopackage format exchanging geospatial information OGC standard.\nGeoPackage standard describes rules store geospatial information tiny SQLite container.\nHence, GeoPackage lightweight spatial database container, allows storage vector raster data also non-spatial data extensions.\nAside GeoPackage, geospatial data exchange formats worth checking (Table 7.2).\nGeoTIFF format seems prominent raster data format.\nallows spatial information, CRS, embedded within TIFF file.\nSimilar ESRI Shapefile, format firstly developed 1990s, open format.\nAdditionally, GeoTIFF still expanded improved.\nOne significant recent addition GeoTIFF format variant called COG (Cloud Optimized GeoTIFF).\nRaster objects saved COGs can hosted HTTP servers, people can read parts file without downloading whole file.","code":""},{"path":"read-write.html","id":"data-input","chapter":"7 Geographic data I/O","heading":"7.6 Data input (I)","text":"Executing commands sf::read_sf() (main function use loading vector data) terra::rast() (main function used loading raster data) silently sets chain events reads data files.\nMoreover, many R packages containing wide range geographic data providing simple access different data sources.\nload data R , precisely, assign objects workspace, stored RAM accessible .GlobalEnv R session.","code":""},{"path":"read-write.html","id":"iovec","chapter":"7 Geographic data I/O","heading":"7.6.1 Vector data","text":"\nSpatial vector data comes wide variety file formats.\npopular representations .geojson .gpkg files can imported directly R sf function read_sf() (equivalent st_read()), uses GDAL’s vector drivers behind scenes.\nst_drivers() returns data frame containing name long_name first two columns, features driver available GDAL (therefore sf), including ability write data store raster data subsequent columns, illustrated key file formats Table 7.3.\nfollowing commands show first three drivers reported computer’s GDAL installation (results can vary depending GDAL version installed) summary features.\nNote majority drivers can write data (51 87) 16 formats can efficiently represent raster data addition vector data (see ?st_drivers() details):\nTABLE 7.3: Popular drivers/formats reading/writing vector data.\nfirst argument read_sf() dsn, text string object containing single text string.\ncontent text string vary different drivers.\ncases, ESRI Shapefile (.shp) GeoPackage format (.gpkg), dsn file name.\nread_sf() guesses driver based file extension, illustrated .gpkg file :drivers, dsn provided folder name, access credentials database, GeoJSON string representation (see examples read_sf() help page details).vector driver formats can store multiple data layers.\ndefault, read_sf() automatically reads first layer file specified dsn; however, using layer argument can specify layer.read_sf() function also allows reading just parts file RAM two possible mechanisms.\nfirst one related query argument, allows specifying part data read OGR SQL query text.\nexample extracts data Tanzania (Figure ??:).\ndone specifying want get columns (SELECT *) \"world\" layer name_long equals \"Tanzania\":second mechanism uses wkt_filter argument.\nargument expects well-known text representing study area want extract data.\nLet’s try using small example – want read polygons file intersect buffer 50,000 meters Tanzania’s borders.\n, need prepare “filter” () creating buffer (Section 5.2.3), (b) converting sf buffer object sfc geometry object st_geometry(), (c) translating geometries well-known text representation st_as_text():Now, can apply “filter” using wkt_filter argument.result, shown Figure ??:B, contains Tanzania every country within 50 km buffer.\nFIGURE 7.1: Reading subset vector data using query () wkt filter (B).\nNaturally, options specific certain drivers.37\nexample, think coordinates stored spreadsheet format (.csv).\nread files spatial objects, naturally specify names columns (X Y example ) representing coordinates.\ncan help options parameter.\nfind possible options, please refer ‘Open Options’ section corresponding GDAL driver description.\ncomma-separated value (csv) format, visit http://www.gdal.org/drv_csv.html.Instead columns describing xy-coordinates, single column can also contain geometry information.\nWell-known text (WKT), well-known binary (WKB), GeoJSON formats examples .\ninstance, world_wkt.csv file column named WKT representing polygons world’s countries.\nuse options parameter indicate .\n\n\nfinal example, show read_sf() also reads KML files.\nKML file stores geographic information XML format - data format creation web pages transfer data application-independent way (Nolan Lang 2014).\n, access KML file web.\nfile contains one layer.\nst_layers() lists available layers.\nchoose first layer Placemarks say help layer parameter read_sf().examples presented section far used sf package geographic data import.\nfast flexible may worth looking packages specific file formats.\nexample geojsonsf package.\nbenchmark suggests around 10 times faster sf package reading .geojson.","code":"\nsf_drivers = st_drivers()\nhead(sf_drivers, n = 3)\nsummary(sf_drivers[-c(1:2)])\nvector_filepath = system.file(\"shapes/world.gpkg\", package = \"spData\")\nworld = read_sf(vector_filepath, quiet = TRUE)\ntanzania = read_sf(vector_filepath,\n                   query = 'SELECT * FROM \"world\" WHERE name_long = \"Tanzania\"')\ntanzania_buf = st_buffer(tanzania, 50000)\ntanzania_buf_geom = st_geometry(tanzania_buf)\ntanzania_buf_wkt = st_as_text(tanzania_buf_geom)\ntanzania_neigh = read_sf(vector_filepath,\n                         wkt_filter = tanzania_buf_wkt)\ncycle_hire_txt = system.file(\"misc/cycle_hire_xy.csv\", package = \"spData\")\ncycle_hire_xy = read_sf(cycle_hire_txt, options = c(\"X_POSSIBLE_NAMES=X\",\n                                                    \"Y_POSSIBLE_NAMES=Y\"))\nworld_txt = system.file(\"misc/world_wkt.csv\", package = \"spData\")\nworld_wkt = read_sf(world_txt, options = \"GEOM_POSSIBLE_NAMES=WKT\")\n# the same as\nworld_wkt2 = st_read(world_txt, options = \"GEOM_POSSIBLE_NAMES=WKT\", \n                    quiet = TRUE, stringsAsFactors = FALSE, as_tibble = TRUE)\nu = \"https://developers.google.com/kml/documentation/KML_Samples.kml\"\ndownload.file(u, \"KML_Samples.kml\")\nst_layers(\"KML_Samples.kml\")\n#> Driver: LIBKML \n#> Available layers:\n#>               layer_name geometry_type features fields\n#> 1             Placemarks                      3     11\n#> 2      Styles and Markup                      1     11\n#> 3       Highlighted Icon                      1     11\n#> 4        Ground Overlays                      1     11\n#> 5        Screen Overlays                      0     11\n#> 6                  Paths                      6     11\n#> 7               Polygons                      0     11\n#> 8          Google Campus                      4     11\n#> 9       Extruded Polygon                      1     11\n#> 10 Absolute and Relative                      4     11\nkml = read_sf(\"KML_Samples.kml\", layer = \"Placemarks\")"},{"path":"read-write.html","id":"raster-data-1","chapter":"7 Geographic data I/O","heading":"7.6.2 Raster data","text":"\nSimilar vector data, raster data comes many file formats supporting multilayer files.\nterra’s rast() command reads single layer file just one layer provided.also works case want read multilayer file.","code":"\nraster_filepath = system.file(\"raster/srtm.tif\", package = \"spDataLarge\")\nsingle_layer = rast(raster_filepath)\nmultilayer_filepath = system.file(\"raster/landsat.tif\", package = \"spDataLarge\")\nmultilayer_rast = rast(multilayer_filepath)"},{"path":"read-write.html","id":"data-output","chapter":"7 Geographic data I/O","heading":"7.7 Data output (O)","text":"Writing geographic data allows convert one format another save newly created objects.\nDepending data type (vector raster), object class (e.g., sf SpatRaster), type amount stored information (e.g., object size, range values), important know store spatial files efficient way.\nnext two sections demonstrate .","code":""},{"path":"read-write.html","id":"vector-data-1","chapter":"7 Geographic data I/O","heading":"7.7.1 Vector data","text":"counterpart read_sf() write_sf().\nallows write sf objects wide range geographic vector file formats, including common .geojson, .shp .gpkg.\nBased file name, write_sf() decides automatically driver use.\nspeed writing process depends also driver.Note: try write data source , function overwrite file:Instead overwriting file, add new layer file append = TRUE, supported several spatial formats, including GeoPackage.Alternatively, can use st_write() since equivalent write_sf().\nHowever, different defaults – overwrite files (returns error try ) shows short summary written file format object.layer_options argument also used many different purposes.\nOne write spatial data text file.\ncan done specifying GEOMETRY inside layer_options.\neither AS_XY simple point datasets (creates two new columns coordinates) AS_WKT complex spatial data (one new column created contains well-known text representation spatial objects).","code":"\nwrite_sf(obj = world, dsn = \"world.gpkg\")\nwrite_sf(obj = world, dsn = \"world.gpkg\")\nwrite_sf(obj = world, dsn = \"world_many_layers.gpkg\", append = TRUE)\nst_write(obj = world, dsn = \"world2.gpkg\")\n#> Writing layer `world2' to data source `world2.gpkg' using driver `GPKG'\n#> Writing 177 features with 10 fields and geometry type Multi Polygon.\nwrite_sf(cycle_hire_xy, \"cycle_hire_xy.csv\", layer_options = \"GEOMETRY=AS_XY\")\nwrite_sf(world_wkt, \"world_wkt.csv\", layer_options = \"GEOMETRY=AS_WKT\")"},{"path":"read-write.html","id":"raster-data-2","chapter":"7 Geographic data I/O","heading":"7.7.2 Raster data","text":"\nwriteRaster() function saves SpatRaster objects files disk.\nfunction expects input regarding output data type file format, also accepts GDAL options specific selected file format (see ?writeRaster details).\nterra package offers nine data types saving raster: LOG1S, INT1S, INT1U, INT2S, INT2U, INT4S, INT4U, FLT4S, FLT8S.38\ndata type determines bit representation raster object written disk (Table 7.4).\ndata type use depends range values raster object.\nvalues data type can represent, larger file get disk.\nUnsigned integers (INT1U, INT2U, INT4U) suitable categorical data, float numbers (FLT4S FLT8S) usually represent continuous data.\nwriteRaster() uses FLT4S default.\nworks cases, size output file unnecessarily large save binary categorical data.\nTherefore, recommend use data type needs least storage space, still able represent values (check range values summary() function).\nTABLE 7.4: Data types supported terra package.\ndefault, output file format derived filename.\nNaming file *.tif create GeoTIFF file, demonstrated :raster file formats additional options, can set providing GDAL parameters options argument writeRaster().\nGeoTIFF files written terra, default, LZW compression gdal = c(\"COMPRESS=LZW\").\nchange disable compression, need modify argument.\nAdditionally, can save raster object COG (Cloud Optimized GeoTIFF, Section 7.5) \"=COG\" option.","code":"\nwriteRaster(single_layer, filename = \"my_raster.tif\", datatype = \"INT2U\")\nwriteRaster(x = single_layer,\n            filename = \"my_raster.tif\",\n            datatype = \"INT2U\",\n            gdal = c(\"COMPRESS=NONE\", \"of=COG\"),\n            overwrite = TRUE)"},{"path":"read-write.html","id":"visual-outputs","chapter":"7 Geographic data I/O","heading":"7.8 Visual outputs","text":"\nR supports many different static interactive graphics formats.\ngeneral method save static plot open graphic device, create plot, close , example:available graphic devices include pdf(), bmp(), jpeg(), tiff().\ncan specify several properties output plot, including width, height resolution.Additionally, several graphic packages provide functions save graphical output.\nexample, tmap package tmap_save() function.\ncan save tmap object different graphic formats HTML file specifying object name file path new file.hand, can save interactive maps created mapview package HTML file image using mapshot() function:","code":"\npng(filename = \"lifeExp.png\", width = 500, height = 350)\nplot(world[\"lifeExp\"])\ndev.off()\nlibrary(tmap)\ntmap_obj = tm_shape(world) + tm_polygons(col = \"lifeExp\")\ntmap_save(tmap_obj, filename = \"lifeExp_tmap.png\")\nlibrary(mapview)\nmapview_obj = mapview(world, zcol = \"lifeExp\", legend = TRUE)\nmapshot(mapview_obj, file = \"my_interactive_map.html\")"},{"path":"read-write.html","id":"exercises-5","chapter":"7 Geographic data I/O","heading":"7.9 Exercises","text":"E1. List describe three types vector, raster, geodatabase formats.E2. Name least two differences read_sf() well-known function st_read().E3. Read cycle_hire_xy.csv file spData package spatial object (Hint: located misc folder).\ngeometry type loaded object?E4. Download borders Germany using rnaturalearth, create new object called germany_borders.\nWrite new object file GeoPackage format.E5. Download global monthly minimum temperature spatial resolution five minutes using geodata package.\nExtract June values, save file named tmin_june.tif file (hint: use terra::subset()).E6. Create static map Germany’s borders, save PNG file.E7. Create interactive map using data cycle_hire_xy.csv file.\nExport map file called cycle_hire.html.","code":""},{"path":"adv-map.html","id":"adv-map","chapter":"8 Making maps with R","heading":"8 Making maps with R","text":"","code":""},{"path":"adv-map.html","id":"prerequisites-6","chapter":"8 Making maps with R","heading":"Prerequisites","text":"chapter requires following packages already using:addition, uses following visualization packages (also install shiny want develop interactive mapping applications):","code":"\nlibrary(sf)\nlibrary(raster)\n#> Warning: multiple methods tables found for 'approxNA'\nlibrary(dplyr)\nlibrary(spData)\nlibrary(spDataLarge)\nlibrary(tmap)    # for static and interactive maps\nlibrary(leaflet) # for interactive maps\nlibrary(ggplot2) # tidyverse data visualization package"},{"path":"adv-map.html","id":"introduction-5","chapter":"8 Making maps with R","heading":"8.1 Introduction","text":"satisfying important aspect geographic research communicating results.\nMap making — art cartography — ancient skill involves communication, intuition, element creativity.\nStatic mapping R straightforward plot() function, saw Section 2.2.3.\npossible create advanced maps using base R methods (Murrell 2016).\nfocus chapter, however, cartography dedicated map-making packages.\nlearning new skill, makes sense gain depth--knowledge one area branching .\nMap making exception, hence chapter’s coverage one package (tmap) depth rather many superficially.addition fun creative, cartography also important practical applications.\ncarefully crafted map can best way communicating results work, poorly designed maps can leave bad impression.\nCommon design issues include poor placement, size readability text careless selection colors, outlined style guide Journal Maps.\nFurthermore, poor map making can hinder communication results (Brewer 2015):Amateur-looking maps can undermine audience’s ability understand important information weaken presentation professional data investigation.Maps used several thousand years wide variety purposes.\nHistoric examples include maps buildings land ownership Old Babylonian dynasty 3000 years ago Ptolemy’s world map masterpiece Geography nearly 2000 years ago (Talbert 2014).Map making historically activity undertaken , behalf , elite.\nchanged emergence open source mapping software R package tmap ‘print composer’ QGIS enable anyone make high-quality maps, enabling ‘citizen science.’\nMaps also often best way present findings geocomputational research way accessible.\nMap making therefore critical part geocomputation emphasis describing, also changing world.chapter shows make wide range maps.\nnext section covers range static maps, including aesthetic considerations, facets inset maps.\nSections 8.3 8.5 cover animated interactive maps (including web maps mapping applications).\nFinally, Section 8.6 covers range alternative map-making packages including ggplot2 cartogram.","code":""},{"path":"adv-map.html","id":"static-maps","chapter":"8 Making maps with R","heading":"8.2 Static maps","text":"\nStatic maps common type visual output geocomputation.\nStandard formats include .png .pdf raster vector outputs respectively.\nInitially, static maps type maps R produce.\nThings advanced release sp (see E. J. Pebesma Bivand 2005) many techniques map making developed since .\nHowever, despite innovation interactive mapping, static plotting still emphasis geographic data visualisation R decade later (Cheshire Lovelace 2015).generic plot() function often fastest way create static maps vector raster spatial objects (see sections 2.2.3 2.3.3).\nSometimes, simplicity speed priorities, especially development phase project, plot() excels.\nbase R approach also extensible, plot() offering dozens arguments.\nAnother approach grid package allows low level control static maps, illustrated Chapter 14 Murrell (2016).\nsection focuses tmap emphasizes important aesthetic layout options.\ntmap powerful flexible map-making package sensible defaults.\nconcise syntax allows creation attractive maps minimal code familiar ggplot2 users.\nalso unique capability generate static interactive maps using code via tmap_mode().\nFinally, accepts wider range spatial classes (including raster objects) alternatives ggplot2 (see vignettes tmap-getstarted tmap-changes-v2, well Tennekes (2018), documentation).","code":""},{"path":"adv-map.html","id":"tmap-basics","chapter":"8 Making maps with R","heading":"8.2.1 tmap basics","text":"\nLike ggplot2, tmap based idea ‘grammar graphics’ (Wilkinson Wills 2005).\ninvolves separation input data aesthetics (data visualised): input dataset can ‘mapped’ range different ways including location map (defined data’s geometry), color, visual variables.\nbasic building block tm_shape() (defines input data, raster vector objects), followed one layer elements tm_fill() tm_dots().\nlayering demonstrated chunk , generates maps presented Figure 8.1:\nFIGURE 8.1: New Zealand’s shape plotted fill (left), border (middle) fill border (right) layers added using tmap functions.\nobject passed tm_shape() case nz, sf object representing regions New Zealand (see Section 2.2.1 sf objects).\nLayers added represent nz visually, tm_fill() tm_borders() creating shaded areas (left panel) border outlines (middle panel) Figure 8.1, respectively.intuitive approach map making:\ncommon task adding new layers undertaken addition operator +, followed tm_*().\nasterisk (*) refers wide range layer types self-explanatory names including fill, borders (demonstrated ), bubbles, text raster (see help(\"tmap-element\") full list).\nlayering illustrated right panel Figure 8.1, result adding border top fill layer.","code":"\n# Add fill layer to nz shape\ntm_shape(nz) +\n  tm_fill() \n# Add border layer to nz shape\ntm_shape(nz) +\n  tm_borders() \n# Add fill and border layers to nz shape\ntm_shape(nz) +\n  tm_fill() +\n  tm_borders() "},{"path":"adv-map.html","id":"map-obj","chapter":"8 Making maps with R","heading":"8.2.2 Map objects","text":"useful feature tmap ability store objects representing maps.\ncode chunk demonstrates saving last plot Figure 8.1 object class tmap (note use tm_polygons() condenses tm_fill()  + tm_borders() single function):map_nz can plotted later, example adding additional layers (shown ) simply running map_nz console, equivalent print(map_nz).New shapes can added + tm_shape(new_obj).\ncase new_obj represents new spatial object plotted top preceding layers.\nnew shape added way, subsequent aesthetic functions refer , another new shape added.\nsyntax allows creation maps multiple shapes layers, illustrated next code chunk uses function tm_raster() plot raster layer (alpha set make layer semi-transparent):Building previously created map_nz object, preceding code creates new map object map_nz1 contains another shape (nz_elev) representing average elevation across New Zealand (see Figure 8.2, left).\nshapes layers can added, illustrated code chunk creates nz_water, representing New Zealand’s territorial waters, adds resulting lines existing map object.limit number layers shapes can added tmap objects.\nshape can even used multiple times.\nfinal map illustrated Figure 8.2 created adding layer representing high points (stored object nz_height) onto previously created map_nz2 object tm_dots() (see ?tm_dots ?tm_bubbles details tmap’s point plotting functions).\nresulting map, four layers, illustrated right-hand panel Figure 8.2:useful little known feature tmap multiple map objects can arranged single ‘metaplot’ tmap_arrange().\ndemonstrated code chunk plots map_nz1 map_nz3, resulting Figure 8.2.\nFIGURE 8.2: Maps additional layers added final map Figure 8.1.\nelements can also added + operator.\nAesthetic settings, however, controlled arguments layer functions.","code":"\nmap_nz = tm_shape(nz) + tm_polygons()\nclass(map_nz)\n#> [1] \"tmap\"\nmap_nz1 = map_nz +\n  tm_shape(nz_elev) + tm_raster(alpha = 0.7)\nnz_water = st_union(nz) %>% st_buffer(22200) %>% \n  st_cast(to = \"LINESTRING\")\nmap_nz2 = map_nz1 +\n  tm_shape(nz_water) + tm_lines()\nmap_nz3 = map_nz2 +\n  tm_shape(nz_height) + tm_dots()\ntmap_arrange(map_nz1, map_nz2, map_nz3)"},{"path":"adv-map.html","id":"aesthetics","chapter":"8 Making maps with R","heading":"8.2.3 Aesthetics","text":"\nplots previous section demonstrate tmap’s default aesthetic settings.\nGray shades used tm_fill() tm_bubbles() layers continuous black line used represent lines created tm_lines().\ncourse, default values aesthetics can overridden.\npurpose section show .two main types map aesthetics: change data constant.\nUnlike ggplot2, uses helper function aes() represent variable aesthetics, tmap accepts aesthetic arguments directly.\nmap variable aesthetic, pass column name corresponding argument, set fixed aesthetic, pass desired value instead.39\ncommonly used aesthetics fill border layers include color, transparency, line width line type, set col, alpha, lwd, lty arguments, respectively.\nimpact setting fixed values illustrated Figure 8.3.\nFIGURE 8.3: impact changing commonly used fill border aesthetics fixed values.\nLike base R plots, arguments defining aesthetics can also receive values vary.\nUnlike base R code (generates left panel Figure 8.4), tmap aesthetic arguments accept numeric vector:Instead col (aesthetics can vary lwd line layers size point layers) requires character string naming attribute associated geometry plotted.\nThus, one achieve desired result follows (plotted right-hand panel Figure 8.4):\nFIGURE 8.4: Comparison base (left) tmap (right) handling numeric color field.\nimportant argument functions defining aesthetic layers tm_fill() title, sets title associated legend.\nfollowing code chunk demonstrates functionality providing attractive name variable name Land_area (note use expression() create superscript text):","code":"\nma1 = tm_shape(nz) + tm_fill(col = \"red\")\nma2 = tm_shape(nz) + tm_fill(col = \"red\", alpha = 0.3)\nma3 = tm_shape(nz) + tm_borders(col = \"blue\")\nma4 = tm_shape(nz) + tm_borders(lwd = 3)\nma5 = tm_shape(nz) + tm_borders(lty = 2)\nma6 = tm_shape(nz) + tm_fill(col = \"red\", alpha = 0.3) +\n  tm_borders(col = \"blue\", lwd = 3, lty = 2)\ntmap_arrange(ma1, ma2, ma3, ma4, ma5, ma6)\nplot(st_geometry(nz), col = nz$Land_area)  # works\ntm_shape(nz) + tm_fill(col = nz$Land_area) # fails\n#> Error: Fill argument neither colors nor valid variable name(s)\ntm_shape(nz) + tm_fill(col = \"Land_area\")\nlegend_title = expression(\"Area (km\"^2*\")\")\nmap_nza = tm_shape(nz) +\n  tm_fill(col = \"Land_area\", title = legend_title) + tm_borders()"},{"path":"adv-map.html","id":"color-settings","chapter":"8 Making maps with R","heading":"8.2.4 Color settings","text":"\nColor settings important part map design.\ncan major impact spatial variability portrayed illustrated Figure 8.5.\nshows four ways coloring regions New Zealand depending median income, left right (demonstrated code chunk ):default setting uses ‘pretty’ breaks, described next paragraphbreaks allows manually set breaksn sets number bins numeric variables categorizedpalette defines color scheme, example BuGn\nFIGURE 8.5: Illustration settings affect color settings. results show (left right): default settings, manual breaks, n breaks, impact changing palette.\nAnother way change color settings altering color break (bin) settings.\naddition manually setting breaks tmap allows users specify algorithms automatically create breaks style argument.\n\nsix useful break styles:style = \"pretty\", default setting, rounds breaks whole numbers possible spaces evenly;style = \"equal\" divides input values bins equal range appropriate variables uniform distribution (recommended variables skewed distribution resulting map may end-little color diversity);style = \"quantile\" ensures number observations fall category (potential downside bin ranges can vary widely);style = \"jenks\" identifies groups similar values data maximizes differences categories;style = \"cont\" (\"order\") present large number colors continuous color fields particularly suited continuous rasters (\"order\" can help visualize skewed distributions);style = \"cat\" designed represent categorical values assures category receives unique color.\nFIGURE 8.6: Illustration different binning methods set using style argument tmap.\nPalettes define color ranges associated bins determined breaks, n, style arguments described .\ndefault color palette specified tm_layout() (see Section 8.2.5 learn ); however, quickly changed using palette argument.\nexpects vector colors new color palette name, can selected interactively tmaptools::palette_explorer().\ncan add - prefix reverse palette order.three main groups color palettes: categorical, sequential diverging (Figure 8.7), serves different purpose.\nCategorical palettes consist easily distinguishable colors appropriate categorical data without particular order state names land cover classes.\nColors intuitive: rivers blue, example, pastures green.\nAvoid many categories: maps large legends many colors can uninterpretable.40The second group sequential palettes.\nfollow gradient, example light dark colors (light colors tend represent lower values), appropriate continuous (numeric) variables.\nSequential palettes can single (Blues go light dark blue, example) multi-color/hue (YlOrBr gradient light yellow brown via orange, example), demonstrated code chunk — output shown, run code see results!last group, diverging palettes, typically range three distinct colors (purple-white-green Figure 8.7) usually created joining two single-color sequential palettes darker colors end.\nmain purpose visualize difference important reference point, e.g., certain temperature, median household income mean probability drought event.\nreference point’s value can adjusted tmap using midpoint argument.\nFIGURE 8.7: Examples categorical, sequential diverging palettes.\ntwo important principles consideration working colors: perceptibility accessibility.\nFirstly, colors maps match perception.\nmeans certain colors viewed experience also cultural lenses.\nexample, green colors usually represent vegetation lowlands blue connected water cool.\nColor palettes also easy understand effectively convey information.\nclear values lower higher, colors change gradually.\nproperty preserved rainbow color palette; therefore, suggest avoiding geographic data visualization (Borland Taylor II 2007).\nInstead, viridis color palettes, also available tmap, can used.\nSecondly, changes colors accessible largest number people.\nTherefore, important use colorblind friendly palettes often possible.41","code":"\ntm_shape(nz) + tm_polygons(col = \"Median_income\")\nbreaks = c(0, 3, 4, 5) * 10000\ntm_shape(nz) + tm_polygons(col = \"Median_income\", breaks = breaks)\ntm_shape(nz) + tm_polygons(col = \"Median_income\", n = 10)\ntm_shape(nz) + tm_polygons(col = \"Median_income\", palette = \"BuGn\")\ntm_shape(nz) + tm_polygons(\"Population\", palette = \"Blues\")\ntm_shape(nz) + tm_polygons(\"Population\", palette = \"YlOrBr\")"},{"path":"adv-map.html","id":"layouts","chapter":"8 Making maps with R","heading":"8.2.5 Layouts","text":"\nmap layout refers combination map elements cohesive map.\nMap elements include among others objects mapped, title, scale bar, margins aspect ratios, color settings covered previous section relate palette break-points used affect map looks.\nmay result subtle changes can equally large impact impression left maps.Additional elements north arrows scale bars functions: tm_compass() tm_scale_bar() (Figure 8.8).\nFIGURE 8.8: Map additional elements - north arrow scale bar.\ntmap also allows wide variety layout settings changed, , produced using following code (see args(tm_layout) ?tm_layout full list), illustrated Figure 8.9:\nFIGURE 8.9: Layout options specified (left right) title, scale, bg.color frame arguments.\narguments tm_layout() provide control many aspects map relation canvas placed.\nuseful layout settings (illustrated Figure 8.10):Frame width (frame.lwd) option allow double lines (frame.double.line)Margin settings including outer.margin inner.marginFont settings controlled fontface fontfamilyLegend settings including binary options legend.show (whether show legend) legend.(omit map) legend.outside (legend go outside map?), well multiple choice settings legend.positionDefault colors aesthetic layers (aes.color), map attributes frame (attr.color)Color settings controlling sepia.intensity (yellowy map looks) saturation (color-grayscale)\nFIGURE 8.10: Illustration selected layout options.\nimpact changing color settings listed illustrated Figure 8.11 (see ?tm_layout full list).\nFIGURE 8.11: Illustration selected color-related layout options.\n\nBeyond low-level control layouts colors, tmap also offers high-level styles, using tm_style() function (representing second meaning ‘style’ package).\nstyles tm_style(\"cobalt\") result stylized maps, others tm_style(\"gray\") make subtle changes, illustrated Figure 8.12, created using code (see 08-tmstyles.R):\nFIGURE 8.12: Selected tmap styles.\n","code":"\nmap_nz + \n  tm_compass(type = \"8star\", position = c(\"left\", \"top\")) +\n  tm_scale_bar(breaks = c(0, 100, 200), text.size = 1)\nmap_nz + tm_layout(title = \"New Zealand\")\nmap_nz + tm_layout(scale = 5)\nmap_nz + tm_layout(bg.color = \"lightblue\")\nmap_nz + tm_layout(frame = FALSE)\nmap_nza + tm_style(\"bw\")\nmap_nza + tm_style(\"classic\")\nmap_nza + tm_style(\"cobalt\")\nmap_nza + tm_style(\"col_blind\")"},{"path":"adv-map.html","id":"faceted-maps","chapter":"8 Making maps with R","heading":"8.2.6 Faceted maps","text":"\n\nFaceted maps, also referred ‘small multiples,’ composed many maps arranged side--side, sometimes stacked vertically (Meulemans et al. 2017).\nFacets enable visualization spatial relationships change respect another variable, time.\nchanging populations settlements, example, can represented faceted map panel representing population particular moment time.\ntime dimension represented via another aesthetic color.\nHowever, risks cluttering map involve multiple overlapping points (cities tend move time!).Typically individual facets faceted map contain geometry data repeated multiple times, column attribute data (default plotting method sf objects, see Chapter 2).\nHowever, facets can also represent shifting geometries evolution point pattern time.\nuse case faceted plot illustrated Figure 8.13.\nFIGURE 8.13: Faceted map showing top 30 largest urban agglomerations 1970 2030 based population projections United Nations.\npreceding code chunk demonstrates key features faceted maps created tmap:Shapes facet variable repeated (countries world case)argument varies depending variable (year case).nrow/ncol setting specifying number rows columns facets arranged intoThe free.coords parameter specifying map bounding boxIn addition utility showing changing spatial relationships, faceted maps also useful foundation animated maps (see Section 8.3).","code":"\nurb_1970_2030 = urban_agglomerations %>% \n  filter(year %in% c(1970, 1990, 2010, 2030))\n\ntm_shape(world) +\n  tm_polygons() +\n  tm_shape(urb_1970_2030) +\n  tm_symbols(col = \"black\", border.col = \"white\", size = \"population_millions\") +\n  tm_facets(by = \"year\", nrow = 2, free.coords = FALSE)"},{"path":"adv-map.html","id":"inset-maps","chapter":"8 Making maps with R","heading":"8.2.7 Inset maps","text":"\n\ninset map smaller map rendered within next main map.\nserve many different purposes, including providing context (Figure 8.14) bringing non-contiguous regions closer ease comparison (Figure 8.15).\nalso used focus smaller area detail cover area map, representing different topic.example , create map central part New Zealand’s Southern Alps.\ninset map show main map relation whole New Zealand.\nfirst step define area interest, can done creating new spatial object, nz_region.second step, create base map showing New Zealand’s Southern Alps area.\nplace important message stated.third step consists inset map creation.\ngives context helps locate area interest.\nImportantly, map needs clearly indicate location main map, example stating borders.Finally, combine two maps using function viewport() grid package, first arguments specify center location (x y) size (width height) inset map.\nFIGURE 8.14: Inset map providing context - location central part Southern Alps New Zealand.\nInset map can saved file either using graphic device (see Section 7.8) tmap_save() function arguments - insets_tm insets_vp.Inset maps also used create one map non-contiguous areas.\nProbably, often used example map United States, consists contiguous United States, Hawaii Alaska.\nimportant find best projection individual inset types cases (see Chapter 6 learn ).\ncan use US National Atlas Equal Area map contiguous United States putting EPSG code projection argument tm_shape().rest objects, hawaii alaska, already proper projections; therefore, just need create two separate maps:final map created combining arranging three maps:\nFIGURE 8.15: Map United States.\ncode presented compact can used basis inset maps results, Figure 8.15, provide poor representation locations Hawaii Alaska.\n-depth approach, see us-map vignette geocompkg.","code":"\nnz_region = st_bbox(c(xmin = 1340000, xmax = 1450000,\n                      ymin = 5130000, ymax = 5210000),\n                    crs = st_crs(nz_height)) %>% \n  st_as_sfc()\nnz_height_map = tm_shape(nz_elev, bbox = nz_region) +\n  tm_raster(style = \"cont\", palette = \"YlGn\", legend.show = TRUE) +\n  tm_shape(nz_height) + tm_symbols(shape = 2, col = \"red\", size = 1) +\n  tm_scale_bar(position = c(\"left\", \"bottom\"))\nnz_map = tm_shape(nz) + tm_polygons() +\n  tm_shape(nz_height) + tm_symbols(shape = 2, col = \"red\", size = 0.1) + \n  tm_shape(nz_region) + tm_borders(lwd = 3) \nlibrary(grid)\nnz_height_map\nprint(nz_map, vp = viewport(0.8, 0.27, width = 0.5, height = 0.5))\nus_states_map = tm_shape(us_states, projection = 2163) + tm_polygons() + \n  tm_layout(frame = FALSE)\nhawaii_map = tm_shape(hawaii) + tm_polygons() + \n  tm_layout(title = \"Hawaii\", frame = FALSE, bg.color = NA, \n            title.position = c(\"LEFT\", \"BOTTOM\"))\nalaska_map = tm_shape(alaska) + tm_polygons() + \n  tm_layout(title = \"Alaska\", frame = FALSE, bg.color = NA)\nus_states_map\nprint(hawaii_map, vp = grid::viewport(0.35, 0.1, width = 0.2, height = 0.1))\nprint(alaska_map, vp = grid::viewport(0.15, 0.15, width = 0.3, height = 0.3))"},{"path":"adv-map.html","id":"animated-maps","chapter":"8 Making maps with R","heading":"8.3 Animated maps","text":"\n\nFaceted maps, described Section 8.2.6, can show spatial distributions variables change (e.g., time), approach disadvantages.\nFacets become tiny many .\nFurthermore, fact facet physically separated screen page means subtle differences facets can hard detect.Animated maps solve issues.\nAlthough depend digital publication, becoming less issue content moves online.\nAnimated maps can still enhance paper reports: can always link readers web-page containing animated (interactive) version printed map help make come alive.\nseveral ways generate animations R, including animation packages gganimate, builds ggplot2 (see Section 8.6).\nsection focusses creating animated maps tmap syntax familiar previous sections flexibility approach.Figure 8.16 simple example animated map.\nUnlike faceted plot, squeeze multiple maps single screen allows reader see spatial distribution world’s populous agglomerations evolve time (see book’s website animated version).\nFIGURE 8.16: Animated map showing top 30 largest urban agglomerations 1950 2030 based population projects United Nations. Animated version available online : geocompr.robinlovelace.net.\nanimated map illustrated Figure 8.16 can created using tmap techniques generate faceted maps, demonstrated Section 8.2.6.\ntwo differences, however, related arguments tm_facets():along = \"year\" used instead = \"year\".free.coords = FALSE, maintains map extent map iteration.additional arguments demonstrated subsequent code chunk:resulting urb_anim represents set separate maps year.\nfinal stage combine save result .gif file tmap_animation().\nfollowing command creates animation illustrated Figure 8.16, elements missing, add exercises:Another illustration power animated maps provided Figure 8.17.\nshows development states United States, first formed east incrementally west finally interior.\nCode reproduce map can found script 08-usboundaries.R.\nFIGURE 8.17: Animated map showing population growth, state formation boundary changes United States, 1790-2010. Animated version available online geocompr.robinlovelace.net.\n","code":"\nurb_anim = tm_shape(world) + tm_polygons() + \n  tm_shape(urban_agglomerations) + tm_dots(size = \"population_millions\") +\n  tm_facets(along = \"year\", free.coords = FALSE)\ntmap_animation(urb_anim, filename = \"urb_anim.gif\", delay = 25)"},{"path":"adv-map.html","id":"interactive-maps","chapter":"8 Making maps with R","heading":"8.4 Interactive maps","text":"\n\nstatic animated maps can enliven geographic datasets, interactive maps can take new level.\nInteractivity can take many forms, common useful ability pan around zoom part geographic dataset overlaid ‘web map’ show context.\nLess advanced interactivity levels include popups appear click different features, kind interactive label.\nadvanced levels interactivity include ability tilt rotate maps, demonstrated mapdeck example , provision “dynamically linked” sub-plots automatically update user pans zooms (Pezanowski et al. 2018).important type interactivity, however, display geographic data interactive ‘slippy’ web maps.\nrelease leaflet package 2015 revolutionized interactive web map creation within R number packages built foundations adding new features (e.g., leaflet.extras) making creation web maps simple creating static maps (e.g., mapview tmap).\nsection illustrates approach opposite order.\nexplore make slippy maps tmap (syntax already learned), mapview finally leaflet (provides low-level control interactive maps).unique feature tmap mentioned Section 8.2 ability create static interactive maps using code.\nMaps can viewed interactively point switching view mode, using command tmap_mode(\"view\").\ndemonstrated code , creates interactive map New Zealand based tmap object map_nz, created Section 8.2.2, illustrated Figure 8.18:\nFIGURE 8.18: Interactive map New Zealand created tmap view mode. Interactive version available online : geocompr.robinlovelace.net.\nNow interactive mode ‘turned ,’ maps produced tmap launch (another way create interactive maps tmap_leaflet function).\nNotable features interactive mode include ability specify basemap tm_basemap() (tmap_options()) demonstrated (result shown):impressive little-known feature tmap’s view mode also works faceted plots.\nargument sync tm_facets() can used case produce multiple maps synchronized zoom pan settings, illustrated Figure 8.19, produced following code:\nFIGURE 8.19: Faceted interactive maps global coffee production 2016 2017 sync, demonstrating tmap’s view mode action.\nSwitch tmap back plotting mode function:proficient tmap, quickest way create interactive maps may mapview.\nfollowing ‘one liner’ reliable way interactively explore wide range geographic data formats:\nFIGURE 8.20: Illustration mapview action.\nmapview concise syntax yet powerful. default, provides standard GIS functionality mouse position information, attribute queries (via pop-ups), scale bar, zoom--layer buttons.\noffers advanced controls including ability ‘burst’ datasets multiple layers addition multiple layers + followed name geographic object.\nAdditionally, provides automatic coloring attributes (via argument zcol).\nessence, can considered data-driven leaflet API (see information leaflet).\nGiven mapview always expects spatial object (sf, Spatial*, Raster*) first argument, works well end piped expressions.\nConsider following example sf used intersect lines polygons visualized mapview (Figure 8.21).\nFIGURE 8.21: Using mapview end sf-based pipe expression.\nOne important thing keep mind mapview layers added via + operator (similar ggplot2 tmap). frequent gotcha piped workflows main binding operator %>%.\ninformation mapview, see package’s website : r-spatial.github.io/mapview/.ways create interactive maps R.\ngoogleway package, example, provides interactive mapping interface flexible extensible\n(see googleway-vignette details).\nAnother approach author mapdeck, provides access Uber’s Deck.gl framework.\nuse WebGL enables interactively visualize large datasets (millions points).\npackage uses Mapbox access tokens, must register using package.unique feature mapdeck provision interactive ‘2.5d’ perspectives, illustrated Figure 8.22.\nmeans can can pan, zoom rotate around maps, view data ‘extruded’ map.\nFigure 8.22, generated following code chunk, visualizes road traffic crashes UK, bar height respresenting casualties per area.\nFIGURE 8.22: Map generated mapdeck, representing road traffic casualties across UK. Height 1 km cells represents number crashes.\nbrowser can zoom drag, addition rotating tilting map pressing Cmd/Ctrl.\nMultiple layers can added %>% operator, demonstrated mapdeck vignette.Mapdeck also supports sf objects, can seen replacing add_grid() function call preceding code chunk add_polygon(data = lnd, layer_id = \"polygon_layer\"), add polygons representing London interactive tilted map.Last least leaflet mature widely used interactive mapping package R.\nleaflet provides relatively low-level interface Leaflet JavaScript library many arguments can understood reading documentation original JavaScript library (see leafletjs.com).Leaflet maps created leaflet(), result leaflet map object can piped leaflet functions.\nallows multiple map layers control settings added interactively, demonstrated code generates Figure 8.23 (see rstudio.github.io/leaflet/ details).\nFIGURE 8.23: leaflet package action, showing cycle hire points London. See interactive version online.\n","code":"\ntmap_mode(\"view\")\nmap_nz\nmap_nz + tm_basemap(server = \"OpenTopoMap\")\nworld_coffee = left_join(world, coffee_data, by = \"name_long\")\nfacets = c(\"coffee_production_2016\", \"coffee_production_2017\")\ntm_shape(world_coffee) + tm_polygons(facets) + \n  tm_facets(nrow = 1, sync = TRUE)\ntmap_mode(\"plot\")\n#> tmap mode set to plotting\nmapview::mapview(nz)\ntrails %>%\n  st_transform(st_crs(franconia)) %>%\n  st_intersection(franconia[franconia$district == \"Oberfranken\", ]) %>%\n  st_collection_extract(\"LINE\") %>%\n  mapview(color = \"red\", lwd = 3, layer.name = \"trails\") +\n  mapview(franconia, zcol = \"district\", burst = TRUE) +\n  breweries\nlibrary(mapdeck)\nset_token(Sys.getenv(\"MAPBOX\"))\ncrash_data = read.csv(\"https://git.io/geocompr-mapdeck\")\ncrash_data = na.omit(crash_data)\nms = mapdeck_style(\"dark\")\nmapdeck(style = ms, pitch = 45, location = c(0, 52), zoom = 4) %>%\nadd_grid(data = crash_data, lat = \"lat\", lon = \"lng\", cell_size = 1000,\n         elevation_scale = 50, layer_id = \"grid_layer\",\n         colour_range = viridisLite::plasma(6))\npal = colorNumeric(\"RdYlBu\", domain = cycle_hire$nbikes)\nleaflet(data = cycle_hire) %>% \n  addProviderTiles(providers$CartoDB.Positron) %>%\n  addCircles(col = ~pal(nbikes), opacity = 0.9) %>% \n  addPolygons(data = lnd, fill = FALSE) %>% \n  addLegend(pal = pal, values = ~nbikes) %>% \n  setView(lng = -0.1, 51.5, zoom = 12) %>% \n  addMiniMap()"},{"path":"adv-map.html","id":"mapping-applications","chapter":"8 Making maps with R","heading":"8.5 Mapping applications","text":"\ninteractive web maps demonstrated Section 8.4 can go far.\nCareful selection layers display, base-maps pop-ups can used communicate main results many projects involving geocomputation.\nweb mapping approach interactivity limitations:Although map interactive terms panning, zooming clicking, code static, meaning user interface fixedAll map content generally static web map, meaning web maps scale handle large datasets easilyAdditional layers interactivity, graphs showing relationships variables ‘dashboards’ difficult create using web-mapping approachOvercoming limitations involves going beyond static web mapping towards geospatial frameworks map servers.\nProducts field include GeoDjango (extends Django web framework written Python), MapGuide (framework developing web applications, largely written C++) GeoServer (mature powerful map server written Java).\n(particularly GeoServer) scalable, enabling maps served thousands people daily — assuming sufficient public interest maps!\nbad news server-side solutions require much skilled developer time set-maintain, often involving teams people roles dedicated geospatial database administrator (DBA).good news web mapping applications can now rapidly created using shiny, package converting R code interactive web applications.\nthanks support interactive maps via functions renderLeaflet(), documented Shiny integration section RStudio’s leaflet website.\nsection gives context, teaches basics shiny web mapping perspective culminates full-screen mapping application less 100 lines code.way shiny works well documented shiny.rstudio.com.\ntwo key elements shiny app reflect duality common web application development: ‘front end’ (bit user sees) ‘back end’ code.\nshiny apps, elements typically created objects named ui server within R script named app.R, lives ‘app folder.’\nallows web mapping applications represented single file, coffeeApp/app.R file book’s GitHub repo.considering large apps, worth seeing minimal example, named ‘lifeApp,’ action.42\ncode defines launches — command shinyApp() — lifeApp, provides interactive slider allowing users make countries appear progressively lower levels life expectancy (see Figure 8.24):\nFIGURE 8.24: Screenshot showing minimal example web mapping application created shiny.\nuser interface (ui) lifeApp created fluidPage().\ncontains input output ‘widgets’ — case, sliderInput() (many *Input() functions available) leafletOutput().\narranged row-wise default, explaining slider interface placed directly map Figure 8.24 (see ?column adding content column-wise).server side (server) function input output arguments.\noutput list objects containing elements generated render*() function — renderLeaflet() example generates output$map.\nInput elements input$life referred server must relate elements exist ui — defined inputId = \"life\" code .\nfunction shinyApp() combines ui server elements serves results interactively via new R process.\nmove slider map shown Figure 8.24, actually causing R code re-run, although hidden view user interface.Building basic example knowing find help (see ?shiny), best way forward now may stop reading start programming!\nrecommended next step open previously mentioned CycleHireApp/app.R script IDE choice, modify re-run repeatedly.\nexample contains components web mapping application implemented shiny ‘shine’ light behave.CycleHireApp/app.R script contains shiny functions go beyond demonstrated simple ‘lifeApp’ example.\ninclude reactive() observe() (creating outputs respond user interface — see ?reactive) leafletProxy() (modifying leaflet object already created).\nelements critical creation web mapping applications implemented shiny.\nrange ‘events’ can programmed including advanced functionality drawing new layers subsetting data, described shiny section RStudio’s leaflet website.Experimenting apps CycleHireApp build knowledge web mapping applications R, also practical skills.\nChanging contents setView(), example, change starting bounding box user sees app initiated.\nexperimentation done random, reference relevant documentation, starting ?shiny, motivated desire solve problems posed exercises.shiny used way can make prototyping mapping applications faster accessible ever (deploying shiny apps separate topic beyond scope chapter).\nEven applications eventually deployed using different technologies, shiny undoubtedly allows web mapping applications developed relatively lines code (76 case CycleHireApp).\nstop shiny apps getting rather large.\nPropensity Cycle Tool (PCT) hosted pct.bike, example, national mapping tool funded UK’s Department Transport.\nPCT used dozens people day multiple interactive elements based 1000 lines code (Lovelace et al. 2017).apps undoubtedly take time effort develop, shiny provides framework reproducible prototyping aid development process.\nOne potential problem ease developing prototypes shiny temptation start programming early, purpose mapping application envisioned detail.\nreason, despite advocating shiny, recommend starting longer established technology pen paper first stage interactive mapping projects.\nway prototype web applications limited technical considerations, motivations imagination.\nFIGURE 8.25: Hire cycle App, simple web mapping application finding closest cycle hiring station based location requirement cycles. Interactive version available online geocompr.robinlovelace.net.\n","code":"\nlibrary(shiny)    # for shiny apps\nlibrary(leaflet)  # renderLeaflet function\nlibrary(spData)   # loads the world dataset \nui = fluidPage(\n  sliderInput(inputId = \"life\", \"Life expectancy\", 49, 84, value = 80),\n      leafletOutput(outputId = \"map\")\n  )\nserver = function(input, output) {\n  output$map = renderLeaflet({\n    leaflet() %>% \n      # addProviderTiles(\"OpenStreetMap.BlackAndWhite\") %>%\n      addPolygons(data = world[world$lifeExp < input$life, ])})\n}\nshinyApp(ui, server)"},{"path":"adv-map.html","id":"other-mapping-packages","chapter":"8 Making maps with R","heading":"8.6 Other mapping packages","text":"tmap provides powerful interface creating wide range static maps (Section 8.2) also supports interactive maps (Section 8.4).\nmany options creating maps R.\naim section provide taster pointers additional resources: map making surprisingly active area R package development, learn can covered .mature option use plot() methods provided core spatial packages sf raster, covered Sections 2.2.3 2.3.3, respectively.\nmentioned sections plot methods raster vector objects can combined results draw onto plot area (elements keys sf plots multi-band rasters interfere ).\nbehavior illustrated subsequent code chunk generates Figure 8.26.\nplot() many options can explored following links ?plot help page sf vignette sf5.\nFIGURE 8.26: Map New Zealand created plot(). legend right refers elevation (1000 m sea level).\nSince version 2.3.0, tidyverse plotting package ggplot2 supported sf objects geom_sf().\nsyntax similar used tmap:\ninitial ggplot() call followed one layers, added + geom_*(), * represents layer type geom_sf() (sf objects) geom_points() (points).ggplot2 plots graticules default.\ndefault settings graticules can overridden using scale_x_continuous(), scale_y_continuous() coord_sf(datum = NA).\nnotable features include use unquoted variable names encapsulated aes() indicate aesthetics vary switching data sources using data argument, demonstrated code chunk creates Figure 8.27:\nFIGURE 8.27: Map New Zealand created ggplot2.\nadvantage ggplot2 strong user-community many add-packages.\nGood additional resources can found open source ggplot2 book (Wickham 2016) descriptions multitude ‘ggpackages’ ggrepel tidygraph.Another benefit maps based ggplot2 can easily given level interactivity printed using function ggplotly() plotly package.\nTry plotly::ggplotly(g1), example, compare result plotly mapping functions described : blog.cpsievert..time, ggplot2 drawbacks.\ngeom_sf() function always able create desired legend use spatial data.\nRaster objects also natively supported ggplot2 need converted data frame plotting.covered mapping sf, raster ggplot2 packages first packages highly flexible, allowing creation wide range static maps.\ncover mapping packages plotting specific type map (next paragraph), worth considering alternatives packages already covered general-purpose mapping (Table 8.1).\nTABLE 8.1: Selected general-purpose mapping packages.\nTable 8.1 shows range mapping packages available, many others listed table.\nnote cartography, generates range unusual maps including choropleth, ‘proportional symbol’ ‘flow’ maps, documented vignette cartography.Several packages focus specific map types, illustrated Table 8.2.\npackages create cartograms distort geographical space, create line maps, transform polygons regular hexagonal grids, visualize complex data grids representing geographic topologies.\nTABLE 8.2: Selected specific-purpose mapping packages, associated metrics.\naforementioned packages, however, different approaches data preparation map creation.\nnext paragraph, focus solely cartogram package.\nTherefore, suggest read linemap, geogrid geofacet documentations learn .cartogram map geometry proportionately distorted represent mapping variable.\nCreation type map possible R cartogram, allows creating continuous non-contiguous area cartograms.\nmapping package per se, allows construction distorted spatial objects plotted using generic mapping package.cartogram_cont() function creates continuous area cartograms.\naccepts sf object name variable (column) inputs.\nAdditionally, possible modify intermax argument - maximum number iterations cartogram transformation.\nexample, represent median income New Zeleand’s regions continuous cartogram (right-hand panel Figure 8.28) follows:\nFIGURE 8.28: Comparison standard map (left) continuous area cartogram (right).\ncartogram also offers creation non-contiguous area cartograms using cartogram_ncont() Dorling cartograms using cartogram_dorling().\nNon-contiguous area cartograms created scaling region based provided weighting variable.\nDorling cartograms consist circles area proportional weighting variable.\ncode chunk demonstrates creation non-contiguous area Dorling cartograms US states’ population (Figure 8.29):\nFIGURE 8.29: Comparison non-continuous area cartogram (left) Dorling cartogram (right).\nNew mapping packages emerging time.\n2018 alone, number mapping packages released CRAN, including mapdeck, mapsapi, rayshader.\nterms interactive mapping, leaflet.extras contains many functions extending functionality leaflet (see end point-pattern vignette geocompkg website examples heatmaps created leaflet.extras).","code":"\ng = st_graticule(nz, lon = c(170, 175), lat = c(-45, -40, -35))\nplot(nz_water, graticule = g, axes = TRUE, col = \"blue\")\nraster::plot(nz_elev / 1000, add = TRUE)\nplot(st_geometry(nz), add = TRUE)\nlibrary(ggplot2)\ng1 = ggplot() + geom_sf(data = nz, aes(fill = Median_income)) +\n  geom_sf(data = nz_height) +\n  scale_x_continuous(breaks = c(170, 175))\ng1\nlibrary(cartogram)\nnz_carto = cartogram_cont(nz, \"Median_income\", itermax = 5)\ntm_shape(nz_carto) + tm_polygons(\"Median_income\")\nus_states2163 = st_transform(us_states, 2163)\nus_states2163_ncont = cartogram_ncont(us_states2163, \"total_pop_15\")\nus_states2163_dorling = cartogram_dorling(us_states2163, \"total_pop_15\")"},{"path":"adv-map.html","id":"exercises-6","chapter":"8 Making maps with R","heading":"8.7 Exercises","text":"exercises rely new object, africa.\nCreate using world worldbank_df datasets spData package follows (see Chapter 3):also use zion nlcd datasets spDataLarge:Create map showing geographic distribution Human Development Index (HDI) across Africa base graphics (hint: use plot()) tmap packages (hint: use tm_shape(africa) + ...).\nName two advantages based experience.\nName three mapping packages advantage .\nBonus: create three maps Africa using three packages.\nName two advantages based experience.Name three mapping packages advantage .Bonus: create three maps Africa using three packages.Extend tmap created previous exercise legend three bins: “High” (HDI 0.7), “Medium” (HDI 0.55 0.7) “Low” (HDI 0.55).\nBonus: improve map aesthetics, example changing legend title, class labels color palette.\nBonus: improve map aesthetics, example changing legend title, class labels color palette.Represent africa’s subregions map.\nChange default color palette legend title.\nNext, combine map map created previous exercise single plot.Create land cover map Zion National Park.\nChange default colors match perception land cover categories\nAdd scale bar north arrow change position improve map’s aesthetic appeal\nBonus: Add inset map Zion National Park’s location context Utah state. (Hint: object representing Utah can subset us_states dataset.)\nChange default colors match perception land cover categoriesAdd scale bar north arrow change position improve map’s aesthetic appealBonus: Add inset map Zion National Park’s location context Utah state. (Hint: object representing Utah can subset us_states dataset.)Create facet maps countries Eastern Africa:\none facet showing HDI representing population growth (hint: using variables HDI pop_growth, respectively)\n‘small multiple’ per country\none facet showing HDI representing population growth (hint: using variables HDI pop_growth, respectively)‘small multiple’ per countryBuilding previous facet map examples, create animated maps East Africa:\nShowing first spatial distribution HDI scores population growth\nShowing country order\nShowing first spatial distribution HDI scores population growthShowing country orderCreate interactive map Africa:\ntmap\nmapview\nleaflet\nBonus: approach, add legend (automatically provided) scale bar\ntmapWith mapviewWith leafletBonus: approach, add legend (automatically provided) scale barSketch paper ideas web mapping app used make transport land-use policies evidence based:\ncity live, couple users per day\ncountry live, dozens users per day\nWorldwide hundreds users per day large data serving requirements\ncity live, couple users per dayIn country live, dozens users per dayWorldwide hundreds users per day large data serving requirementsUpdate code coffeeApp/app.R instead centering Brazil user can select country focus :\nUsing textInput()\nUsing selectInput()\nUsing textInput()Using selectInput()Reproduce Figure 8.1 1st 6th panel Figure 8.6 closely possible using ggplot2 package.Join us_states us_states_df together calculate poverty rate state using new dataset.\nNext, construct continuous area cartogram based total population.\nFinally, create compare two maps poverty rate: (1) standard choropleth map (2) map using created cartogram boundaries.\ninformation provided first second map?\ndiffer ?Visualize population growth Africa.\nNext, compare maps hexagonal regular grid created using geogrid package.","code":"\nafrica = world %>% \n  filter(continent == \"Africa\", !is.na(iso_a2)) %>% \n  left_join(worldbank_df, by = \"iso_a2\") %>% \n  dplyr::select(name, subregion, gdpPercap, HDI, pop_growth) %>% \n  st_transform(\"+proj=aea +lat_1=20 +lat_2=-23 +lat_0=0 +lon_0=25\")\nzion = st_read((system.file(\"vector/zion.gpkg\", package = \"spDataLarge\")))\ndata(nlcd, package = \"spDataLarge\")"},{"path":"gis.html","id":"gis","chapter":"9 Bridges to GIS software","heading":"9 Bridges to GIS software","text":"","code":""},{"path":"gis.html","id":"prerequisites-7","chapter":"9 Bridges to GIS software","heading":"Prerequisites","text":"chapter requires QGIS, SAGA GRASS installed following packages attached:43","code":"\nlibrary(sf)\nlibrary(raster)\n#> Warning: multiple methods tables found for 'approxNA'\n#library(RQGIS)\nlibrary(RSAGA)\nlibrary(rgrass7)"},{"path":"gis.html","id":"introduction-6","chapter":"9 Bridges to GIS software","heading":"9.1 Introduction","text":"defining feature R way interact :\ntype commands hit Enter (Ctrl+Enter writing code source editor RStudio) execute interactively.\nway interacting computer called command-line interface (CLI) (see definition note ).\nCLIs unique R.44\ndedicated GIS packages, contrast, emphasis tends graphical user interface (GUI).\ncan interact GRASS, QGIS, SAGA gvSIG system terminals embedded CLIs Python Console QGIS, ‘pointing clicking’ norm.\nmeans many GIS users miss advantages command-line according Gary Sherman, creator QGIS (Sherman 2008):advent ‘modern’ GIS software, people want point \nclick way life. ’s good, tremendous amount\nflexibility power waiting command line. Many times\ncan something command line fraction time \ncan GUI.‘CLI vs GUI’ debate can adversial ; options can used interchangeably, depending task hand user’s skillset.45\nadvantages good CLI provided R (enhanced IDEs RStudio) numerous.\ngood CLI:Facilitates automation repetitive tasks;Enables transparency reproducibility, backbone good scientific practice data science;Encourages software development providing tools modify existing functions implement new ones;Helps develop future-proof programming skills high demand many disciplines industries; andIs user-friendly fast, allowing efficient workflow.hand, GUI-based GIS systems (particularly QGIS) also advantageous.\ngood GIS GUI:‘shallow’ learning curve meaning geographic data can explored visualized without hours learning new language;Provides excellent support ‘digitizing’ (creating new vector datasets), including trace, snap topological tools;46Enables georeferencing (matching raster images existing maps) ground control points orthorectification;Supports stereoscopic mapping (e.g., LiDAR structure motion); andProvides access spatial database management systems object-oriented relational data models, topology fast (spatial) querying.Another advantage dedicated GISs provide access hundreds ‘geoalgorithms’ (computational recipes solve geographic problems — see Chapter 10).\nMany unavailable R command line, except via ‘GIS bridges,’ topic (motivation ) chapter.47R originated interface language.\npredecessor S provided access statistical algorithms languages (particularly FORTRAN), intuitive read-evaluate-print loop (REPL) (Chambers 2016).\nR continues tradition interfaces numerous languages, notably C++, described Chapter 1.\nR designed GIS.\nHowever, ability interface dedicated GISs gives astonishing geospatial capabilities.\nR well known statistical programming language, many people unaware ability replicate GIS workflows, additional benefits (relatively) consistent CLI.\nFurthermore, R outperforms GISs areas geocomputation, including interactive/animated map making (see Chapter 8) spatial statistical modeling (see Chapter 11).\nchapter focuses ‘bridges’ three mature open source GIS products (see Table 9.1): QGIS (via package RQGIS; Section 9.2), SAGA (via RSAGA; Section 9.3) GRASS (via rgrass7; Section 9.4).\nThough covered , worth aware interface ArcGIS, proprietary popular GIS software, via RPyGeo.48\ncomplement R-GIS bridges, chapter ends brief introduction interfaces spatial libraries (Section 9.6.1) spatial databases (Section 9.6.2).TABLE 9.1: Comparison three open-source GIS. Hybrid refers support vector raster operations.","code":""},{"path":"gis.html","id":"rqgis","chapter":"9 Bridges to GIS software","heading":"9.2 (R)QGIS","text":"QGIS one popular open-source GIS [Table 9.1; Graser Olaya (2015)].\nmain advantage lies fact provides unified interface several open-source GIS.\nmeans access GDAL, GRASS SAGA QGIS (Graser Olaya 2015).\nrun geoalgorithms (frequently 1000 depending set-) outside QGIS GUI, QGIS provides Python API.\nRQGIS establishes tunnel Python API reticulate package.\nBasically, functions set_env() open_app() .\nNote optional run set_env() open_app() since functions depending output run automatically needed.\nrunning RQGIS, make sure installed QGIS (third-party) dependencies SAGA GRASS.\ninstall RQGIS number dependencies required, described install_guide vignette, covers installation Windows, Linux Mac.\ntime writing (autumn 2018) RQGIS supports Long Term Release (2.18), support QGIS 3 pipeline (see RQGIS3).Leaving path-argument set_env() unspecified search computer QGIS installation.\nHence, faster specify explicitly path QGIS installation.\nSubsequently, open_app() sets paths necessary run QGIS within R, finally creates -called QGIS custom application (see http://docs.qgis.org/testing/en/docs/pyqgis_developer_cookbook/intro.html#using-pyqgis--custom-applications).now ready QGIS geoprocessing within R!\nexample shows unite polygons, process unfortunately produces -called slivers , tiny polygons resulting overlaps inputs frequently occur real-world data.\nsee remove .union, use incongruent polygons already encountered Section 4.2.5.\npolygon datasets available spData package, like use geographic CRS (see also Chapter 6).find algorithm work, find_algorithms() searches QGIS geoalgorithms using regular expressions.\nAssuming short description function contains word “union”, can run:Short descriptions geoalgorithm can also provided, setting name_only = FALSE.\none clue name geoalgorithm might , one can leave search_term-argument empty return list available QGIS geoalgorithms.\ncan also find algorithms QGIS online documentation.next step find qgis:union can used.\nopen_help() opens online help geoalgorithm question.\nget_usage() returns function parameters default values.Finally, can let QGIS work.\nNote workhorse function run_qgis() accepts R named arguments, .e., can specify parameter names returned get_usage() run_qgis() regular R function.\nNote also run_qgis() accepts spatial objects residing R’s global environment input (: aggzone_wgs incongr_wgs).\ncourse, also specify paths spatial vector files stored disk.\nSetting load_output TRUE automatically loads QGIS output sf-object R.Note QGIS union operation merges two input layers one layer using intersection symmetrical difference two input layers (way also default union operation GRASS SAGA).\nst_union(incongr_wgs, aggzone_wgs) (see Exercises)!\nQGIS output contains empty geometries multipart polygons.\nEmpty geometries might lead problems subsequent geoprocessing tasks deleted.\nst_dimension() returns NA geometry empty, can therefore used filter.Next convert multipart polygons single-part polygons (also known explode geometries casting).\nnecessary deletion sliver polygons later .One way identify slivers find polygons comparatively small areas, , e.g., 25000 m2 (see blue colored polygons left panel Figure 9.1).next step find function makes slivers disappear.\nAssuming function short description contains word “sliver,” can run:returns one geoalgorithm whose parameters can accessed help get_usage() .Conveniently, user need specify single parameter.\ncase parameter left unspecified, run_qgis() automatically use corresponding default value argument available.\nfind default values, run get_args_man().remove slivers, specify polygons area less equal 25,000 m2 joined neighboring polygon largest area (see right panel Figure 9.1).\nFIGURE 9.1: Sliver polygons colored blue (left panel). Cleaned polygons (right panel).\ncode chunk note thatleaving output parameter(s) unspecified saves resulting QGIS output temporary folder created QGIS;\nrun_qgis() prints paths console successfully running QGIS engine; andif output consists multiple files set load_output TRUE, run_qgis() return list element corresponding one output file.learn RQGIS, see Muenchow, Schratz, Brenning (2017).","code":"\n#library(RQGIS)\nset_env(dev = FALSE)\n#> $`root`\n#> [1] \"C:/OSGeo4W64\"\n#> ...\nopen_app()\ndata(\"incongruent\", \"aggregating_zones\", package = \"spData\")\nincongr_wgs = st_transform(incongruent, 4326)\naggzone_wgs = st_transform(aggregating_zones, 4326)\nfind_algorithms(\"union\", name_only = TRUE)\n#> [1] \"qgis:union\"        \"saga:fuzzyunionor\" \"saga:union\"\nalg = \"qgis:union\"\nopen_help(alg)\nget_usage(alg)\n#>ALGORITHM: Union\n#>  INPUT <ParameterVector>\n#>  INPUT2 <ParameterVector>\n#>  OUTPUT <OutputVector>\nunion = run_qgis(alg, INPUT = incongr_wgs, INPUT2 = aggzone_wgs, \n                 OUTPUT = file.path(tempdir(), \"union.shp\"),\n                 load_output = TRUE)\n#> $`OUTPUT`\n#> [1] \"C:/Users/geocompr/AppData/Local/Temp/RtmpcJlnUx/union.shp\"\n# remove empty geometries\nunion = union[!is.na(st_dimension(union)), ]\n# multipart polygons to single polygons\nsingle = st_cast(union, \"POLYGON\")\n# find polygons which are smaller than 25000 m^2\nx = 25000\nunits(x) = \"m^2\"\nsingle$area = st_area(single)\nsub = dplyr::filter(single, area < x)\nfind_algorithms(\"sliver\", name_only = TRUE)\n#> [1] \"qgis:eliminatesliverpolygons\"\nalg = \"qgis:eliminatesliverpolygons\"\nget_usage(alg)\n#>ALGORITHM: Eliminate sliver polygons\n#>  INPUT <ParameterVector>\n#>  KEEPSELECTION <ParameterBoolean>\n#>  ATTRIBUTE <parameters from INPUT>\n#>  COMPARISON <ParameterSelection>\n#>  COMPARISONVALUE <ParameterString>\n#>  MODE <ParameterSelection>\n#>  OUTPUT <OutputVector>\n#>  ...\nclean = run_qgis(\"qgis:eliminatesliverpolygons\",\n                 INPUT = single,\n                 ATTRIBUTE = \"area\",\n                 COMPARISON = \"<=\",\n                 COMPARISONVALUE = 25000,\n                 OUTPUT = file.path(tempdir(), \"clean.shp\"),\n                 load_output = TRUE)\n#> $`OUTPUT`\n#> [1] \"C:/Users/geocompr/AppData/Local/Temp/RtmpcJlnUx/clean.shp\""},{"path":"gis.html","id":"rsaga","chapter":"9 Bridges to GIS software","heading":"9.3 (R)SAGA","text":"System Automated Geoscientific Analyses (SAGA; Table 9.1) provides possibility execute SAGA modules via command line interface (saga_cmd.exe Windows just saga_cmd Linux) (see SAGA wiki modules).\naddition, Python interface (SAGA Python API).\nRSAGA uses former run SAGA within R.Though SAGA hybrid GIS, main focus raster processing, particularly digital elevation models (soil properties, terrain attributes, climate parameters).\nHence, SAGA especially good fast processing large (high-resolution) raster datasets (Conrad et al. 2015).\nTherefore, introduce RSAGA raster use case Muenchow, Brenning, Richter (2012).\nSpecifically, like compute SAGA wetness index digital elevation model.\nFirst , need make sure RSAGA find SAGA computer called.\n, RSAGA functions using SAGA background make use rsaga.env().\nUsually, rsaga.env() detect SAGA automatically searching several likely directories (see help information).However, possible ‘hidden’ SAGA location rsaga.env() search automatically.\nlinkSAGA searches computer valid SAGA installation.\nfinds one, adds newest version PATH environment variable thereby making sure rsaga.env() runs successfully.\nnecessary run next code chunk rsaga.env() unsuccessful (see previous code chunk).Secondly, need write digital elevation model SAGA-format.\nNote calling data(landslides) attaches two objects global environment - dem, digital elevation model form list, landslides, data.frame containing observations representing presence absence landslide:organization SAGA modular.\nLibraries contain -called modules, .e., geoalgorithms.\nfind libraries available, run (output shown):choose library ta_hydrology (ta abbreviation terrain analysis).\nSubsequently, can access available modules specific library (: ta_hydrology) follows:rsaga.get.usage() prints function parameters specific geoalgorithm, e.g., SAGA Wetness Index, console.Finally, can run SAGA within R using RSAGA’s geoprocessing workhorse function rsaga.geoprocessor().\nfunction expects parameter-argument list specified necessary parameters.facilitate access SAGA interface, RSAGA frequently provides user-friendly wrapper-functions meaningful default values (see RSAGA documentation examples, e.g., ?rsaga.wetness.index).\nfunction call calculating ‘SAGA Wetness Index’ becomes:course, like inspect result visually (Figure 9.2).\nload plot SAGA output file, use raster package.\nFIGURE 9.2: SAGA wetness index Mount Mongón, Peru.\ncan find extended version example vignette(\"RSAGA-landslides\") includes use statistical geocomputing derive terrain attributes predictors non-linear Generalized Additive Model (GAM) predict spatially landslide susceptibility (Muenchow, Brenning, Richter 2012).\nterm statistical geocomputation emphasizes strength combining R’s data science power geoprocessing power GIS heart building bridge R GIS.","code":"\nlibrary(RSAGA)\nrsaga.env()\n#> Search for SAGA command line program and modules... \n#> Done\n#> $workspace\n#> [1] \".\"\n#> ...\nlibrary(link2GI)\nsaga = linkSAGA()\nrsaga.env()\ndata(landslides)\nwrite.sgrd(data = dem, file = file.path(tempdir(), \"dem\"), header = dem$header)\nrsaga.get.libraries()\nrsaga.get.modules(libs = \"ta_hydrology\")\nrsaga.get.usage(lib = \"ta_hydrology\", module = \"SAGA Wetness Index\")\nparams = list(DEM = file.path(tempdir(), \"dem.sgrd\"),\n              TWI = file.path(tempdir(), \"twi.sdat\"))\nrsaga.geoprocessor(lib = \"ta_hydrology\", module = \"SAGA Wetness Index\", \n                   param = params)\nrsaga.wetness.index(in.dem = file.path(tempdir(), \"dem\"), \n                    out.wetness.index = file.path(tempdir(), \"twi\"))\nlibrary(raster)\ntwi = raster::raster(file.path(tempdir(), \"twi.sdat\"))\n# shown is a version using tmap\nplot(twi, col = RColorBrewer::brewer.pal(n = 9, name = \"Blues\"))"},{"path":"gis.html","id":"rgrass","chapter":"9 Bridges to GIS software","heading":"9.4 GRASS through rgrass7","text":"U.S. Army - Construction Engineering Research Laboratory (USA-CERL) created core Geographical Resources Analysis Support System (GRASS) [Table 9.1; Neteler Mitasova (2008)] 1982 1995.\nAcademia continued work since 1997.\nSimilar SAGA, GRASS focused raster processing beginning later, since GRASS 6.0, adding advanced vector functionality (R. Bivand, Pebesma, Gómez-Rubio 2013).introduce rgrass7 one interesting problems GIScience - traveling salesman problem.\nSuppose traveling salesman like visit 24 customers.\nAdditionally, like start finish journey home makes total 25 locations covering shortest distance possible.\nsingle best solution problem; however, find even modern computers (mostly) impossible (P. Longley 2015).\ncase, number possible solutions correspond (25 - 1)! / 2, .e., factorial 24 divided 2 (since differentiate forward backward direction).\nEven one iteration can done nanosecond, still corresponds 9837145 years.\nLuckily, clever, almost optimal solutions run tiny fraction inconceivable amount time.\nGRASS GIS provides one solutions (details, see v.net.salesman).\nuse case, like find shortest path first 25 bicycle stations (instead customers) London’s streets (simply assume first bike station corresponds home traveling salesman).Aside cycle hire points data, need OpenStreetMap data London.\ndownload help osmdata package (see also Section 7.2).\nconstrain download street network (OSM language called “highway”) bounding box cycle hire data, attach corresponding data sf-object.\nosmdata_sf() returns list several spatial objects (points, lines, polygons, etc.).\n, keep line objects.\nOpenStreetMap objects come lot columns, streets features almost 500.\nfact, interested geometry column.\nNevertheless, keeping one attribute column; otherwise, run trouble trying provide writeVECT() geometry object (see ?writeVECT details).\nRemember geometry column sticky, hence, even though just selecting one attribute, geometry column also returned (see Section 2.2.1).convenience reader, one can attach london_streets global environment using data(\"london_streets\", package = \"spDataLarge\").Now data, can go initiate GRASS session, .e., create GRASS spatial database.\nGRASS geodatabase  system based SQLite.\nConsequently, different users can easily work project, possibly different read/write permissions.\nHowever, one set spatial database (also within R), users used GIS GUI popping one click might find process bit intimidating beginning.\nFirst , GRASS database requires directory, contains location (see GRASS GIS Database help pages grass.osgeo.org information).\nlocation turn simply contains geodata one project.\nWithin one location, several mapsets can exist typically refer different users.\nPERMANENT mandatory mapset created automatically.\nstores projection, spatial extent default resolution raster data.\norder share geographic data users project, database owner can add spatial data PERMANENT mapset.\nPlease refer Neteler Mitasova (2008) GRASS GIS quick start information GRASS spatial database system.set location mapset want use GRASS within R.\nFirst , need find GRASS 7 installed computer.link data.frame contains rows GRASS 7 installations computer.\n, use GRASS 7 installation.\ninstalled GRASS 7 computer, recommend now.\nAssuming found working installation computer, use corresponding path initGRASS.\nAdditionally, specify store spatial database (gisDbase), name location london, use PERMANENT mapset.Subsequently, define projection, extent resolution.familiar set GRASS environment, becomes tedious .\nLuckily, linkGRASS7() link2GI packages lets one line code.\nthing need provide spatial object determines projection extent spatial database..\nFirst, linkGRASS7() finds GRASS installations computer.\nSince set ver_select TRUE, can interactively choose one found GRASS-installations.\njust one installation, linkGRASS7() automatically chooses one.\nSecond, linkGRASS7() establishes connection GRASS 7.can use GRASS geoalgorithms, need add data GRASS’s spatial database.\nLuckily, convenience function writeVECT() us.\n(Use writeRAST() case raster data.)\ncase add street cycle hire point data using first attribute column, name also london_streets points.use sf-objects rgrass7, run use_sf() first (note: code assumes running rgrass7 0.2.1 ).perform network analysis, need topological clean street network.\nGRASS’s v.clean takes care removal duplicates, small angles dangles, among others.\n, break lines intersection ensure subsequent routing algorithm can actually turn right left intersection, save output GRASS object named streets_clean.\nlikely cycling station points lie exactly street segment.\nHowever, find shortest route , need connect nearest streets segment.\nv.net’s connect-operator exactly .\nsave output streets_points_con.resulting clean dataset serves input v.net.salesman-algorithm, finally finds shortest route cycle hire stations.\ncenter_cats requires numeric range input.\nrange represents points shortest route calculated.\nSince like calculate route cycle stations, set 1-25.\naccess GRASS help page traveling salesman algorithm, run execGRASS(\"g.manual\", entry = \"v.net.salesman\").visualize result, import output layer R, convert sf-object keeping geometry, visualize help mapview package (Figure 9.3 Section 8.4).\nFIGURE 9.3: Shortest route (blue line) 24 cycle hire stations (blue dots) OSM street network London.\nimportant considerations note process:used GRASS’s spatial database (based SQLite) allows faster processing.\nmeans exported geographic data beginning.\ncreated new objects imported final result back R.\nfind datasets currently available, run execGRASS(\"g.list\", type = \"vector,raster\", flags = \"p\").also accessed already existing GRASS spatial database within R.\nPrior importing data R, might want perform (spatial) subsetting.\nUse v.select v.extract vector data.\ndb.select lets select subsets attribute table vector layer without returning corresponding geometry.can also start R within running GRASS session (information please refer R. Bivand, Pebesma, Gómez-Rubio 2013 wiki).Refer excellent GRASS online help execGRASS(\"g.manual\", flags = \"\") information available GRASS geoalgorithm.like use GRASS 6 within R, use R package spgrass6.","code":"\ndata(\"cycle_hire\", package = \"spData\")\npoints = cycle_hire[1:25, ]\nlibrary(osmdata)\nb_box = st_bbox(points)\nlondon_streets = opq(b_box) %>%\n  add_osm_feature(key = \"highway\") %>%\n  osmdata_sf() %>%\n  `[[`(\"osm_lines\")\nlondon_streets = dplyr::select(london_streets, osm_id)\nlibrary(link2GI)\nlink = findGRASS() \nlibrary(rgrass7)\n# find a GRASS 7 installation, and use the first one\nind = grep(\"7\", link$version)[1]\n# next line of code only necessary if we want to use GRASS as installed by \n# OSGeo4W. Among others, this adds some paths to PATH, which are also needed\n# for running GRASS.\nlink2GI::paramGRASSw(link[ind, ])\ngrass_path = \n  ifelse(test = !is.null(link$installation_type) && \n           link$installation_type[ind] == \"osgeo4W\",\n         yes = file.path(link$instDir[ind], \"apps/grass\", link$version[ind]),\n         no = link$instDir)\ninitGRASS(gisBase = grass_path,\n          # home parameter necessary under UNIX-based systems\n          home = tempdir(),\n          gisDbase = tempdir(), location = \"london\", \n          mapset = \"PERMANENT\", override = TRUE)\nexecGRASS(\"g.proj\", flags = c(\"c\", \"quiet\"), \n          proj4 = st_crs(london_streets)$proj4string)\nb_box = st_bbox(london_streets) \nexecGRASS(\"g.region\", flags = c(\"quiet\"), \n          n = as.character(b_box[\"ymax\"]), s = as.character(b_box[\"ymin\"]), \n          e = as.character(b_box[\"xmax\"]), w = as.character(b_box[\"xmin\"]), \n          res = \"1\")\nlink2GI::linkGRASS7(london_streets, ver_select = TRUE)\nuse_sf()\nwriteVECT(SDF = london_streets, vname = \"london_streets\")\nwriteVECT(SDF = points[, 1], vname = \"points\")\n# clean street network\nexecGRASS(cmd = \"v.clean\", input = \"london_streets\", output = \"streets_clean\",\n          tool = \"break\", flags = \"overwrite\")\n# connect points with street network\nexecGRASS(cmd = \"v.net\", input = \"streets_clean\", output = \"streets_points_con\", \n          points = \"points\", operation = \"connect\", threshold = 0.001,\n          flags = c(\"overwrite\", \"c\"))\nexecGRASS(cmd = \"v.net.salesman\", input = \"streets_points_con\",\n          output = \"shortest_route\", center_cats = paste0(\"1-\", nrow(points)),\n          flags = c(\"overwrite\"))\nroute = readVECT(\"shortest_route\") %>%\n  st_as_sf() %>%\n  st_geometry()\nmapview::mapview(route, map.types = \"OpenStreetMap.BlackAndWhite\", lwd = 7) +\n  points"},{"path":"gis.html","id":"when-to-use-what","chapter":"9 Bridges to GIS software","heading":"9.5 When to use what?","text":"recommend single R-GIS interface hard since usage depends personal preferences, tasks hand familiarity different GIS software packages turn probably depends field study.\nmentioned previously, SAGA especially good fast processing large (high-resolution) raster datasets, frequently used hydrologists, climatologists soil scientists (Conrad et al. 2015).\nGRASS GIS, hand, GIS presented supporting topologically based spatial database especially useful network analyses also simulation studies (see ).\nQGIS much user-friendly compared GRASS- SAGA-GIS, especially first-time GIS users, probably popular open-source GIS.\nTherefore, RQGIS appropriate choice use cases.\nmain advantages area unified access several GIS, therefore provision >1000 geoalgorithms (Table 9.1) including duplicated functionality, e.g., can perform overlay-operations using QGIS-, SAGA- GRASS-geoalgorithms;automatic data format conversions (SAGA uses .sdat grid files GRASS uses database format QGIS handle corresponding conversions);automatic passing geographic R objects QGIS geoalgorithms back R; andconvenience functions support access online help, named arguments automatic default value retrieval (rgrass7 inspired latter two features).means, use cases certainly use one R-GIS bridges.\nThough QGIS GIS providing unified interface several GIS software packages, provides access subset corresponding third-party geoalgorithms (information please refer Muenchow, Schratz, Brenning (2017)).\nTherefore, use complete set SAGA GRASS functions, stick RSAGA rgrass7.\n, take advantage RSAGA’s numerous user-friendly functions.\nNote also, RSAGA offers native R functions geocomputation multi.local.function(), pick..points() many .\nRSAGA supports much SAGA versions (R)QGIS.\nFinally, need topological correct data /spatial database management functionality multi-user access, recommend usage GRASS.\naddition, like run simulations help geodatabase (Krug, Roura-Pascual, Richardson 2010), use rgrass7 directly since RQGIS always starts new GRASS session call.Please note number GIS software packages scripting interface dedicated R package accesses : gvSig, OpenJump, Orfeo Toolbox TauDEM.","code":""},{"path":"gis.html","id":"other-bridges","chapter":"9 Bridges to GIS software","heading":"9.6 Other bridges","text":"focus chapter R interfaces Desktop GIS software.\nemphasize bridges dedicated GIS software well-known common ‘way ’ understanding geographic data.\nalso provide access many geoalgorithms.‘bridges’ include interfaces spatial libraries (Section 9.6.1 shows access GDAL CLI R), spatial databases (see Section 9.6.2) web mapping services (see Chapter 8).\nsection provides snippet possible.\nThanks R’s flexibility, ability call programs system integration languages (notably via Rcpp reticulate), many bridges possible.\naim comprehensive, demonstrate ways accessing ‘flexibility power’ quote Sherman (2008) beginning chapter.","code":""},{"path":"gis.html","id":"gdal","chapter":"9 Bridges to GIS software","heading":"9.6.1 Bridges to GDAL","text":"discussed Chapter 7, GDAL low-level library supports many geographic data formats.\nGDAL effective GIS programs use GDAL background importing exporting geographic data, rather re-inventing wheel using bespoke read-write code.\nGDAL offers data /O.\ngeoprocessing tools vector raster data, functionality create tiles serving raster data online, rapid rasterization vector data, can accessed via system R command line.code chunk demonstrates functionality:\nlinkGDAL() searches computer working GDAL installation adds location executable files PATH variable, allowing GDAL called.\nexample ogrinfo provides metadata vector dataset:example — returns result rgdal::ogrInfo() — may simple, shows use GDAL via system command-line, independently packages.\n‘link’ GDAL provided link2gi used foundation advanced GDAL work R system CLI.49\nTauDEM (http://hydrology.usu.edu/taudem/taudem5/index.html) Orfeo Toolbox (https://www.orfeo-toolbox.org/) spatial data processing libraries/programs offering command line interface.\ntime writing, appears developer version R/TauDEM interface R-Forge (https://r-forge.r-project.org/R/?group_id=956).\ncase, example shows access libraries system command line via R.\nturn starting point creating proper interface libraries form new R packages.diving project create new bridge, however, important aware power existing R packages system() calls may platform-independent (may fail computers).\nFurthermore, sf brings power provided GDAL, GEOS PROJ R via R/C++ interface provided Rcpp, avoids system() calls.","code":"\nlink2GI::linkGDAL()\ncmd = paste(\"ogrinfo -ro -so -al\", system.file(\"shape/nc.shp\", package = \"sf\"))\nsystem(cmd)\n#> INFO: Open of `C:/Users/geocompr/Documents/R/win-library/3.5/sf/shape/nc.shp'\n#>     using driver `ESRI Shapefile' successful.\n#> \n#> Layer name: nc\n#> Metadata:\n#>  DBF_DATE_LAST_UPDATE=2016-10-26\n#> Geometry: Polygon\n#> Feature Count: 100\n#> Extent: (-84.323853, 33.881992) - (-75.456978, 36.589649)\n#> Layer SRS WKT:\n#> ..."},{"path":"gis.html","id":"postgis","chapter":"9 Bridges to GIS software","heading":"9.6.2 Bridges to spatial databases","text":"\nSpatial database management systems (spatial DBMS) store spatial non-spatial data structured way.\ncan organize large collections data related tables (entities) via unique identifiers (primary foreign keys) implicitly via space (think instance spatial join).\nuseful geographic datasets tend become big messy quite quickly.\nDatabases enable storing querying large datasets efficiently based spatial non-spatial fields, provide multi-user access topology support.important open source spatial database PostGIS (Obe Hsu 2015).50\nR bridges spatial DBMSs PostGIS important, allowing access huge data stores without loading several gigabytes geographic data RAM, likely crashing R session.\nremainder section shows PostGIS can called R, based “Hello real word” PostGIS Action, Second Edition (Obe Hsu 2015).51The subsequent code requires working internet connection, since accessing PostgreSQL/PostGIS database living QGIS Cloud (https://qgiscloud.com/).52Often first question , ‘tables can found database?’\ncan asked follows (answer 5 tables):interested restaurants highways tables.\nformer represents locations fast-food restaurants US, latter principal US highways.\nfind attributes available table, can run:first query select US Route 1 Maryland (MD).\nNote st_read() allows us read geographic data database provided open connection database query.\nAdditionally, st_read() needs know column represents geometry (: wkb_geometry).results sf-object named us_route type sfc_MULTILINESTRING.\nnext step add 20-mile buffer (corresponds 1609 meters times 20) around selected highway (Figure 9.4).Note spatial query using functions (ST_Union(), ST_Buffer()) already familiar since find also sf-package, though written lowercase characters (st_union(), st_buffer()).\nfact, function names sf package largely follow PostGIS naming conventions.53\nlast query find Hardee restaurants (HDE) within buffer zone (Figure 9.4).Please refer Obe Hsu (2015) detailed explanation spatial SQL query.\nFinally, good practice close database connection follows:54\nFIGURE 9.4: Visualization output previous PostGIS commands showing highway (black line), buffer (light yellow) three restaurants (light blue points) within buffer.\nUnlike PostGIS, sf supports spatial vector data.\nquery manipulate raster data stored PostGIS database, use rpostgis package (Bucklin Basille 2018) /use command-line tools rastertopgsql comes part PostGIS installation.subsection brief introduction PostgreSQL/PostGIS.\nNevertheless, like encourage practice storing geographic non-geographic data spatial DBMS attaching subsets R’s global environment needed (geo-)statistical analysis.\nPlease refer Obe Hsu (2015) detailed description SQL queries presented comprehensive introduction PostgreSQL/PostGIS general.\nPostgreSQL/PostGIS formidable choice open-source spatial database.\ntrue lightweight SQLite/SpatiaLite database engine GRASS uses SQLite background (see Section 9.4).final note, data getting big PostgreSQL/PostGIS require massive spatial data management query performance, next logical step use large-scale geographic querying distributed computing systems, example, provided GeoMesa (http://www.geomesa.org/) GeoSpark [http://geospark.datasyslab.org/; Huang et al. (2017)].","code":"\nlibrary(RPostgreSQL)\nconn = dbConnect(drv = PostgreSQL(), dbname = \"rtafdf_zljbqm\",\n                 host = \"db.qgiscloud.com\",\n                 port = \"5432\", user = \"rtafdf_zljbqm\", \n                 password = \"d3290ead\")\ndbListTables(conn)\n#> [1] \"spatial_ref_sys\" \"topology\"        \"layer\"           \"restaurants\"    \n#> [5] \"highways\" \ndbListFields(conn, \"highways\")\n#> [1] \"qc_id\"        \"wkb_geometry\" \"gid\"          \"feature\"     \n#> [5] \"name\"         \"state\"   \nquery = paste(\n  \"SELECT *\",\n  \"FROM highways\",\n  \"WHERE name = 'US Route 1' AND state = 'MD';\")\nus_route = st_read(conn, query = query, geom = \"wkb_geometry\")\nquery = paste(\n  \"SELECT ST_Union(ST_Buffer(wkb_geometry, 1609 * 20))::geometry\",\n  \"FROM highways\",\n  \"WHERE name = 'US Route 1' AND state = 'MD';\")\nbuf = st_read(conn, query = query)\nquery = paste(\n  \"SELECT r.wkb_geometry\",\n  \"FROM restaurants r\",\n  \"WHERE EXISTS (\",\n  \"SELECT gid\",\n  \"FROM highways\",\n  \"WHERE\",\n  \"ST_DWithin(r.wkb_geometry, wkb_geometry, 1609 * 20) AND\",\n  \"name = 'US Route 1' AND\",\n  \"state = 'MD' AND\",\n  \"r.franchise = 'HDE');\"\n)\nhardees = st_read(conn, query = query)\nRPostgreSQL::postgresqlCloseConnection(conn)#> old-style crs object detected; please recreate object with a recent sf::st_crs()\n#> old-style crs object detected; please recreate object with a recent sf::st_crs()"},{"path":"gis.html","id":"exercises-7","chapter":"9 Bridges to GIS software","heading":"9.7 Exercises","text":"Create two overlapping polygons (poly_1 poly_2) help sf-package (see Chapter 2).Create two overlapping polygons (poly_1 poly_2) help sf-package (see Chapter 2).Union poly_1 poly_2 using st_union() qgis:union.\ndifference two union operations?\ncan use sf package obtain result QGIS?Union poly_1 poly_2 using st_union() qgis:union.\ndifference two union operations?\ncan use sf package obtain result QGIS?Calculate intersection poly_1 poly_2 using:\nRQGIS, RSAGA rgrass7\nsf\nCalculate intersection poly_1 poly_2 using:RQGIS, RSAGA rgrass7sfAttach data(dem, package = \"spDataLarge\") data(random_points, package = \"spDataLarge\").\nSelect randomly point random_points find dem pixels can seen point (hint: viewshed).\nVisualize result.\nexample, plot hillshade, top digital elevation model, viewshed output point.\nAdditionally, give mapview try.Attach data(dem, package = \"spDataLarge\") data(random_points, package = \"spDataLarge\").\nSelect randomly point random_points find dem pixels can seen point (hint: viewshed).\nVisualize result.\nexample, plot hillshade, top digital elevation model, viewshed output point.\nAdditionally, give mapview try.Compute catchment area catchment slope data(\"dem\", package = \"spDataLarge\") using RSAGA (see Section 9.3).Compute catchment area catchment slope data(\"dem\", package = \"spDataLarge\") using RSAGA (see Section 9.3).Use gdalinfo via system call raster file stored disk choice (see Section 9.6.1).Use gdalinfo via system call raster file stored disk choice (see Section 9.6.1).Query Californian highways PostgreSQL/PostGIS database living QGIS Cloud introduced chapter (see Section 9.6.2).Query Californian highways PostgreSQL/PostGIS database living QGIS Cloud introduced chapter (see Section 9.6.2).","code":""},{"path":"algorithms.html","id":"algorithms","chapter":"10 Scripts, algorithms and functions","heading":"10 Scripts, algorithms and functions","text":"","code":""},{"path":"algorithms.html","id":"prerequisites-8","chapter":"10 Scripts, algorithms and functions","heading":"Prerequisites","text":"chapter primarily uses base R; sf package used check result algorithm develop.\nassumes understanding geographic classes introduced Chapter 2 can used represent wide range input file formats (see Chapter 7).","code":""},{"path":"algorithms.html","id":"intro-algorithms","chapter":"10 Scripts, algorithms and functions","heading":"10.1 Introduction","text":"Chapter 1 established geocomputation using existing tools, developing new ones, “form shareable R scripts functions.”\nchapter teaches building blocks reproducible code.\nalso introduces low-level geometric algorithms, type used Chapter 9.\nReading help understand algorithms work write code can used many times, many people, multiple datasets.\nchapter , , make skilled programmer.\nProgramming hard requires plenty practice (Abelson, Sussman, Sussman 1996):appreciate programming intellectual activity right must turn computer programming; must read write computer programs — many .strong reasons moving direction, however.55\nadvantages reproducibility go beyond allowing others replicate work:\nreproducible code often better every way code written run , including terms computational efficiency, scalability ease adapting maintaining .Scripts basis reproducible R code, topic covered Section 10.2.\nAlgorithms recipes modifying inputs using series steps, resulting output, described Section 10.3.\nease sharing reproducibility, algorithms can placed functions.\ntopic Section 10.4.\nexample finding centroid polygon used tie concepts together.\nChapter 5 already introduced centroid function st_centroid(), example highlights seemingly simple operations result comparatively complex code, affirming following observation (Wise 2001):One intriguing things spatial data problems things appear trivially easy human can surprisingly difficult computer.example also reflects secondary aim chapter , following Xiao (2016), “duplicate available , show things work.”","code":""},{"path":"algorithms.html","id":"scripts","chapter":"10 Scripts, algorithms and functions","heading":"10.2 Scripts","text":"functions distributed packages building blocks R code, scripts glue holds together, logical order, create reproducible workflows.\nprogramming novices scripts may sound intimidating simply plain text files, typically saved extension representing language contain.\nR scripts generally saved .R extension named reflect .\nexample 10-hello.R, script file stored code folder book’s repository, contains following two lines code:lines code may particularly exciting demonstrate point: scripts need complicated.\nSaved scripts can called executed entirety source(), demonstrated shows comment ignored instruction executed:strict rules can go script files nothing prevent saving broken, non-reproducible code.56\n, however, conventions worth following:Write script order: just like script film, scripts clear order ‘setup,’ ‘data processing’ ‘save results’ (roughly equivalent ‘beginning,’ ‘middle’ ‘end’ film).Write script order: just like script film, scripts clear order ‘setup,’ ‘data processing’ ‘save results’ (roughly equivalent ‘beginning,’ ‘middle’ ‘end’ film).Add comments script people (future self) can understand . minimum, comment state purpose script (see Figure 10.1) (long scripts) divide sections. can done RStudio, example, shortcut Ctrl+Shift+R, creates ‘foldable’ code section headings.Add comments script people (future self) can understand . minimum, comment state purpose script (see Figure 10.1) (long scripts) divide sections. can done RStudio, example, shortcut Ctrl+Shift+R, creates ‘foldable’ code section headings., scripts reproducible: self-contained scripts work computer useful scripts run computer, good day. involves attaching required packages beginning, reading-data persistent sources (reliable website) ensuring previous steps taken.57Above , scripts reproducible: self-contained scripts work computer useful scripts run computer, good day. involves attaching required packages beginning, reading-data persistent sources (reliable website) ensuring previous steps taken.57It hard enforce reproducibility R scripts, tools can help.\ndefault, RStudio  ‘code-checks’ R scripts underlines faulty code red wavy line, illustrated :\nFIGURE 10.1: Code checking RStudio. example, script 10-centroid-alg.R, highlights unclosed curly bracket line 19.\ncontents section apply type R script.\nparticular consideration scripts geocomputation tend external dependencies, QGIS dependency run code Chapter 9, require input data specific format.\ndependencies mentioned comments script elsewhere project part, illustrated script 10-centroid-alg.R.\nwork undertaken script demonstrated reproducible example , works pre-requisite object named poly_mat, square sides 9 units length (meaning become apparent next section):58","code":"\n# Aim: provide a minimal R script\nprint(\"Hello geocompr\")\nsource(\"code/10-hello.R\")\n#> [1] \"Hello geocompr\"\npoly_mat = cbind(\n  x = c(0, 0, 9, 9, 0),\n  y = c(0, 9, 9, 0, 0)\n)\nsource(\"https://git.io/10-centroid-alg.R\") # short url#> [1] \"The area is: 81\"\n#> [1] \"The coordinates of the centroid are: 4.5, 4.5\""},{"path":"algorithms.html","id":"geometric-algorithms","chapter":"10 Scripts, algorithms and functions","heading":"10.3 Geometric algorithms","text":"Algorithms can understood computing equivalent cooking recipe.\ncomplete set instructions , undertaken input (ingredients), result useful (tasty) outputs.\ndiving concrete case study, brief history show algorithms relate scripts (covered Section 10.2) functions (can used generalize algorithms, ’ll see Section 10.4).word “algorithm” originated 9th century Baghdad publication Hisab al-jabr w’al-muqabala, early math textbook.\nbook translated Latin became popular author’s last name, al-Khwārizmī, “immortalized scientific term: Al-Khwarizmi\nbecame Alchoarismi, Algorismi , eventually, algorithm” (Bellos 2011).\ncomputing age, algorithm refers series steps solves problem, resulting pre-defined output.\nInputs must formally defined suitable data structure (Wise 2001).\nAlgorithms often start flow charts pseudocode showing aim process implemented code.\nease usability, common algorithms often packaged inside functions, may hide steps taken (unless look function’s source code, see Section 10.4).Geoalgorithms, encountered Chapter 9, algorithms take geographic data , generally, return geographic results (alternative terms thing include GIS algorithms geometric algorithms).\nmay sound simple deep subject entire academic field, Computational Geometry, dedicated study (Berg et al. 2008) numerous books subject.\nO’Rourke (1998), example, introduces subject range progressively harder geometric algorithms using reproducible freely available C code.example geometric algorithm one finds centroid polygon.\nmany approaches centroid calculation, work specific types spatial data.\npurposes section, choose approach easy visualize: breaking polygon many triangles finding centroid , approach discussed Kaiser Morin (1993) alongside centroid algorithms (mentioned briefly O’Rourke 1998).\nhelps break approach discrete tasks writing code (subsequently referred step 1 step 4, also presented schematic diagram pseudocode):Divide polygon contiguous triangles.Find centroid triangle.Find area triangle.Find area-weighted mean triangle centroids.steps may sound straightforward, converting words working code requires work plenty trial--error, even inputs constrained:\nalgorithm work convex polygons, contain internal angles greater 180°, star shapes allowed (packages decido sfdct can triangulate non-convex polygons using external libraries, shown algorithm vignette geocompr.github.io).simplest data structure polygon matrix x y coordinates row represents vertex tracing polygon’s border order first last rows identical (Wise 2001).\ncase, ’ll create polygon five vertices base R, building example GIS Algorithms (Xiao 2016 see github.com/gisalgs Python code), illustrated Figure 10.2:Now example dataset, ready undertake step 1 outlined .\ncode shows can done creating single triangle (T1), demonstrates method; also demonstrates step 2 calculating centroid based formula \\(1/3(+ b + c)\\) \\(\\) \\(c\\) coordinates representing triangle’s vertices:\nFIGURE 10.2: Illustration polygon centroid calculation problem.\nStep 3 find area triangle, weighted mean accounting disproportionate impact large triangles accounted .\nformula calculate area triangle follows (Kaiser Morin 1993):\\[\n\\frac{Ax ( B y − C y ) + B x ( C y − y ) + C x ( y − B y )}\n{ 2 }\n\\]\\(\\) \\(C\\) triangle’s three points \\(x\\) \\(y\\) refer x y dimensions.\ntranslation formula R code works data matrix representation triangle T1 follows (function abs() ensures positive result):code chunk outputs correct result.59\nproblem code clunky must re-typed want run another triangle matrix.\nmake code generalizable, see can converted function Section 10.4.Step 4 requires steps 2 3 undertaken just one triangle (demonstrated ) triangles.\nrequires iteration create triangles representing polygon, illustrated Figure 10.3.\nlapply() vapply() used iterate triangle provide concise solution base R:60\nFIGURE 10.3: Illustration iterative centroid algorithm triangles. X represents area-weighted centroid iterations 2 3.\nnow position complete step 4 calculate total area sum() centroid coordinates polygon weighted.mean(C[, 1], ) weighted.mean(C[, 2], ) (exercise alert readers: verify commands work).\ndemonstrate link algorithms scripts, contents section condensed 10-centroid-alg.R.\nsaw end Section 10.2 script can calculate centroid square.\ngreat thing scripting algorithm works new poly_mat object (see exercises verify results reference st_centroid()):example shows low-level geographic operations can developed first principles base R.\nalso shows tried--tested solution already exists, may worth re-inventing wheel:\naimed find centroid polygon, quicker represent poly_mat sf object use pre-existing sf::st_centroid() function instead.\nHowever, great benefit writing algorithms 1st principles understand every step process, something guaranteed using peoples’ code.\nconsideration performance: R slow compared low-level languages C++ number crunching (see Section 1.3) optimization difficult.\naim develop new methods, computational efficiency prioritized.\ncaptured saying “premature optimization root evil (least ) programming” (Knuth 1974).Algorithm development hard.\napparent amount work gone developing centroid algorithm base R just one, rather inefficient, approach problem limited real-world applications (convex polygons uncommon practice).\nexperience lead appreciation low-level geographic libraries GEOS (underlies sf::st_centroid()) CGAL (Computational Geometry Algorithms Library) run fast work wide range input geometry types.\ngreat advantage open source nature libraries source code readily available study, comprehension (skills confidence) modification.61","code":"\n# generate a simple matrix representation of a polygon:\nx_coords = c(10, 0, 0, 12, 20, 10)\ny_coords = c(0, 0, 10, 20, 15, 0)\npoly_mat = cbind(x_coords, y_coords)\n# create a point representing the origin:\nOrigin = poly_mat[1, ]\n# create 'triangle matrix':\nT1 = rbind(Origin, poly_mat[2:3, ], Origin) \n# find centroid (drop = FALSE preserves classes, resulting in a matrix):\nC1 = (T1[1, , drop = FALSE] + T1[2, , drop = FALSE] + T1[3, , drop = FALSE]) / 3\n# calculate the area of the triangle represented by matrix T1:\nabs(T1[1, 1] * (T1[2, 2] - T1[3, 2]) +\n  T1[2, 1] * (T1[3, 2] - T1[1, 2]) +\n  T1[3, 1] * (T1[1, 2] - T1[2, 2]) ) / 2\n#> [1] 50\ni = 2:(nrow(poly_mat) - 2)\nT_all = lapply(i, function(x) {\n  rbind(Origin, poly_mat[x:(x + 1), ], Origin)\n})\n\nC_list = lapply(T_all,  function(x) (x[1, ] + x[2, ] + x[3, ]) / 3)\nC = do.call(rbind, C_list)\n\nA = vapply(T_all, function(x) {\n  abs(x[1, 1] * (x[2, 2] - x[3, 2]) +\n        x[2, 1] * (x[3, 2] - x[1, 2]) +\n        x[3, 1] * (x[1, 2] - x[2, 2]) ) / 2\n  }, FUN.VALUE = double(1))\nsource(\"code/10-centroid-alg.R\")\n#> [1] \"The area is: 245\"\n#> [1] \"The coordinates of the centroid are: 8.83, 9.22\""},{"path":"algorithms.html","id":"functions","chapter":"10 Scripts, algorithms and functions","heading":"10.4 Functions","text":"Like algorithms, functions take input return output.\nFunctions, however, refer implementation particular programming language, rather ‘recipe’ .\nR, functions objects right, can created joined together modular fashion.\ncan, example, create function undertakes step 2 centroid generation algorithm follows:example demonstrates two key components functions:\n1) function body, code inside curly brackets define function inputs; 2) formals, list arguments function works — x case (third key component, environment, beyond scope section).\ndefault, functions return last object calculated (coordinates centroid case t_centroid()).62The function now works inputs pass , illustrated command calculates area 1st triangle example polygon previous section (see Figure 10.3):can also create function calculate triangle’s area, name t_area():Note function’s creation, triangle’s area can calculated single line code, avoiding duplication verbose code:\nfunctions mechanism generalizing code.\nnewly created function t_area() takes object x, assumed dimensions ‘triangle matrix’ data structure ’ve using, returns area, illustrated T1 follows:can test generalizability function using find area new triangle matrix, height 1 base 3:useful feature functions modular.\nProvided know output , one function can used building block another.\nThus, functions t_centroid() t_area() can used sub-components larger function work script 10-centroid-alg.R: calculate area convex polygon.\ncode chunk creates function poly_centroid() mimic behavior sf::st_centroid() convex polygons:63Functions poly_centroid() can extended provide different types output.\nreturn result object class sfg, example, ‘wrapper’ function can used modify output poly_centroid() returning result:can verify output output sf::st_centroid() follows:","code":"\nt_centroid = function(x) {\n  (x[1, ] + x[2, ] + x[3, ]) / 3\n}\nt_centroid(T1)\n#> x_coords y_coords \n#>     3.33     3.33\nt_area = function(x) {\n  abs(\n    x[1, 1] * (x[2, 2] - x[3, 2]) +\n    x[2, 1] * (x[3, 2] - x[1, 2]) +\n    x[3, 1] * (x[1, 2] - x[2, 2])\n  ) / 2\n}\nt_area(T1)\n#> [1] 50\nt_new = cbind(x = c(0, 3, 3, 0),\n              y = c(0, 0, 1, 0))\nt_area(t_new)\n#>   x \n#> 1.5\npoly_centroid = function(poly_mat) {\n  Origin = poly_mat[1, ] # create a point representing the origin\n  i = 2:(nrow(poly_mat) - 2)\n  T_all = lapply(i, function(x) {\n    rbind(Origin, poly_mat[x:(x + 1), ], Origin)\n  })\n  C_list = lapply(T_all, t_centroid)\n  C = do.call(rbind, C_list)\n  A = vapply(T_all, t_area, FUN.VALUE = double(1))\n  c(weighted.mean(C[, 1], A), weighted.mean(C[, 2], A))\n}\npoly_centroid(poly_mat)\n#> [1] 8.83 9.22\npoly_centroid_sfg = function(x) {\n  centroid_coords = poly_centroid(x)\n  sf::st_point(centroid_coords)\n}\npoly_sfc = sf::st_polygon(list(poly_mat))\nidentical(poly_centroid_sfg(poly_mat), sf::st_centroid(poly_sfc))\n#> [1] TRUE"},{"path":"algorithms.html","id":"programming","chapter":"10 Scripts, algorithms and functions","heading":"10.5 Programming","text":"chapter moved quickly, scripts functions via tricky topic algorithms.\ndiscussed abstract, also created working examples solve specific problem:script 10-centroid-alg.R introduced demonstrated ‘polygon matrix’individual steps allowed script work described algorithm, computational recipeTo generalize algorithm converted modular functions eventually combined create function poly_centroid() previous sectionTaken , steps straightforward.\nskill programming combining scripts, algorithms functions way produces performant, robust user-friendly tools people can use.\nnew programming, expect people reading book , able follow reproduce results preceding sections seen major achievement.\nProgramming takes many hours dedicated study practice become proficient.challenge facing developers aiming implement new algorithms efficient way put perspective considering created toy function.\ncurrent state, poly_centroid() fails (non-convex) polygons!\nquestion arising : one generalize function?\nTwo options (1) find ways triangulate non-convex polygons (topic covered online algorithm article supports chapter) (2) explore centroid algorithms rely triangular meshes.wider question : worth programming solution high performance algorithms already implemented packaged functions st_centroid()?\nreductionist answer specific case ‘.’\nwider context, considering benefits learning program, answer ‘depends.’\nprogramming, ’s easy waste hours trying implement method, find someone already done hard work.\ninstead seeing chapter first stepping stone towards geometric algorithm programming wizardry, may productive use lesson try program generalized solution, use existing higher-level solutions.\nsurely occasions writing new functions best way forward, also times using functions already exist best way forward.guarantee , read chapter, able rapidly create new functions work.\nconfident contents help decide appropriate time try (existing functions solve problem, programming task within capabilities benefits solution likely outweigh time costs developing ).\nFirst steps towards programming can slow (exercises rushed) long-term rewards can large.","code":""},{"path":"algorithms.html","id":"ex-algorithms","chapter":"10 Scripts, algorithms and functions","heading":"10.6 Exercises","text":"Read script 10-centroid-alg.R code folder book’s GitHub repo.\nbest practices covered Section 10.2 follow?\nCreate version script computer IDE RStudio (preferably typing-script line--line, coding style comments, rather copy-pasting — help learn type scripts). Using example square polygon (e.g., created poly_mat = cbind(x = c(0, 0, 9, 9, 0), y = c(0, 9, 9, 0, 0))) execute script line--line.\nchanges made script make reproducible?\n\n\n\n\n\n\ndocumentation improved?\n\nbest practices covered Section 10.2 follow?Create version script computer IDE RStudio (preferably typing-script line--line, coding style comments, rather copy-pasting — help learn type scripts). Using example square polygon (e.g., created poly_mat = cbind(x = c(0, 0, 9, 9, 0), y = c(0, 9, 9, 0, 0))) execute script line--line.changes made script make reproducible?\n\n\n\n\n\ndocumentation improved?\nSection 10.3 calculated area geographic centroid polygon represented poly_mat 245 8.8, 9.2, respectively.\nReproduce results computer reference script 10-centroid-alg.R, implementation algorithm (bonus: type commands - try avoid copy-pasting).\n\nresults correct? Verify converting poly_mat sfc object (named poly_sfc) st_polygon() (hint: function takes objects class list()) using st_area() st_centroid().\n\nReproduce results computer reference script 10-centroid-alg.R, implementation algorithm (bonus: type commands - try avoid copy-pasting).\nresults correct? Verify converting poly_mat sfc object (named poly_sfc) st_polygon() (hint: function takes objects class list()) using st_area() st_centroid().\nstated algorithm created works convex hulls. Define convex hulls (see Chapter 5) test algorithm polygon convex hull.\nBonus 1: Think method works convex hulls note changes need made algorithm make work types polygon.\n\nBonus 2: Building contents 10-centroid-alg.R, write algorithm using base R functions can find total length linestrings represented matrix form.\n\nBonus 1: Think method works convex hulls note changes need made algorithm make work types polygon.\nBonus 2: Building contents 10-centroid-alg.R, write algorithm using base R functions can find total length linestrings represented matrix form.\nSection 10.4 created different versions poly_centroid() function generated outputs class sfg (poly_centroid_sfg()) type-stable matrix outputs (poly_centroid_type_stable()). extend function creating version (e.g., called poly_centroid_sf()) type stable (accepts inputs class sf) returns sf objects (hint: may need convert object x matrix command sf::st_coordinates(x)).\nVerify works running poly_centroid_sf(sf::st_sf(sf::st_sfc(poly_sfc)))\nerror message get try run poly_centroid_sf(poly_mat)?\nVerify works running poly_centroid_sf(sf::st_sf(sf::st_sfc(poly_sfc)))error message get try run poly_centroid_sf(poly_mat)?","code":""},{"path":"spatial-cv.html","id":"spatial-cv","chapter":"11 Statistical learning","heading":"11 Statistical learning","text":"","code":""},{"path":"spatial-cv.html","id":"prerequisites-9","chapter":"11 Statistical learning","heading":"Prerequisites","text":"chapter assumes proficiency geographic data analysis, example gained studying contents working-exercises Chapters 2 6.\nfamiliarity generalized linear models (GLM) machine learning highly recommended (example . Zuur et al. 2009; James et al. 2013).chapter uses following packages:64Required data attached due course.","code":"\nlibrary(sf)\nlibrary(raster)\n#> Warning: multiple methods tables found for 'approxNA'\nlibrary(dplyr)\nlibrary(mlr)\nlibrary(parallelMap)"},{"path":"spatial-cv.html","id":"intro-cv1","chapter":"11 Statistical learning","heading":"11.1 Introduction","text":"Statistical learning concerned use statistical computational models identifying patterns data predicting patterns.\nDue origins, statistical learning one R’s great strengths (see Section 1.3).65\nStatistical learning combines methods statistics machine learning methods can categorized supervised unsupervised techniques.\nincreasingly used disciplines ranging physics, biology ecology geography economics (James et al. 2013).chapter focuses supervised techniques training dataset, opposed unsupervised techniques clustering.\nResponse variables can binary (landslide occurrence), categorical (land use), integer (species richness count) numeric (soil acidity measured pH).\nSupervised techniques model relationship responses — known sample observations — one predictors.primary aim much machine learning research make good predictions, opposed statistical/Bayesian inference, good helping understand underlying mechanisms uncertainties data (see Krainski et al. 2018).\nMachine learning thrives age ‘big data’ methods make assumptions input variables can handle huge datasets.\nMachine learning conducive tasks prediction future customer behavior, recommendation services (music, movies, buy next), face recognition, autonomous driving, text classification predictive maintenance (infrastructure, industry).chapter based case study: (spatial) prediction landslides.\napplication links applied nature geocomputation, defined Chapter 1, illustrates machine learning borrows field statistics sole aim prediction.\nTherefore, chapter first introduces modeling cross-validation concepts help Generalized Linear Model  (. Zuur et al. 2009).\nBuilding , chapter implements typical machine learning algorithm, namely Support Vector Machine (SVM).\nmodels’ predictive performance assessed using spatial cross-validation (CV), accounts fact geographic data special.CV determines model’s ability generalize new data, splitting dataset (repeatedly) training test sets.\nuses training data fit model, checks performance predicting test data.\nCV helps detect overfitting since models predict training data closely (noise) tend perform poorly test data.Randomly splitting spatial data can lead training points neighbors space test points.\nDue spatial autocorrelation, test training datasets independent scenario, consequence CV fails detect possible overfitting.\nSpatial CV alleviates problem central theme chapter.","code":""},{"path":"spatial-cv.html","id":"case-landslide","chapter":"11 Statistical learning","heading":"11.2 Case study: Landslide susceptibility","text":"case study based dataset landslide locations Southern Ecuador, illustrated Figure 11.1 described detail Muenchow, Brenning, Richter (2012).\nsubset dataset used paper provided RSAGA package, can loaded follows:load three objects: data.frame named landslides, list named dem, sf object named study_area.\nlandslides contains factor column lslpts TRUE corresponds observed landslide ‘initiation point,’ coordinates stored columns x y.66There 175 landslide points 1360 non-landslide, shown summary(landslides).\n1360 non-landslide points sampled randomly study area, restriction must fall outside small buffer around landslide polygons.make number landslide non-landslide points balanced, let us sample 175 1360 non-landslide points.67dem digital elevation model consisting two elements:\ndem$header, list represents raster ‘header’ (see Section 2.3), dem$data, matrix altitude pixel.\ndem can converted raster object :\nFIGURE 11.1: Landslide initiation points (red) points unaffected landsliding (blue) Southern Ecuador.\nmodel landslide susceptibility, need predictors.\nTerrain attributes frequently associated landsliding (Muenchow, Brenning, Richter 2012), can computed digital elevation model (dem) using R-GIS bridges (see Chapter 9).\nleave exercise reader compute following terrain attribute rasters extract corresponding values landslide/non-landslide data frame (see exercises; also provide resulting data frame via spDataLarge package, see ):slope: slope angle (°).cplan: plan curvature (rad m−1) expressing convergence divergence slope thus water flow.cprof: profile curvature (rad m-1) measure flow acceleration, also known downslope change slope angle.elev: elevation (m .s.l.) representation different altitudinal zones vegetation precipitation study area.log10_carea: decadic logarithm catchment area (log10 m2) representing amount water flowing towards location.Data containing landslide points, corresponding terrain attributes, provided spDataLarge package, along terrain attribute raster stack values extracted.\nHence, computed predictors , attach corresponding data running code remaining chapter:first three rows lsl, rounded two significant digits, can found Table 11.1.\nTABLE 11.1: Structure lsl dataset.\n","code":"\ndata(\"landslides\", package = \"RSAGA\")\n# select non-landslide points\nnon_pts = filter(landslides, lslpts == FALSE)\n# select landslide points\nlsl_pts = filter(landslides, lslpts == TRUE)\n# randomly select 175 non-landslide points\nset.seed(11042018)\nnon_pts_sub = sample_n(non_pts, size = nrow(lsl_pts))\n# create smaller landslide dataset (lsl)\nlsl = bind_rows(non_pts_sub, lsl_pts)\ndem = raster(\n  dem$data, \n  crs = dem$header$proj4string,\n  xmn = dem$header$xllcorner, \n  xmx = dem$header$xllcorner + dem$header$ncols * dem$header$cellsize,\n  ymn = dem$header$yllcorner,\n  ymx = dem$header$yllcorner + dem$header$nrows * dem$header$cellsize\n  )\n# attach landslide points with terrain attributes\ndata(\"lsl\", package = \"spDataLarge\")\n# attach terrain attribute raster stack\ndata(\"ta\", package = \"spDataLarge\")"},{"path":"spatial-cv.html","id":"conventional-model","chapter":"11 Statistical learning","heading":"11.3 Conventional modeling approach in R","text":"introducing mlr package, umbrella-package providing unified interface dozens learning algorithms (Section 11.5), worth taking look conventional modeling interface R.\nintroduction supervised statistical learning provides basis spatial CV, contributes better grasp mlr approach presented subsequently.Supervised learning involves predicting response variable function predictors (Section 11.4).\nR, modeling functions usually specified using formulas (see ?formula detailed Formulas R Tutorial details R formulas).\nfollowing command specifies runs generalized linear model:worth understanding three input arguments:formula, specifies landslide occurrence (lslpts) function predictorsA family, specifies type model, case binomial response binary (see ?family)data frame contains response predictorsThe results model can printed follows (summary(fit) provides detailed account results):model object fit, class glm, contains coefficients defining fitted relationship response predictors.\ncan also used prediction.\ndone generic predict() method, case calls function predict.glm().\nSetting type response returns predicted probabilities (landslide occurrence) observation lsl, illustrated (see ?predict.glm):Spatial predictions can made applying coefficients predictor rasters.\ncan done manually raster::predict().\naddition model object (fit), function also expects raster stack predictors named model’s input data frame (Figure 11.2).\nFIGURE 11.2: Spatial prediction landslide susceptibility using GLM.\n, making predictions neglect spatial autocorrelation since assume average predictive accuracy remains without spatial autocorrelation structures.\nHowever, possible include spatial autocorrelation structures models (. Zuur et al. 2009; Blangiardo Cameletti 2015; . F. Zuur et al. 2017) well predictions (kriging approaches, see, e.g., Goovaerts 1997; Hengl 2007; R. Bivand, Pebesma, Gómez-Rubio 2013).\n, however, beyond scope book.\nSpatial prediction maps one important outcome model.\nEven important good underlying model making since prediction map useless model’s predictive performance bad.\npopular measure assess predictive performance binomial model Area Receiver Operator Characteristic Curve (AUROC).\nvalue 0.5 1.0, 0.5 indicating model better random 1.0 indicating perfect prediction two classes.\nThus, higher AUROC, better model’s predictive power.\nfollowing code chunk computes AUROC value model roc(), takes response predicted values inputs.\nauc() returns area curve.AUROC value \n\n0.83\nrepresents good fit.\nHowever, overoptimistic estimation since computed complete dataset.\nderive biased-reduced assessment, use cross-validation case spatial data make use spatial CV.","code":"\nfit = glm(lslpts ~ slope + cplan + cprof + elev + log10_carea,\n          family = binomial(),\n          data = lsl)\nclass(fit)\n#> [1] \"glm\" \"lm\"\nfit\n#> \n#> Call:  glm(formula = lslpts ~ slope + cplan + cprof + elev + log10_carea, \n#>     family = binomial(), data = lsl)\n#> \n#> Coefficients:\n#> (Intercept)        slope        cplan        cprof         elev  log10_carea  \n#>    1.97e+00     9.30e-02    -2.57e+01    -1.43e+01     2.41e-05    -2.12e+00  \n#> \n#> Degrees of Freedom: 349 Total (i.e. Null);  344 Residual\n#> Null Deviance:       485 \n#> Residual Deviance: 361   AIC: 373\npred_glm = predict(object = fit, type = \"response\")\nhead(pred_glm)\n#>      1      2      3      4      5      6 \n#> 0.3327 0.4755 0.0995 0.1480 0.3486 0.6766\n# making the prediction\npred = raster::predict(ta, model = fit, type = \"response\")\npROC::auc(pROC::roc(lsl$lslpts, fitted(fit)))\n#> Area under the curve: 0.826"},{"path":"spatial-cv.html","id":"intro-cv","chapter":"11 Statistical learning","heading":"11.4 Introduction to (spatial) cross-validation","text":"Cross-validation belongs family resampling methods (James et al. 2013).\nbasic idea split (repeatedly) dataset training test sets whereby training data used fit model applied test set.\nComparing predicted values known response values test set (using performance measure AUROC binomial case) gives bias-reduced assessment model’s capability generalize learned relationship independent data.\nexample, 100-repeated 5-fold cross-validation means randomly split data five partitions (folds) fold used test set (see upper row Figure 11.3).\nguarantees observation used one test sets, requires fitting five models.\nSubsequently, procedure repeated 100 times.\ncourse, data splitting differ repetition.\n\nOverall, sums 500 models, whereas mean performance measure (AUROC) models model’s overall predictive power.However, geographic data special.\nsee Chapter 12, ‘first law’ geography states points close , generally, similar points away (Miller 2004).\nmeans points statistically independent training test points conventional CV often close (see first row Figure 11.3).\n‘Training’ observations near ‘test’ observations can provide kind ‘sneak preview’:\ninformation unavailable training dataset.\n\nalleviate problem ‘spatial partitioning’ used split observations spatially disjointed subsets (using observations’ coordinates k-means clustering; Brenning (2012b); second row Figure 11.3).\npartitioning strategy difference spatial conventional CV.\nresult, spatial CV leads bias-reduced assessment model’s predictive performance, hence helps avoid overfitting.\n\nFIGURE 11.3: Spatial visualization selected test training observations cross-validation one repetition. Random (upper row) spatial partitioning (lower row).\n","code":""},{"path":"spatial-cv.html","id":"spatial-cv-with-mlr","chapter":"11 Statistical learning","heading":"11.5 Spatial CV with mlr","text":"\ndozens packages statistical learning, described example CRAN machine learning task view.\nGetting acquainted packages, including undertake cross-validation hyperparameter tuning, can time-consuming process.\nComparing model results different packages can even laborious.\nmlr package developed address issues.\nacts ‘meta-package,’ providing unified interface popular supervised unsupervised statistical learning techniques including classification, regression, survival analysis clustering (Bischl et al. 2016).\n\n\n\nstandardized mlr interface based eight ‘building blocks.’\nillustrated Figure 11.4, clear order.\nFIGURE 11.4: Basic building blocks mlr package. Source: http://bit.ly/2tcb2b7. (Permission reuse figure kindly granted.)\nmlr modeling process consists three main stages.\nFirst, task specifies data (including response predictor variables) model type (regression classification).\nSecond, learner defines specific learning algorithm applied created task.\nThird, resampling approach assesses predictive performance model, .e., ability generalize new data (see also Section 11.4).","code":""},{"path":"spatial-cv.html","id":"glm","chapter":"11 Statistical learning","heading":"11.5.1 Generalized linear model","text":"implement GLM mlr, must create task containing landslide data.\nSince response binary (two-category variable), create classification task makeClassifTask() (regression tasks, use makeRegrTask(), see ?makeRegrTask task types).\nfirst essential argument make*() functions data.\ntarget argument expects name response variable positive determines two factor levels response variable indicate landslide initiation point (case TRUE).\nvariables lsl dataset serve predictors except coordinates (see result getTaskFormula(task) model formula).\nspatial CV, coordinates parameter used (see Section 11.4 Figure 11.3) expects coordinates xy data frame.makeLearner() determines statistical learning method use.\nclassification learners start classif. regression learners regr. (see ?makeLearners details).\nlistLearners() helps find available learners package mlr imports (Table 11.2).\nspecific task, can run:\nTABLE 11.2: Sample available learners binomial tasks mlr package.\nyields learners able model two-class problems (landslide yes ).\nopt binomial classification method used Section 11.3 implemented classif.binomial mlr.\nAdditionally, must specify link-function, logit case, also default binomial() function.\npredict.type determines type prediction prob resulting predicted probability landslide occurrence 0 1 (corresponds type = response predict.glm).find package specified learner taken access corresponding help pages, can run:set-steps modeling mlr may seem tedious.\nremember, single interface provides access 150+ learners shown listLearners(); far tedious learn interface learner!\nadvantages simple parallelization resampling techniques ability tune machine learning hyperparameters (see Section 11.5.2).\nimportantly, (spatial) resampling mlr straightforward, requiring two steps: specifying resampling method running .\nuse 100-repeated 5-fold spatial CV: five partitions chosen based provided coordinates task partitioning repeated 100 times:68To execute spatial resampling, run resample() using specified learner, task, resampling strategy course performance measure, AUROC.\ntakes time (around 10 seconds modern laptop) computes AUROC 500 models.\nSetting seed ensures reproducibility obtained result ensure spatial partitioning re-running code.output preceding code chunk bias-reduced assessment model’s predictive performance, illustrated following code chunk (required input data saved file spatialcv.Rdata book’s GitHub repo):put results perspective, let us compare AUROC values 100-repeated 5-fold non-spatial cross-validation (Figure 11.5; code non-spatial cross-validation shown explored exercise section).\nexpected, spatially cross-validated result yields lower AUROC values average conventional cross-validation approach, underlining -optimistic predictive performance due spatial autocorrelation latter.\nFIGURE 11.5: Boxplot showing difference AUROC values spatial conventional 100-repeated 5-fold cross-validation.\n","code":"\nlibrary(mlr)\n# coordinates needed for the spatial partitioning\ncoords = lsl[, c(\"x\", \"y\")]\n# select response and predictors to use in the modeling\ndata = dplyr::select(lsl, -x, -y)\n# create task\ntask = makeClassifTask(data = data, target = \"lslpts\",\n                       positive = \"TRUE\", coordinates = coords)\nlistLearners(task, warn.missing.packages = FALSE) %>%\n  dplyr::select(class, name, short.name, package) %>%\n  head()\nlrn = makeLearner(cl = \"classif.binomial\",\n                  link = \"logit\",\n                  predict.type = \"prob\",\n                  fix.factors.prediction = TRUE)\ngetLearnerPackages(lrn)\nhelpLearner(lrn)\nperf_level = makeResampleDesc(method = \"SpRepCV\", folds = 5, reps = 100)\nset.seed(012348)\nsp_cv = mlr::resample(learner = lrn, task = task,\n                      resampling = perf_level, \n                      measures = mlr::auc)\n# summary statistics of the 500 models\nsummary(sp_cv$measures.test$auc)\n#>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. \n#>   0.686   0.757   0.789   0.780   0.795   0.861\n# mean AUROC of the 500 models\nmean(sp_cv$measures.test$auc)\n#> [1] 0.78"},{"path":"spatial-cv.html","id":"svm","chapter":"11 Statistical learning","heading":"11.5.2 Spatial tuning of machine-learning hyperparameters","text":"Section 11.4 introduced machine learning part statistical learning.\nrecap, adhere following definition machine learning Jason Brownlee:Machine learning, specifically field predictive modeling, primarily concerned minimizing error model making accurate predictions possible, expense explainability.\napplied machine learning borrow, reuse steal algorithms many different fields, including statistics use towards ends.Section 11.5.1 GLM used predict landslide susceptibility.\nsection introduces support vector machines (SVM) purpose.\nRandom forest models might popular SVMs; however, positive effect tuning hyperparameters model performance much pronounced case SVMs (Probst, Wright, Boulesteix 2018).\nSince (spatial) hyperparameter tuning major aim section, use SVM.\nwishing apply random forest model, recommend read chapter, proceed Chapter 14 apply currently covered concepts techniques make spatial predictions based random forest model.SVMs search best possible ‘hyperplanes’ separate classes (classification case) estimate ‘kernels’ specific hyperparameters allow non-linear boundaries classes (James et al. 2013).\nHyperparameters confused coefficients parametric models, sometimes also referred parameters.69\nCoefficients can estimated data, hyperparameters set learning begins.\nOptimal hyperparameters usually determined within defined range help cross-validation methods.\ncalled hyperparameter tuning.SVM implementations provided kernlab allow hyperparameters tuned automatically, usually based random sampling (see upper row Figure 11.3).\nworks non-spatial data less use spatial data ‘spatial tuning’ undertaken.defining spatial tuning, set mlr building blocks, introduced Section 11.5.1, SVM.\nclassification task remains , hence can simply reuse task object created Section 11.5.1.\nLearners implementing SVM can found using listLearners() follows:options illustrated , use ksvm() kernlab package (Karatzoglou et al. 2004).\nallow non-linear relationships, use popular radial basis function (Gaussian) kernel also default ksvm().next stage specify resampling strategy.\nuse 100-repeated 5-fold spatial CV.Note exact code used GLM Section 11.5.1; simply repeated reminder.far, process identical described Section 11.5.1.\nnext step new, however: tune hyperparameters.\nUsing data performance assessment tuning potentially lead overoptimistic results (Cawley Talbot 2010).\ncan avoided using nested spatial CV.\nFIGURE 11.6: Schematic hyperparameter tuning performance estimation levels CV. (Figure taken Schratz et al. (2018). Permission reuse kindly granted.)\nmeans split fold five spatially disjoint subfolds used determine optimal hyperparameters (tune_level object code chunk ; see Figure 11.6 visual representation).\nfind optimal hyperparameter combination, fit 50 models (ctrl object code chunk ) subfolds randomly selected values hyperparameters C Sigma.\nrandom selection values C Sigma additionally restricted predefined tuning space (ps object).\nrange tuning space chosen values recommended literature (Schratz et al. 2018).next stage modify learner lrn_ksvm accordance characteristics defining hyperparameter tuning makeTuneWrapper().mlr now set-fit 250 models determine optimal hyperparameters one fold.\nRepeating fold, end 1250 (250 * 5) models repetition.\nRepeated 100 times means fitting total 125,000 models identify optimal hyperparameters (Figure 11.3).\nused performance estimation, requires fitting another 500 models (5 folds * 100 repetitions; see Figure 11.3).\nmake performance estimation processing chain even clearer, let us write commands given computer:Performance level (upper left part Figure 11.6): split dataset five spatially disjoint (outer) subfolds.Tuning level (lower left part Figure 11.6): use first fold performance level split spatially five (inner) subfolds hyperparameter tuning.\nUse 50 randomly selected hyperparameters inner subfolds, .e., fit 250 models.Performance estimation: Use best hyperparameter combination previous step (tuning level) apply first outer fold performance level estimate performance (AUROC).Repeat steps 2 3 remaining four outer folds.Repeat steps 2 4, 100 times.process hyperparameter tuning performance estimation computationally intensive.\nModel runtime can reduced parallelization, can done number ways, depending operating system.\nstarting parallelization, ensure processing continues even one models throws error setting .learner.error warn.\navoids process stopping just one failed model, desirable large model runs.\ninspect failed models processing completed, dump :start parallelization, set mode multicore use mclapply() background single machine case Unix-based operating system.70\nEquivalenty, parallelStartSocket() enables parallelization Windows.\nlevel defines level enable parallelization, mlr.tuneParams determining hyperparameter tuning level parallelized (see lower left part Figure 11.6, ?parallelGetRegisteredLevels, mlr parallelization tutorial details).\nuse half available cores (set cpus parameter), setting allows possible users work high performance computing cluster case one used (case ran code).\n\nSetting mc.set.seed TRUE ensures randomly chosen hyperparameters tuning can reproduced running code .\nUnfortunately, mc.set.seed available Unix-based systems.Now set computing nested spatial CV.\nUsing seed allows us recreate exact spatial partitions re-running code.\nSpecifying resample() parameters follows exact procedure presented using GLM, difference extract argument.\nallows extraction hyperparameter tuning results important plan follow-analyses tuning.\nprocessing, good practice explicitly stop parallelization parallelStop().\nFinally, save output object (result) disk case like use another R session.\nrunning subsequent code, aware time-consuming:\n125,500 models took ~1/2hr server using 24 cores (see ).case want run code locally, saved subset results book’s GitHub repo.\ncan loaded follows:Note runtime depends many aspects: CPU speed, selected algorithm, selected number cores dataset.Even important runtime final aggregated AUROC: model’s ability discriminate two classes.appears GLM (aggregated AUROC 0.78) slightly better SVM specific case.\nHowever, using 50 iterations random search probably yield hyperparameters result models better AUROC (Schratz et al. 2018).\nhand, increasing number random search iterations also increase total number models thus runtime.estimated optimal hyperparameters fold performance estimation level can also viewed.\nfollowing command shows best hyperparameter combination first fold first iteration (recall results first 5 * 50 model runs):estimated hyperparameters used first fold first iteration performance estimation level resulted following AUROC value:far spatial CV used assess ability learning algorithms generalize unseen data.\nspatial predictions, one tune hyperparameters complete dataset.\ncovered Chapter 14.","code":"\nlrns = listLearners(task, warn.missing.packages = FALSE)\nfilter(lrns, grepl(\"svm\", class)) %>% \n  dplyr::select(class, name, short.name, package)\n#>            class                                 name short.name package\n#> 6   classif.ksvm              Support Vector Machines       ksvm kernlab\n#> 9  classif.lssvm Least Squares Support Vector Machine      lssvm kernlab\n#> 17   classif.svm     Support Vector Machines (libsvm)        svm   e1071\nlrn_ksvm = makeLearner(\"classif.ksvm\",\n                        predict.type = \"prob\",\n                        kernel = \"rbfdot\")\n# performance estimation level\nperf_level = makeResampleDesc(method = \"SpRepCV\", folds = 5, reps = 100)\n# five spatially disjoint partitions\ntune_level = makeResampleDesc(\"SpCV\", iters = 5)\n# use 50 randomly selected hyperparameters\nctrl = makeTuneControlRandom(maxit = 50)\n# define the outer limits of the randomly selected hyperparameters\nps = makeParamSet(\n  makeNumericParam(\"C\", lower = -12, upper = 15, trafo = function(x) 2^x),\n  makeNumericParam(\"sigma\", lower = -15, upper = 6, trafo = function(x) 2^x)\n  )\nwrapped_lrn_ksvm = makeTuneWrapper(learner = lrn_ksvm, \n                                   resampling = tune_level,\n                                   par.set = ps,\n                                   control = ctrl, \n                                   show.info = TRUE,\n                                   measures = mlr::auc)\nconfigureMlr(on.learner.error = \"warn\", on.error.dump = TRUE)\nlibrary(parallelMap)\nif (Sys.info()[\"sysname\"] %in% c(\"Linux\", \"Darwin\")) {\nparallelStart(mode = \"multicore\", \n              # parallelize the hyperparameter tuning level\n              level = \"mlr.tuneParams\", \n              # just use half of the available cores\n              cpus = round(parallel::detectCores() / 2),\n              mc.set.seed = TRUE)\n}\n\nif (Sys.info()[\"sysname\"] == \"Windows\") {\n  parallelStartSocket(level = \"mlr.tuneParams\",\n                      cpus =  round(parallel::detectCores() / 2))\n}\nset.seed(12345)\nresult = mlr::resample(learner = wrapped_lrn_ksvm, \n                       task = task,\n                       resampling = perf_level,\n                       extract = getTuneResult,\n                       measures = mlr::auc)\n# stop parallelization\nparallelStop()\n# save your result, e.g.:\n# saveRDS(result, \"svm_sp_sp_rbf_50it.rds\")\nresult = readRDS(\"extdata/spatial_cv_result.rds\")\n# Exploring the results\n# runtime in minutes\nround(result$runtime / 60, 2)\n#> [1] 37.4\n# final aggregated AUROC \nresult$aggr\n#> auc.test.mean \n#>         0.758\n# same as\nmean(result$measures.test$auc)\n#> [1] 0.758\n# winning hyperparameters of tuning step, \n# i.e. the best combination out of 50 * 5 models\nresult$extract[[1]]$x\n#> $C\n#> [1] 0.458\n#> \n#> $sigma\n#> [1] 0.023\nresult$measures.test[1, ]\n#>   iter   auc\n#> 1    1 0.799"},{"path":"spatial-cv.html","id":"conclusions","chapter":"11 Statistical learning","heading":"11.6 Conclusions","text":"Resampling methods important part data scientist’s toolbox (James et al. 2013).\nchapter used cross-validation assess predictive performance various models.\ndescribed Section 11.4, observations spatial coordinates may statistically independent due spatial autocorrelation, violating fundamental assumption cross-validation.\nSpatial CV addresses issue reducing bias introduced spatial autocorrelation.mlr package facilitates (spatial) resampling techniques combination popular statistical learning techniques including linear regression, semi-parametric models generalized additive models machine learning techniques random forests, SVMs, boosted regression trees (Bischl et al. 2016; Schratz et al. 2018).\nMachine learning algorithms often require hyperparameter inputs, optimal ‘tuning’ can require thousands model runs require large computational resources, consuming much time, RAM /cores.\nmlr tackles issue enabling parallelization.Machine learning overall, use understand spatial data, large field chapter provided basics, learn.\nrecommend following resources direction:mlr tutorials Machine Learning R Handling spatial DataAn academic paper hyperparameter tuning (Schratz et al. 2018)case spatio-temporal data, one account spatial temporal autocorrelation CV (Meyer et al. 2018)","code":""},{"path":"spatial-cv.html","id":"exercises-8","chapter":"11 Statistical learning","heading":"11.7 Exercises","text":"Compute following terrain attributes dem datasets loaded data(\"landslides\", package = \"RSAGA\") help R-GIS bridges (see Chapter 9):\nSlope\nPlan curvature\nProfile curvature\nCatchment area\nSlopePlan curvatureProfile curvatureCatchment areaExtract values corresponding output rasters landslides data frame (data(landslides, package = \"RSAGA\") adding new variables called slope, cplan, cprof, elev log_carea. Keep landslide initiation points 175 randomly selected non-landslide points (see Section 11.2 details).Use derived terrain attribute rasters combination GLM make spatial prediction map similar shown Figure 11.2.\nRunning data(\"study_mask\", package = \"spDataLarge\") attaches mask study area.Compute 100-repeated 5-fold non-spatial cross-validation spatial CV based GLM learner compare AUROC values resampling strategies help boxplots (see Figure 11.5).\nHint: need specify non-spatial task non-spatial resampling strategy.\n\nModel landslide susceptibility using quadratic discriminant analysis (QDA, James et al. 2013).\nAssess predictive performance (AUROC) QDA.\ndifference spatially cross-validated mean AUROC value QDA GLM?\n\nHint: running spatial cross-validation learners, set seed make sure use spatial partitions turn guarantees comparability.Run SVM without tuning hyperparameters.\nUse rbfdot kernel \\(\\sigma\\) = 1 C = 1.\nLeaving hyperparameters unspecified kernlab’s ksvm() otherwise initialize automatic non-spatial hyperparameter tuning.\ndiscussion need (spatial) tuning hyperparameters, please refer Schratz et al. (2018).","code":""},{"path":"transport.html","id":"transport","chapter":"12 Transportation","heading":"12 Transportation","text":"","code":""},{"path":"transport.html","id":"prerequisites-10","chapter":"12 Transportation","heading":"Prerequisites","text":"chapter uses following packages:71","code":"\nlibrary(sf)\nlibrary(dplyr)\nlibrary(spDataLarge)\n#> Warning: multiple methods tables found for 'approxNA'\nlibrary(stplanr)      # geographic transport data package\nlibrary(tmap)         # visualization package (see Chapter 8)"},{"path":"transport.html","id":"introduction-7","chapter":"12 Transportation","heading":"12.1 Introduction","text":"sectors geographic space tangible transport.\neffort moving (overcoming distance) central ‘first law’ geography, defined Waldo Tobler 1970 follows (Miller 2004):Everything related everything else, near things related distant things.‘law’ basis spatial autocorrelation key geographic concepts.\napplies phenomena diverse friendship networks ecological diversity can explained costs transport — terms time, energy money — constitute ‘friction distance.’\nperspective, transport technologies disruptive, changing geographic relationships geographic entities including mobile humans goods: “purpose transportation overcome space” (Rodrigue, Comtois, Slack 2013).Transport inherently geospatial activity.\ninvolves traversing continuous geographic space B, infinite localities .\ntherefore unsurprising transport researchers long turned geocomputational methods understand movement patterns transport problems motivator geocomputational methods.chapter introduces geographic analysis transport systems different geographic levels, including:Areal units: transport patterns can understood reference zonal aggregates main mode travel (car, bike foot, example) average distance trips made people living particular zone, covered Section 12.3.Desire lines: straight lines represent ‘origin-destination’ data records many people travel (travel) places (points zones) geographic space, topic Section 12.4.Routes: lines representing path along route network along desire lines defined previous bullet point.\nsee create Section 12.5.Nodes: points transport system can represent common origins destinations public transport stations bus stops rail stations, topic Section 12.6.Route networks: represent system roads, paths linear features area covered Section 12.7. can represented geographic features (representing route segments) structured interconnected graph, level traffic different segments referred ‘flow’ transport modelers (Hollander 2016).Another key level agents, mobile entities like .\ncan represented computationally thanks software MATSim, captures dynamics transport systems using agent-based modeling (ABM) approach high spatial temporal resolution (Horni, Nagel, Axhausen 2016).\nABM powerful approach transport research great potential integration R’s spatial classes (Thiele 2014; Lovelace Dumont 2016), outside scope chapter.\nBeyond geographic levels agents, basic unit analysis transport models trip, single purpose journey origin ‘’ destination ‘B’ (Hollander 2016).\nTrips join-different levels transport systems: usually represented desire lines connecting zone centroids (nodes), can allocated onto route network routes, made people can represented agents.Transport systems dynamic systems adding additional complexity.\npurpose geographic transport modeling can interpreted simplifying complexity way captures essence transport problems.\nSelecting appropriate level geographic analysis can help simplify complexity, capture essence transport system without losing important features variables (Hollander 2016).Typically, models designed solve particular problem.\nreason, chapter based around policy scenario, introduced next section, asks:\nincrease cycling city Bristol?\nChapter 13 demonstrates another application geocomputation:\nprioritising location new bike shops.\nlink chapters bike shops may benefit new cycling infrastructure, demonstrating important feature transport systems: closely linked broader social, economic land-use patterns.","code":""},{"path":"transport.html","id":"bris-case","chapter":"12 Transportation","heading":"12.2 A case study of Bristol","text":"case study used chapter located Bristol, city west England, around 30 km east Welsh capital Cardiff.\noverview region’s transport network illustrated Figure 12.1, shows diversity transport infrastructure, cycling, public transport, private motor vehicles.\nFIGURE 12.1: Bristol’s transport network represented colored lines active (green), public (railways, black) private motor (red) modes travel. Blue border lines represent inner city boundary larger Travel Work Area (TTWA).\nBristol 10th largest city council England, population half million people, although travel catchment area larger (see Section 12.3).\nvibrant economy aerospace, media, financial service tourism companies, alongside two major universities.\nBristol shows high average income per capita also contains areas severe deprivation (Bristol City Council 2015).terms transport, Bristol well served rail road links, relatively high level active travel.\n19% citizens cycle 88% walk least per month according Active People Survey (national average 15% 81%, respectively).\n8% population said cycled work 2011 census, compared 3% nationwide.Despite impressive walking cycling statistics, city major congestion problem.\nPart solution continue increase proportion trips made cycling.\nCycling greater potential replace car trips walking speed mode, around 3-4 times faster walking (typical speeds 15-20 km/h vs 4-6 km/h walking).\nambitious plan double share cycling 2020.policy context, aim chapter, beyond demonstrating geocomputation R can used support sustainable transport planning, provide evidence decision-makers Bristol decide best increase share walking cycling particular city.\nhigh-level aim met via following objectives:Describe geographical pattern transport behavior cityIdentify key public transport nodes routes along cycling rail stations encouraged, first stage multi-model tripsAnalyze travel ‘desire lines’, find many people drive short distancesIdentify cycle route locations encourage less car driving cyclingTo get wheels rolling practical aspects chapter, begin loading zonal data travel patterns.\nzone-level data small often vital gaining basic understanding settlement’s overall transport system.","code":""},{"path":"transport.html","id":"transport-zones","chapter":"12 Transportation","heading":"12.3 Transport zones","text":"Although transport systems primarily based linear features nodes — including pathways stations — often makes sense start areal data, break continuous space tangible units (Hollander 2016).\naddition boundary defining study area (Bristol case), two zone types particular interest transport researchers: origin destination zones.\nOften, geographic units used origins destinations.\nHowever, different zoning systems, ‘Workplace Zones,’ may appropriate represent increased density trip destinations areas many ‘trip attractors’ schools shops (Office National Statistics 2014).simplest way define study area often first matching boundary returned OpenStreetMap, can obtained using osmdata command bristol_region = osmdata::getbb(\"Bristol\", format_out = \"sf_polygon\"). results sf object representing bounds largest matching city region, either rectangular polygon bounding box detailed polygonal boundary.72\nBristol, UK, detailed polygon returned, representing official boundary Bristol (see inner blue boundary Figure 12.1) couple issues approach:first OSM boundary returned OSM may official boundary used local authoritiesEven OSM returns official boundary, may inappropriate transport research bear little relation people travelTravel Work Areas (TTWAs) address issues creating zoning system analogous hydrological watersheds.\nTTWAs first defined contiguous zones within 75% population travels work (Coombes, Green, Openshaw 1986), definition used chapter.\nBristol major employer attracting travel surrounding towns, TTWA substantially larger city bounds (see Figure 12.1).\npolygon representing transport-orientated boundary stored object bristol_ttwa, provided spDataLarge package loaded beginning chapter.origin destination zones used chapter : officially defined zones intermediate geographic resolution (official name Middle layer Super Output Areas MSOAs).\nhouses around 8,000 people.\nadministrative zones can provide vital context transport analysis, type people might benefit particular interventions (e.g., Moreno-Monroy, Lovelace, Ramos 2017).geographic resolution zones important: small zones high geographic resolution usually preferable high number large regions can consequences processing (especially origin-destination analysis number possibilities increases non-linear function number zones) (Hollander 2016).\nAnother issue small zones related anonymity rules. make impossible infer identity individuals zones, detailed socio-demographic variables often available low geographic resolution. Breakdowns travel mode age sex, example, available Local Authority level UK, much higher Output Area level, contains around 100 households. details, see www.ons.gov.uk/methodology/geography.\n102 zones used chapter stored bristol_zones, illustrated Figure 12.2.\nNote zones get smaller densely populated areas: houses similar number people.\nbristol_zones contains attribute data transport, however, name code zone:add travel data, undertake attribute join, common task described Section 3.2.4.\nuse travel data UK’s 2011 census question travel work, data stored bristol_od, provided ons.gov.uk data portal.\nbristol_od origin-destination (OD) dataset travel work zones UK’s 2011 Census (see Section 12.4).\nfirst column ID zone origin second column zone destination.\nbristol_od rows bristol_zones, representing travel zones rather zones :results previous code chunk shows 10 OD pairs every zone, meaning need aggregate origin-destination data joined bristol_zones, illustrated (origin-destination data described Section 12.4):preceding chunk:grouped data zone origin (contained column o);aggregated variables bristol_od dataset numeric, find total number people living zone mode transport; and73renamed grouping variable o matches ID column geo_code bristol_zones object.resulting object zones_attr data frame rows representing zones ID variable.\ncan verify IDs match zones dataset using %% operator follows:results show 102 zones present new object zone_attr form can joined onto zones.74\ndone using joining function left_join() (note inner_join() produce result):\n\nresult zones_joined, contains new columns representing total number trips originating zone study area (almost 1/4 million) mode travel (bicycle, foot, car train).\ngeographic distribution trip origins illustrated left-hand map Figure 12.2.\nshows zones 0 4,000 trips originating study area.\ntrips made people living near center Bristol fewer outskirts.\n? Remember dealing trips within study region:\nlow trip numbers outskirts region can explained fact many people peripheral zones travel regions outside study area.\nTrips outside study region can included regional model special destination ID covering trips go zone represented model (Hollander 2016).\ndata bristol_od, however, simply ignores trips: ‘intra-zonal’ model.way OD datasets can aggregated zone origin, can also aggregated provide information destination zones.\nPeople tend gravitate towards central places.\nexplains spatial distribution represented right panel Figure 12.2 relatively uneven, common destination zones concentrated Bristol city center.\nresult zones_od, contains new column reporting number trip destinations mode, created follows:simplified version Figure 12.2 created code (see 12-zones.R code folder book’s GitHub repo reproduce figure Section 8.2.6 details faceted maps tmap):\nFIGURE 12.2: Number trips (commuters) living working region. left map shows zone origin commute trips; right map shows zone destination (generated script 12-zones.R).\n","code":"\nnames(bristol_zones)\n#> [1] \"geo_code\" \"name\"     \"geometry\"\nnrow(bristol_od)\n#> [1] 2910\nnrow(bristol_zones)\n#> [1] 102\nzones_attr = bristol_od %>% \n  group_by(o) %>% \n  summarize_if(is.numeric, sum) %>% \n  dplyr::rename(geo_code = o)\nsummary(zones_attr$geo_code %in% bristol_zones$geo_code)\n#>    Mode    TRUE \n#> logical     102\nzones_joined = left_join(bristol_zones, zones_attr, by = \"geo_code\")\nsum(zones_joined$all)\n#> [1] 238805\nnames(zones_joined)\n#> [1] \"geo_code\"   \"name\"       \"all\"        \"bicycle\"    \"foot\"      \n#> [6] \"car_driver\" \"train\"      \"geometry\"\nzones_od = bristol_od %>% \n  group_by(d) %>% \n  summarize_if(is.numeric, sum) %>% \n  dplyr::select(geo_code = d, all_dest = all) %>% \n  inner_join(zones_joined, ., by = \"geo_code\")\nqtm(zones_od, c(\"all\", \"all_dest\")) +\n  tm_layout(panel.labels = c(\"Origin\", \"Destination\"))"},{"path":"transport.html","id":"desire-lines","chapter":"12 Transportation","heading":"12.4 Desire lines","text":"Unlike zones, represent trip origins destinations, desire lines connect centroid origin destination zone, thereby represent people desire go zones.\nrepresent quickest ‘bee line’ ‘crow flies’ route B taken, obstacles buildings windy roads getting way (see convert desire lines routes next section).already loaded data representing desire lines dataset bristol_od.\norigin-destination (OD) data frame object represents number people traveling zone represented o d, illustrated Table 12.1.\narrange OD data trips filter-top 5, type (please refer Chapter 3 detailed description non-spatial attribute operations):TABLE 12.1: Sample top 5 origin-destination pairs Bristol OD data frame, representing travel desire lines zones study area.resulting table provides snapshot Bristolian travel patterns terms commuting (travel work).\ndemonstrates walking popular mode transport among top 5 origin-destination pairs, zone E02003043 popular destination (Bristol city center, destination top 5 OD pairs), intrazonal trips, one part zone E02003043 another (first row Table 12.1), constitute traveled OD pair dataset.\npolicy perspective, raw data presented Table 12.1 limited use:\naside fact contains tiny portion 2,910 OD pairs, tells us little policy measures needed, proportion trips made walking cycling.\nfollowing command calculates percentage desire line made active modes:two main types OD pair:\ninterzonal intrazonal.\nInterzonal OD pairs represent travel zones destination different origin.\nIntrazonal OD pairs represent travel within zone (see top row Table 12.1).\nfollowing code chunk splits od_bristol two types:next step convert interzonal OD pairs sf object representing desire lines can plotted map stplanr function od2line().75An illustration results presented Figure 12.3, simplified version created following command (see code 12-desire.R reproduce figure exactly Chapter 8 details visualization tmap):\nFIGURE 12.3: Desire lines representing trip patterns Bristol, width representing number trips color representing percentage trips made active modes (walking cycling). four black lines represent interzonal OD pairs Table 7.1.\nmap shows city center dominates transport patterns region, suggesting policies prioritized , although number peripheral sub-centers can also seen.\nNext interesting look distribution interzonal modes, e.g. zones cycling least common means transport.","code":"\nod_top5 = bristol_od %>% \n  arrange(desc(all)) %>% \n  top_n(5, wt = all)\nbristol_od$Active = (bristol_od$bicycle + bristol_od$foot) /\n  bristol_od$all * 100\nod_intra = filter(bristol_od, o == d)\nod_inter = filter(bristol_od, o != d)\ndesire_lines = od2line(od_inter, zones_od)\n#> Creating centroids representing desire line start and end points.\nqtm(desire_lines, lines.lwd = \"all\")"},{"path":"transport.html","id":"routes","chapter":"12 Transportation","heading":"12.5 Routes","text":"geographer’s perspective, routes desire lines longer straight:\norigin destination points , pathway get B complex.\nDesire lines contain two vertices (beginning end points) routes can contain hundreds vertices cover large distance represent travel patterns intricate road network (routes simple grid-based road networks require relatively vertices).\nRoutes generated desire lines — commonly origin-destination pairs — using routing services either run locally remotely.Local routing can advantageous terms speed execution control weighting profile different modes transport.\nDisadvantages include difficulty representing complex networks locally; temporal dynamics (primarily due traffic); need specialized software ‘pgRouting,’ issue developers packages stplanr dodgr seek address.Remote routing services, contrast, use web API send queries origins destinations return results generated powerful server running specialised software.\ngives remote routing services various advantages, including usuallyhave global coverage;update regularly; andrun specialist hardware software set-job.Disadvantages remote routing services include speed (rely data transfer internet) price (Google routing API, example, limits number free queries).\ngoogleway package provides interface Google’s routing API.\nFree (rate limited) routing service include OSRM openrouteservice.org.Instead routing desire lines generated previous section, time memory-consuming, focus desire lines policy interest.\nbenefits cycling trips greatest replace car trips.\nClearly, car trips can realistically replaced cycling.\nHowever, 5 km Euclidean distance (around 6-8 km route distance) can realistically cycled many people, especially riding electric bicycle (‘ebike’).\ntherefore route desire lines along high (300+) number car trips take place 5 km distance.\nrouting done code chunk stplanr function route(), creates sf objects representing routes transport network, one desire line.st_length() determines length linestring, falls distance relations category (see also Section 4.2.7).\nSubsequently, apply simple attribute filter operation (see Section 3.2.1) letting OSRM service routing remote server.\nNote routing works working internet connection.keep new route_carshort object separate straight line representation trip desire_carshort , data management perspective, makes sense combine : represent trip.\nnew route dataset contains distance (referring route distance time) duration fields (seconds) useful.\nHowever, purposes chapter, interested geometry, route distance can calculated.\nfollowing command makes use ability simple features objects contain multiple geographic columns:allows plotting desire lines along many short car journeys take place alongside likely routes traveled cars referring geometry column separately (desire_carshort$geometry desire_carshort$geom_car case).\nMaking width routes proportional number car journeys potentially replaced provides effective way prioritize interventions road network (Lovelace et al. 2017).code chunk plots desire lines routes, resulting Figure 12.4 shows routes along people drive short distances:76\nFIGURE 12.4: Routes along many (300+) short (<5km Euclidean distance) car journeys made (red) overlaying desire lines representing trips (black) zone centroids (dots).\nPlotting results interactive map, mapview::mapview(desire_carshort$geom_car) example, shows many short car trips take place around Bradley Stoke.\neasy find explanations area’s high level car dependency: according Wikipedia, Bradley Stoke “Europe’s largest new town built private investment,” suggesting limited public transport provision.\nFurthermore, town surrounded large (cycling unfriendly) road structures, “junctions M4 M5 motorways” (Tallon 2007).many benefits converting travel desire lines likely routes travel policy perspective, primary among ability understand surrounding environment makes people travel particular mode.\ndiscuss future directions research building routes Section 12.9.\npurposes case study, suffice say roads along short car journeys travel prioritized investigation understand can made conducive sustainable transport modes.\nOne option add new public transport nodes network.\nnodes described next section.","code":"\ndesire_lines$distance = as.numeric(st_length(desire_lines))\ndesire_carshort = dplyr::filter(desire_lines, car_driver > 300 & distance < 5000)\nroute_carshort = route(l = desire_carshort, route_fun = route_osrm)\ndesire_carshort$geom_car = st_geometry(route_carshort)\nplot(st_geometry(desire_carshort))\nplot(desire_carshort$geom_car, col = \"red\", add = TRUE)\nplot(st_geometry(st_centroid(zones_od)), add = TRUE)"},{"path":"transport.html","id":"nodes","chapter":"12 Transportation","heading":"12.6 Nodes","text":"Nodes geographic transport data zero-dimensional features (points) among predominantly one-dimensional features (lines) comprise network.\ntwo types transport nodes:Nodes directly network zone centroids — covered next section — individual origins destinations houses workplaces.Nodes part transport networks, representing individual pathways, intersections pathways (junctions) points entering exiting transport network bus stops train stations.Transport networks can represented graphs, segment connected (via edges representing geographic lines) one edges network.\nNodes outside network can added “centroid connectors”, new route segments nearby nodes network (Hollander 2016).77\nEvery node network connected one ‘edges’ represent individual segments network.\nsee transport networks can represented graphs Section 12.7.Public transport stops particularly important nodes can represented either type node: bus stop part road, large rail station represented pedestrian entry point hundreds meters railway tracks.\nuse railway stations illustrate public transport nodes, relation research question increasing cycling Bristol.\nstations provided spDataLarge bristol_stations.common barrier preventing people switching away cars commuting work distance home work far walk cycle.\nPublic transport can reduce barrier providing fast high-volume option common routes cities.\nactive travel perspective, public transport ‘legs’ longer journeys divide trips three:origin leg, typically residential areas public transport stationsThe public transport leg, typically goes station nearest trip’s origin station nearest destinationThe destination leg, station alighting destinationBuilding analysis conducted Section 12.4, public transport nodes can used construct three-part desire lines trips can taken bus (mode used example) rail.\nfirst stage identify desire lines public transport travel, case easy previously created dataset desire_lines already contains variable describing number trips train (public transport potential also estimated using public transport routing services OpenTripPlanner).\nmake approach easier follow, select top three desire lines terms rails use:challenge now ‘break-’ lines three pieces, representing travel via public transport nodes.\ncan done converting desire line multiline object consisting three line geometries representing origin, public transport destination legs trip.\noperation can divided three stages: matrix creation (origins, destinations ‘via’ points representing rail stations), identification nearest neighbors conversion multilines.\nundertaken line_via().\nstplanr function takes input lines points returns copy desire lines — see Desire Lines Extended vignette geocompr.github.io website ?line_via details works.\noutput input line, except new geometry columns representing journey via public transport nodes, demonstrated :illustrated Figure 12.5, initial desire_rail lines now three additional geometry list columns representing travel home origin station, destination, finally destination station destination.\ncase, destination leg short (walking distance) origin legs may sufficiently far justify investment cycling infrastructure encourage people cycle stations outward leg peoples’ journey work residential areas surrounding three origin stations Figure 12.5.\nFIGURE 12.5: Station nodes (red dots) used intermediary points convert straight desire lines high rail usage (black) three legs: origin station (red) via public transport (gray) destination (short blue line).\n","code":"\ndesire_rail = top_n(desire_lines, n = 3, wt = train)\nncol(desire_rail)\n#> [1] 10\ndesire_rail = line_via(desire_rail, bristol_stations)\nncol(desire_rail)\n#> [1] 13"},{"path":"transport.html","id":"route-networks","chapter":"12 Transportation","heading":"12.7 Route networks","text":"\ndata used section downloaded using osmdata.\navoid request data OSM repeatedly, use bristol_ways object, contains point line data case study area (see ?bristol_ways):code chunk loaded simple feature object representing around 3,000 segments transport network.\neasily manageable dataset size (transport datasets can large, ’s best start small).mentioned, route networks can usefully represented mathematical graphs, nodes network connected edges.\nnumber R packages developed dealing graphs, notably igraph.\nOne can manually convert route network igraph object, geographic attributes lost.\novercome issue SpatialLinesNetwork() developed stplanr package represent route networks simultaneously graphs set geographic lines.\nfunction demonstrated using subset bristol_ways object used previous sections.output previous code chunk shows ways_sln composite object various ‘slots.’\ninclude: spatial component network (named sl), graph component (g) ‘weightfield,’ edge variable used shortest path  calculation (default segment distance).\nways_sln class sfNetwork, defined S4 class system.\nmeans component can accessed using @ operator, used extract graph component process using igraph package, plotting results geographic space.\nexample , ‘edge betweenness’, meaning number shortest paths passing edge, calculated (see ?igraph::betweenness details Figure 12.6).\nresults demonstrate graph edge represents segment: segments near center road network greatest betweenness scores.\nFIGURE 12.6: Illustration small route network, segment thickness proportional betweenness, generated using igraph package described text.\nOne can also find shortest route origins destinations using graph representation route network.\ncan done functions sum_network_routes() stplanr, undertakes ‘local routing’ (see Section 12.5).","code":"\nsummary(bristol_ways)\n#>      highway        maxspeed         ref                geometry   \n#>  cycleway:1317   30 mph : 925   A38    : 214   LINESTRING   :4915  \n#>  rail    : 832   20 mph : 556   A432   : 146   epsg:4326    :   0  \n#>  road    :2766   40 mph : 397   M5     : 144   +proj=long...:   0  \n#>                  70 mph : 328   A4018  : 124                       \n#>                  50 mph : 158   A420   : 115                       \n#>                  (Other): 490   (Other):1877                       \n#>                  NA's   :2061   NA's   :2295\nways_freeway = bristol_ways %>% filter(maxspeed == \"70 mph\") \nways_sln = SpatialLinesNetwork(ways_freeway)\n#> Warning in SpatialLinesNetwork.sf(ways_freeway): Graph composed of multiple\n#> subgraphs, consider cleaning it with sln_clean_graph().\nslotNames(ways_sln)\n#> [1] \"sl\"          \"g\"           \"nb\"          \"weightfield\"\nweightfield(ways_sln)\n#> [1] \"length\"\nclass(ways_sln@g)\n#> [1] \"igraph\"\ne = igraph::edge_betweenness(ways_sln@g)\nplot(ways_sln@sl$geometry, lwd = e / 500)"},{"path":"transport.html","id":"prioritizing-new-infrastructure","chapter":"12 Transportation","heading":"12.8 Prioritizing new infrastructure","text":"chapter’s final practical section demonstrates policy-relevance geocomputation transport applications identifying locations new transport infrastructure may needed.\nClearly, types analysis presented need extended complemented methods used real-world applications, discussed Section 12.9.\nHowever, stage useful , feed wider analyses.\nsummarize, : identifying short car-dependent commuting routes (generated desire lines) Section 12.5; creating desire lines representing trips rail stations Section 12.6; analysis transport systems route network using graph theory Section 12.7.final code chunk chapter combines strands analysis.\nadds car-dependent routes route_carshort newly created object, route_rail creates new column representing amount travel along centroid--centroid desire lines represent:results preceding code visualized Figure 12.7, shows routes high levels car dependency highlights opportunities cycling rail stations (subsequent code chunk creates simple version figure — see code/12-cycleways.R reproduce figure exactly).\nmethod limitations: reality, people travel zone centroids always use shortest route algorithm particular mode.\nHowever, results demonstrate routes along cycle paths prioritized car dependency public transport perspectives.\nFIGURE 12.7: Potential routes along prioritise cycle infrastructure Bristol, based access key rail stations (red dots) routes many short car journeys (north Bristol surrounding Stoke Bradley). Line thickness proportional number trips.\nresults may look attractive interactive map, mean?\nroutes highlighted Figure 12.7 suggest transport systems intimately linked wider economic social context.\nexample Stoke Bradley case point:\nlocation, lack public transport services active travel infrastructure help explain highly car-dependent.\nwider point car dependency spatial distribution implications sustainable transport policies (Hickman, Ashiru, Banister 2011).","code":"\nroute_rail = desire_rail %>%\n  st_set_geometry(\"leg_orig\") %>% \n  route(l = ., route_fun = route_osrm) %>% \n  select(names(route_carshort))\nroute_cycleway = rbind(route_rail, route_carshort)\nroute_cycleway$all = c(desire_rail$all, desire_carshort$all)\nqtm(route_cycleway, lines.lwd = \"all\")"},{"path":"transport.html","id":"future-directions-of-travel","chapter":"12 Transportation","heading":"12.9 Future directions of travel","text":"chapter provides taste possibilities using geocomputation transport research.\nexplored key geographic elements make-city’s transport system using open data reproducible code.\nresults help plan investment needed.Transport systems operate multiple interacting levels, meaning geocomputational methods great potential generate insights work.\nmuch done area: possible build foundations presented chapter many directions.\nTransport fastest growing source greenhouse gas emissions many countries, set become “largest GHG emitting sector, especially developed countries” (see EURACTIV.com).\nhighly unequal distribution transport-related emissions across society, fact transport (unlike food heating) essential well-, great potential sector rapidly decarbonize demand reduction, electrification vehicle fleet uptake active travel modes walking cycling.\nexploration ‘transport futures’ local level represents promising direction travel transport-related geocomputational research.Methodologically, foundations presented chapter extended including variables analysis.\nCharacteristics route speed limits, busyness provision protected cycling walking paths linked ‘mode-split’ (proportion trips made different modes transport).\naggregating OpenStreetMap data using buffers geographic data methods presented Chapters 3 4, example, possible detect presence green space close proximity transport routes.\nUsing R’s statistical modeling capabilities, used predict current future levels cycling, example.type analysis underlies Propensity Cycle Tool (PCT), publicly accessible (see www.pct.bike) mapping tool developed R used prioritize investment cycling across England (Lovelace et al. 2017).\nSimilar tools used encourage evidence-based transport policies related topics air pollution public transport access around world.","code":""},{"path":"transport.html","id":"ex-transport","chapter":"12 Transportation","heading":"12.10 Exercises","text":"total distance cycleways constructed routes presented Figure 12.7 constructed?\nBonus: find two ways arriving answer.\nBonus: find two ways arriving answer.proportion trips represented desire_lines accounted route_cycleway object?\nBonus: proportion trips cross proposed routes?\nAdvanced: write code increase proportion.\nBonus: proportion trips cross proposed routes?Advanced: write code increase proportion.analysis presented chapter designed teaching geocomputation methods can applied transport research. ‘real’ local government transport consultancy, top 3 things differently?\n\n\n\nClearly, routes identified Figure 12.7 provide part picture. extend analysis incorporate trips potentially cycled?Imagine want extend scenario creating key areas (routes) investment place-based cycling policies car-free zones, cycle parking points reduced car parking strategy. raster data assist work?\nBonus: develop raster layer divides Bristol region 100 cells (10 10) provide metric related transport policy, number people trips pass cell walking average speed limit roads, bristol_ways dataset (approach taken Chapter 13).\nBonus: develop raster layer divides Bristol region 100 cells (10 10) provide metric related transport policy, number people trips pass cell walking average speed limit roads, bristol_ways dataset (approach taken Chapter 13).","code":""},{"path":"location.html","id":"location","chapter":"13 Geomarketing","heading":"13 Geomarketing","text":"","code":""},{"path":"location.html","id":"prerequisites-11","chapter":"13 Geomarketing","heading":"Prerequisites","text":"chapter requires following packages (revgeo must also installed):Required data, downloaded due courseAs convenience reader ensure easy reproducibility, made available downloaded data spDataLarge package.","code":"\nlibrary(sf)\nlibrary(dplyr)\nlibrary(purrr)\nlibrary(raster)\n#> Warning: multiple methods tables found for 'approxNA'\nlibrary(osmdata)\nlibrary(spDataLarge)"},{"path":"location.html","id":"introduction-8","chapter":"13 Geomarketing","heading":"13.1 Introduction","text":"chapter demonstrates skills learned Parts II can applied particular domain: geomarketing (sometimes also referred location analysis location intelligence).\nbroad field research commercial application.\ntypical example locate new shop.\naim attract visitors , ultimately, make profit.\nalso many non-commercial applications can use technique public benefit, example locate new health services (Tomintz, Clarke, Rigby 2008).People fundamental location analysis, particular likely spend time resources.\nInterestingly, ecological concepts models quite similar used store location analysis.\nAnimals plants can best meet needs certain ‘optimal’ locations, based variables change space (Muenchow et al. (2018); see also chapter 14).\none great strengths geocomputation GIScience general.\nConcepts methods transferable fields.\nPolar bears, example, prefer northern latitudes temperatures lower food (seals sea lions) plentiful.\nSimilarly, humans tend congregate certain places, creating economic niches (high land prices) analogous ecological niche Arctic.\nmain task location analysis find ‘optimal locations’ specific services, based available data.\nTypical research questions include:target groups live areas frequent?competing stores services located?many people can easily reach specific stores?existing services - -exploit market potential?market share company specific area?chapter demonstrates geocomputation can answer questions based hypothetical case study based real data.","code":""},{"path":"location.html","id":"case-study","chapter":"13 Geomarketing","heading":"13.2 Case study: bike shops in Germany","text":"Imagine starting chain bike shops Germany.\nstores placed urban areas many potential customers possible.\nAdditionally, hypothetical survey (invented chapter, commercial use!) suggests single young males (aged 20 40) likely buy products: target audience.\nlucky position sufficient capital open number shops.\nplaced?\nConsulting companies (employing geomarketing analysts) happily charge high rates answer questions.\nLuckily, can help open data open source software.\nfollowing sections demonstrate techniques learned first chapters book can applied undertake common steps service location analysis:Tidy input data German census (Section 13.3)Convert tabulated census data raster objects (Section 13.4)Identify metropolitan areas high population densities (Section 13.5)Download detailed geographic data (OpenStreetMap, osmdata) areas (Section 13.6)Create rasters scoring relative desirability different locations using map algebra (Section 13.7)Although applied steps specific case study, generalized many scenarios store location public service provision.","code":""},{"path":"location.html","id":"tidy-the-input-data","chapter":"13 Geomarketing","heading":"13.3 Tidy the input data","text":"German government provides gridded census data either 1 km 100 m resolution.\nfollowing code chunk downloads, unzips reads 1 km data.\nPlease note census_de also available spDataLarge package (data(\"census_de\", package = \"spDataLarge\").census_de object data frame containing 13 variables 300,000 grid cells across Germany.\nwork, need subset : Easting (x) Northing (y), number inhabitants (population; pop), mean average age (mean_age), proportion women (women) average household size (hh_size).\nvariables selected renamed German English code chunk summarized Table 13.1.\n, mutate_all() used convert values -1 -9 (meaning unknown) NA.TABLE 13.1: Categories variable census data Datensatzbeschreibung…xlsx located downloaded file census.zip (see Figure 13.1 spatial distribution).","code":"\ndownload.file(\"https://tinyurl.com/ybtpkwxz\", \n              destfile = \"census.zip\", mode = \"wb\")\nunzip(\"census.zip\") # unzip the files\ncensus_de = readr::read_csv2(list.files(pattern = \"Gitter.csv\"))\n# pop = population, hh_size = household size\ninput = dplyr::select(census_de, x = x_mp_1km, y = y_mp_1km, pop = Einwohner,\n                      women = Frauen_A, mean_age = Alter_D,\n                      hh_size = HHGroesse_D)\n# set -1 and -9 to NA\ninput_tidy = mutate_all(input, list(~ifelse(. %in% c(-1, -9), NA, .)))"},{"path":"location.html","id":"create-census-rasters","chapter":"13 Geomarketing","heading":"13.4 Create census rasters","text":"preprocessing, data can converted raster stack brick (see Sections 2.3.4 3.3.1).\nrasterFromXYZ() makes really easy.\nrequires input data frame first two columns represent coordinates regular grid.\nremaining columns (: pop, women, mean_age, hh_size) serve input raster brick layers (Figure 13.1; see also code/13-location-jm.R github repository).\nFIGURE 13.1: Gridded German census data 2011 (see Table 13.1 description classes).\nnext stage reclassify values rasters stored input_ras accordance survey mentioned Section 13.2, using raster function reclassify(), introduced Section 4.3.3.\ncase population data, convert classes numeric data type using class means.\nRaster cells assumed population 127 value 1 (cells ‘class 1’ contain 3 250 inhabitants) 375 value 2 (containing 250 500 inhabitants), (see Table 13.1).\ncell value 8000 inhabitants chosen ‘class 6’ cells contain 8000 people.\ncourse, approximations true population, precise values.78\nHowever, level detail sufficient delineate metropolitan areas (see next section).contrast pop variable, representing absolute estimates total population, remaining variables re-classified weights corresponding weights used survey.\nClass 1 variable women, instance, represents areas 0 40% population female;\nreclassified comparatively high weight 3 target demographic predominantly male.\nSimilarly, classes containing youngest people highest proportion single households reclassified high weights.Note made sure order reclassification matrices list elements input_ras.\ninstance, first element corresponds cases population.\nSubsequently, -loop applies reclassification matrix corresponding raster layer.\nFinally, code chunk ensures reclass layers name layers input_ras.","code":"\ninput_ras = rasterFromXYZ(input_tidy, crs = st_crs(3035)$proj4string)\ninput_ras\n#> class : RasterBrick\n#> dimensions : 868, 642, 557256, 4 (nrow, ncol, ncell, nlayers)\n#> resolution : 1000, 1000 (x, y)\n#> extent : 4031000, 4673000, 2684000, 3552000 (xmin, xmax, ymin, ymax)\n#> coord. ref. : +proj=laea +lat_0=52 +lon_0=10\n#> names       :  pop, women, mean_age, hh_size \n#> min values  :    1,     1,        1,       1 \n#> max values  :    6,     5,        5,       5\nrcl_pop = matrix(c(1, 1, 127, 2, 2, 375, 3, 3, 1250, \n                   4, 4, 3000, 5, 5, 6000, 6, 6, 8000), \n                 ncol = 3, byrow = TRUE)\nrcl_women = matrix(c(1, 1, 3, 2, 2, 2, 3, 3, 1, 4, 5, 0), \n                   ncol = 3, byrow = TRUE)\nrcl_age = matrix(c(1, 1, 3, 2, 2, 0, 3, 5, 0),\n                 ncol = 3, byrow = TRUE)\nrcl_hh = rcl_women\nrcl = list(rcl_pop, rcl_women, rcl_age, rcl_hh)\nreclass = input_ras\nfor (i in seq_len(nlayers(reclass))) {\n  reclass[[i]] = reclassify(x = reclass[[i]], rcl = rcl[[i]], right = NA)\n}\nnames(reclass) = names(input_ras)\nreclass\n#> ... (full output not shown)\n#> names       :  pop, women, mean_age, hh_size \n#> min values  :  127,     0,        0,       0 \n#> max values  : 8000,     3,        3,       3"},{"path":"location.html","id":"define-metropolitan-areas","chapter":"13 Geomarketing","heading":"13.5 Define metropolitan areas","text":"define metropolitan areas pixels 20 km2 inhabited 500,000 people.\nPixels coarse resolution can rapidly created using aggregate(), introduced Section 5.3.3.\ncommand uses argument fact = 20 reduce resolution result twenty-fold (recall original raster resolution 1 km2):next stage keep cells half million people.Plotting reveals eight metropolitan regions (Figure 13.2).\nregion consists one raster cells.\nnice join cells belonging one region.\nraster’s clump() command exactly .\nSubsequently, rasterToPolygons() converts raster object spatial polygons, st_as_sf() converts sf-object.polys now features column named clumps indicates metropolitan region polygon belongs use dissolve polygons coherent single regions (see also Section 5.2.6):Given column input, summarize() dissolves geometry.\nFIGURE 13.2: aggregated population raster (resolution: 20 km) identified metropolitan areas (golden polygons) corresponding names.\nresulting eight metropolitan areas suitable bike shops (Figure 13.2; see also code/13-location-jm.R creating figure) still missing name.\nreverse geocoding approach can settle problem.\nGiven coordinate, reverse geocoding finds corresponding address.\nConsequently, extracting centroid coordinate metropolitan area can serve input reverse geocoding API.\nrevgeo package provides access open source Photon geocoder OpenStreetMap, Google Maps Bing.\ndefault, uses Photon API.\nrevgeo::revgeo() accepts geographical coordinates (latitude/longitude); therefore, first requirement bring metropolitan polygons appropriate coordinate reference system (Chapter 6).Choosing frame revgeocode()’s output option give back data.frame several columns referring location including street name, house number city.make sure reader uses exact results, put spDataLarge object metro_names.TABLE 13.2: Result reverse geocoding.Overall, satisfied city column serving metropolitan names (Table 13.2) apart one exception, namely Wülfrath belongs greater region Düsseldorf.\nHence, replace Wülfrath Düsseldorf (Figure 13.2).\nUmlauts like ü might lead trouble , example determining bounding box metropolitan area opq() (see ), avoid .","code":"\npop_agg = aggregate(reclass$pop, fact = 20, fun = sum)\nsummary(pop_agg)\n#>             pop\n#> Min.        127\n#> 1st Qu.   39886\n#> Median    66008\n#> 3rd Qu.  105696\n#> Max.    1204870\n#> NA's        447\npop_agg = pop_agg[pop_agg > 500000, drop = FALSE] \npolys = pop_agg %>% \n  clump() %>%\n  rasterToPolygons() %>%\n  st_as_sf()\nmetros = polys %>%\n  group_by(clumps) %>%\n  summarize()\nmetros_wgs = st_transform(metros, 4326)\ncoords = st_centroid(metros_wgs) %>%\n  st_coordinates() %>%\n  round(4)\nlibrary(revgeo)\nmetro_names = revgeo(longitude = coords[, 1], latitude = coords[, 2], \n                     output = \"frame\")\nmetro_names = dplyr::pull(metro_names, city) %>% \n  as.character() %>% \n  ifelse(. == \"Wülfrath\", \"Duesseldorf\", .)"},{"path":"location.html","id":"points-of-interest","chapter":"13 Geomarketing","heading":"13.6 Points of interest","text":"\nosmdata package provides easy--use access OSM data (see also Section 7.2).\nInstead downloading shops whole Germany, restrict query defined metropolitan areas, reducing computational load providing shop locations areas interest.\nsubsequent code chunk using number functions including:map() (tidyverse equivalent lapply()), iterates eight metropolitan names subsequently define bounding box OSM query function opq() (see Section 7.2).add_osm_feature() specify OSM elements key value shop (see wiki.openstreetmap.org list common key:value pairs).osmdata_sf(), converts OSM data spatial objects (class sf).(), tries repeatedly (three times case) download data fails first time.79\nrunning code: please consider download almost 2GB data.\nsave time resources, put output named shops spDataLarge.\nmake available environment ensure spDataLarge package loaded, run data(\"shops\", package = \"spDataLarge\").highly unlikely shops defined metropolitan areas.\nfollowing condition simply checks least one shop region.\n, recommend try download shops /specific region/s.make sure list element (sf data frame) comes columns, keep osm_id shop columns help another map loop.\ngiven since OSM contributors equally meticulous collecting data.\nFinally, rbind shops one large sf object.easier simply use map_dfr().\nUnfortunately, far work harmony sf objects.\nNote: shops provided spDataLarge package.thing left convert spatial point object raster (see Section 5.4.3).\nsf object, shops, converted raster parameters (dimensions, resolution, CRS) reclass object.\nImportantly, count() function used calculate number shops cell.result subsequent code chunk therefore estimate shop density (shops/km2).\nst_transform() used rasterize() ensure CRS inputs match.raster layers (population, women, mean age, household size) poi raster reclassified four classes (see Section 13.4).\nDefining class intervals arbitrary undertaking certain degree.\nOne can use equal breaks, quantile breaks, fixed values others.\n, choose Fisher-Jenks natural breaks approach minimizes within-class variance, result provides input reclassification matrix.","code":"\nshops = map(metro_names, function(x) {\n  message(\"Downloading shops of: \", x, \"\\n\")\n  # give the server a bit time\n  Sys.sleep(sample(seq(5, 10, 0.1), 1))\n  query = opq(x) %>%\n    add_osm_feature(key = \"shop\")\n  points = osmdata_sf(query)\n  # request the same data again if nothing has been downloaded\n  iter = 2\n  while (nrow(points$osm_points) == 0 & iter > 0) {\n    points = osmdata_sf(query)\n    iter = iter - 1\n  }\n  points = st_set_crs(points$osm_points, 4326)\n})\n# checking if we have downloaded shops for each metropolitan area\nind = map(shops, nrow) == 0\nif (any(ind)) {\n  message(\"There are/is still (a) metropolitan area/s without any features:\\n\",\n          paste(metro_names[ind], collapse = \", \"), \"\\nPlease fix it!\")\n}\n# select only specific columns\nshops = map(shops, dplyr::select, osm_id, shop)\n# putting all list elements into a single data frame\nshops = do.call(rbind, shops)\nshops = st_transform(shops, proj4string(reclass))\n# create poi raster\npoi = rasterize(x = shops, y = reclass, field = \"osm_id\", fun = \"count\")\n# construct reclassification matrix\nint = classInt::classIntervals(values(poi), n = 4, style = \"fisher\")\nint = round(int$brks)\nrcl_poi = matrix(c(int[1], rep(int[-c(1, length(int))], each = 2), \n                   int[length(int)] + 1), ncol = 2, byrow = TRUE)\nrcl_poi = cbind(rcl_poi, 0:3)  \n# reclassify\npoi = reclassify(poi, rcl = rcl_poi, right = NA) \nnames(poi) = \"poi\""},{"path":"location.html","id":"identifying-suitable-locations","chapter":"13 Geomarketing","heading":"13.7 Identifying suitable locations","text":"steps remain combining layers add poi reclass raster stack remove population layer .\nreasoning latter twofold.\nFirst , already delineated metropolitan areas, areas population density average compared rest Germany.\nSecond, though advantageous many potential customers within specific catchment area, sheer number alone might actually represent desired target group.\ninstance, residential tower blocks areas high population density necessarily high purchasing power expensive cycle components.\nachieved complementary functions addLayer() dropLayer():common data science projects, data retrieval ‘tidying’ consumed much overall workload far.\nclean data, final step — calculating final score summing raster layers — can accomplished single line code.instance, score greater 9 might suitable threshold indicating raster cells bike shop placed (Figure 13.3; see also code/13-location-jm.R).\nFIGURE 13.3: Suitable areas (.e., raster cells score > 9) accordance hypothetical survey bike stores Berlin.\n","code":"\n# add poi raster\nreclass = addLayer(reclass, poi)\n# delete population raster\nreclass = dropLayer(reclass, \"pop\")\n# calculate the total score\nresult = sum(reclass)"},{"path":"location.html","id":"discussion-and-next-steps","chapter":"13 Geomarketing","heading":"13.8 Discussion and next steps","text":"presented approach typical example normative usage GIS (P. Longley 2015).\ncombined survey data expert-based knowledge assumptions (definition metropolitan areas, defining class intervals, definition final score threshold).\napproach less suitable scientific research applied analysis provides evidence based indication areas suitable bike shops compared sources information.\nnumber changes approach improve analysis:used equal weights calculating final scores factors, household size, important portion women mean ageWe used points interest related bike shops, --, hardware, bicycle, fishing, hunting, motorcycles, outdoor sports shops (see range shop values available OSM Wiki) may yielded refined resultsData higher resolution may improve output (see exercises)used limited set variables data sources, INSPIRE geoportal data cycle paths OpenStreetMap, may enrich analysis (see also Section 7.2)Interactions remained unconsidered, possible relationships portion men single householdsIn short, analysis extended multiple directions.\nNevertheless, given first impression understanding obtain deal spatial data R within geomarketing context.Finally, point presented analysis merely first step finding suitable locations.\nfar identified areas, 1 1 km size, representing potentially suitable locations bike shop accordance survey.\nSubsequent steps analysis taken:Find optimal location based number inhabitants within specific catchment area.\nexample, shop reachable many people possible within 15 minutes traveling bike distance (catchment area routing).\nThereby, account fact away people shop, unlikely becomes actually visit (distance decay function).Also good idea take account competitors.\n, already bike shop vicinity chosen location, possible customers (sales potential) distributed competitors (Huff 1963; Wieland 2017).need find suitable affordable real estate, e.g., terms accessibility, availability parking spots, desired frequency passers-, big windows, etc.","code":""},{"path":"location.html","id":"exercises-9","chapter":"13 Geomarketing","heading":"13.9 Exercises","text":"used raster::rasterFromXYZ() convert input_tidy raster brick.\nTry achieve help sp::gridded() function.\nused raster::rasterFromXYZ() convert input_tidy raster brick.\nTry achieve help sp::gridded() function.\nDownload csv file containing inhabitant information 100-m cell resolution (https://www.zensus2011.de/SharedDocs/Downloads/DE/Pressemitteilung/DemografischeGrunddaten/csv_Bevoelkerung_100m_Gitter.zip?__blob=publicationFile&v=3).\nPlease note unzipped file size 1.23 GB.\nread R, can use readr::read_csv.\ntakes 30 seconds machine (16 GB RAM)\ndata.table::fread() might even faster, returns object class data.table().\nUse .tibble() convert tibble.\nBuild inhabitant raster, aggregate cell resolution 1 km, compare difference inhabitant raster (inh) created using class mean values.Download csv file containing inhabitant information 100-m cell resolution (https://www.zensus2011.de/SharedDocs/Downloads/DE/Pressemitteilung/DemografischeGrunddaten/csv_Bevoelkerung_100m_Gitter.zip?__blob=publicationFile&v=3).\nPlease note unzipped file size 1.23 GB.\nread R, can use readr::read_csv.\ntakes 30 seconds machine (16 GB RAM)\ndata.table::fread() might even faster, returns object class data.table().\nUse .tibble() convert tibble.\nBuild inhabitant raster, aggregate cell resolution 1 km, compare difference inhabitant raster (inh) created using class mean values.Suppose bike shop predominantly sold electric bikes older people.\nChange age raster accordingly, repeat remaining analyses compare changes original result.Suppose bike shop predominantly sold electric bikes older people.\nChange age raster accordingly, repeat remaining analyses compare changes original result.","code":""},{"path":"eco.html","id":"eco","chapter":"14 Ecology","heading":"14 Ecology","text":"","code":""},{"path":"eco.html","id":"prerequisites-12","chapter":"14 Ecology","heading":"Prerequisites","text":"chapter assumes strong grasp geographic data analysis processing, covered Chapters 2 5.\nalso make use R’s interfaces dedicated GIS software, spatial cross-validation, topics covered Chapters 9 11, respectively.chapter uses following packages:","code":"\nlibrary(sf)\nlibrary(raster)\n# library(RQGIS)\nlibrary(mlr)\nlibrary(dplyr)\nlibrary(vegan)"},{"path":"eco.html","id":"introduction-9","chapter":"14 Ecology","heading":"14.1 Introduction","text":"chapter model floristic gradient fog oases reveal distinctive vegetation belts clearly controlled water availability.\n, bring together concepts presented previous chapters even extend (Chapters 2 5 Chapters 9 11).Fog oases one fascinating vegetation formations ever encountered.\nformations, locally termed lomas, develop mountains along coastal deserts Peru Chile.80\ndeserts’ extreme conditions remoteness provide habitat unique ecosystem, including species endemic fog oases.\nDespite arid conditions low levels precipitation around 30-50 mm per year average, fog deposition increases amount water available plants austal winter.\nresults green southern-facing mountain slopes along coastal strip Peru (Figure 14.1).\nfog, develops temperature inversion caused cold Humboldt current austral winter, provides name habitat.\nEvery years, El Niño phenomenon brings torrential rainfall sun-baked environment (Dillon, Nakazawa, Leiva 2003).\ncauses desert bloom, provides tree seedlings chance develop roots long enough survive following arid conditions.Unfortunately, fog oases heavily endangered.\nmostly due human activity (agriculture climate change).\neffectively protect last remnants unique vegetation ecosystem, evidence needed composition spatial distribution native flora (Muenchow, Bräuning, et al. 2013; Muenchow, Hauenstein, et al. 2013).\nLomas mountains also economic value tourist destination, can contribute well-local people via recreation.\nexample, Peruvians live coastal desert, lomas mountains frequently closest “green” destination.chapter demonstrate ecological applications techniques learned previous chapters.\ncase study involve analyzing composition spatial distribution vascular plants southern slope Mt. Mongón, lomas mountain near Casma central northern coast Peru (Figure 14.1).\nFIGURE 14.1: Mt. Mongón study area, Muenchow, Schratz, Brenning (2017).\nfield study Mt. Mongón, recorded vascular plants living 100 randomly sampled 4x4 m2 plots austral winter 2011 (Muenchow, Bräuning, et al. 2013).\nsampling coincided strong La Niña event year (see ENSO monitoring NOASS Climate Prediction Center).\nled even higher levels aridity usual coastal desert.\nhand, also increased fog activity southern slopes Peruvian lomas mountains.Ordinations dimension-reducing techniques allow extraction main gradients (noisy) dataset, case floristic gradient developing along southern mountain slope (see next section).\nchapter model first ordination axis, .e., floristic gradient, function environmental predictors altitude, slope, catchment area NDVI.\n, make use random forest model - popular machine learning algorithm (Breiman 2001).\nmodel allow us make spatial predictions floristic composition anywhere study area.\nguarantee optimal prediction, advisable tune beforehand hyperparameters help spatial cross-validation (see Section 11.5.2).","code":""},{"path":"eco.html","id":"data-and-data-preparation","chapter":"14 Ecology","heading":"14.2 Data and data preparation","text":"data needed subsequent analyses available via spDataLarge package.study_area sf polygon representing outlines study area.\nrandom_points sf object, contains 100 randomly chosen sites.\ncomm community matrix wide data format (Wickham 2014) rows represent visited sites field columns observed species.81The values represent species cover per site, recorded area covered species proportion site area percentage points (%; please note one site can >100% due overlapping cover individual plants).\nrownames comm correspond id column random_points.\ndem digital elevation model (DEM) study area, ndvi Normalized Difference Vegetation Index (NDVI) computed red near-infrared channels Landsat scene (see Section 4.3.3 ?ndvi).\nVisualizing data helps get familiar , shown Figure 14.2 dem overplotted random_points study_area.\nFIGURE 14.2: Study mask (polygon), location sampling sites (black points) DEM background.\nnext step compute variables need modeling predictive mapping (see Section 14.4.2) also aligning Non-metric multidimensional scaling (NMDS) axes main gradient study area, altitude humidity, respectively (see Section 14.3).Specifically, compute catchment slope catchment area digital elevation model using R-GIS bridges (see Chapter 9).\nCurvatures might also represent valuable predictors, Exercise section can find change modeling result.compute catchment area catchment slope, make use saga:sagawetnessindex function.82\nget_usage() returns function parameters default values specific geoalgorithm.\n, present selection complete output.Subsequently, can specify needed parameters using R named arguments (see Section 9.2).\nRemember can use RasterLayer living R’s global environment specify input raster DEM (see Section 9.2).\nSpecifying 1 SLOPE_TYPE makes sure algorithm return catchment slope.\nresulting output rasters saved temporary files .sdat extension SAGA raster format.\nSetting load_output TRUE ensures resulting rasters imported R.returns list named ep consisting two elements: AREA SLOPE.\nLet us add two raster objects list, namely dem ndvi, convert raster stack (see Section 2.3.4).Additionally, catchment area values highly skewed right (hist(ep$carea)).\nlog10-transformation makes distribution normal.convenience reader, added ep spDataLarge:Finally, can extract terrain attributes field observations (see also Section 5.4.2).","code":"\ndata(\"study_area\", \"random_points\", \"comm\", \"dem\", \"ndvi\", package = \"spDataLarge\")\n# sites 35 to 40 and corresponding occurrences of the first five species in the\n# community matrix\ncomm[35:40, 1:5]\n#>    Alon_meri Alst_line Alte_hali Alte_porr Anth_eccr\n#> 35         0         0         0       0.0     1.000\n#> 36         0         0         1       0.0     0.500\n#> 37         0         0         0       0.0     0.125\n#> 38         0         0         0       0.0     3.000\n#> 39         0         0         0       0.0     2.000\n#> 40         0         0         0       0.2     0.125\nget_usage(\"saga:sagawetnessindex\")\n#>ALGORITHM: Saga wetness index\n#>  DEM <ParameterRaster>\n#>  ...\n#>  SLOPE_TYPE <ParameterSelection>\n#>  ...\n#>  AREA <OutputRaster>\n#>  SLOPE <OutputRaster>\n#>  AREA_MOD <OutputRaster>\n#>  TWI <OutputRaster>\n#> ...\n#>SLOPE_TYPE(Type of Slope)\n#>  0 - [0] local slope\n#>  1 - [1] catchment slope\n#> ...\n# environmental predictors: catchment slope and catchment area\nep = run_qgis(alg = \"saga:sagawetnessindex\",\n              DEM = dem,\n              SLOPE_TYPE = 1, \n              SLOPE = tempfile(fileext = \".sdat\"),\n              AREA = tempfile(fileext = \".sdat\"),\n              load_output = TRUE,\n              show_output_paths = FALSE)\nep = stack(c(dem, ndvi, ep))\nnames(ep) = c(\"dem\", \"ndvi\", \"carea\", \"cslope\")\nep$carea = log10(ep$carea)\ndata(\"ep\", package = \"spDataLarge\")\nrandom_points[, names(ep)] = raster::extract(ep, random_points)"},{"path":"eco.html","id":"nmds","chapter":"14 Ecology","heading":"14.3 Reducing dimensionality","text":"Ordinations popular tool vegetation science extract main information, frequently corresponding ecological gradients, large species-plot matrices mostly filled 0s.\nHowever, also used remote sensing, soil sciences, geomarketing many fields.\nunfamiliar ordination techniques need refresher, look Michael W. Palmer’s web page short introduction popular ordination techniques ecology Borcard, Gillet, Legendre (2011) deeper look apply techniques R.\nvegan’s package documentation also helpful resource (vignette(package = \"vegan\")).Principal component analysis (PCA) probably famous ordination technique.\ngreat tool reduce dimensionality one can expect linear relationships variables, joint absence variable (example calcium) two plots (observations) can considered similarity.\nbarely case vegetation data.one, relationships usually non-linear along environmental gradients.\nmeans presence plant usually follows unimodal relationship along gradient (e.g., humidity, temperature salinity) peak favorable conditions declining ends towards unfavorable conditions.Secondly, joint absence species two plots hardly indication similarity.\nSuppose plant species absent driest (e.g., extreme desert) moistest locations (e.g., tree savanna) sampling.\nreally refrain counting similarity likely thing two completely different environmental settings common terms floristic composition shared absence species (except rare ubiquitous species).Non-metric multidimensional scaling (NMDS) one popular dimension-reducing technique ecology (von Wehrden et al. 2009).\nNMDS reduces rank-based differences distances objects original matrix distances ordinated objects.\ndifference expressed stress.\nlower stress value, better ordination, .e., low-dimensional representation original matrix.\nStress values lower 10 represent excellent fit, stress values around 15 still good, values greater 20 represent poor fit (McCune, Grace, Urban 2002).\nR, metaMDS() vegan package can execute NMDS.\ninput, expects community matrix sites rows species columns.\nOften ordinations using presence-absence data yield better results (terms explained variance) though prize , course, less informative input matrix (see also Exercises).\ndecostand() converts numerical observations presences absences 1 indicating occurrence species 0 absence species.\nOrdination techniques NMDS require least one observation per site.\nHence, need dismiss sites species found.resulting output matrix serves input NMDS.\nk specifies number output axes, , set 4.83\nNMDS iterative procedure trying make ordinated space similar input matrix step.\nmake sure algorithm converges, set number steps 500 (try parameter).stress value 9 represents good result, means reduced ordination space represents large majority variance input matrix.\nOverall, NMDS puts objects similar (terms species composition) closer together ordination space.\nHowever, opposed ordination techniques, axes arbitrary necessarily ordered importance (Borcard, Gillet, Legendre 2011).\nHowever, already know humidity represents main gradient study area (Muenchow, Bräuning, et al. 2013; Muenchow, Schratz, Brenning 2017).\nSince humidity highly correlated elevation, rotate NMDS axes accordance elevation (see also ?MDSrotate details rotating NMDS axes).\nPlotting result reveals first axis , intended, clearly associated altitude (Figure 14.3).\nFIGURE 14.3: Plotting first NMDS axis altitude.\nscores first NMDS axis represent different vegetation formations, .e., floristic gradient, appearing along slope Mt. Mongón.\nspatially visualize , can model NMDS scores previously created predictors (Section 14.2), use resulting model predictive mapping (see next section).","code":"\n# presence-absence matrix\npa = decostand(comm, \"pa\")  # 100 rows (sites), 69 columns (species)\n# keep only sites in which at least one species was found\npa = pa[rowSums(pa) != 0, ]  # 84 rows, 69 columns\nset.seed(25072018)\nnmds = metaMDS(comm = pa, k = 4, try = 500)\nnmds$stress\n#> ...\n#> Run 498 stress 0.08834745 \n#> ... Procrustes: rmse 0.004100446  max resid 0.03041186 \n#> Run 499 stress 0.08874805 \n#> ... Procrustes: rmse 0.01822361  max resid 0.08054538 \n#> Run 500 stress 0.08863627 \n#> ... Procrustes: rmse 0.01421176  max resid 0.04985418 \n#> *** Solution reached\n#> 0.08831395\nelev = dplyr::filter(random_points, id %in% rownames(pa)) %>% \n  dplyr::pull(dem)\n# rotating NMDS in accordance with altitude (proxy for humidity)\nrotnmds = MDSrotate(nmds, elev)\n# extracting the first two axes\nsc = scores(rotnmds, choices = 1:2)\n# plotting the first axis against altitude\nplot(y = sc[, 1], x = elev, xlab = \"elevation in m\", \n     ylab = \"First NMDS axis\", cex.lab = 0.8, cex.axis = 0.8)\nknitr::include_graphics(\"figures/xy-nmds-1.png\")"},{"path":"eco.html","id":"modeling-the-floristic-gradient","chapter":"14 Ecology","heading":"14.4 Modeling the floristic gradient","text":"predict floristic gradient spatially, make use random forest model (Hengl et al. 2018).\nRandom forest models frequently used environmental ecological modeling, often provide best results terms predictive performance (Schratz et al. 2018).\n, shortly introduce decision trees bagging, since form basis random forests.\nrefer reader James et al. (2013) detailed description random forests related techniques.introduce decision trees example, first construct response-predictor matrix joining rotated NMDS scores field observations (random_points).\nalso use resulting data frame mlr modeling later .Decision trees split predictor space number regions.\nillustrate , apply decision tree data using scores first NMDS axis response (sc) altitude (dem) predictor.\nFIGURE 14.4: Simple example decision tree three internal nodes four terminal nodes.\nresulting tree consists three internal nodes four terminal nodes (Figure 14.4).\nfirst internal node top tree assigns observations \n\n328.5\nleft observations right branch.\nobservations falling left branch mean NMDS score \n\n-1.198.\nOverall, can interpret tree follows: higher elevation, higher NMDS score becomes.\nDecision trees tendency overfit, mirror closely input data including noise turn leads bad predictive performances [Section 11.4; James et al. (2013)].\nBootstrap aggregation (bagging) ensemble technique helps overcome problem.\nEnsemble techniques simply combine predictions multiple models.\nThus, bagging takes repeated samples input data averages predictions.\nreduces variance overfitting result much better predictive accuracy compared decision trees.\nFinally, random forests extend improve bagging decorrelating trees desirable since averaging predictions highly correlated trees shows higher variance thus lower reliability averaging predictions decorrelated trees (James et al. 2013).\nachieve , random forests use bagging, contrast traditional bagging tree allowed use available predictors, random forests use random sample available predictors.","code":"\n# construct response-predictor matrix\n# id- and response variable\nrp = data.frame(id = as.numeric(rownames(sc)), sc = sc[, 1])\n# join the predictors (dem, ndvi and terrain attributes)\nrp = inner_join(random_points, rp, by = \"id\")\nlibrary(\"tree\")\ntree_mo = tree(sc ~ dem, data = rp)\nplot(tree_mo)\ntext(tree_mo, pretty = 0)"},{"path":"eco.html","id":"mlr-building-blocks","chapter":"14 Ecology","heading":"14.4.1 mlr building blocks","text":"code section largely follows steps introduced Section 11.5.2.\ndifferences following:response variable numeric, hence regression task replace classification task Section 11.5.2.Instead AUROC can used categorical response variables, use root mean squared error (RMSE) performance measure.use random forest model instead support vector machine naturally goes along different hyperparameters.leaving assessment bias-reduced performance measure exercise reader (see Exercises).\nInstead show tune hyperparameters (spatial) predictions.Remember 125,500 models necessary retrieve bias-reduced performance estimates using 100-repeated 5-fold spatial cross-validation random search 50 iterations (see Section 11.5.2).\nhyperparameter tuning level, found best hyperparameter combination turn used outer performance level predicting test data specific spatial partition (see also Figure 11.6).\ndone five spatial partitions, repeated 100 times yielding total 500 optimal hyperparameter combinations.\none use making spatial predictions?\nanswer simple, none .\nRemember, tuning done retrieve bias-reduced performance estimate, best possible spatial prediction.\nlatter, one estimates best hyperparameter combination complete dataset.\nmeans, inner hyperparameter tuning level longer needed makes perfect sense since applying model new data (unvisited field observations) true outcomes unavailable, hence testing impossible case.\nTherefore, tune hyperparameters good spatial prediction complete dataset via 5-fold spatial CV one repetition.\npreparation modeling using mlr package includes construction response-predictor matrix containing variables used modeling construction separate coordinate data frame.constructed input variables, set specifying mlr building blocks (task, learner, resampling).\nuse regression task since response variable numeric.\nlearner random forest model implementation ranger package.opposed example support vector machines (see Section 11.5.2), random forests often already show good performances used default values hyperparameters (may one reason popularity).\nStill, tuning often moderately improves model results, thus worth effort (Probst, Wright, Boulesteix 2018).\nSince deal geographic data, make use spatial cross-validation tune hyperparameters (see Sections 11.4 11.5).\nSpecifically, use five-fold spatial partitioning one repetition (makeResampleDesc()).\nspatial partitions, run 50 models (makeTuneControlRandom()) find optimal hyperparameter combination.random forests, hyperparameters mtry, min.node.size sample.fraction determine degree randomness, tuned (Probst, Wright, Boulesteix 2018).\nmtry indicates many predictor variables used tree.\npredictors used, corresponds fact bagging (see beginning Section 14.4).\nsample.fraction parameter specifies fraction observations used tree.\nSmaller fractions lead greater diversity, thus less correlated trees often desirable (see ).\nmin.node.size parameter indicates number observations terminal node least (see also Figure 14.4).\nNaturally, trees computing time become larger, lower min.node.size.Hyperparameter combinations selected randomly fall inside specific tuning limits (makeParamSet()).\nmtry range 1 number predictors\n\n(4)\nsample.fraction\nrange 0.2 0.9 min.node.size range 1 10.Finally, tuneParams() runs hyperparameter tuning, find optimal hyperparameter combination specified parameters.\nperformance measure root mean squared error (RMSE).mtry 4, sample.fraction 0.887, min.node.size 10 represent best hyperparameter combination.\nRMSE \n\n0.51\nrelatively good considering range response variable \n\n3.04\n(diff(range(rp$sc))).","code":"\n# extract the coordinates into a separate data frame\ncoords = sf::st_coordinates(rp) %>% \n  as.data.frame() %>%\n  rename(x = X, y = Y)\n# only keep response and predictors which should be used for the modeling\nrp = dplyr::select(rp, -id, -spri) %>%\n  st_drop_geometry()\n# create task\ntask = makeRegrTask(data = rp, target = \"sc\", coordinates = coords)\n# learner\nlrn_rf = makeLearner(cl = \"regr.ranger\", predict.type = \"response\")\n# spatial partitioning\nperf_level = makeResampleDesc(\"SpCV\", iters = 5)\n# specifying random search\nctrl = makeTuneControlRandom(maxit = 50L)\n# specifying the search space\nps = makeParamSet(\n  makeIntegerParam(\"mtry\", lower = 1, upper = ncol(rp) - 1),\n  makeNumericParam(\"sample.fraction\", lower = 0.2, upper = 0.9),\n  makeIntegerParam(\"min.node.size\", lower = 1, upper = 10)\n)\n# hyperparamter tuning\nset.seed(02082018)\ntune = tuneParams(learner = lrn_rf, \n                  task = task,\n                  resampling = perf_level,\n                  par.set = ps,\n                  control = ctrl, \n                  measures = mlr::rmse)\n#>...\n#> [Tune-x] 49: mtry=3; sample.fraction=0.533; min.node.size=5\n#> [Tune-y] 49: rmse.test.rmse=0.5636692; time: 0.0 min\n#> [Tune-x] 50: mtry=1; sample.fraction=0.68; min.node.size=5\n#> [Tune-y] 50: rmse.test.rmse=0.6314249; time: 0.0 min\n#> [Tune] Result: mtry=4; sample.fraction=0.887; min.node.size=10 :\n#> rmse.test.rmse=0.5104918"},{"path":"eco.html","id":"predictive-mapping","chapter":"14 Ecology","heading":"14.4.2 Predictive mapping","text":"tuned hyperparameters can now used prediction.\nsimply modify learner using result hyperparameter tuning, run corresponding model.last step apply model spatially available predictors, .e., raster stack.\nfar, raster::predict() support output ranger models, hence, program prediction .\nFirst, convert ep prediction data frame secondly serves input predict.ranger() function.\nThirdly, put predicted values back RasterLayer (see Section 3.3.1 Figure 14.5).\nFIGURE 14.5: Predictive mapping floristic gradient clearly revealing distinct vegetation belts.\npredictive mapping clearly reveals distinct vegetation belts (Figure 14.5).\nPlease refer Muenchow, Hauenstein, et al. (2013) detailed description vegetation belts lomas mountains.\nblue color tones represent -called Tillandsia-belt.\nTillandsia highly adapted genus especially found high quantities sandy quite desertic foot lomas mountains.\nyellow color tones refer herbaceous vegetation belt much higher plant cover compared Tillandsia-belt.\norange colors represent bromeliad belt, features highest species richness plant cover.\ncan found directly beneath temperature inversion (ca. 750-850 m asl) humidity due fog highest.\nWater availability naturally decreases temperature inversion, landscape becomes desertic succulent species (succulent belt; red colors).\nInterestingly, spatial prediction clearly reveals bromeliad belt interrupted interesting finding detected without predictive mapping.","code":"\n# learning using the best hyperparameter combination\nlrn_rf = makeLearner(cl = \"regr.ranger\",\n                     predict.type = \"response\",\n                     mtry = tune$x$mtry, \n                     sample.fraction = tune$x$sample.fraction,\n                     min.node.size = tune$x$min.node.size)\n# doing the same more elegantly using setHyperPars()\n# lrn_rf = setHyperPars(makeLearner(\"regr.ranger\", predict.type = \"response\"),\n#                       par.vals = tune$x)\n# train model\nmodel_rf = train(lrn_rf, task)\n# to retrieve the ranger output, run:\n# mlr::getLearnerModel(model_rf)\n# which corresponds to:\n# ranger(sc ~ ., data = rp, \n#        mtry = tune$x$mtry, \n#        sample.fraction = tune$x$sample.fraction,\n#        min.node.sie = tune$x$min.node.size)\n# convert raster stack into a data frame\nnew_data = as.data.frame(as.matrix(ep))\n# apply the model to the data frame\npred_rf = predict(model_rf, newdata = new_data)\n# put the predicted values into a raster\npred = dem\n# replace altitudinal values by rf-prediction values\npred[] = pred_rf$data$response"},{"path":"eco.html","id":"conclusions-1","chapter":"14 Ecology","heading":"14.5 Conclusions","text":"chapter ordinated community matrix lomas Mt. Mongón help NMDS (Section 14.3).\nfirst axis, representing main floristic gradient study area, modeled function environmental predictors partly derived R-GIS bridges (Section 14.2).\nmlr package provided building blocks spatially tune hyperparameters mtry, sample.fraction min.node.size (Section 14.4.1).\ntuned hyperparameters served input final model turn applied environmental predictors spatial representation floristic gradient (Section 14.4.2).\nresult demonstrates spatially astounding biodiversity middle desert.\nSince lomas mountains heavily endangered, prediction map can serve basis informed decision-making delineating protection zones, making local population aware uniqueness found immediate neighborhood.terms methodology, additional points addressed:interesting also model second ordination axis, subsequently find innovative way visualizing jointly modeled scores two axes one prediction map.interested interpreting model ecologically meaningful way, probably use (semi-)parametric models (Muenchow, Bräuning, et al. 2013; . Zuur et al. 2009; . F. Zuur et al. 2017).\nHowever, least approaches help interpret machine learning models random forests (see, e.g., https://mlr-org.github.io/interpretable-machine-learning-iml--mlr/).sequential model-based optimization (SMBO) might preferable random search hyperparameter optimization used chapter (Probst, Wright, Boulesteix 2018).Finally, please note random forest machine learning models frequently used setting lots observations many predictors, much used chapter, unclear variables variable interactions contribute explaining response.\nAdditionally, relationships might highly non-linear.\nuse case, relationship response predictors pretty clear, slight amount non-linearity number observations predictors low.\nHence, might worth trying linear model.\nlinear model much easier explain understand random forest model, therefore preferred (law parsimony), additionally computationally less demanding (see Exercises).\nlinear model cope degree non-linearity present data, one also try generalized additive model (GAM).\npoint toolbox data scientist consists one tool, responsibility select tool best suited task purpose hand.\n, wanted introduce reader random forest modeling use corresponding results spatial predictions.\npurpose, well-studied dataset known relationships response predictors, appropriate.\nHowever, imply random forest model returned best result terms predictive performance (see Exercises).","code":""},{"path":"eco.html","id":"exercises-10","chapter":"14 Ecology","heading":"14.6 Exercises","text":"Run NMDS using percentage data community matrix.\nReport stress value compare stress value retrieved NMDS using presence-absence data.\nmight explain observed difference?Run NMDS using percentage data community matrix.\nReport stress value compare stress value retrieved NMDS using presence-absence data.\nmight explain observed difference?Compute predictor rasters used chapter (catchment slope, catchment area), put raster stack.\nAdd dem ndvi raster stack.\nNext, compute profile tangential curvature additional predictor rasters add raster stack (hint: grass7:r.slope.aspect).\nFinally, construct response-predictor matrix.\nscores first NMDS axis (result using presence-absence community matrix) rotated accordance elevation represent response variable, joined random_points (use inner join).\ncomplete response-predictor matrix, extract values environmental predictor raster stack random_points.Compute predictor rasters used chapter (catchment slope, catchment area), put raster stack.\nAdd dem ndvi raster stack.\nNext, compute profile tangential curvature additional predictor rasters add raster stack (hint: grass7:r.slope.aspect).\nFinally, construct response-predictor matrix.\nscores first NMDS axis (result using presence-absence community matrix) rotated accordance elevation represent response variable, joined random_points (use inner join).\ncomplete response-predictor matrix, extract values environmental predictor raster stack random_points.Use response-predictor matrix previous exercise fit random forest model.\nFind optimal hyperparameters use making prediction map.Use response-predictor matrix previous exercise fit random forest model.\nFind optimal hyperparameters use making prediction map.Retrieve bias-reduced RMSE random forest model using spatial cross-validation including estimation optimal hyperparameter combinations (random search 50 iterations) inner tuning loop (see Section 11.5.2).\nParallelize tuning level (see Section 11.5.2).\nReport mean RMSE use boxplot visualize retrieved RMSEs.Retrieve bias-reduced RMSE random forest model using spatial cross-validation including estimation optimal hyperparameter combinations (random search 50 iterations) inner tuning loop (see Section 11.5.2).\nParallelize tuning level (see Section 11.5.2).\nReport mean RMSE use boxplot visualize retrieved RMSEs.Retrieve bias-reduced RMSE simple linear model using spatial cross-validation.\nCompare result result random forest model making RMSE boxplots modeling approach.Retrieve bias-reduced RMSE simple linear model using spatial cross-validation.\nCompare result result random forest model making RMSE boxplots modeling approach.","code":""},{"path":"conclusion.html","id":"conclusion","chapter":"15 Conclusion","heading":"15 Conclusion","text":"","code":""},{"path":"conclusion.html","id":"prerequisites-13","chapter":"15 Conclusion","heading":"Prerequisites","text":"Like introduction, concluding chapter contains code chunks.\nprerequisites demanding.\nassumes :read attempted exercises chapters Part (Foundations);considered can use geocomputation solve real-world problems, work beyond, engaging Part III (Applications).","code":""},{"path":"conclusion.html","id":"introduction-10","chapter":"15 Conclusion","heading":"15.1 Introduction","text":"aim chapter synthesize contents, reference recurring themes/concepts, inspire future directions application development.\nSection 15.2 discusses wide range options handling geographic data R.\nChoice key feature open source software; section provides guidance choosing various options.\nSection 15.3 describes gaps book’s contents explains areas research deliberately omitted, others emphasized.\ndiscussion leads question (answered Section 15.4): read book, go next?\nSection 15.5 returns wider issues raised Chapter 1.\nconsider geocomputation part wider ‘open source approach’ ensures methods publicly accessible, reproducible supported collaborative communities.\nfinal section book also provides pointers get involved.","code":""},{"path":"conclusion.html","id":"package-choice","chapter":"15 Conclusion","heading":"15.2 Package choice","text":"characteristic R often multiple ways achieve result.\ncode chunk illustrates using three functions, covered Chapters 3 5, combine 16 regions New Zealand single geometry:Although classes, attributes column names resulting objects nz_u1 nz_u3 differ, geometries identical.\nverified using base R function identical().84\nuse?\ndepends: former processes geometry data contained nz faster, options performed attribute operations, may useful subsequent steps.wider point often multiple options choose working geographic data R, even within single package.\nrange options grows R packages considered: achieve result using older sp package, example.\nrecommend using sf packages showcased book, reasons outlined Chapter 2, ’s worth aware alternatives able justify choice software.common (sometimes controversial) choice tidyverse base R approaches.\ncover encourage try deciding appropriate different tasks.\nfollowing code chunk, described Chapter 3, shows attribute data subsetting works approach, using base R operator [ select() function tidyverse package dplyr.\nsyntax differs results (essence) :question arises: use?\nanswer : depends.\napproach advantages: pipe syntax popular appealing , base R stable, well known others.\nChoosing therefore largely matter preference.\nHowever, choose use tidyverse functions handle geographic data, beware number pitfalls (see supplementary article tidyverse-pitfalls website supports book).commonly needed operators/functions covered depth — base R [ subsetting operator dplyr function filter() — many functions working geographic data, packages, mentioned.\nChapter 1 mentions 20+ influential packages working geographic data, handful demonstrated subsequent chapters.\nhundreds .\nearly 2019, nearly 200 packages mentioned Spatial Task View;\npackages countless functions geographic data developed year, making impractical cover single book.rate evolution R’s spatial ecosystem may seem overwhelming, strategies deal wide range options.\nadvice start learning one approach depth general understand breadth options available.\nadvice applies equally solving geographic problems R (Section 15.4 covers developments languages) fields knowledge application.course, packages perform much better others, making package selection important decision.\ndiversity, focused packages future-proof (work long future), high performance (relative R packages) complementary.\nstill overlap packages used, illustrated diversity packages making maps, example (see Chapter 8).Package overlap necessarily bad thing.\ncan increase resilience, performance (partly driven friendly competition mutual learning developers) choice, key feature open source software.\ncontext decision use particular approach, sf/tidyverse/raster ecosystem advocated book made knowledge alternatives.\nsp/rgdal/rgeos ecosystem sf designed supersede, example, can many things covered book , due age, built many packages.85\nAlthough best known point pattern analysis, spatstat package also supports raster vector geometries (Baddeley Turner 2005).\ntime writing (October 2018) 69 packages depend , making package: spatstat alternative R-spatial ecosystem.also aware promising alternatives development.\npackage stars, example, provides new class system working spatiotemporal data.\ninterested topic, can check updates package’s source code broader SpatioTemporal Task View.\nprinciple applies domains: important justify software choices review software decisions based --date information.","code":"\nlibrary(spData)\n#> Warning: multiple methods tables found for 'approxNA'\nnz_u1 = sf::st_union(nz)\nnz_u2 = aggregate(nz[\"Population\"], list(rep(1, nrow(nz))), sum)\nnz_u3 = dplyr::summarise(nz, t = sum(Population))\nidentical(nz_u1, nz_u2$geometry)\n#> [1] TRUE\nidentical(nz_u1, nz_u3$geom)\n#> [1] TRUE\nlibrary(dplyr)                          # attach tidyverse package\nnz_name1 = nz[\"Name\"]                   # base R approach\nnz_name2 = nz %>% select(Name)          # tidyverse approach\nidentical(nz_name1$Name, nz_name2$Name) # check results\n#> [1] TRUE"},{"path":"conclusion.html","id":"gaps","chapter":"15 Conclusion","heading":"15.3 Gaps and overlaps","text":"number gaps , overlaps , topics covered book.\nselective, emphasizing topics omitting others.\ntried emphasize topics commonly needed real-world applications geographic data operations, projections, data read/write visualization.\ntopics appear repeatedly chapters, substantial area overlap designed consolidate essential skills geocomputation.hand, omitted topics less commonly used, covered -depth elsewhere.\nStatistical topics including point pattern analysis, spatial interpolation (kriging) spatial epidemiology, example, mentioned reference topics machine learning techniques covered Chapter 11 ().\nalready excellent material methods, including statistically orientated chapters R. Bivand, Pebesma, Gómez-Rubio (2013) book point pattern analysis Baddeley, Rubak, Turner (2015).\ntopics received limited attention remote sensing using R alongside (rather bridge ) dedicated GIS software.\nmany resources topics, including Wegmann, Leutner, Dech (2016) GIS-related teaching materials available Marburg University.Instead covering spatial statistical modeling inference techniques, focussed machine learning (see Chapters 11 14).\n, reason already excellent resources topics, especially ecological use cases, including . Zuur et al. (2009), . F. Zuur et al. (2017) freely available teaching material code Geostatistics & Open-source Statistical Computing David Rossiter, hosted css.cornell.edu/faculty/dgr2 granolarr project Stefano De Sabbata University Leicester introduction R geographic data science.\nalso excellent resources spatial statistics using Bayesian modeling, powerful framework modeling uncertainty estimation (Blangiardo Cameletti 2015; Krainski et al. 2018).Finally, largely omitted big data analytics.\nmight seem surprising since especially geographic data can become big really fast.\nprerequisite big data analytics know solve problem small dataset.\nlearned , can apply exact techniques big data questions, though course need expand toolbox.\nfirst thing learn handle geographic data queries.\nbig data analytics often boil extracting small amount data database specific statistical analysis.\n, provided introduction spatial databases use GIS within R Chapter 9.\nreally analysis big even complete dataset, hopefully, problem trying solve embarrassingly parallel.\n, need learn system able parallelization efficiently Hadoop, GeoMesa (http://www.geomesa.org/) GeoSpark (Huang et al. 2017).\nstill, applying techniques concepts used small datasets answer big data question, difference big data setting.","code":""},{"path":"conclusion.html","id":"next","chapter":"15 Conclusion","heading":"15.4 Where to go next?","text":"indicated previous sections, book covered fraction R’s geographic ecosystem, much discover.\nprogressed quickly, geographic data models Chapter 2, advanced applications Chapter 14.\nConsolidation skills learned, discovery new packages approaches handling geographic data, application methods new datasets domains suggested future directions.\nsection expands general advice suggesting specific ‘next steps,’ highlighted bold .addition learning geographic methods applications R, example reference work cited previous section, deepening understanding R logical next step.\nR’s fundamental classes data.frame matrix foundation sf raster classes, studying improve understanding geographic data.\ncan done reference documents part R, can found command help.start() additional resources subject Wickham (2019) Chambers (2016).Another software-related direction future learning discovering geocomputation languages.\ngood reasons learning R language geocomputation, described Chapter 1, option.86\npossible study Geocomputation : Python, C++, JavaScript, Scala Rust equal depth.\nevolving geospatial capabilities.\nrasterio, example, Python package\nsupplement/replace raster package used book — see Garrard (2016) online tutorials automating-gis-processes Python ecosystem.\nDozens geospatial libraries developed C++, including well known libraries GDAL GEOS, less well known libraries Orfeo Toolbox processing remote sensing (raster) data.\nTurf.js example potential geocomputation JavaScript.\n\n\nGeoTrellis provides functions working raster vector data Java-based language Scala.\nWhiteBoxTools provides example rapidly evolving command-line GIS implemented Rust.\n\n\npackages/libraries/languages advantages geocomputation many discover, documented curated list open source geospatial resources Awesome-Geospatial.geocomputation software, however.\ncan recommend exploring learning new research topics methods academic theoretical perspectives.\nMany methods written yet implemented.\nLearning geographic methods potential applications can therefore rewarding, writing code.\nexample geographic methods increasingly implemented R sampling strategies scientific applications.\nnext step case read-relevant articles area Brus (2018), accompanied reproducible code tutorial content hosted github.com/DickBrus/TutorialSampling4DSM.","code":""},{"path":"conclusion.html","id":"benefit","chapter":"15 Conclusion","heading":"15.5 The open source approach","text":"technical book makes sense next steps, outlined previous section, also technical.\nHowever, wider issues worth considering final section, returns definition geocomputation.\nOne elements term introduced Chapter 1 geographic methods positive impact.\ncourse, define measure ‘positive’ subjective, philosophical question, beyond scope book.\nRegardless worldview, consideration impacts geocomputational work useful exercise:\npotential positive impacts can provide powerful motivation future learning , conversely, new methods can open-many possible fields application.\nconsiderations lead conclusion geocomputation part wider ‘open source approach.’Section 1.1 presented terms mean roughly thing geocomputation, including geographic data science (GDS) ‘GIScience.’\ncapture essence working geographic data, geocomputation advantages: concisely captures ‘computational’ way working geographic data advocated book — implemented code therefore encouraging reproducibility — builds desirable ingredients early definition (Openshaw Abrahart 2000):creative use geographic dataApplication real-world problemsBuilding ‘scientific’ toolsReproducibilityWe added final ingredient: reproducibility barely mentioned early work geocomputation, yet strong case can made vital component first two ingredients.\nReproducibilityencourages creativity shifting focus away basics (readily available shared code) towards applications;discourages people ‘reinventing wheel’: need re-others done methods can used others; andmakes research conducive real world applications, enabling anyone sector apply methods new areas.reproducibility defining asset geocomputation (command-line GIS) worth considering makes reproducible.\nbrings us ‘open source approach,’ three main components:command-line interface (CLI), encouraging scripts recording geographic work shared reproducedOpen source software, can inspected potentially improved anyone worldAn active developer community, collaborates self-organizes build complementary modular toolsLike term geocomputation, open source approach technical entity.\ncommunity composed people interacting daily shared aims: produce high performance tools, free commercial legal restrictions, accessible anyone use.\nopen source approach working geographic data advantages transcend technicalities software works, encouraging learning, collaboration efficient division labor.many ways engage community, especially emergence code hosting sites, GitHub, encourage communication collaboration.\ngood place start simply browsing source code, ‘issues’ ‘commits’ geographic package interest.\nquick glance r-spatial/sf GitHub repository, hosts code underlying sf package, shows 40+ people contributed codebase documentation.\nDozens people contributed asking question contributing ‘upstream’ packages sf uses.\n600 issues closed issue tracker, representing huge amount work make sf faster, stable user-friendly.\nexample, just one package dozens, shows scale intellectual operation underway make R highly effective continuously evolving language geocomputation.instructive watch incessant development activity happen public fora GitHub, even rewarding become active participant.\none greatest features open source approach: encourages people get involved.\nbook result open source approach:\nmotivated amazing developments R’s geographic capabilities last two decades, made practically possible dialogue code sharing platforms collaboration.\nhope addition disseminating useful methods working geographic data, book inspires take open source approach.\nWhether ’s raising constructive issue alerting developers problems package; making work done organizations work open; simply helping people passing knowledge ’ve learned, getting involved can rewarding experience.","code":""},{"path":"references.html","id":"references","chapter":"References","heading":"References","text":"","code":""}]
diff --git a/spatial-class.html b/spatial-class.html
index cae0f4c4b..f9e41368c 100644
--- a/spatial-class.html
+++ b/spatial-class.html
@@ -755,15 +755,17 @@ <h2>
 FIGURE 2.11: Examples of continuous and categorical rasters.
 </p>
 </div>
-<div id="r-packages-for-raster-data-handling" class="section level3" number="2.3.1">
+<div id="r-packages-for-working-with-raster-data" class="section level3" number="2.3.1">
 <h3>
-<span class="header-section-number">2.3.1</span> R packages for raster data handling<a class="anchor" aria-label="anchor" href="#r-packages-for-raster-data-handling"><i class="fas fa-link"></i></a>
+<span class="header-section-number">2.3.1</span> R packages for working with raster data<a class="anchor" aria-label="anchor" href="#r-packages-for-working-with-raster-data"><i class="fas fa-link"></i></a>
 </h3>
-<p>R has several packages able to read and process spatial raster data; see (the-history-of-r-spatial) for more context.
-However, currently, two main packages with this purpose exist – <strong>terra</strong> and <strong>stars</strong>.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;We are not mentioning the &lt;strong&gt;raster&lt;/strong&gt; package here as it is now being replaced with &lt;strong&gt;terra&lt;/strong&gt;.&lt;/p&gt;"><sup>14</sup></a>
-We are focusing on the <strong>terra</strong> package in this book; however, it may be worth knowing the basic similarities and differences between the packages before deciding which one to use.</p>
+<p>Over the last two decades, several packages packages for reading and processing raster datasets have been developed.
+As outlined in Section <a href="intro.html#the-history-of-r-spatial">1.5</a>, chief among them was <strong>raster</strong>, which led to a step change in R’s raster capabilities when it was launched in 2010 and the premier package in the space until the development of <strong>terra</strong> and <strong>stars</strong>.
+Both more recently developed package provide powerful and performant functions for working with raster datasets and there is substantial overlap between their possibly use cases.
+In this book we focus on <strong>terra</strong>, which replaces the older and (in most cases) slower <strong>raster</strong>.
+Before learning about the how <strong>terra</strong>’s class system works, this section describes similarities and differences between <strong>terra</strong> and <strong>raster</strong>; this knowledge will help decide which is most appropriate in different situations.</p>
 <p>First, <strong>terra</strong> focuses on the most common raster data model (regular grids), while <strong>stars</strong> also allows storing less popular models (including regular, rotated, sheared, rectilinear, and curvilinear grids).
-While <strong>terra</strong> usually handle one or multi-layered rasters<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;It also has an additional class &lt;code&gt;SpatRasterDataset&lt;/code&gt; for storing many collections of datasets.&lt;/p&gt;"><sup>15</sup></a>, the <strong>stars</strong> package provides ways to store raster data cubes – a raster object with many layers (e.g., bands), for many moments in time (e.g., months), and many attributes (e.g., sensor type A and sensor type B).
+While <strong>terra</strong> usually handle one or multi-layered rasters<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;It also has an additional class &lt;code&gt;SpatRasterDataset&lt;/code&gt; for storing many collections of datasets.&lt;/p&gt;"><sup>14</sup></a>, the <strong>stars</strong> package provides ways to store raster data cubes – a raster object with many layers (e.g., bands), for many moments in time (e.g., months), and many attributes (e.g., sensor type A and sensor type B).
 Importantly, in both packages, all layers or elements of a data cube must have the same spatial dimensions and extent.
 Second, both packages allow to either read all of the raster data into memory or just to read its metadata – this is usually done automatically based on the input file size.
 However, they store raster values very differently.
@@ -880,7 +882,7 @@ <h3>
 <div class="sourceCode" id="cb42"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span class="fu"><a href="https://rdrr.io/pkg/terra/man/dimensions.html">nlyr</a></span><span class="op">(</span><span class="va">multi_rast</span><span class="op">)</span>
 <span class="co">#&gt; [1] 4</span></code></pre></div>
-<p>For multi-layer raster objects, layers can be selected with <code><a href="https://rdrr.io/pkg/terra/man/subset.html">terra::subset()</a></code>.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;The &lt;code&gt;[[&lt;/code&gt; and &lt;code&gt;$&lt;/code&gt; operators can also be used for layers’ selection.&lt;/p&gt;"><sup>16</sup></a>
+<p>For multi-layer raster objects, layers can be selected with <code><a href="https://rdrr.io/pkg/terra/man/subset.html">terra::subset()</a></code>.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;The &lt;code&gt;[[&lt;/code&gt; and &lt;code&gt;$&lt;/code&gt; operators can also be used for layers’ selection.&lt;/p&gt;"><sup>15</sup></a>
 It accepts a layer number or its name as the second argument:</p>
 <div class="sourceCode" id="cb43"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span class="va">multi_rast3</span> <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/pkg/terra/man/subset.html">subset</a></span><span class="op">(</span><span class="va">multi_rast</span>, <span class="fl">3</span><span class="op">)</span>
@@ -924,7 +926,7 @@ <h3>
 These are suitable because the Earth is compressed: the equatorial radius is around 11.5 km longer than the polar radius <span class="citation">(<a href="references.html#ref-maling_coordinate_1992" role="doc-biblioref">Maling 1992</a>)</span>.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;
 The degree of compression is often referred to as &lt;em&gt;flattening&lt;/em&gt;, defined in terms of the equatorial radius (&lt;span class="math inline"&gt;\(a\)&lt;/span&gt;) and polar radius (&lt;span class="math inline"&gt;\(b\)&lt;/span&gt;) as follows: &lt;span class="math inline"&gt;\(f = (a - b) / a\)&lt;/span&gt;. The terms &lt;em&gt;ellipticity&lt;/em&gt; and &lt;em&gt;compression&lt;/em&gt; can also be used.
 Because &lt;span class="math inline"&gt;\(f\)&lt;/span&gt; is a rather small value, digital ellipsoid models use the ‘inverse flattening’ (&lt;span class="math inline"&gt;\(rf = 1/f\)&lt;/span&gt;) to define the Earth’s compression.
-Values of &lt;span class="math inline"&gt;\(a\)&lt;/span&gt; and &lt;span class="math inline"&gt;\(rf\)&lt;/span&gt; in various ellipsoidal models can be seen by executing &lt;code&gt;sf_proj_info(type = "ellps")&lt;/code&gt;.&lt;/p&gt;'><sup>17</sup></a></p>
+Values of &lt;span class="math inline"&gt;\(a\)&lt;/span&gt; and &lt;span class="math inline"&gt;\(rf\)&lt;/span&gt; in various ellipsoidal models can be seen by executing &lt;code&gt;sf_proj_info(type = "ellps")&lt;/code&gt;.&lt;/p&gt;'><sup>16</sup></a></p>
 <p>Ellipsoids are part of a wider component of CRSs: the <em>datum</em>.
 This contains information on what ellipsoid to use and the precise relationship between the Cartesian coordinates and location on the Earth’s surface.
 There are two types of datum — geocentric (such as <code>WGS84</code>) and local (such as <code>NAD83</code>).
@@ -933,7 +935,7 @@ <h3>
 In a <em>local datum</em>, shown as a purple dashed line, the ellipsoidal surface is shifted to align with the surface at a particular location.
 These allow local variations in Earth’s surface, for example due to large mountain ranges, to be accounted for in a local CRS.
 This can be seen in Figure <a href="spatial-class.html#fig:datum-fig">2.13</a>, where the local datum is fitted to the area of Philippines, but is misaligned with most of the rest of the planet’s surface.
-Both datums in Figure <a href="spatial-class.html#fig:datum-fig">2.13</a> are put on top of a geoid - a model of global mean sea level.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;Please note that the geoid on the Figure exaggerates the bumpy surface of the geoid by a factor of 10,000 to highlight the irregular shape of the planet.&lt;/p&gt;"><sup>18</sup></a></p>
+Both datums in Figure <a href="spatial-class.html#fig:datum-fig">2.13</a> are put on top of a geoid - a model of global mean sea level.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;Please note that the geoid on the Figure exaggerates the bumpy surface of the geoid by a factor of 10,000 to highlight the irregular shape of the planet.&lt;/p&gt;"><sup>17</sup></a></p>
 
 <div class="figure" style="text-align: center">
 <span style="display:block;" id="fig:datum-fig"></span>
@@ -981,7 +983,7 @@ <h3>
 
 Spatial R packages support a wide range of CRSs and they use the long-established <a href="https://proj.org">PROJ</a> library.
 Two recommend ways to describe CRSs in R are (a) Spatial Reference System Identifier (SRID) or (b) well-known text (known as WKT2<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;
-Several WKT dialects were created to describe CRSs, including ESRI WKT, GDAL WKT1, and the current WKT2:2018 &lt;span class="citation"&gt;(&lt;a href="references.html#ref-lott_geographic_2015" role="doc-biblioref"&gt;Lott 2015&lt;/a&gt;)&lt;/span&gt;&lt;/p&gt;'><sup>19</sup></a>) definitions.
+Several WKT dialects were created to describe CRSs, including ESRI WKT, GDAL WKT1, and the current WKT2:2018 &lt;span class="citation"&gt;(&lt;a href="references.html#ref-lott_geographic_2015" role="doc-biblioref"&gt;Lott 2015&lt;/a&gt;)&lt;/span&gt;&lt;/p&gt;'><sup>18</sup></a>) definitions.
 Both of these approaches have advantages and disadvantages.</p>
 <p>A SRID is a unique value used to identify coordinate reference system definitions in a form of <em>AUTHORITY:CODE</em>.
 The most popular registry of SRIDs is <em>EPSG</em>, however, other registries, such as <em>ESRI</em> or <em>OGR</em>, exist.
@@ -1049,7 +1051,7 @@ <h3>
 <div class="sourceCode" id="cb47"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span class="va">new_vector</span> <span class="op">=</span> <span class="fu"><a href="https://r-spatial.github.io/sf/reference/st_crs.html">st_set_crs</a></span><span class="op">(</span><span class="va">new_vector</span>, <span class="st">"EPSG:4326"</span><span class="op">)</span> <span class="co"># set CRS</span></code></pre></div>
 <p>The second argument in the above function could be either SRID (<code>"EPSG:4326"</code> in the example), complete WKT2 representation, <code>proj4string</code>, or CRS extracted from the existing object with <code><a href="https://r-spatial.github.io/sf/reference/st_crs.html">st_crs()</a></code>.</p>
-<p>The <code><a href="https://rdrr.io/pkg/terra/man/crs.html">crs()</a></code> function can be used to access CRS information from a <code>SpatRaster</code> object<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;Use the &lt;code&gt;cat()&lt;/code&gt; (e.g., &lt;code&gt;cat(crs(my_rast))&lt;/code&gt;) function to print it nicely.&lt;/p&gt;"><sup>20</sup></a>:</p>
+<p>The <code><a href="https://rdrr.io/pkg/terra/man/crs.html">crs()</a></code> function can be used to access CRS information from a <code>SpatRaster</code> object<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;Use the &lt;code&gt;cat()&lt;/code&gt; (e.g., &lt;code&gt;cat(crs(my_rast))&lt;/code&gt;) function to print it nicely.&lt;/p&gt;"><sup>19</sup></a>:</p>
 <div class="sourceCode" id="cb48"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span class="fu"><a href="https://rdrr.io/pkg/terra/man/crs.html">crs</a></span><span class="op">(</span><span class="va">my_rast</span><span class="op">)</span> <span class="co"># get CRS</span>
 <span class="co">#&gt; [1] "GEOGCRS[\"WGS 84\",\n    DATUM[\"World Geodetic System 1984\",\n        ELLIPSOID[\"WGS 84\",6378137,298.257223563,\n            LENGTHUNIT[\"metre\",1]]],\n    PRIMEM[\"Greenwich\",0,\n        ANGLEUNIT[\"degree\",0.0174532925199433]],\n    CS[ellipsoidal,2],\n        AXIS[\"geodetic latitude (Lat)\",north,\n            ORDER[1],\n            ANGLEUNIT[\"degree\",0.0174532925199433]],\n        AXIS[\"geodetic longitude (Lon)\",east,\n            ORDER[2],\n            ANGLEUNIT[\"degree\",0.0174532925199433]],\n    ID[\"EPSG\",4326]]"</span></code></pre></div>
@@ -1177,7 +1179,7 @@ <h2>Note: Second Edition is under construction 🏗</h2>
 </li>
 <li>
 <a class="nav-link" href="#raster-data"><span class="header-section-number">2.3</span> Raster data</a><ul class="nav navbar-nav">
-<li><a class="nav-link" href="#r-packages-for-raster-data-handling"><span class="header-section-number">2.3.1</span> R packages for raster data handling</a></li>
+<li><a class="nav-link" href="#r-packages-for-working-with-raster-data"><span class="header-section-number">2.3.1</span> R packages for working with raster data</a></li>
 <li><a class="nav-link" href="#an-introduction-to-terra"><span class="header-section-number">2.3.2</span> An introduction to terra</a></li>
 <li><a class="nav-link" href="#basic-map-raster"><span class="header-section-number">2.3.3</span> Basic map making</a></li>
 <li><a class="nav-link" href="#raster-classes"><span class="header-section-number">2.3.4</span> Raster classes</a></li>
diff --git a/spatial-cv.html b/spatial-cv.html
index 19dac261c..ff1e05df9 100644
--- a/spatial-cv.html
+++ b/spatial-cv.html
@@ -99,7 +99,7 @@ <h2>Prerequisites<a class="anchor" aria-label="anchor" href="#prerequisites-9"><
 <p>This chapter assumes proficiency with geographic data analysis, for example gained by studying the contents and working-through the exercises in Chapters <a href="spatial-class.html#spatial-class">2</a> to <a href="reproj-geo-data.html#reproj-geo-data">6</a>.
 A familiarity with generalized linear models (GLM) and machine learning is highly recommended <span class="citation">(for example from <a href="references.html#ref-zuur_mixed_2009" role="doc-biblioref">A. Zuur et al. 2009</a>; <a href="references.html#ref-james_introduction_2013" role="doc-biblioref">James et al. 2013</a>)</span>.</p>
 <p>The chapter uses the following packages:<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;
-Package &lt;strong&gt;kernlab&lt;/strong&gt;, &lt;strong&gt;pROC&lt;/strong&gt;, &lt;strong&gt;RSAGA&lt;/strong&gt; and &lt;strong&gt;spDataLarge&lt;/strong&gt; must also be installed although these do not need to be attached.&lt;/p&gt;"><sup>65</sup></a></p>
+Package &lt;strong&gt;kernlab&lt;/strong&gt;, &lt;strong&gt;pROC&lt;/strong&gt;, &lt;strong&gt;RSAGA&lt;/strong&gt; and &lt;strong&gt;spDataLarge&lt;/strong&gt; must also be installed although these do not need to be attached.&lt;/p&gt;"><sup>64</sup></a></p>
 <div class="sourceCode" id="cb362"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span class="kw"><a href="https://rdrr.io/r/base/library.html">library</a></span><span class="op">(</span><span class="va"><a href="https://r-spatial.github.io/sf/">sf</a></span><span class="op">)</span>
 <span class="kw"><a href="https://rdrr.io/r/base/library.html">library</a></span><span class="op">(</span><span class="va"><a href="https://rspatial.org/raster">raster</a></span><span class="op">)</span>
@@ -116,7 +116,7 @@ <h2>
 </h2>
 <p>Statistical learning is concerned with the use of statistical and computational models for identifying patterns in data and predicting from these patterns.
 Due to its origins, statistical learning is one of R’s great strengths (see Section <a href="intro.html#software-for-geocomputation">1.3</a>).<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;
-Applying statistical techniques to geographic data has been an active topic of research for many decades in the fields of Geostatistics, Spatial Statistics and point pattern analysis &lt;span class="citation"&gt;(&lt;a href="references.html#ref-diggle_modelbased_2007" role="doc-biblioref"&gt;Diggle and Ribeiro 2007&lt;/a&gt;; &lt;a href="references.html#ref-gelfand_handbook_2010" role="doc-biblioref"&gt;Gelfand et al. 2010&lt;/a&gt;; &lt;a href="references.html#ref-baddeley_spatial_2015" role="doc-biblioref"&gt;Baddeley, Rubak, and Turner 2015&lt;/a&gt;)&lt;/span&gt;.&lt;/p&gt;'><sup>66</sup></a>
+Applying statistical techniques to geographic data has been an active topic of research for many decades in the fields of Geostatistics, Spatial Statistics and point pattern analysis &lt;span class="citation"&gt;(&lt;a href="references.html#ref-diggle_modelbased_2007" role="doc-biblioref"&gt;Diggle and Ribeiro 2007&lt;/a&gt;; &lt;a href="references.html#ref-gelfand_handbook_2010" role="doc-biblioref"&gt;Gelfand et al. 2010&lt;/a&gt;; &lt;a href="references.html#ref-baddeley_spatial_2015" role="doc-biblioref"&gt;Baddeley, Rubak, and Turner 2015&lt;/a&gt;)&lt;/span&gt;.&lt;/p&gt;'><sup>65</sup></a>
 Statistical learning combines methods from statistics and machine learning and its methods can be categorized into supervised and unsupervised techniques.
 Both are increasingly used in disciplines ranging from physics, biology and ecology to geography and economics <span class="citation">(<a href="references.html#ref-james_introduction_2013" role="doc-biblioref">James et al. 2013</a>)</span>.</p>
 <p>This chapter focuses on supervised techniques in which there is a training dataset, as opposed to unsupervised techniques such as clustering.
@@ -157,12 +157,12 @@ <h2>
 <code class="sourceCode R"><span class="fu"><a href="https://rdrr.io/r/utils/data.html">data</a></span><span class="op">(</span><span class="st">"landslides"</span>, package <span class="op">=</span> <span class="st">"RSAGA"</span><span class="op">)</span></code></pre></div>
 <p>This should load three objects: a <code>data.frame</code> named <code>landslides</code>, a <code>list</code> named <code>dem</code>, and an <code>sf</code> object named <code>study_area</code>.
 <code>landslides</code> contains a factor column <code>lslpts</code> where <code>TRUE</code> corresponds to an observed landslide ‘initiation point,’ with the coordinates stored in columns <code>x</code> and <code>y</code>.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;
-The landslide initiation point is located in the scarp of a landslide polygon. See &lt;span class="citation"&gt;&lt;a href="references.html#ref-muenchow_geomorphic_2012" role="doc-biblioref"&gt;Muenchow, Brenning, and Richter&lt;/a&gt; (&lt;a href="references.html#ref-muenchow_geomorphic_2012" role="doc-biblioref"&gt;2012&lt;/a&gt;)&lt;/span&gt; for further details.&lt;/p&gt;'><sup>67</sup></a></p>
+The landslide initiation point is located in the scarp of a landslide polygon. See &lt;span class="citation"&gt;&lt;a href="references.html#ref-muenchow_geomorphic_2012" role="doc-biblioref"&gt;Muenchow, Brenning, and Richter&lt;/a&gt; (&lt;a href="references.html#ref-muenchow_geomorphic_2012" role="doc-biblioref"&gt;2012&lt;/a&gt;)&lt;/span&gt; for further details.&lt;/p&gt;'><sup>66</sup></a></p>
 <p>There are 175 landslide points and 1360 non-landslide, as shown by <code>summary(landslides)</code>.
 The 1360 non-landslide points were sampled randomly from the study area, with the restriction that they must fall outside a small buffer around the landslide polygons.</p>
 <p>To make the number of landslide and non-landslide points balanced, let us sample 175 from the 1360 non-landslide points.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;The &lt;code&gt;landslides&lt;/code&gt; dataset has been used in classes and summer schools.
 To show how predictive performance of different algorithms changes with an unbalanced and highly spatially autocorrelated response variable, 1360 non-landslide points were randomly selected, i.e., many more absences than presences.
-However, especially a logistic regression with a log-link, as used in this chapter, expects roughly the same number of presences and absences in the response.&lt;/p&gt;"><sup>68</sup></a></p>
+However, especially a logistic regression with a log-link, as used in this chapter, expects roughly the same number of presences and absences in the response.&lt;/p&gt;"><sup>67</sup></a></p>
 <div class="sourceCode" id="cb365"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span class="co"># select non-landslide points</span>
 <span class="va">non_pts</span> <span class="op">=</span> <span class="fu"><a href="https://dplyr.tidyverse.org/reference/filter.html">filter</a></span><span class="op">(</span><span class="va">landslides</span>, <span class="va">lslpts</span> <span class="op">==</span> <span class="cn">FALSE</span><span class="op">)</span>
@@ -647,7 +647,7 @@ <h3>
 Further advantages are simple parallelization of resampling techniques and the ability to tune machine learning hyperparameters (see Section <a href="spatial-cv.html#svm">11.5.2</a>).
 Most importantly, (spatial) resampling in <strong>mlr</strong> is straightforward, requiring only two more steps: specifying a resampling method and running it.
 We will use a 100-repeated 5-fold spatial CV: five partitions will be chosen based on the provided coordinates in our <code>task</code> and the partitioning will be repeated 100 times:<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;Note that package &lt;strong&gt;sperrorest&lt;/strong&gt; initially implemented spatial cross-validation in R &lt;span class="citation"&gt;(&lt;a href="references.html#ref-brenning_spatial_2012" role="doc-biblioref"&gt;Brenning 2012b&lt;/a&gt;)&lt;/span&gt;.
-In the meantime, its functionality was integrated into the &lt;strong&gt;mlr&lt;/strong&gt; package which is the reason why we are using &lt;strong&gt;mlr&lt;/strong&gt; &lt;span class="citation"&gt;(&lt;a href="references.html#ref-schratz_performance_nodate" role="doc-biblioref"&gt;Schratz et al. 2018&lt;/a&gt;)&lt;/span&gt;.The &lt;strong&gt;caret&lt;/strong&gt; package is another umbrella-package &lt;span class="citation"&gt;(&lt;a href="references.html#ref-kuhn_applied_2013" role="doc-biblioref"&gt;Kuhn and Johnson 2013&lt;/a&gt;)&lt;/span&gt; for streamlined modeling in R; however, so far it does not provide spatial CV which is why we refrain from using it for spatial data.&lt;/p&gt;'><sup>69</sup></a></p>
+In the meantime, its functionality was integrated into the &lt;strong&gt;mlr&lt;/strong&gt; package which is the reason why we are using &lt;strong&gt;mlr&lt;/strong&gt; &lt;span class="citation"&gt;(&lt;a href="references.html#ref-schratz_performance_nodate" role="doc-biblioref"&gt;Schratz et al. 2018&lt;/a&gt;)&lt;/span&gt;.The &lt;strong&gt;caret&lt;/strong&gt; package is another umbrella-package &lt;span class="citation"&gt;(&lt;a href="references.html#ref-kuhn_applied_2013" role="doc-biblioref"&gt;Kuhn and Johnson 2013&lt;/a&gt;)&lt;/span&gt; for streamlined modeling in R; however, so far it does not provide spatial CV which is why we refrain from using it for spatial data.&lt;/p&gt;'><sup>68</sup></a></p>
 <div class="sourceCode" id="cb377"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span class="va">perf_level</span> <span class="op">=</span> <span class="fu"><a href="https://mlr.mlr-org.com/reference/makeResampleDesc.html">makeResampleDesc</a></span><span class="op">(</span>method <span class="op">=</span> <span class="st">"SpRepCV"</span>, folds <span class="op">=</span> <span class="fl">5</span>, reps <span class="op">=</span> <span class="fl">100</span><span class="op">)</span></code></pre></div>
 <p>To execute the spatial resampling, we run <code><a href="https://mlr.mlr-org.com/reference/resample.html">resample()</a></code> using the specified learner, task, resampling strategy and of course the performance measure, here the AUROC.
@@ -697,7 +697,7 @@ <h3>
 <p>SVMs search for the best possible ‘hyperplanes’ to separate classes (in a classification case) and estimate ‘kernels’ with specific hyperparameters to allow for non-linear boundaries between classes <span class="citation">(<a href="references.html#ref-james_introduction_2013" role="doc-biblioref">James et al. 2013</a>)</span>.
 Hyperparameters should not be confused with coefficients of parametric models, which are sometimes also referred to as parameters.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;
 For a detailed description of the difference between coefficients and hyperparameters, see the ‘machine mastery’ blog post on the subject.
-<!-- For a more detailed description of the difference between coefficients and hyperparameters, see the [machine mastery blog](https://machinelearningmastery.com/difference-between-a-parameter-and-a-hyperparameter/). -->&lt;/p&gt;"><sup>70</sup></a>
+<!-- For a more detailed description of the difference between coefficients and hyperparameters, see the [machine mastery blog](https://machinelearningmastery.com/difference-between-a-parameter-and-a-hyperparameter/). -->&lt;/p&gt;"><sup>69</sup></a>
 Coefficients can be estimated from the data, while hyperparameters are set before the learning begins.
 Optimal hyperparameters are usually determined within a defined range with the help of cross-validation methods.
 This is called hyperparameter tuning.</p>
@@ -796,7 +796,7 @@ <h3>
 <div class="sourceCode" id="cb385"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span class="fu"><a href="https://mlr.mlr-org.com/reference/configureMlr.html">configureMlr</a></span><span class="op">(</span>on.learner.error <span class="op">=</span> <span class="st">"warn"</span>, on.error.dump <span class="op">=</span> <span class="cn">TRUE</span><span class="op">)</span></code></pre></div>
 <p>To start the parallelization, we set the <code>mode</code> to <code>multicore</code> which will use <code>mclapply()</code> in the background on a single machine in the case of a Unix-based operating system.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;
-See &lt;code&gt;?parallelStart&lt;/code&gt; for further modes and github.com/berndbischl/parallelMap for more on the unified interface to popular parallelization back-ends.&lt;/p&gt;"><sup>71</sup></a>
+See &lt;code&gt;?parallelStart&lt;/code&gt; for further modes and github.com/berndbischl/parallelMap for more on the unified interface to popular parallelization back-ends.&lt;/p&gt;"><sup>70</sup></a>
 Equivalenty, <code><a href="https://parallelmap.mlr-org.com/reference/parallelStart.html">parallelStartSocket()</a></code> enables parallelization under Windows.
 <code>level</code> defines the level at which to enable parallelization, with <code>mlr.tuneParams</code> determining that the hyperparameter tuning level should be parallelized (see lower left part of Figure <a href="spatial-cv.html#fig:inner-outer">11.6</a>, <code><a href="https://parallelmap.mlr-org.com/reference/parallelGetRegisteredLevels.html">?parallelGetRegisteredLevels</a></code>, and the <strong>mlr</strong> <a href="https://mlr-org.github.io/mlr-tutorial/release/html/parallelization/index.html#parallelization-levels">parallelization tutorial</a> for details).
 We will use half of the available cores (set with the <code>cpus</code> parameter), a setting that allows possible other users to work on the same high performance computing cluster in case one is used (which was the case when we ran the code).
diff --git a/spatial-operations.html b/spatial-operations.html
index fd08b1aea..bddb4071b 100644
--- a/spatial-operations.html
+++ b/spatial-operations.html
@@ -228,7 +228,7 @@ <h3>
 <p>Topological relations describe the spatial relationships between objects.
 “Binary topological relationships,” to give them their full name, are logical statements (in that the answer can only be <code>TRUE</code> or <code>FALSE</code>) about the spatial relationships between two objects defined by ordered sets of points (typically forming points, lines and polygons) in two or more dimensions <span class="citation">(<a href="references.html#ref-egenhofer_mathematical_1990" role="doc-biblioref">Egenhofer and Herring 1990</a>)</span>.
 That may sound rather abstract and, indeed, the definition and classification of topological relations is based on earlier mathematical foundations first published as a book in 1966 <span class="citation">(<a href="references.html#ref-spanier_algebraic_1995" role="doc-biblioref">Spanier 1995</a>)</span>.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;
-See &lt;span class="citation"&gt;&lt;a href="references.html#ref-dieck_algebraic_2008" role="doc-biblioref"&gt;Dieck&lt;/a&gt; (&lt;a href="references.html#ref-dieck_algebraic_2008" role="doc-biblioref"&gt;2008&lt;/a&gt;)&lt;/span&gt; for an updated textbook teaching algebraic topology.&lt;/p&gt;'><sup>23</sup></a></p>
+See &lt;span class="citation"&gt;&lt;a href="references.html#ref-dieck_algebraic_2008" role="doc-biblioref"&gt;Dieck&lt;/a&gt; (&lt;a href="references.html#ref-dieck_algebraic_2008" role="doc-biblioref"&gt;2008&lt;/a&gt;)&lt;/span&gt; for an updated textbook teaching algebraic topology.&lt;/p&gt;'><sup>22</sup></a></p>
 <p>Despite their mathematical origins, topological relations can be understood intuitively with reference to visualizations of commonly used functions that test for common types of spatial relationships.
 Figure <a href="spatial-operations.html#fig:relations">4.2</a> shows a variety of geometry pairs and their associated relations.
 The third and fourth pairs in Figure <a href="spatial-operations.html#fig:relations">4.2</a> (from left to right and then down) demonstrate that, for some relations, order is important: while the relations <em>equals</em>, <em>intersects</em>, <em>crosses</em>, <em>touches</em> and <em>overlaps</em> are symmetrical, meaning that if <code>function(x, y)</code> is true, <code>function(y, x)</code> will also by true, relations in which the order of the geomtries are important such as <em>contains</em> and <em>within</em> are not.
diff --git a/transport.html b/transport.html
index c91c01c6c..526a93ce9 100644
--- a/transport.html
+++ b/transport.html
@@ -6,16 +6,16 @@
 <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
 <title>Chapter 12 Transportation | Geocomputation with R</title>
 <meta name="author" content="Robin Lovelace, Jakub Nowosad, Jannes Muenchow">
-<meta name="description" content="Prerequisites This chapter uses the following packages:72 library(sf) library(dplyr) library(spDataLarge) #&gt; Warning: multiple methods tables found for 'approxNA' library(stplanr)   # geographic...">
+<meta name="description" content="Prerequisites This chapter uses the following packages:71 library(sf) library(dplyr) library(spDataLarge) #&gt; Warning: multiple methods tables found for 'approxNA' library(stplanr)   # geographic...">
 <meta name="generator" content="bookdown 0.24 with bs4_book()">
 <meta property="og:title" content="Chapter 12 Transportation | Geocomputation with R">
 <meta property="og:type" content="book">
 <meta property="og:url" content="https://geocompr.robinlovelace.net/transport.html">
 <meta property="og:image" content="https://geocompr.robinlovelace.net/images/cover.png">
-<meta property="og:description" content="Prerequisites This chapter uses the following packages:72 library(sf) library(dplyr) library(spDataLarge) #&gt; Warning: multiple methods tables found for 'approxNA' library(stplanr)   # geographic...">
+<meta property="og:description" content="Prerequisites This chapter uses the following packages:71 library(sf) library(dplyr) library(spDataLarge) #&gt; Warning: multiple methods tables found for 'approxNA' library(stplanr)   # geographic...">
 <meta name="twitter:card" content="summary">
 <meta name="twitter:title" content="Chapter 12 Transportation | Geocomputation with R">
-<meta name="twitter:description" content="Prerequisites This chapter uses the following packages:72 library(sf) library(dplyr) library(spDataLarge) #&gt; Warning: multiple methods tables found for 'approxNA' library(stplanr)   # geographic...">
+<meta name="twitter:description" content="Prerequisites This chapter uses the following packages:71 library(sf) library(dplyr) library(spDataLarge) #&gt; Warning: multiple methods tables found for 'approxNA' library(stplanr)   # geographic...">
 <meta name="twitter:image" content="https://geocompr.robinlovelace.net/images/cover.png">
 <!-- JS --><script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/2.0.6/clipboard.min.js" integrity="sha256-inc5kl9MA1hkeYUt+EC3BhlIgyp/2jDIyBLS6k3UxPI=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/fuse.js/6.4.6/fuse.js" integrity="sha512-zv6Ywkjyktsohkbp9bb45V6tEMoWhzFzXis+LrMehmJZZSys19Yxf1dopHx7WzIKxr5tK2dVcYmaCk2uqdjF4A==" crossorigin="anonymous"></script><script src="https://kit.fontawesome.com/6ecbd6c532.js" crossorigin="anonymous"></script><script src="libs/header-attrs-2.11/header-attrs.js"></script><script src="libs/jquery-3.6.0/jquery-3.6.0.min.js"></script><meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
 <link href="libs/bootstrap-4.6.0/bootstrap.min.css" rel="stylesheet">
@@ -98,7 +98,7 @@ <h2>Prerequisites<a class="anchor" aria-label="anchor" href="#prerequisites-10">
 </h2>
 <ul>
 <li>This chapter uses the following packages:<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;
-The &lt;strong&gt;nabor&lt;/strong&gt; package must also be installed, although it does not need to be attached.&lt;/p&gt;"><sup>72</sup></a>
+The &lt;strong&gt;nabor&lt;/strong&gt; package must also be installed, although it does not need to be attached.&lt;/p&gt;"><sup>71</sup></a>
 </li>
 </ul>
 <div class="sourceCode" id="cb393"><pre class="downlit sourceCode r">
@@ -195,7 +195,7 @@ <h2>
 Often, the same geographic units are used for origins and destinations.
 However, different zoning systems, such as ‘<a href="https://data.gov.uk/dataset/workplace-zones-a-new-geography-for-workplace-statistics3">Workplace Zones</a>,’ may be appropriate to represent the increased density of trip destinations in areas with many ‘trip attractors’ such as schools and shops <span class="citation">(<a href="references.html#ref-office_for_national_statistics_workplace_2014" role="doc-biblioref">Office for National Statistics 2014</a>)</span>.</p>
 <p>The simplest way to define a study area is often the first matching boundary returned by OpenStreetMap, which can be obtained using <strong>osmdata</strong> with a command such as <code>bristol_region = osmdata::getbb("Bristol", format_out = "sf_polygon")</code>. This results in an <code>sf</code> object representing the bounds of the largest matching city region, either a rectangular polygon of the bounding box or a detailed polygonal boundary.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;
-In cases where the first match does not provide the right name, the country or region should be specified, for example &lt;code&gt;Bristol Tennessee&lt;/code&gt; for a Bristol located in America.&lt;/p&gt;"><sup>73</sup></a>
+In cases where the first match does not provide the right name, the country or region should be specified, for example &lt;code&gt;Bristol Tennessee&lt;/code&gt; for a Bristol located in America.&lt;/p&gt;"><sup>72</sup></a>
 For Bristol, UK, a detailed polygon is returned, representing the official boundary of Bristol (see the inner blue boundary in Figure <a href="transport.html#fig:bristol">12.1</a>) but there are a couple of issues with this approach:</p>
 <ul>
 <li>The first OSM boundary returned by OSM may not be the official boundary used by local authorities</li>
@@ -240,7 +240,7 @@ <h2>
 <ul>
 <li>grouped the data by zone of origin (contained in the column <code>o</code>);</li>
 <li>aggregated the variables in the <code>bristol_od</code> dataset <em>if</em> they were numeric, to find the total number of people living in each zone by mode of transport; and<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;
-the &lt;code&gt;_if&lt;/code&gt; affix requires a &lt;code&gt;TRUE&lt;/code&gt;/&lt;code&gt;FALSE&lt;/code&gt; question to be asked of the variables, in this case ‘is it numeric?’ and only variables returning true are summarized.&lt;/p&gt;"><sup>74</sup></a>
+the &lt;code&gt;_if&lt;/code&gt; affix requires a &lt;code&gt;TRUE&lt;/code&gt;/&lt;code&gt;FALSE&lt;/code&gt; question to be asked of the variables, in this case ‘is it numeric?’ and only variables returning true are summarized.&lt;/p&gt;"><sup>73</sup></a>
 </li>
 <li>renamed the grouping variable <code>o</code> so it matches the ID column <code>geo_code</code> in the <code>bristol_zones</code> object.</li>
 </ul>
@@ -252,7 +252,7 @@ <h2>
 <span class="co">#&gt; logical     102</span></code></pre></div>
 <p>The results show that all 102 zones are present in the new object and that <code>zone_attr</code> is in a form that can be joined onto the zones.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content="&lt;p&gt;
 It would also be important to check that IDs match in the opposite direction on real data.
-This could be done by changing the order of the IDs in the &lt;code&gt;summary()&lt;/code&gt; command — &lt;code&gt;summary(bristol_zones$geo_code %in% zones_attr$geo_code)&lt;/code&gt; — or by using &lt;code&gt;setdiff()&lt;/code&gt; as follows: &lt;code&gt;setdiff(bristol_zones$geo_code, zones_attr$geo_code)&lt;/code&gt;.&lt;/p&gt;"><sup>75</sup></a>
+This could be done by changing the order of the IDs in the &lt;code&gt;summary()&lt;/code&gt; command — &lt;code&gt;summary(bristol_zones$geo_code %in% zones_attr$geo_code)&lt;/code&gt; — or by using &lt;code&gt;setdiff()&lt;/code&gt; as follows: &lt;code&gt;setdiff(bristol_zones$geo_code, zones_attr$geo_code)&lt;/code&gt;.&lt;/p&gt;"><sup>74</sup></a>
 This is done using the joining function <code><a href="https://dplyr.tidyverse.org/reference/mutate-joins.html">left_join()</a></code> (note that <code><a href="https://dplyr.tidyverse.org/reference/mutate-joins.html">inner_join()</a></code> would produce here the same result):
 
 </p>
@@ -384,7 +384,7 @@ <h2>
 <p>The next step is to convert the interzonal OD pairs into an <code>sf</code> object representing desire lines that can be plotted on a map with the <strong>stplanr</strong> function <code><a href="https://docs.ropensci.org/stplanr/reference/od2line.html">od2line()</a></code>.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;
 &lt;code&gt;od2line()&lt;/code&gt; works by matching the IDs in the first two columns of the &lt;code&gt;bristol_od&lt;/code&gt; object to the &lt;code&gt;zone_code&lt;/code&gt; ID column in the geographic &lt;code&gt;zones_od&lt;/code&gt; object.
 Note that the operation emits a warning because &lt;code&gt;od2line()&lt;/code&gt; works by allocating the start and end points of each origin-destination pair to the &lt;em&gt;centroid&lt;/em&gt; of its zone of origin and destination.
-For real-world use one would use centroid values generated from projected data or, preferably, use &lt;em&gt;population-weighted&lt;/em&gt; centroids &lt;span class="citation"&gt;(&lt;a href="references.html#ref-lovelace_propensity_2017" role="doc-biblioref"&gt;Lovelace et al. 2017&lt;/a&gt;)&lt;/span&gt;.&lt;/p&gt;'><sup>76</sup></a></p>
+For real-world use one would use centroid values generated from projected data or, preferably, use &lt;em&gt;population-weighted&lt;/em&gt; centroids &lt;span class="citation"&gt;(&lt;a href="references.html#ref-lovelace_propensity_2017" role="doc-biblioref"&gt;Lovelace et al. 2017&lt;/a&gt;)&lt;/span&gt;.&lt;/p&gt;'><sup>75</sup></a></p>
 <div class="sourceCode" id="cb404"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span class="va">desire_lines</span> <span class="op">=</span> <span class="fu"><a href="https://docs.ropensci.org/stplanr/reference/od2line.html">od2line</a></span><span class="op">(</span><span class="va">od_inter</span>, <span class="va">zones_od</span><span class="op">)</span>
 <span class="co">#&gt; Creating centroids representing desire line start and end points.</span></code></pre></div>
@@ -445,7 +445,7 @@ <h2>
 <p>The code chunk below plots the desire lines and routes, resulting in Figure <a href="transport.html#fig:routes">12.4</a> which shows routes along which people drive short distances:<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;
 Not that the red routes and black desire lines do not start at the same point.
 This is because zone centroids rarely lie on the route network: instead the route originate from the transport network node nearest the centroid.
-Note also that routes are assumed to originate in the zone centroids, a simplifying assumption which is used in transport models to reduce the computational resources needed to calculate the shortest path between all combinations of possible origins and destinations &lt;span class="citation"&gt;(&lt;a href="references.html#ref-hollander_transport_2016" role="doc-biblioref"&gt;Hollander 2016&lt;/a&gt;)&lt;/span&gt;.&lt;/p&gt;'><sup>77</sup></a></p>
+Note also that routes are assumed to originate in the zone centroids, a simplifying assumption which is used in transport models to reduce the computational resources needed to calculate the shortest path between all combinations of possible origins and destinations &lt;span class="citation"&gt;(&lt;a href="references.html#ref-hollander_transport_2016" role="doc-biblioref"&gt;Hollander 2016&lt;/a&gt;)&lt;/span&gt;.&lt;/p&gt;'><sup>76</sup></a></p>
 <div class="sourceCode" id="cb409"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span class="fu"><a href="https://r-spatial.github.io/sf/reference/plot.html">plot</a></span><span class="op">(</span><span class="fu"><a href="https://r-spatial.github.io/sf/reference/st_geometry.html">st_geometry</a></span><span class="op">(</span><span class="va">desire_carshort</span><span class="op">)</span><span class="op">)</span>
 <span class="fu"><a href="https://r-spatial.github.io/sf/reference/plot.html">plot</a></span><span class="op">(</span><span class="va">desire_carshort</span><span class="op">$</span><span class="va">geom_car</span>, col <span class="op">=</span> <span class="st">"red"</span>, add <span class="op">=</span> <span class="cn">TRUE</span><span class="op">)</span>
@@ -478,7 +478,7 @@ <h2>
 </ol>
 <p>Transport networks can be represented as graphs, in which each segment is connected (via edges representing geographic lines) to one or more other edges in the network.
 Nodes outside the network can be added with “centroid connectors”, new route segments to nearby nodes on the network <span class="citation">(<a href="references.html#ref-hollander_transport_2016" role="doc-biblioref">Hollander 2016</a>)</span>.<a class="footnote-ref" tabindex="0" data-toggle="popover" data-content='&lt;p&gt;
-The location of these connectors should be chosen carefully because they can lead to over-estimates of traffic volumes in their immediate surroundings &lt;span class="citation"&gt;(&lt;a href="references.html#ref-jafari_investigation_2015" role="doc-biblioref"&gt;Jafari et al. 2015&lt;/a&gt;)&lt;/span&gt;.&lt;/p&gt;'><sup>78</sup></a>
+The location of these connectors should be chosen carefully because they can lead to over-estimates of traffic volumes in their immediate surroundings &lt;span class="citation"&gt;(&lt;a href="references.html#ref-jafari_investigation_2015" role="doc-biblioref"&gt;Jafari et al. 2015&lt;/a&gt;)&lt;/span&gt;.&lt;/p&gt;'><sup>77</sup></a>
 Every node in the network is then connected by one or more ‘edges’ that represent individual segments on the network.
 We will see how transport networks can be represented as graphs in Section <a href="transport.html#route-networks">12.7</a>.</p>
 <p>Public transport stops are particularly important nodes that can be represented as either type of node: a bus stop that is part of a road, or a large rail station that is represented by its pedestrian entry point hundreds of meters from railway tracks.