![An interactive LADAL notebook](https://slcladal.github.io/images/uq1.jpg)

# Introduction to Geospatial Data Visualization with R


This tutorial is the interactive Jupyter notebook accompanying the [*Language Technology and Data Analysis Laboratory* (LADAL) tutorial *Introduction to Geospatial Data Visualization with R*](https://ladal.edu.au/gviz.html). The tutorial provides more details and background information while this interactive notebook focuses strictly on practical aspects.


***


**Preparation and session set up**

We set up our session by activating the packages we need for this tutorial.


In [None]:
# activate packages
library(sf)
library(raster)
library(dplyr)
library(spData)
library(spDataLarge)
library(tmap)  
library(ggplot2)
library(ggspatial)
library(rnaturalearth)
library(ggmap)
library(leaflet)
library(maptools)
library(rgdal)
library(scales)


Once you have initiated the session by executing the code shown above, you are good to go.

If you are using this notebook on your own computer and you have not already installed the R packages listed above, you need to install them. You can install them by replacing the `library` command with `install.packages` and putting the name of the package into quotation marks like this: `install.packages("quanteda")`. Then, you simply run this command and R will install the package you specified.




# Creating Basic Maps 

We will start by generating maps of the world using a in-build `world` data set that is part of the `spData` package and the `plot` function for generating the map.


In [None]:
plot(world)



We see that the  `world` data set contains information on various factors, such as information about regions (e.g., `continent` or `subregion`), country names (`name_long`),  population size (`pop`) or life expectancy (`lifeExp`). We can use this information to show a specific map as shown below.



In [None]:
plot(world["lifeExp"])



We can use the `world` data set and filter for specific features, e.g., we can visualize only a single continent by filtering for the continent we are interested in. In addition, we define the x-axis and y-axis limits so that we zoom in on the region of interest.



In [None]:
# extract europe (exclude russia and iceland)
world_eur <- world %>%
  dplyr::filter(continent == "Europe", 
                name_long != "Russian Federation", 
                name_long != "Iceland") %>%
  dplyr::select(name_long, geom)
# plot
plot(world_eur,
     xlim = c(5, 10),
     ylim = c(30, 70),
     main = "")


We can also overlay information such as population size over continents and countries as shown below.



In [None]:
# plot world map
plot(world["continent"], reset = FALSE)
# define size bases on population size
cex <- sqrt(world$pop) / 10000
# center world map
world_cents <- sf::st_centroid(world, of_largest = TRUE)
# plot
plot(sf::st_geometry(world_cents), 
     add = TRUE, 
     cex = cex)


Overlaying is interesting because it allows us to highlight certain regions or countries as shown below.



In [None]:
# extract map of europe
world_eur <- world %>%
  dplyr::filter(continent == "Europe")
# extract germany
ger <- world %>%
  dplyr::filter(name_long == "Germany")
# plot germany
plot(sf::st_geometry(ger), expandBB = c(.2, .2, .2, .2), col = "gray", lwd = 3)
# plot europe
plot(world_eur[0], add = TRUE)


We can also add information to the `world` data set and use the added information to generate customized plots.



In [None]:
# countries I have been to
countries <- c("United States", "Norway", "France", "United Arab Emirates", 
             "Qatar", "Sweden", "Poland", "Austria", "Hungary", "Romania", 
             "Germany", "Bulgaria", "Greece", "Turkey", "Croatia", 
             "Switzerland", "Belgium", "Netherlands", "Spain", "Ireland", 
             "Australia", "China", "Italy", "Denmark", "United Kingdom", 
             "Slovenia", "Finland", "Slovakia", "Czech Republic", "Japan", 
             "Saudi Arabia", "Serbia")
# data frame with countries I have visited
visited <- world %>%
  dplyr::filter(name_long %in% countries)
# plot world
plot(world[0], col = "lightgray")
# overlay countries I have visited in orange 
plot(sf::st_geometry(visited), add = TRUE, col = "orange")


# Creating Maps with ggplot2 

So far, we have used the base `plot` function to generate maps. However, it is also possible to use `ggplot2` to generate maps and the easiest way is to use `borders` to draw a map. 


In [None]:
# plot map
ggplot() +
  borders()


Another option is to add `geom_sf` to a `ggplot2` object as shown below. A nice feature is that we can add perspective and projection.



In [None]:
# plot map
ggplot(data = world) +
  geom_sf(fill = "white") +
  coord_sf(crs = "+proj=laea +lat_1=-28 +lat_2=-36 +lat_0=-32 +lon_0=135 +x_0=1000000 +y_0=2000000 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs")


Or have a map that shows the world in a bit of an unusual perspective due to the Labert projection.



In [None]:
# plot map
ggplot(data = world) +
  geom_sf(fill = "beige") +
  coord_sf(crs = "+proj=lcc +lat_1=-28 +lat_2=-36 +lat_0=-32 +lon_0=135 +x_0=1000000 +y_0=2000000 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs") +
  theme(panel.grid.major = element_line(color = "gray50", 
                                         size = 0.25),
        panel.background = element_rect(fill = "aliceblue"))


The nice thing about `ggplot2` is, of course, that it is very easy to add layers and create very pretty visualizations. 



In [None]:
# plot map
ggplot(data = world) +
  geom_sf() + 
  theme_bw() +
  # adding axes title
  labs(x = "Longitude", y = "Latitude") +
  # adding title and subtitle
  ggtitle("A map of Australia") +
  # defining coordinates
  coord_sf(xlim = c(100.00, 160.00), 
           ylim = c(-45.00, -10.00), 
           expand = T) +
  # add distance measure
  annotation_scale(location = "bl", width_hint = 0.5) +
  # add compass 
  annotation_north_arrow(location = "br")


Again, we can customize the map according to what we want. In addition, we load a map with a higher resolution using the `ne_countries` function from the `rnaturalearth` package.



In [None]:
# load data
world <- rnaturalearth::ne_countries(returnclass = "sf") 
# add to prevent errors
sf::sf_use_s2(FALSE)
# extract locations
world_points<- st_centroid(world)
# extract labels
world_points <- cbind(world, sf::st_coordinates(sf::st_centroid(world$geometry)))
# generate annotated world map
ggplot(data = world) +
  # land is gray
  geom_sf(fill= "gray90") +
  # axes labels
  labs(x = "Longitude", y = "Latitude") +
  # define zoom
  coord_sf(xlim = c(100.00, 180.00), 
           ylim = c(-45.00, -10.00), expand = T) +
  # add scale bar
  annotation_scale(location = "bl", width_hint = 0.5) +
  # add compass
  annotation_north_arrow(location = "br", which_north = "true", 
                         style = north_arrow_fancy_orienteering) +
  # define theme (add grid lines)
  theme(panel.grid.major = element_line(color = "gray60", 
                                         linetype = "dashed", 
                                         size = 0.25),
        # define background color
         panel.background = element_rect(fill = "aliceblue")) +
  # add text
  geom_text(data= world_points,aes(x=X, y=Y, label=name),
            color = "gray20", fontface = "italic", check_overlap = T, size = 3)


We can explore other designs and maps and show different regions of the world.



In [None]:
# load data
europe <- ne_countries(scale = "medium", continent='europe', returnclass = "sf") 
# plot map
ggplot(data = europe) +
  # add map and define filling
  geom_sf(mapping = aes(fill = ifelse(name_long == "Germany", "0", "1"))) +
  # simply black and white background
  theme_bw() +
  # adding axes title
  labs(x = "Longitude", y = "Latitude") +
  # adding title and subtitle
  ggtitle("A map of central Europe") +
  # defining coordinates
  coord_sf(xlim = c(-10, 30), 
           ylim = c(40, 60)) +
  # add distance measure
  annotation_scale(location = "bl", width_hint = 0.5) +
  # add compass 
  annotation_north_arrow(location = "br",
                         # make compass fancy
                         style = north_arrow_fancy_orienteering) +
  theme(legend.position = "none",
        # add background color
        panel.background = element_rect(fill = "lightblue"),
        # remove grid lines
        panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank()) +
  # define fill colors
  scale_fill_manual(name = "Country", values = c("darkgray", "beige")) +
  # add text
  geom_sf_text(aes(label = name_long), size=2.5, color = "gray20")


# Adding External Information 

While being very useful, displaying basic maps is usually less relevant because, typically, we want to add different layers to a map. In order to add layers to a map, we need to combine the existing data with some data that we would like to lay over the existing map. 

We will now lad external information which contains the locations of Australian airports.


In [None]:
# load data
airports <- base::readRDS(url("https://slcladal.github.io/data/apd.rda", "rb")) %>%
  dplyr::mutate(ID = as.character(ID)) %>%
  dplyr::filter(Country == "Australia")
# inspect
head(airports)


Next, we load an additional data set about the route volume of Australian airports (how many routes go in and out of these airports).



In [None]:
# read in routes data
routes <- base::readRDS(url("https://slcladal.github.io/data/ard.rda", "rb")) %>%
  dplyr::rename(ID = destinationAirportID) %>%
  dplyr::group_by(ID) %>%
  dplyr::summarise(flights = n())
# inspect
head(routes)


We can now merge the airports and the route volume data sets to combine the information about the location with the information about the number of routes that end at each airport.



In [None]:
# combine tables
arrivals <- dplyr::left_join(airports, routes, by = "ID") %>%
  na.omit()
# inspect
head(arrivals)


Now that we have that data (which contains geolocations (longitudes and latitudes), we can visualize the location of the airports on a map and add information about the route volume of the airports in the form of, e.g., points that we plot over the airport - the bigger the point, the higher the route volume. In addition, we add the locations as texts and also make these labels correspond to the route volume.



In [None]:
# create a layer of borders
ggplot(arrivals, aes(x=Longitude, y= Latitude)) +   
  borders("world", colour="gray20", fill="wheat1")  +
  geom_point(color="blue", alpha = .3, size = log(arrivals$flights)) +
  scale_x_continuous(name="Longitude", limits=c(110, 160)) +
  scale_y_continuous(name="Latitude", limits=c(-45, -10)) +
  theme(panel.background = element_rect(fill = "azure1", colour = "azure1")) +
  geom_text(aes(x=Longitude, y= Latitude, label=City),
            color = "gray20", check_overlap = T, size = log(arrivals$flights))


# Creating Maps with ggmap 

If you simply want to show specific locations on existing maps, then the `ggmap` package is an easy way to take on this talk.


In [None]:
# define box
sbbox <- ggmap::make_bbox(lon = c(152.8, 153.4), lat = c(-27.1, -27.7), f = .1)
# get map
brisbane = ggmap::get_map(location=sbbox, zoom=10, maptype="terrain")
# create map
brisbanemap = ggmap::ggmap(brisbane)
# display map
brisbanemap +
  geom_point(data = arrivals, mapping = aes(x = Longitude, y = Latitude), 
               color = "red", size = 2) +
  geom_text(data = arrivals, 
            mapping = aes(x = Longitude+0.1,
                          y = Latitude,
                          label = "Brisbane Airport"),
            size = 3, color = "gray20", 
            fontface = "bold", 
            check_overlap = T) +
  labs(x = "Longitude", y = "Latitude")


# Interactive Maps with leaflet and maptools 

The `leaflet` package offers very easy-to use options for generating interactive maps ([here](https://raw.githubusercontent.com/rstudio/cheatsheets/main/leaflet.pdf) is the link to the leaflet cheat sheet provided by RStudio). The interactivity is achieved by the `leaflet` function from the `leaflet` package which creates a leaflet-map with html-widgets which can be used, e.g., in html rendered R Notebooks or Shiny applications. The advantage of using this function lies in the fact that it offers very detailed maps which enable zooming in on  specific locations.


In [None]:
# generate basic leaflet map
m <- leaflet() %>% 
  leaflet::setView(lng = 153.05, lat = -27.45, zoom = 12)%>% 
  leaflet::addTiles()
# show map
m


Another option for interactive geospatial visualizations is provided by the `maptools` package which comes with a `SpatialPolygonsDataFrame` of the world and the population by country (in 2005). To make the visualization a bit more appealing, we will calculate the population density, add this variable to the data which underlies the visualization, and then display the information interactively. In this case, this means that you can use *mouse-over* or *hoover* effects so that you see the population density in each country if you put the cursor on that country (given the information is available for that country).

We start by loading the required package from the library, adding population density to the data, and removing data points without meaningful information (e.g. we set values like Inf to NA).


In [None]:
# load data
data(wrld_simpl)
# calculate population density and add it to the data 
wrld_simpl@data$PopulationDensity <- round(wrld_simpl@data$POP2005/wrld_simpl@data$AREA,2)
wrld_simpl@data$PopulationDensity <- ifelse(wrld_simpl@data$PopulationDensity == "Inf", NA, wrld_simpl@data$PopulationDensity)
wrld_simpl@data$PopulationDensity <- ifelse(wrld_simpl@data$PopulationDensity == "NaN", NA, wrld_simpl@data$PopulationDensity)
# inspect
head(wrld_simpl@data, 10)


We can now display the data and use color coding to indicate the different population densities.



In [None]:
# define colors
qpal <- colorQuantile(rev(viridis::viridis(10)),
                      wrld_simpl$PopulationDensity, n=10)
# generate visualization
l <- leaflet(wrld_simpl, options =
               leafletOptions(attributionControl = FALSE, minzoom=1.5)) %>%
  addPolygons(
    label=~stringr::str_c(
      NAME, ' ',
      formatC(PopulationDensity, big.mark = ',', format='d')),
    labelOptions= labelOptions(direction = 'auto'),
    weight=1, color='#333333', opacity=1,
    fillColor = ~qpal(PopulationDensity), fillOpacity = 1,
    highlightOptions = highlightOptions(
      color='#000000', weight = 2,
      bringToFront = TRUE, sendToBack = TRUE)
    ) %>%
  addLegend(
    "topright", pal = qpal, values = ~PopulationDensity,
    title = htmltools::HTML("Population density <br> (2005)"),
    opacity = 1 )
# display visualization
l


# Adding Shapes to Maps 


<div class="warning" style='padding:0.1em; background-color:#f2f2f2; color:#51247a'>
<span>
<p style='margin-top:1em; text-align:center'>
<b>NOTE</b><br>You you need to download a shape file (shmoreAndCartierIslands.shp) via the following to url:<br><br> https://slcladal.github.io/data/shapes/AshmoreAndCartierIslands.shp<br><br>
Once you have downloaded these shape files, store them on your computer. When you read in the files, make sure that you have adapted the paths so that R can find the shape files on your computer. The paths below work for my computer but they will not work for yours. You need to download the shape files because, unfortunately, the readOGR function does not allow you to download the shp-files from the web - it requires you to have them stored on your machine.</p></p>
<p style='margin-left:1em;'>
</p></span>
</div>

<br>


In [None]:
# load shape files
australia <- rgdal::readOGR(dsn = here::here("data/shapes",
                                             "AshmoreAndCartierIslands.shp"),
                            stringsAsFactors = F)


We can now generate a first map based on the shape file we downloaded.



In [None]:
# plot australia based on shp file
ggplot() + 
  geom_polygon(data = australia, 
               aes(x = long, y = lat, group = group),
               colour = "black", fill = "gray90") +
  theme_void()


Next, we convert the data set into a tidy format.



In [None]:
# convert the data into tidy format
australia_tidy <- broom::tidy(australia, region = "name")
# inspect data
head(australia_tidy)


Now, we extract the names of the states and territories as well as the longitudes and latitudes where we want to display the labels. Then, we display the information on the map.



In [None]:
# extract names of states and their long and lat
australia_states <- australia_tidy %>%
  dplyr::group_by(id) %>%
  dplyr::summarise(long = mean(long),
                   lat = mean(lat))


We can now generate a map based on the shp-file. In addition, we define colors for the states and territories and customize the map.



In [None]:
# define colors
clrs <- viridis_pal()(15)
# plot map
p <- ggplot() +
  # plot map
  geom_polygon(data = australia_tidy, 
               aes(x = long, y = lat, group = group, fill = id, alpha = .75), 
               asp = 1, colour = "gray50") +
  # add text
  geom_text(data = australia_states, aes(x = long, y = lat, label = id), 
            size = 3, color = "gray20", fontface = "bold", 
            check_overlap = T) +
  geom_text(data= world_points,aes(x=X, y=Y, label=name),
            color = "gray20", fontface = "bold", check_overlap = T, size = 5) +
  # color states
  scale_fill_manual(values = clrs) +
  # define theme and axes
  theme_void() +
  scale_x_continuous(name = "Longitude", limits = c(110, 160)) +
  scale_y_continuous(name = "Latitude", limits = c(-45, -10)) +
  theme(panel.grid.major = element_line(color = "gray60", 
                                         linetype = "dashed", 
                                         size = 0.25),
        # define background color
        panel.background = element_rect(fill = "aliceblue"),
        legend.position = "none")+ 
  # add compass
  annotation_north_arrow(location = "tl", which_north = "true", 
                         style = north_arrow_fancy_orienteering)
# show plot
p


You can create customized polygons by defining longitudes and latitudes. In fact, you can generate very complex polygons like this. However, in this example, we only create a very basic one as a poof-of-concept.



In [None]:
# create data frame with longitude and latitude values
lat <- c(-25, -27.5, -25, -30, -30, -35, -25)
long <- c(150, 140, 130, 135, 140, 147.5, 150)
mypolygon <- as.data.frame(cbind(long, lat))
# inspect data
mypolygon


We can now plot out polygon over the map produced above.



In [None]:
p + 
  geom_polygon(data=mypolygon,
               aes(x = long, y = lat),
               alpha = 0.2, 
               colour = "gray20", 
               fill = "red") +
  ggplot2::annotate("text", 
                    x = mean(long),
                    y = mean(lat),
                    label = "My Polygon Area", 
                    colour="white", 
                    size=3)


In [None]:
# install library
#install.packages("oz")
# activate library
library(oz)
# show map
oz::oz( states=TRUE, col="orange")


In [None]:
# DOES NOT WORK: dependency (rnaturalearthhires) not available for R>=4.2
# install package
#install.packages("devtools")
devtools::install_github("RobertMyles/flag_fillr")
# activate package
library(flagfillr)
# map of europe
flag_fillr_continent("Asia")


In [None]:
# Development version of ggspatial
# devtools::install_github("paleolimbot/ggspatial")
library(ggspatial)
library(ggplot2)
library(giscoR)
library(dplyr)
library(rasterpic)

# For country names
library(countrycode)

world <- gisco_get_countries(epsg = 3857)
europe <- gisco_get_countries(region = "Europe", epsg = 3857)

# Base map of Africa
plot <- ggplot(world) +
  geom_sf(fill = "grey90") +
  theme_minimal() +
  theme(panel.background = element_rect(fill = "lightblue"))

plot +
  # Zoom on Europe
  coord_sf(
    xlim = c(-2000000, 5000000),
    ylim = c(3500000, 8000000)
  )


In [None]:
# We paste the ISO2 code to each african country
europe$iso2 <- countrycode(europe$ISO3_CODE, "iso3c", "iso2c")

# Get flags from repo - low quality to speed up the code
flagrepo <- "https://raw.githubusercontent.com/hjnilsson/country-flags/master/png250px/"

# Loop and add
for (iso in europe$iso2) {
  # Download pic and plot
  imgurl <- paste0(flagrepo, tolower(iso), ".png")
  tmpfile <- tempfile(fileext = ".png")
  download.file(imgurl, tmpfile, quiet = TRUE, mode = "wb")

  # Raster
  x <- europe %>% filter(iso2 == iso)
  x_rast <- rasterpic_img(x, tmpfile, crop = TRUE, mask = TRUE)
  plot <- plot + layer_spatial(x_rast)
}

plot +
  geom_sf(data = europe, fill = NA) +
  # Zoom on Africa
  coord_sf(
    xlim = c(-2000000, 5000000),
    ylim = c(3500000, 8000000)
  )


We will end this introduction here but if you want to want to learn more, check out the detailed resources for geospatial data visualization using R can be found [here](https://keen-swartz-3146c4.netlify.app/) or [here](https://geocompr.robinlovelace.net/index.html).


***

[Back to LADAL](https://ladal.edu.au/gviz.html)

***
