Turn KML Files into tidy data frames:
R
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
R
inst/extdata
man
tests
DESCRIPTION
NAMESPACE
NEWS.md
README.md

README.md

tidykml

The tidykml package reads selected elements and values from KML files, such as those produced by Google My Maps, and puts them into tidy data frames, intended for use with packages like dplyr and ggplot2.

Motivation

The goal of tidykml is to make KML files usable for data wrangling and visualization in as few steps as possible. Several R packages can import KML files, but these packages do not offer a straightforward way to use their results with either dplyr or ggplot2.

The reason for tidykml to exist will go away when packages like ggmap, rgdal and sf implement easy ways to produce tidy data frames from KML data, or to fortify KML data into objects that can be passed to ggplot2.

Limitations

  • The tidykml package was tested only against a limited number of KML files, all of which came either from GADM or from Google My Maps. The fields that it extracts from the KML file might not fit other KML sources.
  • The tidykml package does not fully support MultiGeometry elements, such as multi-polygons, and will only handle their first element, in order of appearance in the KML source.

Due to these limitations, tidykml lives on GitHub but will probably never show up on CRAN.

Installation

Install tidykml with devtools:

devtools::install_github("briatte/tidykml")
library(tidykml)

Example

The data used in this example is a map of the U.S. Civil War featured on Google My Maps. It is bundled in the tidykml package (see ?states for details and usage).

The tidykml package contains functions to return the Points, Polygons or LineStrings of a KML file:

library(dplyr)
f <- system.file("extdata", "states.kml.zip", package = "tidykml")
kml_polygons(f) %>%
    glimpse

The results are always returned in the following form:

Observations: 9,930
Variables: 7
$ folder      <chr> "States (status in 1863)", "States (status in 1863)", "S...
$ name        <chr> "Ohio", "Ohio", "Ohio", "Ohio", "Ohio", "Ohio", "Ohio", ...
$ description <chr> "description: type: Union state<br>type: Union state", "...
$ styleUrl    <chr> "#poly-3F5BA9-1-196", "#poly-3F5BA9-1-196", "#poly-3F5BA...
$ longitude   <dbl> -82.21486, -82.34138, -82.54884, -82.71695, -82.90893, -...
$ latitude    <dbl> 41.46419, 41.43150, 41.39134, 41.45053, 41.42947, 41.456...
$ altitude    <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,...

These results are easy to pass to ggplot2:

library(ggplot2)
kml_polygons(f) %>%
    ggplot(aes(longitude, latitude, group = name)) +
      geom_polygon(color = "white") +
      coord_map("albers", at0 = 45.5, lat1 = 29.5)

These results are also easy to pass to ggmap:

library(ggmap)
m <- get_map(kml_bounds(f), source = "osm")
ggmap(m) +
  geom_polygon(data = kml_polygons(f) %>%
                 mutate(type = gsub("(.*)<br>type: (.*)", "\\2", description)),
               aes(longitude, latitude, group = name, fill = type),
               color = "white", alpha = 0.5) +
  scale_fill_brewer("", palette = "Set1") +
  theme(legend.position = "bottom",
        axis.text = element_blank(),
        axis.ticks = element_blank(),
        axis.title = element_blank())

The final map also shows the location of major U.S. civil war battles:

ggmap(m) +
  geom_polygon(data = kml_polygons(f) %>%
                 mutate(type = gsub("(.*)<br>type: (.*)", "\\2", description)),
               aes(longitude, latitude, group = name, fill = type),
               color = "white", alpha = 0.5) +
  geom_point(data = kml_points(f),
             aes(longitude, latitude),
             color = "darkred", size = 6, alpha = 0.5) +
  scale_fill_brewer("", palette = "Set1") +
  theme(legend.position = "bottom",
        axis.text = element_blank(),
        axis.ticks = element_blank(),
        axis.title = element_blank())

Data

In addition to the example map used above, the package also contains a map of non-Hispanic gangs in South Los Angeles, created by Instagram user @la_hood_maps (see ?gangs for details).

f <- system.file("extdata", "gangs.kml.zip", package = "tidykml")
m <- get_map(kml_bounds(f), source = "osm")
ggmap(m) +
  geom_polygon(data = kml_polygons(f),
               aes(longitude, latitude, group = name, fill = folder),
               color = "grey25", alpha = 0.75) +
  scale_fill_brewer("", palette = "Set3",
                    guide = guide_legend(override.aes = list(color = NA))) +
  labs(title = "Non-Hispanic Gangs in South Los Angeles (2016)",
       caption = paste("Source: instagram.com/la_hood_maps",
                       "(accessed 30 December 2016)."),
       x = NULL, y = NULL) +
  theme(legend.position = "right",
        legend.justification = c(0, 1),
        plot.title = element_text(face = "bold"),
        plot.caption = element_text(hjust = 0),
        axis.text = element_blank(),
        axis.ticks = element_blank())

Utilities

The tidykml package contains a few helper functions to handle KML files:

  • kml_bounds returns the bounding box (longitude and latitude ranges) of the file.
  • kml_coords parses strings of KML coordinates (longitude,latitude[,altitude]).
  • kml_info returns the number of Folders, Placemarks, LineStrings, Points and Polygons in the file
  • kml_read is a wrapper for xml2::read_xml that returns KML sources as an XML nodeset.