Permalink
Find file
d70b57b Oct 27, 2016
@antagomir @statguy @ouzor @jlehtoma
652 lines (458 sloc) 21.5 KB

gisfin - tutorial

This R package provides tools to access open spatial data in Finland as part of the rOpenGov project.

For contact information and source code, see the github page

Available data sources

Helsinki region district maps (Helsingin seudun aluejakokartat)

Helsinki spatial data (Helsingin seudun avoimia paikkatietoaineistoja)

  • Seutukartta (Helsinki Region Maps)
  • Helsingin piirijako (District Division of the City of Helsinki)
  • Seudullinen osoiteluettelo (Regional Address List)
  • Helsingin osoiteluettelo (Register of Addresses of the City of Helsinki)
  • Rakennusrekisterin ote (Helsinki building registry)
  • Source: Helsingin kaupungin Kiinteistövirasto (HKK)

National Land Survey data (Maanmittauslaitoksen avointa dataa)

Geocoding

IP address geographic coordinates

Statistics Finland geospatial data (Tilastokeskuksen paikkatietoaineistoja)

  • Väestöruutuaineisto (Population grid)
  • Tuotanto- ja teollisuuslaitokset (Production and industrial facilities)
  • Oppilaitokset (Educational institutions)
  • Tieliikenneonnettomuudet (Road accidents)
  • Source: Statistics Finland

Finnish postal code areas (Suomalaiset postinumero KML-muodossa)

Examples (Further usage examples)

List of potential data sources to be added to the package can be found here.

Installation

Requirements

The gisfin package uses the rgdal package, which depends on GDAL (Geospatial Data Abstraction Library). Some rgdal installation tips for various platforms are listed below. The gisfin package has been tested with recent versions of the dependency packages and libraries and it is recommended to use the recent releases. The older versions are known to cause problems in some cases. If you encounter problems, please contact us by email: louhos@googlegroups.com.

Windows

Install binaries from CRAN

OSX

Follow these instructions to install rgeos and rgdal on OSX. If these don't work, install the rgdal from KyngChaos Wiki.This is preferred over using the CRAN binaries.

Linux

Install the following packages through your distribution's package manager

Ubuntu/Debian

sudo apt-get -y install libgdal1-dev libproj-dev

Fedora

sudo yum -y install gdal-devel proj-devel

openSUSE

sudo zypper --non-interactive in libgdal-devel libproj-devel

Additional dependencies

These may be needed: GDAL, freeglut, XML, GEOS and PROJ.4.

Installing the package

Development version for developers:

install.packages("devtools")
library("devtools")
install_github("ropengov/gisfin")

Load package.

library("gisfin")

Helsinki region district maps

Helsinki region district maps (Helsingin seudun aluejakokartat) from Helsingin kaupungin Kiinteistövirasto (HKK).

List available maps with get_helsinki_aluejakokartat().

get_helsinki_aluejakokartat()
## [1] "kunta"             "pienalue"          "pienalue_piste"   
## [4] "suuralue"          "suuralue_piste"    "tilastoalue"      
## [7] "tilastoalue_piste" "aanestysalue"

Below the 'suuralue' districts is used for plotting examples with spplot() and ggplot2. The other district types can be plotted similarly.

Plot with spplot

Retrieve 'suuralue' spatial object with get_helsinki_aluejakokartat() and plot with spplot().

sp.suuralue <- get_helsinki_aluejakokartat(map.specifier="suuralue")
spplot(sp.suuralue, zcol="Name")

plot of chunk hkk-suuralue1

Function generate_map_colours() allows nice region colouring separable adjacent regions. This is used here with the rainbow() colour scale to plot the regions with spplot().

sp.suuralue@data$COL <- factor(generate_map_colours(sp=sp.suuralue))
spplot(sp.suuralue, zcol="COL", 
       col.regions=rainbow(length(levels(sp.suuralue@data$COL))), 
       colorkey=FALSE)

plot of chunk hkk-suuralue2

Plot with ggplot2

Use the 'sp.suuralue' retrieved above, and retrieve also the center points of the districts. Use sp2df() function to tranform the spatial objects into data frames. Plot with ggplot2, using blank map theme with get_theme_map().

# Retrieve center points
sp.suuralue.piste <- get_helsinki_aluejakokartat(map.specifier="suuralue_piste")
# Get data frames
df.suuralue <- sp2df(sp.suuralue)
df.suuralue.piste <- sp2df(sp.suuralue.piste)
# Set map theme
library(ggplot2)
theme_set(get_theme_map())
# Plot regions, add labels using the points data
ggplot(df.suuralue, aes(x=long, y=lat)) + 
  geom_polygon(aes(fill=COL, group=Name)) + 
  geom_text(data=df.suuralue.piste, aes(label=Name)) + 
  theme(legend.position="none")

plot of chunk hkk-suuralue3

Plot election districts

Retrieve and plot äänetysaluejako (election districts) with get_helsinki_aluejakokartat() and spplot(), use colours to separate municipalities.

sp.aanestys <- get_helsinki_aluejakokartat(map.specifier="aanestysalue")
spplot(sp.aanestys, zcol="KUNTA", 
       col.regions=rainbow(length(levels(sp.aanestys@data$KUNTA))), 
       colorkey=FALSE)

plot of chunk hkk-aanestysalue


Helsinki spatial data

Other Helsinki region spatial data from Helsingin Kaupungin Kiinteistövirasto (HKK).

List available spatial data with get_helsinki_spatial().

get_helsinki_spatial()

Retrieve municipality map for the larger Helsinki region with get_helsinki_spatial() and transform coordinates with sp::spTransform().

sp.piiri <- get_helsinki_spatial(map.type="piirijako", 
                                 map.specifier="ALUEJAKO_PERUSPIIRI")
# Check current coordinates
sp.piiri@proj4string
# Transform coordinates to WGS84
sp.piiri <- sp::spTransform(sp.piiri, CRS("+proj=longlat +datum=WGS84"))

National Land Survey Finland

Spatial data from National Land Survey Finland (Maanmittauslaitos, MML). These data are preprocessed into RData format, see details here.

Retrieve regional borders for Finland with get_mml().

# Get a specific map
sp.mml <- get_mml(map.id="Yleiskartta-4500", data.id="HallintoAlue")
# Investigate available variables in this map
library(knitr)
kable(head(as.data.frame(sp.mml)))
Kohderyhma Kohdeluokk Enklaavi AVI Maakunta Kunta AVI_ni1 AVI_ni2 Maaku_ni1 Maaku_ni2 Kunta_ni1 Kunta_ni2 Kieli_ni1 Kieli_ni2 AVI.FI Kieli.FI Maakunta.FI Kunta.FI
005 71 84200 1 4 14 005 Länsi- ja Sisä-Suomen aluehallintovirasto Regionförvaltningsverket i Västra och Inre Finland Etelä-Pohjanmaa Södra Österbotten Alajärvi N_A Suomi N_A Länsi- ja Sisä-Suomen aluehallintovirasto Suomi Etelä-Pohjanmaa NA
009 71 84200 1 5 17 009 Pohjois-Suomen aluehallintovirasto Regionförvaltningsverket i Norra Finland Pohjois-Pohjanmaa Norra Österbotten Alavieska N_A Suomi N_A Pohjois-Suomen aluehallintovirasto Suomi Pohjois-Pohjanmaa NA
010 71 84200 1 4 14 010 Länsi- ja Sisä-Suomen aluehallintovirasto Regionförvaltningsverket i Västra och Inre Finland Etelä-Pohjanmaa Södra Österbotten Alavus N_A Suomi N_A Länsi- ja Sisä-Suomen aluehallintovirasto Suomi Etelä-Pohjanmaa NA
016 71 84200 1 1 07 016 Etelä-Suomen aluehallintovirasto Regionförvaltningsverket i Södra Finland Päijät-Häme Päijänne-Tavastland Asikkala N_A Suomi N_A Etelä-Suomen aluehallintovirasto Suomi Päijät-Häme NA
018 71 84200 1 1 01 018 Etelä-Suomen aluehallintovirasto Regionförvaltningsverket i Södra Finland Uusimaa Nyland Askola N_A Suomi N_A Etelä-Suomen aluehallintovirasto Suomi Uusimaa NA
019 71 84200 1 2 02 019 Lounais-Suomen aluehallintovirasto Regionförvaltningsverket i Sydvästra Finland Varsinais-Suomi Egentliga Finland Aura N_A Suomi N_A Lounais-Suomen aluehallintovirasto Suomi Varsinais-Suomi NA

You can list other available data sets:

list_mml_datasets()
## $`2012`
## character(0)
## 
## $`2016`
## character(0)
## 
## $`Maastotietokanta-tiesto1`
## [1] "N61_v"
## 
## $`Maastotietokanta-tiesto2`
## [1] "N62_p" "N62_s" "N62_t" "N62_v"
## 
## $`Yleiskartta-1000`
##  [1] "AmpumaRaja"             "HallintoAlue"          
##  [3] "HallintoAlue_DataFrame" "HallintoalueRaja"      
##  [5] "KaasuJohto"             "KarttanimiPiste500"    
##  [7] "KarttanimiPiste1000"    "KorkeusAlue"           
##  [9] "KorkeusViiva500"        "KorkeusViiva1000"      
## [11] "LentokenttaPiste"       "LiikenneAlue"          
## [13] "MaaAlue"                "Maasto1Reuna"          
## [15] "Maasto2Alue"            "MetsaRaja"             
## [17] "PeltoAlue"              "RautatieViiva"         
## [19] "SahkoLinja"             "SuojaAlue"             
## [21] "SuojametsaRaja"         "SuojeluAlue"           
## [23] "TaajamaAlue"            "TaajamaPiste"          
## [25] "TieViiva"               "VesiAlue"              
## [27] "VesiViiva"             
## 
## $`Yleiskartta-4500`
##  [1] "HallintoAlue"        "HallintoalueRaja"    "KarttanimiPiste2000"
##  [4] "KarttanimiPiste4500" "KarttanimiPiste8000" "KorkeusAlue"        
##  [7] "KorkeusViiva"        "Maasto1Reuna"        "RautatieViiva"      
## [10] "TaajamaPiste2000"    "TaajamaPiste4500"    "TaajamaPiste8000"   
## [13] "TieViiva2000"        "TieViiva4500"        "TieViiva8000"       
## [16] "VesiAlue"            "VesiViiva2000"       "VesiViiva4500"      
## [19] "VesiViiva8000"

Further examples

Visualizing Finnish municipalities with your own data

Here we show examples with the standard shape tools. For interactive maps, see leaflet and rMaps. Examples to be added later.

First, retrieve population data (2013) for Finnish municipalities:

# Get municipality population data from Statistics Finland
# using the pxweb package
library(pxweb)
mydata <- get_pxweb_data(url = "http://pxwebapi2.stat.fi/PXWeb/api/v1/fi/Kuntien_talous_ja_toiminta/Kunnat/ktt14/080_ktt14_2013_fi.px",
             dims = list(Alue = c('*'),
                         Tunnusluku = c('30'),
                         Vuosi = c('Arvo')),
             clean = TRUE)

# Pick municipality ID from the text field
mydata$Kuntakoodi <- sapply(strsplit(as.character(mydata$Alue), " "), function (x) x[[1]])
mydata$Kunta <- sapply(strsplit(as.character(mydata$Alue), " "), function (x) x[[2]])

# Rename fields for clarity
mydata$Asukasluku <- mydata$values

# Pick only the necessary fields for clarity
mydata <- mydata[, c("Kunta", "Kuntakoodi", "Asukasluku")]

Visualize population with Land Survey Finland (MML) maps. See also blog post on this topic. Use a fast wrapper that generates ggplot2-object that can be further modified if necessary:

# Get the municipality map for visualization
sp <- get_municipality_map(data.source = "MML")

# Merge the Finnish map shape file and the population data based on
# the 'Kunta' field. The population data contains also some other
# regions besides municipalities. These will be ignored when merged
# with the municipality map:
sp2 <- sp::merge(sp, mydata, all.x = TRUE, by.x = "kuntakoodi", by.y="Kuntakoodi")

p <- region_plot(sp2, color = "Asukasluku", region = "kuntakoodi", by = 100000)
print(p)

plot of chunk gisfin-owndata1

Using GADM maps

Same with GADM maps. You can select the desired maps at the GADM service. Choose Finland and file format R. This will give the link to the Finnish municipality data file. GADM contains very useful maps but the Finnish municipality map data seems a bit outdated:

# Load municipality borders from GADM:
# sp <- get_municipality_map(data.source = "GADM") # also possible
gadm.url <- "http://biogeo.ucdavis.edu/data/gadm2/R/FIN_adm4.RData"
con <- url(gadm.url)
load(con); close(con)

# Convert NAME field into factor (needed for plots)
gadm$NAME_4 <- factor(gadm$NAME_4)

# Merge the Finnish map shape file and the population data based on
# the 'Kunta' field (see above)
gadm2 <- sp::merge(gadm, mydata, by.x = "NAME_4", by.y = "Kunta", all.x = TRUE)

# Plot the shape file, colour municipalities by population
# It turns out that not all municipality names can be matched.
# We are happy to add solutions here if you have any.
spplot(gadm2, zcol="Asukasluku", colorkey=TRUE, main = "Population in Finnish municipalities")

plot of chunk gisfin-owndata2


Geocoding

Get geocodes for given location (address etc.) using one of the available services. Please read carefully the usage policies for the different services:

The function get_geocode() returns both latitude and longitude for the first hit, and the raw output (varies depending on the service used).

Warning! The geocode results may vary between sources, use with care!

gc1 <- get_geocode("Mannerheimintie 100, Helsinki", service="okf")
unlist(gc1[1:2])
##      lat      lon 
## 60.18856 24.91736
gc2 <- get_geocode("Mannerheimintie 100, Helsinki", service="openstreetmap")
unlist(gc2[1:2])
##      lat      lon 
## 60.18864 24.91750
gc3 <- get_geocode("Mannerheimintie 100, Helsinki", service="google")
unlist(gc3[1:2])
##      lat      lon 
## 60.18864 24.91753

Get geocode for a city (instead of street address; only implemented for OSM at the moment):

gc4 <- get_geocode("&city=Helsinki", service="openstreetmap", raw_query=T)
unlist(gc4[1:2])
##      lat      lon 
## 60.16741 24.94257

IP Location

Geographic coordinates for a given IP-address from Data Science Toolkit:

ip_location("137.224.252.10")
## [1] "51.9667015075684" "5.66669988632202"

Statistics Finland geospatial data

Geospatial data provided by Statistics Finland.

Retrieve a list of the available data sets for population density. In case the service is unreachable, character(0) is returned.

request <- gisfin::GeoStatFiWFSRequest$new()$getPopulationLayers()
client <- gisfin::GeoStatFiWFSClient$new(request)
layers <- client$listLayers()
if (length(layers) > 0) layers
##  [1] "vaestoruutu:vaki2005_1km"    "vaestoruutu:vaki2005_1km_kp"
##  [3] "vaestoruutu:vaki2010_1km"    "vaestoruutu:vaki2010_1km_kp"
##  [5] "vaestoruutu:vaki2011_1km"    "vaestoruutu:vaki2011_1km_kp"
##  [7] "vaestoruutu:vaki2012_1km"    "vaestoruutu:vaki2012_1km_kp"
##  [9] "vaestoruutu:vaki2013_1km"    "vaestoruutu:vaki2013_1km_kp"
## [11] "vaestoruutu:vaki2014_1km"    "vaestoruutu:vaki2014_1km_kp"
## [13] "vaestoruutu:vaki2015_1km"    "vaestoruutu:vaki2015_1km_kp"
## [15] "vaestoruutu:vaki2005_5km"    "vaestoruutu:vaki2010_5km"   
## [17] "vaestoruutu:vaki2011_5km"    "vaestoruutu:vaki2012_5km"   
## [19] "vaestoruutu:vaki2013_5km"    "vaestoruutu:vaki2014_5km"   
## [21] "vaestoruutu:vaki2015_5km"   
## attr(,"driver")
## [1] "WFS"
## attr(,"nlayers")
## [1] 21

Get population density in year 2005 on a 5 km x 5 km grid, convert to RasterStack object and plot on log scale.

library(raster)
request$getPopulation(layers[11])
client <- gisfin::GeoStatFiWFSClient$new(request)
population <- client$getLayer(layers[11])
if (length(population) > 0) {
  x <- sp::SpatialPixelsDataFrame(coordinates(population), population@data, proj4string=population@proj4string)
  population <- raster::stack(x)
  plot(log(population[["vaesto"]]))
}

plot of chunk population-density-plot

Finnish postal code areas

Spatial data provided by Duukkis.

Get the postal code areas and plot them for the Helsinki region.

pnro.sp <- get_postalcode_areas()
pnro.sp@data$COL <- factor(generate_map_colours(sp=pnro.sp))
pnro.pks.sp <- pnro.sp[substr(pnro.sp$pnro, 1, 2) %in% c("00", "01", "02"), ]
spplot(pnro.pks.sp, zcol="COL", 
       col.regions=rainbow(length(levels(pnro.pks.sp@data$COL))), 
       colorkey=FALSE)

plot of chunk postal_code


Citation

Citing the data: See help() to get citation information for each data source individually.

Citing the R package:

citation("gisfin")

Kindly cite the gisfin R package as follows:

  (C) Joona Lehtomaki, Juuso Parkkinen, Leo Lahti, Jussi Jousimo
  and Janne Aukia 2015-2016. gisfin R package

A BibTeX entry for LaTeX users is

  @Misc{,
    title = {gisfin R package},
    author = {Joona Lehtomaki and Juuso Parkkinen and Leo Lahti and Jussi Jousimo and Janne Aukia},
    year = {2015-2016},
  }

Many thanks for all contributors! For more info, see:
https://github.com/rOpenGov/gisfin

Session info

This vignette was created with

sessionInfo()
## R version 3.3.1 (2016-06-21)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.1 LTS
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=de_BE.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=de_BE.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=de_BE.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=de_BE.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] raster_2.5-8    pxweb_0.6.3     ggplot2_2.1.0   maptools_0.8-39
##  [5] rgeos_0.3-21    gisfin_0.9.27   R6_2.2.0        rgdal_1.1-10   
##  [9] sp_1.2-3        knitr_1.14     
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_0.12.7      spdep_0.6-8      formatR_1.4      plyr_1.8.4      
##  [5] highr_0.6        bitops_1.0-6     LearnBayes_2.15  tools_3.3.1     
##  [9] boot_1.3-18      digest_0.6.10    jsonlite_1.1     evaluate_0.10   
## [13] nlme_3.1-128     gtable_0.2.0     lattice_0.20-34  Matrix_1.2-7.1  
## [17] curl_2.1         coda_0.18-1      httr_1.2.1       stringr_1.1.0   
## [21] gtools_3.5.0     grid_3.3.1       data.table_1.9.6 XML_3.98-1.4    
## [25] foreign_0.8-67   RJSONIO_1.3-0    gdata_2.17.0     deldir_0.1-12   
## [29] magrittr_1.5     scales_0.4.0     MASS_7.3-45      splines_3.3.1   
## [33] gmodels_2.16.2   colorspace_1.2-7 labeling_0.3     stringi_1.1.2   
## [37] RCurl_1.95-4.8   munsell_0.4.3    chron_2.3-47     rjson_0.2.15