Accessing CKAN in R

Oleg Lavrovsky edited this page Dec 16, 2018 · 1 revision

This is a short tutorial in using R to run a search for and directly use live open data through the opendata.swiss portal's API interface. Details of the API are documented at handbook.opendata.swiss.

As a first step, we install the ckanr library package, and activate it.

install.packages("ckanr")
library('ckanr')

We can initialise the CKAN library with any Web-accessible open data portal simply by supplying the web link to the starting page.

ckanr_setup(url = "https://opendata.swiss")

As a further exercise, try running the same code on data.stadt-zurich.ch or old.datahub.io.

If there are no connection errors, we are now ready to run a search to get some data packages:

x <- package_search(q = 'name:arbeitslosenquote', rows = 1)

Note that on the Swiss server the titles are multilingual, so we extract just the German (de) title.

x$results[[1]]$title$de

You can print the contents of x$results, use the API reference, or the inspection tools of R Studio to see the full data contents.

To download live open data, now we can select the URL of the first resource in the first package:

tsv_url <- x$results[[1]]$resources[[1]]$download_url

We wish to download the remote (Tab Separated Values) data file, and parse it in one step:

raw_data <- read.csv(tsv_url, header=T, sep="\t")

At this point we are ready to draw a simple plot of the first and second column with visualization code such as:

plot(raw_data[,2], raw_data[,1], type="b")

Get the full example script on GitHub.

This document was adapted from materials from a lecture on Advanced Studies in Data Analysis at the Berne University of Applied Science, as first posted on the School of Data CH forum.

Clone this wiki locally
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.