Skip to content
Ruud Steltenpool edited this page May 20, 2022 · 19 revisions

This tutorial is a quick walkthrough for deploying your own Linked Data API using grlc. Since we'll be using grlc's public instance, you don't need to install anything; a GitHub account will do (although sharing queries on GitHub is not mandatory, it's the easiest way to get SPARQL-based APIs quickly up and running). Having your own SPARQL endpoint is optional.

This is the magic recipe (click on the links for further detail):

  1. Log into GitHub / create GitHub account
  2. Create a repository for queries
  3. Put your queries up in your repository
  4. Endpoints
  5. Create your Linked Data API
  6. Using your Linked Data API
  7. Linked Data Fragments compatibility
  8. Further steps

If after following this steps you don't manage to have a working API, please let us know on gitter

Log into GitHub / create GitHub account

The first thing we'll need is to log into GitHub. If you don't have a GitHub account, you'll be prompted to create one. Just follow the on-screen instructions.

Create a repository for queries

Once you're logged in, you'll see a dashboard screen similar to this one:

To create a new repository, click on the + sign in the top right corner of the screen, and then click on New repository.

Fill in the form to give your repository a name, and (optionally) a description. You can keep the rest of the defaults. Make sure the visibility is set to Public (grlc will need this to access the queries we'll store in the repo). When done, click on Create repository.

The URI of your repository will look something like https://github.com/username/repo. In my case that's https://github.com/albertmeronyo/my-linked-data-api. We'll use this URI later to let grlc know where to find the queries of our API.

Put your queries up in your repository

Next, we need to create one file inside our new repository for each query we want to transform into an API method. In order to do this, let's first get our query written in SPARQL (grlc supports much more than just SPARQL; but let's use it as a prototypical Linked Data example). We'll type a SPARQL query that retrieves the names of all the Hard Rock bands DBpedia knows about, like this:

PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbr: <http://dbpedia.org/resource/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

SELECT ?band_label { 
    ?band rdf:type dbo:Band ;
          dbo:genre dbr:Hard_Rock ;
          rdfs:label ?band_label .
} ORDER BY ?band_label

You can try out this query at the DBpedia SPARQL endpoint, or with the assistance of the handy YASGUI editor (I've typed it there for you). It's really important for grlc to not forget any prefixes, so double check you've declared all prefixes used in your queries.

Once you're sure the query returns results, we'll put it up on your GitHub repo. To do this, click on Create new file in the repository page.

After this, you'll see the file editor. Give your file a name first; it's important you end up with .rq or .sparql, so grlc knows its contents is actually a SPARQL query.

Now, go back up where you've tested our Hard Rock bands query (or just scroll up this page), select all the text of this query, from PREFIX dbo: until ?band_label, and copy-paste it into the query editor as shown below.

Endpoints

Before we finish, we need to tell grlc the endpoint URI against which the query should be executed. There is a number of ways of doing this: you can specify the endpoint URI in the query file, in the repo (in case all queries execute against the same endpoint), or alternatively at execution time as a parameter. To keep things simple, and since we're still editing our query, we'll add the endpoint URI in the query file. We do this by adding the following line at the top:

#+ endpoint: http://dbpedia.org/sparql

(If your endpoint is different from DBpedia's you just type your endpoint's URI instead). So our query editor should now look like this:

Notice that the endpoint metadata (as well as all other metadata fields used by grlc) is indicated through standard SPARQL comments plus a bit of YAML syntax. This makes sure that the content of SPARQL files is always SPARQL compliant and doesn't break any standard.

After this, scroll all the way down until you see a Commit new file button, and click on it (optionally, you can add a description to the commit):

That's it! This puts our new query up on GitHub, making it available for deference to any client. To double check, you can see that now your repository contains your recently created SPARQL file, and a README.md file (if you chose to have a default one when you created the repository):

Create your Linked Data API

Creating your Linked Data API with the query we just stored is really easy by using a grlc instance. Assuming we'll use the public one at http://grlc.io, everything we need to do is to direct our browser to the following address by typing it in the browser address bar:

http://grlc.io/api/username/repo/

This changes for every user and repository name; for me, given that my username is albertmeronyo and my repo my-linked-data-api, I need to go to:

http://grlc.io/api/albertmeronyo/my-linked-data-api/

If all goes well, grlc will craft a nice API documentation page similar to this one:

For the developers: if you're just interested in the JSON OpenAPI spec, you can find that at http://grlc.io/api/username/repo/spec, for example http://grlc.io/api/albertmeronyo/my-linked-data-api/spec

Using your Linked Data API

Besides creating API documentation and specifications from SPARQL queries, grlc can also execute those queries for you as simple HTTP requests. There are many ways of achieving this; the easiest is to click anywhere in the method myquery (or whatever name we gave it), which will show the following details:

If we now click on Try it out, the following will show (letting us change the endpoint URI, which is the only parameter this query takes). There, we can click on Execute:

And voila! After a few moments, if we scroll down we'll see the results from the endpoint listed next to the response code from the server:

There are many other ways of using these results. For instance, if you click on Download you'll get a file with the whole list of results from the endpoint for further processing. Another way of triggering the execution of the query is to use its full URI in any HTTP client, like your browser. Go ahead and write in your browser's address bar:

You can control the format in which grlc returns these results by simply adding the corresponding extension to the URI; all the following work:

You can try all these from any other HTTP client, like curl on your command line:

curl -X GET -H'Accept: text/csv' http://grlc.io/api/albertmeronyo/my-linked-data-api/myquery

Using grlc from Python

Being written in python itself, it is easy to use grlc from python. Here we show how to use grlc to run a SPARQL query which is stored on github, and load the data as a pandas data frame:

import pandas as pd
from io import StringIO

import grlc
import grlc.utils as utils

user = 'CLARIAH'
repo = 'grlc-queries'
query_name = 'description'
acceptHeader = 'text/csv'

data, code, headers = utils.dispatch_query(user, repo, query_name, acceptHeader=acceptHeader)

data_grlc = pd.read_csv(StringIO(data))
data_grlc.head(10)

A full example can be seen here.

Linked Data Fragments compatibility

Besides standard SPARQL endpoints, grlc is compatible with Linked Data Fragments (LDF) servers. TPF (Triple Pattern Fragment) queries that are understood by LDF servers work a bit differently from SPARQL queries in grlc:

  1. The endpoint parameter works exactly the same; the only difference is that the provided URL needs to be a LDF compatible API instead of a SPARQL endpoint. For example:
#+ endpoint: http://fragments.dbpedia.org/2016-04/en
  1. Files with TPF queries inside of them (instead of SPARQL queries) must have the .tpf file extension
  2. Most basic query decorators from the SPARQL version will work; but the query follows a different syntax as it needs to provide values for the subject, predicate, and object parameters of the TPF. The following example is a valid TPF query (empty parameters work like SPARQL variables, i.e. any value is a valid binding):
subject=http://dbpedia.org/resource/Led_Zeppelin
predicate=
object=

A full-blown LDF example, then, could be a file in your repository named fragments.tpf with the following contents:

#+ endpoint: http://fragments.dbpedia.org/2016-04/en
#+ summary: Led Zeppelin dbpedia triples
#+ tags:
#+   - Music bands

subject=http://dbpedia.org/resource/Led_Zeppelin
predicate=
object=

Further steps

There's much more grlc can do for you in building APIs for Linked Data resources. Try repeating steps 3 to 5 to add as many queries to your API as you want. Please refer to the pages on parameter mapping and query decorators to fully leverage all grlc API capabilities!