Skip to content

Connecting to SPARQL

Alexey Morozov edited this page May 30, 2017 · 6 revisions

Introduction

Although Ontodia can use almost any graph source to navigate through data and enable building of diagrams, RDF graphs is what Ontodia was initially designed for.

To operate with RDF Ontodia needs to connect to SPARQL endpoint, which provides an access to data. Ontodia fires SPARQL queries to extract data schema - classes, property types and data - elements and links.

Ontodia has a concept of Data Provider to connect to data sources, and hence you'll be able to find SparqlDataProvider build specifically to connect to RDF/SPARQL data sources.

Connecting to SPARQL endpoint in demos

Ontodia demos require setting URL of SPARQL endpoint in SPARQL_ENDPOINT environment variable, it should be done before startup of the development server (see Quick start or Embedding examples).

Default SparqlDataProvider settings will work with small datasets (up to ~50 000 triples), because data provider gathers statistics about class usage, and collecting this information on large data sets creates excessive load on SPARQL server and will result in timeouts. Also, default full text search will traverse all the data and will be unrealistic with large data sets.

If you wish to browse large data sets, such as DBPedia, Wikidata or any dataset with large amount of triples, please read carefully the "SparqlDataProvider options and settings" section.

Ontodia, SPARQL and web browser security

Ontodia is a JavaScript library and all of its code is executed in a web browser. Since Ontodia accesses remote SPARQL endpoints from a web browser, it is subject to security restrictions in modern browsers. Calls to web-servers that are different from the one that Ontodia was loaded from are forbidden by default by same-origin security policy.

There are two options to consider to evade this limitation:

  1. Proxy all SPARQL requests by the server Ontodia was run from (this is done by Ontodia development server, all embedding examples, this approach is used in ontodia.org, for example)
  2. Use Cross-origin resource sharing or CORS to allow such cross-domain requests.

This problem is not unique to Ontodia and all JavaScript developers calling third-party services encounter it and there is enough knowledge out there on how to make CORS work on major web servers, or how to proxy requests.

The decision on how to deal with this problem in your case depends on your environment and more than one option could be chosen.

Some examples:

  • SPARQL endpoint may have CORS enabled and you can specify endpoint URL directly with SparqlDataProvider. This is the case of DBPedia and Ontodia's DBPedia example nakes use of it. (see src/examples/dbpedia.ts)
  • Ontodia development server proxies all with forwarding requests from /sparql-endpoint to URL specified in SPARQL_ENDPOINT environment variable (see devServer property at webpack.demo.config.js)
  • JSFiddle example of Ontodia use CORS-enabled SPARQL endpoint at https://library-ontodia-org.herokuapp.com/sparql

SparqlDataProvider options and settings

SparqlDataProvider has two things to configure - options and settings. Options were designed to be runtime-focused parameters, while settings are those that won't be changed very often. We understand the possible confusion about such separation and that it could most likely evolve over time.

SparqlDataProvider has the following options:

option description
endpointUrl sparql endpoint URL to use
imagePropertyUris properties to use as image URLs
prepareImages function to extract image URL from element model
queryMethod sets whether you use GET (more compatible, e.g. Virtuozo, but also error-prone due to large request URLs) or POST (less compatible but better on large data sets)

SparqlDataProvider settings allow to tune Ontodia to particular type of schema or ontology definition (RDFS/ORL differ from Wikidata ontology and have different properties to use as subclass and instance-of relations). Also settings could define how to use full-text search functions of particular triplestore.

There is a number of pre-configured sets for settings:

  • OWLStatsSettings - a set that configures Ontodia to work with datasets expressed in RDFS/OWL and with class usage statistics enabled. This is the default set of parameters that Ontodia will use if you don't specify any specific one.
  • OWLRDFSSettings - same as above, but with class statistics turned off. Suitable for large datasets without full-text function in triplestore.
  • WikidataSettings - settings for Wikidata with Blazegraph's full text search
  • DBPediaSettings - settings for DBPedia, filtering out yago classes, with Virtuozo's full text search made use of.

For more information on available settings please consult extensive JSDoc in the sources (sparqlDataProviderSettings.ts) and its use in examples (src/examples).