Skip to content

Faster SPARQL client (netty) with support for local n-ary joins across multiple endpoints

License

Notifications You must be signed in to change notification settings

alexishuf/fastersparql

Repository files navigation

fastersparql

Faster SPARQL client.

This is a Java library for writing clients to SPARQL endpoints that support HTTP 1.1 Transfer-Encoding: chunked. When a SPARQL endpoint supports this, it can send the first result of a query ASAP and a fastersparql client can process that result also ASAP, without having to wait the server's engine to enumerate all rows or having to wait the whole download of the whole results serialization to start parsing them.

Although this is a client library The "faster" in fastersparql mostly depends on the SPARQL server internal implementation. Most SPARQL endpoints will first enumerate all results, them serialize them into the HTTP connection.

Quickstart

You can use the BOM and include other modules as discussed below, or directly add the netty module to your maven pom.xml:

<dependency>
  <groupId>com.github.alexishuf.fastersparql</groupId>
  <artifactId>fastersparql-netty</artifactId>
  <version>1.0.0-SNAPSHOT/version>
</dependency>

Then create a SparqlClient for an endpoint. In this example, we have an endpoint that only supports JSON or TSV serialization with GET methods:

SparqlClient<String[], byte[]> client = FasterSparql.clientFor("json,tsv,get@http://example.org/sparql");

Side notes:

The json,tsv,get@ can be omitted for endpoints that implement all standard serializations and query methods.

Each SparqlClient has a type for rows (String[] above) and fragments (byte[] above). Rows are produced by SELECT and ASK queries, while DESCRIBE and CONSTRUCT queries produce graph serialization fragments.

To change the row and fragment types, provide a RowParser or FragmentParser to FasterSparql.clientFor.

SparqlClient is an AutoCloseable, so consider wrapping it in a try () { /*...*/ } block.

With a client, send some queries:

Results<String[]> results =  client.query(sparqlQuery);
printVars(results.vars()); //List<String> with result variable names
consumeResults(results.publisher());

Results are delivered ASAP and this is achieved by a reactive streams Publisher. Real applications will likely use a higher-level library such as Reactor or Mutiny, which have nice wrappers for Publisher.

If reactive is not your thing, do this:

IterableAdapter<String[]> iterable = new IterableAdapter<>(results.publisher());
for (String[] row : iterable) {
    // Consume each row of N-Triples RDF terms. The i-th element of a row 
    // corresponds to the i-th variable in results.vars()
}
if (iterable.hasError()) { // check for errors, if you care
    throw iterable.error(); 
}

Modules and BOM

To avoid dependency conflicts, fastersparql is split into the following modules:

  • fastersparql-client
  • fastersparql-netty: provides a implementation of SparqlClient over netty
  • fastersparql-operators: implementations for SPARQL algebra operators (Join, Filter, Union, etc.). Use this to implement a SPARQL mediator or simply combine the results of two SPARQL queries to one or more endpoints
  • fastersparql-operators-jena: If you need filters with boolean expressions (i.e., not FILTER EXISTS/FILTER NOT EXISTS) this wraps the Jena implementation. If you prefer RDF4J, use it as inspiration when sending a PR for fastersparql-operators-rdf4j!
  • fastersparql-bom: a Bill Of Materials for keeping versions of the modules in sync

Extensibility

fastersparql uses Java SPI (Service Provider Interface), aka ServiceLoader for locating implementations. To add a new SparqlClient implementation, implement and expose to the classpath via META-INF/services a SparqlClientFactory. To provide implementations for SPARQL algebra operators, read this section.

RowParser and FragmentParser are an exception to the SPI, since they are provided by you via the Fastersparql facade. Simply create new implementations if the bundled implementations are not enough. PRs for your implementations will be apreciated.

About

Faster SPARQL client (netty) with support for local n-ary joins across multiple endpoints

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages