NOTE: THIS IS A DEVELOPMENT VERSION (WORK IN PROGRESS)
mineral, loosely for min (function) enhanced recursion through accumulation is a custom SPARQL function for optimized recursion within SPARQL with aggregates, inspired by the work by Prof. Carlo Zaniolo on Fixpoint semantics and optimization of recursive Datalog programs with aggregates and work by Prof. Maurizio Atzori on Computing recursive SPARQL queries (see on github).
We developed the following function (the wfn
prefix stands for <http://webofcode.org/wfn/>
):
wfn:mr([?acc,] ?query [, ?i0, ?i1, ...])
that can be used with different optimization strategies (configurable server-side):
none
- no optimization used to cut off search space (may not end up in case of cycles in the graph)fast
- accumulator-based recursion inspired by Zaniolo's technique is used to cut off unnecessary branches in the search space including loops (the first parameter,?acc
, must be provided and used as an accumulator to be minimized, see below)auto
- automatically chose between none and fast by detecting if an accumulator has been provided to themr
function (default)
In both versions, caching (memoization) of partial results can be enabled to cut off loops (circles in the graph) and increase speed on already-seen states in the search space.
- ensure you have OpenJDK 8 or later correctly installed
- download and extract Apache Fuseki 2 on directory
./fuseki
- compile Java source code of functions:
./compile
- run fuseki using testing settings:
OPT=auto CACHE=false ./run-fuseki
- fuseki server settings in
config/config.ttl
- initial data in
config/dataset.ttl
- log4j settings in
config/log4j.properties
- optimization is
auto
by default, useOPT=fast
for fast optimization orOPT=none
to disable optimization - caching is disabled by default, use
CACHE=true
to enable memoization of results
- fuseki server settings in
- go to: http://127.0.0.1:3030 and run a recursive query (see examples below)
From Functional Programming: Accumulators on medium.com:
An accumulator is an additional argument added to a function. As the function that has an accumulator is continually called upon, the result of the computation so far is stored in the accumulator. After the recursion is done, the value of the accumulator itself is often returned unless further manipulation of the data is needed.
(see accumulators and folds for insights on the accumulator pattern and how it relates to recursion)
With mineral
, in order to use the fast
optimization, you must use the first parameter ?acc
of the recursive function :mr
as an accumulator, that is, a monotone non-decreasing value over nested calls. Monotonicity must hold according to ORDER BY ?acc
, that is, according to natural order in SPARQL.
In the following a list of examples computing with recursive function both without an accumulator (fast
optimization strategy cannot be used) and with an accumulator (all optimizations including fast
can be used).
In the following it is shown how to compute the factorial of 3.
Simple, inline (without accumulator):
PREFIX : <http://webofcode.org/wfn/>
SELECT ?result {
BIND(:mr("BIND ( IF(?i0 <= 0, 1, ?i0 * :mr(?query, ?i0-1)) AS ?result)", 3) AS ?result)
}
or alternatively (more verbose but clearer):
PREFIX : <http://webofcode.org/wfn/>
SELECT ?result {
# bind variables to parameter values
VALUES ?query { "BIND ( IF(?i0 <= 0, 1, ?i0 * :mr(?query, ?i0-1)) AS ?result)" }
# actual call of the recursive query
BIND(:mr(?query, 3) AS ?result)
}
With support for accumulator (parameter ?acc
is an accumulator):
PREFIX : <http://webofcode.org/wfn/>
SELECT ?result {
BIND(:mr(1, "BIND ( IF(?i0 <= 0, ?acc, :mr(?acc * ?i0, ?query, ?i0-1)) AS ?result)", 3) AS ?result)
}
or verbosely:
PREFIX : <http://webofcode.org/wfn/>
SELECT ?result {
# bind variables to parameter values
VALUES ?query { "BIND ( IF(?i0 <= 0, ?acc, :mr(?acc * ?i0, ?query, ?i0-1)) AS ?result)" }
# actual call of the recursive query
BIND(:mr(1, ?query, 3) AS ?result)
}
Compute the fibonacci number of 6.
Without accumulator:
PREFIX : <http://webofcode.org/wfn/>
SELECT ?result {
# bind variables to parameter values
VALUES ?query {
"BIND ( IF(?i0 <= 2, 1, :mr(?query, ?i0 -1) + :mr(?query, ?i0 -2) ) AS ?result)" }
# actual call of the recursive query
BIND(:mr(?query, 6) AS ?result)
}
With accumulator:
# for explanation, see: https://marcaube.ca/2016/03/optimizing-a-fibonacci-function
# (?i0=acc, ?i1=prev, ?i2=n)
PREFIX : <http://webofcode.org/wfn/>
SELECT ?result {
# bind variables to parameter values
VALUES ?query {
"BIND ( IF(?i1 <= 0, ?acc, :mr(?acc + ?i0, ?query, ?acc, ?i1 -1 ) ) AS ?result)" }
# actual call of the recursive query
BIND(:mr(0, ?query, 1, 6) AS ?result)
}
In the following it is shown how to compute the shortest distance between two nodes (dbo:Village and dbo:Location and ).
Without accumulator:
PREFIX : <http://webofcode.org/wfn/>
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?result {
# bind variables to parameter values
VALUES ?query {
"OPTIONAL { ?i0 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?next } BIND( IF(?i0 = ?i1, 0 , IF(bound(?next), 1 + wfn:mr(?query, ?next, ?i1), ?a)) AS ?result)" }
# actual call of the recursive query
BIND( :mr(?query, dbo:Village, dbo:Location) AS ?result)
}
Note that looking for the maxumum distance (max
instead of min
) is also possible by using negative numbers:
PREFIX : <http://webofcode.org/wfn/>
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?result {
# bind variables to parameter values
VALUES ?query {
"OPTIONAL { ?i0 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?next } BIND( IF(?i0 = ?i1, 0 , IF(bound(?next), -1 + wfn:mr(?query, ?next, ?i1), ?a)) AS ?result)" }
# actual call of the recursive query
BIND( -1 * :mr(?query, dbo:Village, dbo:Location) AS ?result)
}
Shortest distance with accumulator:
PREFIX : <http://webofcode.org/wfn/>
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?result {
# bind variables to parameter values
VALUES ?query { """
OPTIONAL { ?i0 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?next }
BIND( IF(?i0 = ?i1, ?acc, :mr(?acc+1, ?query, ?next, ?i1)) AS ?result)
""" }
# actual call of the recursive query
BIND( :mr(0, ?query, dbo:Village, dbo:Location) AS ?result)
}