Reasonning on RDF with ASP and Python. (not dinopython)
Grasp uses ASP for querying RDF data, allowing final user to code complex queries using a dedicated language. For optimisation, it can also perform SPARQL query. (see below)
There is three main entry point in grasp:
- convert: convert local RDF data in ASP database
- retrieve: retrieve RDF data from an RDF database through a SPARQL query and put it in local ASP database
- query: perform an ASP query on a local ASP database
Grasp is a tool dedicated to reasonning on linked data in ASP, a language dedicated to logical programming (as prolog, well-known to be used by Watson and Robonaut for databases querying).
Interest of ASP over SPARQL are:
- usage of complex and configurable heuristics
- native support of choice of subsets
- optimisations of solutions scoring
The used implementation of ASP is the Potsdam Answer Set Solving Collection.
ASP, oppositely to SPARQL, don't have a complex type systems: all values are string or integer. There is no builtin operation on string. Builtin operations on integers are the commons mathematic operators.
The consequence is that many information, through ASP, is lost about types. With some treatments, some information can be saved, used in ASP and restored after query, but not much.
On bioportal, the ENVO ontology provide a file in xrdf format. (2014-09-16 release)
The data in file named owlapi.xrdf
is used as primary source of data in examples.
The goa data (data/goa_ecoli-20150528-goUniprotUri.ttl
) was generated from GOA EBI, with the help of the script liGoa2rdf.py
.
pip install -r requirements
Used implementation of ASP is gringo/clasp from Potassco labs.
Some problems could appear when installing the pyasp
package ; best solution is to follow the PowerGrASP installing guide.
make convert
make query
Give a look to the Makefile
, and to the grasp/__main__.py
.
All the crunsh about package and python starts here.
grasp/solving.py
file is just the one that call the ASP solver on your data and your query.
You can combine multiple queries and play with clingo/gringo/clasp arguments through the sources.
From the ASP side, the query is defined in queries/test.lp
. This location is sent as package parameter. (see Makefile
)
Because the ASP part needs complex treatment on the data (including grounding), it is expectable that, on big data, the ASP query will just explode in RAM (and, once swapped, in time).
An interesting solution is to use a sparql query to define exactly which triplets really needs ASP.
This is perform with the retrieve
API:
python -m grasp retrieve http://dbpedia.org/sparql mysparqlquery.rq aspdatabase.lp
Note that the SELECT clause of the SPARQL query is used for data retrieving:
SELECT ?a ?b ?c ?d
WHERE {
...
}
Any tuple of a, b, c and d elements yielded by the sparql query will be pushed under the name entry
in the ASP database.
Its also possible to define the predicate name used for data with a constant value:
python -m grasp retrieve http://dbpedia.org/sparql mysparqlquery.rq aspdatabase.lp --atom-name='data'
or a python file:
python -m grasp retrieve http://dbpedia.org/sparql mysparqlquery.rq aspdatabase.lp --atom-name=atoms.py
where the python file atoms.py contains the following lines:
def atom_name(payload):
"""Return the atom name, function to the data found in payload"""
if payload[0] == 'gene':
return 'gene_entry'
else:
return 'entry'
In the API, this is performed as follow:
import grasp
from atoms import atom_name
grasp.retrieve('http://dbpedia.org/sparql', './mysparqlquery.rq', './aspdatabase.lp',
atom_name=atom_name)
CLI and API can perform almost the same treatments ; the latter is more complete.
For help and options on the package:
python -m grasp --help
or
make help
or
import grasp
help(grasp)
help(grasp.convert)
help(grasp.retrieve)
help(grasp.query)
GPL