Skip to content

johnstonskj/rdftools

Repository files navigation

RDF Tools

CLI tools to validate, convert or query RDF files and SPARQL.

The functionality is provided by RDFLib, and while that provides a set of commands, those provided here are somewhat more extensive and also based upon a common command framework that can be extended easily for more cases.

Travis Status Coverage Status Maintainability Requirements

GitHub stars Current Version Python Versions Python Implementations

Installation

Python developers will know other ways, but for the (more) laymen - as long as you have git and pipx installed - you can install like so (on Linux and Mac):

mkdir -p ~/src
cd ~/src
git clone "https://github.com/johnstonskj/rdftools.git"
cd rdftools
pipx install .

Usage

The tooling uses a common starting command - rdf - that then executes sub-commands. As expected, the command has a help function and lists the supported sub-commands as positional arguments. These sub-commands also have their own help.

$ rdf -h
usage: rdf [-h] [-v] {validate,convert,select,query} ...

RDF tool

positional arguments:
  {validate,convert,select,shell,query}
  subargs

optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose

The currently supported sub-commands are as follows.

  • convert - convert files between different RDF representations (NTriples, Notation3, RDF-XML, ...).
  • query - execute SPARQL queries over RDF files.
  • select - simple projections from RDF files.
  • shell - run an interactive shell session.
  • validate - validate an RDF file.

An example, running a SPARQL query over a downloaded file:

$ rdf query \
    -i ~/social.n3 \
    -r n3 \
    -q "SELECT DISTINCT ?person ?topic
      WHERE {
        ?person <http://example.org/social/relationship/1.0/likes>
        ?topic.
      }"
person                                         topic
============================================== =============================================
http://amazon.com/cprm/customers/1.0/Alice     http://amazon.com/cprm/entities/1.0/Diving
http://amazon.com/cprm/customers/1.0/Bob       http://amazon.com/cprm/entities/1.0/Diving
http://amazon.com/cprm/customers/1.0/Alice     http://amazon.com/cprm/entities/1.0/Shoes
3 rows returned in 1.629622 seconds.

Debugging

The -v parameter to either rdf or one of the sub-commands controls the standard Python logging level. It can be stated multiple times, to increase the logging verbosity; -v for warnings, -vv for informational, -vvv for debug.

Interactive Shell

For a more interactive exploration of RDF data, you can run rdf shell, which gives you access to a lot of the same functions as in the separate tools. The shell has a single common graph into which you can load data from external files (and stores in the future), and run SPARQL queries. The shell also has a default initialization file, so commonly used prefixes, common data, etc. can be loaded before you start your session.

$ rdf shell
RDF Shell, v0.1.0.
reading commands from file /Users/simonjo/.rdfshrc
Graph updated with 40 statements.
>

As you might expect, the shell supports a help function and command completion, as well as a persistent history.

Initialization File

The default location for this is ~/.rdfshrc; all commands are read as if you typed them into the shell.

History File

The default location for this is ~/.rdfsh_hist; it will be read at startup and updated on closing the shell.

Extending

New commands are added as modules in the rdftools/scripts folder and have the following structure.

import rdftools

def main():
    (LOG, cmd) = rdftools.startup('Tool description.', add_args=None)

    ...

The add_args parameter is used to add additional command-line arguments to the common argparse structure. The function, if required, takes in a parser object and returns it. The common command line arguments include verbosity, help, and reading files.

def add_args(parser):
    return parser

The results from startup are a standard logger and an (ArgumentParser) Namespace object. The tool can then use the functions: read, read_into, read_all, write, and query to perform common operations on RDF files.

Extending the shell is also pretty simple. You add a function of the following form - it always takes a context object first - and the doc string will be used by default as the displayed help for your command. Arguments may be parsed for more structure, and print() is used extensively for user feedback. Note that you must always return the context, whether you updated it or not. The add_command function will install it into the shell, enabling help and command completion.

def echo(context, args):
    """ echo text
        Echo back the following text."""
    print(args)
    return context
add_command(echo)

References