Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement query operation #24

Closed
jamesaoverton opened this issue May 7, 2015 · 9 comments
Closed

Implement query operation #24

jamesaoverton opened this issue May 7, 2015 · 9 comments
Assignees
Milestone

Comments

@jamesaoverton
Copy link
Member

The query operation should allow arbitrary SPARQL select and update queries to be run against the ontology, saving the results to files. I'm most familiar with Apache Jena, so I plan to use that. Things will be simpler if we load the ontology into the default RDF graph, and don't use named graphs.

We may want to run multiple queries, but we only want to load the ontology into an RDF graph once. The current chaining implementation passes state as an OWLOntology, and I'd like to stick with that simple solution as long as possible. So I propose this command-line interface:

  • --select INPUT OUTPUT (-s) take an input SPARQL file, run the select query, and save to a file; the output format will be determined by the file extension
  • --update INPUT (-u) take an input SPARQL file, run the update query

You can specify these options multiple times. Apache CLI should keep them in the right order. When all queries have been run, we'll load the default RDF graph into an ontology for further processing. Suggestions for the best way to do this are appreciated!

Jena supports these output file formats, and we'll use these file extensions:

  • text .txt
  • CSV .csv
  • TSV .tsv
  • XML .xml
  • JSON .js or .json

Note that the text format is close to some of the table formats accepted by various Markdown parsers: http://pandoc.org/README.html#tables

Example:

robot query --input example.owl \
  --select query1.rq result1.csv \
  --select query2.rq result2.csv \
  --update update1.rq \
  --update update2.rq \
  --output updated.owl
@cmungall
Copy link
Contributor

cmungall commented May 7, 2015

Should this cover cases where a sparql query is used to enforce a build constraint, exiting with a non-zero code if some condition is not met? Or is that best folded into the rest of the ontology unit test framework?

@jamesaoverton
Copy link
Member Author

Good idea. If it's a binary condition, we could use "ask" instead of "select", with an option like --ask. The disadvantage is that you probably want to report the results that caused the failure, not just the fact that it failed.

I suggest another option: --verify INPUT OUTPUT (or maybe --assert.) If the query returns no results, we continue without writing OUTPUT. If the query returns one or more results, then we write OUTPUT and exit ROBOT with non-zero status.

@jamesaoverton jamesaoverton self-assigned this May 11, 2015
@jamesaoverton jamesaoverton modified the milestone: ROBOT for OBI May 11, 2015
@cmungall
Copy link
Contributor

cmungall commented Jun 3, 2016

For constraints we may prefer something like SHACL, see @balhoff's experiments: https://github.com/balhoff/shacl-tests

As for the main SELECT use case, the simplest way to do this would be to write the OWL to a ttl file, and then run the query via Jena. Kind of hacky... could also try the bridge layer in the OWLAPI?

@balhoff
Copy link
Contributor

balhoff commented Jun 14, 2016

I would be happy to attempt an implementation of --verify using SHACL. For input the user would provide an RDF file containing SHACL shapes. One issue is that there haven't been official releases of the @TopQuadrant/shacl library, so it is a bit of a moving target. We could host a build in another maven repo I guess.

As @jamesaoverton suggests above an --ask option could be used for running a SPARQL ASK. It would also be nice to provide --construct.

@cmungall
Copy link
Contributor

Hmm, I'd like to avoid the need for another repo

We use code.berkeleybop.org for OWLTools, but we really messed people up when that was down for a few days.

I'm torn between having ROBOT be a simple one-stop-shop for their release pipelines vs keeping things more modular. Maybe we should start with a standalone tool?

@balhoff
Copy link
Contributor

balhoff commented Jun 14, 2016

That makes sense. I'll clean up the SHACL runner I have now so that we can get some experience with in release pipelines.

jamesaoverton added a commit that referenced this issue Sep 10, 2016
- load ontology into default graph of Jena Arq DatasetGraph
- run select queries, write to CSV
- add command, include it in the CLI
- add test
@cmungall
Copy link
Contributor

Meanwhile, for the basic query option, looks like robot has query but we're lacking something in examples/ for it

@jamesaoverton
Copy link
Member Author

Yes, I haven't properly documented query. I've been using it for a while and I like it.

@cmungall
Copy link
Contributor

cmungall commented Mar 7, 2017

Closing this as the feature has been implemented, some discussion continuing here: #150

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants