-
Notifications
You must be signed in to change notification settings - Fork 558
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataset cleanup - Store API for graph method #309
Conversation
And ignore the fact that some of the old unit tests fail, if we agree that this is ok I'll tidy it up |
Some comments:
The second thing is that I tried to parse an empty graph:
As you see, it's not there, but since parse returns a graph, I expect that it has been added to the Dataset. |
You can use Perhaps adding a The parse problem is interesting, I'll have to look at the internals of the parser... and that means each and every parser.... |
Really? I mean, it DOES return a graph ... |
Hmm - yes that solves to issue when only a single graph is created. But if you parse a trix document with empty graphs? (or trig, if we had parser) - they will create several graphs, but each parser is free to decide whether to use I can do the simple fix now, and maybe clean up the other one when we solve #283 (i started this in a not yet pushed branch) |
At moment, what I have (in my internal stuff) is ds.default_graph_id I also kept the Dataset.DEFAULT as a shorthand that a user can use although, Ivan Urs Holzer wrote:
Ivan Herman |
I forgot about |
Yep. Ivan Urs Holzer wrote:
Ivan Herman |
discussed in #307 Summary of changes: * added methods ```add_graph``` and ```remove_graph``` to the Store API, implemented these for Sleepycat and IOMemory. A flag, ```graph_awareness``` is set on the store if they methods are supported, default implementations will raise an exception. * made the dataset require a store with the ```graph_awareness``` flag set. * removed the graph-state kept in the ```Dataset``` class directly. * removed ```dataset.add_quads```, ```remove_quads``` methods. The ```add/remove``` methods of ```ConjunctiveGraph``` are smart enough to work with triples or quads. * removed the ```dataset.graphs``` method - it now does exactly the same as ```contexts``` * cleaned up a bit more confusion of whether Graph instance or the Graph identifiers are passed to store methods. (#225)
Add graphs when parsing, so also empty graphs are added.
I've made all the tests pass in this branch, and adapted the dawg test and sparql engine a tiny bit. The only remaining question may be if contexts() should return the default graph |
Note: I've also rebased this onto the latest master, if you had a local copy you may have to force pull |
Ihave one wish: Could you add a |
It seems we have more or less agreement I will merge this now - we can sort out the remaining bits in issues of their own! |
Dataset cleanup - Store API for graph method
2013/12/31 RELEASE 4.1 ====================== This is a new minor version RDFLib, which includes a handful of new features: * A TriG parser was added (we already had a serializer) - it is up-to-date wrt. to the newest spec from: http://www.w3.org/TR/trig/ * The Turtle parser was made up to date wrt. to the latest Turtle spec. * Many more tests have been added - RDFLib now has over 2000 (passing!) tests. This is mainly thanks to the NT, Turtle, TriG, NQuads and SPARQL test-suites from W3C. This also included many fixes to the nt and nquad parsers. * ```ConjunctiveGraph``` and ```Dataset``` now support directly adding/removing quads with ```add/addN/remove``` methods. * ```rdfpipe``` command now supports datasets, and reading/writing context sensitive formats. * Optional graph-tracking was added to the Store interface, allowing empty graphs to be tracked for Datasets. The DataSet class also saw a general clean-up, see: RDFLib/rdflib#309 * After long deprecation, ```BackwardCompatibleGraph``` was removed. Minor enhancements/bugs fixed: ------------------------------ * Many code samples in the documentation were fixed thanks to @PuckCh * The new ```IOMemory``` store was optimised a bit * ```SPARQL(Update)Store``` has been made more generic. * MD5 sums were never reinitialized in ```rdflib.compare``` * Correct default value for empty prefix in N3 [#312]RDFLib/rdflib#312 * Fixed tests when running in a non UTF-8 locale [#344]RDFLib/rdflib#344 * Prefix in the original turtle have an impact on SPARQL query resolution [#313]RDFLib/rdflib#313 * Duplicate BNode IDs from N3 Parser [#305]RDFLib/rdflib#305 * Use QNames for TriG graph names [#330]RDFLib/rdflib#330 * \uXXXX escapes in Turtle/N3 were fixed [#335]RDFLib/rdflib#335 * A way to limit the number of triples retrieved from the ```SPARQLStore``` was added [#346]RDFLib/rdflib#346 * Dots in localnames in Turtle [#345]RDFLib/rdflib#345 [#336]RDFLib/rdflib#336 * ```BNode``` as Graph's public ID [#300]RDFLib/rdflib#300 * Introduced ordering of ```QuotedGraphs``` [#291]RDFLib/rdflib#291 2013/05/22 RELEASE 4.0.1 ======================== Following RDFLib tradition, some bugs snuck into the 4.0 release. This is a bug-fixing release: * the new URI validation caused lots of problems, but is nescessary to avoid ''RDF injection'' vulnerabilities. In the spirit of ''be liberal in what you accept, but conservative in what you produce", we moved validation to serialisation time. * the ```rdflib.tools``` package was missing from the ```setup.py``` script, and was therefore not included in the PYPI tarballs. * RDF parser choked on empty namespace URI [#288](RDFLib/rdflib#288) * Parsing from ```sys.stdin``` was broken [#285](RDFLib/rdflib#285) * The new IO store had problems with concurrent modifications if several graphs used the same store [#286](RDFLib/rdflib#286) * Moved HTML5Lib dependency to the recently released 1.0b1 which support python3 2013/05/16 RELEASE 4.0 ====================== This release includes several major changes: * The new SPARQL 1.1 engine (rdflib-sparql) has been included in the core distribution. SPARQL 1.1 queries and updates should work out of the box. * SPARQL paths are exposed as operators on ```URIRefs```, these can then be be used with graph.triples and friends: ```py # List names of friends of Bob: g.triples(( bob, FOAF.knows/FOAF.name , None )) # All super-classes: g.triples(( cls, RDFS.subClassOf * '+', None )) ``` * a new ```graph.update``` method will apply SPARQL update statements * Several RDF 1.1 features are available: * A new ```DataSet``` class * ```XMLLiteral``` and ```HTMLLiterals``` * ```BNode``` (de)skolemization is supported through ```BNode.skolemize```, ```URIRef.de_skolemize```, ```Graph.skolemize``` and ```Graph.de_skolemize``` * Handled of Literal equality was split into lexical comparison (for normal ```==``` operator) and value space (using new ```Node.eq``` methods). This introduces some slight backwards incomaptible changes, but was necessary, as the old version had inconsisten hash and equality methods that could lead the literals not working correctly in dicts/sets. The new way is more in line with how SPARQL 1.1 works. For the full details, see: https://github.com/RDFLib/rdflib/wiki/Literal-reworking * Iterating over ```QueryResults``` will generate ```ResultRow``` objects, these allow access to variable bindings as attributes or as a dict. I.e. ```py for row in graph.query('select ... ') : print row.age, row["name"] ``` * "Slicing" of Graphs and Resources as syntactic sugar: ([#271](RDFLib/rdflib#271)) ```py graph[bob : FOAF.knows/FOAF.name] -> generator over the names of Bobs friends ``` * The ```SPARQLStore``` and ```SPARQLUpdateStore``` are now included in the RDFLib core * The documentation has been given a major overhaul, and examples for most features have been added. Minor Changes: -------------- * String operations on URIRefs return new URIRefs: ([#258](RDFLib/rdflib#258)) ```py >>> URIRef('http://example.org/')+'test rdflib.term.URIRef('http://example.org/test') ``` * Parser/Serializer plugins are also found by mime-type, not just by plugin name: ([#277](RDFLib/rdflib#277)) * ```Namespace``` is no longer a subclass of ```URIRef``` * URIRefs and Literal language tags are validated on construction, avoiding some "RDF-injection" issues ([#266](RDFLib/rdflib#266)) * A new memory store needs much less memory when loading large graphs ([#268](RDFLib/rdflib#268)) * Turtle/N3 serializer now supports the base keyword correctly ([#248](RDFLib/rdflib#248)) * py2exe support was fixed ([#257](RDFLib/rdflib#257)) * Several bugs in the TriG serializer were fixed * Several bugs in the NQuads parser were fixed
Cleaning up Dataset class, adding graph tracking to store API, as
discussed in #307
Summary of changes:
add_graph
andremove_graph
to the StoreAPI, implemented these for Sleepycat and IOMemory. A flag,
graph_awareness
is set on the store if they methods aresupported, default implementations will raise an exception.
graph_awareness
flag set.
Dataset
class directly.dataset.add_quads
,remove_quads
methods. Theadd/remove
methods ofConjunctiveGraph
are smart enoughto work with triples or quads.
dataset.graphs
method - it now does exactly thesame as
contexts
default_union
flag to Graphs,ConjunctiveGraph
has this set to True, andDataset
to FalseGraph identifiers are passed to store methods. (Think about __iadd__, __isub__ etc. for ConjunctiveGraph #225)
Comments:
exist in the store is not supported.
a graph fit into a transaction. It is sort irrelevant, as we do not
have any transaction-aware stores.
Questions:
dataset.quads
returnNone
as context for the triples inthe default graph, this would suggest "no". (currently it does)
get_context
return a Graph, but does not create it.graph
creates the graphs AND will also make a new graphidentified by a skolemized bnode if called without arguments.
graph_awareness
for a Store?The only change would be whether a graph is removed once it is
empty. I guess noone relies on this, as it is not implemented
correctly in
IOMemory
orSleepycat
at the moment :)