Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify confusion around type of context element in ConjunctiveGraphs and context aware stores #167

Open
ghost opened this issue Feb 20, 2012 · 3 comments
Assignees
Labels
concept: RDF dataset Relates to the RDF datasets concept. enhancement New feature or request fix-in-progress id-as-cntxt tracking related issues

Comments

@ghost
Copy link

ghost commented Feb 20, 2012

gromgull, 2011-08-20T07:58:07.000Z

In the ConjunctiveGraph, the context field is assumed to be a graph with the identifier set to the URI of the context, i.e. this is what happens if you create context like this:

g=ConjunctiveGraph()
uri1=URIRef("http://example.org/mygraph1")
uri2=URIRef("http://example.org/mygraph2")

bob = URIRef(u'urn:bob')
likes = URIRef(u'urn:likes')
pizza = URIRef(u'urn:pizza')

g.get_context(uri1).add((bob, likes, pizza))
g.get_context(uri2).add((bob, likes, pizza))

Now g.contexts() returns a generator over some graphs.

Now, for code working on the store level, i.e. serializers and parsers, there should perhaps not be any graph objects?

I came across this when looking at the nquad parser, here:

https://github.com/RDFLib/rdflib/blob/master/rdflib/plugins/parsers/nquads.py#L106

This adds context as simply an URI ref.

I added a "work-around" to make the conjunctivegraph.contexts generator work "correctly" here:

https://github.com/RDFLib/rdflib/blob/master/rdflib/graph.py#L1075

Is this ok?

@gromgull
Copy link
Member

I've semi-fixed this in the last two commits.

If you work with the ConjunctiveGraph class directly:

g = ConjunctiveGraph()
g.get_context('urn:1').add( ( s, p, o ) )

You end up with a Graph object in the store, with identifier set to the URIRef "urn:1".
I dont know saving the whole graph+store in the store is the best way to do it, saving just the identifier seems much cleaner.
AND - I can now mix and match stores in one ConjunctiveGraph - I am not even sure what this could mean:

>>> g1=ConjunctiveGraph()
>>> g2=ConjunctiveGraph()
>>> g1.get_context('urn:a').add( (s, p, o) )
>>> g2.addN( (x,y,z, g1.get_context('urn:a') ) )
>>> g2.store != g2.get_context('urn:a').store 
True

Hmm ...

In [43]: list(g1.contexts())[0]
Out[43]: <Graph identifier=urn:a (<class 'rdflib.graph.Graph'>)>

In [44]: list(list(g1.contexts())[0])
Out[44]: 
[(rdflib.term.BNode('Nae2ff5ac0056427e852347e0a58ff925'),
  rdflib.term.BNode('N7c8cbd571a51409e8a92c2870a30eddd'),
  rdflib.term.BNode('Nbc486ecd9a5b46d5913dafdc4458fc3f'))]

In [45]: list(g1.get_context('urn:a'))
Out[45]: 
[(rdflib.term.URIRef(u'a'),
  rdflib.term.URIRef(u'b'),
  rdflib.term.URIRef(u'c'))]

This is no good :)

@gromgull
Copy link
Member

Something else, whatever we decide - the store.add method should check the type of the context parameter to make sure it is either Graph/Identifier to make sure we avoid very subtle and odd bugs.

@gromgull gromgull added this to the rdflib 5.0.0 milestone Jan 24, 2017
gromgull added a commit that referenced this issue Jan 24, 2017
raise exception when trying to rebind a prefix to another ns.
fix broken rebinding when generating prefixes

This fixes #679 - but actually it's more like a work-around. The
underlying problem is confusion about context and graph objects (#167)
gromgull added a commit that referenced this issue Jan 24, 2017
raise exception when trying to rebind a prefix to another ns.
fix broken rebinding when generating prefixes

This fixes #679 - but actually it's more like a work-around. The
underlying problem is confusion about context and graph objects (#167)
gromgull added a commit that referenced this issue Jan 24, 2017
raise exception when trying to rebind a prefix to another ns.
fix broken rebinding when generating prefixes

This fixes #679 - but actually it's more like a work-around. The
underlying problem is confusion about context and graph objects (#167)
gromgull added a commit that referenced this issue Jan 24, 2017
raise exception when trying to rebind a prefix to another ns.
fix broken rebinding when generating prefixes

This fixes #679 - but actually it's more like a work-around. The
underlying problem is confusion about context and graph objects (#167)
gromgull added a commit that referenced this issue Jan 24, 2017
raise exception when trying to rebind a prefix to another ns.
fix broken rebinding when generating prefixes

This fixes #679 - but actually it's more like a work-around. The
underlying problem is confusion about context and graph objects (#167)
@white-gecko white-gecko modified the milestones: rdflib 5.0.0, rdflib 5.1.0 Apr 6, 2020
@white-gecko white-gecko modified the milestones: rdflib 5.1.0, rdflib 6.0.0 May 1, 2020
@kvjrhall
Copy link
Contributor

kvjrhall commented Jul 4, 2021

Note that this is also observed when using Dataset instances. In particular,

ds = Dataset()
quads = ds.quads((None, None, None, None))   # Fourth term is identifier

store = SPARQLUpdateStore()
store.addN(quads)  # Expects four term to be graph

This seems to have been intended with #409 with the goal of reducing confusion for end users. My personal experience was a large amount of confusion that required me to review the code for both modules in order to identifying what was going wrong.

As a semantic web developer I've been trained to always interpret triple or quad as a tuple of graph nodes. Ideally, I'd hope that the interface match the expectations of a RDF Dataset as closely as possible and behave as a "set of quads" much the way that a graph can be treated as simply a "set of triples".

The documentation for the referenced methods, their argument names, and their method names are all ambiguous and add to the confusion. If a developer has started working with one of the APIs first (i.e., they are exposed to Dataset) then the first API has "deceived" them by setting their expectations incorrectly.

I'd recommend that usage of Graph instances as contexts be phased out for API consistency. In the interim, I'd suggest that the arguments and documentation for those methods be updated to clarify how they differ from other APIs in the system. For example, if migrating away from using graphs as contexts, arguments named quad or quads that expect graph contexts could be renamed deprecated_quad / deprecated_quads / etc. This would serve as a slap-in-the-face notification to new developers that there's something up. I'm sure their are more elegant ways to document this than what I'm suggesting.

mwatts15 added a commit to openworm/owmeta-core that referenced this issue Oct 26, 2021
- Once again adjusting for RDFLib/rdflib#167 not being resolved
@ghost ghost mentioned this issue Dec 14, 2021
@ghost ghost added the id-as-cntxt tracking related issues label Dec 24, 2021
@aucampia aucampia added the concept: RDF dataset Relates to the RDF datasets concept. label May 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
concept: RDF dataset Relates to the RDF datasets concept. enhancement New feature or request fix-in-progress id-as-cntxt tracking related issues
Projects
None yet
4 participants