-
Notifications
You must be signed in to change notification settings - Fork 556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: allow registered prefix use and namespaces which are URIs closes #632 #660
Conversation
Two things:
A possible fix would be for the compute qname method should instead loop through all registered prefixes, find the longest prefix match, and then check if the remainder is a valid localname. That is much more work though. I find those URLs quite ugly - RDF is WEB technology, |
#649 is the same discussion |
@gromgull - let me know if we should move the discussion to #649, but i'll respond here for now! this hack does fix #632 as the cache would contain the full URI. i completely agree that this is a hack and should be replaced with sth better. however, if you do move the triples and bind the namespace there, you would regenerate the cache. if you simply copied the namespaces internally without going through a the key here is that the cache generation happens on bind, and that's the point. the namespace prefix is a serialization concept not an RDF concept. RDF simply says URIs. turtle/trig on the other hand allows prefix-es. therefore, if i explicitly bind a prefix to a valid string, rdflib shouldn't parse it down to a prefix, URI, and part. so both of the following prefixes would be completely valid: @prefix ex: <http://example.org/> .
@prefix ex_foo: <http://example.org/#foo> the problem right now is that the prefix generator in rdflib takes a decision that the prefix cannot be: import rdflib as rl
g = rl.Graph()
g.bind('ex', 'http://example.org/#foo')
g.compute_qname('http://example.org/#foo') returns
whereas it should have returned:
i'm happy to propose a non-hacked solution if there is agreement that rdflib should keep the prefix as determined by the |
From #632:
so this calls this gets put in the cache table. Then when we serialize the graph, rdflib will call Call compute qname on just the namespace URL like you do and expect a an tuple with the 3rd part as the empty string isn't something you would ever do? |
I didn't test it btw - I am just speculating how I remember it works - I could be wrong :) |
@gromgull - i think we are saying the same thing :) the point of disagreement is that i believe the way it works is incorrect. simply, if i so using your example, and to make the turtle file readable to humans, i'm doing from rdflib import Graph, URIRef
graph = Graph()
graph.bind('GENO', 'http://purl.obolibrary.org/obo/GENO_')
graph.bind('RO_has_phenotype', 'http://purl.obolibrary.org/obo/RO_0002200')
graph.add((URIRef('http://example.org'),
URIRef('http://purl.obolibrary.org/obo/RO_0002200'),
URIRef('http://purl.obolibrary.org/obo/GENO_0000385')))
print(graph.serialize(format='turtle')) i will now see, @prefix GENO: <http://purl.obolibrary.org/obo/GENO_> .
@prefix RO_has_phenotype: <http://purl.obolibrary.org/obo/RO_0002200> .
@prefix ns1: <http://purl.obolibrary.org/obo/> .
<http://example.org> ns1:RO_0002200 ns1:GENO_0000385 . i would like to see this (which is valid turtle): @prefix: RO_has_pheotype: <http://purl.obolibrary.org/obo/RO_0002200> .
@prefix: GENO: <http://purl.obolibrary.org/obo/GENO_> .
<http://example.org/> RO_has_phenotype: GENO:0000385 . |
there are two things:
And I am not totally convinced it should - and I am not sure what will break in other serialisations that rely on the same code, but are not turtle. In any case, it certainly should not make them up by itself, but MAYBE if explicitly instructed with
And my point is that "no", this PR does not solve it, not even in a hacky way? |
is NOT valid turtle. The localname needs to be at least one character long. |
Wait - maybe it IS valid. That is so ugly :) https://www.w3.org/TR/turtle/#grammar-production-PrefixedName |
@gromgull - ok now that we agree that it is valid :) i've updated the fix. are there other serializers that use the namespaces like turtle/trig? graph = Graph()
graph.bind('GENO', 'http://purl.obolibrary.org/obo/GENO_')
graph.bind('RO_has_phenotype', 'http://purl.obolibrary.org/obo/RO_0002200')
graph.add((URIRef('http://example.org'),
URIRef('http://purl.obolibrary.org/obo/RO_0002200'),
URIRef('http://purl.obolibrary.org/obo/GENO_0000385')))
print(graph.serialize(format='turtle').decode()) returns: @prefix GENO: <http://purl.obolibrary.org/obo/GENO_> .
@prefix RO_has_phenotype: <http://purl.obolibrary.org/obo/RO_0002200> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
<http://example.org> RO_has_phenotype: GENO:0000385 . |
will be looking at the N3 failures. |
i found a bunch of other things i need to address. will fix and update. |
cache namespaces during the bind process.
closes #632