Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix handling of URIs in the default namespace #2365

Merged
merged 7 commits into from
May 21, 2023

Conversation

jclerman
Copy link
Contributor

Fixes: #2348
Adds failing test-case for: #2077 (marked xfail for now)

Summary of changes

Addressing two issues with NamespaceManager:

  1. Filed as NamespaceManager.expand_curie() does not handle the default prefix #2348: NamespaceManager.expand_curie() did not support expansion of CURIEs with a leading colon, even though those are valid and should resolve using the default namespace for a graph.
  2. Also addressing the converse case: generation of a CURIE (with leading colon, so that it can be handled by expand_curie() from a URI in the default namespace. The new support is added with a new method, curie() on NamespaceManager, so as not to change (break) existing behavior of NamespaceManager.qname().

Tests are included for both methods, including expected failure-cases for expand_curie().

Checklist

  • Checked that there aren't other open pull requests for
    the same change.
  • Added tests for any changes that have a runtime impact.
  • Checked that all tests and type checking passes.
  • For changes that have a potential impact on users of this project:
    • Updated relevant documentation to avoid inaccuracies.
    • Considered adding additional documentation.
    • Considered adding an example in ./examples for new features.
    • Considered updating our changelog (CHANGELOG.md).
  • Considered granting push permissions to the PR branch,
    so maintainers can fix minor issues and keep your PR up to date.

…ceManager.curie()

 - including tests for both methods, including expected failure-cases for expand_curie().
 - this should fix RDFLib#2348
 - also addresses the converse case from RDFLib#2348, mapping a URI in the default namespace to a CURIE,
   using a new method to avoid a breaking change to qname().
@coveralls
Copy link

coveralls commented Apr 18, 2023

Coverage Status

Coverage: 90.851% (-0.003%) from 90.854% when pulling f994de2 on jclerman:support-default-prefix into e103078 on RDFLib:main.

* Changed the tests to use an in-module fixture instead of the
  `test_ontology.owl` graph and removed the `test_ontology.owl` file. The
  problem with arbitrary test data is that it is hard to evolve. If possible
  test data should be kept as terse as possible. In this case I just created a
  namespace manager with three namespaces added, of which only two will be
  effective because of <RDFLib#2077>.
* Added a warning to `NamespaceManager.curie` to warn users that it is not a
  pure function and explain the specific side effects.
@aucampia
Copy link
Member

aucampia commented May 18, 2023

@jclerman please see if you are okay with 598f967

  • Changed the tests to use an in-module fixture instead of the
    test_ontology.owl graph and removed the test_ontology.owl file. The
    problem with arbitrary test data is that it is hard to evolve. If possible
    test data should be kept as terse as possible. In this case I just created a
    namespace manager with three namespaces added, of which only two will be
    effective because of rdflib 6.2.0 SPARQL parsing regression: multiple prefixes for same IRI not supported? #2077.
  • Added a warning to NamespaceManager.curie to warn users that it is not a
    pure function and explain the specific side effects.

If so then I'm happy to merge it, also happy to make further adjustments, just let me know.

@jclerman
Copy link
Contributor Author

The point about curie() having potential side-effects on the NamespaceManager object is well taken. I think I'd like to update the PR to allow an optional generate parameter on curie(), and pass its value to compute_qname()'s parameter of the same name. My own preference would be to make the default value False - I think that would still make this technically an impure function, but at least the default behavior would be to leave the NamespaceManager unchanged.

Does that default sound OK, or is it too confusing to have the default on curie() be False, but on compute_qname() still be True?

@aucampia
Copy link
Member

aucampia commented May 19, 2023

I think I'd like to update the PR to allow an optional generate parameter on curie(), and pass its value to compute_qname()'s parameter of the same name. My own preference would be to make the default value False - I think that would still make this technically an impure function, but at least the default behavior would be to leave the NamespaceManager unchanged.

A parameter like that would be nice, but I would not say essential. If the namespace manager was a blank slate, I would prefer to default to generate=False, but the existing behaviour of other functions makes it a bit controversial to do that, as they generate by default (i.e. have generate=True as a default). I think having NamespaceManager.curie behave the same as most other methods on NamespaceManager is probably best (as it does at this point in time), even though it does not feel entirely right to default to generate=True.

So, it is up to you if you want to add a generate parameter to this PR, happy if you change this PR to add it. I would though say it should default to generate=True for the reasons above, and if it does, it is also fine to add it later. I would not necessarily say generate=False would be wrong because it would not be, but in context I think generate=True is more correct.

@aucampia aucampia requested a review from a team May 19, 2023 08:13
@aucampia aucampia added the review wanted This indicates that the PR is ready for review label May 19, 2023
Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm 👍

@aucampia aucampia added ready to merge The PR will be merged soon if no further feedback is provided. awaiting feedback More feedback is needed from the author of the PR or Issue. labels May 19, 2023
@jclerman
Copy link
Contributor Author

All your comments make sense. I think the generate option would be very useful, so I'm adding it in a new commit here, with the default=True for the reason you mentioned (and I agree with everything you said).

I updated the test function to test out the 3 possible settings of the parameter - not passed, True, and False.

I forgot to remove this when I removed the test data.
I added some qualification to the warning and placed booleans in single
backticks.
Copy link
Member

@aucampia aucampia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for fixing this bug and the new functionality.

@aucampia aucampia merged commit f200722 into RDFLib:main May 21, 2023
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting feedback More feedback is needed from the author of the PR or Issue. ready to merge The PR will be merged soon if no further feedback is provided. review wanted This indicates that the PR is ready for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

NamespaceManager.expand_curie() does not handle the default prefix
3 participants