Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redundant deregistering of custom functions? #1492

Closed
gtfierro opened this issue Dec 7, 2021 · 5 comments · Fixed by #1659
Closed

Redundant deregistering of custom functions? #1492

gtfierro opened this issue Dec 7, 2021 · 5 comments · Fixed by #1659

Comments

@gtfierro
Copy link
Contributor

gtfierro commented Dec 7, 2021

I'm running into an issue with a complex ontology project where a Javascript-based SPARQL function is being "unregistered" twice and causing the following stack trace when I run my unit tests:

../venv/lib/python3.9/site-packages/pyshacl/validate.py:419: in validate
    conforms, report_graph, report_text = validator.run()
../venv/lib/python3.9/site-packages/pyshacl/validate.py:265: in run
    unapply_functions(advanced['functions'], g)
../venv/lib/python3.9/site-packages/pyshacl/functions/__init__.py:108: in unapply_functions
    f.unapply(data_graph)
../venv/lib/python3.9/site-packages/pyshacl/functions/shacl_function.py:197: in unapply
    unregister_custom_function(self.node, self.execute_from_sparql)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

uri = rdflib.term.URIRef('http://datashapes.org/dash#toString'), func = <bound method SPARQLFunction.execute_from_sparql of <pyshacl.functions.shacl_function.SPARQLFunction object at 0x7f5b80852c40>>

    def unregister_custom_function(uri, func):
        if _CUSTOM_FUNCTIONS.get(uri, (None, None))[0] != func:
>           raise ValueError("This function is not registered as %s" % uri.n3())
E           ValueError: This function is not registered as <http://datashapes.org/dash#toString>

../venv/lib/python3.9/site-packages/rdflib/plugins/sparql/operators.py:616: ValueError

I believe what is happening is the dash graph is getting loaded multiple times because of redundant imports. This loads the dash:toString function multiple times, which is (correctly) idempotent. However, during cleanup (I am using pySHACL's implementation of SHACL-AF), pySHACL attempts to unregister the dash:toString function multiple times. After a function is unregistered the first time, the ValueError in https://github.com/RDFLib/rdflib/blob/master/rdflib/plugins/sparql/operators.py#L614-L617 gets thrown. I believe this should be idempotent as well --- if the function does not exist in the function lookup table, then a warning should be thrown rather than an exception. I could also see pySHACL catching the exception to avoid halting the cleanup.

I am happy to introduce a change PR --- either to RDFlib or pySHACL --- to address this, but I wanted to ask if my intended fix is appropriate. I'll see what I can do about creating a minimal reproducible example

@nicholascar
Copy link
Member

Hi @gtfierro sorry I haven't replied to this earlier: I've been dealing with a large volume of PRs.

If this is an RDFlib thing and fixing it there will solve it in pySHACL too, please do submit an RDFlib PR!

@ashleysommer
Copy link
Contributor

Hi @nicholascar
This issue has had a basic work-around on the PySHACL side, but it does still need to be addressed on the RDFLib side too.

@gtfierro
Copy link
Contributor Author

@nicholascar no worries! In the current design, register_custom_function stores the function in a dictionary, and unregister_custom_function throws an error if a function is unregistered more than once (i.e., if it does not exist in the dictionary at time of unregistration); essentially, register_custom_function is idempotent but unregister_custom_function is not. What happened w/ pySHACL is it used a list to maintain knowledge of all of the registered functions, and then attempted to unregister all of those functions when it was finished: I was working with a graph where the same functions were imported multiple times due to how owl:imports was resolved (using this project).

I can see a couple possible fixes:

  • do we need unregister_custom_function? I don't know RDFlib well enough to know why it would be necessary
  • alter unregister_custom_function to log an error but not raise an exception --- this would make it idempotent:
    • we could use another dictionary to keep track of previously unregistered functions and only avoid throwing the exception when a previously existing function is unregistered more than once; unregistration of non-existent functions would throw an exception as before
  • document this behavior of {un}register_custom_function to avoid misuse of the API
  • redesign the {un}register_custom_function interface so that misuse is very difficult -- I'm not sure what this would look like.

Let me know your thoughts on the above when you have time! I think this is a pretty niche issue and it is solved for my immediate use case in pySHACL so I'm not blocked on it.

@nicholascar
Copy link
Member

@gtfierro I don't actually know when unregister_custom_function could or should be run! As I use custom functions, e.g. in my OWL TIME library, custom functions are registered at module load but there's not ever a reason to un-register them, as they go away when the module's not used.

I think the right thing is to do your second suggestion: alter unregister_custom_function to throw a warning, not an error. That way, it will seamlessly just work as you say, idempotently, and this will require the least effort to change (1 line?)

@gtfierro
Copy link
Contributor Author

gtfierro commented Jan 10, 2022

From what I can tell, pySHACL uses unregister_custom_function to clean up state between processing independent graphs (see this block).

I'll file a PR with the warning fix soon!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants