Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deduplicate functions during gathering #109

Merged
merged 3 commits into from
Dec 7, 2021
Merged

Conversation

gtfierro
Copy link
Contributor

@gtfierro gtfierro commented Dec 7, 2021

Documented in #108, returning a list of functions where certain functions (the same node) are listed twice can cause the issue I've documented in RDFLib/rdflib#1492. This PR uses a dictionary to deduplicate those functions during the gather phase, which avoids the issue.

However, this only keeps one copy of the function. In the provided test case, only the JSFunction object of the function is kept, not the SPARQLFunction object. This may cause issues, but I am not familiar enough with the internals to know what the impact may be.

@ashleysommer
Copy link
Collaborator

@gtfierro
Thanks for your very well written issue report and clean PR.
I like your solution of using a dict to deduplicate based on the subject node.

There is just a small issue flagged by the CI pipeline, regarding the import of "typing.Set". I'll change that to "Dict", and re-run the CI, and it should pass.

@gtfierro
Copy link
Contributor Author

gtfierro commented Dec 7, 2021

Thanks! I made the change for Dict. Hopefully that's correct --- I'm still learning the type annotations

Do you think having only one copy of the function object will be an issue? In my example, both the SHACL and JS versions of the function would be returned and presumably evaluated as part of the pySHACL advanced+JS feature execution. After this change, only the JS version would be returned and evaluated. Do they defer to each other so only one of them is running anyway?

If that is an issue, then the fix is probably better done in RDFlib to make removing the function idempotent.

@ashleysommer
Copy link
Collaborator

Yes, I think only having one copy of the fn might be an issue in some cases. I never tested any SHACL Functions that are both SPARQLFunction and JSFunction at the same time, so I don't know what the implications of that are.
But this change is a good first step, and eliminates the duplicate deregistering issue you've identified, so happy to merge this, then look into potential further changes after that.

@ashleysommer ashleysommer merged commit 815fb12 into RDFLib:master Dec 7, 2021
@ashleysommer
Copy link
Collaborator

@gtfierro This fix is released in latest PySHACL v0.17.3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants