Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Add PythonInputSource to create py-based graphs #1463

Merged
merged 9 commits into from Dec 1, 2021

Conversation

joshmoore
Copy link
Contributor

In order to manipulate a JSON-LD structure before creating a graph (specifically to simplify rdf:Seq handling), it is
currently necessary to use json.loads followed by dumps and then let Graph().parse() re-load. By detecting dict
instances and creating a PythonInputSource, a single call to loads suffices.

Proposed Changes

  • add rdflib.parser.PythonInputSource
  • return instances from create_input_source
  • detect instances in rdflib.plugins.shares.jsonld.util.source_to_json

In order to manipulate a JSON-LD structure before creating
a graph (specifically to simplify rdf:Seq handling), it is
currently necessary to use `json.loads` followed by `dumps`
and then let `Graph().parse()` re-load. By detecting `dict`
instances and creating a `PythonInputSource`, a single call
to `loads` suffices.
Copy link
Member

@nicholascar nicholascar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @joshmoore thanks for looking out for more efficiencies here: we are really keen to improve rdflib's JSON-LD performance!

Two points:

  1. can you please add PythonInputSource to parser.py's _all_variable?
  2. can you please add an equivalency test - an assert - to your test? You produce both "Current workaround" & "Desired behavior" output which, I'm sure, are the same, but your test doesn't actually test for that, so automated test running won't see the equivalent behaviour

@joshmoore
Copy link
Contributor Author

Thanks for the feedback, @nicholascar. Glad to hear there's interest. I'll tackle the above as well as put some docs into place and then ping you.

@joshmoore
Copy link
Contributor Author

@nicholascar: changes pushed. Not sure why this failure appeared:

	nosetests: error: no such option: --with-timer

@nicholascar
Copy link
Member

Not sure why this failure appeared:

Probably because between you submitting the original PR and the testing now we've changed the tests in RDFlib from nose to pytest, so something's not right there.

I'll have a look tomorrow to see if I can work it out.

BUT, I've also come up with another request for you, sorry! You say in your PythonInputSource:

"Constructs an RDFLib Parser InputSource from a Python data structure loaded from JSON with json.load or json.loads."

But the "from JSON" bit's not strictly necessary. I could load your PythonInputSource with a native Python dict, data from YAML or any other source I cared to parse into a Python object, you just happen to have used JSON. So I think the class documentation needs to be made more generic.

I could make this change but I just want to test the assumption with you first.

@joshmoore
Copy link
Contributor Author

I could make this change but I just want to test the assumption with you first.

Happy to loosen the docs or to have you do them 👍 I just didn't want to oversell this since as you say, nothing else has been tested. 😉

@joshmoore
Copy link
Contributor Author

Merged origin/master to hopefully fix the test and tried to make it clear that json is just one example of a way to construct the Python data.

Copy link
Member

@nicholascar nicholascar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, all my queries addressed!

@nicholascar nicholascar merged commit 300fc38 into RDFLib:master Dec 1, 2021
@joshmoore joshmoore deleted the python-graph branch December 1, 2021 08:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants