Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make graph and other methods chainable #1394

Merged
merged 6 commits into from Sep 5, 2021
Merged

Make graph and other methods chainable #1394

merged 6 commits into from Sep 5, 2021

Conversation

wssbck
Copy link
Contributor

@wssbck wssbck commented Aug 18, 2021

Proposed Changes

Makes certain graph methods, like remove or add chainable. This will make it possible to write more concise code involving multiple graph modifications, along the lines of:

g = Graph().add(...).addN(...).add(...).remove(...)

@wssbck wssbck marked this pull request as draft August 19, 2021 19:20
@nicholascar
Copy link
Member

@wssbck this change might only need a few test changes for it to pass. I guess that, since most of the methods like add() have always returned None, this is what a bunch of tests look for but I don't suppose there's any reason why they must look for such an outcome. So returning self will just require those tests to be changed. Looks like there's only about 8 of those test anyway, so it won't be a big deal to alter.

Are you still keen to see this PR through?

@wssbck
Copy link
Contributor Author

wssbck commented Aug 28, 2021

@nicholascar I would like to continue, yes. I got distracted by, well, life.

@wssbck wssbck marked this pull request as ready for review September 4, 2021 19:00
@wssbck
Copy link
Contributor Author

wssbck commented Sep 4, 2021

@nicholascar I added returns to more methods and adjusted all the tests. I hope I have not missed anything. The build is passing now.

@wssbck wssbck changed the title Make graph methods chainable Make graph and other methods chainable Sep 4, 2021
@nicholascar
Copy link
Member

@wssbck thanks for updating this PR. I'm happy that it's now doing what you want it to do and that it's passing tests however I want to check one thing with my co-maintainers still:

@ashleysommer @white-gecko: this PR makes add() and .remove() return self and I have two questions for you:

  1. can you think of any down-stream projects that would be negatively affected by this?
  2. is there any performance penalty incurred by this passing back? I don't mean data in self - it's just a reference - but does passing it back, even if it's not used, cost any extra memory or processing? I assume no but need to check

@ashleysommer
Copy link
Contributor

Good question regarding performance. I don't think it will change anything, but I'll do a couple of benchmarks and get a number to compare.

@ashleysommer
Copy link
Contributor

Ok, I created this simple benchmark (its a simple python method benchmark, not an rdflib benchmark):
https://gist.github.com/ashleysommer/86a30e4b3b356198db058a7c6b0393e1

Conclusion: zero performance difference when returning self rather that not returning anything.

@wssbck
Copy link
Contributor Author

wssbck commented Sep 5, 2021

@nicholascar this PR changes return values of several methods from implicit None to explicit object references. From a practical standpoint it shouldn't cause anyone issues unless their code, for some reason, relies on the implicit return of None. I really cannot come up with a reasonable scenario where it could happen, though. It would be like relying on the return value of print (which also returns None).

The practical effect is that these methods can now be chained, potentially making some downstream code more compact.

Since the return values are now explicit, they are easier to reason about. It is, to my experience, always better to try to explicitly return something related to (or convenient because of) the purpose of the function/method rather than rely on the default behaviour of the language.

@nicholascar
Copy link
Member

Thanks @ashleysommer, I'll add you to formally review then and we are good to merge

Copy link
Contributor

@ashleysommer ashleysommer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

rdflib/graph.py Outdated Show resolved Hide resolved
@nicholascar
Copy link
Member

Since @ashleysommer's comment has been addressed, I'm merging

@nicholascar nicholascar merged commit 506244c into RDFLib:master Sep 5, 2021
@nicholascar
Copy link
Member

@wssbck you might like to update the rdflib documentation files with a new PR to show chained use in action

@wssbck wssbck deleted the return_self branch September 5, 2021 22:48
@wssbck
Copy link
Contributor Author

wssbck commented Sep 5, 2021

I will try to find time shortly, thanks for reviewing and merging!

@aucampia
Copy link
Member

aucampia commented Sep 9, 2021

I was looking at fixing type checking, since I had some issues using 6.0.0 in typed python, and noticed that serialize will now sometimes return self, but it will be a bit inconsistent, as sometimes it will still return None. I think for serialize there is actually a good case to make that it should not return self, as the return value somewhat depends on the arguments.

Just a note, I will fix the typing to correspond with what things are now, and then hopefully with type checking enabled in CI issues like this will be easier to spot.

@nicholascar
Copy link
Member

serialize will now sometimes return self

Doesn't serialize return a string/bytes, e.g. https://github.com/RDFLib/rdflib/blob/master/rdflib/graph.py#L1013 & https://github.com/RDFLib/rdflib/blob/master/rdflib/graph.py#L1033?

...as sometimes it will still return None

As above, when would serlialize() return None? Do you perhaps mean something like add()? That might return self/None and it was the focus of this PR more than serialize(). Just trying to make sure I understand the point!

...I will fix the typing to correspond with what things are now...

Great, please do and I'll happily assist

aucampia added a commit to aucampia/rdflib that referenced this pull request Sep 10, 2021
Other changes:

- fix return value for `Graph.serialize` (refs RDFLib#1394)
- remove default value for
  `rdflib.plugins.sparql.algebra.translateAlgebra` (refs RDFLib#1322)
- add .dockerignore to reduce context size and make docker quicker
  to run.
- add .flake8 config to ignore line length as black is managing
  formatting.
- add mypy to docker-compose, makefile and tox.ini
- fix the way csv2rdf is invoked to ensure that the right code gets
  executed.
aucampia added a commit to aucampia/rdflib that referenced this pull request Sep 10, 2021
Other changes:

- fix return value for `Graph.serialize` (refs RDFLib#1394)
- remove default value for
  `rdflib.plugins.sparql.algebra.translateAlgebra` (refs RDFLib#1322)
- add .dockerignore to reduce context size and make docker quicker
  to run.
- add .flake8 config to ignore line length as black is managing
  formatting.
- add mypy to docker-compose, makefile and tox.ini
- fix the way csv2rdf is invoked to ensure that the right code gets
  executed.
aucampia added a commit to aucampia/rdflib that referenced this pull request Sep 10, 2021
Other changes:

- fix return value for `Graph.serialize` (refs RDFLib#1394)
- remove default value for
  `rdflib.plugins.sparql.algebra.translateAlgebra` (refs RDFLib#1322)
- add .dockerignore to reduce context size and make docker quicker
  to run.
- add .flake8 config to ignore line length as black is managing
  formatting.
- add mypy to docker-compose, makefile and tox.ini
- fix the way csv2rdf is invoked to ensure that the right code gets
  executed.
aucampia added a commit to aucampia/rdflib that referenced this pull request Sep 10, 2021
Fixes RDFLib#1311

Add `mypy` to .drone.yml and fix type errors that come up.

Not type checking examples or tests.

Other changes:

- fix return value for `Graph.serialize` (refs RDFLib#1394)
- remove default value for
  `rdflib.plugins.sparql.algebra.translateAlgebra` (refs RDFLib#1322)
- add .dockerignore to reduce context size and make docker quicker
  to run.
- add .flake8 config to ignore line length as black is managing
  formatting.
- add mypy to docker-compose, makefile and tox.ini
- fix the way csv2rdf is invoked to ensure that the right code gets
  executed.
@aucampia
Copy link
Member

@nicholascar Graph.serialize() used to return None when a destination is supplied (see here and here)

This change makes Graph.serialize() return self if a destination is supplied.

However if a destination is supplied and it is a network location serialize() warns and returns None (see here).

I am actually unsure if this make sense, maybe this should rather raise a ValueError or RuntimeException than return None or self. But I guess if it does not throw it makes most sense to return self, which is what I changed it to in #1407.

@nicholascar
Copy link
Member

nicholascar commented Sep 10, 2021

@aucampia ok, thanks for that explanation.

I think you're right: always return the same thing (self) and return an actual error, rather than print a warning, when actually encountering an error condition.

I think you've already made a PR for this! But you have the network file location error as a warning not an error??

@aucampia
Copy link
Member

@nicholascar I only thought that throwing maybe better than logging a warning after I wrote the code for #1407. I updated the PR now to make Graph.serialize throw a ValueError instead of raise an exception if the destination paramter is not a local file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants