Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix errors reported by mypy #1330

Merged
merged 3 commits into from Jun 26, 2021
Merged

Fix errors reported by mypy #1330

merged 3 commits into from Jun 26, 2021

Conversation

aucampia
Copy link
Member

@aucampia aucampia commented May 29, 2021

Fixes mypy errors and add mypy to CI pipelines

Proposed Changes

  • Add mypy to CI pipeline
  • Fix any errors that mypy reports

@aucampia aucampia force-pushed the add_mypy branch 3 times, most recently from ba2bbeb to 13aa36a Compare May 29, 2021 22:47
@aucampia aucampia marked this pull request as ready for review May 29, 2021 22:56
@coveralls
Copy link

coveralls commented May 29, 2021

Coverage Status

Coverage increased (+0.07%) to 75.771% when pulling c88a10a on iafork:add_mypy into 06df8b6 on RDFLib:master.

rdflib/compat.py Outdated Show resolved Hide resolved
@nicholascar
Copy link
Member

@ashleysommer have you used mypy anywhere? Do you think adding it to our requirements here is sensible? It is powerful but adds more dependencies

@FlorianLudwig
Copy link
Contributor

@aucampia thank you very much for your effort here! Lots of cleanup and more types!

Since rdflib is getting more and more types, having mypy seems a really good thing to me.

Removed import of xml.etree.cElementTree as it was was
deprecated in python 3.3.

Also, since xml.etree.ElementTree has been a thing for a long time there is
no good reason to try fallbacks if it fails to import.
@ashleysommer
Copy link
Contributor

Yes, I use MyPy in PySHACL and in the Sanic project, for static type analysis during CI build. It is indeed a very powerful and useful tool. I like using it, and I believe all new modern Python software projects should incorporate MyPy type checking from the start.

However I'm not sure it makes sense to use it in the RDFLib project. A very large portion of contributors to RDFLib are not software engineers. One of the primary reasons that academics, researchers, and scientists use Python is because of the low barrier to entry, and one of those points of ease is the dynamic and untyped nature.

We've seen in the past some contributors are frustrated by the existing PEP8 requirements and our preference (not requirement) for Black formatted code. Its a balancing act between having high quality code contributions, and having any contributions at all. Adding yet another barrier to contribution, another hurdle to jump, another dev-dependency to install, another required tool contributors have to learn and to use, sounds like a bad idea to me.

Having said all that, I am in favour of the software engineers in our community to add the type annotations, run MyPy locally, and fix the errors it raises, then push those annotations and fixes as a PR. I don't think there is any need to add it to the CI pipelines.

@aucampia
Copy link
Member Author

aucampia commented Jun 2, 2021

We've seen in the past some contributors are frustrated by the existing PEP8 requirements and our preference (not requirement) for Black formatted code. Its a balancing act between having high quality code contributions, and having any contributions at all. Adding yet another barrier to contribution, another hurdle to jump, another dev-dependency to install, another required tool contributors have to learn and to use, sounds like a bad idea to me.

@ashleysommer I get the reluctance to add barriers to contribution. However there is another side to this which is that mypy does detect real problems, for example in my commit the changes to the following files relate to real problems which went undetected:

  • examples/conjunctive_graphs.py, examples/simple_example.py, examples/smushing.py, examples/swap_primer.py: would have done decode on a string
  • rdflib/extras/infixowl.py: contains some member functions which are incorrectly specified
  • rdflib/plugins/stores/sparqlconnector.py using incorrect variable in assignment
  • rdflib/plugins/stores/sparqlstore.py, rdflib/resource.py: function/method declared twice

And then there are some cases where type annotations that was present were wrong:

  • rdflib/plugins/stores/sparqlstore.py
  • rdflib/query.py
  • rdflib/graph.py

And with the current config of mypy there is very few cases where it will demand that things are typed and should work fine with no typing in most cases. And not running mypy as part of CI will inevitably result in merging PRs that do have problems. However I understand this is a tradeoff.

I will update the PR to just merge the type annotations and fixes for now. I would however like to keep the mypy config if that is fine. But I will move it to setup.cfg as it would maybe be more appropriate there given that is where you keep all other settings.

Would you be okay with just running mypy as part of CI pipelines without failing if it fails, same as you do for flake8? So it is at least easy to see for PRs if they introduce type errors?

@aucampia aucampia changed the title Add mypy to CI pipelines and fix errors it raises ~~Add mypy to CI pipelines and~~ fix errors mypy reports Jun 2, 2021
@aucampia aucampia changed the title ~~Add mypy to CI pipelines and~~ fix errors mypy reports Fix errors reported by mypy Jun 2, 2021
Also move mypy config into `setup.cfg` to be in line with other tool
configuration.
@aucampia
Copy link
Member Author

aucampia commented Jun 2, 2021

Pull request no longer adds mypy to pipelines and requirements.dev.txt. I will make a PR to add a tox environment for mypy validation once #1313 is merged.

@nicholascar
Copy link
Member

not running mypy as part of CI will inevitably result in merging PRs that do have problems

We could just run mypy occasionally, i.e. not part of the automated build, but see my response and suggestion below.

Would you be okay with just running mypy as part of CI pipelines without failing if it fails, same as you do for flake8? So it is at least easy to see for PRs if they introduce type errors?

OK, how about the reverse wetup to what @ashleysommer proposed:

  • add mypy to CI, let fails break things
  • remove mypy from development requirements
    • PR creators don't have to run mypy locally (they can if they want) but we will run it in CI and ensure all errors are fixed before merge

This takes work of contributors and puts it on us maintainers, however the overhead will be very low after this initial PR that fixes all current mypy issues. We will only see any newly introduced errors.

Doing things this way around means ad hoc contributors don't have to learn mypy but we will still get the benefit from applying it. Also, some of our contributors, such as @aucampia !! will indeed know all about mypy.

@white-gecko @ashleysommer, what do you think of doing this this way around?

@nicholascar
Copy link
Member

I'll merge this as we benefit from the MyPy fixes and leave it to elsewhere to decide when/how to repeat MyPy enhancements.

@nicholascar nicholascar merged commit 5eb588a into RDFLib:master Jun 26, 2021
@aucampia aucampia deleted the add_mypy branch November 17, 2021 19:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants