Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve linting #352

Closed
marcusvaltonen opened this issue Mar 24, 2022 · 15 comments
Closed

Improve linting #352

marcusvaltonen opened this issue Mar 24, 2022 · 15 comments

Comments

@marcusvaltonen
Copy link

marcusvaltonen commented Mar 24, 2022

I installed flake8 to check for code quality and ran it on pylops. It quite quickly finds a lot of flaws in the code quality. Some of the more common once are:

F401 '<module-name>' imported but unused
E203 whitespace before ':'

What is the line length? What I can find in the pre-commit hook it is 88 characters, but at multiple places this is exceeded.
Moreover flake8 is not an aggressive linter compared to e.g. pylint, so I believer this is a minimal requirement.

Suggestion

Enforce PEP8 using flake8 and e.g. flake8-import-order. Also consider enabling some of pylint's checkers for code quality (at least initially). This can be enforced as a pre-commit hook or in the CI-pipeline.

@cako
Copy link
Collaborator

cako commented Mar 25, 2022

Hi @marcusvaltonen,

Thank you for the issue. Indeed, the code is not perfectly PEP8 compliant, not least due to some of the choices regarding line lengths.

  • Many strings are longer than the 88 chosen for black. The reason for this is that black does not format strings by default. This is an experimental feature I though prudent to not enable until it is stable. That's not to say that we would be very happy to receive some commits fixing those issues! It is just that solving them automatically may mess things up now and in the future.
  • The whitespace issue is also a well known point of contention between black and flake8. Black's argument is that E203 is not PEP8 compliant, so there isn't even an option to format it differently.
  • Regarding unused imports, this should definitely not be happening. However, there are some slightly annoying situations that flake8 will flag as "unused import" in __init__.py files. For example here is a simplified version of __init__.py under avo/ which complains of unused imports:
    from .poststack import *
    __all__ = ["AVOLinearModelling"]
    Here is one that does not complain:
    from .poststack import AVOLinearModelling
    __all__ = ["AVOLinearModelling"]
    While PEP8 discourages (but does not ban) wildcard imports (even allowing an exception), this is a bit of a gray area. However, flake8 complains, mostly because of the wildcard import. But not using the wildcard is a bit annoying since it requires a lot of repetition. But maybe that's for the best? I might be a good idea to remove the wildcard imports, as it will allow flake8 and other linters more introspection...
  • We have isort as a pre-commit hook to sort imports, but it appears to not be flake8-compliant. I can look into incorporating flake8-import-order instead of isort depending on what it brings to the table. Let me know if you have thoughts on this!

All in all, I think it is definitely a good idea to incorporate a linter, either flake8, pylint, or another one. My only concern is that linters should not be taken as gospel (e.g., the whitespace issue), and therefore pre-commit-hooks may not be the most appropriate place for them. Indeed, it can create a very negative experience for people unfamiliar with pre-commit hooks, that will not know how to "fix" their problems and therefore not be able to commit. They definitely could be part of the CI-pipeline. @mrava87 could maybe give better input on that end.

@mrava87
Copy link
Collaborator

mrava87 commented Mar 26, 2022

Hi both,
I am never been a great fun of linters for the exact reason @cako mentioned: there are rules in python (or any programming language), and then the are suggestions that in most cases end up being reason of contention between tools trying to do similar things (eg black/flake8).

For this reason I think we should not go after making every linter happy as we probably would spend more time on this than on real useful development.

As we migrate to v2, I am happy to reconsider some of the choices that @cako initially made
here: https://github.com/PyLops/pylops/blob/dev/.pre-commit-config.yaml

To me this is a totally acceptable set of pre-commit tools and black is a wel-known and widely used tool. Of course we could use black in tandem with flake8 (a quick search gave https://ljvmiranda921.github.io/notebook/2018/06/21/precommits-using-black-and-flake8/), I would be happy to consider this.

At the moment my reccomandation is to hold on for a bit as we are going through a lot of internal changes whilst we create v2, we could include this at the very end :)

@mrava87
Copy link
Collaborator

mrava87 commented Mar 26, 2022

Regarding unused import, can you point us to one or two places… I code in pycharm and unused imports are highlighted in grey, so I would be surprised if this happens more than once or twice in the entire library (excluding the init files for the reason @cako mentioned)

@marcusvaltonen
Copy link
Author

Hello,

Nice to hear your opinions. Linters are always there to help - if they don't, you are using it wrong. Every linter cannot for logical reasons (contradicting style opinions or simply too demanding) be happy, but you should stick to one code style and enforce it. I simply think that there are too many inconsistencies in pylops and its derivatives, which could be streamlined. Having e.g. pylint tell you how many variables you should accept as an input argument, how many public methods a class should have or when to write docstrings might seem like too picky, and in many cases it is -- but you can always disable these warnings using locally or globally. For me it is about making a conscious design choice. I've been in big projects without properly enforcing such tools and after a couple of years you end up having to do major refactoring because things are going out of control, thus jeopardizing stability etc. Having a linter saying "hey, this could be a potential issue" is a good hint at where parts of the project could benefit from an architectural update.

I haven't spent much time on pylops, and I know you guys have, so I will try to be humble. It is merely my opinion while scrolling through the code base, where I find design choices that I wouldn't have made. That doesn't mean that they are bad design choices, there are many ways to accomplish the same thing.

On all repositories I maintain I like to put it as a step in the CI-pipeline, because developers should be allowed to commit their work in progress while developing, and not as pre-commit hooks.

At least I would suggest to fix the line length and the imports (and make sure they are checked before merge), I am happy to do that.

Here are some import errors (and I found more in pyproximal as well):

pylops/signalprocessing/Convolve2D.py:1:1: F401 'warnings' imported but unused
pylops/signalprocessing/Convolve2D.py:3:1: F401 'numpy.core.multiarray.normalize_axis_index' imported but unused
pylops/signalprocessing/Shift.py:1:1: F401 'warnings' imported but unused
pylops/signalprocessing/Convolve1D.py:1:1: F401 'warnings' imported but unused
pylops/signalprocessing/ConvolveND.py:1:1: F401 'warnings' imported but unused
pylops/basicoperators/FirstDerivative.py:1:1: F401 'warnings' imported but unused
pylops/basicoperators/Diagonal.py:1:1: F401 'warnings' imported but unused
pylops/basicoperators/Roll.py:1:1: F401 'warnings' imported but unused
pylops/basicoperators/CausalIntegration.py:1:1: F401 'warnings' imported but unused
pylops/basicoperators/SecondDerivative.py:1:1: F401 'warnings' imported but unused
pylops/basicoperators/Laplacian.py:1:1: F401 'warnings' imported but unused
pylops/basicoperators/Laplacian.py:3:1: F401 'numpy as np' imported but unused
pylops/basicoperators/Symmetrize.py:1:1: F401 'warnings' imported but unused
pylops/basicoperators/Smoothing1D.py:1:1: F401 'warnings' imported but unused
pylops/basicoperators/Sum.py:1:1: F401 'warnings' imported but unused
pylops/basicoperators/Smoothing2D.py:1:1: F401 'warnings' imported but unused
pylops/basicoperators/Smoothing2D.py:4:1: F401 'numpy.core.multiarray.normalize_axis_index' imported but unused
pylops/basicoperators/Flip.py:1:1: F401 'warnings' imported but unused
pylops/basicoperators/Restriction.py:1:1: F401 'warnings' imported but unused

Then of course there is the "problem" with the init.py files. I would just go ahead and say that it is a design choice, and simply not check the F401 error on those files.

@mrava87
Copy link
Collaborator

mrava87 commented Mar 26, 2022

Thanks!

Don't get me wrong, I like the idea of linters (or anything else that eases development) but I am just trying to avoid that to go too far... I used to be in a place where sometimes I felt some projects were evolving on making linters happier more than making developers and users happier. This reminds me when you write a paper and try to make reviewers happier but none is gaining anything... so again, let's use these tools but just be pragmatic about ;)

Honestly, I think I know why we get a lot of E203 whitespace before ':'. Prior to @cako introducing pre-commits, my own coding style was x[..., 1:1 + (self.nfft - 1) // 2] /= np.sqrt(2), however black changed everything to x[..., 1 : 1 + (self.nfft - 1) // 2] /= np.sqrt(2). @cako do you agree? I am sure I can trace back it in the commit history...

For the F401, these unused imports (I would not call them errors) should be fixed :) For the init files, any recommendation is welcome. Up until now we needed to stick to my original version of init based on what I knew at that time of python imports (not as much as I do today), but now as we move to v2 we can redesign them in a better way, so any suggestion in a PR is welcome

@cako
Copy link
Collaborator

cako commented Mar 26, 2022

I tried to configure flake8 to be somewhat compatible with black. I think it is a good compromise is to provide a flake8 configuration compatible with black in setup.cfg and strongly suggest new developers to check their code against the linter. Optionally we can have it in the CI, but I am not super conviced about this yet. We don't want to be playing catch-up with ignoring errors.

Anyways, here is an example of configs:

ignore = E203, E501, W503, E402
per-file-ignores =
    __init__.py: F401, F403, F405
max-line-length = 88

Some rationale behind these choices:

  • E203 (Whitespace before ':'): black does not consider E203 to be PEP8 compliant. See: here
  • E501 (Line too long): black reformats long lines (except long strings), so I don't see a problem with the remaining long lines. Ideally they would be reworked, but as mentioned before I wouldn't trust an experimental feature to do this.
  • W503 (Line break occurred before a binary operator): Also not PEP8 compliant in the view of the black community. See: Line breaks before logical operators psf/black#36
  • E402 (Module level import not at top of file): This error pops up in codes like this:
    from scipy.sparse.linalg.interface import LinearOperator as spLinearOperator
    # need to check scipy version since the interface submodule changed into
    # _interface from scipy>=1.8.0
    sp_version = sp.__version__.split(".")
    if int(sp_version[0]) <= 1 and int(sp_version[1]) < 8:
    from scipy.sparse.linalg.interface import _get_dtype
    else:
    from scipy.sparse.linalg._interface import _get_dtype
    from pylops import LinearOperator

    It is an unfortunate but necessary code pattern to deal with breaking changes from SciPy
  • F401, F403, F405 only acts in __init__.py files

With these configurations, I was able to remove most of the "noise" in the linting, and was left with a few warnings/errors that can improve the code. I will submit them in a PR soon!

@marcusvaltonen
Copy link
Author

This is a good start @cako.

I would personally like to have a step in the CI-pipeline to check this, but you can of course (if you don't want to strictly enforce it) always allow merging without that step passing. Again, then you are doing a well-informed decision of knowingly stray from the guidelines.

I strongly believe that the style-guide you chose should be enforced everywhere, and if there are rules you don't like, then you can modify/disregard them. The limitations in doing so is is probably the reason I am not a huge fan of Black and the workflow it promotes. I don't think a unified inter-repository style guide is possible in a language like python which is used by vastly different communities (scientific, enterprise, web-applications, etc.). I guess that works better for gofmt (for Go language), which has inspired Black, as the community is more homogeneous.

Again, just some thoughts on my end, nothing I expect you to take seriously into account. I completely agree that hunting a 10.0 / 10.0 in pylint rating can be a waste of time, but I have always been pragmatic about it and allowed locally disabling it or similar, and in the end I don't think I have regretted this.

@mrava87
Copy link
Collaborator

mrava87 commented Mar 28, 2022

Thanks! I agree with your config decisions, again personally I find it weird when I see x[1 : 10] but as you say there is no real consensus so let's go for the choice you made and ignore E203

Just a question, if I put in my setup.cfg this as suggested in the PR:

[flake8]
ignore = E203, E501, W503, E402
per-file-ignores =
    __init__.py: F401, F403, F405
max-line-length = 88

I still get quite some warnings by running flake8. But they are all for the examples and pytegsts directories. Shall we also care about them, I would say yes but of course they are less important :)

@mrava87
Copy link
Collaborator

mrava87 commented Mar 28, 2022

@cako for now you suggest, devs add the '[flake8] configuration locally and run it before pushing some code but you still prefer it not to be promoted to the pre-commit checks?

@cako
Copy link
Collaborator

cako commented Mar 28, 2022

@marcusvaltonen I didn't know it was possible to provide checks that were not enforced before merge! @mrava87 If we can set this up, I would say that is the ideal way. We can have a core dev look over the code quality + linter warnings, and make an executive decision on whether to accept the code. I'm not sure how to do this, so it would be great to get some pointers.

@mrava87 Yes, I only checked the pylops folder. I will make another PR fixing the other folders as well!

@marcusvaltonen
Copy link
Author

@cako It depends on the CI-tool. I have been working a lot in enterprise with Gitlab CI and there they have an "optional" setting.

@mrava87
Copy link
Collaborator

mrava87 commented Apr 8, 2022

What about this? https://github.com/marketplace/actions/python-flake8-lint

We would need to try but if I understand correctly this would be an extra row in your PR CI pipeline, we can just look at it and ignore if we think there is no problem with the warnings?

@cako
Copy link
Collaborator

cako commented Apr 9, 2022

Exactly, it would be good to inform the reviews, but not to make it mandatory

mrava87 added a commit to mrava87/pylops that referenced this issue Apr 9, 2022
@mrava87
Copy link
Collaborator

mrava87 commented Apr 9, 2022

Have a look at the PR. I think I have been able to set it up successfully, this is what it is run:

flake8 --ignore E203,E501,W503,E402 --max-line-length 88 --per-file-ignores=__init__.py:F401,F403,F405 pylops

I did run it locally and get zero warnings too :)

cako pushed a commit that referenced this issue Apr 13, 2022
* CI: added flake8 github action
Handles #352

* minor: change path for flake8 action

* minor: fix spaces in ignore of flake8 action

* minor: add more ignore in flake8 action

* minor: try to add per-file-ignore to flake8 action

* minor: added flake8 to setup.cfg and makefile
@cako
Copy link
Collaborator

cako commented Apr 14, 2022

Closing based on #364

@cako cako closed this as completed Apr 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants