Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code formatting, style guide enforcement and static checks #2450

Open
RMeli opened this issue Jan 16, 2020 · 22 comments
Open

Code formatting, style guide enforcement and static checks #2450

RMeli opened this issue Jan 16, 2020 · 22 comments

Comments

@RMeli
Copy link
Member

RMeli commented Jan 16, 2020

I recently started using code formatting (with black), style guide enforcement (with flake8) and static code checks (with mypy). I tried to apply them to MDAnalysis for fun and came across to many warnings.

Describe the solution you'd like

I think it would be great to format MDA with black (or other tools) and then use flake8 and mypy (or other tools) to spot and fix possible problems.

Describe alternatives you've considered

There are other tools that can be considered: pylint, isort, ypaf, ...

Additional context

Many errors come from MDAnalysis/migration/.

You can have a look at this commit to see the result of automatic code formatting with black.

@richardjgowers
Copy link
Member

We had (or have?) pylint on, but it sometimes throws annoying errors which stop a PR merging.

i think with something like black, we'd have to have every contributor follow the style, which would mean either a) instructions for everyone on how to apply it easily or b) some sort of CI system which fixes up the formatting on merging. Both seem like they'd add ongoing effort for not much real gain?

@RMeli
Copy link
Member Author

RMeli commented Jan 17, 2020

@richardjgowers I just wanted to spark the discussion but I agree the ratio benefits/effort might be small. However, #2409 and #2411 could be easily spotted by static checks. I also have found some indentation errors here and there that will make the code fail when executed.

Automatic code formatting (with black or other) is useful only if used with pylint or flake8, so that it removes most of the useless formatting errors. (It might reduce commit diffs if everyone uses it, but this is a very marginal benefit). However, it's really easy to install (pip install black) and apply (black .).

@jbarnoud
Copy link
Contributor

I am all in favour of linters and static analyzers. I am considering adding type hints when we would have dropped python 2. I am less keen on formating tools, though: the way I format my code is often deliberate and, sometimes, going against the convention helps with readability or other things.

@orbeckst orbeckst changed the title Code formatting, style guide enforcement and static cheks Code formatting, style guide enforcement and static checks Jan 23, 2020
@RMeli
Copy link
Member Author

RMeli commented Feb 17, 2020

I think adding type hints will be a great!

I understand that automatic code formatting is not something every developer is willing to accept, but it would be the easiest way to remove most of the linting problems in one go at the beginning. I haven't tried pylint, but flake8 gives very reasonable complaints (when used after automatic code formatting).

Most of the tools can be configured in setup.cfg in order to reach a general consensus.

@tylerjereddy
Copy link
Member

I'd suggest only applying the CI linting to the diff if we do this, rather than driving the full code base through a large amount of churn. SciPy is working on adopting something like this in: scipy/scipy#11423

I'd be cautious about adopting type hints---maybe if/when they become widely adopted at the base of our dependency tree in NumPy/SciPy, etc.

@RMeli
Copy link
Member Author

RMeli commented Feb 22, 2020

I like the idea of linting only the diff! What is the drawback of type hints if they are not well adopted by the dependencies?

@tylerjereddy
Copy link
Member

There may not be any major drawback, I just mean it could be sensible to wait to see what the rest of the ecosystem decides to do (if anything) about hinting in scientific code.

For example, NumPy has an experimental repo for type hinting with arrays: https://github.com/numpy/numpy-stubs

See also long-standing issue here: numpy/numpy#7370

Perhaps it may have some influence on what we might want to do.

@tylerjereddy
Copy link
Member

Note that SciPy has started accepting PRs that add type hints to source code, so maybe the time for doing this is not really so far away. May want to wait until we officially drop Python 2 support, as noted above.

@lilyminium
Copy link
Member

+1 on being apprehensive about autoformatters. My main objection is what they do to commit history. It's useful to look at a line of code and see who committed it in which PR, so you can check out any discussion and know who to direct questions to.

CI linting the diffs seem like a great idea and would remove some of the PEP8 comments on reviews.

@IAlibay
Copy link
Member

IAlibay commented Jun 17, 2020

Just tagging this additional discussion thread here.

@lilyminium
Copy link
Member

lilyminium commented Jul 3, 2020

https://github.com/marketplace/pep-8-speaks may be of interest.

Edit: instead of adding a check that can fail, it simply comments so contributors and reviewers can decide if they want to follow suggestions or not.

@orbeckst
Copy link
Member

orbeckst commented Jul 4, 2020

I installed pep8-speaks for all of MDA — if it’s too annoying we can deactivate it again or only enable it for specific repos.

One can also configure it with a config file as described in https://github.com/OrkoHunter/pep8speaks#configuration

@RMeli
Copy link
Member Author

RMeli commented May 7, 2022

+1 on being apprehensive about autoformatters. My main objection is what they do to commit history. It's useful to look at a line of code and see who committed it in which PR, so you can check out any discussion and know who to direct questions to.

Thanks to @mikemhenry, I realised that it is possible to ignore specific commits when using git blame and it seems that this is now supported by GitHub as well.

I am less keen on formatting tools, though: the way I format my code is often deliberate and, sometimes, going against the convention helps with readability or other things.

For this situation, it is possible to ignore blocks of code or single lines with # fmt: on/off and # fmt: skip.

@jandom
Copy link
Contributor

jandom commented Jul 21, 2023

hi there, just adding my two cents – prompted by @RMeli in MDAnalysis/UserGuide#270 (comment) – in my experience code formatters and static analysis tools (black, isort, flake8, mypy) are tremendously valuable. They make the codebase more readable, more maintainable, and lower the bar for entry for new contributors.

These tools (in my experience) are the de-facto standard in the industry, and familiarity with them is valuable. Of course, such tools only solve a narrow problem, they don't solve some of the most important feature requests, tech debt or bugs. But they are typically a low hanging fruit with a positive net impact on the health of a codebase and dev velocity.

@marinegor
Copy link
Contributor

marinegor commented Aug 2, 2023

Hi everyone, just adding another two cents here -- git has a feature that allows you to ignore certain commits in git blame: https://stackoverflow.com/questions/53502654/how-do-i-run-a-code-formatter-over-my-source-without-modifying-git-history

Sadly it requires a configuration beforehand (you must run git config blame.ignoreRevsFile .git-blame-ignore-revs to make it work), and also doesn't (yet) work on the github's web interface -- although it's on a public beta since 8.03.2023 (check here), so I assume it'll be available for everyone relatively soon.

I imagine that messing up git blame was one of the arguments against using automatic formatting -- so just saying there's a workaround here.

@jbarnoud
Copy link
Contributor

jbarnoud commented Aug 2, 2023

I have been on record as one of those (if not the one) that dislike auto-formatting. I had issues with fixes that were suboptimal readability-wise, portions of the code where the formatting was intentional, and reformatting the existing code base breaking git blame.

With time, I come to accept that the few places where the formatting is suboptimal are outweighed by the benefit of auto-formatting. We can have guards where the formatting is intentional (like this superb periodic table). If we can ignore some commits for the blame, then all my objections are addressed.

There is just the question of how to handle pending pull requests that risk being riddled with merge conflicts.

@IAlibay
Copy link
Member

IAlibay commented Aug 2, 2023

We can have guards where the formatting is intentional

I'm still not an amazing fan of black (I am less wary now that that formatting style documentation has improved & there is a "future" section) but I also don't care that much as long as we are careful not to worsen readability / destroy intentional formatting.

There is just the question of how to handle pending pull requests that risk being riddled with merge conflicts.

In my opinion, the best approach would be to cut up the process in chunks. That way we can also review the changes in a sane way rather than have a million+ line changes. We work on files that aren't currently in progress, and if we get blocked by a PR that isn't going anywhere then that gives us a good reason to close stale PRs (of which we have many).

@IAlibay
Copy link
Member

IAlibay commented Aug 2, 2023

I would however add here (for the @MDAnalysis/coredevs) that I would urge a discussion on priorities here. We have a lot of maintenance work to do, and many PRs that need a push (including all the type hinting ones and the guesser stuff).

I'm not saying don't do it, I am however saying let's consider the order in which we do things.

@marinegor
Copy link
Contributor

Is it possible that you add a project-wide autoformatter settings, and then for all new PRs you ask people to a) run the formatter on the files they're changing b) add the respective commit to the .git-blame-ignore-revs file? Perhaps this might help reducing the number of merge conflicts, as well as delay the necessity to run it for all files at once.

@IAlibay
Copy link
Member

IAlibay commented Aug 2, 2023

Is it possible that you add a project-wide autoformatter settings, and then for all new PRs you ask people to a) run the formatter on the files they're changing b) add the respective commit to the .git-blame-ignore-revs file? Perhaps this might help reducing the number of merge conflicts, as well as delay the necessity to run it for all files at once.

I don't think that'll work - you'll get conflicts with files where folks are touching the same file.

@marinegor
Copy link
Contributor

I don't think that'll work - you'll get conflicts with files where folks are touching the same file.

But if the settings are project-wide, their autoformatting (should) match unless they touched the same parts of the code?
I'm not sure though, never tried it.

@orbeckst
Copy link
Member

orbeckst commented Aug 2, 2023

EDIT: Sorry folks — I thought this question happened on Egor's PR, not on the dedicated static code formatters issue ...so this is exactly where discussion should happen... oops. So most of what I wrote doesn't apply really. I am just crossing out the embarrassing lines... except the bit that says if anyone wants to push for such a change they better be ready to be responsible for handling it.


I think we can cut the discussion pretty short: We can consider autoformatters but it ain't gonna happen in this PR as it's too big of a change. I am all for a productive discussion for how to move forward but can we please have it in a sensible place — either an issue or a thread on the developer list?

@marinegor apologies for stomping in here. From your perspective it's a sensible discussion to ask here as you're focused on your work and want to get it done efficiently. From a project-wide perspective it's a discussion that has come up a few times. It looks as opinions are shifting so it's ok to have this discussion from time to time — nothing is set in stone — but in the context of the project it's a big, big undertaking and we have to weigh cost (=time/opportunity) vs benefit. Let's do this in a dedicated place that won't distract from what your PR is about.

So can I ask that if anyone cares enough about the topic of autoformatters, they start a discussion on the dev list (or open/re-open(?) an issue) and point all interested parties there... and then feel responsible for following up on the discussions? Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants