Code formatting, style guide enforcement and static checks #2450

RMeli · 2020-01-16T14:33:33Z

I recently started using code formatting (with black), style guide enforcement (with flake8) and static code checks (with mypy). I tried to apply them to MDAnalysis for fun and came across to many warnings.

Describe the solution you'd like

I think it would be great to format MDA with black (or other tools) and then use flake8 and mypy (or other tools) to spot and fix possible problems.

Describe alternatives you've considered

There are other tools that can be considered: pylint, isort, ypaf, ...

Additional context

Many errors come from MDAnalysis/migration/.

You can have a look at this commit to see the result of automatic code formatting with black.

The text was updated successfully, but these errors were encountered:

richardjgowers · 2020-01-17T10:45:27Z

We had (or have?) pylint on, but it sometimes throws annoying errors which stop a PR merging.

i think with something like black, we'd have to have every contributor follow the style, which would mean either a) instructions for everyone on how to apply it easily or b) some sort of CI system which fixes up the formatting on merging. Both seem like they'd add ongoing effort for not much real gain?

RMeli · 2020-01-17T10:54:01Z

@richardjgowers I just wanted to spark the discussion but I agree the ratio benefits/effort might be small. However, #2409 and #2411 could be easily spotted by static checks. I also have found some indentation errors here and there that will make the code fail when executed.

Automatic code formatting (with black or other) is useful only if used with pylint or flake8, so that it removes most of the useless formatting errors. (It might reduce commit diffs if everyone uses it, but this is a very marginal benefit). However, it's really easy to install (pip install black) and apply (black .).

jbarnoud · 2020-01-17T17:23:55Z

I am all in favour of linters and static analyzers. I am considering adding type hints when we would have dropped python 2. I am less keen on formating tools, though: the way I format my code is often deliberate and, sometimes, going against the convention helps with readability or other things.

RMeli · 2020-02-17T23:50:42Z

I think adding type hints will be a great!

I understand that automatic code formatting is not something every developer is willing to accept, but it would be the easiest way to remove most of the linting problems in one go at the beginning. I haven't tried pylint, but flake8 gives very reasonable complaints (when used after automatic code formatting).

Most of the tools can be configured in setup.cfg in order to reach a general consensus.

tylerjereddy · 2020-02-21T20:37:36Z

I'd suggest only applying the CI linting to the diff if we do this, rather than driving the full code base through a large amount of churn. SciPy is working on adopting something like this in: scipy/scipy#11423

I'd be cautious about adopting type hints---maybe if/when they become widely adopted at the base of our dependency tree in NumPy/SciPy, etc.

RMeli · 2020-02-22T15:18:50Z

I like the idea of linting only the diff! What is the drawback of type hints if they are not well adopted by the dependencies?

tylerjereddy · 2020-02-22T15:55:30Z

There may not be any major drawback, I just mean it could be sensible to wait to see what the rest of the ecosystem decides to do (if anything) about hinting in scientific code.

For example, NumPy has an experimental repo for type hinting with arrays: https://github.com/numpy/numpy-stubs

See also long-standing issue here: numpy/numpy#7370

Perhaps it may have some influence on what we might want to do.

tylerjereddy · 2020-04-26T21:04:25Z

Note that SciPy has started accepting PRs that add type hints to source code, so maybe the time for doing this is not really so far away. May want to wait until we officially drop Python 2 support, as noted above.

lilyminium · 2020-06-05T20:11:32Z

+1 on being apprehensive about autoformatters. My main objection is what they do to commit history. It's useful to look at a line of code and see who committed it in which PR, so you can check out any discussion and know who to direct questions to.

CI linting the diffs seem like a great idea and would remove some of the PEP8 comments on reviews.

IAlibay · 2020-06-17T12:16:05Z

Just tagging this additional discussion thread here.

lilyminium · 2020-07-03T06:44:06Z

https://github.com/marketplace/pep-8-speaks may be of interest.

Edit: instead of adding a check that can fail, it simply comments so contributors and reviewers can decide if they want to follow suggestions or not.

orbeckst · 2020-07-04T06:43:08Z

I installed pep8-speaks for all of MDA — if it’s too annoying we can deactivate it again or only enable it for specific repos.

One can also configure it with a config file as described in https://github.com/OrkoHunter/pep8speaks#configuration

RMeli · 2022-05-07T11:26:07Z

+1 on being apprehensive about autoformatters. My main objection is what they do to commit history. It's useful to look at a line of code and see who committed it in which PR, so you can check out any discussion and know who to direct questions to.

Thanks to @mikemhenry, I realised that it is possible to ignore specific commits when using git blame and it seems that this is now supported by GitHub as well.

I am less keen on formatting tools, though: the way I format my code is often deliberate and, sometimes, going against the convention helps with readability or other things.

For this situation, it is possible to ignore blocks of code or single lines with # fmt: on/off and # fmt: skip.

jandom · 2023-07-21T13:12:48Z

hi there, just adding my two cents – prompted by @RMeli in MDAnalysis/UserGuide#270 (comment) – in my experience code formatters and static analysis tools (black, isort, flake8, mypy) are tremendously valuable. They make the codebase more readable, more maintainable, and lower the bar for entry for new contributors.

These tools (in my experience) are the de-facto standard in the industry, and familiarity with them is valuable. Of course, such tools only solve a narrow problem, they don't solve some of the most important feature requests, tech debt or bugs. But they are typically a low hanging fruit with a positive net impact on the health of a codebase and dev velocity.

marinegor · 2023-08-02T10:45:52Z

Hi everyone, just adding another two cents here -- git has a feature that allows you to ignore certain commits in git blame: https://stackoverflow.com/questions/53502654/how-do-i-run-a-code-formatter-over-my-source-without-modifying-git-history

Sadly it requires a configuration beforehand (you must run git config blame.ignoreRevsFile .git-blame-ignore-revs to make it work), and also doesn't (yet) work on the github's web interface -- although it's on a public beta since 8.03.2023 (check here), so I assume it'll be available for everyone relatively soon.

I imagine that messing up git blame was one of the arguments against using automatic formatting -- so just saying there's a workaround here.

jbarnoud · 2023-08-02T12:09:57Z

I have been on record as one of those (if not the one) that dislike auto-formatting. I had issues with fixes that were suboptimal readability-wise, portions of the code where the formatting was intentional, and reformatting the existing code base breaking git blame.

With time, I come to accept that the few places where the formatting is suboptimal are outweighed by the benefit of auto-formatting. We can have guards where the formatting is intentional (like this superb periodic table). If we can ignore some commits for the blame, then all my objections are addressed.

There is just the question of how to handle pending pull requests that risk being riddled with merge conflicts.

IAlibay · 2023-08-02T12:15:06Z

We can have guards where the formatting is intentional

I'm still not an amazing fan of black (I am less wary now that that formatting style documentation has improved & there is a "future" section) but I also don't care that much as long as we are careful not to worsen readability / destroy intentional formatting.

There is just the question of how to handle pending pull requests that risk being riddled with merge conflicts.

In my opinion, the best approach would be to cut up the process in chunks. That way we can also review the changes in a sane way rather than have a million+ line changes. We work on files that aren't currently in progress, and if we get blocked by a PR that isn't going anywhere then that gives us a good reason to close stale PRs (of which we have many).

IAlibay · 2023-08-02T12:19:26Z

I would however add here (for the @MDAnalysis/coredevs) that I would urge a discussion on priorities here. We have a lot of maintenance work to do, and many PRs that need a push (including all the type hinting ones and the guesser stuff).

I'm not saying don't do it, I am however saying let's consider the order in which we do things.

marinegor · 2023-08-02T12:32:49Z

Is it possible that you add a project-wide autoformatter settings, and then for all new PRs you ask people to a) run the formatter on the files they're changing b) add the respective commit to the .git-blame-ignore-revs file? Perhaps this might help reducing the number of merge conflicts, as well as delay the necessity to run it for all files at once.

IAlibay · 2023-08-02T12:37:07Z

Is it possible that you add a project-wide autoformatter settings, and then for all new PRs you ask people to a) run the formatter on the files they're changing b) add the respective commit to the .git-blame-ignore-revs file? Perhaps this might help reducing the number of merge conflicts, as well as delay the necessity to run it for all files at once.

I don't think that'll work - you'll get conflicts with files where folks are touching the same file.

marinegor · 2023-08-02T12:40:42Z

I don't think that'll work - you'll get conflicts with files where folks are touching the same file.

But if the settings are project-wide, their autoformatting (should) match unless they touched the same parts of the code?
I'm not sure though, never tried it.

orbeckst · 2023-08-02T13:49:34Z

EDIT: Sorry folks — I thought this question happened on Egor's PR, not on the dedicated static code formatters issue ...so this is exactly where discussion should happen... oops. So most of what I wrote doesn't apply really. I am just crossing out the embarrassing lines... except the bit that says if anyone wants to push for such a change they better be ready to be responsible for handling it.

I think we can cut the discussion pretty short: We can consider autoformatters but it ain't gonna happen in this PR as it's too big of a change. I am all for a productive discussion for how to move forward but can we please have it in a sensible place — either an issue or a thread on the developer list?

~~@marinegor apologies for stomping in here. From your perspective it's a sensible discussion to ask here as you're focused on your work and want to get it done efficiently.~~ From a project-wide perspective it's a discussion that has come up a few times. It looks as opinions are shifting so it's ok to have this discussion from time to time — nothing is set in stone — but in the context of the project it's a big, big undertaking and we have to weigh cost (=time/opportunity) vs benefit. ~~Let's do this in a dedicated place that won't distract from what your PR is about.~~

So can I ask that if anyone cares enough about the topic of autoformatters, ~~they start a discussion on the dev list (or open/re-open(?) an issue) and point all interested parties there... and~~ then feel responsible for following up on the discussions? Thanks.

orbeckst changed the title ~~Code formatting, style guide enforcement and static cheks~~ Code formatting, style guide enforcement and static checks Jan 23, 2020

RMeli mentioned this issue May 20, 2020

Add in analysis function to calculate MSDs #2619

Merged

4 tasks

IAlibay mentioned this issue Jun 5, 2020

Updating Serialization Functionality from PR #2140 #2704

Closed

4 tasks

orbeckst mentioned this issue Jul 4, 2020

Add Lipid Enrichment Index analysis #2816

Closed

4 tasks

PicoCentauri mentioned this issue Nov 18, 2021

Type Hints for Analysis Classes #3463

Open

RMeli mentioned this issue May 14, 2022

Long-term solution for import sorting #3656

Open

umak1106 mentioned this issue Jun 2, 2022

Adding Type Hints for MDAnalysis #3686

Open

This was referenced Jul 17, 2023

fix: warnings when running 'make html' MDAnalysis/UserGuide#269

Merged

dev: run pre-commit-hooks MDAnalysis/UserGuide#270

Closed

RMeli mentioned this issue Aug 6, 2023

dev: run black on the files MDAnalysis/UserGuide#286

Merged

RMeli mentioned this issue Oct 24, 2023

feat: add pre commit config #4321

Closed

5 tasks

RMeli mentioned this issue Dec 21, 2023

Fix: groups.py doctest fixes #4374

Merged

5 tasks

RMeli mentioned this issue Mar 24, 2024

Atomnames methods should handle empty group #2879 #4528

Closed

5 tasks

sef43 mentioned this issue Apr 2, 2024

Add a writer_kwargs argument to MDAReporter and speed up testing sef43/openmm-mdanalysis-reporter#7

Closed

4 tasks

yuxuanzhuang mentioned this issue May 20, 2024

AlignTraj Writer kwargs fix #4565

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code formatting, style guide enforcement and static checks #2450

Code formatting, style guide enforcement and static checks #2450

RMeli commented Jan 16, 2020

richardjgowers commented Jan 17, 2020

RMeli commented Jan 17, 2020

jbarnoud commented Jan 17, 2020

RMeli commented Feb 17, 2020

tylerjereddy commented Feb 21, 2020

RMeli commented Feb 22, 2020

tylerjereddy commented Feb 22, 2020

tylerjereddy commented Apr 26, 2020

lilyminium commented Jun 5, 2020

IAlibay commented Jun 17, 2020

lilyminium commented Jul 3, 2020 •

edited

Loading

orbeckst commented Jul 4, 2020

RMeli commented May 7, 2022

jandom commented Jul 21, 2023 •

edited

Loading

marinegor commented Aug 2, 2023 •

edited

Loading

jbarnoud commented Aug 2, 2023

IAlibay commented Aug 2, 2023

IAlibay commented Aug 2, 2023 •

edited

Loading

marinegor commented Aug 2, 2023

IAlibay commented Aug 2, 2023

marinegor commented Aug 2, 2023

orbeckst commented Aug 2, 2023 •

edited

Loading

Code formatting, style guide enforcement and static checks #2450

Code formatting, style guide enforcement and static checks #2450

Comments

RMeli commented Jan 16, 2020

richardjgowers commented Jan 17, 2020

RMeli commented Jan 17, 2020

jbarnoud commented Jan 17, 2020

RMeli commented Feb 17, 2020

tylerjereddy commented Feb 21, 2020

RMeli commented Feb 22, 2020

tylerjereddy commented Feb 22, 2020

tylerjereddy commented Apr 26, 2020

lilyminium commented Jun 5, 2020

IAlibay commented Jun 17, 2020

lilyminium commented Jul 3, 2020 • edited Loading

orbeckst commented Jul 4, 2020

RMeli commented May 7, 2022

jandom commented Jul 21, 2023 • edited Loading

marinegor commented Aug 2, 2023 • edited Loading

jbarnoud commented Aug 2, 2023

IAlibay commented Aug 2, 2023

IAlibay commented Aug 2, 2023 • edited Loading

marinegor commented Aug 2, 2023

IAlibay commented Aug 2, 2023

marinegor commented Aug 2, 2023

orbeckst commented Aug 2, 2023 • edited Loading

lilyminium commented Jul 3, 2020 •

edited

Loading

jandom commented Jul 21, 2023 •

edited

Loading

marinegor commented Aug 2, 2023 •

edited

Loading

IAlibay commented Aug 2, 2023 •

edited

Loading

orbeckst commented Aug 2, 2023 •

edited

Loading