Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] [python-package] use black for formatting Python code? #6304

Closed
jameslamb opened this issue Feb 8, 2024 · 10 comments · Fixed by #6345
Closed

[RFC] [python-package] use black for formatting Python code? #6304

jameslamb opened this issue Feb 8, 2024 · 10 comments · Fixed by #6345

Comments

@jameslamb
Copy link
Collaborator

Summary

I think that black (psf/black) should be used in the project to standardize the formatting of the project's Python code.

Motivation

Over the next few months, I'd like to invest in making it easier for other people to contribute to LightGBM. I'm hoping that more use of linters will help with that.

I think that standardizing the formatting of the code and enforcing that standard automatically with black helps with that in the following ways:

  • makes code look the same everywhere in the project
  • makes code look like other Python projects people have worked on (since black is so widely used)
  • provides automated feedback on style, limiting the review effort required by maintainers

The project contains a lot of Python code:

  • python-package/ = the lightgbm Python library
  • tests/ = unit tests
  • helpers/ = utility scripts for maintaining the project
  • examples/ = Python scripts and Jupyter notebooks showing how to use the project

Proposed Approach

I'd like to sequence this work in the following way (one PR per point below):

  1. add settings in pyproject.toml for black, setting max line length to 120 (to minimize), and set up checks in CI, but only apply + enforce black formatting on code in helpers/* and docs/*
  2. apply formatting changes to Python scripts and notebooks in examples/* and tests/*
  3. apply formatting changes to python-package/*
  4. add a .git-blame-ignore-revs file so those re-formatting commits are ignored in git blame (and in the GitHub blame view)

References

@jameslamb
Copy link
Collaborator Author

@jmoralez @guolinke @shiyu1994 @borchero what do you think?

If you support this, I'll start on it in the next few days.

@borchero
Copy link
Collaborator

borchero commented Feb 8, 2024

Big fan, I've always stumbled over not having this when contributing here 😄

I would also propose to leverage https://pre-commit.com/ to run black to simplify local execution as much as possible.

@jmoralez
Copy link
Collaborator

jmoralez commented Feb 8, 2024

Since we're already using ruff it may be easier to just adopt its formatter, it's supposed to be the same as black but faster, so we can avoid an extra dependency.

@jameslamb
Copy link
Collaborator Author

Since we're already using ruff it may be easier to just adopt its formatter

Hey cool! I didn't know ruff had a formatter too. That project is eating is the world 😂

I'm good with trying that instead of black, these deviations seem ok to me: https://docs.astral.sh/ruff/formatter/black/

I care less about the particular tool than I do just having some standard that's automatically enforced and that's easy to get compliant with via autoformatting.

I think the argument "hey we already have ruff let's use that" is a good reason to try the ruff formatter first!

*I would also propose to leverage https://pre-commit.com/ *

I know I'd been opposed to this in the past so it's probably mostly my fault that LightGBM doesn't already use pre-commit 😬

I'm open to trying it! I wouldn't want it to become the thing that runs all of the project's linting... it's nice that the lint CI job only runs on Linux, so we don't have to worry about all of the linting tools also supporting all of the different environments LightGBM casual contributors work in.

But I'm open to at least starting with the cheap, highly-portable, fast tools in a pre-commit config (e.g. ruff and isort). If we do that, I'd want to also run them with pre-commit in CI, like this:

pre-commit run --all-files

To be sure the pre-commit results and CI are the same.

@jmoralez
Copy link
Collaborator

jmoralez commented Feb 8, 2024

I support the pre-commit as well, it's very easy to get linter errors (especially from isort) and that would help a lot.

@shiyu1994
Copy link
Collaborator

I'm also using black and pre-commit in other projects. I think we can try to use them in LightGBM as well.

@borchero
Copy link
Collaborator

borchero commented Feb 8, 2024

If we do that, I'd want to also run them with pre-commit in CI, like this:

There's also https://github.com/pre-commit/action which runs pre-commit run --all-files and caches your dependencies to speed up the CI :)

One more comment on ruff: it also implements the functionality of isort so we might just consolidate all our needs into a single tool.

@jameslamb
Copy link
Collaborator Author

Thanks for that!

For now I'd rather run pre-commit run --all-files and take the slight CI-time hit for installing dependencies, so it's only a small handful (just ruff and isort?), and since lint runs as a standalone job not on every CI job.

Each additional third-party action we pull in is another source of possible security issues, another thing that requires updating, etc.

One more comment on ruff: it also implements the functionality of isort so we might just consolidate all our needs into a single tool.

Open to that too!

I'll put up a first PR in the next day or two that we can look at, which adds the config and CI changes and just targets a subset of files (like docs/ and helpers/).

@jmoralez
Copy link
Collaborator

jmoralez commented Feb 8, 2024

Just adding what you said last time about replacing isort with ruff (full comment):

I chose NOT to have it run isort, even though it can, because it currently has some inconsistencies with isort (astral-sh/ruff#2104 and astral-sh/ruff#1718, for example) that result in errors on this branch. Maybe we could explore that in the future.

so I think you can expect some errors when changing it but I'm in favor of the change if they're not controversial.

@jameslamb
Copy link
Collaborator Author

Ah thanks for remembering that @jmoralez ! I'd forgotten about that from when we added ruff originally.

I'll start with ruff format and keep using isort for now, and then we can separately look into replacing isort and see what the diff looks like.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants