-
Notifications
You must be signed in to change notification settings - Fork 3.9k
[ci] Use pixi and pre-commit for all linting jobs #6901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to reviewers: this exists to make it easy to activate the environment in the active shell.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for proposing this, working through it, and so thoroughly documenting it in the PR description! I've never used pixi
so this would be quite a disruptive change to my development workflow, but I am absolutely willing to try it as part of reviewing this. I agree, it would be great to consolidate all of these linting tasks and make it easier to get the development environment set up (to help with #6350), and the combination of pre-commit
+ pixi
looks like a powerful way to do that while avoiding the need to manually set up virtualenvs / conda environments.
I'm not ready to approve this yet. However, there are other parts of this PR I really like and support immediately:
- removing linting from
test.sh
- (sticking it in a new
.ci/lint-all.sh
, that'd get deleted in thispixi
PR, would be fine for now)
- (sticking it in a new
- upgrading all the linters to their latest versions and addressing the new warnings
- splitting up
lint
andcheck-docs
into separate jobs in thestatic_analysis.yml
file, and renamingcheck-docs
topython-check-docs
for consistency with the other job calledr-check-docs
- adding
.pixi
stuff to.gitignore
and the ignores inbiome.json
- people could be using
pixi
locally even without the changes from this MR, so those added exclusions are helpful to do immediately
- people could be using
Would you consider splitting those parts off together into a single, separate PR? That could be merged more quickly, and would also cut down the size of the diff here (and yes I understand that a large part of that is the pixi.lock
file, which I'm ignoring because I know it was automatically generated).
If we can agree that the use of
pixi
makes sense, I'll propose to use it to manage the Python dependencies in a separate PR.
I want to make an early comment about this before you spend too much time on it... if "use pixi
" necessarily means "lock to specific versions via a lock file", then I'd be very opposed to expanding this to installing runtime / test-time dependencies in CI.
I really want CI to continue using >=
pins wherever possible for dependencies, so that we find out about incompatibilities automatically when a project LightGBM depends on puts up a new release. In my experience, that spreads the work of reacting to changes in dependencies more evenly over time and reduces the total amount of "make LightGBM compatible with the new version of <dependency>
" work.
@@ -14,6 +14,7 @@ FILES_TO_LINT <- list.files( | |||
, recursive = TRUE | |||
, include.dirs = FALSE | |||
) | |||
FILES_TO_LINT <- FILES_TO_LINT[!grepl(paste0("^", file.path(".", ".pixi", "")), FILES_TO_LINT)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this one. The line above already is saying "only look at files ending in .r
or .rmd
" (ignoring casing).... what files like that are in ./.pixi/
?
|
||
# pixi environments | ||
.pixi | ||
*.egg-info |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is already this line in .gitignore
:
Line 299 in ec5492f
*.egg-info/ |
Is it not working? Why was this *.egg-info
necessary?
I personally found pixi about a month ago from this comment. Looks really cool! Have to try it... |
Motivation
Running linters in this repository is always a little involved (one needs to execute a number of commands to actually run everything), leading to many (unexpected) CI failures. This has already improved by using
pre-commit
which bundles a bunch of linters.This PR proposes to bundle all linters in
pre-commit
and usepixi
to track linting dependencies (more on this below).Changes
pixi
to manage all linting dependenciespixi
lint
task from the.ci/test.sh
fileruff
andtypos
)What is pixi and why?
Pixi is a modern and fast alternative to
conda
,mamba
, etc. Hence, at a high level, it is a cross-platform package manager which installs conda packages (fromconda-forge
). Some differences that make it so nice to use:.pixi
directory in the repository root. One can have arbitrarily many environments. Thedefault
environment typically captures the local development environment.pixi.lock
) which has two benefits: on the one hand, CI doesn't suddenly fail if a new version of a tool is available. On the other hand, the computationally intensive "solve" step does not need to run in the CI. The solve is only performed once locally (and is very fast compared toconda
-- it runs in seconds even in projects that are considerably more complex than LightGBM).pixi run
. These tasks run in the deno task shell and, thus, can be executed in a platform-agnostic manner. When running a task, all required dependencies are automatically installed, making it trivial for the user to get started.pixi
, but:setup-pixi
automatically caches environments in the CI, further improving download times.What's up with the pre-commit setup?
pre-commit
is the most ergonomic way to bundle all linting jobs together. One can now simply runto execute all linters in this repository. Due to the custom
pixi
task, you can also runThis automatically installs all required dependencies, including linters and
pre-commit
itself (except forpwsh
which is not currently available viaconda-forge
for Linux & OSX).The pre-commit setup itself does NOT use other repositories anymore but, instead, uses
pixi
to execute the linting jobs using the linters installed via thepixi
. This has the benefit that (1) the behavior ofpre-commit
matches the behavior of running linters manually (e.g. when auto-formatting in the editor) and (2) there is only one update process for all dependencies.One additional benefit of using
pre-commit
is also that linters are executed "lazily", i.e. if one linter fails, the other linters are being run as well. Before, we eagerly exited on the first linter failure.Why remove the "lint" job from
.ci/test.sh
?The linting job has a trivial entrypoint now and keeping the task in this lengthy script is rather confusing than helpful IMO.
Where to go from here?
If we can agree that the use of
pixi
makes sense, I'll propose to use it to manage the Python dependencies in a separate PR. This will allow us to cut down CI runtimes significantly and, much more importantly, allow contributors to get started with all necessary dependencies much quicker. No worries: I will not propose to manage compilers in CI jobs viaconda
;)