[RFC] 4.0.0 Release #5153

jameslamb · 2022-04-14T04:28:59Z

Summary

@StrikerRUS @guolinke @shiyu1994 @jmoralez @btrotta @Laurae2 could you please try to list out what you all feel is required before a v4.0.0 release of LightGBM is prepared?

Please edit this description and add issues that you feel should be fixed prior to a v4.0.0 release. I've proposed an initial list below.

Python package

R package

CUDA

GPU (non-CUDA)

Quantized Training

Add quantized training #5606

Other

Motivation

The most recent release of LightGBM, v3.3.1, was about 6 months ago (October 27, 2021.

There was a v3.3.2 release on January 7, 2022, but it just contained a single small patch to satisfy CRAN

We have been talking for even longer than that about putting out a 4.0.0 release of LightGBM. Many many fixes, features, and breaking changes have been merged since v3.3.1, and the state of this project on master is now significantly different from what users will get running pip install lightgbm.

I'd like to try to list out everything that we feel needs to be done before the 4.0.0 release, to focus our efforts and hopefully get that release out to users soon. I think an issue like this was successful with the v3.3.0 release (#4310).

Other Items for Release Checklist

enable building stable tag on readthedocs

The text was updated successfully, but these errors were encountered:

jameslamb · 2022-04-14T04:31:36Z

For now, I've locked the conversation on this issue so that only collaborators in the repo are able to comment, to keep the conversation focused on what maintainers are comfortable committing to for v4.0.0.

Users and other outside interested parties can open other issues referencing this one with questions or concerns.

StrikerRUS · 2022-04-15T02:09:13Z

Great plan!
But sorry, I don't believe we'll able to release v4.0.0 in the near future with the current project activity level.

I strongly believe we should focus on bug fixes and reviewing PRs from an outside contributors to keep users' loyalty.

jameslamb · 2022-04-15T02:39:16Z

I don't believe we'll able to release v4.0.0 in the near future with the current project activity level

I also feel that it's probably far away given the current project activity level, but I wanted to at least try to push v4.0.0 forward. I hope maybe the conversation on this issue will encourage Microsoft and other maintainers here to devote more time and resources to making it happen.

I expect that many (maybe even most) LightGBM users won't rely on specific commits, either via git clone or nightly builds, that aren't tagged as releases or published to package repositories like CRAN, PyPI, and conda-forge (which just re-bundles the PyPI package). For them, improvements to this project aren't "real" until they make it into a release.

And I can say that for me personally, it's difficult to stay motivated to spend time on challenging bug fixes, pull request reviews, and user questions in issues if it seems like that work might not ever be released 😞

guolinke · 2022-04-15T03:22:28Z

For the GPU(non-CUDA) bugs, I think it is hard to fix them, as we cannot reach its original developer.
I think we can focus on the cuda_exp version, and deprecate the old GPU codes.

guolinke · 2022-04-15T03:29:26Z

How about we make two lists? one is for necessary changes, another one is optional.
we can focus on the bug fixes, breaking but necessary new features/changes.
We can also mark the assignees in the list (you can also include me), and I will try my best to finish them.

jameslamb · 2022-04-15T15:30:46Z

How about we make two lists?

My goal with this issue was to define the list of "what is required for v4.0.0", and I think that's enough. The implication of that would be that anything not listed here is optional.

guolinke · 2022-04-16T03:09:09Z

@jameslamb haha, there are 52 items, so I think maybe it is too large, and some "hard" items may block us to release. So I propose to have a "core" list that must be done for the v4.0.0.

StrikerRUS · 2022-04-17T00:59:28Z

@guolinke

I think we can focus on the cuda_exp version, and deprecate the old GPU codes.

I'm very disappointed in this decision 🙁

As I said earlier, OpenCL-based version that is able to run on AMD and Intel GPUs is a competitive advantage of LightGBM.

For the GPU(non-CUDA) bugs, I think it is hard to fix them, as we cannot reach its original developer.

Have you tried to reach him via email?

guolinke · 2022-04-17T08:35:19Z

@StrikerRUS for the AMD GPU, we can use rocm, and there are several tools that can covert CUDA code to rocm code.
As for Intel GPU, I am not sure, are there any powerful GPUs that are widely used?

What I mean is to deprecate old GPU algorithm, and move to the new one. And we can adapt the new GPU algorithm to more platforms.

shiyu1994 · 2022-04-17T15:59:53Z

@jameslamb Thanks for opening this! I'm glad and excited to work towards 4.0.0. I'll commit more time from now on to guarantee the progress.

I think we can focus on the cuda_exp version, and deprecate the old GPU codes.

We will make cuda_exp version supporting as many platforms as now by LightGBM. However, before cuda_exp is stable enough, I think we can still maintain the current CUDA and GPU versions, including bug fixing.

StrikerRUS · 2022-04-17T21:35:24Z

for the AMD GPU, we can use rocm, and there are several tools that can covert CUDA code to rocm code.

Oh, interesting idea, I've never heard about such tools! I was sure that CUDA can be run exclusively on NVIDIA cards. However, do you think code transpiling will be convenient for LightGBM users? Honestly, I don't believe that ordinary users will do such things. In contrast, OpenCL version can be used just out of the box on Windows and possibly on Linux in the future.

As for Intel GPU, I am not sure, are there any powerful GPUs that are widely used?

Probably, they will:

Currently we know Intel plans to launch the first Arc GPUs for desktops as soon as the summer of 2022, and at the end of its promo video for the first Arc 3 laptop GPUs the company included a teaser image (above) of a full-size Arc desktop GPU. It sure looks like to be as big as an Nvidia GeForce RTX 3060 — will it perform as well too? We'll have to wait and see.
https://www.tomsguide.com/news/intel-arc-gpu-specs-release-date-features-and-more

guolinke · 2022-04-17T23:48:59Z

@StrikerRUS the code conversion should be manually done by our, and commit to repo.
refer to https://rocmdocs.amd.com/en/latest/Programming_Guides/HIP-porting-guide.html

StrikerRUS · 2022-04-20T02:18:23Z

@guolinke
Ah, I see now. That changes a lot!

jameslamb · 2022-07-15T14:12:15Z

Since the previous release, {lightgbm} is now depended on by several more packages.

https://cran.r-project.org/web/packages/lightgbm/index.html

Including a fairly high-profile one, {bonsai}, with support from RStudio: https://www.tidyverse.org/blog/2022/06/bonsai-0-1-0/.

I'm planning to try to do some work on those projects to make them compatible with v3.3.2 AND v4.0.0.

jameslamb · 2022-07-19T04:13:46Z

I've opened issues with offers of support for all of {lightgbm}'s reverse dependencies.

{cbl}: {lightgbm} v4.0.0 is coming dswatson/cbl#1
{bonsai}: {lightgbm} v4.0.0 is coming tidymodels/bonsai#34
{cheem}: {lightgbm} v4.0.0 is coming nspyrison/cheem#2
{EIX}: {lightgbm} v4.0.0 is coming ModelOriented/EIX#7
{fastshap}: {lightgbm} v4.0.0 is coming bgreenwell/fastshap#47
{SHAPforxgboost}: {lightgbm} v4.0.0 is coming liuyanguu/SHAPforxgboost#33
{shapviz}: {lightgbm} v4.0.0 is coming ModelOriented/shapviz#22

I also want to document this small function in R (proposed in tidymodels/bonsai#42) that can be used to check for the installed version of {lightgbm}:

using_newer_lightgbm_version <- function(){
    utils::packageVersion("lightgbm") > package_version("3.3.2")
}

# example
if (using_newer_lightgbm_version()) {
    predict(bst, X, rawscore = TRUE)
} else {
    predict(bst, X, type = "raw")
}

I hope we can get v4.0.0 out soon, and that we'll start releasing more frequently in the future so that we can just use a cycle of deprecation warnings to communicate such changes, instead of needing to go submit patches.

…icrosoft#5153)

jameslamb · 2022-11-09T16:31:52Z

@shiyu1994 @guolinke can you please devote some time to LightGBM over the next few weeks, and can we try to get a 4.0 release out?

At this point, it's been more than a year since the last release with substantive changes (v3.3.1, in October 2021). We really need your help.

shiyu1994 · 2022-11-09T16:43:51Z

@jameslamb Sure. Personally I feel very anxious whenever I think of that our goal is not finished. Sorry for being blocked by another project in the past few weeks. I'm coming back and will devote more time. Hopefully we can get 4.0.0 done before the new year.

shiyu1994 · 2022-11-09T16:45:07Z

@jameslamb And thanks for your support all the time.

jameslamb · 2022-11-09T16:47:12Z

Of course! At this point, since it's been more than a year since the last release, I think we need to accept that all the things we'd hoped to get into v4.0.0 aren't going to make it there. It would be better to get a v4.0.0 out and then plan other breaking changes for a v5.0.0 in the future, in my opinion, then to delay v4.0.0 another 6+ months.

shiyu1994 · 2022-11-09T16:49:03Z

It would be better to get a v4.0.0 out and then plan other breaking changes for a v5.0.0 in the future, in my opinion, then to delay v4.0.0 another 6+ months.

Strongly agree with that.

jameslamb · 2022-11-18T04:35:32Z

I'm focusing most of my attention right now on fixing the issues related to Python packaging and CI as soon as possible, in preparation for this release.

It has gotten really complicated, so typing out here the sequence that things need to happen.

get CI working again
- ~~WIP: [ci] use XCode 14.1 on macOS-latest gcc jobs (fixes #5589) #5588~~ [ci] switch to manylinux_2_28 for Linux artifacts (fixes #5514, fixes #5589) #5580
merge integrated aarch64 wheels PR
- Build integrated OpenCL Linux wheels #5252
switch to manylinux_2_28 images for building Linux wheels
- [ci] switch to manylinux_2_28 for Linux artifacts (fixes #5514, fixes #5589) #5580
- add a manylinux image for building X86_64 wheels guolinke/lightgbm-ci-docker#27
merge all the new guolinke/lightgbm-ci-docker stuff to master in that repo
- update master with latest changes from dev guolinke/lightgbm-ci-docker#28
- another PR to replace *-dev image tags in .vsts-ci.yml here in LightGBM and fix linux wheel tags: [ci] [python-package] correct tag on x86_64 wheels #5598
update to macOS-11 images on Azure DevOps and build macOS wheels with newer XCode
- ⚠️ THIS HAS TO BE DONE BY DECEMBER 1, 2022 ⚠️
- [ci] migrate CI from macOS 10.15 to 11 (fixes #5391) #5396
switch from setup.py to pyproject.toml for wheel building or put a version ceiling on pip in CI
- [RFC] [python] 'setup.py install' is deprecated #5061 (comment)

I'm happy to try to do most of the work, but really hope @StrikerRUS will be available to provide his expertise.

If he isn't, then @jmoralez @shiyu1994 @guolinke I will really need your help with reviews on these. If we are really struggling I can also try to recruit a volunteer from outside of LightGBM who's very familiar with Python packaging.

guolinke · 2022-11-18T05:46:07Z

@jameslamb Thank you! ping me when you need reviews.

shiyu1994 · 2022-11-25T16:11:22Z

Just added quantized training (#5606) in the TODO list.

jameslamb · 2022-12-29T06:50:16Z

I'd like to revisit this comment from April (#5153 (comment))

We will make cuda_exp version supporting as many platforms as now by LightGBM. However, before cuda_exp is stable enough, I think we can still maintain the current CUDA and GPU versions, including bug fixing

I'm happy to see the progress on implementing more objective functions in the cuda_exp version.

@shiyu1994 @guolinke Do you think we should just remove the version that's currently called cuda and just focus on cuda_exp? The in LightGBM v4.0.0, users who are installing with device_type = "cuda" will get what we're now calling "cuda_exp".

I think we should do that at this point, to:

reduce maintenance burden in the repo
ensure that we get bug reports only for the new CUDA version
reduce the time it takes CI to run

I don't think this project is healthy enough right now (e.g. releasing frequently enough) to do something like release v4.0.0 with both cuda and cuda_exp, then support both for a long time, then eventually remove the old cuda code.

shiyu1994 · 2023-01-03T04:19:30Z

Do you think we should just remove the version that's currently called cuda and just focus on cuda_exp? The in LightGBM v4.0.0, users who are installing with device_type = "cuda" will get what we're now calling "cuda_exp".

Yes. We can do that in v4.0.0.

jameslamb · 2023-01-18T05:44:31Z

Excellent, thanks! I just opened #5677 for your consideration.

jameslamb · 2023-03-03T03:40:24Z

The last thing I absolutely want to get done for v4.0.0 is fix #5061, updating the way lightgbm's wheels and sdists are built.

I'm working on that in #5759, and have started opening up some other small PRs to help with it (e.g. #5761).

@shiyu1994 @guolinke what else do you think absolutely needs to be done before we can put up a v4.0 release?

shiyu1994 · 2023-03-03T15:08:16Z

@jameslamb For objective functions and metrics on new CUDA version, we will make most of them available on GPU. We may leave some of them (which are not very frequently used) falling back to CPU. This part we've almost done.

In addition, I think we should include quantized training. The code is ready in https://github.com/Quantized-GBDT/Quantized-GBDT. We just need to merge it into the official repo. (1~2 weeks ETA)

Finally, if possible, we may have a initial support for multi-GPU training on CUDA. An initial support of multi-GPU training may again takes 1~2 weeks. Depending on the progress of development and the date we want to make 4.0.0 released, we may adjust this goal.

jameslamb · 2023-03-04T00:10:57Z

Ok works for me! I'm happy to help with reviews, docs, tests, whatever you need.

jameslamb · 2023-06-26T02:29:40Z

There are now 1.5+ years worth of fixes and improvements sitting on master which haven't yet made it into a release.

@shiyu1994 @guolinke @jmoralez I think now that #5061 has been addressed and CPU-based quantized training has been added (#5800) we should finally do a v4.0.0 release and not wait any longer.

Do you agree? If yes, I'll create the release PR and we can talk more there.

guolinke · 2023-06-26T07:34:47Z

@shiyu1994 Do we have any other breaking changes?

shiyu1994 · 2023-06-27T04:56:34Z

@jameslamb @guolinke What about we at least have #5933 merged into v4.0.0?

jameslamb · 2023-06-27T13:03:21Z

That PR doesn't contain breaking changes, so I don't think we should wait for it. It could be part of a 4.1 or 4.0.1 release.

If you can't think of any other breaking changes, I think we should move forward with v4.0.0.

shiyu1994 · 2023-06-28T03:21:37Z

That's OK. We can move forward.

jameslamb · 2023-06-28T16:43:01Z

I have one more small breaking change that is ready to be reviewed:

#5947

After that, I'll make a PR for the 4.0 release. Exciting!!!

jameslamb · 2023-07-21T19:54:54Z

For anyone subscribed to notifications and wondering why v4.0.0 of the R package isn't on CRAN yet... it won't be until at least mid-August.

See #5952 (comment)

jameslamb added question maintenance labels Apr 14, 2022

microsoft locked and limited conversation to collaborators Apr 14, 2022

guolinke closed this as completed Apr 17, 2022

guolinke reopened this Apr 17, 2022

jameslamb pinned this issue May 8, 2022

jameslamb added a commit to jameslamb/LightGBM that referenced this issue Oct 30, 2022

[R-package] fix handling of custom paths in Windows GPU build (fixes m…

a626b57

…icrosoft#5153)

jameslamb closed this as completed in #5952 Jul 13, 2023

jameslamb unpinned this issue Oct 8, 2023

[RFC] 4.0.0 Release #5153

[RFC] 4.0.0 Release #5153

Comments

jameslamb commented Apr 14, 2022 • edited Loading

Summary

Motivation

Other Items for Release Checklist

jameslamb commented Apr 14, 2022

StrikerRUS commented Apr 15, 2022 • edited Loading

jameslamb commented Apr 15, 2022

guolinke commented Apr 15, 2022

guolinke commented Apr 15, 2022

jameslamb commented Apr 15, 2022

guolinke commented Apr 16, 2022

StrikerRUS commented Apr 17, 2022

guolinke commented Apr 17, 2022 • edited Loading

shiyu1994 commented Apr 17, 2022 • edited Loading

StrikerRUS commented Apr 17, 2022 • edited Loading

guolinke commented Apr 17, 2022 • edited Loading

StrikerRUS commented Apr 20, 2022

jameslamb commented Jul 15, 2022 • edited Loading

jameslamb commented Jul 19, 2022

jameslamb commented Nov 9, 2022

shiyu1994 commented Nov 9, 2022

shiyu1994 commented Nov 9, 2022

jameslamb commented Nov 9, 2022

shiyu1994 commented Nov 9, 2022

jameslamb commented Nov 18, 2022 • edited Loading

guolinke commented Nov 18, 2022

shiyu1994 commented Nov 25, 2022

jameslamb commented Dec 29, 2022 • edited Loading

shiyu1994 commented Jan 3, 2023

jameslamb commented Jan 18, 2023

jameslamb commented Mar 3, 2023 • edited Loading

shiyu1994 commented Mar 3, 2023

jameslamb commented Mar 4, 2023

jameslamb commented Jun 26, 2023

guolinke commented Jun 26, 2023

shiyu1994 commented Jun 27, 2023

jameslamb commented Jun 27, 2023

shiyu1994 commented Jun 28, 2023

jameslamb commented Jun 28, 2023

jameslamb commented Jul 21, 2023

jameslamb commented Apr 14, 2022 •

edited

Loading

StrikerRUS commented Apr 15, 2022 •

edited

Loading

guolinke commented Apr 17, 2022 •

edited

Loading

shiyu1994 commented Apr 17, 2022 •

edited

Loading

StrikerRUS commented Apr 17, 2022 •

edited

Loading

guolinke commented Apr 17, 2022 •

edited

Loading

jameslamb commented Jul 15, 2022 •

edited

Loading

jameslamb commented Nov 18, 2022 •

edited

Loading

jameslamb commented Dec 29, 2022 •

edited

Loading

jameslamb commented Mar 3, 2023 •

edited

Loading