Add sample weight to LightGBM#147
Merged
Merged
Conversation
…dockerfile based on manylinux_2_28
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #147 +/- ##
============================================
+ Coverage 80.52% 80.99% +0.46%
- Complexity 479 516 +37
============================================
Files 50 52 +2
Lines 1648 1757 +109
Branches 158 177 +19
============================================
+ Hits 1327 1423 +96
- Misses 233 238 +5
- Partials 88 96 +8 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
artpdr
requested changes
May 15, 2026
added 6 commits
May 15, 2026 18:31
artpdr
reviewed
May 18, 2026
artpdr
reviewed
May 18, 2026
artpdr
reviewed
May 18, 2026
added 3 commits
May 18, 2026 18:06
…and replaced inefficient comparison during copy to SWIG arrays
1aabfce to
dde15e0
Compare
artpdr
reviewed
May 18, 2026
added 5 commits
May 19, 2026 09:55
…Creator.schemaMatchAllFeatures
… with underscore) in single method
…ring only the sample weight field can differ between model and dataset schema
AlbertoEAF
approved these changes
May 19, 2026
Contributor
There was a problem hiding this comment.
Didn't get time to make a thorough review, but by a quick look the logic seemed sound in terms of LGBM lib calls, and selecting columns.
I'd suggest a couple manual UATs to be sure everything is as expected, such as passing or not that field, and passing an invalid one, and having a small synthetic dataset that shows the model took the weight into consideration.
artpdr
pushed a commit
that referenced
this pull request
May 20, 2026
…ter training (#150) # Summary This PR addresses a memory leak introduced in #147, by not ensuring the release of all allocated SWIG resources, in particular when training crashes (e.g., when an `IllegalArgumentException` is thrown due to negative sample weights provided). To ensure allocated resources are always released, a try-with-resources is used for the training logic, ensuring the call to `.close()` method for the two resources created - `swigTrainData` and `swigTrainBooster` -, regardless if the training is successful or not
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This MR adds support for sample weights in LightGBM model training. This is achieved by:
Adding a
sample_weightparameter to the utility class that organizes all LightGBM's training hyper-parameters.Introduce utility classes for managing schema fields, in particular a
SampleWeightParamParserUtilfor the sample weight hyper-parameter.Modified LightGBM's training logic, in order to make use of the sample weight hyper-parameter:
Tests