Monotone_constraints in LightGbm #1651

petterton · 2018-11-16T14:52:22Z

In my ML problem I get significantly better results from LightGbm by using the monotone_constraints parameter. I can not see that this is available through the ML.NET interface. Could this be added?

The text was updated successfully, but these errors were encountered:

najeeb-kazmi · 2018-11-16T20:15:28Z

I agree this would be a useful addition to ML.NET. We provide a wrapper for LightGBM but this parameter is not exposed. I will file this for our triage team to review and prioritize.

singlis · 2018-12-21T17:34:48Z

Here are links from lightGBM for adding monotone constraints:
Issue filed here: microsoft/LightGBM#14
Committed here: microsoft/LightGBM#1314

The version of lightGBM we are using in ml.net is 2.2.1.1 -- need to confirm if this version contains the support for monotone_constraints.

justinormont · 2018-12-22T01:37:38Z

@daholste: Assuming I'm understanding this parameter correctly, this can also help with model stacking. Currently, LightGBM is allowed to map the output of the sub-models in the stack without the constraint that "as the sub-models score increases/decreases, so should the final score". Hence it could map as f(x) = ( x < 3 ? 1.0 : (x < 4 ? 0.0 : 2.0 )). When the sub-models correlate well with the label, we likely would benefit from a monotonically increasing meta-model for stacking.

glebuk · 2019-01-14T22:30:49Z

Update the LightGBM and expose the arg. Note if this fixes #1625 as well.

justinormont · 2019-01-17T20:22:42Z

@glebuk: No need to update LightGBM for monotone_constraints.

Our current LightGBM version is from 3mo ago; the monotone_constraints was added 9 mo ago.

So our current version of LightGBM should work without updating.

Work item should be: Expose the monotone_constraints parameter of LightGBM

singlis · 2019-01-23T03:27:29Z

Hi @justinormont, @glebuk and @petterton,

LightGBM allows for the setting of monotone constraints for each feature. For example, if you have a column called Features that is made up of 10 features, you can specify what constraint to use in the order of the features by setting the constraint value of either 1 for increasing, 0 for no constraint and -1 for decreasing. So 1,-1,0,0,0..0 would apply an increasing constraint to the first feature, decreasing constraint to the second feature and no constraint to the remaining features.

While having a per feature control is probably very powerful, I start to think about how does the user specify this in cases where they have a large number of features -- even with 20 or 30 features, no one wants to type in an array that long as that is not only tedious but error prone.

Do we need to control the constraint at a per feature level? Or would applying the same constraint to all features suffice?

justinormont · 2019-01-23T06:38:17Z

I think all or nothing is ok.

I'd like to have control at the per column level, but I'm not sure how to make it user friendly.

My specific use case is, I have a stacked model. The sub-models are rational, therefore I'd like them to be positive-monotone when combined to give the final score; I also want to feed some raw features to the final learner as this stacking method shows promise. To get the same results, I currently duplicate the raw features as NegRawFeat = RawFeat * -1, then xf=Concat{Features:SubModelScores,NegRawFeat,RawFeat} tr=LogisticRegression{nn=+}. This accomplishes the goal (for LR, though not for LightGBM), though looses the slot names making feature importance difficult.

Two ways that are slightly user friendly:

I suppose we could have three input columns: { Features, PosFeatures, NegFeatures }, where Features is the normal as current. Then users can concatenate the columns into PosFeatures which they want to be positive-monotone. I think inventing new input columns is confusion to users, so this is bad.
I don't think it is possible, but another route would be to take in a single Features column, as current, then the user specifies the names of the column (within) the Features column that they want as positive/negative. I don't think it's possible to locate the slots within the Features column corresponding to the user's specified column. Perhaps match on slot names?

LightGBM. This is done through the LightGBM Options class via the MonotoneConstraints member. To handle the monotone constraints, this adds the ability to specify whether to use a positive constraint or a negative constraint along with a range. Multiple ranges can be specified. This checkin also includes tests for the parsing of the ranges to validate the expected value that will be passed to LightGBM. This fixes dotnet#1651.

singlis · 2019-02-07T20:06:07Z

After talking with Tom, this needs more thought through on how this can be made more user friendly. The PR that I currently have here: #2330 has the user specifying the constraints based upon indices. This was primarily to handle the way LightGBM works, but ML.Net does not work this way and using indices after concatenating columns into a Features column does not give a clear way to know which indices map to which features.

Tom recommended to do something similar to how Categorical Features works and somehow we manage the mapping of feature name to indices (such as in the metadata). This would allow the user to specify the constraints based upon a name (which would map to indices) rather than the specific indices.

Also from talking with Tom, this work can be done post v1.0 as it would not affect any API changes.

My vote is to pause on this for now and reinvestigate post 1.0.

@shauheen and @TomFinley feel free to comment if you have anything additional.

justinormont · 2019-02-08T18:54:01Z

TLDR; For the moment, I'd be quite happy to have purely positive / purely negative for all slots.

I agree w/ @TomFinley. I have similar concerns about the usability.

I would also like the style similar to what I think @TomFinley is purposing, the Categorical Features style, is this similar to the second style on my list; I thought it would be too hard; but if Tom thinks it's doable it seems like a great longer term solution.
#1651 (comment)

I don't think it is possible, but another route would be to take in a single Features column, as current, then the user specifies the names of the column (within) the Features column that they want as positive/negative. I don't think it's possible to locate the slots within the Features column corresponding to the user's specified column. Perhaps match on slot names?

For the moment, I'd be quite happy to have purely positive / purely negative for all slots. This addresses AutoML team's ability to use it for model stacking:

#1651 (comment),

@daholste: Assuming I'm understanding this parameter correctly, this can also help with model stacking. Currently, LightGBM is allowed to map the output of the sub-models in the stack without the constraint that "as the sub-models score increases/decreases, so should the final score". Hence it could map as f(x) = ( x < 3 ? 1.0 : (x < 4 ? 0.0 : 2.0 )). When the sub-models correlate well with the label, we likely would benefit from a monotonically increasing meta-model for stacking.

#1651 (comment)

My specific use case is, I have a stacked model. The sub-models are rational, therefore I'd like them to be positive-monotone when combined to give the final score; I also want to feed some raw features to the final learner as this stacking method shows promise. To get the same results, I currently duplicate the raw features as NegRawFeat = RawFeat * -1, then xf=Concat{Features:SubModelScores,NegRawFeat,RawFeat} tr=LogisticRegression{nn=+}. This accomplishes the goal (for LR, though not for LightGBM), though looses the slot names making feature importance difficult.

/cc @shauheen

KyBroecker · 2022-10-11T10:25:15Z

What's the status on this? Reading the thread it seems to me like we can use monotonic constraints with LightGBM, but only if we constrain all features in the same way? Is there any documentation?

michaelgsharp · 2022-12-05T20:42:04Z

@luisquintanilla are you aware of any documentation on this? Closing this issue as its been resolved by the PR, but please post any documentation here on it anyways.

luisquintanilla · 2022-12-05T21:25:50Z

@michaelgsharp we don't have docs that I'm aware of. If you can't point me to the PR that solved this issue or any related tests we can add this to our docs backlog

najeeb-kazmi added enhancement New feature or request API Issues pertaining the friendly API labels Nov 16, 2018

glebuk added this to To Do in v0.10 Jan 14, 2019

singlis self-assigned this Jan 18, 2019

singlis moved this from To Do to In Progress in v0.10 Jan 18, 2019

singlis mentioned this issue Jan 30, 2019

Monotone constraint support for LightGBM #2330

Closed

shauheen removed this from In Progress in v0.10 Jan 31, 2019

shauheen added this to In Progress in v0.11 Jan 31, 2019

shauheen added this to To Do in Backlog via automation Mar 1, 2019

shauheen removed this from In Progress in v0.11 Mar 1, 2019

justinormont mentioned this issue Dec 31, 2019

Initial implementation of VotingRegressor. microsoft/NimbusML#390

Merged

harishsk added the P2 Priority of the issue for triage purpose: Needs to be fixed at some point. label Jan 10, 2020

This was referenced Oct 7, 2022

Monotonic constraints for Regressors, especially LightGBM #6351

Closed

LightGBM Tracking Issue #6337

Closed

michaelgsharp closed this as completed Dec 5, 2022

ghost locked as resolved and limited conversation to collaborators Jan 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Monotone_constraints in LightGbm #1651

Monotone_constraints in LightGbm #1651

petterton commented Nov 16, 2018

najeeb-kazmi commented Nov 16, 2018

singlis commented Dec 21, 2018

justinormont commented Dec 22, 2018

glebuk commented Jan 14, 2019

justinormont commented Jan 17, 2019 •

edited

Loading

singlis commented Jan 23, 2019 •

edited

Loading

justinormont commented Jan 23, 2019 •

edited

Loading

singlis commented Feb 7, 2019 •

edited

Loading

justinormont commented Feb 8, 2019

KyBroecker commented Oct 11, 2022

michaelgsharp commented Dec 5, 2022

luisquintanilla commented Dec 5, 2022

Monotone_constraints in LightGbm #1651

Monotone_constraints in LightGbm #1651

Comments

petterton commented Nov 16, 2018

najeeb-kazmi commented Nov 16, 2018

singlis commented Dec 21, 2018

justinormont commented Dec 22, 2018

glebuk commented Jan 14, 2019

justinormont commented Jan 17, 2019 • edited Loading

singlis commented Jan 23, 2019 • edited Loading

justinormont commented Jan 23, 2019 • edited Loading

singlis commented Feb 7, 2019 • edited Loading

justinormont commented Feb 8, 2019

KyBroecker commented Oct 11, 2022

michaelgsharp commented Dec 5, 2022

luisquintanilla commented Dec 5, 2022

justinormont commented Jan 17, 2019 •

edited

Loading

singlis commented Jan 23, 2019 •

edited

Loading

justinormont commented Jan 23, 2019 •

edited

Loading

singlis commented Feb 7, 2019 •

edited

Loading