-
Notifications
You must be signed in to change notification settings - Fork 729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Monotone models #184
Comments
Hi @Garve, Thanks for bringing this up, and sorry for our delay in getting back to you. We completely agree that the ability to enforce monotonicity would be a nice addition for EBMs, but we haven't had time to do it. There are a few different ways to implement this -- we can enforce it during training, like LightGBM, or we can provide options to enable it as a post-processing step on a trained EBM model. When enforcing monotonicity during boosting, we've noticed that models tend to take advantage of correlated features to bypass the constraint. Enforcing monotonicity as a post-processing step might be more ideal, but it still requires further investigation for our model class. One way we've done this in the past is by applying isotonic regression (the Pool-Adjacent-Violators Algorithm or PAV) on the graphs that need to be monotonic. We'll leave this issue open to track the demand for this feature, and will update this thread once we've made some progress on the research or implementation sides. If anyone would like to discuss this further or help out on this feature, we'd be happy to talk with you! Thanks! |
Hello @interpret-ml team! Thanks for the answer :) I would say that the enforcement during the training makes more sense. Doing it after the training only alters what the model is actually saying, right? As a very naive approach, I could use a max(0, [model output]). Then the model would say -, but we make it a 0. Feels kind of hacky to me. The direct approach might have some issues with correlations, but these problems are always there, no? We can create a dataset X, y and insert a copy of some column of X into X again. import numpy as np
from interpret.glassbox import ExplainableBoostingRegressor
from interpret import show
X = np.random.randn(10000, 2)
X = np.hstack([X, X[:,[0]]]) # insert copy
y = X[:, 0] + X[:, 1]
ebm = ExplainableBoostingRegressor(interactions=False)
ebm.fit(X, y)
ebm_global = ebm.explain_global(name='EBM')
show(ebm_global) The ExplainableBoostingRegressor also can't tell if feature 1 or feature 3 is more important. Both are even half as important as feature 2. This is also a problem due to correlation. Therefore, I think that the users should take care of correlation problems themselves. What are your thought about this? Thank you very much! :) Best regards |
I agree with Garve here. I feel the model will be more accurate if we train it on a monotonic I am currently doing a lot of research on the use of EBMs/GBMs to find heat coefficients and change-points in gas data when compared with outside air temperatures and other weather and non-weather variables. See here for some examples using piece-wise linear regression on the univariate case of temperature. I have also managed to recover change-points and some crude heating coefficients from the EBM models as well, but only when the data is very well behave, or a good deal of care is taken cleaning it before hand. I was planning to do a detailed write-up on this, and propose a python notebook example on how to treat the model after training, but it seems this is a good time to raise one of my findings/thoughts on monotonicity: In the post-processing, one of the issues is, if there is a sizeable negative step, then in the monotonic increasing case, a smoother doesn't know which way to smooth it. At the This is also true at the far left side of the graph where I am now experimenting with weighted smoothing as a post-processing step, however, it seems rather more tricky and requires the original training data. Thus, it seems better to treat this at training time! FYI - to save potential confusion, this case is not another variable vs gas, but one that is KNOWN to be monotonically increasing (vs temperature which is decreasing). |
Hi again! I implemented a very naive proof-of-concept version of an ExplainableBoostingMetaRegressor that takes any base regressor as input, see here on my Github. I can even give each feature an own base regressor. Is it an option to implement it like this, just more efficiently? :D To come back to the original problem: If I want monotonically increasing behavior in some features, I can give it an IsotonicRegression() from scikit-learn. If I want it decreasing, I give it an IsotonicRegression(increasing=False). If I need positive values, I can give it a IsotonicRegression(y_min=0) etc. If I don't specify anything, it uses a DecisionTree with some small depth. Seems to work well! Again a word of caution: The regressor seems to work, but I didn't test it too much too far. It's also not really efficient and doesn't work together with the show() function of interpret. It also doesn't support interactions so far. You can, however, get the nice graphs using the e = ExplainableBoostinMetaregressor()
e.fit(X, y)
for i in range(len(X)):
plt.plot(e.domains_[i], e.outputs_[i])
plt.title(i)
plt.show() I also didn't check how you guys implemented it, I just checked out this youtube video of how the algorithm works at a high level and tried to replicate this in code. Best regards |
I also have a need for monotonicity, and would prefer to have it enforced during training. I don't really have a solution other than those that have been mentioned, but posting to +1 the column of demand. |
Hi David, Do you need monotonicity for just for one or a few variables, or for all variables? |
Hi Rich, I think we've talked before on this topic some months ago. I would need monotonicity on all features, ultimately, as I'm in a regulated space. |
@richcaruana all of the above. Something like, |
@paulsendavidjay: thanks for reminding me of our previous discussion. Completely agree with you that if you need monotonicity on all features then the best way to achieve that is via constraints imposed during training. Not sure how quickly we'll have that implemented, but it is on our radar. @JoshuaC3: the interface you suggest (-1 = decreasing, 0 - no constraint, +1 = increasing) makes sense. Adding constraints to only a subset of features doesn't always achieve the effect you want. If there is no correlation among features, then imposing constraints per feature works exactly as you would expect, but in the usual case where there is correlation among features learning will do everything it can to get around the monotonicity constraints while still appearing to be monotone on the features you constrained. For example, imagine you have two copies of a feature (but aren't aware of it) and put a monotonicity constraint on one of the features, but not both. The model will satisfy the constraint on the feature you apply the constraint to, but will use the other copy of that feature which is unconstrained to undo what it has learned on the constrained feature, so in the end it is not correct to think of the model as being monotone on the constrained features since the model has used correlation among the features to undo that monotonicity. There are almost always many correlations among features in complex datasets, so this is a real problem and makes applying monotonicity constraints to subsets of features problematic. And this is a problem with monotonicity constraints for all learning methods, not just EBMs. At least the effects are more visible with glassbox methods like ours. |
@richcaruana I should have said, it is the interface used by LightGBM, CatBoost and XGBoost. I hadn't considered the colinearity effects for monotonic constraints in general here - what an excellent insight!! That said, I don't think it should cause many issues. Checking colinearity is something an ML practitioner should check as part of EDA/train-test-split/feature selection as standard. Additionally, a domain expert of type who is likely to set monotonic constraints should understand which of his independent variables are monotonically correlated with the dependent variable and with one-another. In my main use case, the latter is certainly true. I know from the physics of the system I am predicting that the independent variables are all either positive or negative monotonic. I intend to share my use case at some as I feel it will be interesting and stimulate the discussion further! Your last point is very pertinent - the fact that this is a glassbox model and has the rest of the interpret toolkit (counterfactuals etc) allows you to understand if/when this behaviour occurs. This is EXACTLY why I wish to use EBM over some of the more established GBMs with monotonic constraints! :D |
I was considering @richcaruana's above concern: colinear, correlated or highly descriptive independent variables. Depending on the application and the reason for wanting to constrain some variable to be monotonic, including 2nd order terms could cause problems. Some idea for how to control for this would be as follows:
|
Hi @Garve, @JoshuaC3, and @paulsendavidjay, Thanks for the spirited discussion around this! Wanted to add to this thread with some utility code that post-processes any main effect graph to enforce monotonicity (after training): from sklearn.isotonic import IsotonicRegression
from copy import deepcopy
import plotly.graph_objects as go
import numpy as np
def make_monotone(ebm, feature, direction='auto', inplace=False, visualize_changes=True):
''' Adjusts an individual feature to be monotone using isotonic regression.
Args:
ebm: Fitted ExplainableBoostingClassifier or ExplainableBoostingRegressor.
feature: Index or name of continuous univariate feature to apply monotone constraints
direction: 'auto', 'increasing' or 'decreasing'. Auto decides sign based on Spearman correlation estimate.
inplace: If True, modifies existing EBM in place. If False, returns new EBM.
visualize_changes: Produces Plotly visualization highlighting edits.
Returns:
If not inplace, returns new EBM with monotonicity constraints.
'''
if isinstance(feature, str): # Find feature index if passed as string
feature_index = ebm.feature_names.index(feature)
else:
feature_index = feature
x = np.array(range(len(ebm.additive_terms_[feature_index])))
y = ebm.additive_terms_[feature_index]
w = ebm.preprocessor_.col_bin_counts_[feature_index]
# Fit isotonic regression weighted by training data bin counts
direction = 'auto' if direction not in ['increasing', 'decreasing'] else direction == 'increasing'
ir = IsotonicRegression(out_of_bounds="clip", increasing=direction)
y_ = ir.fit_transform(x, y, sample_weight=w)
ebm_mono = deepcopy(ebm)
ebm_mono.additive_terms_[feature_index][1:] = y_[1:]
# Plot changes to model
if visualize_changes:
ebm_global = ebm.explain_global()
trace = ebm_mono.explain_global().visualize(feature_index)
trace['data'][1]['line']['color'] = 'red'
trace['data'][1]['name'] = "Monotone"
source_layout = ebm_global.visualize(feature_index)['layout']
source_data = list(ebm_global.visualize(feature_index)['data'])
source_data = [source_data[index] for index, trace in enumerate(source_data)
if trace.name in ["Main", "Distribution"]]
source_data[0]['fill'] = None
source_data.append(trace['data'][1])
source_layout['showlegend'] = True
fig_mono = go.Figure(
data=source_data,
layout=source_layout
)
fig_mono.show()
# Modify in place or return copy
if inplace:
ebm.additive_terms_[feature_index][1:] = y_[1:]
else:
return ebm_mono Here's a quick usage example: modifed_ebm = make_monotone(ebm, feature='Age', direction='auto', inplace=False, visualize_changes=True) which produces a new EBM and the following visualization (if This function isn't fully featured or tested yet, but we wanted to share it here first to provide a temporary solution and get feedback. As @JoshuaC3 points out, this also may not enforce true monotone constraints when pairwise interactions containing the feature are present -- maybe we should throw a warning in those cases, or explore ways to postprocess constraints on pairwise interaction terms? We don't intend for this to be a replacement for monotone constraints at training time, but it could be a nice supplemental utility function for the cases where montonicity via post-processing makes sense. It'd be useful for us to hear if this function works on your problems as we work on training time constraints! -InterpretML Team |
@interpret-ml Very nice! I had spent some time a while ago looking at just such a post-processing method but was having difficulties with accessing the right data given my unfamiliarity with the objects, and had to drop it to work on other business items. This is a great solution that could applied to many business cases, with a clear visualization of the trade off. Thank you for such a quick turnaround! |
From the above code I get the following error: modifying the code by replacing 'col_bin_counts_' with 'col_bin_edges_' and looks good! |
Hi @paulsendavidjay, Same to you -- thanks for the quick feedback! It's a bit surprising that your EBMPreprocessor doesn't have the col_bin_counts_ attribute exposed. Any chance you can check what version of interpret you're on? 0.2.4 (our latest release) should have support for this. From the command line: or in a python environment: import interpret
interpret.__version__ should both show the version number. If you can upgrade, |
Having given some further thought to the discussion here, I have raise the above issue. I think this would address some of the fears we had around constrained variables when used in 2nd order features, as well as 2nd order features in regulated spaces. |
Hi @interpret-ml: Are we still working to add monotonic constraints during training to the algorithm, please? It would be great if this feature can be implemented since domain knowledge is crucial when a model is being used practically. Thank you. |
Hi @interpret-ml Thank you. |
Hey @huanvo88, I plan to work on EBM monotonicity through post-processing. Just curious, are there laws or regulations that require insurance companies to use monotone ML models? If so, could you please point me to some related documents? |
Hi @xiaohk , I think insurance in Canada is more regulated, and I am not dealing with filing so I don't have any legal documents to give you. But sometimes when we present the models to the business, they would require certain features to be increasing or decreasing. From the discussion on this thread it seems it is better to incorporate the constraint in the fitting (like XGBoost or Lightgbm) rather than a post processing, but we can use post processing if there is no better alternative. |
Got it @huanvo88 , thanks! If you only want certain features (not all) to be increasing or decreasing, post-processing might be a better solution than monotonic constraint. You can see #184 (comment)
|
Ah ok I see, thanks @xiaohk. Also just out of curiosity, in Xgboost, lightgbm, and catboost they also have the monotone constraints, I assume that is also post processing? Or did they implement it during the fitting process? |
They implement it as a monotonicity constraint during training. I believe monotonicity constraint during EBM training is on the development roadmap too. |
@xiaohk it is good to know that it is on the development roadmap. So I assume for now you will work on the monotone post processing and push it to the next release? |
@huanvo88 During the fitting process, if the direction of the identified split L > R is different from the constraint L < R, then a split is simply not made. In regulatory space, we are often required to give plain language explanations for adverse decisions based on model scores. Business leaders need need to make sure that these explanations are sensible. For example, it would make sense to say that 'you were declined a loan offer because your total debt is too high', if debt is the most impactful feature in that model for that individual. But if debt had a U-shaped pattern, it could happen but would not make sense to say that 'you were declined a loan because your debt is both too high and too low'. Monotonic constraints eliminate that possibility with rare exception. |
@huanvo88 My stuff is still work-in-progress, but I will keep you updated. If you are interested, I can also show you the pre-release version in the next few weeks. I'd really love to get some feedback from you :) For now, I suggest just to use Isotonic regression to find the best monotonic shape of your learned shape function. The code is included in #184 (comment). |
Hey @paulsendavidjay, thanks for the reply! Your example makes a lot of sense. Just out of curiosity, what is the rare exception where monotonic constraint doesn't help? |
Awesome, thank you.
…On Thu, Jun 24, 2021, 12:29 PM Jay Wang ***@***.***> wrote:
@xiaohk <https://github.com/xiaohk> it is good to know that it is on the
development roadmap. So I assume for now you will work on the monotone post
processing and push it to the next release?
My stuff is still work-in-progress, but I will keep you updated. If you
are interested, I can also show you the pre-release version in the next few
weeks. I'd really love to get some feedback from you :)
For now, I suggest just to use Isotonic regression to find the best
monotonic shape of your learned shape function. The code is included in #184
(comment)
<#184 (comment)>
.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#184 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AISDWP3PKUFZIIGXHMX3JULTUNMODANCNFSM4T6YJ3DA>
.
|
you can check this out https://cs.stackexchange.com/questions/69220/random-forests-on-monotone-training-set-yields-a-monotone-classifier |
An interesting paper on better monotonic splits in Trees: https://arxiv.org/pdf/2011.00986.pdf Having quickly read the paper, my initial understanding is that it improves on the monotonicity constraints as follows:
My intuition tells me that this may only be used at the 2nd order interaction terms stage of training. Additionally, if my intuition is correct, the very small decrease in training time would be even less important as the "opposite-branch-check", as italicised in the quote above, would only need checking on a small subset of cases. Finally, I accept because it might be used on only a small subset of cases, it might not worth implementing for a potentially small accuracy improvement. Nonetheless, it would be interesting to test and find out! |
Hello @Garve, @paulsendavidjay, @JoshuaC3, @flippercy and @huanvo88 , thank you so much for using Interpret! I am Jay Wang, a research intern at the InterpretML team. We are developing a new visualization tool for EBM and recruiting participants for a user study to test out the new tool (see #283 for more details). We think you are a good fit for this paid user study! If you are interested, you can sign up with the link in #283. Let me know if you have any question. Thank you! |
This would be a great feature! Any update on its implementation timeline? |
@discdiver Maybe GAM Changer can help! You can try out this interactive tool to edit your EBM models and make them monotonic. |
Great thread! Thanks, everyone! I just want to follow up: is there any update about adding the monotonicity constraints in the training process? As @paulsendavidjay suggested that many industries are heavily regulated and we need to apply monotonicity constraints to meet those requirements. Thank you! |
I get the error : Can't get the 'EBMPreprocessor' on <module 'intepret.glassbox.ebm.ebm' from ' /file/python3.10/site-packages/interpret/glassbox/ebm/ebm.py while I load a EBM model in a pickle file |
@Jebin1999 -- You should probably open a new thread for questions like this, but I'll just say that this error is what I'd expect if you were to open a model built in interpret 0.2.7 in a 0.3.x version. |
Thanks,
I've done using 0.2.7 version, but still same error
continues, Could you please check once again and confirm me which version
the EBMPrepprocessor on?
Thanks,
-Jebin
…On Mon, 3 Apr 2023 at 17:32, Paul Koch ***@***.***> wrote:
@Jebin1999 <https://github.com/Jebin1999> -- You should probably open a
new thread for questions like this, but I'll just say that this error is
what I'd expect if you were to open a model built in interpret 0.2.7 in a
0.3.x version.
—
Reply to this email directly, view it on GitHub
<#184 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ARUVLH3H7DBAE5SVDRMY4T3W7L3SRANCNFSM4T6YJ3DA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
__
Séanadh Ríomhphoist/_
Email Disclaimer__
**
Tá an ríomhphost seo agus
aon chomhad a sheoltar leis faoi rún agus is lena úsáid ag an seolaí agus
sin amháin é. Is féidir tuilleadh a léamh anseo.
<https://www.dcu.ie/iss/seanadh-riomhphoist.shtml>
<https://www4.dcu.ie/iss/seanadh-riomhphoist.shtml>*
_
This e-mail and any
files transmitted with it are confidential and are intended solely for use
by the addressee. Read more here.
<https://www.dcu.ie/iss/email-disclaimer.shtml> _
*_
|
Our latest v0.4.0 release includes a post-processing monotonize function. Details on how to call it are available in our docs at: https://interpret.ml/docs/ebm.html#interpret.glassbox.ExplainableBoostingRegressor.monotonize |
it is great to have monotonicity constraint on EBM already. Just guessing if it would be possible to add into the roadmap concavity and convexity constraints. By having those constraints configurable by feature:
i guess that all typical prior information would be available to adjust the models. Interpretability is often needed together with those kind of prior constraints between features and target variable. Kind regards. |
Hi @Guillermogsjc -- That's an interesting option. Using concave as an example, you could probably do a fairly good job by selecting the bin with the highest score and monitonizing the graph to the left as increasing and the graph to the right as decreasing. You could do an even better job by repeating this for a few of the highest bins instead of just the highest, and selecting the one that results in the smallest change. I'm really curious to know what applications you had in mind for convex and concave constraints? |
Thanks for the proxy :) Well, convex restrictions arise as a prior in any relationship between covariates and target variable where "It is known" that the feature must have a unique min or max respecting to the effect at the target variable. Probably even searching that min or max when modeling the case. |
We now support monotone constraints during fitting, so closing this issue. As detailed in the thread above, we almost always recommend using post processed model editing instead of applying constraints during fitting, except for investigative purposes or when you're 100% sure the underlying generation function has a fundamental monotonic relationship. An example of this would be when you're modeling a physical system (eg: the dark matter modeling papers on our readme). I've also updated our documentation to reflect this recommendation. |
Hi!
Are there plans to implement monotonic regressors, like it is possible for LightGBM, for example?
Thank you!
Best
Robert
The text was updated successfully, but these errors were encountered: