Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add user friendly string options for interaction constraints in HistGradientBoosting* #24845

Closed
2 tasks done
lorentzenchr opened this issue Nov 6, 2022 · 7 comments · Fixed by #24849
Closed
2 tasks done

Comments

@lorentzenchr
Copy link
Member

lorentzenchr commented Nov 6, 2022

Describe the workflow you want to enable

model_no_interactions = HistGradientBoostingRegressor(
    interaction_cst="no_interactions"
)

model_pairwise_interactions = HistGradientBoostingRegressor(
    interaction_cst="pairwise"
)

instead of

model_no_interactions = HistGradientBoostingRegressor(
    interaction_cst=[[i] for i in range(X_train.shape[1])]
)

model_pairwise_interactions = HistGradientBoostingRegressor(
    interaction_cst=list(itertools.combinations(range(n_features), 2))
)

Describe your proposed solution

  • "no_interactions" is straight forward.
  • "pairwise" expands to a list that is quadratic in number of features. It might be more memory efficient to use generators internally.

Describe alternatives you've considered, if relevant

No response

Additional context

This was proposed as follow-up in #21020 (comment).

@mzugravu
Copy link

mzugravu commented Nov 6, 2022

Hi,

If no one is taking it I can take this issue. I have to make a contribution as part of one of my lectures at university so I'd be grateful for letting me do it.

However, perhaps I need more explanation. I've just checked the doc of HistGradientBoostingRegressor and there is no such argument as interaction_cst. So you would like to add this parameter and link the key word "no_interactions" or "pairwise" to what you showed in the part instead of. Am I correct? Perhaps you could describe a bit what this interaction means for this case.

Let me know if I got it right and if I can take this issue. (don't hesitate to tell me if this will be too hard for a beginner) Thanks.

@lorentzenchr
Copy link
Member Author

@mzugravu Have a look at the linked PR. Interaction constraints are a new feature that will be released in 1.2. I split this issue into 2 parts. The first part "no_interactions" is the by far easier one.
But if this is your first contribution, I suggest searching for issues with label "good first issue" (or "easy").

@mzugravu
Copy link

mzugravu commented Nov 7, 2022

Ok, thank you. I think I will keep searching it might be too hard for me.

@betatim
Copy link
Member

betatim commented Nov 7, 2022

I'll take a look at this

@ogrisel
Copy link
Member

ogrisel commented Nov 7, 2022

And maybe also (possibly in a follow-up PR) using the input feature names of the model when available:

model_no_interactions = HistGradientBoostingRegressor(
    interaction_cst=[
        ("name_of_feature_0", "name_of_feature_42"),
        ("name_of_feature_0", "name_of_feature_1", "name_of_feature_2"),
    ]
)

@lorentzenchr
Copy link
Member Author

@betatim Go ahead.
@ogrisel We should open a new issue for feature names as argument option, for monotonic as well as interaction constraints.

@ogrisel
Copy link
Member

ogrisel commented Nov 7, 2022

@ogrisel We should open a new issue for feature names as argument option, for monotonic as well as interaction constraints.

I agree. Let me do that and I will cross-link back to this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants