Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for passing additional parameters to the estimator's fit method in SequentialFeatureSelector #25236

Open
ionicsolutions opened this issue Dec 26, 2022 · 7 comments

Comments

@ionicsolutions
Copy link
Contributor

ionicsolutions commented Dec 26, 2022

Describe the workflow you want to enable

I would like to be able to pass sample weights to the fit method of the estimator in SequentialFeatureSelector. (SelectFromModel has this feature as well.)

Looking at the code, it seems to me that sklearn.model_selection._validation._fit_and_score, which is where the actual scoring takes place, already supports sample weights in cross-validation (i.e., takes care of passing only the weights for samples that are actually part of the fold).

Describe your proposed solution

Based on my current understanding, SequentialFeatureSelector.fit() would need to get an additional argument fit_params (cf. SelectFromModel.fit()) that is passed on to the cross_val_score function called in SequentialFeatureSelector._get_best_new_feature_score. Everything else appears to be in place already.

if that's all it takes, I'm happy to prepare a PR. But I might very well be overlooking issues or unintended consequences.

Describe alternatives you've considered, if relevant

No response

Additional context

No response

@glemaitre
Copy link
Member

We are most certainly interested to be able to pass additional parameters. The SLEP006 is intending to do so.

@adrinjalali do you think that we should be implemented this support now or we should await the implementation of the SLEP006 to land in main with proper support?

@ionicsolutions
Copy link
Contributor Author

I had a look at SLEP006 and I believe its implementation is required, since passing sample_weight to the estimator's fit() method is possible but it is currently not possible to pass sample_weight to the score() method (cf. sklearn.model_selection._validation._fit_and_score). That's something I hadn't considered yet.

@glemaitre glemaitre removed the Needs Triage Issue requires triage label Dec 30, 2022
@adrinjalali
Copy link
Member

Since there's only one sub-estimator here, and no routing like in the pipeline is required, we can go ahead and have a PR for this independent of SLEP006, It'd be required anyway.

@ionicsolutions
Copy link
Contributor Author

@adrinjalali Thanks for looking into this! Is there already a way to pass sample_weight to score()? According to SLEP006, this is not yet possible, but I might be reading this wrong.

If there's already a way to accomplish this, I'm happy to give adding a sample_weight parameter to SequentialFeatureSelector's fit() a try.

@adrinjalali
Copy link
Member

No we can't support score() here for sample_weight. We can only add it as a fit param kinda thing, and pass it to cross_val_score as fit_params.

@ionicsolutions
Copy link
Contributor Author

Just to be sure that I understand correctly: sample_weight would be considered during fitting but not during scoring, right?

If that's the case, I don't think there's much use for a sample_weight parameter right now. At the very least, it would be quite misleading for casual users that just look at the method's signature. I also cannot think of a case where fitting to weighted samples but scoring on unweighted samples is useful.

@adrinjalali
Copy link
Member

Just to be sure that I understand correctly: sample_weight would be considered during fitting but not during scoring, right?

Yes.

If that's the case, I don't think there's much use for a sample_weight parameter right now. At the very least, it would be quite misleading for casual users that just look at the method's signature. I also cannot think of a case where fitting to weighted samples but scoring on unweighted samples is useful.

In that case, we can fix this after SLEP006 is implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants