New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mapie can not use Pipelines to its full extent, throws exception #149
Comments
Hi @nilslacroix , which version of MAPIE are you using ? |
I am using version 0.3.1 for mapie, conda 4.12, python 3.9.6 and scikit 0.24.2 on a windows 10 machine |
I deleted the whole part in line 458 where it does the check, which lets the LGBM regressor start but after that I get this index error:
Seems to be related to line 368 in regression.py |
Ok, could you try with latest version 0.3.2 of MAPIE ? We fixed a similar issue recently : #128 |
The latest version works fine. Thank you very much fr your work on this project :) |
Describe the bug
I want to use mapie on a model, which I obtained from gscv.best_estimator_ . The model uses a pipeline, which looks like this:
When I use the lines
Mapie throws the exception:
ValueError: could not convert string to float: 'EFH'
in
~\miniconda3\envs\Master_ML\lib\site-packages\mapie\regression.py in fit(self, X, y, sample_weight) 457 cv = self._check_cv(self.cv) 458 estimator = self._check_estimator(self.estimator) --> 459 X, y = check_X_y( 460 X, y, force_all_finite=False, dtype=["float64", "int", "object"] 461 )
X_train and y_train are still in raw format (strings, not scaled, ....) the pipeline was designed to adress this.
My guess is that when mapie.fit is called on X_train the categorical variable "EFH" produces the error because it
is not float64, int or object type. However the pipeline would adress this by using an encoder.
Expected behavior
I would expect for the pipeline to preprocess my data before throwing a exception because of a wrong datatype.
The text was updated successfully, but these errors were encountered: