Compilation of different preprocessing methods #304

NTNguyen13 · 2022-02-25T11:25:09Z

Hi, I've just checked out Shapash. I've seen a lot of this line in the document:
preprocessing=encoder, # Optional: compile step can use inverse_transform method

However, I'm not sure how to process with this. I checked the code in here, but I'm not clear of about the use of parsing dict or list_of_dict to preprocessing.

I have this example, could you please advise me how to process with it?

Original df:

   A   B1   B2   C1   C2   E
1  0   B11  B03  C02  C04  1
2  1   B03  B04  C03  C04  1
3  0   B02  B03  C02  C02  1
4  1   B04  B03  C02  C03  0

I want to one hot encode A and E, and multi label binarizer (B1, B2) and (C1, C2) (both encoders are from sklearn)

Target df:

    A0   A1   B02  B03  B04  B11  C02  C03  C04  E0  E1
1   1    0    0    1    0    1    1    0    1    0   1
2   0    1    0    1    1    0    0    1    1    0   1
3   1    0    1    1    0    0    2    0    0    0   1
4   0    1    0    1    1    0    1    1    0    1   0

Because I have multiple encoders of multiple columns, how should I pass them preprocessing?

Thank you very much

The text was updated successfully, but these errors were encountered:

SebastienBidault · 2022-02-28T23:35:53Z

Hi,

I recommend you to take a look at the encoding tutorials for a better understanding tutorial.

But at the moment we don't support multi label binarizer from sklearn.

We support :
from sklearn : OneHotEncoder / OrdinalEncoder / StandardScaler / QuantileTransformer / PowerTransformer
from category_encoder : OneHotEncoder / OrdinalEncoder / BaseNEncoder / BinaryEncoder / TargetEncoder
or a dict with the mapping needed

I don't know how complex your problem is but maybe you can use the features_groups of the compile step to get the importance of A,B,C or E.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compilation of different preprocessing methods #304

Compilation of different preprocessing methods #304

NTNguyen13 commented Feb 25, 2022

SebastienBidault commented Feb 28, 2022

Compilation of different preprocessing methods #304

Compilation of different preprocessing methods #304

Comments

NTNguyen13 commented Feb 25, 2022

SebastienBidault commented Feb 28, 2022