# Mixed usage with other packages

There are quite some excellent packages out there offering functionality around bucketing/binning/discretizing numerical variables and encoding categorical variables. Chances are you'd like to combine them in your `skorecard` pipelines.

Here are some packages are are compatible with pandas dataframes:

- [`category_encoders` from scikit-learn-contrib](https://github.com/scikit-learn-contrib/category_encoders)
- [`feature-engine` categorical variable encoders](https://feature-engine.readthedocs.io/en/latest/encoding/index.html)
- [`feature-engine` variable discretisation](https://feature-engine.readthedocs.io/en/latest/discretisation/index.html)


In [1]:
%%capture
!pip install category_encoders

In [18]:
%%capture
from sklearn.pipeline import make_pipeline
from skorecard.datasets import load_uci_credit_card
from skorecard.bucketers import OrdinalCategoricalBucketer
X, y = load_uci_credit_card(return_X_y=True)

from category_encoders import TargetEncoder

pipe = make_pipeline(
    TargetEncoder(cols=["EDUCATION"]), #  category_encoders.TargetEncoder passes through other columns
    OrdinalCategoricalBucketer(variables=["MARRIAGE"])
)
pipe.fit(X, y)

In [27]:
pipe.transform(X).head(5)

Unnamed: 0,EDUCATION,MARRIAGE,LIMIT_BAL,BILL_AMT1
0,0.0,2.0,400000.0,201800.0
1,1.0,2.0,80000.0,80610.0
2,0.0,2.0,500000.0,499452.0
3,0.0,1.0,140000.0,450.0
4,1.0,1.0,420000.0,56107.0


Some packages do not return pandas DataFrames, like:

- [`sklearn.preprocessing.KBinsDiscretizer`](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.KBinsDiscretizer.html#sklearn.preprocessing.KBinsDiscretizer)

You can wrap the class in `skorecard.pipeline.KeepPandas` to use these transformers in a pipeline: 

In [26]:
from sklearn.preprocessing import KBinsDiscretizer
from skorecard.pipeline import KeepPandas
from sklearn.compose import ColumnTransformer

ct = ColumnTransformer(
    [
        ("binner", KBinsDiscretizer(n_bins=3, encode='ordinal', strategy='uniform'), ['EDUCATION'])
    ],
    remainder="passthrough"
)
pipe = make_pipeline(
    KeepPandas(ct),
    OrdinalCategoricalBucketer(variables=["MARRIAGE"])
)
pipe.fit_transform(X, y).head(5)



Unnamed: 0,EDUCATION,MARRIAGE,LIMIT_BAL,BILL_AMT1
0,0.0,2.0,400000.0,201800.0
1,1.0,2.0,80000.0,80610.0
2,0.0,2.0,500000.0,499452.0
3,0.0,1.0,140000.0,450.0
4,1.0,1.0,420000.0,56107.0
