## Revised target encoder function

##### The original target_encode_categ function was not robust.  It would return NaN if the category was exclusive to test (missing from train).  The improved function is below.

##### Project results weren't impacted.

In [1]:
from category_encoders.target_encoder import TargetEncoder

from category_encoders.target_encoder import TargetEncoder

def target_encode_categ(encode_cols, train, test, target, encoder):
    '''Returns encoded categorical features.  Encoded feature is a blend of
    (1) ExpectedVal( trainTarget | FeatureClass) and 
    (2) "Prior": ExpectedVal(Target) over all training data.  
    
    Sets testTarget to NaN to stop data leakage.
    
    Encoder smoothing balances Class average vs Prior. Higher smoothing is stronger
    regularization.
    
    Arguments
    ---------
    train: training data including target Y
    test: test data including target Y
    target: target Y
    encoder: TargetEncoder(cols_to encode, smoothing_float_value).  
    
    See https://contrib.scikit-learn.org/categorical-encoding/targetencoder.html
    for more parameters.
    
    Returns
    ---------
    trn: train with encoding applied to encode_cols
    tst: test with encoding applied to encode_cols
    
    '''
    
    trn = train.copy(); tst = test.copy();
    
    #Fit and transform.
    encoder.fit(trn[encode_cols], trn[target])
    trn_enc = encoder.transform(trn[encode_cols])
    tst_enc = encoder.transform(tst[encode_cols])
    
    #Overwrite features with encoded features.
    trn[encode_cols] = trn_enc
    tst[encode_cols] = tst_enc
    
    return trn, tst