# Advanced Bayes Search CV Example

This is a more advanced example of how the `BayesSearchCV` class can be applied - it's recommended that you first read through the simpler `bayes_search_cv_example`.

The `BayesSearchCV` class is used to search for the set of hyperparameters that produce the best decision engine performance for a given Iguanas Pipeline, whilst also reducing the likelihood of overfitting.

The process is as follows:

* Generate k-fold stratified cross validation datasets. 
* For each of the training and validation datasets:
    * Fit the pipeline on the training set using a set of parameters chosen by the Bayesian Optimiser from a given set of ranges.
    * Apply the pipeline to the validation set to return a prediction.
    * Use the provided `scorer` to calculate the score of the prediction.
* Return the parameter set which generated the highest mean overall score across the validation datasets.

In this example, we'll consider the following more advanced workflow (compared to the standard `bayes_search_cv_example` notebook), which considers the generation of a Rules-Based System for a credit card fraud transaction use case:

<center><img src="images/complex_example.png"/></center>

Here, we have a fraud detection use case, and we're aiming to create two distinct rule sets - one for flagging fraudulent behaviour; one for flagging good behaviour. Each of these rule sets will be comprised of a generated rule set and an existing rule set. We'll optimise and filter these two rule sets separately, then combine and feed them into the decision engine optimiser. **Note:** we optimise the generated rules as they'll be created using the `RuleGeneratorDT` class, which generates rules from the branches of decision trees - these split based on gini or entropy - so we can further optimise them for a specific metric. 

**The decision engine will have the following constraint:** for a given transaction, if any approve rules fire it will be approved; else, if any reject rules fire it will be rejected; else, it will be approved.

We'll use the `BayesSearchCV` class to optimise the hyperparameters of the steps in this workflow, **ensuring that we maximise the revenue for our decision engine.**

---

## Import packages

In [1]:
from iguanas.rule_generation import RuleGeneratorDT
from iguanas.rule_selection import SimpleFilter, CorrelatedFilter, GreedyFilter, BayesSearchCV
from iguanas.metrics import FScore, Precision, Revenue, JaccardSimilarity
from iguanas.rbs import RBSOptimiser, RBSPipeline
from iguanas.correlation_reduction import AgglomerativeClusteringReducer
from iguanas.pipeline import LinearPipeline, ParallelPipeline
from iguanas.pipeline.class_accessor import ClassAccessor
from iguanas.space import UniformFloat, UniformInteger, Choice
from iguanas.rules import Rules
from iguanas.rule_optimisation import BayesianOptimiser

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from category_encoders.one_hot import OneHotEncoder
from sklearn.ensemble import RandomForestClassifier
from sklearn.impute import SimpleImputer

## Read in data

Let's read in the [credit card fraud dataset](https://www.kaggle.com/mlg-ulb/creditcardfraud) from Kaggle.

**Note:** this data has been altered to include some null values in the `V1` column. This is to simulate unprocessed data (the dataset on Kaggle has been processed using PCA, so there are no null values). It has also been randomly sampled to 10% of its original number of records, to reduce the file size.

In [2]:
target_col = 'Class'
time_col = 'Time'
amt_col = 'Amount'
# Ready in data
df = pd.read_csv('dummy_data/creditcard.csv')
# Sort data by time ascending
df.sort_values(time_col, ascending=True)
# Create X and y dataframes
X = df.drop([target_col, time_col], axis=1)
y = df[target_col]

In [3]:
X_train_raw, X_test_raw, y_train, y_test = train_test_split(
    X,
    y,
    test_size=0.33,
    random_state=42
)

To calculate the **Revenue**, we need the monetary amount of each transaction - we'll use these later:

In [4]:
amts_train = X_train_raw[amt_col]
amts_test = X_test_raw[amt_col]

### Process data

Let's impute the null values with the mean:

In [5]:
imputer = SimpleImputer(strategy='mean')
X_train = pd.DataFrame(
    imputer.fit_transform(X_train_raw),
    columns=X_train_raw.columns,
    index=X_train_raw.index
)
X_test = pd.DataFrame(
    imputer.transform(X_test_raw),
    columns=X_test_raw.columns,
    index=X_test_raw.index
)

In [6]:
# Check nulls have been imputed
X_train.isna().sum().sum(), X_test.isna().sum().sum()

(0, 0)

### Existing rules

Let's also assume we have the following existing rules, stored in the standard Iguanas string format:

In [7]:
fraud_rule_strings = {
    "ExistingReject1": "((X['V1']<0)|(X['V1'].isna()))&(X['V3']<1)",
    "ExistingReject2": "(X['V2']>3)",
}
good_rule_strings = {
    "ExistingApprove1": "(X['V1']>0)&(X['V3']>1)",
    "ExistingApprove2": "(X['V2']<3)",
    "ExistingApprove3": "(X['V4']<3)"
}

We can create a `Rules` class for each of these:

In [8]:
fraud_rules = Rules(rule_strings=fraud_rule_strings)
good_rules = Rules(rule_strings=good_rule_strings)

Then convert them to the standard Iguanas lambda expression format (we'll need this for the optimisation step):

In [9]:
fraud_rule_lambdas = fraud_rules.as_rule_lambdas(
    as_numpy=False, 
    with_kwargs=True
)
good_rule_lambdas = good_rules.as_rule_lambdas(
    as_numpy=False, 
    with_kwargs=True
)

----

## Set up pipeline

Before we can apply the `BayesSearchCV` class, we need to set up our pipeline. To create the workflow shown at the beginning of the notebook, we must use a combination of `LinearPipeline` and `ParallelPipeline` classes as shown below:

![title](images/complex_example_setup.png)

Let's begin building the **Fraud *LinearPipeline***.

### Fraud *LinearPipeline*

Let's first instantiate the classes that we'll use in the pipeline:

In [10]:
# F1 Score
f1 = FScore(beta=1)
# Precision
p = Precision()
    
# Rule generation
fraud_gen = RuleGeneratorDT(
    metric=f1.fit,
    n_total_conditions=2,
    tree_ensemble=RandomForestClassifier(
        n_estimators=10,
        random_state=0
    ),
    target_feat_corr_types='Infer',
    rule_name_prefix='Reject' # Set this so generated reject rules distinguishable from approve rules
)

# Rule optimisation (for generated rules)
fraud_gen_opt = BayesianOptimiser(
    rule_lambdas=ClassAccessor(
        class_tag='fraud_gen',
        class_attribute='rule_lambdas'
    ),
    lambda_kwargs=ClassAccessor(
        class_tag='fraud_gen',
        class_attribute='lambda_kwargs'
    ),
    metric=f1.fit,
    n_iter=10
)

# Rule optimisation (for existing rules)
fraud_opt = BayesianOptimiser(
    rule_lambdas=fraud_rule_lambdas,
    lambda_kwargs=fraud_rules.lambda_kwargs,
    metric=f1.fit,
    n_iter=10
)

# Rule filter (performance-based)
fraud_sf = SimpleFilter(
    threshold=0.1, 
    operator='>=', 
    metric=f1.fit
)

# Rule filter (correlation-based)
js = JaccardSimilarity()
fraud_cf = CorrelatedFilter(
    correlation_reduction_class=AgglomerativeClusteringReducer(
        threshold=0.9, 
        strategy='top_down', 
        similarity_function=js.fit, 
        metric=f1.fit
    ), 
    rules=ClassAccessor(
        class_tag='fraud_gen',
        class_attribute='rules'
    )
)

Now we can create our **Fraud Rule Generation *LinearPipeline***. Note that we pass the tag for the optimisation of the generated rules to the `use_init_data` parameter, so that the feature set is passed to the `BayesianOptimiser` class, rather than the output from the `RuleGeneratorDT`:

In [11]:
fraud_gen_lp = LinearPipeline(
    steps = [
        ('fraud_gen', fraud_gen),
        ('fraud_gen_opt', fraud_gen_opt),
    ],
    use_init_data=['fraud_gen_opt']
)

And then our **Fraud *ParallelPipeline*** (noting that one of the steps in this pipeline is the **Fraud Rule Generation *LinearPipeline*** created above):

In [12]:
fraud_gen_lp = ParallelPipeline(
    steps = [
        ('fraud_gen_lp', fraud_gen_lp),
        ('fraud_opt', fraud_opt),
    ]
)

And then finally, our **Fraud *LinearPipeline***:

In [13]:
fraud_lp = LinearPipeline(
    steps = [
        ('fraud_gen_lp', fraud_gen_lp),
        ('fraud_sf', fraud_sf),
        ('fraud_cf', fraud_cf)
    ]
)

Now we can do the same for the **Good *LinearPipeline***:

### Good *LinearPipeline*

Let's first instantiate the classes that we'll use in the pipeline:

In [14]:
# Rule generation
good_gen = RuleGeneratorDT(
    metric=f1.fit,
    n_total_conditions=2,
    tree_ensemble=RandomForestClassifier(
        n_estimators=10,
        random_state=0
    ),
    target_feat_corr_types='Infer',
    rule_name_prefix='Approve' # Set this so generated reject rules distinguishable from approve rules
)

# Rule optimisation (for generated rules)
good_gen_opt = BayesianOptimiser(
    rule_lambdas=ClassAccessor(
        class_tag='good_gen',
        class_attribute='rule_lambdas'
    ),
    lambda_kwargs=ClassAccessor(
        class_tag='good_gen',
        class_attribute='lambda_kwargs'
    ),
    metric=f1.fit,
    n_iter=10
)

# Rule optimisation (for existing rules)
good_opt = BayesianOptimiser(
    rule_lambdas=good_rule_lambdas,
    lambda_kwargs=good_rules.lambda_kwargs,
    metric=f1.fit,
    n_iter=10
)

# Rule filter (performance-based)
good_sf = SimpleFilter(
    threshold=0.1, 
    operator='>=', 
    metric=f1.fit
)

# Rule filter (correlation-based)
js = JaccardSimilarity()
good_cf = CorrelatedFilter(
    correlation_reduction_class=AgglomerativeClusteringReducer(
        threshold=0.9, 
        strategy='top_down', 
        similarity_function=js.fit, 
        metric=f1.fit
    ),
    rules=ClassAccessor(
        class_tag='good_gen',
        class_attribute='rules'
    )
)

Now we can create our **Good Rule Generation *LinearPipeline***. Note that we pass the tag for the optimisation of the generated rules to the `use_init_data` parameter, so that the feature set is passed to the `BayesianOptimiser` class, rather than the output from the `RuleGeneratorDT`:

In [15]:
good_gen_lp = LinearPipeline(
    steps = [
        ('good_gen', good_gen),
        ('good_gen_opt', good_gen_opt),
    ],
    use_init_data=['good_gen_opt']
)

And then our **Good *ParallelPipeline*** (noting that one of the steps in this pipeline is the **Good Rule Generation *LinearPipeline*** created above):

In [16]:
good_gen_lp = ParallelPipeline(
    steps = [
        ('good_gen_lp', good_gen_lp),
        ('good_opt', good_opt),
    ]
)

And then finally, our **Good *LinearPipeline***:

In [17]:
good_lp = LinearPipeline(
    steps = [
        ('good_gen_lp', good_gen_lp),
        ('good_sf', good_sf),
        ('good_cf', good_cf)
    ]
)

Now we can move on to constructing the **Overall Pipelines:**

### Overall Pipelines

First, we'll construct our **Overall *ParallelPipeline*** using the **Fraud *LinearPipeline*** and **Good *LinearPipeline***:

In [18]:
overall_pp = ParallelPipeline(
    steps = [
        ('fraud_lp', fraud_lp),
        ('good_lp', good_lp)
    ]
)

Now we can instantiate the decision engine optimiser. Since we have a constraint on the decision engine (if any approve rules fire, approve the transaction; else if any reject rules fire, reject the transaction; else approve the transaction), we pass the rules remaining after the filtering stages to the relevant elements in the `config` parameter of the `RBSPipeline` class, using the `ClassAccessor` class:

In [19]:
# Decision engine optimisation metric
opt_metric = Revenue(
    y_type='Fraud',
    chargeback_multiplier=3
)

# Decision engine (to be optimised)
rbs_pipeline = RBSPipeline(
    config=[
        [
            0, ClassAccessor( # If any approve rules fire, approve
                class_tag='good_cf', 
                class_attribute='rules_to_keep'
            ),
        ],
        [
            1, ClassAccessor( # Else if any reject rules fire, reject
                class_tag='fraud_cf', 
                class_attribute='rules_to_keep'
            )
        ],        
    ],
    final_decision=0 # Else approve
)

# Decision engine optimiser
rbs_optimiser = RBSOptimiser(
    pipeline=rbs_pipeline,
    metric=opt_metric.fit,     
    rules=ClassAccessor(
        class_tag='overall_pp',
        class_attribute='rules'
    ),
    n_iter=10
)

Finally, we can instantiate our **Overall *LinearPipeline***:

In [20]:
overall_lp = LinearPipeline(
    steps=[
        ('overall_pp', overall_pp),
        ('rbs_optimiser', rbs_optimiser)
    ]
)

## Define the search space

Now we need to define the search space for each of the relevant parameters of our pipeline. **Note:** this example does not search across all hyperparameters - you should define your own search spaces based on your use case.

To do this, we create a dictionary, where each key corresponds to the tag used for the relevant pipeline step. Each value should be a dictionary of the parameters (keys) and their search spaces (values). Search spaces should be defined using the classes in the `iguanas.space` module:

In [21]:
# Define additional FScores
f0dot5 = FScore(beta=0.5)
f0dot25 = FScore(beta=0.25)

In [22]:
search_spaces = {
    'fraud_gen': {
        'n_total_conditions': UniformInteger(2, 7),
    },
    'fraud_gen_opt': {
        'metric': Choice([f0dot25.fit, f0dot5.fit, f1.fit]),
    },
    'fraud_sf': {
        'threshold': UniformFloat(0, 1),
    },
    'fraud_cf': {
        'correlation_reduction_class': Choice(
            [
                AgglomerativeClusteringReducer(
                    threshold=0.9, 
                    strategy='top_down', 
                    similarity_function=js.fit, 
                    metric=f1.fit                    
                ),
                AgglomerativeClusteringReducer(
                    threshold=0.95, 
                    strategy='top_down', 
                    similarity_function=js.fit, 
                    metric=f1.fit                    
                )
            ]
        )
    },    
    'good_gen': {
        'n_total_conditions': UniformInteger(2, 7),
    },
    'good_gen_opt': {
        'metric': Choice([f0dot25.fit, f0dot5.fit, f1.fit]),
    },
    'good_sf': {
        'threshold': UniformFloat(0, 1),
    },
    'good_cf': {
        'correlation_reduction_class': Choice(
            [
                AgglomerativeClusteringReducer(
                    threshold=0.9, 
                    strategy='top_down', 
                    similarity_function=js.fit, 
                    metric=f1.fit                    
                ),
                AgglomerativeClusteringReducer(
                    threshold=0.95, 
                    strategy='top_down', 
                    similarity_function=js.fit, 
                    metric=f1.fit                    
                )
            ]
        )
    }
}

## Optimise the pipeline hyperparameters

Now that we have our pipeline and search spaces defined, we can instantiate the `BayesSearchCV` class. We'll split our data into 3 cross-validation datasets and try 10 different parameter sets.

**Note:** since we're using the `Revenue` as the scoring metric for the `BayesSearchCV` class, we need to set the `sample_weight_in_val` parameter to `True`. This ensures that the `sample_weight` passed to the final step in the pipeline is used when applying the `metric` function to the prediction of each validation set (for `Revenue`, the `sample_weight` corresponds to the monetary amount of each transaction, which is required).

In [23]:
bs = BayesSearchCV(
    pipeline=overall_lp, 
    search_spaces=search_spaces, 
    metric=opt_metric.fit, # Use the same metric as the RBSOptimiser
    cv=3, 
    n_iter=10,
    num_cores=3,
    error_score=0,
    verbose=1,
    sample_weight_in_val=True # Set to True
)

Finally, we can run the `fit` method to optimise the hyperparameters of the pipeline. 

**Note the following:** 

* The existing rules contain conditions that rely on unprocessed data (in this case, there are conditions that check for nulls). So for the rule optimisation steps, we must use the unprocessed training data `X_train_raw`; for the rule generation steps, we must use the processed training data `X_train`.
* Since we're generating and optimising rules that flag both positive and negative cases (i.e. reject and approve rules in this example), we need to specify what the target is in each case. For the reject rules, we can just use `y_train`, however for the approve rules, we need to flip `y_train` (so that the rule generator and rule optimisers target the negative cases).
* We need the `amts_train` to be passed to the `sample_weight` parameter of the `RBSOptimiser`, as we're optimising the decision engine for the `Revenue`.

In [24]:
bs.fit(
    X={
        'fraud_gen_lp': X_train, # Use processed features for rule generation
        'fraud_opt': X_train_raw, # Use raw features for optimising existing rules
        'good_gen_lp': X_train, # Use processed features for rule generation
        'good_opt': X_train_raw # Use raw features for optimising existing rules
    }, 
    y={
        'fraud_lp': y_train, # Use target for Fraud LinearPipeline
        'good_lp': 1-y_train, # Flip target for Good LinearPipeline
        'rbs_optimiser': y_train # Use target for RBSOptimiser
    },
    sample_weight={
        'fraud_lp': None, # No sample_weight for Fraud LinearPipeline
        'good_lp': None, # No sample_weight for Good LinearPipeline
        'rbs_optimiser': amts_train # sample_weight for RBSOptimiser
    }
)

--- Optimising pipeline parameters ---
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [01:04<00:00,  6.43s/trial, best loss: -560248.5233333333]
--- Refitting on entire dataset with best pipeline ---


### Outputs

The `fit` method doesn't return anything. See the `Attributes` section in the class docstring for a description of each attribute generated:

In [25]:
bs.best_score

560248.5233333333

In [26]:
bs.best_params

{'fraud_cf': {'correlation_reduction_class': AgglomerativeClusteringReducer(threshold=0.9, strategy=top_down, similarity_function=<bound method JaccardSimilarity.fit of JaccardSimilarity>, metric=<bound method FScore.fit of FScore with beta=1>, print_clustermap=False)},
 'fraud_gen': {'n_total_conditions': 7.0},
 'fraud_gen_opt': {'metric': <bound method FScore.fit of FScore with beta=0.5>},
 'fraud_sf': {'threshold': 0.43381170823234194},
 'good_cf': {'correlation_reduction_class': AgglomerativeClusteringReducer(threshold=0.95, strategy=top_down, similarity_function=<bound method JaccardSimilarity.fit of JaccardSimilarity>, metric=<bound method FScore.fit of FScore with beta=1>, print_clustermap=False)},
 'good_gen': {'n_total_conditions': 5.0},
 'good_gen_opt': {'metric': <bound method FScore.fit of FScore with beta=0.25>},
 'good_sf': {'threshold': 0.5309641649521473}}

In [27]:
bs.best_index

1

In [28]:
bs.cv_results.head()

Unnamed: 0,Params,fraud_cf__correlation_reduction_class,fraud_gen__n_total_conditions,fraud_gen_opt__metric,fraud_sf__threshold,good_cf__correlation_reduction_class,good_gen__n_total_conditions,good_gen_opt__metric,good_sf__threshold,FoldIdx,Scores,MeanScore,StdDevScore
1,{'fraud_cf': {'correlation_reduction_class': A...,"AgglomerativeClusteringReducer(threshold=0.9, ...",7.0,<bound method FScore.fit of FScore with beta=0.5>,0.433812,"AgglomerativeClusteringReducer(threshold=0.95,...",5.0,<bound method FScore.fit of FScore with beta=0...,0.530964,"[0, 1, 2]","[572899.38, 572468.7999999999, 535377.39]",560248.523333,17587.425522
4,{'fraud_cf': {'correlation_reduction_class': A...,"AgglomerativeClusteringReducer(threshold=0.95,...",3.0,<bound method FScore.fit of FScore with beta=1>,0.455055,"AgglomerativeClusteringReducer(threshold=0.95,...",3.0,<bound method FScore.fit of FScore with beta=0.5>,0.14742,"[0, 1, 2]","[571558.6399999999, 573567.72, 535370.05]",560165.47,17552.183916
8,{'fraud_cf': {'correlation_reduction_class': A...,"AgglomerativeClusteringReducer(threshold=0.95,...",7.0,<bound method FScore.fit of FScore with beta=0...,0.027295,"AgglomerativeClusteringReducer(threshold=0.95,...",3.0,<bound method FScore.fit of FScore with beta=1>,0.510489,"[0, 1, 2]","[573019.7, 570521.4199999999, 535370.05]",559637.056667,17189.649214
7,{'fraud_cf': {'correlation_reduction_class': A...,"AgglomerativeClusteringReducer(threshold=0.9, ...",6.0,<bound method FScore.fit of FScore with beta=0...,0.184279,"AgglomerativeClusteringReducer(threshold=0.9, ...",3.0,<bound method FScore.fit of FScore with beta=0.5>,0.967131,"[0, 1, 2]","[572826.1399999999, 573582.58, 519611.77]",555340.163333,25265.676559
2,{'fraud_cf': {'correlation_reduction_class': A...,"AgglomerativeClusteringReducer(threshold=0.95,...",3.0,<bound method FScore.fit of FScore with beta=0.5>,0.599342,"AgglomerativeClusteringReducer(threshold=0.9, ...",7.0,<bound method FScore.fit of FScore with beta=1>,0.258372,"[0, 1, 2]","[573231.7, 570532.0399999999, 521590.14999999997]",555117.963333,23733.348425


To see the final optimised decision engine configuration and rule set, we first return the parameters of the trained pipeline (stored in the attribute `pipeline_`):

In [29]:
pipeline_params = bs.pipeline_.get_params()

Then, to see the final optimised decision engine configuration, we filter to the `config` parameter of the `rbs_optimiser` step:

In [30]:
final_config = pipeline_params['rbs_optimiser']['config']
final_config

[[0, ['Approve_30', 'Approve_41']],
 [1,
  ['Reject_31',
   'Reject_54',
   'Reject_39',
   'Reject_25',
   'Reject_40',
   'Reject_47',
   'Reject_16',
   'Reject_27',
   'Reject_37',
   'Reject_34',
   'Reject_13']]]

This shows us which rules should be used for the approval step (decision `0`) and which rules should be used for the rejection step (decision `1`).

To see the logic of our final set of rules, we filter to the `rules` parameter of the `rbs_optimiser` step:

In [31]:
final_rules = bs.pipeline_.get_params()['rbs_optimiser']['rules']

Then extract the `rule_strings` attribute:

In [32]:
final_rules.rule_strings

{'Reject_13': "(X['V10']<=-5.57875)&(X['V19']>-3.52538)",
 'Reject_16': "(X['V11']>2.35088)&(X['V17']<=-3.60923)&(X['V19']>-3.50737)",
 'Reject_25': "(X['V12']<=-3.67614)&(X['V16']<=-2.1745)&(X['V2']>-7.3523)",
 'Reject_27': "(X['V12']<=-3.74057)&(X['V2']>1.27989)",
 'Reject_31': "(X['V12']<=-4.43715)",
 'Reject_34': "(X['V12']<=-5.63079)",
 'Reject_37': "(X['V14']<=-11.33548)&(X['V17']<=-2.69333)&(X['V27']>1.1957)",
 'Reject_39': "(X['V14']<=-4.64098)",
 'Reject_40': "(X['V14']<=-4.72102)",
 'Reject_47': "(X['V17']<=-2.18248)&(X['V27']>-4.29354)",
 'Reject_54': "(X['V17']<=6.28023)&(X['V19']>1.78579)",
 'Approve_30': "(X['V12']>-4.43659)&(X['V16']>-4.28659)&(X['V18']>-4.172)&(X['V9']>-3.86694)",
 'Approve_41': "(X['V17']>-2.18248)&(X['V19']<=-3.52538)"}

## Apply the optimised pipeline

We can apply our optimised pipeline to a new data set and make a prediction using the `predict` method:

In [33]:
y_pred_test = bs.predict(X_test)

### Outputs

The `predict` method returns the prediction generated by class in the final step of the pipeline - in this case, the `RBSOptimiser`:

In [34]:
y_pred_test

17855    0
23775    0
18629    0
12843    0
18084    0
        ..
2223     0
10168    0
23823    0
16583    0
22478    0
Length: 9399, dtype: int64

We can now calculate the **Revenue** of our optimised pipeline using the test data:

In [35]:
rev_opt = opt_metric.fit(
    y_preds=y_pred_test,
    y_true=y_test,
    sample_weight=amts_test
)

Comparing this to our original, unoptimised pipeline:

In [36]:
overall_lp.fit(
    X={
        'fraud_gen_lp': X_train,
        'fraud_opt': X_train_raw,
        'good_gen_lp': X_train,
        'good_opt': X_train_raw    
    }, 
    y={
        'fraud_lp': y_train,
        'good_lp': 1-y_train,
        'rbs_optimiser': y_train
    },
    sample_weight={
        'fraud_lp': None,
        'good_lp': None,
        'rbs_optimiser': y_train        
    }
)

In [37]:
y_pred_test_init = overall_lp.predict(X_test)

In [38]:
rev_init = opt_metric.fit(
    y_preds=y_pred_test_init,
    y_true=y_test,
    sample_weight=amts_test
)

In [39]:
print(f'Revenue of original, unoptimised pipeline: ${round(rev_init)}')
print(f'Revenue of optimised pipeline: ${round(rev_opt)}')
print(f'Absolute improvement in Revenue: ${round(rev_opt-rev_init)}')
print(f'Percentage improvement in Revenue: {round(100*(rev_opt-rev_init)/rev_init, 2)}%')

Revenue of original, unoptimised pipeline: $775698
Revenue of optimised pipeline: $857669
Absolute improvement in Revenue: $81972
Percentage improvement in Revenue: 10.57%




---