🏦 Banks are battling frauds with machine learning models, but changing data patterns can weaken these defenses. London's Poundbank needs your help to figure out why their fraud detection models aren't as accurate anymore.

Poundbank recommends the `nannyml` library for monitoring machine learning models, which is also their tool of choice.

## The data

They have provided you with a reference(test data) and analysis set(production data). A summary and preview are provided below.

## reference.csv and analysis.csv

| Column     | Description              |
|------------|--------------------------|
| `'timestamp'` | Date of the transaction. |
| `'time_since_login_min'` | Time since the user logged in to the app. |
| `'transaction_amount'` | The amount of Pounds(£) that users sent to another account. |
| `'transaction_type'` | Transaction type: <ul><li>`CASH-OUT` - Withdrawing money from an account.</li><li>`PAYMENT` - Transaction where a payment is made to a third party.</li><li>`CASH-IN` - This is the opposite of a cash-out. It involves depositing money into an account.</li><li>`TRANSFER` - Transaction which involves moving funds from one account to another.</li> |
| `'is_first_transaction'` | A binary indicator denoting if the transaction is the user's first (1 for the first transaction, 0 otherwise). |
| `'user_tenure_months'` | The duration in months since the user's account was created or since they became a member. |
| `'is_fraud'` | A binary label indicating whether the transaction is fraudulent (1 for fraud, 0 otherwise). |
| `'predicted_fraud_proba'` | The probability assigned by a detection model indicates the likelihood of a fraudulent transaction. |
| `'predicted_fraud'` |  The predicted classification label is calculated based on predicted fraud probability by the detection model (1 for predicted fraud, 0 otherwise). |

In [63]:
# Re-run this cell
# Import required libraries
import pandas as pd
import nannyml as nml
nml.disable_usage_logging()
reference = pd.read_csv("reference.csv")
analysis = pd.read_csv("analysis.csv")
reference.head()

Unnamed: 0,timestamp,time_since_login_min,transaction_amount,transaction_type,is_first_transaction,user_tenure_months,is_fraud,predicted_fraud_proba,predicted_fraud
0,2018-01-01 00:00:00.000,1.56175,3981.1,PAYMENT,False,0.31898,1.0,0.99,1
1,2018-01-01 00:08:43.152,1.658074,1267.9,PAYMENT,False,7.391323,0.0,0.07,0
2,2018-01-01 00:17:26.304,2.454287,1984.7,CASH-IN,False,0.781225,1.0,1.0,1
3,2018-01-01 00:26:09.456,2.392085,2265.2,CASH-OUT,False,0.680473,1.0,0.98,1
4,2018-01-01 00:34:52.608,2.189806,2126.8,CASH-IN,False,8.542895,1.0,0.99,1


In [64]:
# Identify the months in which the estimated(expected) and realized(actual) accuracy of the model triggers alerts.

estimator = nml.CBPE(y_pred_proba='predicted_fraud_proba',
                    y_pred='predicted_fraud',
                    y_true='is_fraud',
                    timestamp_column_name='timestamp',
                    problem_type='classification_binary',
                    metrics=['roc_auc'],
                    chunk_period='m')

estimator.fit(reference)
estimated_results = estimator.estimate(analysis)


calculator = nml.PerformanceCalculator(y_pred_proba='predicted_fraud_proba',
                                        y_pred='predicted_fraud',
                                        y_true='is_fraud',
                                        timestamp_column_name='timestamp',
                                        problem_type='classification_binary',
                                        chunk_period='m',
                                        metrics=['roc_auc'])

calculator.fit(reference)
calculated_results = calculator.calculate(analysis)

filtered_results = calculated_results.compare(estimated_results)
filtered_results.plot().show()

![Realized performance vs Estimated performance (CBPE)](Compared.png)


In [65]:
months_with_performance_alerts = ['april_2019','may_2019','june_2019']

In [66]:
# Determine the feature that shows the most drift between the reference and analysis sets, thereby impacting the drop in realized accuracy the most.

features = ['time_since_login_min','transaction_amount',
            'transaction_type','is_first_transaction','user_tenure_months']

uni_drift = nml.UnivariateDriftCalculator(continuous_methods = ['kolmogorov_smirnov'],
                                          categorical_methods = ['chi2'],
                                          column_names = features,
                                         timestamp_column_name='timestamp',
                                         chunk_period='m')

uni_drift.fit(reference)
uni_drift_results = uni_drift.calculate(analysis)
uni_drift_results.plot().show()

corr_ranker = nml.CorrelationRanker()
corr_ranker.fit(calculated_results.filter(period='reference'))
corr_ranker_results = corr_ranker.rank(uni_drift_results,calculated_results)
display(corr_ranker_results)

Unnamed: 0,column_name,pearsonr_correlation,pearsonr_pvalue,has_drifted,rank
0,time_since_login_min,0.917039,8.675482e-08,True,1
1,transaction_amount,0.638245,0.004366984,True,2
2,is_first_transaction,0.074722,0.7682462,True,3
3,user_tenure_months,-0.015082,0.9526362,True,4
4,transaction_type,-0.064372,0.7996784,True,5


![Kolmogorov-Smirnov and Chi Square metrics on features](univariatedrift.jpg)

In [67]:
highest_correlation_feature = 'time_since_login_min'

In [68]:
# Look for instances where the monthly average transaction amount differs from the usual, causing an alert.
interested_column = 'transaction_amount'
avg_calculator = nml.SummaryStatsAvgCalculator(column_names = interested_column,
                                            chunk_period = 'm',
                                            timestamp_column_name = 'timestamp')
avg_calc_result = avg_calculator.calculate(analysis)
avg_calc_result.plot().show()

![Average values for transaction_amount](transaction_amount.png)

In [69]:
alert_avg_transaction_amount = 3069.8184