# **Fraud Detection in Business Transactions**

## **Objective**

Your goal is to modify and submit your tampered `fraud.csv` file containing the poisoned transactions, ensuring that the company's fraud detection system does not classify any of the transactions as fradulent.

In `fraud.csv`, you will see the following fields:

0. `Id` - Transaction ID

1. `PurchaseAmount` - The amount spent in the transaction

2. `NumItems` - The number of items bought

3. `ShippingDistance` - The distance between where the item bought and where the item was shipped to, in kilometers

4. `PaymentMethod` - The method used to pay for the items

    * `0` = Credit Card

    * `1` = Debit Card

    * `2` = PayPal

    * `3` = Gift Card

5. `TransactionHour` (from 0 to 23) - The time of day when the transaction was made

## **1. Loading a pretrained `K-Nearest Neighbours` model**

### Import necessary Python libraries

If you are running this locally, ensure the following Python libraries have been installed:
* `pandas`: https://pandas.pydata.org/
* `scikit-learn`: https://scikit-learn.org/stable/index.html
* `numpy`: https://numpy.org/

In [None]:
import pandas as pd
import sklearn
import pickle
import numpy as np

**If you are using Google Colab, upload the `fraud_classifier.pkl` file under the "Files" section now.** Remember to uncomment the correct lines depending on if you are running this notebook on Google Colab or not.

In [None]:
model_filepath = "./content/fraud_classifier.pkl"

# If using Google Colab, uncomment the line below
# model_filepath = "/content/fraud_classifier.pkl"

with open(model_filepath, 'rb') as f:
    KNN = pickle.load(f)

print(KNN)

## **2. Preview the fradulent transactions**

**If you are using Google Colab, upload the `fraud.csv` file under the "Files" section now.**

The following code block displays the `fraud.csv` file, where most of the transactions are flagged as fraudulent. Remember to uncomment the correct lines depending on if you are running this notebook on Google Colab or not.

In [None]:
# If using Google Colab, uncomment the line below
# fraud_data = pd.read_csv("/content/fraud.csv")

# Otherwise, if you are running this notebook locally, uncomment the line below
# fraud_data = pd.read_csv("./content/fraud.csv")

fraud_data

We can see how many transactions are currently flagged as fraudulent.

In [None]:
preds = KNN.predict(fraud_data.drop("Id", axis=1))
fraudulent_transactions = (preds == 1).sum()
total_transactions = len(preds)
percentage_fraudulent = (fraudulent_transactions / total_transactions) * 100

print(f"{percentage_fraudulent}% of transactions were classified as fraudulent!")

Additionally, we can also input custom data into the model to see whether that transaction is classified as fraud or not.

In [None]:
# Modify the variables below
PurchaseAmount = 100
NumItems = 100
ShippingDistance = 100
PaymentMethod = 0
TransactionHour = 0

is_fraud = KNN.predict(np.array([PurchaseAmount, NumItems, ShippingDistance, PaymentMethod, TransactionHour]).reshape(1, -1))

if is_fraud == 1:
    print("This transaction is fraudulent! ❌❌")
else:
    print("This transaction is not fraudulent. ✅✅")

## **Your task now is to modify `fraud.csv` so that most of the transactions are no longer classified as fraudulent. You can re-run the code cells above to help you in your modifications.**