<header>
   <p  style='font-size:36px;font-family:Arial; color:#F0F0F0; background-color: #00233c; padding-left: 20pt; padding-top: 20pt;padding-bottom: 10pt; padding-right: 20pt;'>
       Financial Fraud Detection with Clearscape Analytics using SQL
  <br>
       <img id="teradata-logo" src="https://storage.googleapis.com/clearscape_analytics_demo_data/DEMO_Logo/teradata.svg" alt="Teradata" style="width: 125px; height: auto; margin-top: 20pt;">
    </p>
</header>

<p style = 'font-size:20px;font-family:Arial'><b>Introduction</b></p>
<p style = 'font-size:16px;font-family:Arial'>
    In recent years we have seen a massive increase in Fraud attempts, making fraud detection imperative for Banking and Financial Institutions. Despite countless efforts and human supervision, hundreds of millions of dollars are lost due to fraud. Fraud can happen using various methods, i.e., stolen credit cards, misleading accounting, phishing emails, etc. Due to small cases in significant populations, fraud detection has become more and more challenging. 
    <br>
    <br>
    With ClearScape Analytics, data scientists can use their preferred language, tools and platform to develop models to identify this fraud. Even in large scale operations, users have the guarantee that Vantage can scale to their needs and reduce fraud.</p>
    
<p style = 'font-size:18px;font-family:Arial'><b>Business Values</b></p>
<ul style = 'font-size:16px;font-family:Arial'>
    <li>Identification of financial fraud in multiple accounts</li>
    <li>Pattern recognition of fraudulent versus normal transactions</li>
    <li>Reduction of money lost due to recovering fraudulent charges</li>
    <li>Improved customer satisfaction and reduction of customer churn</li>
</ul>

<p style = 'font-size:18px;font-family:Arial'><b>Why Vantage?</b></p>
<p style = 'font-size:16px;font-family:Arial'>To maximize the business value of advanced analytic techniques including Machine Learning and Artificial Intelligence, it is estimated that organizations must scale their model development and deployment pipelines to 100s or 1000s of times greater amounts of data, models, or both.
    <br>
    <br>
    ClearScape Analytics provides powerful, flexible end-to-end data connectivity, feature engineering, model training, evaluation, and operational functions that can be deployed at scale as enterprise data assets; treating the products of ML and AI as first-class analytic processes in the enterprise.</p>

<hr style="height:2px;border:none;">
<p style = 'font-size:20px;font-family:Arial'><b>1. Connect to Vantage</b></p>
<p style = 'font-size:16px;font-family:Arial'>We will be prompted to provide the password. We will enter the password, press the Enter key, and then use the down arrow to go to the next cell. Begin running steps with Shift + Enter keys.</p>

In [None]:
%connect local, hidewarnings=true

<p style = 'font-size:16px;font-family:Arial'>Setup for execution of notebook. Begin running steps with Shift + Enter keys.</p>


In [None]:
SET query_band='DEMO=Financial_Fraud_Detection_InDB_SQL.ipynb;' UPDATE FOR SESSION;

<hr style="height:2px;border:none;">
<p style = 'font-size:20px;font-family:Arial'><b>2. Getting Data for This Demo</b>
<p style = 'font-size:16px;font-family:Arial'>We have provided data for this demo on cloud storage. We have the option of either running the demo using foreign tables to access the data without using any storage on our environment or downloading the data to local storage, which may yield somewhat faster execution. However, we need to consider available storage. There are two statements in the following cell, and one is commented out. We may switch which mode we choose by changing the comment string.</p> 


In [None]:
---call get_data('DEMO_GLM_Fraud_cloud');    -- takes about 1 minutes
call get_data('DEMO_GLM_Fraud_local');    -- takes about 2 minutes

<p style = 'font-size:16px;font-family:Arial'>Optional step – We should execute the below step only if we want to see the status of databases/tables created and space used.</p>


In [None]:
call space_report();  -- optional, takes about 10 seconds

<hr style="height:2px;border:none;">
<b style = 'font-size:20px;font-family:Arial'>3. Data Exploration</b>
<p style = 'font-size:16px;font-family:Arial'>We loaded the data from <a href = 'https://www.kaggle.com/code/georgepothur/4-financial-fraud-detection-xgboost/data'>https://www.kaggle.com/code/georgepothur/4-financial-fraud-detection-xgboost/data</a> into Vantage in a table named "transaction_data". We checked the data size and printed sample rows: 63k rows and 12 columns.</p>
<p style = 'font-size:16px;font-family:Arial'><b><i>*Please scroll down to the end of the notebook for detailed column descriptions of the dataset.</i></b></p>

In [None]:
Select count(*) from DEMO_GLM_Fraud.transaction_data;

In [None]:
Select Top 5 * from DEMO_GLM_Fraud.transaction_data;

<p style = 'font-size:16px;font-family:Arial'>In this simulated scenario, deceptive agents engage in transactions with the objective of taking control of customers' accounts, transferring funds to another account, and ultimately cashing out for profit.</p>

<hr style="height:1px;border:none;">
<p style = 'font-size:18px;font-family:Arial'><b>3.1 How many fraudulent transactions do we have in our dataset?</b></p>

In [None]:
SELECT SUM(
            CASE
                WHEN isFraud=1 THEN 1 ELSE 0
            END) AS "Fraud Transactions",
        COUNT(*) AS "Total Transactions",
        CAST(SUM(CAST(
            CASE
                WHEN isFraud=1 THEN 1 ELSE 0
            END AS DECIMAL(7,4)))/
            COUNT(*)  * 100 AS DECIMAL(4,2) format '9.99') || '%' AS "Percent Fraud Transactions"
FROM DEMO_GLM_Fraud.transaction_data;

<hr style="height:1px;border:none;">
<p style = 'font-size:18px;font-family:Arial'><b>3.2 How many transactions do we have group by transaction type?</b></p>

In [None]:
select "type", count(*) as typecnt from DEMO_GLM_Fraud.transaction_data group by 1;

In [None]:
%chart x=type, y=typecnt, mark=bar, title="No of Transactions per Transaction Type", 
            width=700, height=300, gridx=false, gridy=false

<hr style="height:1px;border:none;">
<p style = 'font-size:18px;font-family:Arial'><b>3.3 How many fraudulent transactions do we have group by transaction type?</b></p>

In [None]:
SELECT "type" as "Transaction Type",
    SUM(
    CASE
        WHEN isFraud=1 THEN 1 ELSE 0
    END) AS "Fraud Transactions by Type"
FROM DEMO_GLM_Fraud.transaction_data
    GROUP BY 1 HAVING "Fraud Transactions by Type" > 0 ;

In [None]:
%chart x="Transaction Type", y="Fraud Transactions by Type", mark=bar, title="No of Fraud Transactions per Transaction Type", 
            width=500, height=400, gridx=false, gridy=false, labelx="Transaction Type", 
            labely= "Count of Fraud Transactions by Type"

<p style = 'font-size:16px;font-family:Arial'>From the above result, we can see that out of the 92 fraud transactions, 47 are from transaction type "TRANSFER" and 45 are from "CASH_OUT".</p>

<hr style="height:1px;border:none;">
<p style = 'font-size:18px;font-family:Arial'><b>3.4 What percentage of fraudulent transactions do we have where transaction amount is equal to old balance in the origin account?</b></p>

<p style = 'font-size:16px;font-family:Arial'>This might be the case where the fraudster emptied the account of the victim.</p>

In [None]:
SELECT SUM(
    CASE
        WHEN isFraud=1 AND amount=oldbalanceOrig THEN 1 ELSE 0
    END) AS "Cleanout Fraud Transactions",
        CAST(SUM(CAST(
    CASE
        WHEN isFraud=1 AND amount=oldbalanceOrig THEN 1 ELSE 0
    END AS DECIMAL(7,4)))/
    SUM(
    CASE
        WHEN isFraud=1 THEN 1 ELSE 0
    END) * 100 AS DECIMAL(5,2)) || '%' AS "Percent Cleanout Fraud Transactions"
FROM DEMO_GLM_Fraud.transaction_data;

<p style = 'font-size:16px;font-family:Arial'>From the above result, we can see that out of 92 Fraud transactions, the amount involved in 90 fraud transactions was equal to the total balance in the account. </p>

<hr style="height:1px;border:none;">
<p style = 'font-size:16px;font-family:Arial'><b>Below are some insights about the dataset:</b></p>
<ol style = 'font-size:16px;font-family:Arial'>
    <li>We have 92 fraud transactions, which account for 0.14% of the dataset.</li>
    <li>Out of these 92 fraud transactions, 47 are of type TRANSFER, and 45 are of type CASH_OUT.</li>
    <li>Approximately 97.83% of our fraud transactions have a transaction amount equal to oldbalanceOrig, indicating account cleanout.</li>
    <li>About 71.74% of our fraud transactions have the recipient's old balance as zero.</li>
    <li>The isFlaggedFraud indicator is correct only two times among our 92 fraud transactions.</li>
</ol>

<hr style="height:1px;border:none;">
<p style = 'font-size:18px;font-family:Arial'><b>3.5 Univariate statistics</b></p>

<p style = 'font-size:16px;font-family:Arial'>The <code>TD_UnivariateStatistics</code> funtion computes the count, mean, std, min, percentiles, and max for numeric columns.</p>

In [None]:
SELECT TOP 5 *
    FROM TD_UnivariateStatistics(
    ON "DEMO_GLM_Fraud"."transaction_data" AS InputTable
    USING
    TargetColumns('step','amount','oldbalanceOrig','newbalanceOrig','oldbalanceDest',
                                'newbalanceDest','isFraud','isFlaggedFraud','txn_id')
    Stats('COUNT','MAXIMUM','MEAN','MINIMUM','PERCENTILES','STANDARD DEVIATION')
    Centiles(25,50,75)
    ) AS t;

<hr style="height:1px;border:none;">
<p style = 'font-size:18px;font-family:Arial'><b>3.6 Checking for Null Values</b></p>
<p style = 'font-size:16px;font-family:Arial'>The TD_ColumnSummary() function can be used to take a quick look at the columns, their datatypes, and summary of NULLs/non-NULLs for a given table.</p>

In [None]:
SELECT * FROM TD_ColumnSummary(
    ON "DEMO_GLM_Fraud"."transaction_data" AS InputTable
    USING
    TargetColumns('[:]')
) as t

<hr style="height:1px;border:none;">
<p style = 'font-size:18px;font-family:Arial'><b>3.7 Checking for Outliers</b></p>
<p style = 'font-size:16px;font-family:Arial'>The TD_OutlierFilterFit() function calculates the lower percentile, upper percentile, count of rows and median for all the "target_columns" provided by the user. These metrics for each column help the function OutlierTransform() detect outliers in data.</p>



In [None]:
SELECT TOP 5 *
    FROM TD_OutlierFilterFit(
    ON "DEMO_GLM_Fraud"."transaction_data" AS InputTable
    OUT TABLE OutputTable("OutlierFilterFit_out")
    USING
    TargetColumns('amount','newbalanceOrig','oldbalanceDest','newbalanceDest','oldbalanceOrig')
    ) AS sqlmr;

In [None]:
CREATE MULTISET TABLE OutlierFiltertransform_out AS (
    SELECT *
        FROM TD_OutlierFilterTransform(
        ON "DEMO_GLM_Fraud"."transaction_data" AS InputTable 
        ON "OutlierFilterFit_out" AS FitTable DIMENSION
    ) AS sqlmr )WITH DATA;

In [None]:
select Top 5 * from OutlierFiltertransform_out;

In [None]:
SELECT totalrows as "Total Rows",
    rowsafteroutliers as "Row after Outliers",
    totalrows-rowsafteroutliers AS "Number of Outliers"
    FROM
    (
    SELECT COUNT(*) AS totalrows
        FROM "DEMO_GLM_Fraud"."transaction_data") a,
        (
    SELECT COUNT(*) AS rowsafteroutliers
        FROM "OutlierFiltertransform_out")b;

In [None]:
SELECT TOP 5 *
    FROM 
    (
    SELECT *
        FROM "DEMO_GLM_Fraud"."transaction_data" MINUS
    SELECT *
        FROM "OutlierFiltertransform_out") a;

<hr style="height:2px;border:none;">
<b style = 'font-size:20px;font-family:Arial'>4. Data Preparation</b>

<p style='font-size:16px;font-family:Arial'><b>We'll perform the following steps:</b></p>
<ul style='font-size:16px;font-family:Arial'>
    <li>We will one-hot encode the categorical "type" column.</li>
    <li>We will perform feature scaling using ScaleFit and ScaleTransform on numerical columns.</li>
    <li>We will split the data into training and testing datasets (80:20 split).</li>
</ul>

<p style='font-size:16px;font-family:Arial'>We perform feature scaling during data pre-processing to handle highly varying magnitudes, values, or units. If feature scaling is not done, then a machine learning algorithm tends to weigh greater values higher and consider smaller values as lower ones, regardless of the unit of the values.</p>

<hr style="height:1px;border:none;">
<p style = 'font-size:18px;font-family:Arial'><b>4.1 Drop redundant columns</b></p>
<p style = 'font-size:16px;font-family:Arial'>We don't need nameDest, nameOrigin, and isFlaggedFraud for model training as they do not impact the outcome. We have txn_id to uniquely identify each transaction.</p>

In [None]:
CREATE MULTISET TABLE txn_data AS (
    SELECT step,
        "type",
        amount,
        oldbalanceOrig,
        newbalanceOrig,
        oldbalanceDest,
        newbalanceDest,
        isFraud,
        txn_id
        FROM "DEMO_GLM_Fraud"."transaction_data")WITH DATA PRIMARY INDEX(txn_id);

In [None]:
select Top 5 * from txn_data;

<hr style="height:1px;border:none;">
<p style = 'font-size:18px;font-family:Arial'><b>4.2 One-hot encoding</b></p>
<p style='font-size:16px;font-family:Arial'>
Here, we are one-hot encoding the "type" column. We find one-hot encoding necessary in many cases to represent categorical variables as binary values, enable numerical processing, ensure feature independence, handle non-numeric data, and improve the performance and interpretability of our machine learning models.
</p>

In [None]:
CREATE MULTISET TABLE onehotencodingfittable AS (
    SELECT *
        FROM TD_OneHotEncodingFit (
        ON txn_data AS InputTable
        USING
        TargetColumn ('"type"')
        IsInputDense ('true')
        CategoryCounts(5)
        Approach('Auto')    
) AS dt) WITH DATA;

In [None]:
CREATE MULTISET TABLE clean_data AS (
    SELECT *
        FROM TD_OneHotEncodingTransform (
        ON txn_data AS InputTable
        ON onehotencodingfittable AS FitTable DIMENSION
        USING
        IsInputDense('True')
) AS dt ) WITH DATA PRIMARY INDEX(txn_id);

In [None]:
select Top 5 * from clean_data;

<p style = 'font-size:16px;font-family:Arial'>The above output shows that we have transformed the data into a transfromed dataset.</p>

<hr style="height:2px;border:none;">
<b style = 'font-size:20px;font-family:Arial'>5. Create training and testing datasets in Vantage</b>
<p style = 'font-size:16px;font-family:Arial'>We'll create two datasets for training and testing in the ratio of 80:20.</p>

In [None]:
CREATE MULTISET TABLE traintest_data AS (
    SELECT *
        FROM TD_TrainTestSplit(
        ON clean_data AS InputTable
        USING
        IDColumn('txn_id')
        seed(25)
        trainSize(0.8)
        testSize(0.2)
) AS sqlmr) WITH DATA PRIMARY INDEX(txn_id);

In [None]:
CREATE MULTISET TABLE clean_data_train AS (
    SELECT step,
        type_0,
        type_1,
        type_2,
        type_3,
        type_4,
        type_other,
        amount,
        oldbalanceOrig,
        newbalanceOrig,
        oldbalanceDest,
        newbalanceDest,
        isFraud,
        txn_id
        FROM traintest_data
        WHERE TD_IsTrainRow = 1)WITH DATA PRIMARY INDEX(txn_id);

In [None]:
CREATE MULTISET TABLE clean_data_test AS (
    SELECT step,
        type_0,
        type_1,
        type_2,
        type_3,
        type_4,
        type_other,
        amount,
        oldbalanceOrig,
        newbalanceOrig,
        oldbalanceDest,
        newbalanceDest,
        isFraud,
        txn_id
        FROM traintest_data
        WHERE TD_IsTrainRow = 0)WITH DATA PRIMARY INDEX(txn_id);

<p style = 'font-size:16px;font-family:Arial'>The above output shows that we have split into train and test dataset.</p>

<hr style="height:2px;border:none;">
<b style = 'font-size:20px;font-family:Arial'>6. In-Database XGBoost model training</b>

<p style = 'font-size:16px;font-family:Arial'>The TD_XGBoost() function, also known as eXtreme Gradient Boosting, is an implementation of the gradient boosted decision tree algorithm designed for speed and performance. It has recently been dominating applied machine learning.</p>
<p style = 'font-size:16px;font-family:Arial'>In gradient boosting, each iteration fits a model to the residuals (errors) of the previous iteration to correct the errors made by existing models. The predicted residual is multiplied by this learning rate and then added to the previous prediction. Models are added sequentially until no further improvements can be made. It is called gradient boosting because it uses a gradient descent algorithm to minimize the loss when adding new models.</p>


In [None]:
CREATE MULTISET TABLE xgb_model AS (
    SELECT *
        FROM TD_XGBoost(
        ON "clean_data_train" AS "input"
        PARTITION BY ANY
        USING
        InputColumns('amount','newbalanceOrig','oldbalanceDest','newbalanceDest','oldbalanceOrig',
                                            'type_0','type_1','type_2','type_3','type_4')
        ResponseColumn('isFraud')
        MaxDepth(7)
        Seed(42)
        ModelType('Classification')
        RegularizationLambda(120.0)
        ShrinkageFactor(0.1)
) AS sqlmr )WITH DATA;

<p style = 'font-size:16px;font-family:Arial'>The function output is a trained XGBoost model, and we can input it to the XGBoostPredict() function for prediction.</p>

<hr style="height:2px;border:none;">
<b style = 'font-size:20px;font-family:Arial'>7. In-Database XGBoost model scoring</b>

<p style = 'font-size:16px;font-family:Arial'>The TD_XGBoostPredict() function runs the predictive algorithm based on the model generated by TD_XGBoost(). The TD_XGBoost() function, also known as eXtreme Gradient Boosting, performs classification or regression analysis on datasets.</p>
<p style = 'font-size:16px;font-family:Arial'>
When using the function, we should provide only numeric features. We need to convert the categorical features to numeric values before prediction.</p>

In [None]:
CREATE MULTISET TABLE XGBPredict AS (
    SELECT *
        FROM TD_XGBoostPredict(
        ON "clean_data_test" AS inputtable
        PARTITION BY ANY 
        ON "xgb_model" AS ModelTable
        DIMENSION
        ORDER BY "task_index", "tree_num", "iter", "tree_order"
        USING
        IdColumn('txn_id')
        Accumulate('isFraud')
        OutputProb('True')
        ModelType('Classification')
        Responses('0','1')
) AS sqlmr )WITH DATA;

In [None]:
Select Top 5 * from XGBPredict;

<p style = 'font-size:16px;font-family:Arial'>The output above shows our prob_1, i.e., the transaction is fraud, and prob_0, i.e., the transaction is not a fraud. We use these probabilities in our prediction column to assign a class label.</p>

In [None]:
SELECT *
    FROM TD_ClassificationEvaluator(
    ON XGBPredict AS InputTable
    PARTITION BY ANY
    OUT TABLE OutputTable(classeval_out)
    USING
    ObservationColumn('isFraud')
    PredictionColumn('Prediction')
    Labels('0','1')
    ) AS sqlmr;

In [None]:
select * from classeval_out;

<hr style="height:2px;border:none;">
<b style = 'font-size:20px;font-family:Arial'>8. Visualize the results (ROC curve and AUC)</b>

<p style = 'font-size:16px;font-family:Arial'>We create the ROC curve, which is a graph between TPR (True Positive Rate) and FPR (False Positive Rate). We use the area under the ROC curve as a metric to evaluate how well our model can distinguish between positive and negative classes. A higher AUC indicates better performance in distinguishing between the positive and negative categories. We generally consider an AUC above 0.75 as decent.</p>

In [None]:
SELECT *
    FROM TD_ROC(
    ON XGBPredict AS InputTable PARTITION BY ANY
    OUT TABLE OutputTable(roc_out)
    USING
    ProbabilityColumn('"Prob_1"')
    ObservationColumn('isFraud')
    NumThresholds(300)
    ) AS sqlmr;

In [None]:
select * from roc_out;

In [None]:
%chart x=fpr, y=tpr, mark=line, title="ROC", 
            width=500, height=400, gridx=false, gridy=false, labelx="False Positive Rate", 
            labely= "True Positive Rate"

<p style = 'font-size:16px;font-family:Arial'>Looking at the above ROC Curve, we can confidently say that our model has performed well on testing data. The AUC value is above 0.75 and resonates with our understanding that the model is performing well.</p>

<p style = 'font-size:18px;font-family:Arial'><b>Conclusion</b></p>

<p style = 'font-size:16px;font-family:Arial'>In this demonstration, we have illustrated a simplified - but complete - overview of how we can implement a typical machine learning workflow completely inside the database using Vantage. This allows us to leverage Vantage's operational scale, power, and stability.</p>

<hr style="height:2px;border:none;">
<b style = 'font-size:20px;font-family:Arial'>9. Cleanup</b>

<p style = 'font-size:18px;font-family:Arial'><b>Work Tables</b></p>
<p style = 'font-size:16px;font-family:Arial'>We need to clean up our work tables to prevent errors next time.</p>

In [None]:
DROP TABLE OutlierFilterFit_out;

In [None]:
DROP TABLE OutlierFiltertransform_out;

In [None]:
DROP TABLE txn_data;

In [None]:
DROP TABLE onehotencodingfittable;

In [None]:
DROP TABLE xgb_model;

In [None]:
DROP TABLE XGBPredict;

In [None]:
DROP TABLE classeval_out;

In [None]:
DROP TABLE roc_out;

In [None]:
DROP TABLE traintest_data;

In [None]:
DROP TABLE clean_data;

In [None]:
DROP TABLE clean_data_train;

In [None]:
DROP TABLE clean_data_test;

<p style = 'font-size:18px;font-family:Arial'> <b>Databases and Tables </b></p>
<p style = 'font-size:16px;font-family:Arial'>We will use the following code to clean up tables and databases created for this demonstration.</p>

In [None]:
call remove_data('demo_glm_fraud');

<hr style="height:2px;border:none;">

<b style = 'font-size:20px;font-family:Arial'>Required Materials</b>
<p style = 'font-size:16px;font-family:Arial'>Let’s look at the elements we have available for reference for this notebook:</p>

<p style = 'font-size:18px;font-family:Arial'><b>Filters:</b></p>
<ul style = 'font-size:16px;font-family:Arial'>
    <li><b>Industry:</b> Finance</li>
    <li><b>Functionality:</b> Machine Learning</li>
    <li><b>Use Case:</b> Fraud Detection</li>
</ul>

<p style = 'font-size:18px;font-family:Arial'><b>Related Resources:</b></p>

<ul style = 'font-size:16px;font-family:Arial'>
    <li><a href='https://www.teradata.com/Blogs/Fraud-Busting-AI'>Fraud-Busting-AI</a></li>
    <li><a href='https://www.teradata.com/Industries/Financial-Services'>Financial Services</a></li>
    <li><a href='https://www.teradata.com/Resources/Datasheets/Move-from-Detection-to-Prevention-and-Outsmart-Fraudsters'>Move from Detection to Prevention and Outsmart Tech-Savvy Fraudsters</a></li>
</ul>

<b style = 'font-size:20px;font-family:Arial'>Dataset:</b>

- `txn_id`: transaction id
- `step`: maps a unit of time in the real world. In this case 1 step is 1 hour of time. Total steps 744 (31 days simulation).
- `type`: CASH-IN, CASH-OUT, DEBIT, PAYMENT and TRANSFER
- `amount`: amount of the transaction in local currency
- `nameOrig`: customer who started the transaction
- `oldbalanceOrig`: customer's balance before the transaction
- `newbalanceOrig`: customer's balance after the transaction
- `nameDest`: customer who is the recipient of the transaction
- `oldbalanceDest`: recipient's balance before the transaction
- `newbalanceDest`: recipient's balance after the transaction
- `isFraud`: identifies a fraudulent transaction (1) and non fraudulent (0)
- `isFlaggedFraud`: flags illegal attempts to transfer more than 200,000 in a single transaction

<p style = 'font-size:18px;font-family:Arial'><b>Links:</b></p>
<ul style = 'font-size:16px;font-family:Arial'>
    <li>Uses a dataset and feature discovery methods outlined here: <a href = 'https://www.kaggle.com/georgepothur/4-financial-fraud-detection-xgboost/notebook'>https://www.kaggle.com/georgepothur/4-financial-fraud-detection-xgboost/notebook</a></li>
    <li>Teradata Clearscape Analytics reference: <a href = 'https://docs.teradata.com/search/all?query=Analyze+Your+Data+with+ClearScape+Analytics%25E2%2584%25A2&content-lang=en-US'>here</a></li>
</ul>

<footer style="padding-bottom:35px; background:#f9f9f9; border-bottom:3px solid #00233C">
    <div style="float:left;margin-top:14px">ClearScape Analytics™</div>
    <div style="float:right;">
        <div style="float:left; margin-top:14px">
            Copyright © Teradata Corporation - 2024,2025. All Rights Reserved
        </div>
    </div>
</footer>