In [1]:
import numpy as np
import pandas as pd
import math

## Simulated Transactional Data

Next, we import the simulated transactional data, which records 400 transactions over a 12-month period. These data have the following fields:

1. DATE: Date of transaction
2. COUNTRY BANK ORIGIN: Bank country originator transaction
3. RISK COUNTRY BANK ORIGIN: Risk of Bank country originator transaction	
4. COUNTRY BANK DESTINATION: Bank country reception transaction
5. RISK COUNTRY BANK DESTINATION: Risk of Bank country reception transaction
6. RISK COUNTRY: The highest risk of the countries involved in the transaction.	
7. OCCUPATION:	Occupation of customer involved in the transaction (Businessman,Employee,Housewife,Student)
8. OCCUPATION RISK: Risk of occupation 
9. CUSTOMER ID: Id of customer
10. AMOUNT: Transaction amount
11. CUSTOMER RISK:	Risk of customer (1-low,2-medium,3-high)
12. TRANSACTION TYPE: Type of transaction (Debit, Credit)
13. CAT: The variable that tells us whether a transaction was unusual or not (1 = Unusual, 0 = Usual)

In [2]:
df1 = pd.read_excel('TXN_DATA.xlsx', sheet_name="Data")

In [3]:
df1.head()

Unnamed: 0,DATE,COUNTRY BANK ORIGIN,RISK COUNTRY BANK ORIGIN,COUNTRY BANK DESTINATION,RISK COUNTRY BANK DESTINATION,RISK COUNTRY,OCCUPATION,OCCUPATION RISK,CUSTOMER ID,AMOUNT2,CUSTOMER RISK,TRANSACTION TYPE,CAT
0,2022-01-02,Thailand,Medium,The Democratic Republic Of The Congo,High,3,Housewife,1,27,11.543591,1,Entrante,0
1,2022-01-03,Thailand,Medium,Australia,Low,2,Businessman,3,77,2692.271638,1,Entrante,0
2,2022-01-03,Andorra,Low,Bhután,Low,1,Student,1,37,225.207654,1,Entrante,0
3,2022-01-05,Cuba,High,United States,Low,3,Employee,1,12,46.812418,1,Saliente,0
4,2022-01-06,Norway,Low,Andorra,Low,1,Businessman,3,70,787.839259,1,Entrante,0


## AHP Pairwise Matrix
Below is the pairwise matrix where the Saaty scale was used:

### Saaty's Comparison Scale
1. **1** EQUAL IMPORTANCE
2. **3** MODERATELY IMPORTANT
3. **5** STRONGLY IMPORTANT
4. **7** VERY STRONGLY IMPORTANT
5. **9** EXTREMELY IMPORTANT

We apply this scale to the factors:

1. BANK COUNTRY
2. OCCUPATION
3. CUSTOMER RISK

|   FACTORS     | BANK COUNTRY  | OCCUPATION | CUSTOMER RISK | 
|---------------|---------------|------------|---------------|
| BANK COUNTRY  | 1.000         | 3.000      | 7.000         | 
| OCCUPATION    | 0.333         | 1.000      | 3.000         |
| CUSTOMER RISK | 0.143         | 0.333      | 1.000         | 

This matrix recorded a consistency ratio of RC=0.05, therefore, we assume consistency and proceed with the calculation of weights that will allow us to convert the qualitative aspects of transactionality into a numerical variable.

## Weightings of the factors

|   FACTORS     |    WEIGHT     | 
|---------------|---------------|
| BANK COUNTRY  |     0.67      | 
| OCCUPATION    |     0.25      | 
| CUSTOMER RISK |     0.09      |

Under these weightings, the quantification of risk factors is calculated, and the **QPuntuation** field is obtained in the data matrix that we will use for unusual transaction detection. Additionally, the following variables are incorporated into this data matrix:

1. **TxJurisdiction**: An indicator that compares the transaction amount with the transactionality of the bank's country according to its risk.
2. **TxCustomer**: An indicator that compares the transaction amount with the transactionality of the customer.
3. **TxRisk**: An indicator that compares the transaction amount with the transactionality of customers according to their risk.
4. **TxOccupation**: An indicator that compares the transaction amount with the transactionality of customers according to their occupation.

In [4]:
df = pd.read_excel('TXN_DATA.xlsx', sheet_name='PCAMatrix')

In [5]:
df.head()

Unnamed: 0,CUSTOMER ID,QPuntuation,TxJurisdiction,TxCustomer,TxRisk,TxOccupation,CAT
0,27,2.35,0.002674,0.007524,0.00236,0.002763,0
1,77,1.86,0.353034,1.569971,0.550359,0.262313,0
2,37,1.1,0.100259,2.988329,0.046037,0.051718,0
3,12,2.35,0.010844,0.033598,0.009569,0.016726,0
4,70,1.01,0.350735,2.79475,0.161051,0.076761,0


For the purposes of the project, we incorporate the variable "EYE" so that after applying PCA, we can validate the prediction of unusual transactions and, consequently, assess the efficiency of the method in detecting unusual aspects in them.

Next, we separate the quantitative variables that we will use for classification.

In [6]:
X, y  = df.iloc[:,1:6].values, df.loc[:,"CAT"].values

In [7]:
mu = np.mean(X, axis=0)
sigma = np.std(X, axis=0)
X_std = (X - mu)/sigma

### <span style='color:Black'> **Generating Cairo files**  </span>

In [8]:
def decimal_to_fp16x16(num):

    whole_num = int(num)
    fractional_part = int((num - whole_num) * 65536)
    fp_number = (whole_num << 16) + fractional_part
    return fp_number

In [9]:
import os

In [10]:
current_directory = os.getcwd()
parent_directory = os.path.dirname(current_directory)
new_directory_path = os.path.join(parent_directory, "src/generated")

In [11]:
os.makedirs('src/generated', exist_ok=True)

In [12]:
tensor_name = ["X","X_std","y"]  

def generate_cairo_files(data, name):

    with open(os.path.join('src', 'generated', f"{name}.cairo"), "w") as f:
        f.write(
            "use core::array::{ArrayTrait, SpanTrait};\n" +
            "use orion::operators::tensor::{core::{Tensor, TensorTrait}};\n" +
            "use orion::operators::tensor::FP16x16Tensor;\n" +
            "use orion::numbers::fixed_point::implementations::fp16x16::core::{FP16x16, FixedTrait};\n" +
            "\n" + f"fn {name}() -> Tensor<FP16x16>" + "{\n\n" + 
            "let mut shape = ArrayTrait::new();\n"
        )
        for dim in data.shape:
            f.write(f"shape.append({dim});\n")
    
        f.write("let mut data = ArrayTrait::new();\n")
        for val in np.nditer(data.flatten()):
            f.write(f"data.append(FixedTrait::new({abs(int(decimal_to_fp16x16(val)))}, {str(val < 0).lower()}));\n")
        f.write(
            "let tensor = TensorTrait::<FP16x16>::new(shape.span(), data.span());\n" +
            "return tensor;\n}"
        )

with open(f"src/generated.cairo", "w") as f:
    for n in tensor_name:
        f.write(f"mod {n};\n")

generate_cairo_files(X, "X")
generate_cairo_files(X_std, "X_std")
generate_cairo_files(y, "y")