# Credit Card Analytics

Welcome to our credit card analytics notebook! Here we'll be using Atoti to analyze `5` million records worth of credit card sales transactions, joined together with a rich data model of users, loans, and retailer attributes. Let's see what insights we can generate!

> In this Notebook:
>
> * [Install Dependencies](Install-Dependencies)
> * [Import Libraries](Import-Libraries)
> * [Load Data into Pandas DataFrame from CSV](#Load-Data-Into-Pandas-DataFrame-from-CSV)
> * [Format Data for each DataFrame](#Format-Data-for-Each-DataFrame)
> * [Instantiate Atoti Server](#Instantiate-Atoti-Server)
> * [Load Pandas DataFrames as Atoti Table Objects](#Load-Pandas-DataFrames-as-Atoti-Table-Objects)
> * [Join Tables](#Join-Tables)
> * [Analyze and Create New Hierarchies, Levels, and Measures](#Analyze-and-Create-New-Hierarchies,-Levels,-and-Measures)

**ðŸ’¡ Note:** Our credit card datasets come from [Kaggle](https://www.kaggle.com/datasets/ealtman2019/credit-card-transactions/data), feel free to check it out!

## Install Dependencies

In [1]:
# You will need the following when downloading CSVs from AWS S3
# Used the quiet flag to reduce installation output
!pip install --quiet fsspec s3fs

## Import Libraries

In [2]:
import atoti as tt
import pandas as pd
import time
from pprint import pprint

## Load Data Into Pandas DataFrame from CSV

* Credit Card Transaction Data
* Credit Card Info Data
* User Info Data
* Retailer Info Data
* Loan Data

### Credit Card Transaction Data

In [3]:
# Load credit card transaction data
cc_sales_gzip_df = pd.read_csv(
    "s3://data.atoti.io/notebooks/retail-banking/processed/credit_card_transactions_processed_5MM.csv.gz",
    compression="gzip",
    low_memory=False,
)
cc_sales_gzip_df

Unnamed: 0,User,Card,Year,Month,Day,Time,Amount,Use Chip,Merchant Name,Merchant City,Merchant State,Zip,MCC,Errors?,Is Fraud?
0,50,0,2019,7,4,23:37,4.432,Chip Transaction,Merchant 47076,Beaverton,OR,97007.0,5921,,No
1,50,1,2019,6,28,23:33,3.436,Chip Transaction,Merchant 47076,Beaverton,OR,97007.0,5921,,No
2,792,1,2015,11,28,16:08,1.672,Chip Transaction,Merchant 47076,Yonkers,NY,10703.0,5921,,No
3,1210,1,2007,3,7,22:52,4.076,Swipe Transaction,Merchant 47076,Beaverton,OR,97007.0,5921,,No
4,1575,0,2016,5,23,07:07,2.462,Swipe Transaction,Merchant 47076,Shreveport,LA,71107.0,5921,,No
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4999995,1999,1,2019,12,21,07:59,8.560,Chip Transaction,Merchant 31143,Russellville,AL,35653.0,4121,,No
4999996,1999,1,2019,12,22,08:15,9.344,Chip Transaction,Merchant 31143,Russellville,AL,35653.0,4121,,No
4999997,1999,1,2019,12,22,20:25,9.260,Chip Transaction,Merchant 31143,Russellville,AL,35653.0,4121,,No
4999998,1999,1,2019,12,23,19:48,9.800,Chip Transaction,Merchant 31143,Russellville,AL,35653.0,4121,,No


### Credit Card Info Data

In [4]:
# Load user credit card data
user_cc_df = pd.read_csv(
    "s3://data.atoti.io/notebooks/retail-banking/processed/cards_processed.csv",
    index_col=0,
)
user_cc_df

Unnamed: 0,User,CARD INDEX,Retailer ID,Card Number,Expires,CVV,Has Chip,Cards Issued,Credit Limit,Acct Open Date,Year PIN last Changed,Card on Dark Web
0,0,0,24,4344676511950444,12/2022,623,YES,2,74295,09/2002,2008,No
1,0,1,25,4956965974959986,12/2020,393,YES,2,71968,04/2014,2014,No
2,0,2,26,4582313478255491,02/2024,719,YES,2,96414,07/2003,2004,No
3,0,3,20,4879494103069057,08/2024,693,NO,1,62400,01/2003,2012,No
4,0,4,17,5722874738736011,03/2009,75,YES,1,50028,09/2008,2009,No
...,...,...,...,...,...,...,...,...,...,...,...,...
6141,1997,1,1,300609782832003,01/2024,663,YES,1,56900,11/2000,2013,No
6142,1997,2,20,4718517475996018,01/2021,492,YES,2,55700,04/2012,2012,No
6143,1998,0,7,5929512204765914,08/2020,237,NO,2,59200,02/2012,2012,No
6144,1999,0,11,5589768928167462,01/2020,630,YES,1,78074,01/2020,2020,No


### User Data

In [5]:
# Load user data
users_df = pd.read_csv(
    "data/users_processed.csv"
    # "s3://data.atoti.io/notebooks/retail-banking/processed/users_processed.csv"
)
# users_df = users_df.rename_axis("User").reset_index()
users_df

Unnamed: 0,User,Person,Current Age,Age Range,Retirement Age,Birth Year,Birth Month,Gender,Address,Apartment,...,Longitude,Per Capita Income - Zipcode,Yearly Income - Person,Income Range,Total Debt,FICO Score,EAD,PD12,PDLT,LGD
0,0,Hazel Robinson,53,60,66,1966,11,Female,462 Rose Lane,,...,-117.76,29278,59696,50K - 80K,127613,787,7543.452009,0.102502,0.148166,0.608904
1,1,Nickolas Lopez,21,30,67,1999,2,Male,92196 Tenth Drive,,...,-77.55,44196,90104,80K - 100K,85204,787,7543.452009,0.102502,0.148166,0.608904
2,2,Kallie Rodriguez,39,40,71,1980,7,Female,135 Littlewood Avenue,6.0,...,-117.29,22050,44958,20K - 50K,91549,787,7543.452009,0.102502,0.148166,0.608904
3,3,Rylan Rodriguez,33,40,69,1986,10,Female,928 Bayview Street,,...,-89.46,19635,40029,20K - 50K,0,787,7543.452009,0.102502,0.148166,0.608904
4,4,Sasha Sadr,53,60,68,1966,12,Female,3606 Federal Boulevard,,...,-73.74,37891,77254,50K - 80K,191349,701,8943.997200,0.105292,0.150311,0.602701
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1995,1995,Alessandro Davis,37,40,66,1982,12,Male,550 Forest Street,,...,-112.12,20305,41401,20K - 50K,71180,580,8195.670961,0.103159,0.150773,0.607600
1996,1996,Darren Turner,31,40,63,1988,5,Male,6692 Lake Street,,...,-85.34,21444,43724,20K - 50K,53853,514,8681.589080,0.123131,0.171764,0.611464
1997,1997,August Braun,42,50,72,1977,8,Male,331 Oak Lane,,...,-121.76,28733,58584,50K - 80K,99235,563,9434.481972,0.097110,0.143415,0.605630
1998,1998,Kyng El-Mafouk,51,60,68,1968,10,Male,207 Ocean View Street,,...,-74.42,53790,109673,100K - 150K,242379,505,8936.491473,0.101380,0.149277,0.602884


### Retailer Data

In [6]:
# Load retailer data
cc_info_df = pd.read_csv(
    "data/retailers.csv"
    # "s3://data.atoti.io/notebooks/retail-banking/input/retailers.csv"
)
cc_info_df

Unnamed: 0,Retailer ID,Retailer Name,Card Brand,Card Type,Level 1,Level 2,Level 3,Level 4,Level 5,Industry
0,1,Upromise,Mastercard,Credit,Barclays,Consumer Banking,Cards Business,Nonprofit,Education,Education
1,2,AARP,Mastercard,Credit,Barclays,Consumer Banking,Cards Business,Nonprofit,Health Care,Health Care
2,3,Barnes & Nobele,Mastercard,Credit,Barclays,Consumer Banking,Cards Business,Retail,Consumer Discretionary,Books
3,4,Athleta,Mastercard,Credit,Barclays,Consumer Banking,Cards Business,Retail,Consumer Discretionary,Fashion
4,5,Banana Republic,Mastercard,Credit,Barclays,Consumer Banking,Cards Business,Retail,Consumer Discretionary,Fashion
5,6,GAP,Mastercard,Credit,Barclays,Consumer Banking,Cards Business,Retail,Consumer Discretionary,Fashion
6,7,Old Navy,Mastercard,Credit,Barclays,Consumer Banking,Cards Business,Retail,Consumer Discretionary,Fashion
7,8,American Airlines,Mastercard,Credit,Barclays,Consumer Banking,Cards Business,Travel,Airline,Airline
8,9,Emirates Airlines,Mastercard,Credit,Barclays,Consumer Banking,Cards Business,Travel,Airline,Airline
9,10,Frontier Airlines,Mastercard,Credit,Barclays,Consumer Banking,Cards Business,Travel,Airline,Airline


### Loans Data

In [7]:
# Load loans data
user_loans_df = pd.read_csv(
    "data/loans.csv"
    # "s3://data.atoti.io/notebooks/retail-banking/input/loans.csv"
)
user_loans_df

Unnamed: 0,User,Credit Policty,Loan Purpose,Interest Rate,Installment,DTI,Days with Credit Line,Revolving Bal,Revol_Util,Inq Last 6mos,Delinq 2yrs,Public Record,Not Fully Paid
0,0,1,Home Improvement,0.1166,330.53,10.74,5519.000000,46789,66.7,3,0,0,1
1,0,1,Debt Consolidation,0.1292,213.72,20.15,8520.041667,22247,93.1,1,0,0,0
2,0,1,Small Business,0.1292,504.84,6.08,3240.000000,5071,78.0,1,1,0,1
3,3,1,Debt Consolidation,0.1355,645.24,1.67,5370.041667,9416,95.1,2,0,0,0
4,4,0,Debt Consolidation,0.1671,710.18,20.83,3269.000000,11881,50.8,4,0,0,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...
1995,1995,0,Home Improvement,0.1418,68.54,10.08,4649.041667,2840,38.4,4,0,0,0
1996,1996,0,All Other,0.1407,34.21,16.27,567.041667,0,85.0,0,0,0,0
1997,1997,0,Debt Consolidation,0.1375,102.17,16.71,1799.000000,9135,76.8,28,0,0,0
1998,1998,0,Debt Consolidation,0.1470,110.46,9.42,449.000000,1670,55.6,1,0,0,0


## Format Data for Each DataFrame

* Credit Card Transactions
* User Credit Card Info
* Users Data
* Loans Data

In [8]:
cc_sales_gzip_df.dtypes

User                int64
Card                int64
Year                int64
Month               int64
Day                 int64
Time               object
Amount            float64
Use Chip           object
Merchant Name      object
Merchant City      object
Merchant State     object
Zip               float64
MCC                 int64
Errors?            object
Is Fraud?          object
dtype: object

### Formatting Credit Card Transactions

In [9]:
# Cast intended hierarchies as strings
cc_sales_gzip_df["User"] = cc_sales_gzip_df["User"].astype(str)
cc_sales_gzip_df["Card"] = cc_sales_gzip_df["Card"].astype(str)
cc_sales_gzip_df["Year"] = cc_sales_gzip_df["Year"].astype(str)
cc_sales_gzip_df["Month"] = cc_sales_gzip_df["Month"].astype(str)
cc_sales_gzip_df["Day"] = cc_sales_gzip_df["Day"].astype(str)
cc_sales_gzip_df["Merchant Name"] = cc_sales_gzip_df["Merchant Name"].astype(str)
cc_sales_gzip_df["Zip"] = cc_sales_gzip_df["Zip"].astype(str)
cc_sales_gzip_df["MCC"] = cc_sales_gzip_df["MCC"].astype(str)

# Create a `Datetime` column and combine values
# From Year, Month, Day, and Time columns to
# Create a proper Datetime data type column
cc_sales_gzip_df.insert(2, "Datetime", "")
cc_sales_gzip_df["Datetime"] = pd.to_datetime(
    cc_sales_gzip_df["Year"]
    + " "
    + cc_sales_gzip_df["Month"]
    + " "
    + cc_sales_gzip_df["Day"]
    + " "
    + cc_sales_gzip_df["Time"]
)
cc_sales_gzip_df.drop(columns=["Year", "Month", "Day", "Time"], inplace=True)
cc_sales_gzip_df

Unnamed: 0,User,Card,Datetime,Amount,Use Chip,Merchant Name,Merchant City,Merchant State,Zip,MCC,Errors?,Is Fraud?
0,50,0,2019-07-04 23:37:00,4.432,Chip Transaction,Merchant 47076,Beaverton,OR,97007.0,5921,,No
1,50,1,2019-06-28 23:33:00,3.436,Chip Transaction,Merchant 47076,Beaverton,OR,97007.0,5921,,No
2,792,1,2015-11-28 16:08:00,1.672,Chip Transaction,Merchant 47076,Yonkers,NY,10703.0,5921,,No
3,1210,1,2007-03-07 22:52:00,4.076,Swipe Transaction,Merchant 47076,Beaverton,OR,97007.0,5921,,No
4,1575,0,2016-05-23 07:07:00,2.462,Swipe Transaction,Merchant 47076,Shreveport,LA,71107.0,5921,,No
...,...,...,...,...,...,...,...,...,...,...,...,...
4999995,1999,1,2019-12-21 07:59:00,8.560,Chip Transaction,Merchant 31143,Russellville,AL,35653.0,4121,,No
4999996,1999,1,2019-12-22 08:15:00,9.344,Chip Transaction,Merchant 31143,Russellville,AL,35653.0,4121,,No
4999997,1999,1,2019-12-22 20:25:00,9.260,Chip Transaction,Merchant 31143,Russellville,AL,35653.0,4121,,No
4999998,1999,1,2019-12-23 19:48:00,9.800,Chip Transaction,Merchant 31143,Russellville,AL,35653.0,4121,,No


### Formatting Credit Card Info Data

In [10]:
# Rename `CARD INDEX` column to `Card` to match joined `Card` column from cc_sales_gzip_df
user_cc_df.rename(columns={"CARD INDEX": "Card"}, inplace=True)

# Cast intended hierarchies as strings
user_cc_df["User"] = user_cc_df["User"].astype(str)
user_cc_df["Card"] = user_cc_df["Card"].astype(str)
user_cc_df["Retailer ID"] = user_cc_df["Retailer ID"].astype(str)
user_cc_df["Card Number"] = user_cc_df["Card Number"].astype(str)
user_cc_df["CVV"] = user_cc_df["CVV"].astype(str)
user_cc_df

Unnamed: 0,User,Card,Retailer ID,Card Number,Expires,CVV,Has Chip,Cards Issued,Credit Limit,Acct Open Date,Year PIN last Changed,Card on Dark Web
0,0,0,24,4344676511950444,12/2022,623,YES,2,74295,09/2002,2008,No
1,0,1,25,4956965974959986,12/2020,393,YES,2,71968,04/2014,2014,No
2,0,2,26,4582313478255491,02/2024,719,YES,2,96414,07/2003,2004,No
3,0,3,20,4879494103069057,08/2024,693,NO,1,62400,01/2003,2012,No
4,0,4,17,5722874738736011,03/2009,75,YES,1,50028,09/2008,2009,No
...,...,...,...,...,...,...,...,...,...,...,...,...
6141,1997,1,1,300609782832003,01/2024,663,YES,1,56900,11/2000,2013,No
6142,1997,2,20,4718517475996018,01/2021,492,YES,2,55700,04/2012,2012,No
6143,1998,0,7,5929512204765914,08/2020,237,NO,2,59200,02/2012,2012,No
6144,1999,0,11,5589768928167462,01/2020,630,YES,1,78074,01/2020,2020,No


### Formatting Users Data

In [11]:
# Remove $ symbols for currency related numerical values
# users_df['Per Capita Income - Zipcode'] = users_df['Per Capita Income - Zipcode'].str.replace('$', '')
# users_df['Yearly Income - Person'] = users_df['Yearly Income - Person'].str.replace('$', '')
# users_df['Total Debt'] = users_df['Total Debt'].str.replace('$', '')

# Cast intended hierarchies as strings or numerical data types
users_df["User"] = users_df["User"].astype(str)
users_df["Birth Month"] = users_df["Birth Month"].astype(str)
users_df["Zipcode"] = users_df["Zipcode"].astype(str)
users_df["Per Capita Income - Zipcode"] = users_df[
    "Per Capita Income - Zipcode"
].astype(int)
users_df["Yearly Income - Person"] = users_df["Yearly Income - Person"].astype(int)
users_df["Total Debt"] = users_df["Total Debt"].astype(int)
users_df["FICO Score"] = users_df["FICO Score"].astype(str)
users_df["Current Age"] = users_df["Current Age"].astype(str)
users_df["Age Range"] = users_df["Age Range"].astype(str)
users_df["Retirement Age"] = users_df["Retirement Age"].astype(str)
# users_df

### Format Retailers Data

In [12]:
# Cast intended hierarchies as strings
cc_info_df["Retailer ID"] = cc_info_df["Retailer ID"].astype(str)

## Instantiate Atoti Server

In [13]:
# Start an Atoti Server instance
session = tt.Session(
    user_content_storage="./content",
    port=9092,
    java_options=["-Xms1G", "-Xmx20G"],
)
session.link

http://localhost:9092

_Note_: This is the session's local URL: it may not be reachable if Atoti is running on another machine.

## Load Pandas DataFrames as Atoti Table Objects

In [14]:
# Implied Atoti Table creation using `Session.read_pandas()` function
cc_sales_table = session.read_pandas(
    cc_sales_gzip_df,
    table_name="Sales Transactions",
)

users_table = session.read_pandas(
    users_df,
    table_name="Users",
)

user_cc_table = session.read_pandas(
    user_cc_df,
    table_name="User Credit Cards",
)

cc_info_table = session.read_pandas(
    cc_info_df,
    table_name="Credit Card Info",
)

fico_table = session.read_csv(
    "data/FICO.csv",
    #    keys=['FICO Score'],
    types={"FICO Score": tt.type.STRING},
    table_name="FICO",
)

user_loans_table = session.read_csv(
    "data/loans.csv",
    types={
        "User": tt.type.STRING,
        "Inq Last 6mos": tt.type.STRING,
        "Delinq 2yrs": tt.type.STRING,
        "Public Record": tt.type.STRING,
        "Not Fully Paid": tt.type.STRING,
    },
    table_name="loans",
)

## Join Tables

In [15]:
# Join tables
cc_sales_table.join(
    user_cc_table,
    (cc_sales_table["User"] == user_cc_table["User"])
    & (cc_sales_table["Card"] == user_cc_table["Card"]),
)
cc_sales_table.join(users_table, cc_sales_table["User"] == users_table["User"])
users_table.join(fico_table, users_table["FICO Score"] == fico_table["FICO Score"])
user_cc_table.join(
    cc_info_table, user_cc_table["Retailer ID"] == cc_info_table["Retailer ID"]
)
users_table.join(user_loans_table, users_table["User"] == user_loans_table["User"])

In [16]:
# Create Cube from Atoti Table object
cube = session.create_cube(cc_sales_table)

In [17]:
# View the schema
session.tables.schema

```mermaid
erDiagram
  "Sales Transactions" {
    _ String "User"
    _ String "Card"
    _ LocalDateTime "Datetime"
    nullable double "Amount"
    _ String "Use Chip"
    _ String "Merchant Name"
    _ String "Merchant City"
    _ String "Merchant State"
    _ String "Zip"
    _ String "MCC"
    _ String "Errors?"
    _ String "Is Fraud?"
  }
  "User Credit Cards" {
    _ String "User"
    _ String "Card"
    _ String "Retailer ID"
    _ String "Card Number"
    _ String "Expires"
    _ String "CVV"
    _ String "Has Chip"
    nullable long "Cards Issued"
    nullable long "Credit Limit"
    _ String "Acct Open Date"
    nullable long "Year PIN last Changed"
    _ String "Card on Dark Web"
  }
  "loans" {
    _ String "User"
    nullable int "Credit Policty"
    _ String "Loan Purpose"
    nullable double "Interest Rate"
    nullable double "Installment"
    nullable double "DTI"
    nullable double "Days with Credit Line"
    nullable int "Revolving Bal"
    nullable double "Revol_Util"
    _ String "Inq Last 6mos"
    _ String "Delinq 2yrs"
    _ String "Public Record"
    _ String "Not Fully Paid"
  }
  "Users" {
    _ String "User"
    _ String "Person"
    _ String "Current Age"
    _ String "Age Range"
    _ String "Retirement Age"
    nullable long "Birth Year"
    _ String "Birth Month"
    _ String "Gender"
    _ String "Address"
    nullable double "Apartment"
    _ String "City"
    _ String "State"
    _ String "Zipcode"
    nullable double "Latitude"
    nullable double "Longitude"
    nullable long "Per Capita Income - Zipcode"
    nullable long "Yearly Income - Person"
    _ String "Income Range"
    nullable long "Total Debt"
    _ String "FICO Score"
    nullable double "EAD"
    nullable double "PD12"
    nullable double "PDLT"
    nullable double "LGD"
  }
  "Credit Card Info" {
    _ String "Retailer ID"
    _ String "Retailer Name"
    _ String "Card Brand"
    _ String "Card Type"
    _ String "Level 1"
    _ String "Level 2"
    _ String "Level 3"
    _ String "Level 4"
    _ String "Level 5"
    _ String "Industry"
  }
  "FICO" {
    _ String "FICO Score"
    _ String "FICO Level"
    _ String "FICO Range"
  }
  "Sales Transactions" }o--o| "User Credit Cards" : "(`Card` == `Card`) & (`User` == `User`)"
  "Sales Transactions" }o--o| "Users" : "`User` == `User`"
  "User Credit Cards" }o--o| "Credit Card Info" : "`Retailer ID` == `Retailer ID`"
  "Users" }o--o| "loans" : "`User` == `User`"
  "Users" }o--o| "FICO" : "`FICO Score` == `FICO Score`"
```


## Analyze and Create New Hierarchies, Levels, and Measures

In [18]:
# Set variables for hierarchies, levels, and measures
h, l, m = cube.hierarchies, cube.levels, cube.measures

In [19]:
# Create a multi-level date hierarchy
cube.create_date_hierarchy(
    "Transaction Date",
    column=cc_sales_table["Datetime"],
    levels={
        "Year": "yyyy",
        "Quarter": "QQQ",
        "Month": "MMM",
        "Day": "dd",
        "Hour": "HH",
    },
)

In [20]:
h["Retailer Levels"] = [
    l["Level 1"],
    l["Level 2"],
    l["Level 3"],
    l["Level 4"],
    l["Level 5"],
]
del h["Level 1"]
del h["Level 2"]
del h["Level 3"]
del h["Level 4"]
del h["Level 5"]

In [21]:
l

In [22]:
# Create measures from joined table numerical columns
m["Total Debt"] = tt.agg.sum(
    tt.agg.single_value(users_table["Total Debt"]),
    scope=tt.OriginScope(l[("Sales Transactions", "User", "User")]),
)
m["Income Annual"] = tt.agg.single_value(users_table["Yearly Income - Person"])
m["Num Credit Cards"] = tt.agg.count_distinct(user_cc_table["Card Number"])

In [23]:
m["Amount"] = tt.agg.sum(cc_sales_table["Amount"])
m["Credit_Limit"] = tt.agg.single_value(user_cc_table["Credit Limit"])
m["Credit Limit"] = tt.agg.sum(
    m["Credit_Limit"], scope=tt.OriginScope(l["Card Number"])
)
m["Utilization"] = m["Amount"] / m["Credit Limit"]
m["Utilization"].formatter = "DOUBLE[0.00%]"

In [24]:
m["PD12"] = tt.agg.mean(
    tt.agg.single_value(users_table["PD12"]),
    scope=tt.OriginScope(l[("Sales Transactions", "User", "User")]),
)
m["PD12"].formatter = "DOUBLE[0.00%]"
m["PDLT"] = tt.agg.mean(
    tt.agg.single_value(users_table["PDLT"]),
    scope=tt.OriginScope(l[("Sales Transactions", "User", "User")]),
)
m["PDLT"].formatter = "DOUBLE[0.00%]"
m["LGD"] = tt.agg.mean(
    tt.agg.single_value(users_table["LGD"]),
    scope=tt.OriginScope(l[("Sales Transactions", "User", "User")]),
)
m["LGD"].formatter = "DOUBLE[0.00%]"
m["EAD"] = m["Amount"]
m["ECL"] = tt.agg.sum_product(
    m["Amount"],
    m["LGD"],
    m["PD12"],
    scope=tt.OriginScope(l[("Sales Transactions", "User", "User")]),
)

In [25]:
m["PD12"].folder = "Credit Risk"
m["PDLT"].folder = "Credit Risk"
m["LGD"].folder = "Credit Risk"
m["EAD"].folder = "Credit Risk"
m["ECL"].folder = "Credit Risk"

In [26]:
m["Revolving Balance"] = tt.agg.sum(
    tt.agg.single_value(user_loans_table["Revolving Bal"]),
    scope=tt.OriginScope(l["User"]),
)
m["Interest Rate"] = tt.agg.single_value(user_loans_table["Interest Rate"])
m["Days with Credit Line"] = tt.agg.single_value(
    user_loans_table["Days with Credit Line"]
)
m["Revolving Utilization"] = tt.agg.sum(
    tt.agg.single_value(user_loans_table["Revol_Util"]),
    scope=tt.OriginScope(l[("Sales Transactions", "User", "User")]),
)

In [27]:
l["Income Range"].order = tt.CustomOrder(
    first_elements=[
        "0K - 20K",
        "20K - 50K",
        "50K - 80K",
        "80K - 100K",
        "100K - 150K",
        "150K - 200K",
        "200K+",
    ]
)

In [28]:
l["FICO Level"].order = tt.CustomOrder(
    first_elements=["Poor", "Fair", "Good", "Very Good", "Exceptional"]
)