# Data Model Overview for Banking/Finance Industry

- **Customer Dimension**: 
  - **Description**: Contains detailed customer profile information, including region, segment, language, and classification level.
  - **Key Attributes**: `Customer_Id`, `Master_Bank_Customer_Id`, `City_Name`, `Region`, `Customer_Segment`, `Customer_Class`, `Preferred_Language_Code`.

- **Account Dimension**: 
  - **Description**: Stores account-specific attributes such as type, product description, category, interest rate, and account tier level.
  - **Key Attributes**: `Account_Id`, `Account_Type`, `Product_Description`, `Product_Category`, `Interest_Rate`, `Account_Tier`.

- **Campaign Dimension**: 
  - **Description**: Defines marketing campaigns targeted to customers, including their start and end dates and campaign type.
  - **Key Attributes**: `Campaign_Id`, `Campaign_Name`, `Campaign_Type`, `Campaign_Start_Date`, `Campaign_End_Date`.

- **Account Facts**:
  - **Description**: Fact table capturing each account’s transactional and usage details, along with flags indicating specific account features.
  - **Key Attributes**: `Account_Id`, `Customer_Id`, `Account_Status`, `Opening_Date`, `Overdraft_Flag`, `Debit_Card_Flag`, `Average_Balance`, `Total_Transactions`.

- **Campaign Facts**:
  - **Description**: Fact table recording customer interactions with campaigns, including flags for promotional interactions, interaction type, and eligibility.
  - **Key Attributes**: `Campaign_Id`, `Customer_Id`, `Account_Id`, `Promo_Flag`, `Interaction_Date`, `Interaction_Type`, `Eligible_Flag`.

- **Denormalized Customer Table**:
  - **Description**: Consolidated view containing complete information on customer profiles, account attributes, and campaign interactions. Designed for simplified analysis and reporting.
  - **Key Attributes**: Combination of key attributes from the above dimensions and fact tables, such as `Customer_Id`, `Account_Id`, `Campaign_Id`, and various interaction and account feature flags.


# The DDls

```sql
-- Banking/Finance Industry Model for Teradata

-- Customer Dimension Table: Stores primary customer information
CREATE MULTISET TABLE Customer_Dim (
    Customer_Id BIGINT NOT NULL,
    Master_Bank_Customer_Id BIGINT,
    Postal_Code VARCHAR(10),
    City_Name VARCHAR(50),
    Region VARCHAR(20),
    Customer_Segment VARCHAR(20), -- e.g., Retail, Corporate
    Customer_Class CHAR(1),       -- e.g., Premium, Standard
    Preferred_Language_Code CHAR(2),
    PRIMARY KEY (Customer_Id)
) PRIMARY INDEX (Customer_Id);

-- Account Dimension Table: Holds account types and associated details
CREATE MULTISET TABLE Account_Dim (
    Account_Id BIGINT NOT NULL,
    Account_Type VARCHAR(50),    -- e.g., Savings, Checking
    Product_Description VARCHAR(100),
    Product_Category VARCHAR(100), 
    Interest_Rate DECIMAL(5,2),  -- Interest rate applicable to account type
    Account_Tier CHAR(1),        -- e.g., Basic, Premium
    PRIMARY KEY (Account_Id)
) PRIMARY INDEX (Account_Id);

-- Campaign Dimension Table: Details on marketing campaigns directed at customers
CREATE MULTISET TABLE Campaign_Dim (
    Campaign_Id BIGINT NOT NULL,
    Campaign_Name VARCHAR(100),
    Campaign_Type VARCHAR(50),    -- e.g., Promotion, Retention
    Campaign_Start_Date DATE,
    Campaign_End_Date DATE,
    PRIMARY KEY (Campaign_Id)
) PRIMARY INDEX (Campaign_Id);

-- Account Facts Table: Contains transactional information and usage patterns for accounts
CREATE MULTISET TABLE Account_Facts (
    Account_Id BIGINT NOT NULL,
    Customer_Id BIGINT NOT NULL,
    Account_Status VARCHAR(20),    -- e.g., Active, Closed
    Opening_Date DATE,
    Closing_Date DATE,
    Overdraft_Flag BYTEINT,       -- 1 for true, 0 for false
    Debit_Card_Flag BYTEINT,      -- 1 for true, 0 for false
    Fixed_Rate_Flag BYTEINT,      -- 1 for true, 0 for false
    Average_Balance DECIMAL(15,2), 
    Total_Transactions INT,
    PRIMARY KEY (Account_Id)
) PRIMARY INDEX (Account_Id);

-- Campaign Facts Table: Records interactions with marketing campaigns for each account
CREATE MULTISET TABLE Campaign_Facts (
    Campaign_Id BIGINT NOT NULL,
    Customer_Id BIGINT NOT NULL,
    Account_Id BIGINT NOT NULL,
    Promo_Flag BYTEINT,          -- 1 for true, 0 for false
    Interaction_Date DATE,
    Interaction_Type VARCHAR(50), -- e.g., Email, SMS
    Eligible_Flag BYTEINT,        -- 1 if eligible, 0 otherwise
    PRIMARY KEY (Campaign_Id, Customer_Id, Account_Id)
) PRIMARY INDEX (Campaign_Id, Customer_Id);

-- Consolidated Denormalized Table for simplified reporting on customer and account interactions
CREATE MULTISET TABLE Denormalized_Customer (
    Customer_Id BIGINT NOT NULL,
    Master_Bank_Customer_Id BIGINT,
    Postal_Code VARCHAR(10),
    City_Name VARCHAR(50),
    Region VARCHAR(20),
    Customer_Segment VARCHAR(20),
    Customer_Class CHAR(1),
    Preferred_Language_Code CHAR(2),
    Account_Id BIGINT,
    Account_Type VARCHAR(50),
    Product_Description VARCHAR(100),
    Product_Category VARCHAR(100),
    Interest_Rate DECIMAL(5,2),
    Account_Tier CHAR(1),
    Account_Status VARCHAR(20),
    Opening_Date DATE,
    Closing_Date DATE,
    Overdraft_Flag BYTEINT,
    Debit_Card_Flag BYTEINT,
    Fixed_Rate_Flag BYTEINT,
    Average_Balance DECIMAL(15,2),
    Total_Transactions INT,
    Campaign_Id BIGINT,
    Campaign_Name VARCHAR(100),
    Campaign_Type VARCHAR(50),
    Campaign_Start_Date DATE,
    Campaign_End_Date DATE,
    Promo_Flag BYTEINT,
    Interaction_Date DATE,
    Interaction_Type VARCHAR(50),
    Eligible_Flag BYTEINT
) PRIMARY INDEX (Customer_Id);

```

## data gen

In [114]:
import pandas as pd
import numpy as np
import uuid
import random

# Generate data for Customer_Dim
def generate_synthetic_customer_data(num_customers):
    np.random.seed(0)
    data = {
        "Customer_Id": [str(uuid.uuid4()) for _ in range(num_customers)],
        "Master_Bank_Customer_Id": np.random.randint(100000, 999999, size=num_customers),
        "City_Name": np.random.choice(["Zurich", "Geneva", "Bern", "Basel", "Lausanne"], num_customers),
        "Region": np.random.choice(["ZH", "GE", "BE", "BS", "VD"], num_customers),
        "Customer_Segment": np.random.choice(["Retail", "Private Banking", "Wealth Management", "Corporate"], num_customers),
        "Customer_Class": np.random.choice(["A", "B", "C", "D"], num_customers),
        "Preferred_Language_Code": np.random.choice(["DE", "FR", "IT", "EN"], num_customers)
    }
    return pd.DataFrame(data)

# Generate data for Account_Dim
def generate_synthetic_account_data(num_accounts):
    np.random.seed(1)
    data = {
        "Account_Id": [str(uuid.uuid4()) for _ in range(num_accounts)],
        "Account_Type": np.random.choice(["Savings", "Checking", "Investment", "Loan"], num_accounts),
        "Currency": np.random.choice(["CHF", "USD", "EUR"], num_accounts),
        "Opening_Date": pd.to_datetime(np.random.choice(pd.date_range("2000-01-01", "2020-01-01"), num_accounts)),
        "Status": np.random.choice(["Active", "Dormant", "Closed"], num_accounts)
    }
    return pd.DataFrame(data)



def generate_synthetic_customer_details(customer_ids):
    np.random.seed(5)
    
    # Swiss road names and towns for address generation
    swiss_roads = [
        "Bahnhofstrasse", "Hauptstrasse", "Seestrasse", "Dorfstrasse", "Poststrasse", "Kirchgasse", 
        "Marktgasse", "Rosenweg", "Sonnenstrasse", "Schulstrasse", "Friedhofstrasse", "Mühlegasse", 
        "Industriestrasse", "Obere Dorfstrasse", "Neugasse", "Birkenweg", "Weiherstrasse", 
        "Landstrasse", "Schlossstrasse", "Im Graben", "Allee", "Hintergasse", "Bergstrasse", 
        "Tannenweg", "Quellenstrasse", "Eichenweg", "Hofstrasse", "Bachweg", "Feldstrasse", 
        "Mattenstrasse", "Haldenstrasse", "Brunnenstrasse", "Weidstrasse", "Burgstrasse", 
        "Gartenweg", "Winkelstrasse", "Kreuzstrasse", "Morgenstrasse", "Hauptgasse", "Haldenweg", 
        "Aarweg", "Churerstrasse", "Florastrasse", "Uferstrasse", "Zelgstrasse", "Neuhofstrasse", 
        "Rigiweg", "Hofmattstrasse", "Grenzstrasse", "Heinrichstrasse", "Forchstrasse", 
        "Hertensteinstrasse", "Hungerbergstrasse", "Rebbergstrasse", "Bäderstrasse", "Haselweg", 
        "Schindelistrasse", "Hinterdorfstrasse", "Albisstrasse", "Alte Landstrasse", "Frohsinnstrasse", 
        "Plattenstrasse", "Eggerstrasse", "Kirchweg", "Wildbachstrasse", "Langgasse", "Kreuzplatz", 
        "Bodenstrasse", "Pilatusstrasse", "Im Stutz", "Spreitenbachstrasse", "Schifflaende", 
        "Rohrstrasse", "Neugutstrasse", "Bannhaldenstrasse", "Wiesenstrasse", "Dörfliweg", 
        "Wehntalerstrasse", "Zürichstrasse", "Kreuzlingerstrasse", "Schützenweg", "Mönchstrasse", 
        "Lindenplatz", "Brühlstrasse", "Hurdackerstrasse", "Alpenstrasse", "Tobelstrasse", 
        "Bodenacker", "Rankweg", "Gerstenstrasse", "Moosstrasse", "Steinackerstrasse", 
        "Im Winkel", "Tellenfeldstrasse", "Seehaldenstrasse", "Wiesentalstrasse", "Untere Wiese", 
        "Weihermatte", "Schwanenplatz"
    ]
    
    swiss_towns = [
        "Zurich", "Geneva", "Bern", "Basel", "Lausanne", "Winterthur", "Lucerne", "St. Gallen", 
        "Lugano", "Biel", "Thun", "Köniz", "La Chaux-de-Fonds", "Schaffhausen", "Fribourg", 
        "Chur", "Neuchâtel", "Vernier", "Sion", "Uster", "Yverdon-les-Bains", "Zug", 
        "Rapperswil-Jona", "Montreux", "Sierre", "Wädenswil", "Kreuzlingen", "Binningen", 
        "Wil", "Bellinzona", "Aarau", "Baden", "Kloten", "Martigny", "Olten", "Rüti", 
        "Grenchen", "Vevey", "Morges", "Freienbach", "Schlieren", "Arlesheim", "Le Grand-Saconnex", 
        "Volketswil", "Dietikon", "Einsiedeln", "Prilly", "Riehen", "Meilen", "Muri", 
        "Horgen", "Gland", "Thalwil", "Lenzburg", "Burgdorf", "Affoltern", "Adliswil", 
        "Spiez", "Langenthal", "Zofingen", "Bulle", "Wohlen", "Onex", "Frauenfeld", "Cham", 
        "Romanshorn", "Nyon", "Emmen", "Crans-Montana", "Pfäffikon", "Hinwil", "Oberwil", 
        "Bussigny", "Allschwil", "Ecublens", "Les Acacias", "Schwyz", "Renens", "Muttenz", 
        "Spreitenbach", "Liestal", "Münsingen", "Pratteln", "Wettingen", "Worb", 
        "Herisau", "La Tour-de-Peilz", "Rümlang", "Ticino", "Solothurn", "Brugg", 
        "Glarus", "Stäfa", "Payerne", "Buchs", "Flawil", "Dübendorf"
    ]
    
    num_customers = len(customer_ids)
    # Generate random data
    data = {
        "Customer_Id": customer_ids,
        "Street_Name": np.random.choice(swiss_roads, num_customers),
        "House_Number": np.random.randint(1, 100, num_customers),
        "Town": np.random.choice(swiss_towns, num_customers),
        "Postal_Code": np.random.randint(1000, 9999, num_customers),
        "Date_Of_Birth": pd.to_datetime(np.random.choice(pd.date_range("1930-01-01", "2000-01-01"), num_customers)),
        "Customer_Since": pd.to_datetime(np.random.choice(pd.date_range("2010-01-01", "2022-12-31"), num_customers))
    }
    
    customer_details = pd.DataFrame(data)
    
    # Introduce 0.5% missing values for both Street_Name and House_Number
    mask_both_missing = customer_details.sample(frac=0.005).index
    customer_details.loc[mask_both_missing, ['Street_Name', 'House_Number']] = np.nan
    
    # Introduce 1% missing values for only House_Number
    mask_house_missing = customer_details.sample(frac=0.01).index
    customer_details.loc[mask_house_missing, 'House_Number'] = np.nan
    
    return customer_details


def map_accounts_to_customers(account_ids, customer_ids):
    # Repeat customer IDs to match the length of account_ids and shuffle for randomness
    customer_ids_repeated = np.resize(customer_ids, len(account_ids))
    np.random.shuffle(customer_ids_repeated)
    
    # Create a mapping DataFrame
    account_customer_map = pd.DataFrame({
        'Account_Id': account_ids,
        'Customer_Id': customer_ids_repeated
    })
    
    return account_customer_map


# Generate data for Transaction_Fact
def generate_synthetic_transaction_data(num_transactions, account_ids):
    np.random.seed(2)
    data = {
        "Transaction_Id": [str(uuid.uuid4()) for _ in range(num_transactions)],
        "Account_Id": np.random.choice(account_ids, num_transactions),
        "Transaction_Date": pd.to_datetime(np.random.choice(pd.date_range("2020-01-01", "2023-01-01"), num_transactions)),
        "Transaction_Type": np.random.choice(["Credit", "Debit"], num_transactions),
        "Transaction_Amount": np.round(np.random.uniform(100, 10000, num_transactions), 2),
        "Channel": np.random.choice(["Online", "Branch", "ATM", "Mobile"], num_transactions)
    }
    return pd.DataFrame(data)

# Generate data for Account_Balance_Fact
def generate_synthetic_balance_data(num_entries, account_ids):
    np.random.seed(3)
    data = {
        "Account_Id": np.random.choice(account_ids, num_entries),
        "Balance_Date": pd.to_datetime(np.random.choice(pd.date_range("2020-01-01", "2023-01-01"), num_entries)),
        "Balance_Amount": np.round(np.random.uniform(1000, 500000, num_entries), 2)
    }
    return pd.DataFrame(data)

# Generate data for Customer_Interaction_Fact
def generate_synthetic_interaction_data(num_interactions, customer_ids):
    np.random.seed(4)
    data = {
        "Interaction_Id": [str(uuid.uuid4()) for _ in range(num_interactions)],
        "Customer_Id": np.random.choice(customer_ids, num_interactions),
        "Interaction_Date": pd.to_datetime(np.random.choice(pd.date_range("2020-01-01", "2023-01-01"), num_interactions)),
        "Interaction_Channel": np.random.choice(["Email", "Phone", "Branch", "Mobile App"], num_interactions),
        "Interaction_Type": np.random.choice(["Inquiry", "Complaint", "Service Request", "Feedback"], num_interactions),
        "Resolution_Status": np.random.choice(["Resolved", "Pending", "Escalated"], num_interactions)
    }
    return pd.DataFrame(data)



def generate_denormalized_master(customer_dim, account_dim, account_customer_map, transaction_fact, balance_fact, interaction_fact):
    # Set indexes for easier joining based on Customer_Id and Account_Id
    customer_dim= customer_dim.set_index('Customer_Id', inplace=False).copy(deep=True)
    account_customer_map= account_customer_map.set_index('Account_Id', inplace=False).copy(deep=True)
    account_dim= account_dim.set_index('Account_Id', inplace=False).copy(deep=True)
    transaction_fact= transaction_fact.set_index('Account_Id', inplace=False).copy(deep=True)
    balance_fact= balance_fact.set_index('Account_Id', inplace=False).copy(deep=True)
    interaction_fact=interaction_fact.set_index('Customer_Id', inplace=False).copy(deep=True)
    
    # Map each transaction and balance record to a Customer_Id using the account-to-customer mapping
    transaction_fact['Customer_Id'] = transaction_fact.index.map(account_customer_map['Customer_Id'])
    balance_fact['Customer_Id'] = balance_fact.index.map(account_customer_map['Customer_Id'])
    
    # Aggregate account data for each customer
    account_agg = account_dim.join(account_customer_map).reset_index().groupby('Customer_Id').agg({
        'Account_Id': 'count',
        'Account_Type': lambda x: x.mode()[0],
        'Currency': lambda x: x.mode()[0]
    }).rename(columns={
        'Account_Id': 'Num_Accounts',
        'Account_Type': 'Primary_Account_Type',
        'Currency': 'Primary_Currency'
    })
        
    # Aggregate transaction data for each customer
    transaction_agg = transaction_fact.reset_index().groupby('Customer_Id').agg({
        'Transaction_Amount': ['sum', 'mean'],
        'Transaction_Type': lambda x: x.value_counts().idxmax()
    })
    transaction_agg.columns = ['Total_Transaction_Amount', 'Average_Transaction_Amount', 'Common_Transaction_Type']

    # Aggregate balance data for each customer
    balance_agg = balance_fact.reset_index().groupby('Customer_Id').agg({
        'Balance_Amount': ['last',"min"],
    })
    balance_agg.columns = ['Current_Balance', 'Lowest_Balance']
    
    # Aggregate interaction data for each customer
    interaction_agg = interaction_fact.groupby('Customer_Id').agg({
        'Interaction_Type': lambda x: x.value_counts().idxmax(),
        'Interaction_Channel': lambda x: x.value_counts().idxmax(),
        'Interaction_Date': 'count'
    }).rename(columns={
        'Interaction_Type': 'Frequent_Interaction_Type',
        'Interaction_Channel': 'Preferred_Channel',
        'Interaction_Date': 'Total_Interactions'
    })
    
    # Join all aggregated data with the customer dimension
    master_table = customer_dim.join(account_agg) \
                               .join(transaction_agg) \
                               .join(balance_agg) \
                               .join(interaction_agg)
    
    return master_table.reset_index()


In [115]:
def generate_all_tables(num_customers, num_accounts, num_transactions, num_balances, num_interactions):
    # Generate customer table
    customer_dim = generate_synthetic_customer_data(num_customers)
    all_customer_ids = customer_dim['Customer_Id'].unique()
    
    # Generate account table
    account_dim = generate_synthetic_account_data(num_accounts)
    
    # Map accounts to customers
    account_customer_map = map_accounts_to_customers(account_dim['Account_Id'].unique(), all_customer_ids)
    
    # Generate transaction table
    transaction_fact = generate_synthetic_transaction_data(num_transactions, account_dim['Account_Id'])
    
    # Generate balance table
    balance_fact = generate_synthetic_balance_data(num_balances, account_dim['Account_Id'])
    
    # Generate interaction table
    interaction_fact = generate_synthetic_interaction_data(num_interactions, all_customer_ids)
    
    # Generate the denormalized master table
    master_table = generate_denormalized_master(
        customer_dim, 
        account_dim, 
        account_customer_map, 
        transaction_fact, 
        balance_fact, 
        interaction_fact
    )

    customer_details = generate_synthetic_customer_details(all_customer_ids)

    return {
        "Customer_Dim": customer_dim,
        "Account_Dim": account_dim,
        "Account_Customer_Map": account_customer_map,
        "Transaction_Fact": transaction_fact,
        "Balance_Fact": balance_fact,
        "Interaction_Fact": interaction_fact,
        "Master_Table": master_table,
        "Customer_Details":customer_details
    }


In [116]:
all_tables = generate_all_tables(num_customers = 50000, num_accounts=60000, 
                                 num_transactions=500000,
                                 num_balances=60000, num_interactions= 300000)

In [117]:
customer_dim, account_dim, account_customer_map, transaction_fact, balance_fact, interaction_fact, master_table, customer_details = all_tables.values()

In [118]:
customer_dim

Unnamed: 0,Customer_Id,Master_Bank_Customer_Id,City_Name,Region,Customer_Segment,Customer_Class,Preferred_Language_Code
0,216ed956-6879-4089-a191-c6bf2a3f391c,405711,Zurich,ZH,Retail,A,EN
1,136e0b2a-7460-4d11-b214-b24ec2395a11,535829,Zurich,ZH,Wealth Management,D,DE
2,dc8b5908-6502-4e09-82c6-4eddce51a68c,217952,Geneva,ZH,Corporate,C,EN
3,3d2c42e9-64d9-4b8c-931f-b9ea0c01bd47,252315,Lausanne,VD,Wealth Management,A,DE
4,d4416a0b-0de5-4fd4-9d65-0b9a2ccb793e,982371,Geneva,VD,Retail,A,FR
...,...,...,...,...,...,...,...
49995,edb88cf5-e095-4bbc-9f3a-68eb95d6401b,371583,Geneva,GE,Private Banking,A,FR
49996,c711e6a0-e918-44a4-a7cf-8ec97efff6e4,367320,Zurich,BS,Wealth Management,D,EN
49997,95aab6ba-6d5a-4427-a0b5-137a1c037712,608183,Zurich,BS,Private Banking,C,IT
49998,d34ae2f1-4df1-426f-bb0b-dfe5f15d0f34,280665,Basel,BS,Private Banking,B,DE


In [119]:
master_table

Unnamed: 0,Customer_Id,Master_Bank_Customer_Id,City_Name,Region,Customer_Segment,Customer_Class,Preferred_Language_Code,Num_Accounts,Primary_Account_Type,Primary_Currency,Total_Transaction_Amount,Average_Transaction_Amount,Common_Transaction_Type,Current_Balance,Lowest_Balance,Frequent_Interaction_Type,Preferred_Channel,Total_Interactions
0,216ed956-6879-4089-a191-c6bf2a3f391c,405711,Zurich,ZH,Retail,A,EN,2,Investment,CHF,85222.93,4261.146500,Credit,173777.61,173777.61,Complaint,Email,7.0
1,136e0b2a-7460-4d11-b214-b24ec2395a11,535829,Zurich,ZH,Wealth Management,D,DE,2,Loan,CHF,85220.07,4734.448333,Debit,269004.07,269004.07,Service Request,Branch,5.0
2,dc8b5908-6502-4e09-82c6-4eddce51a68c,217952,Geneva,ZH,Corporate,C,EN,2,Checking,CHF,65687.93,4691.995000,Debit,339839.55,174542.12,Inquiry,Branch,6.0
3,3d2c42e9-64d9-4b8c-931f-b9ea0c01bd47,252315,Lausanne,VD,Wealth Management,A,DE,2,Savings,CHF,122934.45,6829.691667,Debit,255449.78,102589.27,Feedback,Email,4.0
4,d4416a0b-0de5-4fd4-9d65-0b9a2ccb793e,982371,Geneva,VD,Retail,A,FR,2,Investment,CHF,150316.19,5781.391923,Credit,331461.78,331461.78,Service Request,Branch,6.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
49995,edb88cf5-e095-4bbc-9f3a-68eb95d6401b,371583,Geneva,GE,Private Banking,A,FR,1,Investment,CHF,20286.65,5071.662500,Debit,373944.23,373944.23,Feedback,Email,9.0
49996,c711e6a0-e918-44a4-a7cf-8ec97efff6e4,367320,Zurich,BS,Wealth Management,D,EN,1,Investment,EUR,27144.69,3393.086250,Debit,327750.30,327750.30,Service Request,Phone,7.0
49997,95aab6ba-6d5a-4427-a0b5-137a1c037712,608183,Zurich,BS,Private Banking,C,IT,1,Loan,USD,29059.14,4843.190000,Credit,,,Feedback,Email,3.0
49998,d34ae2f1-4df1-426f-bb0b-dfe5f15d0f34,280665,Basel,BS,Private Banking,B,DE,1,Investment,CHF,63694.48,5307.873333,Debit,102141.69,102141.69,Complaint,Mobile App,6.0


In [120]:
customer_details

Unnamed: 0,Customer_Id,Street_Name,House_Number,Town,Postal_Code,Date_Of_Birth,Customer_Since
0,216ed956-6879-4089-a191-c6bf2a3f391c,Zürichstrasse,76.0,Vernier,3444,1981-07-01,2018-04-27
1,136e0b2a-7460-4d11-b214-b24ec2395a11,Plattenstrasse,,Brugg,8528,1971-09-07,2022-03-09
2,dc8b5908-6502-4e09-82c6-4eddce51a68c,Weiherstrasse,63.0,Biel,1978,1974-10-10,2015-09-06
3,3d2c42e9-64d9-4b8c-931f-b9ea0c01bd47,Neugutstrasse,60.0,Bellinzona,4496,1955-08-11,2010-07-12
4,d4416a0b-0de5-4fd4-9d65-0b9a2ccb793e,Sonnenstrasse,74.0,Payerne,1379,1986-04-04,2016-03-28
...,...,...,...,...,...,...,...
49995,edb88cf5-e095-4bbc-9f3a-68eb95d6401b,Hauptgasse,26.0,Frauenfeld,7616,1987-09-05,2020-04-15
49996,c711e6a0-e918-44a4-a7cf-8ec97efff6e4,Im Stutz,62.0,Hinwil,4875,1950-05-30,2020-12-02
49997,95aab6ba-6d5a-4427-a0b5-137a1c037712,Frohsinnstrasse,46.0,Onex,3702,1974-08-09,2019-12-06
49998,d34ae2f1-4df1-426f-bb0b-dfe5f15d0f34,Allee,99.0,Binningen,5856,1996-08-30,2013-06-12


In [112]:
customer_details.isna().sum()

Customer_Id         0
Street_Name       250
House_Number      748
Town                0
Postal_Code         0
Date_Of_Birth       0
Customer_Since      0
dtype: int64

In [121]:
transaction_fact

Unnamed: 0,Transaction_Id,Account_Id,Transaction_Date,Transaction_Type,Transaction_Amount,Channel
0,576f357e-78eb-41cb-925e-a64d99768944,e1116b7e-ae18-43b8-b0f0-152d3d2f757a,2022-09-17,Debit,3644.77,Online
1,0875e515-4649-4cab-a432-e23bfa785820,d9a97950-10e4-42b4-afd9-325b15f99d0a,2021-09-23,Debit,9346.85,Online
2,95d1791c-6a69-424b-8abc-27e4fbcdb970,9f83af9f-7f8a-4617-8372-f663256949c2,2022-04-30,Debit,4729.85,ATM
3,02fc96a8-cb41-4175-8aa9-090054d5dc25,f47bb788-4721-4ade-af06-0804c3510848,2022-10-04,Debit,2216.86,ATM
4,117545bb-26e5-44c4-ba25-189e17821437,19bb644e-d237-4a19-8412-33f0d2abb6b3,2022-10-16,Debit,3457.56,Branch
...,...,...,...,...,...,...
499995,e1d1a879-cffc-47e0-a487-2e9fe10f58ff,c026cfa8-7aa1-4f8a-ace8-68deb8ca9456,2020-03-07,Debit,8603.35,Online
499996,2d13dc2a-cf2c-4972-a694-929a32f268b4,892eff57-d162-46b6-a3e7-0f3401f6e878,2021-03-25,Debit,9287.40,Branch
499997,dbd3b9a2-1d28-4ba3-8760-83c603e82af5,7fa3e123-35c5-455e-9498-596c8ae9a12b,2021-06-30,Credit,9081.04,Mobile
499998,f3c413ce-5e57-4a12-81f0-a05615d44ce3,e623b20a-d685-4856-98cb-4ffec99e3748,2022-10-02,Credit,9642.34,ATM
