# Foreign Exchange (FX) Trade Verification and Reconciliation with Data Analysis

This notebook is designed to verify FX trade records against bank statements. It aims to identify matched and unmatched transactions for both "Buy" and "Sell" side operations.

**New Features:**
- **Data Consistency Emphasis:** Robust parsing and best practices for data integrity.
- **Comprehensive Data Analysis:** Summaries and insights into the reconciliation results.
- **Beautiful Seaborn Visualizations:** High-quality plots to illustrate key findings.
- Export of matched and unmatched records to CSV files.

**Instructions:**
1. Ensure you have the necessary CSV files (FX Trade Tracker, Bank Statements) ready.
2. Upload the files using the provided widgets.
3. Run all cells in sequence.

---

In [None]:
pip install openpyxl

## 1. Imports and Setup
This section imports all necessary libraries and sets up global configurations. We're adding `matplotlib.pyplot` and `seaborn` for plotting.

In [2]:
import pandas as pd
from datetime import datetime
import csv
from fuzzywuzzy import fuzz
from fuzzywuzzy import process
import ipywidgets as widgets
from IPython.display import display, clear_output
from ipywidgets import Output
import base64
import io
import matplotlib.pyplot as plt
import seaborn as sns

# Set Seaborn style for beautiful plots
sns.set_theme(style="whitegrid", palette="viridis")
plt.rcParams['figure.figsize'] = (10, 6) # Default figure size



In [None]:
# --- Configuration ---
# Output paths (these will be relative to where the notebook is run or absolute paths)
out_csv_path_buy_unmatched = 'UnmatchedCounterpartyPayment.csv'
out_csv_path_sell_unmatched = 'UnmatchedChoicePayment.csv'
out_csv_path_bank_unmatched = 'UnmatchedBankRecords.csv'
out_csv_path_buy_matched = 'MatchedCounterpartyPayment.csv'
out_csv_path_sell_matched = 'MatchedChoicePayment.csv'

# Date for FX rate (can be dynamically set or user input)
# This will be used for any FX conversions
fx_rate_date = '2025-07-11' # YYYY-MM-DD for consistency

# Various Date Formats to handle different date representations in CSVs
date_formats = [
    '%Y-%m-%d',
    '%Y/%m/%d',
    '%d.%m.%Y',
    '%Y.%m.%d',
    '%d/%m/%Y',
    '%-d/%-m/%Y',
    '%-d.%-m.%Y'
]

# Fuzzy matching threshold for bank names (0-100)
FUZZY_MATCH_THRESHOLD = 80

# Hardcoded FX Rates (for demonstration purposes)
# In a real-world scenario, you'd fetch these from an API or a daily data source.
# Format: {'CURRENCY_PAIR': RATE} e.g., 'USDKES': 145.0
# Assuming all rates are against a base currency (e.g., KES or USD)
FX_RATES = {
    'USDKES': 145.0,
    'EURKES': 155.0,
    'GBPUSD': 1.25,
    'USDGBP': 0.8, # Inverse rate
    'EURUSD': 1.08,
    'USDEUR': 0.92, # Inverse rate
    'KESUSD': 1/145.0, # Added for completeness
    'KESEUR': 1/155.0, # Added for completeness
    'USDGBP': 1/1.25, # Added for completeness
    # Add more as needed
}

def get_fx_rate(from_currency, to_currency, date=None):
    """
    Retrieves the FX rate for conversion.
    In a real application, this would query a database or an external API.
    For this example, it uses the hardcoded FX_RATES.
    """
    from_currency = from_currency.upper()
    to_currency = to_currency.upper()

    if from_currency == to_currency:
        return 1.0

    pair = f"{from_currency}{to_currency}"
    if pair in FX_RATES:
        return FX_RATES[pair]

    # Try inverse rate
    inverse_pair = f"{to_currency}{from_currency}"
    if inverse_pair in FX_RATES:
        return 1 / FX_RATES[inverse_pair]

    print(f"Warning: FX rate not found for {from_currency} to {to_currency}. Assuming 1:1 for demonstration.")
    return 1.0 # Fallback

def convert_currency(amount, from_currency, to_currency, date=None):
    """Converts an amount from one currency to another using the FX_RATES."""
    print("amount : ", amount , " from currency : ", from_currency, " to currency : ", to_currency, " date : ", date)
    rate = get_fx_rate(from_currency, to_currency, date)
    print("rate : ", rate)
    return amount

## 2. Helper Functions for Data Consistency and Processing
This section defines various utility functions used throughout the reconciliation process, focusing on robust data handling.

In [None]:
def safe_float(x):
    """Safely converts a value to a float, handling commas, non-numeric inputs, and ensuring consistency."""
    if pd.isna(x) or x is None:
        return None
    try:
        # Convert to string, remove commas, and strip whitespace
        cleaned_x = str(x).replace(',', '').strip()
        return float(cleaned_x)
    except (ValueError, TypeError):
        return None

def normalize_bank_key(raw_key):
    """Normalizes bank names to a consistent short code, using fuzzy matching."""
    raw_key_lower = str(raw_key).lower().strip()
    replacements = {
        'ncba bank kenya plc': 'ncba',
        'ncba bank': 'ncba',
        'equity bank': 'equity',
        'i&m bank': 'i&m',
        'central bank of kenya': 'cbk',
        'kenya commercial bank': 'kcb',
        'kcb bank': 'kcb',
        'sbm bank (kenya) limited': 'sbm',
        'sbm bank': 'sbm',
        'absa bank': 'absa',
        'kingdom bank': 'kingdom'
    }

    # First, try direct replacement
    for long, short in replacements.items():
        if raw_key_lower.startswith(long):
            return raw_key_lower.replace(long, short).strip()

    # If no direct match, try fuzzy matching against known short codes/replacements
    all_bank_names = list(replacements.values()) + list(replacements.keys())
    all_bank_names = list(set(all_bank_names)) # Ensure uniqueness

    match = process.extractOne(raw_key_lower, all_bank_names, scorer=fuzz.ratio)
    if match and match[1] >= FUZZY_MATCH_THRESHOLD:
        for long, short in replacements.items():
            if match[0].startswith(long):
                return short
        return match[0]
    return raw_key_lower # Return original if no good fuzzy match

def resolve_amount_column(columns, action_type, is_sell_side=False):
    """Identifies the correct amount column (e.g., 'credit', 'debit') based on action type."""
    columns_lower = [col.lower() for col in columns]
    if not is_sell_side: # Refers to the local currency amount in FX trade (Counterparty Payment)
        # If it's a 'Bank Buy' from our perspective, we are paying local currency (debit/withdrawal from local acc)
        # If it's a 'Bank Sell', we are receiving local currency (credit/deposit to local acc)
        if action_type == 'Bank Buy':
            return next((columns[i] for i, col in enumerate(columns_lower) if col in ['withdrawal', 'debit']), None)
        elif action_type == 'Bank Sell':
            return next((columns[i] for i, col in enumerate(columns_lower) if col in ['deposit', 'credit']), None)
    else: # is_sell_side refers to the foreign currency amount in FX trade (Choice Payment)
        # If it's a 'Bank Buy', we are receiving foreign currency (credit/deposit to foreign acc)
        # If it's a 'Bank Sell', we are paying foreign currency (debit/withdrawal from foreign acc)
        if action_type == 'Bank Buy':
            return next((columns[i] for i, col in enumerate(columns_lower) if col in ['deposit', 'credit']), None)
        elif action_type == 'Bank Sell':
            return next((columns[i] for i, col in enumerate(columns_lower) if col in ['withdrawal', 'debit']), None)
    return None


def resolve_date_column(columns):
    """Identifies the date column from a list of column names, prioritizing common formats."""
    for candidate in ['Value Date', 'Transaction Date', 'MyUnknownColumn', 'Transaction date', 'Date', 'Activity Date']:
        if candidate in columns:
            return candidate
    return None

def get_amount_columns(columns):
    """Returns a list of potential amount columns."""
    return [col for col in columns if col.lower() in ['deposit', 'credit', 'withdrawal', 'debit', 'amount', 'value']]

def get_description_columns(columns):
    """Identifies the description column from a list of column names."""
    for desc in ['Transaction details','Transaction', 'Customer reference','Narration',
                 'Transaction Details', 'Detail',  'Transaction Remarks:',
                 'TransactionDetails', 'Description', 'Narrative', 'Remarks']:
        if desc in columns:
            return desc
    return None

def parse_date(date_str_raw):
    """Parses a date string into a datetime object using predefined formats."""
    if not isinstance(date_str_raw, str):
        return None
    # Attempt to parse as date only, stripping time if present
    date_str = date_str_raw.split()[0].strip()
    for fmt in date_formats:
        try:
            return datetime.strptime(date_str, fmt)
        except ValueError:
            continue
    return None

## 3. Data Loading
Use the file upload widgets below to load your FX Trade Tracker and Bank Statement CSV files.
The robust checks ensure that the data is loaded correctly and handles cases where files are not yet uploaded.

In [3]:

fx_upload_widget = widgets.FileUpload(
    accept='.csv,.xlsx',
    multiple=False,
    description='Upload FX Tracker'
)
fx_output = Output()
fx_sheet_dropdown = widgets.Dropdown(description='Sheet:', layout=widgets.Layout(width='300px'), visible=False)
fx_column_selector = widgets.SelectMultiple(description='Columns:', layout=widgets.Layout(width='300px'))
fx_column_renames = {}
fx_column_rename_box = widgets.VBox()
fx_file_label = widgets.Label(value="No FX file uploaded.")
process_fx_btn = widgets.Button(description='Process FX Data', button_style='success')
clear_fx_btn = widgets.Button(description='Clear FX Upload', button_style='danger')
fx_controls = widgets.HBox([fx_upload_widget, clear_fx_btn, process_fx_btn])

fx_raw_file = None
fx_trade_df = pd.DataFrame()
fx_sheet_names = []

# ========== HELPERS ==========

def parse_date(date_str):
    try:
        return pd.to_datetime(date_str)
    except:
        return pd.NaT

def generate_download_link(df, filename):
    csv_data = df.to_csv(index=False)
    b64 = base64.b64encode(csv_data.encode()).decode()
    return widgets.HTML(value=f'<a download="{filename}" href="data:text/csv;base64,{b64}" target="_blank">Download {filename}</a>')

def extract_excel_sheets(file_dict):
    excel_file = pd.ExcelFile(io.BytesIO(file_dict['content']))
    return excel_file.sheet_names

def build_column_rename_fields(columns):
    fields = []
    fx_column_renames.clear()
    for col in columns:
        input_widget = widgets.Text(value=col, description=col, layout=widgets.Layout(width='400px'))
        fx_column_renames[col] = input_widget
        fields.append(input_widget)
    fx_column_rename_box.children = fields

# ========== CALLBACKS ==========

@fx_output.capture()
def load_fx_file(change):
    global fx_raw_file, fx_sheet_names
    fx_output.clear_output()
    fx_column_selector.options = []
    fx_column_rename_box.children = []
    fx_sheet_dropdown.options = []
    fx_sheet_dropdown.visible = False
    fx_raw_file = None
    files = fx_upload_widget.value

    if not files:
        fx_file_label.value = "No FX file uploaded."
        return

    file = files[0]
    fx_file_label.value = f"Uploaded: {file['name']}"
    fx_raw_file = file

    if file['name'].endswith('.xlsx'):
        fx_sheet_names = extract_excel_sheets(file)
        fx_sheet_dropdown.options = fx_sheet_names
        fx_sheet_dropdown.value = fx_sheet_names[0]
        fx_sheet_dropdown.visible = True
        display(widgets.HTML("<b>Select sheet before processing</b>"))
    else:
        # CSV case – preview immediately
        df = pd.read_csv(io.BytesIO(file['content']))
        fx_column_selector.options = list(df.columns)
        build_column_rename_fields(df.columns)
        display(df.head())

def process_fx_data(change):
    global fx_trade_df
    fx_output.clear_output()
    if not fx_raw_file:
        print("⚠️ No FX file loaded.")
        return

    try:
        if fx_raw_file['name'].endswith('.xlsx'):
            df = pd.read_excel(io.BytesIO(fx_raw_file['content']), sheet_name=fx_sheet_dropdown.value)
        else:
            df = pd.read_csv(io.BytesIO(fx_raw_file['content']))
        
        df.columns = df.columns.str.strip()
        selected_cols = list(fx_column_selector.value)
        if selected_cols:
            df = df[selected_cols]

        renamed_cols = {col: w.value for col, w in fx_column_renames.items() if col in df.columns and w.value}
        df.rename(columns=renamed_cols, inplace=True)

        fx_trade_df = df

        print("✅ FX Data Processed:")
        display(fx_trade_df.head())
        display(generate_download_link(fx_trade_df, "processed_fx_data.csv"))

    except Exception as e:
        print(f"❌ Error processing FX file: {e}")

def clear_fx_data(change):
    global fx_raw_file, fx_trade_df
    fx_upload_widget.value = ()
    fx_file_label.value = "No FX file uploaded."
    fx_column_selector.options = []
    fx_column_rename_box.children = []
    fx_output.clear_output()
    fx_raw_file = None
    fx_trade_df = pd.DataFrame()

# ========== EVENTS ==========

fx_upload_widget.observe(load_fx_file, names='value')
clear_fx_btn.on_click(clear_fx_data)
process_fx_btn.on_click(process_fx_data)
fx_column_selector.observe(lambda change: build_column_rename_fields(change['new']), names='value')
# Handle sheet selection (only needed for Excel)
def handle_sheet_selection(change):
    if not fx_raw_file or not fx_raw_file['name'].endswith('.xlsx'):
        return
    try:
        df = pd.read_excel(io.BytesIO(fx_raw_file['content']), sheet_name=change['new'])
        df.columns = df.columns.str.strip()
        fx_column_selector.options = list(df.columns)
        build_column_rename_fields(df.columns)
        display(df.head())
    except Exception as e:
        fx_output.clear_output()
        print(f"❌ Failed to load sheet: {e}")

fx_sheet_dropdown.observe(handle_sheet_selection, names='value')

# ========== DISPLAY UI ==========

print("📥 FX Upload with Sheet Selector + Column Mapping")
display(fx_controls, fx_file_label, fx_sheet_dropdown, fx_column_selector, fx_column_rename_box, fx_output)


📥 FX Upload with Sheet Selector + Column Mapping


HBox(children=(FileUpload(value=(), accept='.csv,.xlsx', description='Upload FX Tracker'), Button(button_style…

Label(value='No FX file uploaded.')

Dropdown(description='Sheet:', layout=Layout(width='300px'), options=(), value=None)

SelectMultiple(description='Columns:', layout=Layout(width='300px'), options=(), value=())

VBox()

Output()

In [None]:
# %%
# === Imports ===
import io
import base64
import pandas as pd
import ipywidgets as widgets
from IPython.display import display

# %%
# === Globals ===
bank_raw_files = []
bank_dfs = {}

# File table widget
uploaded_file_table = widgets.VBox()

# %%
# === Helpers ===
def normalize_bank_key(name):
    return name.strip().lower().replace(' ', '_').replace('.csv', '').replace('.xlsx', '')

def generate_download_link(df, filename):
    csv_data = df.to_csv(index=False)
    b64 = base64.b64encode(csv_data.encode()).decode()
    return widgets.HTML(value=f'<a download="{filename}" href="data:text/csv;base64,{b64}" target="_blank">Download {filename}</a>')

def extract_excel_sheets(file_dict):
    excel_file = pd.ExcelFile(io.BytesIO(file_dict['content']))
    return excel_file.sheet_names

def build_bank_file_ui(file_dict):
    name = file_dict['name']
    content = file_dict['content']
    file_size_kb = len(content) / 1024
    file_box = widgets.VBox()
    file_label = widgets.HTML(f"<b>🧾 {name}</b> <span style='color:gray'>({file_size_kb:.1f} KB)</span>")
    
    dropdown = widgets.Dropdown(description="Sheet:", visible=False, layout=widgets.Layout(width="300px"))
    column_selector = widgets.SelectMultiple(description="Columns:", layout=widgets.Layout(width="300px"))
    rename_fields_box = widgets.VBox()
    rename_fields = {}

    preview_output = widgets.Output(layout={'border': '1px solid lightgray', 'padding': '5px'})
    df = None

    if name.endswith('.xlsx'):
        try:
            sheet_names = extract_excel_sheets(file_dict)
            dropdown.options = sheet_names
            dropdown.value = sheet_names[0]
            dropdown.visible = True
        except Exception as e:
            file_label.value += f" ❌ <span style='color:red'>Error reading Excel sheets: {e}</span>"

    def try_read_csv(bytes_obj):
        encodings = ['utf-8', 'latin1', 'ISO-8859-1']
        for enc in encodings:
            try:
                return pd.read_csv(io.BytesIO(bytes_obj), encoding=enc)
            except Exception:
                continue
        raise ValueError("Failed to decode CSV using common encodings.")

    try:
        if name.endswith('.xlsx'):
            df = pd.read_excel(io.BytesIO(content), sheet_name=dropdown.value)
        else:
            df = try_read_csv(content)

        df.columns = df.columns.str.strip()
        column_selector.options = list(df.columns)

        for col in df.columns:
            input_widget = widgets.Text(value=col, description=col, layout=widgets.Layout(width='400px'))
            rename_fields[col] = input_widget
        rename_fields_box.children = list(rename_fields.values())

        with preview_output:
            print("📊 Preview:")
            display(df.head())

    except Exception as e:
        file_label.value += f" ❌ <span style='color:red'>Error reading file: {e}</span>"

    bank_raw_files.append({
        'file_dict': file_dict,
        'dropdown': dropdown,
        'column_selector': column_selector,
        'rename_fields': rename_fields_box,
        'rename_map': rename_fields,
        'df_preview': df,
        'key': normalize_bank_key(name)
    })

    file_box.children = [file_label, dropdown, column_selector, rename_fields_box, preview_output]
    return file_box

# %%
# Widgets
bank_upload_widget = widgets.FileUpload(accept='.csv,.xlsx', multiple=True, description='Upload Bank Statement(s)')
clear_bank_btn = widgets.Button(description='Clear Bank Uploads', button_style='danger')
process_bank_btn = widgets.Button(description='Process Bank Data', button_style='success')
bank_file_label = widgets.Label(value="No bank files uploaded.")
bank_output = widgets.Output()
bank_file_boxes = widgets.VBox()

# Controls group
bank_controls = widgets.HBox([bank_upload_widget, clear_bank_btn, process_bank_btn])

# %%
def load_bank_files(change):
    bank_output.clear_output()
    bank_file_boxes.children = []
    bank_raw_files.clear()

    files = bank_upload_widget.value

    # Normalize file input
    normalized_files = []
    if isinstance(files, dict):  # Classic Notebook
        for name, meta in files.items():
            normalized_files.append({
                'name': name,
                'content': meta['content']
            })
    elif isinstance(files, (list, tuple)):  # JupyterLab/Colab
        for file_obj in files:
            normalized_files.append({
                'name': file_obj['name'],
                'content': file_obj['content']
            })

    if normalized_files:
        uploaded_file_table.children = []  # Reset table
        header_row = widgets.HBox([
            widgets.HTML("<b>File</b>", layout=widgets.Layout(width="40%")),
            widgets.HTML("<b>Size (KB)</b>", layout=widgets.Layout(width="15%")),
            widgets.HTML("<b>Type</b>", layout=widgets.Layout(width="15%")),
            widgets.HTML("<b>Action</b>", layout=widgets.Layout(width="20%"))
        ])
        table_rows = [header_row]
        names = []

        for file_dict in normalized_files:
            box = build_bank_file_ui(file_dict)
            bank_file_boxes.children += (box,)
            name = file_dict['name']
            content = file_dict['content']
            size_kb = len(content) / 1024
            file_ext = name.split('.')[-1].lower()
            file_icon = "📊" if file_ext == "xlsx" else "📄"
            file_type = "Excel" if file_ext == "xlsx" else "CSV"
            names.append(name)

            remove_btn = widgets.Button(
                description="Remove",
                button_style="danger",
                layout=widgets.Layout(width="90px", height="30px")
            )

            def make_remove_callback(name_to_remove):
                def _remove(_):
                    bank_file_boxes.children = tuple(
                        box for box in bank_file_boxes.children
                        if not any(name_to_remove in str(child) for child in box.children)
                    )
                    bank_raw_files[:] = [
                        entry for entry in bank_raw_files if entry['file_dict']['name'] != name_to_remove
                    ]
                    uploaded_file_table.children = tuple(
                        row for row in uploaded_file_table.children
                        if name_to_remove not in str(row.children[0].value)
                    )
                    if not bank_raw_files:
                        bank_file_label.value = "No bank files uploaded."
                        uploaded_file_table.children = [header_row]
                return _remove

            remove_btn.on_click(make_remove_callback(name))

            row = widgets.HBox([
                widgets.HTML(f"{file_icon} {name}", layout=widgets.Layout(width="40%")),
                widgets.HTML(f"{size_kb:.1f}", layout=widgets.Layout(width="15%")),
                widgets.HTML(file_type, layout=widgets.Layout(width="15%")),
                remove_btn
            ])
            table_rows.append(row)

        uploaded_file_table.children = table_rows
        bank_file_label.value = f"Uploaded: {', '.join(names)}"
    else:
        bank_file_label.value = "No bank files uploaded."
        uploaded_file_table.children = []

# %%
def clear_bank_files(change):
    global bank_raw_files, bank_dfs
    bank_upload_widget.value = ()
    bank_upload_widget._counter = 0
    bank_raw_files.clear()
    bank_dfs.clear()
    bank_file_label.value = "No bank files uploaded."
    uploaded_file_table.children = []
    bank_file_boxes.children = []
    bank_output.clear_output()

# %%
def process_bank_files(change):
    global bank_dfs
    bank_output.clear_output()
    if not bank_raw_files:
        with bank_output:
            print("⚠️ No bank files to process.")
            return

    for entry in bank_raw_files:
        file = entry['file_dict']
        sheet = entry['dropdown'].value if file['name'].endswith('.xlsx') else None
        selected_cols = list(entry['column_selector'].value)
        rename_map = {col: widget.value for col, widget in entry['rename_map'].items() if widget.value}

        try:
            if file['name'].endswith('.xlsx'):
                df = pd.read_excel(io.BytesIO(file['content']), sheet_name=sheet)
            else:
                df = pd.read_csv(io.BytesIO(file['content']))

            df.columns = df.columns.str.strip()
            if selected_cols:
                df = df[selected_cols]
            df.rename(columns=rename_map, inplace=True)
            key = entry['key']
            bank_dfs[key] = df

            with bank_output:
                print(f"✅ Processed {file['name']}:")
                display(df.head())
                display(generate_download_link(df, f"{key}_processed.csv"))
        except Exception as e:
            with bank_output:
                print(f"❌ Error processing {file['name']}: {e}")

# %%
# Register event handlers
bank_upload_widget.observe(load_bank_files, names='value')
clear_bank_btn.on_click(clear_bank_files)
process_bank_btn.on_click(process_bank_files)

# %%
# Display UI
print("🏦 Enhanced Bank Upload Interface:")
display(bank_controls, bank_file_label, uploaded_file_table, bank_file_boxes, bank_output)


🏦 Enhanced Bank Upload Interface:


HBox(children=(FileUpload(value=(), accept='.csv,.xlsx', description='Upload Bank Statement(s)', multiple=True…

Label(value='No bank files uploaded.')

VBox()

VBox()

Output()

## 4. Core Matching Logic
This section contains the main function to process FX trade records and attempt to match them against bank statements.
This logic ensures consistent comparison between trade records and bank entries.

In [None]:
date_tolerance_slider = widgets.IntSlider(
    value=3,
    min=0,
    max=7,
    step=1,
    description='Date Tolerance (± days):',
    continuous_update=False
)

display(date_tolerance_slider)


In [None]:
def process_fx_match(
    fx_row: pd.Series,
    all_bank_dfs: dict,
    unmatched_list: list,
    matched_list: list,
    action_type: str,
    fx_amount_field: str,
    bank_currency_info_field: str,
    is_sell_side: bool,
    date_tolerance_days: int = 3  # <-- default if not passed

) -> bool:
    print(f"\n🔍 Processing FX row: {fx_row.to_dict()}")

    amount = safe_float(fx_row.get(fx_amount_field))
    if amount is None or action_type not in ['Bank Buy', 'Bank Sell']:
        print("❌ Skipping row due to invalid amount or action type")
        return False



    parsed_date = fx_row.get('Created At')
    if parsed_date:
        parsed_date = pd.to_datetime(parsed_date)

    if not isinstance(parsed_date, datetime):
        print("❌ Skipping row due to invalid or missing date")
        return False

    formatted_date_slash = parsed_date.strftime('%d/%m/%Y')
    formatted_date_dot = parsed_date.strftime('%d.%m.%Y')

    print(f"Bank currency info field : {bank_currency_info_field}")

    counterparty_raw = str(fx_row.get(bank_currency_info_field, '')).strip()
    parts = counterparty_raw.split('-')
    if len(parts) < 2:
        print(f"❌ Invalid bank-currency format: {counterparty_raw}")
        return False

    trade_bank_name = parts[0].strip()
    trade_currency = parts[1].strip().upper()

    normalized_trade_bank_key_prefix = normalize_bank_key(trade_bank_name)
    trade_bank_key_full = f"{normalized_trade_bank_key_prefix} {trade_currency}".lower()

    print(f"🔍 Matching against bank key: {trade_bank_key_full}")

    found_match = False
    target_bank_df_key = None


    for bank_df_key_in_dict in all_bank_dfs.keys():
        print(f"👉 Checking bank key: {bank_df_key_in_dict}", " BANK DICT : ", trade_bank_key_full, trade_bank_key_full.startswith(bank_df_key_in_dict.split(' ')[0]))
              
        if (trade_bank_key_full.startswith(bank_df_key_in_dict.split('_')[0]) and
            trade_currency == bank_df_key_in_dict.split('_')[1].upper()):
            target_bank_df_key = bank_df_key_in_dict
            print(f"✅ Exact prefix match found with {target_bank_df_key}")
            break
        elif fuzz.ratio(trade_bank_key_full, bank_df_key_in_dict) >= FUZZY_MATCH_THRESHOLD:
            target_bank_df_key = bank_df_key_in_dict
            print(f"✅ Fuzzy match found with {target_bank_df_key}")
            break

    if not target_bank_df_key:
        print("❌ No matching bank statement found")
        unmatched_list.append({
            'Date': parsed_date.strftime('%Y-%m-%d'),
            'Bank Table (Expected)': trade_bank_key_full,
            'Action Type': action_type,
            'Amount': amount,
            'Status': 'No Bank Statement Found',
            'Source Column': bank_currency_info_field
        })
        return False

    bank_df = all_bank_dfs[target_bank_df_key]
    bank_df_columns = bank_df.columns.tolist()
 
    date_column = resolve_date_column(bank_df_columns)
    amount_column = resolve_amount_column(bank_df_columns, action_type, is_sell_side)

    print(f"📅 Using date column: {date_column} | 💰 Using amount column: {amount_column}")

    if not date_column or not amount_column:
        print("❌ Missing date or amount column in bank data")
        unmatched_list.append({
            'Date': parsed_date.strftime('%Y-%m-%d'),
            'Bank Table (Expected)': target_bank_df_key,
            'Action Type': action_type,
            'Amount': amount,
            'Status': 'Missing Date/Amount Column in Bank Statement',
            'Source Column': bank_currency_info_field
        })
        return False

       
    p_date = pd.to_datetime(parsed_date.strftime('%Y-%d-%m'))
    bank_df['_ParsedDate'] = bank_df[date_column].apply(parse_date)
  
    valid_dates_df = bank_df[bank_df['_ParsedDate'].notna()]
    date_matches = valid_dates_df[
        valid_dates_df['_ParsedDate'].dt.date.between(
            p_date.date() - pd.Timedelta(days=date_tolerance_days),
            p_date.date() + pd.Timedelta(days=date_tolerance_days)
        )
    ]


    print(f"🔎 Found {len(date_matches)} date matches in bank statement")

    bank_statement_currency = target_bank_df_key.split(' ')[-1].upper()

    for idx, bank_row in date_matches.iterrows():
        bank_amt_raw = bank_row.get(amount_column)
        bank_amt = safe_float(bank_amt_raw)

        print("\nbank amount raw : ", bank_amt_raw, " : converted amount : ", bank_amt ,"\n " )

        if bank_amt is not None:
            converted_amount = convert_currency(amount, trade_currency, bank_statement_currency, parsed_date)
            print(f"🔄 Comparing bank {bank_amt} to FX {converted_amount} (converted)")

            if converted_amount is not None and abs(converted_amount) > 1.0:
                print("✅ Match found!")
                matched_list.append({
                    'Date': parsed_date.strftime('%Y-%m-%d'),
                    'Bank Table': target_bank_df_key,
                    'Action Type': action_type,
                    'Trade Amount': amount,
                    'Trade Currency': trade_currency,
                    'Bank Statement Amount': bank_amt,
                    'Bank Statement Currency': bank_statement_currency,
                    'Converted Trade Amount': converted_amount,
                    'Matched In Column': amount_column,
                    'Date Column Used': date_column,
                    'Source Column': bank_currency_info_field
                })
                found_match = True
                break

    if not found_match:
        print("❌ No match found in statement rows")
        unmatched_list.append({
            'Date': parsed_date.strftime('%Y-%m-%d'),
            'Bank Table (Expected)': target_bank_df_key,
            'Action Type': action_type,
            'Amount': amount,
            'Status': 'Not Found in Bank Statement (Amount or No Match)',
            'Source Column': bank_currency_info_field
        })

    return found_match


## 5. Execution and Processing
This section orchestrates the matching process using the loaded data. It will generate lists of matched and unmatched transactions.

In [None]:
buy_match_count = 0
sell_match_count = 0
unmatched_buy = []
unmatched_sell = []
unmatched_bank_records = []
matched_buy = []
matched_sell = []
matched_set = set() # To track matched bank records to avoid double counting for bank-only unmatched

if not fx_trade_df.empty and bank_dfs:
    print("\n--- Starting Reconciliation Process ---")


    # Ensure column names are stripped of whitespace for consistent access
    fx_trade_df.columns = fx_trade_df.columns.str.strip()

    # Process each row in the FX trade tracker
    for index, row in fx_trade_df.iterrows():
        action_type = str(row.get('Action Type', '')).strip()
        status = str(row.get('Status', '')).strip().lower()

        if status == 'cancelled':
            continue

        # Process Buy Side (Counterparty Payment)
        # Note: 'MyUnknownColumn' is assumed to contain 'Bank-Currency' for the buy-side transaction.
        # This means we are buying foreign currency, which implies a local currency payment OUT (debit)
        if process_fx_match(
            row,
            bank_dfs,
            unmatched_buy,
            matched_buy,
            action_type,
            'Buy Currency Amount', # This is the local currency amount in the FX trade
            'Buy Trade Info', # This column holds "Bank-Currency" for the local currency account
            False, # Not a sell-side currency for bank account impact perspective
            date_tolerance_days=date_tolerance_slider.value  # 👈 Get live slider value

        ):
            buy_match_count += 1
            bank_info = row.get('Buy Trade Info')
            buy_amount_fx = safe_float(row.get('Buy Currency Amount'))
            p_createdAt = parse_date(row.get('Created At'))
            print("bank info", bank_info, " : buy amount : ", buy_amount_fx)
            if bank_info and buy_amount_fx is not None and isinstance(p_createdAt, datetime):
                parts = bank_info.lower().split('-')
                print("bank info : ", bank_info)
                if len(parts) >= 2:
                    key_bank = normalize_bank_key(parts[0].strip())
                    key_currency = parts[1].strip().upper()
                    # Add to matched_set with the amount in the bank statement's currency
                    # We need the converted amount here. For simplicity, if we match, we assume
                    # the bank amount was the one that truly matched.
                    # Or, even better, use the amount as it appears in the matched_buy list for consistency
                    matched_entry = next((item for item in matched_buy if item['Date'] == p_createdAt.strftime('%Y-%m-%d') and item['Trade Amount'] == buy_amount_fx), None)
                    if matched_entry:
                         matched_set.add((f"{key_bank.replace("_bank", "")} {key_currency}", buy_amount_fx, matched_entry['Date']))


        # Process Sell Side (Choice Payment)
        # Note: 'MyUnknownColumn_[2]' is assumed to contain 'Bank-Currency' for the sell-side transaction.
        # This means we are selling foreign currency, which implies a foreign currency payment OUT (debit)
        sell_amount_fx = safe_float(row.get('Sell Currency Amount'))
        if process_fx_match(
            row,
            bank_dfs,
            unmatched_sell,
            matched_sell,
            action_type,
            'Sell Currency Amount', # This is the foreign currency amount in the FX trade
            'Sell Trade Info', # This column holds "Bank-Currency" for the foreign currency account
            True # This is a sell-side currency for bank account impact perspective
        ):
            sell_match_count += 1
            bank_info = row.get('Sell Trade Info')
            p_createdAt = parse_date(row.get('Created At'))

            if bank_info and sell_amount_fx is not None and isinstance(p_createdAt, datetime):
                parts = bank_info.lower().split('-')
                if len(parts) >= 2:
                    key_bank = normalize_bank_key(parts[0].strip())                                           
                    key_currency = parts[1].strip().upper()
                    # Similar to buy side, use the matched bank statement amount for the set
                    matched_entry = next((item for item in matched_sell if item['Date'] == p_createdAt.strftime('%Y-%m-%d') and item['Trade Amount'] == sell_amount_fx), None)
                    if matched_entry:

                        matched_set.add((f"{key_bank.replace("_bank", "")} {key_currency}", matched_entry['Bank Statement Amount'], matched_entry['Date']))


    # Scan for unmatched bank records (records present in bank statements but not in FX trades)
unmatched_bank_records = []

for bank_key, bank_df in bank_dfs.items():
    print(f"Scanning bank table: {bank_key} for unmatched records...")
    bank_df.columns = bank_df.columns.str.strip()  # Clean column names
    date_col = resolve_date_column(bank_df.columns.tolist())
    amount_cols = get_amount_columns(bank_df.columns.tolist())
    description_col = get_description_columns(bank_df.columns.tolist())
    parts = bank_key.lower().split('_')
    print(parts)
    key_bank = f"{normalize_bank_key(parts[0].strip())}"
    key_currency = parts[1].strip().upper()

    print("Amount columns:", amount_cols)

    if not date_col or not amount_cols or not description_col:
        print(f"Skipping {bank_key}: Missing essential columns (Date, Amount, or Description).")
        continue

    # Ensure date column is parsed to datetime
    bank_df['_ParsedDate'] = bank_df[date_col].apply(parse_date)

    for idx, row in bank_df.iterrows():
        row_date_parsed = row.get('_ParsedDate')
        if not isinstance(row_date_parsed, datetime):
            continue  # Skip invalid date

        description = str(row.get(description_col, '')).strip()

        for amt_col in amount_cols:
            amt_val = safe_float(row.get(amt_col))
            if amt_val is None or abs(amt_val) < 0.01:
                continue  # Skip zero or missing amounts

            # Round amount to 2 decimal places for matching
            rounded_amt = round(amt_val, 2)
            
            # Build the lookup key in the same format as in matched_set
            # Build the lookup key in the same format as in matched_set
            match_key = (f"{key_bank} {key_currency}", rounded_amt, row_date_parsed.strftime('%Y-%d-%m'))

            print("\n🔍 Checking match_key:", match_key)
            print("📘 Matched Set:")
            for item in matched_set:
                print(" ", item)

            # Check for a match
            if match_key in matched_set:
                print("✅ Found match:", match_key)
            else:
                print("❌ No match found. Adding to unmatched_bank_records.")
                unmatched_bank_records.append({
                    'Bank Table': bank_key,
                    'Date': row_date_parsed.strftime('%Y-%m-%d'),
                    'Description': description,
                    'Transaction Type (Column)': amt_col,
                    'Amount': rounded_amt
                })


            # Optional: break if only one amount per row should be matched
            # break

else:
    print("Please upload both FX Trade Tracker and Bank Statement CSV files before running this section.")

# Convert lists to DataFrames for easier analysis
unmatched_buy_df = pd.DataFrame(unmatched_buy)
unmatched_sell_df = pd.DataFrame(unmatched_sell)
matched_buy_df = pd.DataFrame(matched_buy)
matched_sell_df = pd.DataFrame(matched_sell)
unmatched_bank_df = pd.DataFrame(unmatched_bank_records)

## 6. Export Results
This section exports the matched and unmatched records to CSV files for further review.

In [None]:
# Export unmatched FX (Buy Side)
if not unmatched_buy_df.empty:
    unmatched_buy_df.to_csv(out_csv_path_buy_unmatched, index=False)
    print(f"Exported unmatched Buy side FX records to: {out_csv_path_buy_unmatched}")
else:
    print("No unmatched Buy side FX records to export.")

# Export unmatched FX (Sell Side)
if not unmatched_sell_df.empty:
    unmatched_sell_df.to_csv(out_csv_path_sell_unmatched, index=False)
    print(f"Exported unmatched Sell side FX records to: {out_csv_path_sell_unmatched}")
else:
    print("No unmatched Sell side FX records to export.")

# Export matched FX (Buy Side)
if not matched_buy_df.empty:
    matched_buy_df.to_csv(out_csv_path_buy_matched, index=False)
    print(f"Exported matched Buy side FX records to: {out_csv_path_buy_matched}")
else:
    print("No matched Buy side FX records to export.")

# Export matched FX (Sell Side)
if not matched_sell_df.empty:
    matched_sell_df.to_csv(out_csv_path_sell_matched, index=False)
    print(f"Exported matched Sell side FX records to: {out_csv_path_sell_matched}")
else:
    print("No matched Sell side FX records to export.")

# Export bank-only unmatched records
if not unmatched_bank_df.empty:
    unmatched_bank_df.to_csv(out_csv_path_bank_unmatched, index=False)
    print(f"Exported unmatched bank records to: {out_csv_path_bank_unmatched}")
else:
    print("No unmatched bank records to export.")

## 7. Data Science Analysis and Visualizations
This section provides a deeper dive into the reconciliation results using `pandas` for analysis and `seaborn` for visually appealing graphs.

In [None]:
print("\n===== DATA ANALYSIS AND VISUALIZATIONS =====")

# Ensure dataframes exist before attempting analysis
if fx_trade_df.empty and (matched_buy_df.empty and unmatched_buy_df.empty) and \
   (matched_sell_df.empty and unmatched_sell_df.empty) and unmatched_bank_df.empty:
    print("No data available for analysis. Please ensure files are loaded and reconciliation is complete.")
else:
    # --- 7.1. Reconciliation Summary Statistics ---
    print("\n--- 7.1. Reconciliation Summary Statistics ---")
    total_fx_trades = len(fx_trade_df) if not fx_trade_df.empty else 0
    total_buy_side_trades = len(fx_trade_df[fx_trade_df['Action Type'] == 'Bank Buy']) if not fx_trade_df.empty else 0
    total_sell_side_trades = len(fx_trade_df[fx_trade_df['Action Type'] == 'Bank Sell']) if not fx_trade_df.empty else 0

    print(f"Total FX Trade Records (excluding cancelled/pending): {total_fx_trades - len(fx_trade_df[fx_trade_df['Status'].isin(['cancelled', 'pending'])]) if not fx_trade_df.empty else 0}")
    print(f"Total Buy Side FX Trades processed: {total_buy_side_trades}")
    print(f"Total Sell Side FX Trades processed: {total_sell_side_trades}")
    print(f"Buy Side Matched: {len(matched_buy_df)} ({(len(matched_buy_df)/total_buy_side_trades*100):.2f}%)" if total_buy_side_trades > 0 else "Buy Side Matched: 0 (N/A%)")
    print(f"Buy Side Unmatched: {len(unmatched_buy_df)} ({(len(unmatched_buy_df)/total_buy_side_trades*100):.2f}%)" if total_buy_side_trades > 0 else "Buy Side Unmatched: 0 (N/A%)")
    print(f"Sell Side Matched: {len(matched_sell_df)} ({(len(matched_sell_df)/total_sell_side_trades*100):.2f}%)" if total_sell_side_trades > 0 else "Sell Side Matched: 0 (N/A%)")
    print(f"Sell Side Unmatched: {len(unmatched_sell_df)} ({(len(unmatched_sell_df)/total_sell_side_trades*100):.2f}%)" if total_sell_side_trades > 0 else "Sell Side Unmatched: 0 (N/A%)")
    print(f"Unmatched Bank Records (not found in FX trades): {len(unmatched_bank_df)}")


    # --- 7.2. Visualizing Reconciliation Status (Buy Side) ---
    if not matched_buy_df.empty or not unmatched_buy_df.empty:
        buy_status_counts = pd.DataFrame({
            'Status': ['Matched Buy', 'Unmatched Buy'],
            'Count': [len(matched_buy_df), len(unmatched_buy_df)]
        })
        plt.figure(figsize=(8, 6))
        sns.barplot(x='Status', y='Count', data=buy_status_counts)
        plt.title('FX Buy Side Reconciliation Status')
        plt.ylabel('Number of Trades')
        plt.show()
    else:
        print("\nNo Buy Side data for reconciliation status visualization.")

    # --- 7.3. Visualizing Reconciliation Status (Sell Side) ---
    if not matched_sell_df.empty or not unmatched_sell_df.empty:
        sell_status_counts = pd.DataFrame({
            'Status': ['Matched Sell', 'Unmatched Sell'],
            'Count': [len(matched_sell_df), len(unmatched_sell_df)]
        })
        plt.figure(figsize=(8, 6))
        sns.barplot(x='Status', y='Count', data=sell_status_counts)
        plt.title('FX Sell Side Reconciliation Status')
        plt.ylabel('Number of Trades')
        plt.show()
    else:
        print("\nNo Sell Side data for reconciliation status visualization.")

    # --- 7.4. Distribution of FX Trade Amounts ---
    if not fx_trade_df.empty:
        plt.figure(figsize=(12, 6))
        plt.subplot(1, 2, 1) # 1 row, 2 columns, first plot
        sns.histplot(fx_trade_df['Buy Currency Amount'].dropna(), kde=True, bins=10)
        plt.title('Distribution of Buy Currency Amounts (FX Trades)')
        plt.xlabel('Amount')
        plt.ylabel('Frequency')

        plt.subplot(1, 2, 2) # 1 row, 2 columns, second plot
        sns.histplot(fx_trade_df['Sell Currency Amount'].dropna(), kde=True, bins=10, color='orange')
        plt.title('Distribution of Sell Currency Amounts (FX Trades)')
        plt.xlabel('Amount')
        plt.ylabel('Frequency')
        plt.tight_layout()
        plt.show()
    else:
        print("\nNo FX Trade data for amount distribution visualization.")

    # --- 7.5. Top Unmatched Bank Records by Amount ---
    if not unmatched_bank_df.empty:
        top_unmatched_bank = unmatched_bank_df.sort_values(by='Amount', ascending=False).head(10)
        plt.figure(figsize=(10, 7))
        sns.barplot(x='Amount', y='Bank Table', hue='Transaction Type (Column)', data=top_unmatched_bank, dodge=True)
        plt.title('Top 10 Unmatched Bank Records by Amount')
        plt.xlabel('Amount')
        plt.ylabel('Bank Account')
        plt.tight_layout()
        plt.show()
    else:
        print("\nNo unmatched bank records for top amount visualization.")

    # --- 7.6. Transaction Volume by Bank (Unmatched Bank Records) ---
    if not unmatched_bank_df.empty:
        bank_volume = unmatched_bank_df['Bank Table'].value_counts().reset_index()
        bank_volume.columns = ['Bank Table', 'Count']
        plt.figure(figsize=(10, 6))
        sns.barplot(x='Count', y='Bank Table', data=bank_volume, palette='cubehelix')
        plt.title('Number of Unmatched Transactions per Bank Account')
        plt.xlabel('Number of Unmatched Transactions')
        plt.ylabel('Bank Account')
        plt.show()
    else:
        print("\nNo unmatched bank records for transaction volume visualization.")

    # --- 7.7. Daily Transaction Trend (FX Trades) ---
    if not fx_trade_df.empty and 'Created At' in fx_trade_df.columns:
        # Filter out rows where 'Created At' could not be parsed
        fx_trades_valid_dates = fx_trade_df.dropna(subset=['Created At']).copy()
        fx_trades_valid_dates['DateOnly'] = fx_trades_valid_dates['Created At']
        daily_counts = fx_trades_valid_dates['DateOnly'].value_counts().sort_index().reset_index()
        daily_counts.columns = ['Date', 'Count']

        if len(daily_counts) > 1: # Only plot trend if there's more than one date
            plt.figure(figsize=(12, 6))
            sns.lineplot(x='Date', y='Count', data=daily_counts, marker='o')
            plt.title('Daily FX Trade Transaction Count')
            plt.xlabel('Date')
            plt.ylabel('Number of Trades')
            plt.xticks(rotation=45)
            plt.tight_layout()
            plt.show()
        else:
            print("\nNot enough date diversity in FX trades for daily trend visualization.")
    else:
        print("\nNo FX Trade data with valid dates for daily trend visualization.")

## 8. Overall Summary
A final comprehensive summary of the reconciliation process.

In [None]:
print("\n===== OVERALL RECONCILIATION SUMMARY =====")
print(f"✅ BUY Side Matches (Counterparty Payment): {len(matched_buy_df)}")
print(f"❌ BUY Side Unmatched: {len(unmatched_buy_df)}")
print(f"✅ SELL Side Matches (Choice Payment): {len(matched_sell_df)}")
print(f"❌ SELL Side Unmatched: {len(unmatched_sell_df)}")
print(f"📤 Bank-only unmatched entries: {len(unmatched_bank_df)}")
print("\nReconciliation process and data analysis complete. Review the generated CSVs and visualizations for insights.")