# OTC Treasury Reconciliation System
### Portfolio Project: Senior Treasury & Finance Operations
**Author:** Gilang Fajar Wijayanto  
**Date:** February 2026

---

## 1. Introduction
In a high-volume OTC (Over-The-Counter) trading environment, reconciling transactions across bank accounts, crypto wallets, and market maker statements is a critical control function. Errors in settlement matching lead to incorrect PnL recognition, tax leakage, and hidden liquidity risks.

This notebook demonstrates a **robust reconciliation engine** that:
1. Normalizes transaction data from disparate sources.
2. Pairs crypto and fiat settlement legs.
3. Recognizes PnL only upon dual-leg confirmation.
4. Flags discrepancies and settlement delays.

### System Architecture
![Architecture Flow](../diagrams/architecture_flow.png)


## 2. Setup and Data Loading
We use `pandas` for data manipulation and `matplotlib` for visualization. We'll load the synthetic dataset generated for FY 2024.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime

# Set aesthetic style
sns.set_theme(style="whitegrid")
plt.rcParams['figure.figsize'] = [12, 6]

# Define paths (local paths relative to repo)
TX_PATH = "../data/01_transactions.csv"
PNL_PATH = "../data/02_monthly_pnl.csv"
LEDGER_PATH = "../data/03_account_ledger.csv"

# Load datasets
df_tx = pd.read_csv(TX_PATH)
df_pnl = pd.read_csv(PNL_PATH)
df_ledger = pd.read_csv(LEDGER_PATH)

print(f"Loaded {len(df_tx)} transactions.")

## 3. Schema Walkthrough
Every row in the transaction dataset represents a single OTC trade. Key fields include:
- `idr_client_amount`: The total IDR the client pays or receives.
- `idr_mm_amount`: Our cost from the Market Maker.
- `tax_idr`: The 0.21% regulatory tax applied on the client amount.
- `net_pnl_idr`: Our final profit after tax (recognized only if settled).

In [None]:
df_tx.info()
df_tx[['transaction_id', 'pair', 'direction', 'volume_crypto', 'client_price_idr', 'net_pnl_idr', 'status']].head()

## 4. Exploratory Analysis
Understanding the distribution of trading pairs and transaction statuses helps identify operational bottlenecks.

In [None]:
# Transaction count by pair
pair_counts = df_tx['pair'].value_counts(normalize=True) * 100
pair_counts.plot(kind='pie', autopct='%1.1f%%', title="Volume Weight by Pair")
plt.ylabel('')
plt.show()

# Status distribution
print("Status Breakdown:")
print(df_tx['status'].value_counts())

## 5. Settlement Matching Logic
An OTC trade is only complete when both the **Crypto Leg** (wallet transfer) and the **Fiat Leg** (bank transfer) are confirmed. We track these via `crypto_settlement_timestamp` and `fiat_settlement_timestamp`.

In [None]:
# Show dual-entry ledger structure for a single transaction
sample_tx_id = df_tx[df_tx['status'] == 'SETTLED']['transaction_id'].iloc[0]
df_ledger[df_ledger['transaction_id'] == sample_tx_id]

## 6. PnL Recognition Rule
**Rule:** PnL is recognized only when BOTH legs are settled. 
This prevents recognizing "paper profit" before the cash is actually in the bank or the crypto is in the wallet.

In [None]:
# Convert to datetime
df_tx['trade_date'] = pd.to_datetime(df_tx['trade_date'])
df_tx['pnl_recognition_timestamp'] = pd.to_datetime(df_tx['pnl_recognition_timestamp'])

# Monthly Net PnL Waterfall
monthly_summary = df_tx[df_tx['status'] == 'SETTLED'].groupby('pnl_recognition_month')['net_pnl_idr'].sum()
monthly_summary.plot(kind='bar', title="Monthly Net PnL (IDR)", color='teal')
plt.show()

## 7. Discrepancy Detection
We flag transactions where the time gap between legs (e.g., bank transfer delay) exceeds the operational SLA (8 hours).

In [None]:
df_tx['crypto_settlement_timestamp'] = pd.to_datetime(df_tx['crypto_settlement_timestamp'])
df_tx['fiat_settlement_timestamp'] = pd.to_datetime(df_tx['fiat_settlement_timestamp'])

df_tx['settlement_gap_hrs'] = (df_tx['fiat_settlement_timestamp'] - df_tx['crypto_settlement_timestamp']).dt.total_seconds() / 3600

# Flagging delays > 8 hours
delays = df_tx[df_tx['settlement_gap_hrs'] > 8][['transaction_id', 'pair', 'settlement_gap_hrs']]
print(f"Transactions with settlement delays > 8 hrs: {len(delays)}")
delays.head()

## 9. Volume & Market Maker Analysis
This section analyzes trading volume concentration and Market Maker performance (spread 'generosity') to optimize routing strategies.

In [None]:
# 1. Volume Contribution per Pair
vol_stats = df_tx.groupby('pair')['idr_client_amount'].sum()
vol_pct = (vol_stats / vol_stats.sum() * 100).sort_values(ascending=False)

plt.figure(figsize=(10, 5))
vol_pct.plot(kind='bar', color=['#26a17b', '#2775ca', '#f7931a', '#d4a843'])
plt.title("Volume Contribution (%) by Pair")
plt.ylabel("Percentage %")
plt.show()

# 2. Market Maker Spread Analysis
mm_stats = df_tx.groupby(['market_maker_name', 'pair'])['spread_bps'].mean().unstack()
print("Average Spread (bps) by MM and Pair:")
display(mm_stats.round(2))

# 3. Optimal Routing Detection
print("\nOptimal Routing Recommendations:")
for pair in mm_stats.columns:
    best_mm = mm_stats[pair].idxmin()
    best_val = mm_stats[pair].min()
    print(f"- {pair}: Route to {best_mm} ({best_val:.2f} bps)")

## 10. Summary & Next Steps
This system ensures that treasury operations are transparent and risk-controlled. 

**Production enhancements would include:**
- Automated PDF bank statement scraping.
- Real-time on-chain confirmation alerts.
- Automated hedging execution via Market Maker APIs.