### Finance – Ensuring Accurate Transactions

**Task 1**: Transaction Data Validation Insights

**Objective**: Maintain transaction integrity.

**Steps**:
1. Choose a sample financial transaction dataset.
2. Identify common transaction issues like duplicate entries or incorrect amounts.
3. Develop a list of validation checks specific to financial transactions.

In [5]:
import pandas as pd
from datetime import datetime
from flask import Flask, request, jsonify
import threading

# -----------------------------
# VALIDATION LOGIC (Task 1 & 2)
# -----------------------------

def validate_transactions(df):
    today = pd.Timestamp(datetime.today().date())
    required_cols = ['Transaction_ID', 'Amount', 'Date', 'Status']

    # Robust date conversion with error handling
    try:
        df['Date'] = pd.to_datetime(df['Date'], errors='coerce')
    except Exception as e:
        print(f"Date parsing failed: {e}")

    # Ensure required columns exist
    for col in required_cols:
        if col not in df.columns:
            raise KeyError(f"Missing required column: {col}")

    results = {
        'duplicates': df[df.duplicated(subset=['Account_ID', 'Amount', 'Date'], keep=False)],
        'invalid_debit': df[(df['Transaction_Type'] == 'DEBIT') & (df['Amount'] < 0)],
        'out_of_range': df[(df['Amount'] < 0) | (df['Amount'] > 10000)],
        'missing_fields': df[df[required_cols].isnull().any(axis=1)],
        'future_dates': df[df['Date'] > today],
        'invalid_status': df[~df['Status'].isin(['Completed', 'Pending', 'Failed'])],
        'type_mismatch': df[
            ((df['Transaction_Type'] == 'DEBIT') & (df['Amount'] < 0)) |
            ((df['Transaction_Type'] == 'CREDIT') & (df['Amount'] > 0))
        ],
        'invalid_currency': df[~df['Currency'].isin(['USD', 'EUR', 'GBP'])],
        'duplicate_ids': df[df.duplicated(subset=['Transaction_ID'], keep=False)],
    }

    return results

# -----------------------------
# SAMPLE DATA FOR VALIDATION
# -----------------------------

def run_batch_validation():
    print("\n🔍 Running Batch Validation...\n")

    sample_data = {
        'Transaction_ID': ['TXN001', 'TXN002', 'TXN002', 'TXN004', 'TXN005', None],
        'Date': ['2025-05-01', '2025-05-01', '2025-05-01', '2025-06-20', '2025-05-02', '2026-01-01'],
        'Account_ID': ['AC1234', 'AC1234', 'AC1234', 'AC5678', 'AC5678', 'AC9999'],
        'Amount': [150.00, 150.00, 150.00, -25.00, 20000.00, 0.00],
        'Currency': ['USD', 'USD', 'USD', 'USD', 'USD', 'XYZ'],
        'Transaction_Type': ['DEBIT', 'DEBIT', 'DEBIT', 'CREDIT', 'DEBIT', 'DEBIT'],
        'Merchant': ['Amazon', 'Amazon', 'Amazon', 'Refund', 'Tesla', ''],
        'Status': ['Completed', 'Completed', 'Completed', 'Completed', 'Completed', 'Failed']
    }

    df = pd.DataFrame(sample_data)

    results = validate_transactions(df)

    for issue, result in results.items():
        if not result.empty:
            print(f"❌ {issue.upper()} found:")
            print(result.to_string(index=False), "\n")
        else:
            print(f"✅ {issue.upper()} check passed.\n")

# -----------------------------
# REAL-TIME API VALIDATION
# -----------------------------

app = Flask(__name__)

@app.route('/validate', methods=['POST'])
def validate_transaction_api():
    txn = request.json
    errors = []
    today = datetime.today().date()

    try:
        for field in ['Transaction_ID', 'Amount', 'Date', 'Status']:
            if field not in txn or txn[field] in [None, ""]:
                errors.append(f"{field} is missing.")

        if txn['Transaction_Type'] == 'DEBIT' and txn['Amount'] < 0:
            errors.append("DEBIT amount cannot be negative.")
        if txn['Transaction_Type'] == 'CREDIT' and txn['Amount'] > 0:
            errors.append("CREDIT amount must be negative.")

        if txn['Currency'] not in ['USD', 'EUR', 'GBP']:
            errors.append("Unsupported currency.")

        if txn['Status'] not in ['Completed', 'Pending', 'Failed']:
            errors.append("Invalid status.")

        try:
            txn_date = datetime.strptime(txn['Date'], "%Y-%m-%d").date()
            if txn_date > today:
                errors.append("Transaction date cannot be in the future.")
        except Exception:
            errors.append("Invalid date format. Use YYYY-MM-DD.")
    except Exception as e:
        errors.append(f"Validation failed: {str(e)}")

    return jsonify({'valid': not errors, 'errors': errors})

# -----------------------------
# RUN VALIDATION + API
# -----------------------------

if __name__ == '__main__':
    # Run batch validation in background
    threading.Thread(target=run_batch_validation).start()

    # Start API for real-time validation
    print("\n🚀 Real-time API running at http://localhost:5000/validate")
    app.run(debug=False)



🔍 Running Batch Validation...


🚀 Real-time API running at http://localhost:5000/validate
❌ DUPLICATES found:
Transaction_ID       Date Account_ID  Amount Currency Transaction_Type Merchant    Status
        TXN001 2025-05-01     AC1234   150.0      USD            DEBIT   Amazon Completed
        TXN002 2025-05-01     AC1234   150.0      USD            DEBIT   Amazon Completed
        TXN002 2025-05-01     AC1234   150.0      USD            DEBIT   Amazon Completed 

✅ INVALID_DEBIT check passed.

❌ OUT_OF_RANGE found:
Transaction_ID       Date Account_ID  Amount Currency Transaction_Type Merchant    Status
        TXN004 2025-06-20     AC5678   -25.0      USD           CREDIT   Refund Completed
        TXN005 2025-05-02     AC5678 20000.0      USD            DEBIT    Tesla Completed 

❌ MISSING_FIELDS found:
Transaction_ID       Date Account_ID  Amount Currency Transaction_Type Merchant Status
          None 2026-01-01     AC9999     0.0      XYZ            DEBIT          Failed 

❌ 

 * Running on http://127.0.0.1:5000
[33mPress CTRL+C to quit[0m


**Task 2**: Implement Financial Data Validation

**Objective**: Use automated tools to ensure transaction accuracy.

**Steps**:
1. Integrate data validation rules into your existing financial systems.
2. Ensure real-time checks to validate data upon entry.


Starting Flask API on http://localhost:5000/validate
 * Serving Flask app '__main__'

--- Running Batch Validation ---

❌ Validation Failed: duplicates
Transaction_ID       Date Account_ID  Amount Currency Transaction_Type Merchant    Status
        TXN001 2025-05-01     AC1234   150.0      USD            DEBIT   Amazon Completed
        TXN002 2025-05-01     AC1234   150.0      USD            DEBIT   Amazon Completed
        TXN002 2025-05-01     AC1234   150.0      USD            DEBIT   Amazon Completed 

✅ Passed: invalid_debit
❌ Validation Failed: out_of_range
Transaction_ID       Date Account_ID  Amount Currency Transaction_Type Merchant    Status
        TXN004 2025-06-20     AC5678   -25.0      USD           CREDIT   Refund Completed
        TXN005 2025-05-02     AC5678 20000.0      USD            DEBIT    Tesla Completed 

❌ Validation Failed: missing_fields
 * Debug mode: off


 * Running on http://127.0.0.1:5000
[33mPress CTRL+C to quit[0m


Transaction_ID       Date Account_ID  Amount Currency Transaction_Type Merchant Status
          None 2026-01-01     AC9999     0.0      XYZ            DEBIT          Failed 

❌ Validation Failed: future_dates
Transaction_ID       Date Account_ID  Amount Currency Transaction_Type Merchant    Status
        TXN004 2025-06-20     AC5678   -25.0      USD           CREDIT   Refund Completed
          None 2026-01-01     AC9999     0.0      XYZ            DEBIT             Failed 

✅ Passed: invalid_status
✅ Passed: type_mismatch
❌ Validation Failed: invalid_currency
Transaction_ID       Date Account_ID  Amount Currency Transaction_Type Merchant Status
          None 2026-01-01     AC9999     0.0      XYZ            DEBIT          Failed 

❌ Validation Failed: duplicate_ids
Transaction_ID       Date Account_ID  Amount Currency Transaction_Type Merchant    Status
        TXN002 2025-05-01     AC1234   150.0      USD            DEBIT   Amazon Completed
        TXN002 2025-05-01     AC1234   1