### Finance – Ensuring Accurate Transactions

**Task 1**: Transaction Data Validation Insights

**Objective**: Maintain transaction integrity.

**Steps**:
1. Choose a sample financial transaction dataset.
2. Identify common transaction issues like duplicate entries or incorrect amounts.
3. Develop a list of validation checks specific to financial transactions.

In [1]:
import pandas as pd

data = {
    "transaction_id": [101, 102, 103, 103, 104],
    "account_id": [1, 2, 3, 3, 4],
    "transaction_amount": [1000.00, -50.00, 250.00, 250.00, 5000000.00],  # negative or abnormally high
    "transaction_date": ["2025-05-19", "2025-05-19", "2025-05-19", "2025-05-19", "2025-05-19"]
}

df = pd.DataFrame(data)

# Identify issues
duplicate_transactions = df[df.duplicated(subset=["transaction_id"], keep=False)]
invalid_amounts = df[(df["transaction_amount"] <= 0) | (df["transaction_amount"] > 1_000_000)]

# Validation checks list
validation_checks = [
    "Check for duplicate transaction IDs",
    "Ensure transaction amounts are positive and below set thresholds",
    "Validate transaction dates are within expected ranges",
    "Confirm account IDs exist in authorized accounts",
    "Verify no missing essential fields"
]

duplicate_transactions, invalid_amounts, validation_checks


(   transaction_id  account_id  transaction_amount transaction_date
 2             103           3               250.0       2025-05-19
 3             103           3               250.0       2025-05-19,
    transaction_id  account_id  transaction_amount transaction_date
 1             102           2               -50.0       2025-05-19
 4             104           4           5000000.0       2025-05-19,
 ['Check for duplicate transaction IDs',
  'Ensure transaction amounts are positive and below set thresholds',
  'Validate transaction dates are within expected ranges',
  'Confirm account IDs exist in authorized accounts',
  'Verify no missing essential fields'])

**Task 2**: Implement Financial Data Validation

**Objective**: Use automated tools to ensure transaction accuracy.

**Steps**:
1. Integrate data validation rules into your existing financial systems.
2. Ensure real-time checks to validate data upon entry.

In [2]:

import pandas as pd
from great_expectations.dataset import PandasDataset

class FinancialDataset(PandasDataset):
    def validate_transactions(self):
        self.expect_column_values_to_not_be_null("transaction_id")
        self.expect_column_values_to_be_unique("transaction_id")
        self.expect_column_values_to_be_between("transaction_amount", min_value=0.01, max_value=1_000_000)
        self.expect_column_values_to_not_be_null("transaction_date")
        self.expect_column_values_to_match_regex("transaction_date", r"\d{4}-\d{2}-\d{2}")

# Sample data
data = {
    "transaction_id": [101, 102, 103, 104],
    "account_id": [1, 2, 3, 4],
    "transaction_amount": [1000.00, 500.00, -250.00, 750.00],
    "transaction_date": ["2025-05-19", "2025-05-19", "2025-05-19", "2025/05/19"]
}

df = FinancialDataset(pd.DataFrame(data))

validation_results = df.validate_transactions()
validation_results






