# L3_M10.3: Managing Financial Knowledge Base Drift

**Learning Arc (5-7 minutes)**

## What You'll Build

A production-ready drift detection and versioning system for financial knowledge bases in RAG systems. You'll learn to:

- **Detect semantic drift** when regulatory definitions change (ASC 606, ASC 842, CECL)
- **Implement version control** with regulatory effective dates for temporal query routing
- **Build selective retraining** pipelines that target only affected documents
- **Create audit trails** with cryptographic hashing for SOX 404 compliance
- **Validate updates** with regression testing to ensure zero historical query breakage

## Prerequisites

- Generic CCC M1-M4 (RAG MVP fundamentals)
- Finance AI M10.1 (Secure Deployment)
- Finance AI M10.2 (Monitoring Performance)
- Understanding of GAAP standards and regulatory compliance

## Concepts Covered

1. **Semantic Drift Detection** - Using embedding similarity to identify concept changes
2. **Knowledge Base Versioning** - Managing multiple standard versions with effective dates
3. **Regulatory Monitoring** - Tracking FASB/SEC announcements automatically
4. **Selective Retraining** - Re-embedding only affected documents (500 vs. 50K)
5. **Regression Testing** - Validating updates don't break existing queries
6. **Immutable Audit Trails** - SOX compliance with 7+ year retention
7. **Human-in-Loop Compliance** - Approval workflows for version changes
8. **Temporal Query Routing** - Serving historical queries with period-appropriate standards

## Expected Outcomes

By the end of this notebook, you will:

- Detect drift between ASC 840 and ASC 842 lease accounting standards
- Create versioned knowledge bases with regulatory effective dates
- Identify and retrain affected documents selectively
- Generate immutable audit trails for compliance
- Validate changes with regression testing

**Services Used:** OpenAI (Embeddings API), Pinecone (Vector DB) - auto-detected from script

**Time:** ~45 minutes (script video length)

## Section 1: Setup and Configuration

### OFFLINE Mode Configuration

This notebook can run in two modes:
- **OFFLINE:** Uses mock embeddings (hash-based, deterministic) - no API keys required
- **ONLINE:** Uses real OpenAI embeddings and Pinecone storage - requires API keys

The code will automatically skip external API calls if credentials are not available.

In [None]:
# OFFLINE Mode Guard - Check for API credentials
import os
import sys
from pathlib import Path

# Add parent directory to path for imports
sys.path.insert(0, str(Path.cwd().parent))

from dotenv import load_dotenv
load_dotenv(Path.cwd().parent / '.env')

OPENAI_ENABLED = os.getenv("OPENAI_ENABLED", "false").lower() == "true"
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY", "")

PINECONE_ENABLED = os.getenv("PINECONE_ENABLED", "false").lower() == "true"
PINECONE_API_KEY = os.getenv("PINECONE_API_KEY", "")

if not OPENAI_ENABLED or not OPENAI_API_KEY:
    print("‚ö†Ô∏è  Running in OFFLINE mode - API calls will be skipped")
    print("   Using mock embeddings (hash-based, deterministic)")
    print("")
    print("   To enable OpenAI: Set OPENAI_ENABLED=true and OPENAI_API_KEY in .env")
    print("   To enable Pinecone: Set PINECONE_ENABLED=true and PINECONE_API_KEY in .env")
else:
    print("‚úÖ OpenAI enabled - full functionality available")
    if PINECONE_ENABLED:
        print("‚úÖ Pinecone enabled - vector storage available")
    else:
        print("‚ö†Ô∏è  Pinecone disabled - using in-memory storage")

### Import Dependencies

In [None]:
# Core imports from our package
from src.l3_m10_financial_rag_in_production import (
    FinancialKBDriftDetector,
    KnowledgeBaseVersionManager,
    RegulatoryMonitor,
    SelectiveRetrainingPipeline,
    AuditTrailManager,
    detect_drift,
    create_version,
    validate_regression
)

import config
import json
from datetime import datetime

print("‚úÖ Imports successful")

# Load example data
with open('../example_data.json', 'r') as f:
    example_data = json.load(f)

print(f"‚úÖ Loaded example data with {len(example_data['baseline_concepts'])} baseline concepts")

## Section 2: Conceptual Foundation

### The Challenge: Knowledge Base Drift in Financial RAG

Unlike generic RAG systems, financial knowledge bases face unique challenges:

**1. Semantic Drift**
- Same terminology, different meaning ("lease" in ASC 840 vs. ASC 842)
- Definition changes without terminology changes
- Example: Operating lease treatment fundamentally changed

**2. Regulatory Effective Dates**
- Standards have explicit effective dates (ASC 842: Jan 1, 2019)
- Historical queries require period-appropriate standards
- Transition periods with dual compliance requirements

**3. Zero Historical Breakage**
- SOX compliance requires audit trail accuracy
- Queries about 2018 transactions must use 2018 standards
- Cannot overwrite baseline knowledge

### Our Approach

**Six-Step Workflow:**
1. **Regulatory Monitoring** - Automated FASB/SEC scraping (daily)
2. **Drift Detection** - Embedding similarity comparison (weekly)
3. **Version Control** - Create new versions with effective dates
4. **Selective Retraining** - Re-embed only affected documents
5. **Regression Testing** - Validate historical queries still work
6. **Audit Trail** - Immutable records with cryptographic hashing

## Section 3: Establishing Baseline

First step: Create baseline embeddings for current financial concepts.
These serve as the reference point for future drift detection.

In [None]:
# Initialize drift detector
drift_detector = FinancialKBDriftDetector(
    threshold=0.85,
    openai_client=config.get_openai_client(),
    pinecone_index=None  # Using in-memory for demo
)

print("Drift detector initialized")
print(f"Threshold: {drift_detector.threshold}")
print(f"Severity levels:")
print(f"  HIGH: similarity < 0.70")
print(f"  MEDIUM: 0.70-0.80")
print(f"  LOW: 0.80-0.85")
print(f"  No drift: >= 0.85")

In [None]:
# Establish baseline from example data
baseline_concepts = example_data['baseline_concepts']

print("Establishing baseline for concepts:")
for concept in baseline_concepts.keys():
    print(f"  - {concept}")

result = drift_detector.establish_baseline(baseline_concepts)

print(f"\n‚úÖ Baseline established")
print(f"   Mode: {result['mode']}")
print(f"   Concepts: {result['concept_count']}")

# Expected: 6 concepts (Lease Accounting, Revenue Recognition, etc.)

## Section 4: Detecting Drift

### Scenario 1: No Drift (High Similarity)

When concepts haven't changed, similarity should be >= 0.85

In [None]:
# Test with identical concepts (no drift)
current_concepts_no_drift = example_data['current_concepts_low_drift']

drift_report = drift_detector.detect_drift(current_concepts_no_drift)

print("Drift Detection Report (No Drift Scenario):")
print(f"  Concepts checked: {drift_report['concepts_checked']}")
print(f"  Drift detected: {drift_report['summary']['total_drift_count']}")
print(f"  No drift: {drift_report['summary']['no_drift_count']}")
print(f"\n  Severity breakdown:")
print(f"    HIGH: {drift_report['summary']['high_severity']}")
print(f"    MEDIUM: {drift_report['summary']['medium_severity']}")
print(f"    LOW: {drift_report['summary']['low_severity']}")

# Expected: Minimal or no drift for editorial changes

### Scenario 2: High Severity Drift (ASC 840 ‚Üí ASC 842)

Major regulatory changes should trigger high severity alerts

In [None]:
# Test with major changes (ASC 840 ‚Üí ASC 842)
current_concepts_high_drift = example_data['current_concepts_high_drift']

drift_report_high = drift_detector.detect_drift(current_concepts_high_drift)

print("Drift Detection Report (High Severity Scenario):")
print(f"  Concepts checked: {drift_report_high['concepts_checked']}")
print(f"  Total drift detected: {drift_report_high['summary']['total_drift_count']}")
print(f"\n  Severity breakdown:")
print(f"    HIGH: {drift_report_high['summary']['high_severity']}")
print(f"    MEDIUM: {drift_report_high['summary']['medium_severity']}")
print(f"    LOW: {drift_report_high['summary']['low_severity']}")

# Show details for high severity drifts
if drift_report_high['drift_detected']:
    print(f"\n  Drifted concepts (first 2):")
    for drift in drift_report_high['drift_detected'][:2]:
        print(f"    - {drift['concept']}")
        print(f"      Similarity: {drift['similarity']:.4f}")
        print(f"      Severity: {drift['severity']}")

# Expected: Multiple high severity alerts for ASC 840 ‚Üí 842 transition

## Section 5: Knowledge Base Versioning

### Creating Versions with Effective Dates

When drift is detected, create new versions instead of overwriting baseline.

In [None]:
# Initialize version manager
version_manager = KnowledgeBaseVersionManager()

# Create version for old standard (ASC 840)
version_asc840 = version_manager.create_version(
    standard_name="ASC 840",
    effective_from="2000-01-01",
    effective_until="2018-12-31",
    concept_definitions=baseline_concepts
)

print("Created version: ASC 840 (Old Lease Standard)")
print(f"  Version ID: {version_asc840['version_id']}")
print(f"  Effective: {version_asc840['effective_from']} to {version_asc840['effective_until']}")

# Create version for new standard (ASC 842)
version_asc842 = version_manager.create_version(
    standard_name="ASC 842",
    effective_from="2019-01-01",
    effective_until=None,  # Current standard
    concept_definitions=current_concepts_high_drift
)

print(f"\nCreated version: ASC 842 (New Lease Standard)")
print(f"  Version ID: {version_asc842['version_id']}")
print(f"  Effective: {version_asc842['effective_from']} onwards")

print(f"\n‚úÖ Total versions: {len(version_manager.list_versions())}")

### Temporal Query Routing

Route queries to the correct version based on transaction date

In [None]:
# Query for 2018 (should use ASC 840)
version_2018 = version_manager.get_version_for_date(
    query_date="2018-06-15",
    standard_name="ASC 840"
)

print("Query for June 15, 2018:")
if version_2018:
    print(f"  ‚úÖ Using version: {version_2018['standard_name']}")
    print(f"     Version ID: {version_2018['version_id']}")
    print(f"     Effective: {version_2018['effective_from']} to {version_2018['effective_until']}")
else:
    print("  ‚ùå No version found")

# Query for 2023 (should use ASC 842)
version_2023 = version_manager.get_version_for_date(
    query_date="2023-06-15",
    standard_name="ASC 842"
)

print("\nQuery for June 15, 2023:")
if version_2023:
    print(f"  ‚úÖ Using version: {version_2023['standard_name']}")
    print(f"     Version ID: {version_2023['version_id']}")
    print(f"     Effective: {version_2023['effective_from']} onwards")
else:
    print("  ‚ùå No version found")

# Expected: Different versions for different time periods

## Section 6: Regulatory Monitoring

Automated monitoring of FASB, SEC, and AICPA sources for updates

In [None]:
# Initialize regulatory monitor
regulatory_monitor = RegulatoryMonitor()

print("Regulatory sources monitored:")
for source, url in regulatory_monitor.sources.items():
    print(f"  - {source}: {url}")

# Check for updates
updates = regulatory_monitor.check_for_updates()

print(f"\n‚úÖ Checked {len(updates['sources_checked'])} sources at {updates['timestamp']}")
print(f"   Updates found: {len(updates['updates_found'])}")

if updates['updates_found']:
    print(f"\n   Sample update (mock data):")
    update = updates['updates_found'][0]
    print(f"     Source: {update['source']}")
    print(f"     Title: {update['title']}")
    print(f"     Effective: {update['effective_date']}")

# Expected: Mock updates in offline mode (real scraping would occur in production)

## Section 7: Selective Retraining Pipeline

### Cost Comparison: Full vs. Selective Retraining

**Full corpus re-embedding:** 50,000 docs √ó $0.0002 = **$10**
**Selective retraining:** 500 affected docs √ó $0.0002 = **$0.10** (99% savings)

### Identifying Affected Documents

In [None]:
# Initialize retraining pipeline
retraining_pipeline = SelectiveRetrainingPipeline(
    openai_client=config.get_openai_client(),
    pinecone_index=None
)

# Load sample documents
document_corpus = example_data['sample_documents']

print(f"Document corpus: {len(document_corpus)} documents")

# Identify documents affected by lease accounting drift
drift_concepts = ["Lease Accounting", "Right-of-Use Asset", "Lease Liability"]

affected_docs = retraining_pipeline.identify_affected_documents(
    drift_concepts=drift_concepts,
    document_corpus=document_corpus
)

print(f"\nDrifted concepts: {len(drift_concepts)}")
for concept in drift_concepts:
    print(f"  - {concept}")

print(f"\n‚úÖ Affected documents: {len(affected_docs)} / {len(document_corpus)}")
print(f"   Selective retraining ratio: {len(affected_docs)/len(document_corpus)*100:.1f}%")

if affected_docs:
    print(f"\n   Sample affected documents:")
    for doc in affected_docs[:2]:
        print(f"     - {doc['id']}: {doc['content'][:60]}...")

# Expected: ~3-5 documents (those mentioning lease/ROU asset)

### Executing Retraining in Batches

In [None]:
# Retrain affected documents in batches
batch_size = 2  # Small batch for demo

print(f"Retraining {len(affected_docs)} documents in batches of {batch_size}...")

retrain_result = retraining_pipeline.retrain_documents(
    documents=affected_docs,
    batch_size=batch_size
)

print(f"\n‚úÖ Retraining complete")
print(f"   Status: {retrain_result['status']}")
print(f"   Mode: {retrain_result['mode']}")
print(f"   Documents processed: {retrain_result['documents_processed']}")

if 'batches' in retrain_result:
    print(f"   Batches: {len(retrain_result['batches'])}")

# Expected: Successful retraining (skipped in offline mode)

## Section 8: Audit Trail Management

### SOX 404 Compliance Requirements

- **Immutability:** All records must be immutable (no edits/deletes)
- **Retention:** 7+ years (2,555 days minimum)
- **Cryptographic Hashing:** SHA-256 for data integrity
- **Approver Tracking:** Human-in-loop for compliance decisions

In [None]:
# Initialize audit trail manager
audit_manager = AuditTrailManager()

# Log drift detection event
drift_audit = audit_manager.log_drift_detection(
    drift_results=drift_report_high,
    approver="compliance.officer@company.com"
)

print("Logged drift detection event:")
print(f"  Event ID: {drift_audit['event_id']}")
print(f"  Type: {drift_audit['event_type']}")
print(f"  Timestamp: {drift_audit['timestamp']}")
print(f"  Approver: {drift_audit['approver']}")
print(f"  Data hash: {drift_audit['data_hash'][:16]}...")
print(f"  Immutable: {drift_audit['immutable']}")

# Log version creation
version_audit = audit_manager.log_version_creation(
    version_metadata=version_asc842,
    approver="compliance.officer@company.com"
)

print(f"\nLogged version creation event:")
print(f"  Event ID: {version_audit['event_id']}")
print(f"  Type: {version_audit['event_type']}")
print(f"  Version ID: {version_audit['version_id']}")
print(f"  Immutable: {version_audit['immutable']}")

print(f"\n‚úÖ Total audit entries: {len(audit_manager.audit_log)}")

### Retrieving Audit Trail

In [None]:
# Retrieve all audit entries
all_entries = audit_manager.get_audit_trail()

print(f"All audit entries: {len(all_entries)}")

# Filter by event type
drift_entries = audit_manager.get_audit_trail(event_type="drift_detection")
version_entries = audit_manager.get_audit_trail(event_type="version_creation")

print(f"  Drift detection events: {len(drift_entries)}")
print(f"  Version creation events: {len(version_entries)}")

# Expected: Immutable records with cryptographic hashes

## Section 9: Regression Testing

### Validating Updates Don't Break Historical Queries

Critical for SOX compliance: zero historical query breakage

In [None]:
# Define regression test suite
test_queries = [
    {"text": "How are operating leases accounted for in 2018?", "date": "2018-06-15"},
    {"text": "What is ASC 842 lease accounting?", "date": "2023-06-15"},
    {"text": "Explain right-of-use asset recognition", "date": "2023-06-15"},
    {"text": "Revenue recognition under ASC 606", "date": "2023-06-15"}
]

expected_results = [
    {"standard": "ASC 840", "treatment": "off-balance sheet"},
    {"standard": "ASC 842", "treatment": "on-balance sheet"},
    {"standard": "ASC 842", "concept": "right-of-use asset"},
    {"standard": "ASC 606", "model": "five-step"}
]

print(f"Running regression tests: {len(test_queries)} queries")
print()

# Run validation
validation_result = validate_regression(
    test_queries=test_queries,
    expected_results=expected_results
)

print(f"‚úÖ Regression validation complete")
print(f"   Total tests: {validation_result['total_tests']}")
print(f"   Passed: {validation_result['passed']}")
print(f"   Failed: {validation_result['failed']}")
print(f"   Pass rate: {validation_result['passed']/validation_result['total_tests']*100:.1f}%")

# Show sample test results
print(f"\n   Sample test results:")
for detail in validation_result['details'][:2]:
    print(f"     - Query: {detail['query'][:50]}...")
    print(f"       Status: {detail['status']}")

# Expected: 100% pass rate (all tests pass)

## Section 10: Complete Workflow Integration

### End-to-End Drift Management Process

In [None]:
# Complete workflow demonstration
print("=" * 60)
print("COMPLETE DRIFT MANAGEMENT WORKFLOW")
print("=" * 60)

# Step 1: Monitor regulatory sources
print("\n[1] REGULATORY MONITORING")
updates = regulatory_monitor.check_for_updates()
print(f"    ‚úì Checked {len(updates['sources_checked'])} sources")
print(f"    ‚úì Found {len(updates['updates_found'])} potential updates")

# Step 2: Detect drift
print("\n[2] DRIFT DETECTION")
drift_report = drift_detector.detect_drift(current_concepts_high_drift)
print(f"    ‚úì Checked {drift_report['concepts_checked']} concepts")
print(f"    ‚úì Detected {drift_report['summary']['total_drift_count']} drifts")
print(f"    ‚úì High severity: {drift_report['summary']['high_severity']}")

# Step 3: Create version (human approval simulated)
print("\n[3] VERSION CONTROL CREATION")
version = version_manager.create_version(
    standard_name="ASC 842",
    effective_from="2019-01-01",
    concept_definitions=current_concepts_high_drift
)
print(f"    ‚úì Created version: {version['standard_name']}")
print(f"    ‚úì Version ID: {version['version_id']}")
print(f"    ‚úì Effective from: {version['effective_from']}")

# Step 4: Selective retraining
print("\n[4] SELECTIVE RETRAINING")
affected = retraining_pipeline.identify_affected_documents(
    drift_concepts=["Lease Accounting"],
    document_corpus=document_corpus
)
retrain = retraining_pipeline.retrain_documents(affected, batch_size=50)
print(f"    ‚úì Identified {len(affected)} affected documents")
print(f"    ‚úì Retraining status: {retrain['status']}")

# Step 5: Regression testing
print("\n[5] REGRESSION TESTING")
validation = validate_regression(test_queries, expected_results)
print(f"    ‚úì Tests run: {validation['total_tests']}")
print(f"    ‚úì Pass rate: {validation['passed']}/{validation['total_tests']}")

# Step 6: Audit trail
print("\n[6] AUDIT TRAIL")
drift_audit = audit_manager.log_drift_detection(drift_report, "compliance.officer")
version_audit = audit_manager.log_version_creation(version, "compliance.officer")
print(f"    ‚úì Logged drift detection: {drift_audit['event_id'][:12]}...")
print(f"    ‚úì Logged version creation: {version_audit['event_id'][:12]}...")
print(f"    ‚úì Total audit entries: {len(audit_manager.audit_log)}")

print("\n" + "=" * 60)
print("‚úÖ WORKFLOW COMPLETE - Ready for production deployment")
print("=" * 60)

## Section 11: Common Failures & Solutions

### Production Challenges and How to Handle Them

**1. False Positive Drift Alerts**
- **Problem:** Editorial changes triggering drift alerts
- **Solution:** Lower threshold to 0.85 or add severity assessment

**2. Missed Regulatory Updates**
- **Problem:** Website scraping failures
- **Solution:** Multiple data sources + manual fallback

**3. Version Conflicts**
- **Problem:** Queries returning mixed old/new standards
- **Solution:** Temporal query routing based on transaction date

**4. Retraining Cost Overruns**
- **Problem:** Full corpus re-embedding instead of selective
- **Solution:** Target only affected documents (500 vs. 50K)

**5. Historical Query Breakage**
- **Problem:** Overwriting baseline instead of versioning
- **Solution:** Maintain concurrent versions with effective dates

**6. Audit Trail Storage Overflow**
- **Problem:** Storing full content instead of hashes
- **Solution:** Use SHA-256 hashes, archive old entries

**7. Slow Drift Detection**
- **Problem:** Re-generating all embeddings on each check
- **Solution:** Cache baseline embeddings, only check changed concepts

**8. Human Approval Bottlenecks**
- **Problem:** Version updates delayed for weeks
- **Solution:** Automated notifications + SLA tracking

## Section 12: Decision Card

### When to Use This Approach

**‚úÖ USE when:**
- Financial RAG systems with evolving standards (GAAP, IFRS, tax code)
- Multi-year data requiring historical accuracy
- SOX compliance requirements (audit trails, 95%+ citation accuracy)
- Moderate update frequency (quarterly/annually)
- Defined effective dates for regulatory changes

**‚ùå DO NOT USE when:**
- Static knowledge bases (no regulatory changes)
- Short-lived projects (<1 year)
- Non-regulated domains (marketing content, blogs)
- High-velocity updates (daily/hourly changes)
- Low accuracy tolerance (80-90% acceptable)

### Cost Tiers

**Tier 1 (Small):** 5K docs, monthly checks ‚Üí ~$1-2/month
**Tier 2 (Medium):** 50K docs, weekly checks ‚Üí ~$6-10/month ‚Üê Script baseline
**Tier 3 (Large):** 500K docs, daily checks ‚Üí ~$60-100/month

### Alternative Approaches

- **Budget constrained:** Skip Pinecone, use local SQLite
- **No compliance:** Skip audit trails, simplify detection
- **High update frequency:** Full corpus re-embedding
- **No historical queries:** Overwrite baseline instead of versioning

## Summary & Next Steps

### What You've Learned

In this notebook, you've implemented:

1. ‚úÖ **Semantic drift detection** using embedding similarity (threshold: 0.85)
2. ‚úÖ **Knowledge base versioning** with regulatory effective dates
3. ‚úÖ **Temporal query routing** for period-appropriate standards
4. ‚úÖ **Selective retraining** targeting only affected documents
5. ‚úÖ **Regression testing** to ensure zero historical breakage
6. ‚úÖ **Immutable audit trails** with cryptographic hashing (SOX 404)
7. ‚úÖ **Human-in-loop approval** workflows for compliance
8. ‚úÖ **Regulatory monitoring** of FASB/SEC sources

### Production Deployment

To deploy this system to production:

1. **Configure services:**
   - Set `OPENAI_ENABLED=true` and add API key
   - Set `PINECONE_ENABLED=true` and configure index
   - Configure PostgreSQL for audit trails

2. **Establish baseline:**
   - Generate embeddings for all current concepts
   - Store in Pinecone with "baseline" namespace

3. **Schedule monitoring:**
   - Daily regulatory source checks
   - Weekly drift detection runs
   - Slack/email notifications for alerts

4. **Build test suite:**
   - 200+ historical queries with expected results
   - Automated CI/CD integration

5. **Deploy API:**
   - Use `uvicorn app:app` or containerize with Docker
   - Enable HTTPS and rate limiting
   - Configure monitoring dashboards

### Resources

- **Full documentation:** See ../README.md
- **Augmented script:** [GitHub](https://github.com/yesvisare/financial_ai_ccc_l2/blob/main/Augmented_FinanceAI_M10_3_Managing_Financial_Knowledge_Base_Drift.md)
- **Tests:** See ../tests/test_m10_financial_rag_in_production.py
- **API endpoints:** Start server with `uvicorn app:app --reload`

### Next Modules

Continue your learning journey:
- **M10.4:** Advanced monitoring and alerting strategies
- **M11:** Cost optimization for production RAG systems
- **M12:** Multi-tenant financial RAG architectures

---

**Congratulations!** You've completed L3_M10.3: Managing Financial Knowledge Base Drift üéâ