# L3 M9.1: Explainability & Citation Tracking

**Learning Arc:**

Financial RAG systems without explainability create regulatory liability. This notebook teaches you to build citation-tracked financial RAG with:

1. **Source Attribution:** Inline citations [1], [2], [3] linking to specific SEC filings
2. **Verifiable Citations:** Filing date, document section, direct quotes for audit verification
3. **Retrieval Transparency:** Logs of retrieved documents, relevance scores, selection rationale
4. **Audit Trail:** Immutable records meeting SOX Section 404 requirements
5. **Conflict Detection:** Explicit disclosure when sources contradict
6. **Citation Verification:** Post-generation validation catching LLM hallucinations

**Prerequisites:**
- Finance AI M7-M8 completed (RAG fundamentals)
- Understanding of SEC filings (10-K, 10-Q, 8-K)
- Basic compliance knowledge (SOX, SEC regulations)

**By the end:**
- Build explainable financial RAG systems
- Implement citation tracking and verification
- Create SOX-compliant audit trails
- Validate citations and detect hallucinations

**Estimated time:** 2-3 hours

## OFFLINE Mode Configuration

This notebook can run in two modes:
- **OFFLINE:** Uses mock data, no API calls (for learning/testing)
- **ONLINE:** Uses real APIs (ANTHROPIC, OPENAI, PINECONE)

In [None]:
import os
import sys

# Add parent directory to path for imports
sys.path.insert(0, os.path.abspath('..'))

# Check service configuration
ANTHROPIC_ENABLED = os.getenv("ANTHROPIC_ENABLED", "false").lower() == "true"
OPENAI_ENABLED = os.getenv("OPENAI_ENABLED", "false").lower() == "true"
PINECONE_ENABLED = os.getenv("PINECONE_ENABLED", "false").lower() == "true"

print("üîß Service Configuration:")
print(f"  ANTHROPIC (Claude LLM):     {'‚úÖ Enabled' if ANTHROPIC_ENABLED else '‚ùå Disabled'}")
print(f"  OPENAI (Embeddings):        {'‚úÖ Enabled' if OPENAI_ENABLED else '‚ùå Disabled'}")
print(f"  PINECONE (Vector Database): {'‚úÖ Enabled' if PINECONE_ENABLED else '‚ùå Disabled'}")
print()

if not any([ANTHROPIC_ENABLED, OPENAI_ENABLED, PINECONE_ENABLED]):
    print("‚ö†Ô∏è Running in OFFLINE mode")
    print("   - Using mock data for demonstrations")
    print("   - No external API calls will be made")
    print("   - Expected outputs shown in comments")
    print()
    print("To enable online mode:")
    print("   1. Copy .env.example to .env")
    print("   2. Add your API keys")
    print("   3. Set ANTHROPIC_ENABLED=true, OPENAI_ENABLED=true, PINECONE_ENABLED=true")
    print("   4. Restart Jupyter kernel")
else:
    print("‚úÖ Running in ONLINE mode - live API calls enabled")

**SAVED_SECTION:1**

---

## Section 1: Import Dependencies and Load Example Data

In [None]:
import json
from pprint import pprint

# Import our citation-tracking components
from src.l3_m9_financial_compliance_risk import (
    CitationAwareRetriever,
    CitationMapBuilder,
    CitationAwareLLMPrompt,
    CitationVerificationEngine,
    AuditTrailManager
)

print("‚úÖ Successfully imported all components")

# Load example data
with open('../example_data.json', 'r') as f:
    example_data = json.load(f)

print(f"\nüìä Loaded {len(example_data['queries'])} example queries")
print("\nExample queries:")
for i, query in enumerate(example_data['queries'][:3], 1):
    print(f"  {i}. {query}")
print("  ...")

**SAVED_SECTION:2**

---

## Section 2: Component 1 - Citation-Aware Retrieval

The first component retrieves financial documents and assigns citation markers [1], [2], [3].

In [None]:
# Initialize retriever
retriever = CitationAwareRetriever(
    vectorstore=None,  # Will use mock data in offline mode
    embeddings=None,
    relevance_threshold=0.70
)

# Retrieve documents with citations
query = "What was Tesla's Q2 2024 free cash flow?"
print(f"üìù Query: {query}\n")

retrieval_result = retriever.retrieve_with_citations(
    query=query,
    k=3,
    filters={"ticker": "TSLA", "fiscal_period": "Q2 2024"}
)

print("‚úÖ Retrieval Complete\n")
print(f"Documents retrieved: {retrieval_result['retrieval_log']['documents_retrieved']}")
print(f"Documents used: {retrieval_result['retrieval_log']['documents_used']}")
print(f"Documents excluded: {retrieval_result['retrieval_log']['documents_excluded']}")
print(f"\nCitation markers assigned: {list(retrieval_result['citation_map'].keys())}")

**SAVED_SECTION:3**

---

## Section 3: Examine Citation Map

Each citation has structured metadata for SEC audit verification.

In [None]:
# Examine first citation
citation_map = retrieval_result['citation_map']

print("üìã Citation [1] Metadata:\n")
pprint(citation_map['[1]'])

# Expected output shows:
# - source_type: "10-Q"
# - ticker: "TSLA"
# - filing_date: "2024-08-03"
# - fiscal_period: "Q2 2024"
# - section: "Financial Statements"
# - relevance_score: 0.92
# - excerpt: Direct quote from filing

**SAVED_SECTION:4**

---

## Section 4: Component 2 - LLM Prompting with Citation Instructions

We construct a prompt that instructs the LLM to use citation markers.

In [None]:
prompter = CitationAwareLLMPrompt()

# Build context with citation markers
context = "\n\n".join(retrieval_result['documents'])

# Build RAG prompt
llm_prompt = prompter.build_rag_prompt(
    query=query,
    retrieved_context=context,
    citation_map=citation_map
)

print("üìù System Prompt (excerpt):\n")
print(prompter.SYSTEM_PROMPT[:300] + "...\n")

print("üìù User Prompt (excerpt):\n")
print(llm_prompt[:400] + "...")

# Expected: Prompt includes citation markers [1], [2], [3]
# and explicit instructions to cite every fact

**SAVED_SECTION:5**

---

## Section 5: Generate LLM Response (or Mock)

In offline mode, we use a mock response. In online mode, we call Claude API.

In [None]:
if ANTHROPIC_ENABLED:
    # Online mode - call Claude API
    from config import get_anthropic_client
    
    client = get_anthropic_client()
    response = client.messages.create(
        model="claude-3-5-sonnet-20241022",
        max_tokens=1024,
        system=prompter.SYSTEM_PROMPT,
        messages=[{"role": "user", "content": llm_prompt}]
    )
    llm_response = response.content[0].text
    print("‚úÖ LLM response generated via Claude API\n")
else:
    # Offline mode - mock response
    llm_response = """Tesla reported Q2 2024 free cash flow of -$1.0B [1], primarily driven by $2.3B in capital expenditures for Gigafactory expansion [1]. 

However, operating cash flow improved to $1.3B compared to $0.5B in Q1 2024 [2], indicating operational efficiency gains. The negative free cash flow reflects strategic investments in manufacturing capacity rather than operational challenges.

Management expects to achieve positive free cash flow in Q3 2024 as capital expenditures normalize [3]."""
    print("‚ö†Ô∏è Using mock response (offline mode)\n")

print("üí¨ LLM Response:\n")
print(llm_response)

**SAVED_SECTION:6**

---

## Section 6: Component 3 - Citation Verification (Hallucination Detection)

Verify that each citation actually supports the claim made.

In [None]:
verifier = CitationVerificationEngine()

verification = verifier.verify_citations(
    response=llm_response,
    citation_map=citation_map
)

print("üîç Verification Results:\n")
print(f"‚úÖ Verification passed: {verification['verification_passed']}")
print(f"‚úÖ Verified claims: {len(verification['verified_claims'])}")
print(f"‚ùå Unsupported claims: {len(verification['unsupported_claims'])}")
print(f"üìä Overall fidelity: {verification['overall_fidelity']:.2%}")
print(f"‚öñÔ∏è Recommendation: {verification['recommendation']}")

if verification['unsupported_claims']:
    print("\n‚ö†Ô∏è Unsupported claims detected:")
    for claim in verification['unsupported_claims'][:2]:  # Show first 2
        print(f"  - {claim['claim'][:100]}...")
        print(f"    Similarity: {claim['similarity']:.2f}")

**SAVED_SECTION:7**

---

## Section 7: Component 4 - Audit Trail Creation

Log complete pipeline for SOX Section 404 compliance.

In [None]:
import uuid

audit_manager = AuditTrailManager()

query_id = str(uuid.uuid4())
response_id = audit_manager.log_complete_pipeline(
    query_id=query_id,
    user_id="analyst_demo",
    query_text=query,
    retrieved_docs=[citation_map],
    llm_response=llm_response,
    citations=citation_map,
    verification=verification
)

print("üìù Audit Trail Created:\n")
print(f"Query ID: {query_id}")
print(f"Response ID: {response_id}")
print(f"Audit entries logged: {len(audit_manager.audit_entries)}")

# Retrieve audit log
logs = audit_manager.get_audit_log(query_id=query_id)
print("\n‚úÖ Audit log retrievable for SEC examination")

**SAVED_SECTION:8**

---

## Section 8: Examine Complete Audit Entry

In [None]:
audit_entry = logs[0]

print("üìã Complete Audit Entry:\n")
print(f"Query ID: {audit_entry['query_id']}")
print(f"User ID: {audit_entry['user_id']}")
print(f"Timestamp: {audit_entry['timestamp']}")
print(f"\nQuery: {audit_entry['query_text']}")
print(f"\nDocuments retrieved: {audit_entry['retrieved_documents']['count']}")
print(f"\nVerification:")
print(f"  - Passed: {audit_entry['verification']['passed']}")
print(f"  - Fidelity: {audit_entry['verification']['overall_fidelity']:.2%}")
print(f"  - Recommendation: {audit_entry['verification']['recommendation']}")

# Expected: Complete pipeline logged
# - Query, retrieval, response, citations, verification
# - Timestamp for 7-year retention
# - User attribution for accountability

**SAVED_SECTION:9**

---

## Section 9: Testing Hallucination Detection

Let's test the verification engine with a hallucinated response.

In [None]:
# Create a response with hallucinated facts
hallucinated_response = """Tesla reported Q2 2024 free cash flow of $5.0B [1], 
the highest in company history [1]. Revenue grew 50% year-over-year [2]."""

print("üß™ Testing hallucination detection...\n")
print("Hallucinated response:")
print(hallucinated_response)
print()

hallucination_check = verifier.verify_citations(
    response=hallucinated_response,
    citation_map=citation_map
)

print("üîç Verification Results:\n")
print(f"‚úÖ Verification passed: {hallucination_check['verification_passed']}")
print(f"üìä Fidelity: {hallucination_check['overall_fidelity']:.2%}")
print(f"‚ùå Unsupported claims: {len(hallucination_check['unsupported_claims'])}")

if hallucination_check['unsupported_claims']:
    print("\n‚ö†Ô∏è Hallucinations detected:")
    for claim in hallucination_check['unsupported_claims']:
        print(f"  - Claim: {claim['claim'][:80]}...")
        print(f"    Status: {claim['status']}")
        print(f"    Similarity: {claim['similarity']:.2f}\n")

# Expected: Verification catches hallucinations
# Similarity scores will be low (<0.85 threshold)
# Claims flagged for human review

**SAVED_SECTION:10**

---

## Section 10: Testing Conflict Detection

Demonstrate how to handle conflicting sources.

In [None]:
# Mock conflicting sources
conflict_response = """Revenue shows mixed signals across sources: 10-Q reports 5% decline [1], 
while earnings call describes results as flat on constant currency basis [2], and analysts 
calculate 2% growth when adjusted for divestitures [3]. 

These discrepancies stem from different accounting adjustments and should be investigated 
before drawing conclusions."""

print("üìä Example of Proper Conflict Disclosure:\n")
print(conflict_response)
print()

print("‚úÖ This response properly discloses conflicting sources")
print("‚úÖ Cites all sources, not just favorable ones")
print("‚úÖ Explains why discrepancies exist")
print("‚úÖ Recommends investigation before conclusions")
print()
print("‚ö†Ô∏è Improper: Cherry-picking only '2% growth' would constitute fraud risk")

**SAVED_SECTION:11**

---

## Section 11: Production Deployment Considerations

In [None]:
print("üöÄ Production Deployment Checklist:\n")

checklist = [
    ("Citation Accuracy", "Test on 100+ real SEC filings, achieve >95% accuracy"),
    ("Audit Logging", "Verify all 5 components log correctly"),
    ("Verification Threshold", "Calibrate semantic similarity (recommended: 0.85-0.90)"),
    ("Monitoring", "Set up alerts for citation drift, verification failures"),
    ("Compliance Documentation", "Prepare SOX 404 control documentation"),
    ("Shadow Period", "2-week parallel operation with human validation"),
    ("Quarterly Audits", "Recalibrate thresholds against manual review"),
]

for i, (item, description) in enumerate(checklist, 1):
    print(f"{i}. {item}")
    print(f"   {description}\n")

print("\nüìã Regulatory Framework:")
print("  - SEC Regulation S-P: Requires explainability for automated advice")
print("  - SOX Section 404: Requires audit trails (7-year retention)")
print("  - Investment Advisers Act: Fiduciary duty to clients")
print("  - GDPR Article 22: Right to explanation (EU clients)")

**SAVED_SECTION:12**

---

## Section 12: Summary and Next Steps

In [None]:
print("üéì What You've Learned:\n")
print("‚úÖ 1. Citation-Aware Retrieval")
print("     - Retrieve documents with relevance scoring")
print("     - Assign citation markers [1], [2], [3]")
print("     - Filter by relevance threshold (0.70)\n")

print("‚úÖ 2. Citation Map Generation")
print("     - Structured metadata for SEC audit")
print("     - Filing date, section, page number")
print("     - SHA256 hash for tamper detection\n")

print("‚úÖ 3. LLM Prompting with Citations")
print("     - Instruct LLM to cite EVERY fact")
print("     - Embed citation markers in context")
print("     - Require disclosure if info unavailable\n")

print("‚úÖ 4. Citation Verification")
print("     - Post-generation hallucination detection")
print("     - Semantic similarity checking (>0.85)")
print("     - Flag unsupported claims for review\n")

print("‚úÖ 5. SOX-Compliant Audit Trail")
print("     - Immutable logging of complete pipeline")
print("     - 7-year retention support")
print("     - Query ‚Üí Retrieval ‚Üí Response ‚Üí Verification\n")

print("\nüöÄ Next Steps:\n")
print("1. Test with real SEC filings from EDGAR API")
print("2. Configure ANTHROPIC, OPENAI, PINECONE for production")
print("3. Calibrate verification threshold with manual audits")
print("4. Set up PostgreSQL for immutable audit storage")
print("5. Conduct 2-week shadow period with compliance team")
print("6. Deploy with monitoring and quarterly audits")

print("\nüìö Additional Resources:")
print("  - README.md: Complete documentation")
print("  - example_data.json: More query examples")
print("  - example_data.txt: SEC filing excerpts")
print("  - tests/: Comprehensive test suite")