# L3 M9.3: Regulatory Constraints in LLM Outputs (MNPI, Disclaimers, Safe Harbor)

## Learning Arc (5-Stage Framework)

### 1. HOOK (Why This Matters)

**The Crisis Scenario:**

A financial chatbot responds to an investor query: *"Q4 earnings will exceed $3 billion based on strong holiday sales."* The chatbot cited an internal forecast document, not a public SEC filing. Within hours:

- Stock price jumps 8% on leaked earnings information
- SEC launches Regulation Fair Disclosure (Reg FD) investigation
- Legal team receives shareholder lawsuit notice for selective disclosure
- Company faces criminal liability for insider trading violations
- Potential penalties: $5M+ fines, executive imprisonment (up to 20 years)

**The Problem:** A single LLM response leaked Material Non-Public Information (MNPI), triggering catastrophic regulatory violations.

**The Solution:** This module implements a three-layer compliance framework that detects MNPI, injects required disclaimers, and enforces information barriers (Chinese Walls) to prevent securities law violations.

---

### 2. CONCEPT (Core Ideas)

**Three-Layer Compliance Framework:**

1. **Layer 1 - MNPI Detection:** Prevents Material Non-Public Information disclosure using source validation, materiality indicators, and temporal checks (98%+ recall required)

2. **Layer 2 - Disclaimer Requirements:** Ensures FINRA Rule 2210 compliance ("Not Investment Advice") and Safe Harbor protection for forward-looking statements

3. **Layer 3 - Information Barriers (Chinese Walls):** Prevents selective disclosure by enforcing role-based access control and data namespace separation

**Decision Logic:** If ‚â•2 layers flag MNPI OR single high-confidence violation (0.9+) ‚Üí BLOCK response and log violation

---

### 3. CODE (Implementation)

You'll build:
- `MNPIDetector` class with three-layer detection
- `DisclaimerManager` for FINRA and Safe Harbor compliance
- `InformationBarrier` for Chinese Wall enforcement
- `ComplianceFilter` orchestrating the complete pipeline

---

### 4. CHALLENGE (Real Problems)

**Common Failures:**
- False negatives (missing MNPI) = catastrophic regulatory liability
- False positives (over-blocking) = acceptable cost vs. legal risk
- Stale disclosure database = public info flagged as MNPI
- Missing citation metadata = MNPI detection bypassed

---

### 5. CONFIDENCE (Mastery Check)

By the end of this notebook, you will be able to:
- ‚úÖ Detect MNPI violations using three-layer pattern matching
- ‚úÖ Inject compliant disclaimers meeting FINRA and SEC requirements
- ‚úÖ Enforce information barriers preventing selective disclosure
- ‚úÖ Create audit trails surviving SEC investigations

---

## What You'll Build Today

A production-ready compliance filter that:
1. Analyzes LLM outputs for MNPI violations
2. Blocks responses containing internal forecasts, merger plans, or executive changes
3. Injects "Not Investment Advice" and Safe Harbor disclaimers automatically
4. Logs all violations for regulatory audits

**Prerequisites:**
- Generic CCC M1-M4 (RAG architecture, optimization, deployment)
- Finance AI M9.1 (Explainability & Citation Tracking)
- Finance AI M9.2 (Risk Assessment in Retrieval)
- Understanding of financial regulatory frameworks

**Duration:** 45-50 minutes

**Level:** L2+ SkillElevate

Let's begin!

In [None]:
# SAVED_SECTION:1

## Section 1: OFFLINE Mode Guard

Check if services are available. This module can run in OFFLINE mode since it primarily filters LLM outputs rather than generating them.

In [None]:
import os
import warnings
warnings.filterwarnings('ignore')

# Check service availability (ANTHROPIC detected from M9.1 integration)
ANTHROPIC_ENABLED = os.getenv("ANTHROPIC_ENABLED", "false").lower() == "true"
ANTHROPIC_API_KEY = os.getenv("ANTHROPIC_API_KEY", "")

print("üîç Service Availability Check")
print("=" * 50)

if not ANTHROPIC_ENABLED or not ANTHROPIC_API_KEY:
    print("‚ö†Ô∏è  Running in OFFLINE mode")
    print("")
    print("This module is a compliance FILTER that can run without external LLM calls.")
    print("")
    print("To enable ANTHROPIC (optional M9.1 integration):")
    print("  1. Set ANTHROPIC_ENABLED=true in .env")
    print("  2. Add ANTHROPIC_API_KEY=your_key_here")
    print("")
    print("‚úÖ Local examples will still work!")
else:
    print("‚úÖ ANTHROPIC service available (M9.1 integration enabled)")

print("=" * 50)
# SAVED_SECTION:2

## Section 2: Setup and Imports

Import the compliance filtering components from our package.

In [None]:
# Core imports
import sys
sys.path.insert(0, '..')

from src.l3_m9_financial_compliance_risk import (
    MNPIDetector,
    DisclaimerManager,
    InformationBarrier,
    ComplianceFilter,
    ViolationType,
    filter_llm_output
)

import logging
logging.basicConfig(level=logging.WARNING)
logger = logging.getLogger(__name__)

print("‚úÖ Imports complete")
print("")
print("Available components:")
print("  - MNPIDetector (Layer 1: MNPI Detection)")
print("  - DisclaimerManager (Layer 2: FINRA & Safe Harbor)")
print("  - InformationBarrier (Layer 3: Chinese Walls)")
print("  - ComplianceFilter (Complete Pipeline)")
# SAVED_SECTION:3

## Section 3: Concept Explanation

### Concept 1: Material Non-Public Information (MNPI)

**Definition:** Information that could reasonably affect a company's stock price AND hasn't been disclosed simultaneously to all investors.

**Examples:**
- ‚úÖ **Public:** "Q3 earnings were $2.5 billion" (filed 10-Q, publicly available)
- ‚ùå **MNPI:** "Q4 earnings will be $3 billion" (internal forecast, not yet disclosed)

**Why It Matters:**
- Securities Exchange Act Section 10(b) and Rule 10b-5 (fraud, insider trading)
- Regulation Fair Disclosure (Reg FD) - selective disclosure prohibition
- Criminal liability: up to 20 years imprisonment

In [None]:
# Example: Contrasting public vs. MNPI

public_example = {
    "text": "Q3 earnings were $2.5B according to 10-Q filing",
    "citation": {"source_type": "10-Q", "filing_date": "2023-11-15"},
    "classification": "PUBLIC"
}

mnpi_example = {
    "text": "Q4 earnings will be $3B based on internal forecasts",
    "citation": {"source_type": "internal forecast", "filing_date": None},
    "classification": "MNPI - VIOLATION"
}

print("üìä MNPI Detection Examples")
print("=" * 60)
print(f"\n‚úÖ PUBLIC: {public_example['text']}")
print(f"   Source: {public_example['citation']['source_type']}")
print(f"\n‚ùå MNPI: {mnpi_example['text']}")
print(f"   Source: {mnpi_example['citation']['source_type']}")
print("   Status: BLOCKED - SEC violation risk")
# SAVED_SECTION:4

### Concept 2: Regulation FD (Fair Disclosure)

**Definition:** SEC rule requiring public companies to disclose material information to all investors simultaneously, not selectively.

**Information Barriers (Chinese Walls):** Prevents selective disclosure by maintaining separate data namespaces and enforcing role-based access control.

In [None]:
# Example: Information Barrier enforcement

user_permissions = {
    "analyst_external": ["public"],
    "analyst_internal": ["public", "internal"],
    "executive": ["public", "internal", "restricted"]
}

barrier = InformationBarrier(user_permissions=user_permissions)

print("üöß Information Barrier (Chinese Walls)")
print("=" * 60)

# Test access for different users
test_cases = [
    ("analyst_external", "public", "‚úÖ ALLOWED"),
    ("analyst_external", "internal", "‚ùå DENIED"),
    ("analyst_internal", "internal", "‚úÖ ALLOWED"),
    ("executive", "restricted", "‚úÖ ALLOWED")
]

for user, namespace, expected in test_cases:
    has_access = barrier.check_access(user, namespace)
    result = "‚úÖ ALLOWED" if has_access else "‚ùå DENIED"
    print(f"{result}: {user:20} accessing {namespace:15} namespace")

# Expected: External analysts cannot access internal/restricted data
# SAVED_SECTION:5

### Concept 3: Safe Harbor Provisions

**Definition:** Legal protection for forward-looking statements accompanied by meaningful cautionary language (Private Securities Litigation Reform Act of 1995).

**Required Elements:**
1. Identify statement as forward-looking
2. Include meaningful cautionary language about risks
3. Provide substantive disclosure of risk factors

In [None]:
# Example: Forward-looking statement detection and Safe Harbor disclaimer

disclaimer_manager = DisclaimerManager()

forward_looking_text = "The company expects revenue growth of 20-25% in fiscal year 2025."

filtered_text, added_disclaimers = disclaimer_manager.add_disclaimers(forward_looking_text)

print("‚öñÔ∏è  Safe Harbor Disclaimer Injection")
print("=" * 60)
print(f"\nOriginal: {forward_looking_text}")
print(f"\nFiltered: {filtered_text}")
print(f"\nDisclaimer added: {added_disclaimers}")

# Expected: Safe Harbor statement added for forward-looking prediction
# SAVED_SECTION:6

### Concept 4: FINRA Rule 2210 (Communications with the Public)

**Definition:** FINRA regulation governing financial communications, requiring balanced presentation and risk disclosure.

**Investment Advice Patterns:**
- "Recommend buying XYZ stock"
- "This stock is undervalued/overvalued"
- "Target price is $50"
- "Buy/Sell/Hold rating"

In [None]:
# Example: Investment advice detection and FINRA disclaimer

investment_advice_text = "We recommend buying XYZ stock at current prices below $50."

filtered_text, added_disclaimers = disclaimer_manager.add_disclaimers(investment_advice_text)

print("üìã FINRA Rule 2210 Disclaimer Injection")
print("=" * 60)
print(f"\nOriginal: {investment_advice_text}")
print(f"\nFiltered: {filtered_text}")
print(f"\nDisclaimer added: {added_disclaimers}")

# Expected: "Not Investment Advice" disclaimer added
# SAVED_SECTION:7

## Section 4: Technical Implementation - MNPI Detection

### Three-Layer MNPI Detection Pattern

**Layer 1:** Source Validation (internal vs. public documents)

**Layer 2:** Materiality Indicator Matching (earnings, M&A, executive changes)

**Layer 3:** Temporal Check (disclosed or still internal?)

**Decision Logic:** If ‚â•2 layers flag MNPI OR confidence ‚â•0.9 ‚Üí BLOCK

In [None]:
# Initialize MNPI detector
mnpi_detector = MNPIDetector(detection_threshold=0.85)

# Test Case 1: Public information (should pass)
public_text = "Q3 earnings were $2.5B according to the 10-Q filing."
public_citations = [
    {"source_type": "10-Q", "source_id": "sec_20231115"}
]

result_public = mnpi_detector.detect(public_text, public_citations)

print("üîç MNPI Detection Test 1: Public Information")
print("=" * 60)
print(f"Text: {public_text}")
print(f"Violation: {result_public['is_violation']}")
print(f"Confidence: {result_public['confidence']:.2f}")
print(f"Layers flagged: {result_public['layers_flagged']}/3")
print(f"Details: {result_public['details']}")

# Expected: No violation, low confidence
# SAVED_SECTION:8

In [None]:
# Test Case 2: MNPI violation (should block)
mnpi_text = "Based on internal forecasts, Q4 earnings will exceed $3 billion."
mnpi_citations = [
    {"source_type": "internal forecast", "source_id": "budget_2024"}
]

result_mnpi = mnpi_detector.detect(mnpi_text, mnpi_citations)

print("\nüö® MNPI Detection Test 2: Internal Forecast (VIOLATION)")
print("=" * 60)
print(f"Text: {mnpi_text}")
print(f"Violation: {result_mnpi['is_violation']}")
print(f"Confidence: {result_mnpi['confidence']:.2f}")
print(f"Layers flagged: {result_mnpi['layers_flagged']}/3")
print(f"\nBreakdown:")
print(f"  Internal source: {result_mnpi['details']['internal_source']}")
print(f"  Material indicators: {result_mnpi['details']['material_indicators']}")
print(f"  Undisclosed forward-looking: {result_mnpi['details']['undisclosed_forward_looking']}")

# Expected: VIOLATION detected, high confidence, ‚â•2 layers flagged
# SAVED_SECTION:9

## Section 5: Complete Compliance Filter Pipeline

Integrate all three layers: MNPI detection + Disclaimers + Information Barriers

In [None]:
# Initialize complete compliance filter
compliance_filter = ComplianceFilter()

# Test Case 1: Public info with investment advice (should add disclaimer)
test_output_1 = "Based on strong fundamentals, this stock is undervalued."
test_citations_1 = [
    {"source_type": "public analysis", "source_id": "report_2024", "data_namespace": "public"}
]

result_1 = compliance_filter.filter_output(
    llm_output=test_output_1,
    citations=test_citations_1,
    user_id="analyst_001"
)

print("üõ°Ô∏è  Compliance Filter Test 1: Public + Investment Advice")
print("=" * 60)
print(f"Original: {test_output_1}")
print(f"\nResult: {result_1['allowed']}")
print(f"Disclaimers added: {result_1['disclaimers_added']}")
print(f"\nFiltered output (first 200 chars):")
print(result_1['filtered_text'][:200] if result_1['filtered_text'] else "None")

# Expected: Allowed with investment_advice disclaimer
# SAVED_SECTION:10

In [None]:
# Test Case 2: MNPI violation (should block)
test_output_2 = "Confidential board minutes reveal merger plans valued at $500M."
test_citations_2 = [
    {"source_type": "board minutes", "source_id": "board_2024_03", "data_namespace": "restricted"}
]

result_2 = compliance_filter.filter_output(
    llm_output=test_output_2,
    citations=test_citations_2,
    user_id="analyst_002"
)

print("\nüö´ Compliance Filter Test 2: MNPI Violation (BLOCKED)")
print("=" * 60)
print(f"Original: {test_output_2}")
print(f"\nResult: {result_2['allowed']}")
print(f"Blocked reason: {result_2['blocked_reason']}")
print(f"Audit logged: {result_2['audit_logged']}")
print(f"\nViolation details:")
print(f"  Confidence: {result_2['violation_details']['confidence']:.2f}")
print(f"  Layers flagged: {result_2['violation_details']['layers_flagged']}/3")

# Expected: BLOCKED due to MNPI violation
# SAVED_SECTION:11

## Section 6: Audit Log Review

Compliance audit logs are critical for SEC investigations and shareholder litigation.

In [None]:
# Retrieve audit logs
audit_logs = compliance_filter.get_audit_log()

print("üìä Compliance Audit Log")
print("=" * 60)
print(f"Total violations logged: {len(audit_logs)}")
print("")

if audit_logs:
    print("Recent violations:")
    for idx, log in enumerate(audit_logs[-3:], 1):  # Last 3 violations
        print(f"\n{idx}. Violation Type: {log['violation_type']}")
        print(f"   User: {log['user_id']}")
        print(f"   Timestamp: {log['timestamp']}")
        print(f"   Action: {log['action_taken']}")
        print(f"   Text snippet: {log['text_snippet'][:100]}...")
else:
    print("No violations logged yet.")

# Expected: Shows logged MNPI violations from previous tests
# SAVED_SECTION:12

## Section 7: Reality Check - Common Failures

### What Actually Goes Wrong

| Failure Mode | Severity | Solution |
|--------------|----------|----------|
| False negatives (missed MNPI) | ‚ö†Ô∏è CATASTROPHIC | Increase threshold to 0.90+, expand patterns |
| False positives (over-blocking) | ‚ö†Ô∏è ACCEPTABLE | Lower threshold to 0.80, refine patterns |
| Stale disclosure database | ‚ö†Ô∏è HIGH | Implement daily EDGAR sync, staleness alerts |
| Missing citation metadata | ‚ö†Ô∏è CRITICAL | Validate M9.1 schema, fail-safe to block |
| Audit log not persisted | ‚ö†Ô∏è CRITICAL | Monitor DB health, retry logic |

In [None]:
# Demonstrate threshold tuning impact

threshold_tests = [0.70, 0.85, 0.95]
test_text = "Q4 projections suggest strong earnings growth."
test_citations = [{"source_type": "internal memo", "source_id": "memo_01"}]

print("üéØ Threshold Tuning Impact")
print("=" * 60)

for threshold in threshold_tests:
    detector = MNPIDetector(detection_threshold=threshold)
    result = detector.detect(test_text, test_citations)
    
    print(f"\nThreshold {threshold:.2f}: {'BLOCKED' if result['is_violation'] else 'ALLOWED'}")
    print(f"  Confidence: {result['confidence']:.2f}")
    print(f"  Layers flagged: {result['layers_flagged']}/3")

# Expected: Lower thresholds = more blocking (high recall, low precision)
# SAVED_SECTION:13

## Section 8: Alternative Solutions

### Comparison of Approaches

| Approach | Pros | Cons | When to Use |
|----------|------|------|-------------|
| **Manual Review Queue** | Highest accuracy | Slow, expensive | High-stakes decisions only |
| **Keyword Blacklist** | Simple, fast | Brittle, high false negatives | Not recommended for MNPI |
| **Risk Scoring (no blocking)** | User-friendly | Insufficient for compliance | Non-financial applications |
| **Three-Layer Detection** | Balanced precision/recall | Requires tuning | ‚úÖ Recommended for finance |

**Recommendation:** Use three-layer detection with escalation to manual review for medium-confidence violations (0.7-0.89).

In [None]:
# SAVED_SECTION:14

## Section 9: When Not to Use

### Anti-Patterns

‚ùå **Don't use for:**
- Non-financial applications (healthcare, e-commerce)
- Internal-only systems with no disclosure risk
- Pure educational AI with no trading decisions
- Historical data analysis with no MNPI access

‚úÖ **Do use for:**
- Banks, broker-dealers, investment advisers (FINRA/SEC regulated)
- Public companies handling earnings, M&A, executive changes
- RAG systems with access to both public and internal documents
- Applications subject to SEC examinations

In [None]:
# SAVED_SECTION:15

## Section 10: Domain-Specific Considerations (Finance AI)

### Financial Regulations Summary

**Securities Exchange Act Section 10(b) & Rule 10b-5:**
- Prohibits fraud and insider trading
- Applies to MNPI disclosure
- Penalties: up to 20 years imprisonment, $5M+ fines

**Regulation FD (Fair Disclosure):**
- Simultaneous disclosure to all investors required
- Prevents selective disclosure to analysts/large investors
- Enforced via information barriers (Chinese Walls)

**FINRA Rule 2210:**
- Governs communications with the public
- Requires balanced presentation, risk disclosure
- "Not Investment Advice" disclaimer mandatory for non-registered advisers

**Private Securities Litigation Reform Act (1995):**
- Safe Harbor for forward-looking statements
- Requires meaningful cautionary language
- Protects against shareholder lawsuits

In [None]:
# SAVED_SECTION:16

## Section 11: Decision Card

### When to Use This Approach

‚úÖ **Financial applications with regulatory requirements:**
- Banks, broker-dealers, investment advisers (FINRA/SEC regulated)
- Public companies handling earnings, M&A, executive changes
- Applications processing SEC filings, internal forecasts, board minutes

‚úÖ **Systems handling material information:**
- RAG systems with access to both public and internal documents
- LLMs answering investor questions, analyst queries
- Automated financial report generation

‚úÖ **Applications requiring audit trails:**
- Systems subject to SEC examinations, shareholder litigation
- Compliance monitoring for Reg FD, insider trading rules
- Applications needing legal defensibility

### When NOT to Use

‚ùå **Non-financial applications:**
- Healthcare, e-commerce, general knowledge systems

‚ùå **Internal-only systems with no disclosure risk:**
- Employee-facing tools with no external access
- Read-only archive/research applications

‚ùå **Pure educational AI:**
- No trading decisions, no material information

### Performance Considerations

**Latency:** ~65-115ms overhead per response
- MNPI detection: ~50-100ms
- Disclaimer injection: ~10ms
- Information barrier: ~5ms

**Scale Optimization:**
- Cache `public_disclosures` in Redis (TTL: 24h)
- Pre-compile regex patterns
- Batch database queries

### Trade-offs

**Precision vs. Recall:**
- Prioritize recall (98%+) over precision for MNPI
- False negatives = catastrophic regulatory liability
- False positives = acceptable cost vs. legal risk

**Automation vs. Human Review:**
- Auto-block: confidence ‚â•0.9
- Escalate to human: 0.7-0.89
- Allow with disclaimers: <0.7

In [None]:
# SAVED_SECTION:17

## Section 12: Hands-On Exercise

### Exercise 1: Build Your Own Compliance Filter

Create a compliance filter that:
1. Blocks responses containing "confidential" or "internal only"
2. Adds disclaimers for phrases like "should buy" or "should sell"
3. Logs all violations with user tracking

In [None]:
# Exercise workspace

# TODO: Create your custom compliance filter
# Hint: Use ComplianceFilter() with custom MNPI patterns

# Test cases:
test_cases = [
    "This confidential analysis shows strong growth.",
    "Investors should buy this stock immediately.",
    "Historical prices ranged from $40-$60."
]

# Your code here:
my_filter = ComplianceFilter()

for test_text in test_cases:
    result = my_filter.filter_output(
        llm_output=test_text,
        citations=[{"source_type": "unknown", "source_id": "test"}],
        user_id="exercise_user"
    )
    print(f"\nText: {test_text}")
    print(f"Result: {'BLOCKED' if not result['allowed'] else 'ALLOWED'}")
    if result['allowed']:
        print(f"Disclaimers: {result['disclaimers_added']}")

# Expected: First two should block or add disclaimers, third should pass
# SAVED_SECTION:18

## Section 13: Summary & Next Steps

### What You've Learned

‚úÖ **MNPI Detection:** Three-layer pattern matching (source validation, materiality indicators, temporal check)

‚úÖ **Disclaimer Injection:** FINRA Rule 2210 and Safe Harbor compliance

‚úÖ **Information Barriers:** Chinese Walls preventing selective disclosure

‚úÖ **Compliance Audit Trails:** Logging for SEC investigations

### Key Takeaways

1. **False negatives are catastrophic** - prioritize recall (98%+) over precision
2. **Threshold tuning is critical** - test with labeled data, monitor production logs
3. **Audit logs are non-negotiable** - required for regulatory defense
4. **Fail-safe to block** - when in doubt, block the response

### Next Steps

1. **M9.4: Human-in-the-Loop for High-Stakes Decisions** - Implement manual review queue for medium-confidence violations

2. **Production Deployment:**
   - Set up PostgreSQL database with compliance schema
   - Configure Redis caching for public disclosures
   - Implement automated EDGAR filing sync
   - Set up monitoring and alerting for violations

3. **Testing:**
   - Create labeled dataset of MNPI vs. public information
   - Run precision/recall analysis across threshold range
   - Conduct red team exercises (attempt to bypass filters)
   - Validate audit log retention (7 years for SEC compliance)

### Resources

- **Augmented Script:** [GitHub Link](https://github.com/yesvisare/financial_ai_ccc_l2/blob/main/Augmented_FinanceAI_M9_3_Regulatory_Constraints_LLM_Outputs.md)
- **API Documentation:** FastAPI docs at `/docs` endpoint
- **Test Suite:** Run `pytest tests/` to validate implementation

**Congratulations!** You've completed L3 M9.3: Regulatory Constraints in LLM Outputs üéâ

In [None]:
print("üéì Module Complete!")
print("=" * 60)
print("You can now:")
print("  ‚úÖ Detect MNPI violations in LLM outputs")
print("  ‚úÖ Inject FINRA and Safe Harbor disclaimers")
print("  ‚úÖ Enforce information barriers (Chinese Walls)")
print("  ‚úÖ Create compliance audit trails for SEC investigations")
print("")
print("Next: Explore the API at http://localhost:8000/docs")
print("Or continue to M9.4: Human-in-the-Loop for High-Stakes Decisions")
# SAVED_SECTION:19