<a href="https://colab.research.google.com/github/lorettarehm/AIML/blob/main/LR_Sentinel.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Customer First: Automated FCA Compliance Audit

This notebook implements the core functionality of the **Customer First** (Text & Policy Engine). The system uses the **LLM-as-a-Judge pattern** to audit content using regulations as guidelines, proactively detecting points where the company can improve its processes, products, services or communications to optimal adherence to regulations, in this example the Consumer Duty (FCA).

## Key Features:
1. **RAG Implementation:** FCA rulebooks (The Regulatory Constitution) are loaded as a Knowledge Base for **Semantic Retrieval**
2. **Constitutional AI Pattern:** The judge uses explicit criteria, Chain-of-Thought (CoT), and structured output to achieve regulatory-grade reliability.
3. **Persona Simulation:** The Judge evaluates comprehension from the perspective of diverse customer personas.

#### Required packages: `openai`, `langchain-openai`, `python-dotenv`, `numpy`, `scikit-learn`, `pyPDF`

In [2]:
# Install required libraries
!pip install openai langchain-openai python-dotenv numpy scikit-learn PyPDF2 PyCryptodome

Collecting langchain-openai
  Downloading langchain_openai-1.1.0-py3-none-any.whl.metadata (2.6 kB)
Collecting PyPDF2
  Downloading pypdf2-3.0.1-py3-none-any.whl.metadata (6.8 kB)
Collecting PyCryptodome
  Downloading pycryptodome-3.23.0-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.4 kB)
Downloading langchain_openai-1.1.0-py3-none-any.whl (84 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m84.3/84.3 kB[0m [31m1.7 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading pypdf2-3.0.1-py3-none-any.whl (232 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m232.6/232.6 kB[0m [31m6.0 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading pycryptodome-3.23.0-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.3 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.3/2.3 MB[0m [31m33.2 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: PyPDF2, PyCryptodome, langchain-openai
Successfully installed PyCryptodome

In [3]:
# Imports
import os
import json
import numpy as np
from dotenv import load_dotenv
from openai import OpenAI

# Required LangChain components
from sklearn.metrics.pairwise import cosine_similarity # For Semantic Retrieval

# Load environment variables from .env file
load_dotenv()
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
# OPENAI_API_URL = os.getenv("OPENAI_API_URL") # Only Azure setup
DEFAULT_MODEL = "gpt-4o-mini" # Needed for judging

client = OpenAI(api_key=OPENAI_API_KEY)
print("✅ Libraries imported and API Key loaded.")

✅ Libraries imported and API Key loaded.


In [4]:
# create RAG with text extracted from FCA Consumer Duty
FCA_KNOWLEDGE_BASE = [
    "FCA CONC 3.2.1: Credit agreements must clearly state the Representative APR (Annual Percentage Rate) with equal prominence to any simple interest rate advertised",
    "Consumer Duty Outcome 2 (Products and Services): All products must be designed to meet the needs of identified consumers.",
    "FCA BCOBS 2.2.3: Risk warnings must be easily located and displayed using a font size no smaller than the main body text (prominence rules).",
    "Consumer Duty Outcome 4 (Consumer Understanding): Communication must be tailored to the target audience, ensuring clarity and avoiding jargon.",
    "FCA CONC 4.5.1: Information on missed payment fees and charges must be presented clearly, ideally using a simple table or bullet points."
]

In [5]:
# LLM-as-a-Judge Core Function
def run_regulatory_judge(
    artifact_text: str,
    evaluation_query: str,
    persona_description: str
) -> dict:

    # Step 1 & 2: RAG Retrieval - Find relevant FCA rules based on the query
    retrieved_rules = find_semantically_similar_documents(
        evaluation_query, FCA_KNOWLEDGE_BASE, top_k=2
    )
    context_rules = "\n".join(retrieved_rules)

    # Step 3: Construct the Judge Prompt with RAG and Persona Simulation
    system_prompt = f"""
    You are the **Customer First** and **{persona_description} Persona**.
    Your task is to evaluate a piece of bank communication against the provided FCA Regulatory Rules.

    **INSTRUCTIONS:**
    1. **Context Check (RAG):** Analyze the relevance of the [REGULATORY CONTEXT].
    2. **Chain-of-Thought (CoT):** First, provide step-by-step reasoning on whether the [ARTIFACT TEXT] complies with the relevant rule(s) from the context, specifically concerning the user query.
    3. **Persona Evaluation:** From the perspective of the **{persona_description} Persona**, attempt to answer the comprehension question based ONLY on the [ARTIFACT TEXT] [24].
    4. **Output:** Return your final assessment and score in the required JSON format.

    **CRITERIA:**
    - Compliance Score (1-5, 5=Fully Compliant/Clear)
    - Clarity Score for Persona (1-5, 5=Perfectly Understood)
    - Pass/Fail (Binary: PASS if Compliance >= 4 AND Clarity >= 4)
    - Reasoning Trace (The audit log for CoT) [9]
    """

    user_content = f"""
    [REGULATORY CONTEXT]:
{context_rules}

    [ARTIFACT TEXT]:
{artifact_text}

    [COMPREHENSION QUESTION]: {evaluation_query}

    Respond ONLY with a valid JSON object following this structure:
    {{"compliance_score": "X/5", "clarity_score_persona": "X/5", "pass_fail": "PASS/FAIL", "reasoning_trace": "[Your detailed CoT reasoning]"}}
    """

    # Call the LLM Judge (forcing JSON output) [25, 26]
    response_text = call_llm(
        prompt=[{"role": "system", "content": system_prompt}, {"role": "user", "content": user_content}],
        response_format={ "type": "json_object" }
    )

    # Step 4: Output Parsing
    try:
        return json.loads(response_text)
    except json.JSONDecodeError:
        print("Warning: LLM did not return valid JSON. Returning raw text.")
        return {"error": "Invalid JSON from model", "raw_output": response_text}

print("✅ Customer First function defined.")

✅ Customer First function defined.


In [6]:
# Standard OpenAI implementation

def personal_function(prompt, **kwargs):
    chat_model = ChatOpenAI(
        model="gpt-4o",
        api_key=OPENAI_API_KEY,
        timeout=30,
        **kwargs,
    )
    # LLM-as-a-Judge needs GPT-4o
    if isinstance(prompt, list):
        return chat_model.invoke(prompt).content
    else:
        return chat_model.invoke([{"role":"user", "content": prompt}]).content

def call_llm(prompt, **kwargs):
    # Use the personal function for simplicity
    return personal_function(prompt, **kwargs)

def call_embeddings(text):
    # Wrapper for embedding (RAG)
    response = client.embeddings.create(
        input=text,
        model="text-embedding-3-small"
    )
    return response.data[0].embedding

print("✅ LLM connection and embedding functions defined.")

✅ LLM connection and embedding functions defined.


In [8]:
def find_semantically_similar_documents(query_text: str, documents: list[str], top_k: int = 1) -> list[str]:
    # Get embedding for the query
    query_embedding = call_embeddings(query_text)

    # Get embeddings for all documents
    document_embeddings = [call_embeddings(doc) for doc in documents]

    # Calculate cosine similarity between the query and each document
    similarities = cosine_similarity(np.array([query_embedding]), np.array(document_embeddings))[0]

    # Get the indices of the top_k most similar documents
    top_k_indices = similarities.argsort()[-top_k:][::-1]

    # Return the top_k documents
    return [documents[i] for i in top_k_indices]

print("✅ Semantic similarity function defined.")

✅ Semantic similarity function defined.


In [9]:
# 5. Simulation: Auditing a Credit Card Brochure Snippet

ARTIFACT_SNIPPET = (
    "**Standard APR 25.9% (variable).** The APR applied to this account is 27.5% Representative (variable)."
    "Fee for missing a payment: Please consult the complex fee matrix table in Appendix C, row 4, column B."
)

JUDGE_QUERY = "How much will you pay if you miss a payment, and is the Representative APR clear?" # Combines compliance check and comprehension question [24]

VULNERABLE_PERSONA = "Low Literacy and Financial Anxiety" # [8, 27]


print("Running Audit for Persona:", VULNERABLE_PERSONA)
print("--------------------------------------------------")
print("Artifact Text:", ARTIFACT_SNIPPET)
print("\n")

audit_results = run_regulatory_judge(
    artifact_text=ARTIFACT_SNIPPET,
    evaluation_query=JUDGE_QUERY,
    persona_description=VULNERABLE_PERSONA
)

print(json.dumps(audit_results, indent=4))

Running Audit for Persona: Low Literacy and Financial Anxiety
--------------------------------------------------
Artifact Text: **Standard APR 25.9% (variable).** The APR applied to this account is 27.5% Representative (variable).Fee for missing a payment: Please consult the complex fee matrix table in Appendix C, row 4, column B.




RateLimitError: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

## Interpreting the Output (Compliance Heatmap)

The output provides a structured verdict from the LLM Judge.

If the 'Clarity Score for Persona' is low (e.g., due to difficulty navigating a 'complex fee matrix'), the resulting `pass_fail` should be 'FAIL'. The `reasoning_trace` acts as the mandatory audit log (Chain-of-Thought) explaining the verdict.

This approach automates the audit of static content, helping the compliance team identify areas of confusion or non-compliance.