# LLM-Powered Credit Quality Assurance (QA) System

## Introduction

This notebook implements an **LLM-powered Credit Quality Assurance (QA) system** that combines machine-learning credit risk models with bank credit policy documents to support explainable and policy-aligned lending decisions.

In modern banks, credit models such as LightGBM or XGBoost are used to estimate the **Probability of Default (PD)** for each loan. However, PD alone is not sufficient for operational decision-making. Credit Quality Assurance teams must also ensure that decisions are **consistent with internal credit policies**, **risk appetite**, and **regulatory standards**.

This notebook represents the **AI decision-support layer** of the credit process. It takes as input:

- A table of **QA loan cases** generated by a PD model and rule-based policy engine  
- **Credit policy documents** (PDFs) such as underwriting guidelines and risk appetite statements  

Using **Retrieval-Augmented Generation (RAG)**, the system retrieves the most relevant policy text for each loan and uses a large language model (LLM) to produce:

- An **AI credit decision**  
- A **policy compliance status**  
- A **human-readable explanation**  
- A **recommended next action**  

## System Architecture

Loan Data

   ↓

PD Model + SHAP

   ↓

Policy Rules → QA Cases

   ↓

RAG (Policy PDFs)

   ↓

LLM → AI QA Decision

In [25]:
import pandas as pd
import re

from langchain_community.document_loaders import PyPDFLoader,DirectoryLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_openai import OpenAI,OpenAIEmbeddings,ChatOpenAI
from langchain_community.vectorstores import FAISS
from langchain_core.prompts import ChatPromptTemplate
from langchain_huggingface import HuggingFaceEmbeddings
import json


In [29]:
pd.set_option("display.max_colwidth", None)
pd.set_option("display.max_rows", None)
pd.set_option("display.max_columns", None)

In [30]:
# Load the QA cases dataset
qa_cases = pd.read_csv(r'C:\Users\USER\Desktop\data set\lending club data set\data\qa_cases.csv')
qa_cases.head()

Unnamed: 0,case_id,PD,Action,dti,loan_to_income,revol_util,emp_length,verification_status,LGD,expected_loss,top_risk_factors,segment_default_rate,pd_vs_segment
0,1452671,0.5319,Review/Reject,0.2749,0.200994,44.4,10.0,Source Verified,0.75,7180.650779,"int_rate ,dti ,acc_open_past_24mths ,term ,home_ownership_RENT",0.3418,0.1901
1,464451,0.268195,QA,0.2075,0.128205,45.9,10.0,Source Verified,0.45,1206.878739,"int_rate ,acc_open_past_24mths ,loan_to_income ,avg_cur_bal ,application_type_Joint App",0.1617,0.106495
2,809706,0.288979,QA,0.0603,0.25,45.9,3.0,Source Verified,0.45,3901.221581,"int_rate ,avg_cur_bal ,acc_open_past_24mths ,term ,dti",0.1617,0.127279
3,1538286,0.099637,QA,0.2784,0.105448,9.4,10.0,Source Verified,0.45,269.020804,"int_rate ,revol_util ,dti ,acc_open_past_24mths ,loan_to_income",0.1617,-0.062063
4,822739,0.366241,Review/Reject,0.388,0.416667,86.3,10.0,Not Verified,0.65,4761.137546,"dti ,avg_cur_bal ,acc_open_past_24mths ,loan_to_income ,bc_util",0.3418,0.024441


In [31]:
qa_cases.columns

Index(['case_id', 'PD', 'Action', 'dti', 'loan_to_income', 'revol_util',
       'emp_length', 'verification_status', 'LGD', 'expected_loss',
       'top_risk_factors', 'segment_default_rate', 'pd_vs_segment'],
      dtype='object')

In [32]:
# Loda PDF documtents
loader = DirectoryLoader(
    r"C:\Users\USER\Desktop\books\policy",
    glob='*.pdf',
    loader_cls=PyPDFLoader
)

documensts= loader.load()
print(f"loaded {len(documensts)} pages")

loaded 11 pages


In [33]:
# Split the documents into smaller chunks
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=300,
    chunk_overlap=50)

chunks = text_splitter.split_documents(documensts)

In [34]:
# Embed the documents and create a vector store
embedding = HuggingFaceEmbeddings(model_name = "sentence-transformers/all-MiniLM-L6-v2")
vector_store = FAISS.from_documents(chunks, embedding)

# Create a retriever from the vector store
retriever = vector_store.as_retriever(search_type="similarity", search_kwargs={"k":4})

In [35]:
# set the llm model
llm = ChatOpenAI(model="gpt-4o-mini",temperature=0)

In [36]:
# Function to extract JSON from text
def extract_json(text):
    text = text.strip()
    text = re.sub(r"^```json", "", text)
    text = re.sub(r"^```", "", text)
    text = re.sub(r"```$", "", text)
    return text.strip()

qa

In [52]:
# Creat a QA loop with the first 5 cases
results = []

for _, case in qa_cases.head(10).iterrows():
    query = f"""
    case_id: {case.case_id}
    PD: {case.PD:.2%}
    LGD : {case.LGD:.2%}
    Expected Loss : {case.expected_loss:.2}$
    Decision Segment: {case.Action}
    Segment default rate: {case.segment_default_rate:.2%}
    Top risk drivers: {case.top_risk_factors}

    Explain the credit decision using the bank's policy.
    """

    # Retrieve relevant policy chunks
    docs = retriever.invoke(query)
    context = "\n\n".join([d.page_content for d in docs])

    # Build structured prompt
    prompt = f"""
    you are a senior Credit Quality Assurance (QA) analyst in a retail bank.

    Your job is to evaluate whether this loan is safe, borderline, or unacceptable based on:
    1. The borrower's Probability of Default (PD)
    2. The portfolio risk of its policy segment (Segment Default Rate)
    3. The bank's credit policy rules
    4. The key risk drivers

    Definitions:
    - PD = the model's estimate of this borrower's default risk
    - LGD = Loss Given Default, the percentage of the loan that the bank is expected to lose if the borrower defaults.
    - Expected Loss = PD × LGD × Loan Amount, representing the estimated financial loss to the bank.
    - Segment Default Rate = the historical default rate of loans in the same policy segment (QA / Reject / etc).
      It represents the typical risk level of that group.

    Interpretation rules:
    - A loan with high Expected Loss represents higher financial risk even if PD is moderate.
    - If Expected Loss is high relative to typical loans in the same segment, the loan should be treated as riskier.
    - QA loans are not automatically approved; they require manual review when PD, Expected Loss, or risk drivers are elevated.


    Bank Policy:
    {context}

    Loan Case:
    {query}

    Return your answer strictly in this JSON format:
{{
  "ai_decision": "...",
  "policy_status": "...",
  "ai_reason": "...",
  "next_step": "..."
}}
"""

    response = llm.invoke(prompt).content

    try:
        clean = extract_json(response)
        decision = json.loads(clean)
    except:
        decision = {
            "ai_decision": "Parsing Error",
            "policy_status": "Unknown",
            "ai_reason": response,
            "next_step": "Manual review"
        }

    results.append({
        "case_id": case.case_id,
        "pd": case.PD,
        "segment": case.Action,
        "segment_default_rate": case.segment_default_rate,
        "Top Risk Drivers": case.top_risk_factors,
        "ai_decision": decision["ai_decision"],
        "policy_status": decision["policy_status"],
        "next_step": decision["next_step"],
        "ai_reason": decision["ai_reason"]
    })

In [53]:
ai_decision = pd.DataFrame(results)
ai_decision.index = ai_decision.index +1
ai_decision

Unnamed: 0,case_id,pd,segment,segment_default_rate,Top Risk Drivers,ai_decision,policy_status,next_step,ai_reason
1,1452671,0.5319,Review/Reject,0.3418,"int_rate ,dti ,acc_open_past_24mths ,term ,home_ownership_RENT",Unacceptable,Reject,Reject the loan application.,"The borrower's Probability of Default (PD) is 53.19%, which is categorized as 'Very High Risk' according to bank policy. Additionally, the Expected Loss of $7,200 is significant, and the Segment Default Rate of 34.18% is higher than the PD, indicating that this loan poses a higher financial risk. The loan also falls under the mandatory QA review due to the elevated PD."
2,464451,0.268195,QA,0.1617,"int_rate ,acc_open_past_24mths ,loan_to_income ,avg_cur_bal ,application_type_Joint App",borderline,QA Review Required,Conduct a manual review of the loan application to assess the risk drivers and determine if the loan can be approved or should be rejected.,"The borrower's PD of 26.82% is below the unacceptable threshold of 30%, but the Expected Loss of $1,200 is significant relative to the segment default rate of 16.17%. Additionally, the loan falls into the QA segment, indicating that it requires further manual review due to elevated risk drivers."
3,809706,0.288979,QA,0.1617,"int_rate ,avg_cur_bal ,acc_open_past_24mths ,term ,dti",borderline,QA Review Required,Conduct a manual review to assess the borrower's overall creditworthiness and risk factors.,"The borrower's PD is 28.90%, which is below the unacceptable threshold of 30%. However, the Expected Loss of $3,900 is significant relative to the segment default rate of 16.17%, indicating a higher financial risk. Additionally, the loan falls into the QA segment, necessitating a manual review due to elevated risk factors."
4,1538286,0.099637,QA,0.1617,"int_rate ,revol_util ,dti ,acc_open_past_24mths ,loan_to_income",unacceptable,loan does not meet acceptable credit risk criteria,Manual review required due to QA flag and risk assessment.,"The borrower's Probability of Default (PD) is 9.96%, which is below the unacceptable threshold of 30%. Additionally, the Expected Loss of $270 is relatively low, but the loan falls under the QA segment, indicating it requires manual review. However, the PD is significantly lower than the Segment Default Rate of 16.17%, suggesting that this loan is riskier compared to typical loans in the same segment. The presence of top risk drivers such as high DTI and other factors further indicates elevated risk."
5,822739,0.366241,Review/Reject,0.3418,"dti ,avg_cur_bal ,acc_open_past_24mths ,loan_to_income ,bc_util",unacceptable,"PD exceeds 30%, which is unacceptable credit risk.",Reject the loan application based on the unacceptable credit risk.,"The borrower's Probability of Default (PD) is 36.62%, which falls into the 'Very High Risk' category according to the risk bands. Additionally, the Expected Loss of $4,800 is significant, and the Segment Default Rate of 34.18% indicates that this loan is riskier than typical loans in the same segment. The loan also requires a mandatory QA review due to the elevated PD."
6,549665,0.229531,QA,0.1617,"int_rate ,loan_to_income ,total_rev_hi_lim ,acc_open_past_24mths ,purpose_home_improvement",borderline,QA Review Required,Conduct a manual review to assess the borrower's overall creditworthiness and the impact of the top risk drivers.,"The borrower's PD of 22.95% is below the unacceptable threshold of 30%, but the Expected Loss of $1,200 is significant relative to the segment default rate of 16.17%. Additionally, the loan falls into the QA segment, indicating that it requires manual review due to elevated risk factors."
7,521926,0.083505,QA,0.1617,"int_rate ,avg_cur_bal ,loan_to_income ,mths_since_last_delinq ,acc_open_past_24mths",safe,QA Review Required,Conduct a manual review of the loan application considering the top risk drivers.,"The borrower's PD is significantly below the unacceptable threshold of 30%, and while the Expected Loss is moderate, it is not high relative to the segment default rate of 16.17%. However, the loan falls into the QA segment, which necessitates a manual review due to the presence of elevated risk drivers."
8,383370,0.230352,QA,0.1617,"term ,loan_to_income ,avg_cur_bal ,loan_amnt ,int_rate",borderline,QA Review Required,Conduct a manual review to assess the borrower's overall creditworthiness and the impact of the identified risk drivers.,"The borrower's PD of 23.04% is below the unacceptable threshold of 30%, but the Expected Loss of $2,500 is significant relative to the segment default rate of 16.17%. Additionally, the loan falls into the QA segment, indicating that it requires further manual review due to elevated risk factors."
9,707053,0.1442,QA,0.1617,"avg_cur_bal ,int_rate ,application_type_Joint App ,loan_to_income ,loan_amnt",safe,QA,Proceed with manual review to confirm the loan's suitability.,"The borrower's PD of 14.42% is below the unacceptable threshold of 30%. The Expected Loss of $2,000 is moderate compared to the Segment Default Rate of 16.17%, indicating that the loan is within acceptable risk levels. Although the loan is flagged for QA review due to its decision segment, it does not meet any of the unacceptable risk conditions outlined in the bank's policy."
10,1629238,0.397524,Review/Reject,0.3418,"int_rate ,dti ,revol_util ,total_rev_hi_lim ,earliest_cr_line_Oct-1999",unacceptable,manual review required,Conduct a manual review of the loan application to assess additional risk factors and make a final decision.,"The borrower's PD of 39.75% falls into the 'Very High Risk' category, which is above the 30% threshold for unacceptable credit risk. Additionally, the Expected Loss of $2,300 is significant, and the Segment Default Rate of 34.18% indicates a higher risk environment. The loan requires a mandatory QA review due to these elevated risk factors."


# AI Credit QA Decision Output

The table below shows the **LLM-generated Credit Quality Assurance (QA) decisions** for a sample of high-risk and borderline loan applications.

Each row represents a loan that was flagged by the machine-learning risk model and credit policy rules.  
Using **Retrieval-Augmented Generation (RAG)**, the LLM reviewed the relevant credit policy documents and produced:

- An **AI decision** (e.g. Reject, Approve, or Route to QA)  
- A **policy compliance status**  
- A **recommended next action**  
- A **human-readable explanation** linking the borrower’s risk profile to the bank’s credit policy  

This output demonstrates how large language models can be used as a **governed decision-support layer** on top of traditional credit-risk models, enabling faster, more consistent, and more explainable QA reviews.

In [54]:
# Save the AI decisions to a CSV file
ai_decision.to_csv("ai_decisions.csv", index=False)