# AI Assistant for Bank Loan Review
(FDIC Rulebook Checker)

##Project Overview
- This project is about building a smart AI assistant for banking. Instead of training a new AI brain from scratch (which is expensive and hard), we are using Prompt Engineering. This means we are giving smart, specific instructions to an existing AI (like GPT) to make it an expert in one thing: Banking Rules.

- We are feeding the AI the official "Rulebook" for loans (called the FDIC Manual, Section 3.2). The AI‚Äôs job is to read a loan application, check it against this rulebook, and tell us if there are any risks or missing details.

- **Crucial Rule:** The AI is just an advisor. It points out risks, but it never says "Approve" or "Reject." That decision is left to the human banker.

## The Problem We Are Solving

We need an assistant that follows four strict rules:
1.  **Source of Truth:** It must strictly use the **FDIC Manual Section 3.2** as its only knowledge base.
2.  **Advisory Only:** It provides **risk warnings and observations**, but **never** makes the final "Approve/Reject" decision.
3.  **Hallucination-Free:** It refuses to answer questions that are not found in the text (no guessing).
4.  **Interactive:** The final product is a **web-based chat tool** (Gradio) for real-time analysis.

## 3. Methodology
- **Input:** PDF/Text Loan Documents + FDIC Rulebook.
- **Process:** The AI extracts text, checks it against the rulebook using a strict System Prompt, and generates a compliance summary.
- **Output:** A risk analysis report

# Install Required Libraries
Install the necessary tools
- Gradio: For the web interface
- PyPDF: To read PDF loan documents
- OpenAI: To access the LLM



In [2]:
!pip install gradio pypdf openai --quiet

[?25l   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m0.0/329.1 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m329.1/329.1 kB[0m [31m16.0 MB/s[0m eta [36m0:00:00[0m
[?25h

# Import Required Libraries
We import the libraries needed for the project:

- `PdfReader` from **PyPDF** to read and extract text from the FDIC PDF.  
- `OpenAI` to interact with the language model.  
- `gradio` to build a simple chat-based interface.


In [3]:
from pypdf import PdfReader
from openai import OpenAI
import gradio as gr
from google.colab import userdata

# Load FDIC Section 3.2 Text
We read the FDIC RMS Manual ‚Äì Section 3.2 (Loans) from a text file and store it in a variable.  
This text will serve as the **single source of truth** for our Assistant.




In [4]:
with open("/content/FDIC_SECTION.txt", "r") as f:
        FDIC_SECTION_3_2 = f.read()

In [5]:
import openai
client = openai.OpenAI(
      api_key=userdata.get("API_KEY"),
      base_url=userdata.get("BASE_URL")
  )

# Prompting Techniques for Regulatory Loan Review Using FDIC Guidance

## 1. Purpose of the System Prompt
- Sets up an AI assistant specifically for analyzing loans under **FDIC RMS Manual ‚Äì Section 3.2**.
- Goal: Identify **compliance gaps, risks, and documentation issues**.
- Ensures the AI focuses only on **regulatory guidance**, avoiding unrelated advice.

## 2. Key Prompting Techniques Used

### a) Role Definition
- AI is explicitly told its role: *Regulatory Loan Review Assistant*.
- Limits AI to **examiner-style insights** rather than general advice.

### b) Source Authority Restriction
- AI must use **only FDIC Section 3.2**.
- Ensures accuracy and regulatory compliance.

### c) Operational Guidelines
- Instructions focus on **risk classification, documentation review, and analysis**.
- Queries outside Section 3.2 trigger a refusal.

### d) Prohibited Actions
- AI cannot provide **credit recommendations, legal advice, or speculation**.
- Acts as a **safety guardrail**.

### e) Tone and Style Specification
- Neutral, professional, and clear communication.
- Bullet points are preferred for clarity.
- Makes outputs **examiner-ready**.

### f) Focus Areas for Analysis
- Well-defined loan weaknesses (e.g., missing amortization).
- Adequacy of collateral and financial documentation.
- Deviations from prudent underwriting practices.

## 3. Why This Prompting Technique Works
- **Precision:** Narrow scope reduces irrelevant answers.
- **Safety:** Prohibited actions prevent risky guidance.
- **Clarity:** Tone and formatting ensure readability.
- **Efficiency:** AI knows exactly what analysis is expected.

## 4. Real-World Application
- Regulatory training, loan audits, and documentation review.
- Helps examiners quickly identify **risk and compliance issues**.

## Summary
This is an example of **role-based, scope-restricted, authority-constrained AI prompting**.  
Clear instructions on source, behavior, tone, and focus turn the AI into a specialized assistant for **regulatory loan analysis**.

In [6]:
SYSTEM_PROMPT = """
You are a Regulatory Loan Review Assistant operating strictly under FDIC RMS Manual ‚Äì Section 3.2.

Primary Objective:
Analyze loan summaries and user inquiries using *only* the provided FDIC Section 3.2 text. Your goal is to identify regulatory compliance gaps, risk classifications, and documentation deficiencies without making lending decisions.

Operational Guidelines:
1. Source Authority: Base all regulatory reasoning strictly on the provided FDIC text. However, you may use standard banking knowledge to interpret basic loan terms (e.g., LTV, DTI, collateral types) or document structures unless the text contradicts them.
2. Scope of Advice: Focus on explaining examiner expectations, defining risk categories (e.g., Substandard, Special Mention), and clarifying appraisal requirements.
3. Refusals: If a query falls outside Section 3.2 or requests a credit decision, strictly refuse by stating: "This is out of scope based on the provided guidance."


Prohibited Actions:
- Do NOT offer loan approvals, denials, or specific credit scores.
- Do NOT provide legal, investment, or personal financial advice.
- Do NOT speculate on rules not found in the provided text.

Tone and Style:
- Maintain a neutral, objective, and professional examiner-style tone.
- Use clear, simple English to explain complex regulatory concepts.
- Organize findings with bullet points for clarity.

Analysis Focus Areas:
- Identify "well-defined weaknesses" in loan structures (e.g., lack of amortization).
- Verify if collateral and financial documentation meet safety standards.
- Highlight any deviations from prudent underwriting practices described in the manual.
"""

In [7]:
test_loan_text ="""
Uniform Residential Loan Application (URLA) - Summary
===================================================

I. Borrower Information
----------------------
Name: Jane Doe
Age: 53
Marital Status: Unmarried
Employment: Officer, Miami Police Department
Years on Job: 7 Years


II. Property and Loan Information
--------------------------------
Subject Property Address: 0000 NW 45th Street, Miami, FL
Loan Amount: $180,000
Interest Rate: 10.490%
Loan Term: 360 Months (Fixed Rate)
Estimated Property Value: $200,000
Lien Position: First Mortgage


III. Assets and Income
---------------------
Real Estate Owned (Market Value): $365,000
Gross Rental Income: $2,100 per month
Net Rental Income (after expenses): -$277.69 (Negative)


IV. Transaction Details
----------------------
Total Closing Costs: $8,229
Borrower Cash Contribution: $19,259


V. Declarations
---------------
Bankruptcy (Past 7 Years): No
Foreclosure or Delinquency: No
Occupancy Status: Primary Residence


---------------------------------------------------
Generated for FDIC Regulatory Analysis Testing"""

In [8]:
test_question = "Check the property section. Is the property description complete according to standard appraisal requirements"


test_response = client.chat.completions.create(
    model="gpt-4.1-nano",
    messages=[
        {"role": "system", "content": SYSTEM_PROMPT},
        {
            "role": "user",
            "content": f"""
Loan Document Content:
{test_loan_text}

FDIC Section 3.2 Reference:
{FDIC_SECTION_3_2}

User Question:
{test_question}
"""
            }
        ],
    temperature=0.1
    )
print(test_response.choices[0].message.content)

Based on the provided FDIC Section 3.2 guidance, the property description in the loan summary appears to be minimal, listing only the address ("0000 NW 45th Street, Miami, FL"). 

Regulatory expectations for appraisal requirements specify that independent appraisals are required for transactions over certain thresholds (e.g., residential over $400,000). The appraisal process must ensure a thorough and accurate property description, which typically includes details such as property type, size, improvements, and legal description, to support the valuation.

Since the summary does not include detailed property information beyond the address, it does not demonstrate compliance with the comprehensive property description standards necessary for a valid appraisal. Proper documentation should include sufficient property details to support the valuation and ensure safety and soundness.

**In summary:**
- The property description in the summary is not complete according to standard appraisal re

# Helper Function: Read the File

This function is the "eyes" of the system. The AI cannot read a PDF file directly; it needs plain text.

* If you upload a **PDF**, this tool scans every page and pulls out the words.
* If you upload a **Text file**, it just reads it normally.
* If you upload something else, it tells you "Unsupported format."


In [9]:
def extract_text(file):
    if file is None:
        return "No document uploaded."

    if file.name.endswith(".txt"):
        with open(file.name, "r", encoding="utf-8") as f:
            return f.read()

    if file.name.endswith(".pdf"):
        reader = PdfReader(file.name)
        text = ""
        for page in reader.pages:
            page_text = page.extract_text()
            if page_text:
                text += page_text + "\n"
        return text

    return "Unsupported file format"


## Loan Document Summarization Prompt

### Context Used in this Prompt
- The model acts as a **loan document summarizer**.
- It only summarizes **what is written in the loan document**.
- The summary must have **exactly 10 lines**.
- Each line focuses on a **specific loan detail** like amount, borrower, or collateral.
- If information is missing, it must be **clearly mentioned**.


### Prompt Engineering Techniques Used
- **Role-based prompting** to define the task clearly.
- **Strict formatting rules** to get consistent output.
- **Negative instructions** to avoid opinions, risk judgments, or advice.


In [10]:
Summarization_Prompt = """
You are a Loan Document Summarization Assistant.

Your task is to summarize the provided loan document strictly as a factual
overview. Do not include opinions, risk judgments, approvals, or advice.

Summarization rules:
- Produce EXACTLY 10 concise lines
- Each line must be a single, clear statement
- Focus only on:
  ‚Ä¢ Loan structure and terms
  ‚Ä¢ Borrower details
  ‚Ä¢ Property / collateral information
  ‚Ä¢ Key documentation present or missing
- Do not infer or speculate beyond the document
- If information is missing, state it explicitly

OUTPUT FORMAT (follow strictly):

1. Loan Type and Purpose:
2. Loan Amount and Term:
3. Interest Rate and Repayment Structure:
4. Borrower Information:
5. Guarantors (if any):
6. Property / Collateral Description:
7. Appraisal Status:
8. Financial Information Provided:
9. Key Loan Covenants or Conditions:
10. Documentation Gaps or Missing Items:

Tone:
- Neutral, factual, and examiner-style
- Document-based statements only
"""


In [11]:
def summarize_document(file):
    try:
        document_text = extract_text(file)

        response = client.chat.completions.create(
            model="gpt-4.1-nano",
            messages=[
                {
                    "role": "system",
                    "content": Summarization_Prompt
                },
                {
                    "role": "user",
                    "content": document_text
                }
            ]
        )

        return response.choices[0].message.content

    except Exception as e:
        return f"System Error: {str(e)}"


# The Main Brain: Answering Questions

This function is where the magic happens. It connects the **Loan**, the **Rulebook**, and the **User's Question** together.

* **Step 1:** It grabs the text from the uploaded loan file.
* **Step 2:** It retrieves the "FDIC Rulebook" text we loaded earlier.
* **Step 3:** It sends a combined message to the AI that looks like this:
    * *"Here is the loan details..."*
    * *"Here is the official rulebook..."*
    * *"The user wants to know: [Question]"*
    * *"Answer using ONLY the rulebook."*

This ensures the AI doesn't guess‚Äîit looks up the exact rule in the provided text before answering.

In [12]:
def analyze_document(file, user_question):
    try:
        document_text = extract_text(file)

        response = client.chat.completions.create(
            model="gpt-4.1-nano",
            messages=[
                {"role": "system", "content": SYSTEM_PROMPT},
                {
                    "role": "user",
                    "content": f"""
Loan Document Content:
{document_text}

FDIC Section 3.2 Reference:
{FDIC_SECTION_3_2}

User Question:
{user_question}
"""
                }
            ]
        )
        final_ans=response.choices[0].message.content
        return final_ans

    except Exception as e:
        return f"System Error: {str(e)}"


# Gradio Chat Interface: Regulatory Loan Analysis

This section sets up a **Gradio web interface** for the Regulatory Reasoning Assistant.  
It allows users to upload loan documents, get a concise summary, and ask regulatory questions based on **FDIC Section 3.2**.

---




In [13]:
with gr.Blocks() as interface:

    # üîπ CENTERED HEADER
    gr.Markdown("""
    <div style="text-align:center">
        <h1>üè¶ Regulatory-Aligned Loan Document Analysis System</h1>
        <h3>FDIC RMS Manual ‚Äì Section 3.2 (Loans)</h3>
        <p>
            This system summarizes uploaded loan documents and provides
            <b>regulatory-aligned observations</b>.
        </p>
    </div>
    """)

    gr.Markdown("---")

    # üîπ STEP 1
    gr.Markdown("## üìÑ Step 1: Upload Loan Document")

    file_input = gr.File(
        label="Loan Document (.pdf or .txt)",
        file_types=[".pdf", ".txt"]
    )

    summarize_btn = gr.Button("üìë Summarize Document", variant="primary")

    summary_output = gr.Textbox(
        label="üìë Document Summary (10 Lines)",
        lines=10,
        show_copy_button=True
    )

    gr.Markdown("---")

    # üîπ STEP 2
    gr.Markdown("## ‚ùì Step 2: Regulatory Question Analysis")

    with gr.Row():

        # LEFT PANEL
        with gr.Column():
            question_input = gr.Textbox(
                label="Ask a Question",
                placeholder="Example: What documentation risks are present?",
                lines=8
            )

            analyze_btn = gr.Button(
                "üîç Analyze Question",
                variant="primary",
                size="lg"
            )

        # RIGHT PANEL
        with gr.Column():
            analysis_output = gr.Textbox(
                label="üìò FDIC Section 3.2 Regulatory Output",
                lines=18,
                show_copy_button=True
            )

    # üîπ BUTTON ACTIONS
    summarize_btn.click(
        fn=summarize_document,
        inputs=file_input,
        outputs=summary_output
    )

    analyze_btn.click(
        fn=analyze_document,
        inputs=[file_input, question_input],
        outputs=analysis_output
    )

    gr.Markdown("""
    ---
    <div style="text-align:center; font-size: 0.9em;">
        ‚ö†Ô∏è <b>Regulatory Limitation Notice</b><br>
        This system does <b>not</b> approve, reject, or score loans.<br>
        It strictly follows <b>FDIC Section 3.2 (Loans)</b>.
    </div>
    """)

# ‚úÖ Theme passed here (future-proof)
interface.launch()


It looks like you are running Gradio on a hosted Jupyter notebook, which requires `share=True`. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://ea7a505b5144bb518e.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


