# Interactive Audit Testing

This notebook allows for real-time testing of the IFRS9 Automated Auditor without using web ports.

In [5]:
# Cell 1: Load environment and initialize RcmAuditor
import os
from dotenv import load_dotenv
from rcm_engine import RcmAuditor
from config import CONFIG

load_dotenv()

# Initialize Auditor
auditor = RcmAuditor()
print("RcmAuditor initialized.")

# Ensure documents are indexed
auditor.initialize_rag()
print("RAG Index ready.")

RcmAuditor initialized.
Loading documents from documents/...
Loading Política de Previsionamiento_modificada sep 2025.pdf...
Loading Respuesta Memorando final Inspección 2025_Perdida Esperada NIIF.pdf...
Creating vector store with 76 chunks...
Index built successfully.
RAG Index ready.


In [6]:
# Cell 2: Define a specific test question manually
test_question = "How is the PD 12-month calculated?"
print(f"Test Question: {test_question}")

Test Question: How is the PD 12-month calculated?


In [7]:
# Cell 3: Run the retrieval step only and print retrieved PDF text chunks
print("Retrieving top 5 chunks...")
retrieved_docs = auditor.rag_engine.retrieve(test_question, k=5)

for i, doc in enumerate(retrieved_docs):
    print(f"\n--- Chunk {i+1} (Page {doc.metadata.get('page', 'N/A')}) ---")
    print(doc.page_content[:500] + "...") # Print first 500 chars

Retrieving top 5 chunks...

--- Chunk 1 (Page 8) ---
y Loss Given Default (LGD). 
 
En este sentido, para el cálculo de la PD debe considerarse la cantidad de 
casos (exposiciones individuales) y no los montos, dado que este enfoque 
se fundamenta en la premisa estadística de que la frecuencia observada 
de ocurrencia de ciertos eventos en u na muestra puede aproximar la 
probabilidad real de ocurrencia de dichos eventos. 
 
Se recomienda, por tanto, ajustar la metodología empleada para asegurar 
la correcta estimación de la PD conforme a los prin...

--- Chunk 2 (Page 11) ---
PERSONALES 202108 14840 24 40 61 77 420
PERSONALES 202109 14441 17 38 54 77 427
PERSONALES 202110 14093 21 37 60 86 424
PERSONALES 202111 13766 16 39 65 67 426
PERSONALES 202112 12945 23 49 51 71 426
PERSONALES 202201 12563 28 30 50 99 120 429
PERSONALES 202202 18085 25 35 72 93 113 435
PERSONALES 202203 17549 20 39 59 87 115 423
PERSONALES 202204 16574 21 43 61 91 117 430
PERSONALES 202205 10278 0 0 0 0 0 0
 
 


In [8]:
# Cell 4: Run the full process_row logic and print AI Answer AND Critique Score

# Mock a row data structure
mock_row = {
    'Control Reference': 'Test-Ref-001',
    'Test Procedure': test_question
}

print("Running full process_row...")
result = auditor.process_row(mock_row)

print("\n=== AI Answer ===")
print(result['AI_Answer'])

print("\n=== Validation ===")
print(f"Score: {result['Validation_Score']}")
print(f"Hallucination: {result['Hallucination_Flag']}")
print(f"Reasoning: {result['Validation_Reasoning']}")

Running full process_row...

=== AI Answer ===
Not Documented.

=== Validation ===
Score: 1
Hallucination: True
Reasoning: The generated answer 'Not Documented' does not relate to the source context provided. The source context discusses the methodology for calculating Probability of Default (PD) and Loss Given Default (LGD), including specific details about the transition to a new methodology and the responsible parties. The generated answer does not address any of these points and appears to be a placeholder or an error, indicating a hallucination.


In [9]:
mock_row

{'Control Reference': 'Test-Ref-001',
 'Test Procedure': 'How is the PD 12-month calculated?'}