# Halluci-NOT Read Me!!
**Real-time AI Governance Layer for SAP GenAI Hub**
A lightweight validation system that catches PII leaks, AI hallucinations, and policy violations in LLM outputs before they reach production users.



## 🎯 The Problem
Enterprise GenAI has three predictable failure modes:
1. **PII Leakage** → GDPR violations (4% of global revenue in fines)
2. **Hallucinations** → Operational failures (users follow AI's confident lies)
3. **Policy Violations** → Audit failures (unapproved vendors, breached rules)
Generic content filters don't know YOUR approved vendors, YOUR SAP landscape, or YOUR business policies.
## 💡 The Solution
**Two-Layer Validation: Pattern Matching + LLM Reasoning**
User Query → Generator → Checker → Decision (PASS/REVIEW/BLOCK) → Audit Log
### What It Detects
| Violation Type | Severity | Example |
|----------------|----------|---------|
| **PII Leak** | 90-95 | Customer email/phone exposed |
| **Policy Violation** | 80-90 | Unapproved vendor recommended |
| **Hallucination** | 75-85 | Fake SAP transaction code (ME99X) |
| **Clean Response** | 0 | Verified, safe content ✅ |
## 📊 Test Results
| Test | Scenario | Decision | Severity | Evidence |
|------|----------|----------|----------|----------|
| 1 | PII Exposure | **BLOCKED** | 95 | Email + phone in response |
| 2 | Policy Breach | **BLOCKED** | 85 | Unapproved vendor (CheapSteel Ltd.) |
| 3 | Hallucination | **BLOCKED** | 80 | Fake T-code (ME99X) among real ones |
| 4 | Clean Query | **PASSED** | 0 | General SAP info, no violations ✅ |
**Key Achievement:** Caught 1 fake T-code (ME99X) embedded among 3 real ones (ME21N, EKKO, ME23N) — demonstrating precision, not just pattern matching.
## 🏗️ Architecture
### Two-Layer Validation System
**Layer 1: Deterministic Checks**
- Regex patterns for PII (emails, phones, IDs)
- Allowlist verification for SAP terms
- Policy rule matching (vendors, discounts, thresholds)
**Layer 2: LLM Reasoning (GPT-4o mini)**
- Context-aware PII detection
- Hallucination verification against domain knowledge
- Policy interpretation for edge cases
**Layer 3: Severity Scoring**
- Risk-based thresholds (0-100 scale)
- Business impact mapping
- Automated action decisions
┌─────────────────────────────────────────────────────────┐
│ USER QUERY │
└────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ GENERATOR (SAP Assistant) │
│ Returns JSON: {answer, confidence} │
└────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ COMPLIANCE CHECKER │
│ ├─ PII Detection (regex + LLM context) │
│ ├─ SAP Term Verification (allowlist) │
│ ├─ Policy Validation (business rules) │
│ └─ Severity Scoring (0-100) │
└────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ DECISION: PASS | REVIEW | BLOCK │
│ + Audit Log (evidence, impact, recommendation) │
└─────────────────────────────────────────────────────────┘
## 🚀 Quick Start
### Installation
\# Clone repository
git clone https://github.com/YOUR-USERNAME/compliance-drift-detector.git
cd compliance-drift-detector
\# Install dependencies
pip install -r requirements.txt
\# Set up environment variables
cp .env.example .env
\# Add your OpenAI API key to .env
### Basic Usage
from checker import ComplianceChecker
from generator import SAPAssistant
\# Initialize
assistant = SAPAssistant()
checker = ComplianceChecker()
\# Generate response
user\_query = "How do I create a purchase order in SAP?"
response = assistant.generate(user\_query)
\# Validate
result = checker.validate(response)
print(f"Decision: {result\['decision']}")
print(f"Severity: {result\['severity']}")
if result\['violations']:
for v in result\['violations']:
print(f"⚠️ {v\['type']}: {v\['evidence']}")### Run Tests
\# Run all test scenarios
python -m pytest tests/
\# Run specific test
python tests/test\_hallucination.py
## 📁 Project Structure
compliance-drift-detector/
├── README.md # This file
├── requirements.txt # Python dependencies
├── .env.example # Environment variable template
├── config.py # Allowlists and policy rules
├── generator.py # SAP Assistant (response generator)
├── checker.py # Compliance validation logic
├── main.py # CLI interface
├── tests/
│ ├── test\_pii\_leak.py
│ ├── test\_policy\_violation.py
│ ├── test\_hallucination.py
│ └── test\_clean\_response.py
└── docs/
└── ARCHITECTURE.md # Detailed technical documentation
## 🔧 Configuration
### Allowlists (in config.py)
**SAP Transaction Codes:**
APPROVED\_TCODES = \[
"VA01", "VA02", "VA03", # Sales \& Distribution
"ME21N", "ME22N", "ME23N", # Procurement
"FB50", "FB60", "FB70", # Finance
"MM01", "MM02", "MM03", # Material Management
"XD01", "XD02", "XD03" # Master Data
]**Approved Vendors:**
APPROVED\_VENDORS = \[
"Acme Industrial Supplies",
"Global Tech Partners GmbH",
"Premium Components Inc.",
"Certified Steel Solutions"
]**Policy Rules:**
POLICY\_RULES = {
"max\_discount\_percent": 15,
"approval\_threshold\_usd": 50000
}## 📈 Severity Scoring Logic
def calculate\_severity(violations):
"""
Severity scale: 0-100
- 0-30: PASS (log for monitoring)
- 31-70: REVIEW (human oversight needed)
- 71-100: BLOCK (immediate rejection)
"""
base\_scores = {
"PII\_LEAK": 95, # GDPR violation risk
"POLICY\_VIOLATION": 85, # Audit/compliance breach
"UNVERIFIED\_SAP\_TERM": 80 # Operational failure risk
}
if not violations:
return 0
# Take highest base score
max\_score = max(\[base\_scores.get(v\["type"], 50) for v in violations])
# Escalate for multiple violations
if len(violations) > 1:
max\_score = min(100, max\_score + 10)
return max\_score## 🎓 Built With
- **Platform:** SAP AI Launchpad (Generative AI Hub)
- **Models:**
- Generator: Gemini 2.0 Flash Lite
- Checker: GPT-4o mini
- **Techniques:** Prompt engineering, function calling, structured outputs
- **Build Time:** 2 hours (proof-of-concept)
- **Code:** Zero custom ML training (pure prompt engineering)
## 🛣️ Roadmap (V2)
### Production Enhancements
- [ ] **Persistent Logging:** PostgreSQL for audit trails with retention policies
- [ ] **Feedback Loops:** Track false positives, retune thresholds
- [ ] **Custom Policy DSL:** Let admins define rules without code
- [ ] **Multi-Model Voting:** Consensus across GPT/Gemini/Claude for higher confidence
- [ ] **SAP BTP Integration:** Deploy as orchestration module in GenAI Hub
- [ ] **Dashboard:** Real-time violation monitoring for governance teams
- [ ] **SOC 2 Compliance:** Encryption, access controls, audit logging
### Known Limitations (V1)
- Allowlist scope limited to 15 T-codes (demo scale)
- Edge case: Embedded SAP terms in natural language sometimes misclassified
- No learning from feedback (static rules)
- Session-only (no database)
## 📄 License
MIT License - See LICENSE file for details
## 🤝 Contributing
This is a proof-of-concept for educational/portfolio purposes. Not accepting contributions at this time, but feel free to fork and adapt for your use case.
## 📧 Contact
**Arham Hassan**
- LinkedIn: [linkedin.com/in/arham-hassan-a21457242](https://linkedin.com/in/arham-hassan-a21457242)
- Email: 23cr8@queensu.ca
Built as part of SAP Generative AI Developer certification project.
## 🏆 Achievements
- ✅ 100% catch rate on test scenarios
- ✅ Zero false positives
- ✅ Built in 2 hours with no custom ML training
- ✅ Production-ready architecture design
- ✅ Enterprise governance thinking (GDPR, SOX, audit trails)
**Status:** Proof-of-Concept ✅ | Production-Ready: See V2 Roadmap
=======
Compliance-Drift-Detector ( Real-time AI governance layer for SAP GenAI Hub - catches PII leaks, hallucinations, and policy violations ).