English | 简体中文
Compliance-first audit trail for AI/LLM systems. Record decision chains, auto-redact PII, generate EU AI Act / GDPR / SOC2 reports — in under 10 lines of code.
The compliance gap in the AI tooling ecosystem:
EU AI Act enforcement begins August 2026. High-risk AI systems must maintain automatic logs of decisions, inputs, and outputs. GDPR Article 22 requires documented logic for every automated decision. SOC2 demands tamper-evident audit trails. Yet:
- Langfuse / LangSmith are observability tools — built for debugging, not compliance. No PII redaction, no regulatory report templates.
- LLM Guard is a security gateway — it filters inputs/outputs but keeps no audit logs.
- Agent Compliance Layer is the only dedicated compliance tool — but it's closed-source SaaS with no self-hosting option.
AuditLens is the first open-source project that combines LLM decision-chain recording, automatic PII redaction, and compliance report generation into a single Python SDK.
| Feature | AuditLens | Langfuse | LangSmith | LLM Guard |
|---|---|---|---|---|
| Decision chain recording | ✅ | ✅ | ✅ | ❌ |
| PII auto-redaction | ✅ | ❌ | ❌ | ✅ |
| EU AI Act report (Art. 12/19) | ✅ | ❌ | ❌ | ❌ |
| GDPR Art. 22 report | ✅ | ❌ | ❌ | ❌ |
| GDPR Art. 30 report (RoPA) | ✅ | ❌ | ❌ | ❌ |
| SOC2 audit trail | ✅ | ❌ | ❌ | ❌ |
| Data lineage (DSAR support) | ✅ | ❌ | ❌ | ❌ |
| Self-hosted / zero-knowledge | ✅ | ✅ | ❌ | ✅ |
| Framework agnostic | ✅ | ✅ | ✅ | ✅ |
| Python native | ✅ | ✅ | ✅ | ✅ |
| Open source | ✅ | ✅ | ❌ | ✅ |
pip install auditlensfrom auditlens import AuditEngine, audit_context
# One-time setup — defaults to SQLite at ./audit.db
engine = AuditEngine()
# Option 1: Decorator — wrap any LLM-calling function
@engine.trace(provider="openai", model="gpt-4o")
def ask_llm(prompt: str) -> str:
return my_llm_client.complete(prompt)
# Option 2: Context manager — group calls into a session
with audit_context(engine, session_id="user-123", purpose="customer_support") as ctx:
answer = ask_llm("How do I reset my password?")
ctx.annotate(decision_type="assisted", confidence_score=0.95)
# Generate a compliance report
from auditlens.reports import EUAIActReportGenerator
from auditlens.storage import create_storage
storage = create_storage("audit.db")
print(EUAIActReportGenerator(storage).to_json(system_name="My AI System"))Every LLM call is recorded with SHA-256 hashes of inputs and outputs, creating a tamper-evident audit trail. Multi-step pipelines are linked under a shared chain_id.
Built-in regex engine detects emails, phone numbers, SSNs, credit cards, IPs, Chinese ID cards, IBANs, AWS keys, and more. Three redaction strategies:
replace→[EMAIL]hash→[SHA:ab12...]mask→j***@example.com
Four report templates mapped directly to regulatory articles:
- EU AI Act Art. 12/19 — usage period, input references, risk events, retention compliance
- GDPR Art. 22 — automated decision records with algorithm logic, confidence scores, right-to-contest
- GDPR Art. 30 — Records of Processing Activities (RoPA): purposes, data categories, retention periods
- SOC2 — tamper-evident hash chain, model change detection, access logs
Answer GDPR Art. 15 Data Subject Access Requests (DSARs): which LLM calls processed a given user's data? Full lineage exported per data_subject_id.
- SQLite (default, zero config) — indexed queries, suitable for production at moderate scale
- JSONL — log-pipeline friendly (Fluentd, Logstash, S3)
- PostgreSQL — planned for v0.2
auditlens stats
auditlens query
auditlens report
auditlens export
auditlens lineage
pip install auditlens
pip install auditlens[dev] # with pytest, ruff, mypyRequirements: Python 3.9+. The only runtime dependency is click.
from auditlens import AuditEngine
engine = AuditEngine(
storage="sqlite:///audit.db", # or "audit.jsonl"
pii_enabled=True,
pii_method="replace", # replace | hash | mask
store_raw_text=True, # False = hash-only privacy mode
environment="production",
application_name="MyApp",
application_version="1.0.0",
)@engine.trace(provider="openai", model="gpt-4o")
def ask_llm(prompt: str) -> str:
return openai_client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": prompt}],
).choices[0].message.content@engine.trace(provider="anthropic", model="claude-sonnet-4-6")
async def ask_llm_async(prompt: str) -> str:
return await anthropic_client.messages.create(
model="claude-sonnet-4-6",
messages=[{"role": "user", "content": prompt}],
)with audit_context(
engine,
session_id="sess-001",
data_subject_id="user-42",
purpose="loan_assessment",
legal_basis="contract",
) as ctx:
# All @engine.trace calls inside are linked under the same chain_id
features = extract_features(application)
decision = assess_risk(features)
ctx.annotate(human_review_required=True)event = engine.record(
input_text="Summarise this contract.",
output_text="The contract covers...",
provider="openai",
model="gpt-4o",
data_subject_id="user-42",
processing_purpose="legal_review",
legal_basis="legitimate_interest",
decision_type="automated",
confidence_score=0.97,
retention_days=180,
)from auditlens.reports import (
EUAIActReportGenerator,
GDPRArticle22ReportGenerator,
GDPRArticle30ReportGenerator,
SOC2ReportGenerator,
)
from auditlens.storage import create_storage
storage = create_storage("audit.db")
# EU AI Act Art. 12/19
print(EUAIActReportGenerator(storage).to_json(system_name="My AI"))
# GDPR Art. 22 — automated decision records
print(GDPRArticle22ReportGenerator(storage).to_json(
controller_name="Acme Corp",
dpo_contact="dpo@acme.com",
))
# GDPR Art. 30 — Records of Processing Activities
print(GDPRArticle30ReportGenerator(storage).to_csv()) # JSON and CSV supported
# SOC2
print(SOC2ReportGenerator(storage).to_json(organization="Acme Corp"))export AUDITLENS_DB=audit.db
# Summary statistics
auditlens stats
auditlens stats --format json
# Query audit events
auditlens query --provider openai --limit 50
auditlens query --session-id sess-123 --format json
auditlens query --start 2025-01-01 --end 2025-12-31
# Generate compliance reports
auditlens report --type eu-ai-act
auditlens report --type gdpr-art22 --controller "Acme Corp"
auditlens report --type gdpr-art30 --format csv --output ropa.csv
auditlens report --type soc2 --org "Acme Corp" --output soc2.json
# Export raw data
auditlens export --format jsonl --output events.jsonl
auditlens export --format csv --output events.csv
# Data lineage — answer DSARs
auditlens lineage --subject-id user-42
auditlens lineage --subject-id user-42 --format json
auditlens lineage --request-id <event-id>
auditlens lineage --chain-id <chain-id>from auditlens.lineage import LineageTracker
tracker = LineageTracker(storage)
summary = tracker.get_subject_summary("user-42")
# {
# "subject_id": "user-42",
# "total_llm_calls": 47,
# "providers_used": ["openai", "anthropic"],
# "processing_purposes": ["support", "analytics"],
# "data_categories": ["name", "email"],
# ...
# }| Regulation | Articles Covered | Report Type |
|---|---|---|
| EU AI Act | Art. 12 (transparency logs), Art. 19 (record-keeping) | eu-ai-act |
| GDPR | Art. 22 (automated decisions), Art. 30 (processing records) | gdpr-art22, gdpr-art30 |
| SOC 2 | CC7 (tamper-evident logs, access audit) | soc2 |
| NIST AI RMF | GOVERN 1.7, MAP 1.5 (traceability & accountability) | lineage + chain logs |
| ISO 42001 | Clause 8.4 (AI system operation records) | lineage + chain logs |
EU AI Act timeline: Enforcement begins August 2026. Report format follows pre-enforcement technical guidance; minor updates may be needed when implementing acts are published. Early adoption gives you a head start.
AuditLens uses a regex-based pattern-matching engine for PII detection. This is intentional — it keeps the library dependency-free and fast — but comes with well-defined trade-offs.
| Pattern | Example |
|---|---|
| Email addresses | user@example.com |
| US/international phone numbers | +1-800-555-0100, +44 7911 123456 |
| US Social Security Numbers | 123-45-6789 |
| Credit card numbers (Visa/MC/Amex/Discover) | 4111 1111 1111 1111 |
| IPv4 / IPv6 addresses | 192.168.1.1 |
| Chinese ID cards (18-digit) | 110101199003077777 |
| UK NIN | AB123456C |
| IBAN | GB33BUKB20201555555555 |
| AWS access keys | AKIAIOSFODNN7EXAMPLE |
| API key / secret heuristic | api_key=abc123... |
| Pattern | False-positive scenario |
|---|---|
PASSPORT ([A-Z]{1,2}\d{6,9}) |
Software build IDs, license keys |
IBAN |
EU regulation codes with similar structure |
IP_ADDRESS |
Version strings in dotted-quad notation |
PHONE |
Long numeric sequences (order IDs, reference numbers) |
The regex engine cannot detect:
- Person names ("John Smith", "张伟")
- Physical addresses in free text
- Dates of birth in natural language
- Implicit identifiers (account nicknames, usernames)
| Scenario | Recommendation |
|---|---|
| Dev / staging audit log review | ✅ Built-in engine is sufficient |
| Catching structured PII in LLM I/O | ✅ Works well with pii_method="replace" |
| Production compliance gateway (standalone) | |
| GDPR Article 17 erasure completeness proof | data_subject_id lineage tracking |
Roadmap: v0.2 will add optional Presidio integration (pip install auditlens[presidio]) for NLP-backed entity recognition.
auditlens/
├── core/
│ ├── engine.py # AuditEngine — central coordinator
│ ├── interceptor.py # @engine.trace decorator + audit_context() manager
│ ├── models.py # AuditEvent, DecisionChain, DataLineage
│ └── config.py # AuditConfig, PIIConfig, StorageConfig
├── pii/
│ ├── detector.py # PIIDetector — regex scanning
│ ├── redactor.py # PIIRedactor — replace / hash / mask
│ └── patterns.py # Built-in PII patterns
├── lineage/
│ └── tracker.py # LineageTracker — DSAR support
├── storage/
│ ├── base.py # StorageBackend ABC
│ ├── sqlite.py # SQLite (default)
│ └── jsonl.py # JSONL file
├── reports/
│ ├── eu_ai_act.py # EU AI Act Art. 12/19 report
│ ├── gdpr.py # GDPR Art. 22 + Art. 30 reports
│ └── soc2.py # SOC2 audit report
└── cli/
└── main.py # Click-based CLI
git clone https://github.com/hidearmoon/auditlens.git
cd auditlens
pip install -e ".[dev]"
pytest
ruff check .
ruff format --check .
mypy auditlens/See CONTRIBUTING.md. All public APIs must have docstrings; new features must include tests (maintain ≥80% coverage). Keep the dependency footprint minimal.
Apache 2.0 — see LICENSE.
Built by OpenForge AI — open-source tools for AI safety, observability, and compliance.