AuditLens

Compliance-first audit trail for AI/LLM systems. Record decision chains, auto-redact PII, generate EU AI Act / GDPR / SOC2 reports — in under 10 lines of code.

Why AuditLens?

The compliance gap in the AI tooling ecosystem:

EU AI Act enforcement begins August 2026. High-risk AI systems must maintain automatic logs of decisions, inputs, and outputs. GDPR Article 22 requires documented logic for every automated decision. SOC2 demands tamper-evident audit trails. Yet:

Langfuse / LangSmith are observability tools — built for debugging, not compliance. No PII redaction, no regulatory report templates.
LLM Guard is a security gateway — it filters inputs/outputs but keeps no audit logs.
Agent Compliance Layer is the only dedicated compliance tool — but it's closed-source SaaS with no self-hosting option.

AuditLens is the first open-source project that combines LLM decision-chain recording, automatic PII redaction, and compliance report generation into a single Python SDK.

Comparison

Feature	AuditLens	Langfuse	LangSmith	LLM Guard
Decision chain recording	✅	✅	✅	❌
PII auto-redaction	✅	❌	❌	✅
EU AI Act report (Art. 12/19)	✅	❌	❌	❌
GDPR Art. 22 report	✅	❌	❌	❌
GDPR Art. 30 report (RoPA)	✅	❌	❌	❌
SOC2 audit trail	✅	❌	❌	❌
Data lineage (DSAR support)	✅	❌	❌	❌
Self-hosted / zero-knowledge	✅	✅	❌	✅
Framework agnostic	✅	✅	✅	✅
Python native	✅	✅	✅	✅
Open source	✅	✅	❌	✅

Quick Start

pip install auditlens

from auditlens import AuditEngine, audit_context

# One-time setup — defaults to SQLite at ./audit.db
engine = AuditEngine()

# Option 1: Decorator — wrap any LLM-calling function
@engine.trace(provider="openai", model="gpt-4o")
def ask_llm(prompt: str) -> str:
    return my_llm_client.complete(prompt)

# Option 2: Context manager — group calls into a session
with audit_context(engine, session_id="user-123", purpose="customer_support") as ctx:
    answer = ask_llm("How do I reset my password?")
    ctx.annotate(decision_type="assisted", confidence_score=0.95)

# Generate a compliance report
from auditlens.reports import EUAIActReportGenerator
from auditlens.storage import create_storage

storage = create_storage("audit.db")
print(EUAIActReportGenerator(storage).to_json(system_name="My AI System"))

Features

🔍 Decision Chain Recording

Every LLM call is recorded with SHA-256 hashes of inputs and outputs, creating a tamper-evident audit trail. Multi-step pipelines are linked under a shared chain_id.

🛡️ PII Detection & Redaction

Built-in regex engine detects emails, phone numbers, SSNs, credit cards, IPs, Chinese ID cards, IBANs, AWS keys, and more. Three redaction strategies:

replace → [EMAIL]
hash → [SHA:ab12...]
mask → j***@example.com

📊 Compliance Reports

Four report templates mapped directly to regulatory articles:

EU AI Act Art. 12/19 — usage period, input references, risk events, retention compliance
GDPR Art. 22 — automated decision records with algorithm logic, confidence scores, right-to-contest
GDPR Art. 30 — Records of Processing Activities (RoPA): purposes, data categories, retention periods
SOC2 — tamper-evident hash chain, model change detection, access logs

🔗 Data Lineage Tracking

Answer GDPR Art. 15 Data Subject Access Requests (DSARs): which LLM calls processed a given user's data? Full lineage exported per data_subject_id.

💾 Pluggable Storage

SQLite (default, zero config) — indexed queries, suitable for production at moderate scale
JSONL — log-pipeline friendly (Fluentd, Logstash, S3)
PostgreSQL — planned for v0.2

⌨️ CLI Tools

auditlens stats
auditlens query
auditlens report
auditlens export
auditlens lineage

Installation

pip install auditlens
pip install auditlens[dev]   # with pytest, ruff, mypy

Requirements: Python 3.9+. The only runtime dependency is click.

Usage

AuditEngine Configuration

from auditlens import AuditEngine

engine = AuditEngine(
    storage="sqlite:///audit.db",    # or "audit.jsonl"
    pii_enabled=True,
    pii_method="replace",            # replace | hash | mask
    store_raw_text=True,             # False = hash-only privacy mode
    environment="production",
    application_name="MyApp",
    application_version="1.0.0",
)

Decorator Usage

@engine.trace(provider="openai", model="gpt-4o")
def ask_llm(prompt: str) -> str:
    return openai_client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}],
    ).choices[0].message.content

Async Support

@engine.trace(provider="anthropic", model="claude-sonnet-4-6")
async def ask_llm_async(prompt: str) -> str:
    return await anthropic_client.messages.create(
        model="claude-sonnet-4-6",
        messages=[{"role": "user", "content": prompt}],
    )

Context Manager — Multi-Step Chain

with audit_context(
    engine,
    session_id="sess-001",
    data_subject_id="user-42",
    purpose="loan_assessment",
    legal_basis="contract",
) as ctx:
    # All @engine.trace calls inside are linked under the same chain_id
    features = extract_features(application)
    decision = assess_risk(features)
    ctx.annotate(human_review_required=True)

Manual Recording

event = engine.record(
    input_text="Summarise this contract.",
    output_text="The contract covers...",
    provider="openai",
    model="gpt-4o",
    data_subject_id="user-42",
    processing_purpose="legal_review",
    legal_basis="legitimate_interest",
    decision_type="automated",
    confidence_score=0.97,
    retention_days=180,
)

Compliance Reports

from auditlens.reports import (
    EUAIActReportGenerator,
    GDPRArticle22ReportGenerator,
    GDPRArticle30ReportGenerator,
    SOC2ReportGenerator,
)
from auditlens.storage import create_storage

storage = create_storage("audit.db")

# EU AI Act Art. 12/19
print(EUAIActReportGenerator(storage).to_json(system_name="My AI"))

# GDPR Art. 22 — automated decision records
print(GDPRArticle22ReportGenerator(storage).to_json(
    controller_name="Acme Corp",
    dpo_contact="dpo@acme.com",
))

# GDPR Art. 30 — Records of Processing Activities
print(GDPRArticle30ReportGenerator(storage).to_csv())   # JSON and CSV supported

# SOC2
print(SOC2ReportGenerator(storage).to_json(organization="Acme Corp"))

CLI Reference

export AUDITLENS_DB=audit.db

# Summary statistics
auditlens stats
auditlens stats --format json

# Query audit events
auditlens query --provider openai --limit 50
auditlens query --session-id sess-123 --format json
auditlens query --start 2025-01-01 --end 2025-12-31

# Generate compliance reports
auditlens report --type eu-ai-act
auditlens report --type gdpr-art22 --controller "Acme Corp"
auditlens report --type gdpr-art30 --format csv --output ropa.csv
auditlens report --type soc2 --org "Acme Corp" --output soc2.json

# Export raw data
auditlens export --format jsonl --output events.jsonl
auditlens export --format csv --output events.csv

# Data lineage — answer DSARs
auditlens lineage --subject-id user-42
auditlens lineage --subject-id user-42 --format json
auditlens lineage --request-id <event-id>
auditlens lineage --chain-id <chain-id>

Data Lineage

from auditlens.lineage import LineageTracker

tracker = LineageTracker(storage)
summary = tracker.get_subject_summary("user-42")
# {
#   "subject_id": "user-42",
#   "total_llm_calls": 47,
#   "providers_used": ["openai", "anthropic"],
#   "processing_purposes": ["support", "analytics"],
#   "data_categories": ["name", "email"],
#   ...
# }

Supported Regulations

Regulation	Articles Covered	Report Type
EU AI Act	Art. 12 (transparency logs), Art. 19 (record-keeping)	`eu-ai-act`
GDPR	Art. 22 (automated decisions), Art. 30 (processing records)	`gdpr-art22`, `gdpr-art30`
SOC 2	CC7 (tamper-evident logs, access audit)	`soc2`
NIST AI RMF	GOVERN 1.7, MAP 1.5 (traceability & accountability)	lineage + chain logs
ISO 42001	Clause 8.4 (AI system operation records)	lineage + chain logs

EU AI Act timeline: Enforcement begins August 2026. Report format follows pre-enforcement technical guidance; minor updates may be needed when implementing acts are published. Early adoption gives you a head start.

PII Detection — Limitations & Scope

AuditLens uses a regex-based pattern-matching engine for PII detection. This is intentional — it keeps the library dependency-free and fast — but comes with well-defined trade-offs.

What the engine covers well

Pattern	Example
Email addresses	`user@example.com`
US/international phone numbers	`+1-800-555-0100`, `+44 7911 123456`
US Social Security Numbers	`123-45-6789`
Credit card numbers (Visa/MC/Amex/Discover)	`4111 1111 1111 1111`
IPv4 / IPv6 addresses	`192.168.1.1`
Chinese ID cards (18-digit)	`110101199003077777`
UK NIN	`AB123456C`
IBAN	`GB33BUKB20201555555555`
AWS access keys	`AKIAIOSFODNN7EXAMPLE`
API key / secret heuristic	`api_key=abc123...`

Known false positives

Pattern	False-positive scenario
`PASSPORT` (`[A-Z]{1,2}\d{6,9}`)	Software build IDs, license keys
`IBAN`	EU regulation codes with similar structure
`IP_ADDRESS`	Version strings in dotted-quad notation
`PHONE`	Long numeric sequences (order IDs, reference numbers)

Known false negatives

The regex engine cannot detect:

Person names ("John Smith", "张伟")
Physical addresses in free text
Dates of birth in natural language
Implicit identifiers (account nicknames, usernames)

Recommended use

Scenario	Recommendation
Dev / staging audit log review	✅ Built-in engine is sufficient
Catching structured PII in LLM I/O	✅ Works well with `pii_method="replace"`
Production compliance gateway (standalone)	⚠️ Supplement with Microsoft Presidio
GDPR Article 17 erasure completeness proof	⚠️ Use `data_subject_id` lineage tracking

Roadmap: v0.2 will add optional Presidio integration (pip install auditlens[presidio]) for NLP-backed entity recognition.

Architecture

auditlens/
├── core/
│   ├── engine.py        # AuditEngine — central coordinator
│   ├── interceptor.py   # @engine.trace decorator + audit_context() manager
│   ├── models.py        # AuditEvent, DecisionChain, DataLineage
│   └── config.py        # AuditConfig, PIIConfig, StorageConfig
├── pii/
│   ├── detector.py      # PIIDetector — regex scanning
│   ├── redactor.py      # PIIRedactor — replace / hash / mask
│   └── patterns.py      # Built-in PII patterns
├── lineage/
│   └── tracker.py       # LineageTracker — DSAR support
├── storage/
│   ├── base.py          # StorageBackend ABC
│   ├── sqlite.py        # SQLite (default)
│   └── jsonl.py         # JSONL file
├── reports/
│   ├── eu_ai_act.py     # EU AI Act Art. 12/19 report
│   ├── gdpr.py          # GDPR Art. 22 + Art. 30 reports
│   └── soc2.py          # SOC2 audit report
└── cli/
    └── main.py          # Click-based CLI

Development

git clone https://github.com/hidearmoon/auditlens.git
cd auditlens
pip install -e ".[dev]"

pytest
ruff check .
ruff format --check .
mypy auditlens/

Contributing

See CONTRIBUTING.md. All public APIs must have docstrings; new features must include tests (maintain ≥80% coverage). Keep the dependency footprint minimal.

License

Apache 2.0 — see LICENSE.

Built by OpenForge AI — open-source tools for AI safety, observability, and compliance.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github/workflows		.github/workflows
auditlens		auditlens
tests		tests
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
README_zh.md		README_zh.md
RELEASE_CHECKLIST.md		RELEASE_CHECKLIST.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

AuditLens

Why AuditLens?

Comparison

Quick Start

Features

🔍 Decision Chain Recording

🛡️ PII Detection & Redaction

📊 Compliance Reports

🔗 Data Lineage Tracking

💾 Pluggable Storage

⌨️ CLI Tools

Installation

Usage

AuditEngine Configuration

Decorator Usage

Async Support

Context Manager — Multi-Step Chain

Manual Recording

Compliance Reports

CLI Reference

Data Lineage

Supported Regulations

PII Detection — Limitations & Scope

What the engine covers well

Known false positives

Known false negatives

Recommended use

Architecture

Development

Contributing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages