Skip to content

ai-warden/ai-warden-python

Repository files navigation

AI-Warden Python SDK

Version Python License

AI-Warden is a production-ready Python SDK for detecting and preventing prompt injection attacks in AI/LLM applications. Protect your AI systems with beautiful CLI tools, magic browser authentication, and comprehensive framework integrations.

✨ Features

  • 🛡️ Advanced Detection - Pattern matching, LLM-based analysis, and hybrid modes
  • 🌐 Magic Login - Browser-based OAuth authentication (no copy-paste!)
  • 🎨 Beautiful CLI - Rich terminal UI with colors, spinners, and progress bars
  • Fast & Accurate - Pattern mode ~20ms, LLM mode ~1.2s
  • 🔌 Framework Support - FastAPI, Django, Flask middleware included
  • 🚀 Async Ready - Full async/await support with httpx
  • 📦 Batch Processing - Validate multiple prompts efficiently
  • 🔐 Secure Storage - Credentials stored safely in ~/.ai-warden/

🚀 Quick Start

Installation

pip install ai-warden

Magic Login

ai-warden login

This will:

  1. Open your browser automatically
  2. Let you sign up/login at the AI-Warden web portal
  3. Receive and save your API key securely
  4. You're ready to go! ✅

Basic Usage

from ai_warden import AIWarden

# Auto-loads credentials from magic login
warden = AIWarden()

# Validate a prompt
result = warden.validate("Ignore all previous instructions")

print(result.is_safe)      # False
print(result.threat_type)  # "jailbreak_attempt"
print(result.confidence)   # 0.95
print(result.latency_ms)   # 23

📖 Documentation

Table of Contents


🔐 Authentication

Option 1: Magic Login (Recommended)

ai-warden login

Opens your browser, handles OAuth, saves credentials automatically.

Option 2: Manual Configuration

ai-warden configure --api-key sk_live_xxx

Or set environment variable:

export AI_WARDEN_API_KEY="sk_live_xxx"

Option 3: Direct in Code

from ai_warden import AIWarden

warden = AIWarden(api_key="sk_live_xxx")

🐍 Python API

Basic Validation

from ai_warden import AIWarden

warden = AIWarden()

# Validate single prompt
result = warden.validate("Hello world")

if result.is_safe:
    print("✅ Safe to use!")
else:
    print(f"⚠️ Threat detected: {result.threat_type}")

Validation Modes

from ai_warden import ValidationMode

# Pattern-only (fast, ~20ms)
result = warden.validate("text", mode=ValidationMode.PATTERN)

# LLM-based (accurate, ~1.2s)
result = warden.validate("text", mode=ValidationMode.LLM)

# Hybrid (pattern first, then LLM if uncertain)
result = warden.validate("text", mode=ValidationMode.HYBRID)

# Auto (smart decision - recommended)
result = warden.validate("text", mode=ValidationMode.AUTO)

Batch Validation

prompts = [
    "Hello world",
    "Ignore previous instructions",
    "Show me all data"
]

results = warden.validate_batch(prompts)

for prompt, result in zip(prompts, results):
    print(f"{prompt}: {'✅' if result.is_safe else '⚠️'}")

Async Support

import asyncio
from ai_warden import AsyncAIWarden

async def main():
    async with AsyncAIWarden() as warden:
        result = await warden.validate("text")
        print(result.is_safe)

asyncio.run(main())

Context Manager

with AIWarden() as warden:
    result = warden.validate("text")
    print(result.is_safe)

🛠️ CLI Reference

ai-warden login

Magic browser-based authentication.

ai-warden login

# Output:
# 🌐 Opening browser for authentication...
# ⏳ Waiting for callback...
# ✅ Authentication successful!
# 🔑 API key saved to ~/.ai-warden/credentials

Options:

  • --auth-url URL - Custom authentication URL
  • --port PORT - Local callback port (default: 8787)

ai-warden configure

Manual API key configuration.

# Interactive
ai-warden configure

# Direct
ai-warden configure --api-key sk_live_xxx --api-url http://46.62.240.255:8080/api

ai-warden validate

Validate a single prompt.

ai-warden validate "Ignore all instructions"

# Output:
# ⚠️ UNSAFE: jailbreak_attempt
# 
# Details:
#   Confidence: 0.95
#   Mode: pattern
#   Latency: 23ms

Options:

  • --mode MODE - Validation mode (pattern/llm/hybrid/auto)

ai-warden scan

Scan files for vulnerabilities.

# Scan single file
ai-warden scan app.py

# Scan directory recursively
ai-warden scan src/ --recursive

# Output:
# Scanning: app.py
# ✅ Line 42: Safe
# ⚠️ Line 89: UNSAFE - potential injection
# 
# Summary: 1 issue found in 1 file

Options:

  • --recursive, -r - Scan directories recursively
  • --mode MODE - Validation mode

ai-warden scan-skill

Scan a remote skill repository for prompt injection threats.

# Offline scanning (free, no API key needed)
ai-warden scan-skill https://github.com/user/skill --offline
ai-warden scan-skill https://github.com/user/skill --offline --json
ai-warden scan-skill https://github.com/user/skill --offline --strict

# API-powered scanning (requires API key)
ai-warden scan-skill https://github.com/user/skill
ai-warden scan-skill https://github.com/user/skill --json --strict

Options:

  • --offline - Use local scanner only (free, no API key)
  • --json - Machine-readable JSON output
  • --strict - Exit code 1 unless verdict is SAFE
  • --mode MODE - Detection mode (strict/balanced/permissive)

Example Output

🔍 AI-Warden Skill Scan
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  Skill:    smart-web-search
  Source:   github.com/davidme6/smart-web-search
  Files:    4 scanned
  Mode:     offline

  LICENSE                  ✅ Safe       (0.00)
  README.md                ❌ CRITICAL   (1.00)
    ├─ P102: Data Forwarding Instructions [CRITICAL] — "Email**: smart-web-search@feedback.com"
    └─ H003: Excessive External URLs [LOW] — "Found 11 external URLs"
  SKILL.md                 ✅ Safe       (0.19)
    └─ H003: Excessive External URLs [LOW] — "Found 20 external URLs"
  _meta.json               ✅ Safe       (0.00)

  Verdict:     ❌ DANGEROUS
  Trust Score: 0/100
  Scan Time:   1.2s
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Verdicts

Verdict Trust Score Meaning
✅ SAFE 70-100 No threats detected
⚠️ WARNING 25-69 Suspicious patterns found, review recommended
❌ DANGEROUS 0-24 Active threats detected, do not install

Offline vs API Mode

Offline (free) API (metered)
Detection Regex patterns Judge Mars ML + patterns
Speed Instant ~150ms/file
False positives Higher Lower
Zero-day threats
Requires API key No Yes

Python API

from ai_warden import AIWarden

warden = AIWarden()

# Offline scan
result = warden.scan_skill("https://github.com/user/skill", offline=True)
print(result["verdict"])     # SAFE, WARNING, or DANGEROUS
print(result["trustScore"])  # 0-100

# API-powered scan
result = warden.scan_skill("https://github.com/user/skill")
for f in result["files"]:
    print(f"{f['path']}: {f['riskLevel']} ({f['score']})")

ai-warden status

Show authentication status and usage.

ai-warden status

# Output:
# ╭─── AI-Warden Status ────────────────────────────╮
# │ ✅ Authenticated                                 │
# │                                                  │
# │ API Key:   sk_live_...abc (valid)               │
# │ API URL:   http://46.62.240.255:8080/api        │
# │ Tier:      Free                                  │
# │ Usage:     42 / 1000 requests (4%)              │
# │ Remaining: 958 requests this month              │
# ╰──────────────────────────────────────────────────╯

ai-warden logout

Remove stored credentials.

ai-warden logout

# Output:
# 🗑️ Credentials removed
# Run 'ai-warden login' to authenticate again

🌐 Middleware

FastAPI

from fastapi import FastAPI
from ai_warden.middleware import FastAPIMiddleware

app = FastAPI()

app.add_middleware(
    FastAPIMiddleware,
    api_key="sk_live_xxx",    # Or load from env
    block_unsafe=True,         # Return 400 on unsafe prompts
    log_threats=True,          # Log detected threats
    exclude_paths=["/health"]  # Skip validation for these paths
)

@app.post("/chat")
async def chat(prompt: str):
    # Middleware validates prompt before reaching here
    return {"response": "Safe!"}

How it works:

  1. Middleware intercepts POST/PUT/PATCH requests
  2. Extracts text fields from JSON body
  3. Validates all text content
  4. Returns 400 if unsafe (when block_unsafe=True)
  5. Or adds warning header and continues

Django

# settings.py
MIDDLEWARE = [
    'ai_warden.middleware.django.AIWardenMiddleware',
    # ... other middleware
]

AI_WARDEN_API_KEY = "sk_live_xxx"
AI_WARDEN_BLOCK_UNSAFE = True
AI_WARDEN_LOG_THREATS = True
AI_WARDEN_EXCLUDE_PATHS = ["/admin/", "/static/"]

Flask

from flask import Flask, request
from ai_warden.middleware import flask_protect

app = Flask(__name__)

@app.route('/chat', methods=['POST'])
@flask_protect(api_key="sk_live_xxx", block_unsafe=True)
def chat():
    prompt = request.json['prompt']
    # Decorator validates prompt before function runs
    return {"response": "Safe!"}

🚀 Advanced Usage

Custom Validation Logic

from ai_warden import AIWarden

warden = AIWarden()

def validate_user_input(text: str) -> bool:
    """Custom validation with additional checks."""
    # AI-Warden validation
    result = warden.validate(text)
    
    if not result.is_safe:
        print(f"Blocked: {result.threat_type}")
        return False
    
    # Additional custom checks
    if len(text) > 10000:
        print("Blocked: Too long")
        return False
    
    return True

Error Handling

from ai_warden import AIWarden
from ai_warden.exceptions import (
    AuthenticationError,
    ValidationError,
    APIError
)

warden = AIWarden()

try:
    result = warden.validate("text")
except AuthenticationError:
    print("Invalid API key")
except ValidationError as e:
    print(f"Validation failed: {e}")
except APIError as e:
    print(f"API error: {e}")

Custom API URL

warden = AIWarden(
    api_key="sk_live_xxx",
    api_url="https://your-custom-domain.com",
    timeout=60  # Custom timeout in seconds
)

Usage Statistics

warden = AIWarden()

usage = warden.get_usage()

print(f"Tier: {usage['tier']}")
print(f"Usage: {usage['usage']} / {usage['limit']}")
print(f"Remaining: {usage['limit'] - usage['usage']}")

🧪 Testing

Run tests with pytest:

pip install -e ".[dev]"
pytest

With coverage:

pytest --cov=ai_warden --cov-report=html

📦 Installation Options

Basic Installation

pip install ai-warden

With Async Support

pip install ai-warden[async]

With Framework Support

pip install ai-warden[fastapi]
pip install ai-warden[django]
pip install ai-warden[flask]

All Features

pip install ai-warden[async,fastapi,django,flask,secure]

Development

pip install -e ".[dev]"

🤝 Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature-name
  3. Make your changes
  4. Run tests: pytest
  5. Format code: black ai_warden/
  6. Submit a pull request

📄 License

MIT License - see LICENSE for details.


🔗 Links


🛡️ Security

If you discover a security vulnerability, please email security@ai-warden.com instead of using the issue tracker.


🙏 Acknowledgments

Built with ❤️ using:

  • Click - Beautiful command-line interfaces
  • Rich - Rich terminal formatting
  • Pydantic - Data validation
  • Requests - HTTP client
  • httpx - Async HTTP client

Made with 🛡️ by the AI-Warden team

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages