Release v1.5.0 · itdove/ai-guardian

Added

Scanner Installer: Skip Installation if Already Up-to-Date (Issue #271)
- Smart Version Checking: ai-guardian scanner install now checks if scanner is already installed before downloading
- Skip When Up-to-Date: Automatically skips installation if the latest version is already installed
- Upgrade Detection: Automatically upgrades when a newer version is available
- Downgrade Protection: Does not auto-downgrade without explicit --version flag
- Explicit Control: --version flag allows downgrade or reinstall of specific versions
- Clear Messaging: Shows current version, target version, and action taken
- Version Comparison: Proper semantic version comparison (e.g., 8.30.1 < 8.31.0)
- Performance: Saves bandwidth and time by skipping unnecessary downloads
- Implementation:
  - Added _get_installed_version() method to detect currently installed version
  - Added _compare_versions() method for semantic version comparison
  - Updated install() method with version checking logic
- Test Coverage: Added 10 comprehensive test cases for version checking scenarios
- Benefits: Faster installation, bandwidth-friendly, safe (no auto-downgrades), clear user feedback
SSRF Protection: Wildcard Domain Pattern Support (Issue #253)
- New Feature: Added wildcard pattern matching for additional_blocked_domains configuration
- Syntax: Supports * (match zero or more characters) and ? (match exactly one character)
- Pattern Examples:
  - *.internal.com - Block all .internal.com domains (api.internal.com, db.internal.com)
  - admin.* - Block admin.* with any suffix (admin.example.com, admin.local)
  - *.corp.* - Block all .corp. domains (api.corp.internal, db.corp.example.com)
  - metadata.* - Block all metadata.* endpoints (metadata.aws.com, metadata.google.internal)
  - test?.example.com - Block test1.example.com, test2.example.com, testa.example.com
- Use Cases:
  - Block entire TLDs with single pattern (*.internal, *.local)
  - Block subdomain patterns (*.admin.example.com)
  - Block naming patterns (metadata.*, admin.*)
  - Enterprise-wide policies with simplified configuration
- Backward Compatibility: Exact domain matching and subdomain matching still work as before
- Pattern Validation: Invalid patterns are rejected at config load time with warnings
- Performance: Patterns stored separately from exact domains for optimal matching
- Files Modified:
  - src/ai_guardian/ssrf_protector.py: Added fnmatch import, _blocked_domain_patterns list, _is_valid_domain_pattern() method, pattern matching in _is_domain_blocked()
  - src/ai_guardian/schemas/ai-guardian-config.schema.json: Updated additional_blocked_domains description with wildcard pattern syntax
  - docs/SSRF_PROTECTION.md: Comprehensive documentation with wildcard pattern examples and use cases
- Tests: Added 11 comprehensive test cases in TestWildcardDomainPatterns class
- Impact: Users can now use flexible wildcard patterns to block domains more efficiently
SSRF Protection: URL Allow-List Support (Issue #252)
- New Configuration: Added allowed_domains array to ssrf_protection configuration
- Purpose: Allow specific trusted domains/URLs while maintaining core protections
- Evaluation Order (Deny-First):
  1. Check immutable core protections (metadata endpoints, dangerous schemes, private IPs)
  2. Check deny-list (additional_blocked_domains)
  3. Check allow-list (allowed_domains) - can override step 2, NOT step 1
- Use Cases:
  - Allow specific internal APIs while blocking other internal domains
  - Allow development/staging servers without allowing all localhost
  - Allow specific partner domains on restricted networks
  - Provide granular control to override broad domain blocks
- Domain Matching: Supports exact match and subdomain matching
  - "api.corp.internal" allows api.corp.internal and v1.api.corp.internal
- Security: Cannot override immutable core protections
  - Metadata endpoints (169.254.169.254, metadata.google.internal) remain blocked
  - Private IP ranges (RFC 1918) remain blocked
  - Dangerous schemes (file://, gopher://) remain blocked
- Files Modified:
  - src/ai_guardian/schemas/ai-guardian-config.schema.json: Added allowed_domains property
  - src/ai_guardian/ssrf_protector.py: Implemented allow-list logic in _check_url()
  - src/ai_guardian/setup.py: Added allowed_domains: [] to default config
  - ai-guardian-example.json: Added examples and security warnings
  - docs/SSRF_PROTECTION.md: Comprehensive documentation with examples
  - AGENTS.md: Enhanced schema change checklist
- Tests: Added 9 comprehensive test cases in tests/test_ssrf_protection.py
- Impact: Users can now create exceptions for specific domains while maintaining strong security boundaries

Security

Cascading Priority for Remote Config URLs to Prevent Immutability Bypass (Issue #255)
- Fix: Implemented first-match-wins cascading for remote config URL sources
- Vulnerability: Users could bypass immutable: true enterprise policies by adding their own remote config URLs in local/user configs
- Attack Scenario: Enterprise deploys system config with immutable: true SSRF protection, user adds attacker-controlled remote URL that disables it
- Solution: Remote config URLs now follow strict priority hierarchy (system config → env var → user config → local config)
  - System config (/etc/ai-guardian/remote-configs.json): Highest priority, requires root/admin, blocks all lower sources
  - Environment variable (AI_GUARDIAN_REMOTE_CONFIG_URLS): Second priority, blocks user/local sources
  - User config (~/.config/ai-guardian/ai-guardian.json): Third priority, blocks local config
  - Local config (~/.ai-guardian.json): Lowest priority fallback
- Implementation:
  - Added _get_system_config_path(): Returns platform-specific system config path (Linux/macOS: /etc/ai-guardian/remote-configs.json, Windows: C:\ProgramData\ai-guardian\remote-configs.json)
  - Refactored _load_remote_configs(): Implements cascading with early return on first match
  - Added _fetch_remote_configs(): Helper to reduce code duplication
- Testing: Added 5 new test cases in test_immutable_configs.py:
  - System config blocks user remote URLs
  - Environment variable takes priority over user config
  - User remote URLs work without system config
  - Local config has lowest priority
  - Legacy format (direct list) still works
- Backward Compatibility: ✅ Existing users with remote_configs in user/local files continue working unchanged
- Enterprise Deployment: Enterprises can now deploy one system config file to enforce policies across all users
- Impact: Critical security fix - prevents users from bypassing all enterprise security policies

Changed

Documentation: Clarify SSRF Protection Limitations and Scope (Issue #256)
- Updated docs/SSRF_PROTECTION.md: Added "Important Limitations" section at the top
  - Clearly explains what SSRF protection CAN and CANNOT protect against
  - Documents pattern-based filtering vs comprehensive network security
  - Added OpenShell integration guide for comprehensive SSRF protection
  - Explains hook-based architecture and why limitations exist
- Updated README.md: SSRF section now includes limitation disclaimers
  - Examples of what CAN be blocked (explicit URLs in Bash/tool parameters)
  - Examples of what CANNOT be blocked (MCP server internal calls)
  - Recommendations for network-level controls and MCP sandboxing
- Updated ai-guardian-example.json: Added comprehensive limitation comments
  - Explains pattern-based filtering cannot replace network security
  - Documents that it cannot detect MCP internal network calls
  - Notes about HTTP redirects and dynamic URL construction
- Updated src/ai_guardian/ssrf_protector.py: Enhanced module docstring
  - Clear architecture explanation (hook-based, not proxy)
  - Defense in depth strategy documentation
  - Usage guidance and limitations
- Updated error messages: SSRF block/warn messages now mention limitations
  - Changed "SSRF ATTACK DETECTED" to "SSRF PATTERN DETECTED"
  - Added note about pattern-based detection
  - Recommends firewall rules and network controls
  - References docs/SSRF_PROTECTION.md
- Impact: Users now have realistic expectations about SSRF protection scope
- Key Message: ai-guardian catches obvious SSRF attempts in command strings but cannot replace network-level security

Fixed

Setup.py Missing permissions_directories in Default Config Template (Issue #240)
- Fix: Added permissions_directories field to _get_default_config_template() function in setup.py
- Problem: Users running ai-guardian setup --create-config got incomplete configuration files missing the permissions_directories option
- Root Cause: When permissions_directories was added to the schema, setup.py wasn't updated (violating AGENTS.md configuration consistency guidelines)
- Impact: Generated configs now include permissions_directories with comprehensive comments and examples:
  - Local directory scanning example (~/.claude/skills)
  - GitHub repository scanning example with token_env
  - Documentation explaining it's OPTIONAL/ADVANCED and most users should prefer remote_configs
- Location: Added to setup.py after permissions section (line 893), before directory_rules
- Verification: Tested via _get_default_config_template() and confirmed field appears in generated config
JSON Schema Missing Definitions (Issue #239)
- Fix: Added missing pattern_server_auth and pattern_server_cache definitions to schema
- Problem: Schema referenced definitions that didn't exist, causing validation failures for pattern_server configurations
- Root Cause: When pattern_server was refactored from root-level to nested under each feature in v1.7.0, the auth/cache structures were not extracted into reusable definitions
- Impact: Schema validation now succeeds for configs using pattern_server with auth/cache in:
  - secret_redaction.pattern_server
  - prompt_injection.unicode_detection.pattern_server
  - ssrf_protection.pattern_server
  - config_file_scanning.pattern_server
- Tests: Added comprehensive test suite (test_pattern_server_definitions.py) validating all pattern_server references

Added

LeakTK Pattern Server Documentation (Issue #156)
- Added comprehensive documentation for using LeakTK patterns as a pattern server
- README.md: Added "Using LeakTK Patterns (Recommended)" section with quick start guide
  - Benefits: Free, community-maintained, 104+ rules, no authentication required
  - Configuration example using GitHub raw content URL
  - Verification steps and expected log output
- docs/SECRET_SCANNING.md: Added complete LeakTK integration guide
  - Pattern sources comparison table (LeakTK vs Gitleaks defaults)
  - Configuration options and cache settings
  - Pattern version compatibility table (8.25.0, 8.26.0, 8.27.0)
  - Troubleshooting guide and common issues
  - FAQ section covering offline usage, updates, firewall workarounds
  - Example workflows for combining LeakTK with project-specific patterns
- ai-guardian-example.json: Added LeakTK example configuration
  - Documented free, community-maintained pattern source
  - Reference to LeakTK GitHub repository
- Feature already implemented and tested - documentation completes the feature
- LeakTK repository: https://github.com/leaktk/patterns
Permissions Comparison Documentation (Issue #235)
- Added comprehensive docs/PERMISSIONS_COMPARISON.md comparing ai-guardian.json vs settings.json permission systems
- Covers: architecture diagrams, capabilities comparison, enforcement differences, when to use each
- Explains Skills are only controllable via ai-guardian.json (not in settings.json)
- Documents defense-in-depth best practices using both permission systems
- Includes example configurations for different scenarios (user preferences, enterprise enforcement, defense-in-depth)
- Cross-referenced from README.md "When to Use" section

Changed

Removed Unused Maintainer Detection Code (Issue #231)
- Change: Removed ~450 lines of unused GitHub maintainer detection code from tool_policy.py
- Removed Methods:
  - _get_git_repo_info() - Extract GitHub repo info from git remote
  - _get_authenticated_github_user() - Get GitHub username from gh CLI
  - _check_github_collaborator() - Check if user has write access via GitHub API
  - _get_maintainer_cache() - Read maintainer status from cache
  - _cache_maintainer_status() - Write maintainer status to cache
  - _is_github_maintainer_cached() - Main maintainer check with caching
  - _diagnose_maintainer_bypass() - Diagnostic helper for bypass issues
- Rationale:
  - These methods were no longer called in production code since commit 0f6e456 (April 19, 2026)
  - The _should_skip_immutable_protection() bypass logic was simplified to allow ALL contributors to edit development source (fork + PR workflow)
  - Maintainer check was removed to enable standard open-source contribution workflow
  - Security relies on PR review process, not role-based permissions
- Impact:
  - Reduced codebase complexity (~450 lines removed)
  - Removed dependency on gh CLI for permission checking
  - Eliminated cache file management (~/.cache/ai-guardian/maintainer-status.json)
  - Faster hook execution (no GitHub API calls)
- Tests Updated:
  - Renamed test_maintainer_bypass.py → test_development_source_bypass.py
  - Removed tests for unused GitHub API methods (~400 lines)
  - Kept tests for core bypass logic (_should_skip_immutable_protection)
  - Removed @patch('_is_github_maintainer_cached') mocks from other test files
- No Breaking Changes: The permission model remains unchanged - all contributors can edit development source, config/hooks/cache remain always protected
Secret Redaction Always Redacts (Removed Block Mode) (Issue #234)
- Change: secret_redaction.action="block" mode removed - secrets are now always redacted (never blocked)
- New Default: Changed default action from "log-only" to "warn" for better UX
- Valid Actions: Only "warn" (redact with notification) and "log-only" (redact silently) are now supported
- Breaking Change: Configurations with action="block" will fail validation with a helpful error message
- Rationale:
  - Simpler UX - one behavior, no confusing modes
  - Better DX - AI can still help (sees masked secrets) instead of being completely blocked
  - Same security - real secrets never reach AI
  - Less friction - reading files with secrets doesn't stop work
  - Name matches behavior - "secret_redaction" actually redacts
- Migration: For users who want old "block" behavior, add sensitive files to .gitleaksignore to prevent reading them entirely
- Impact:
  - Schema updated to only allow "warn" and "log-only"
  - TUI dropdown no longer shows "block" option
  - Config validation rejects "block" with migration guidance
  - Default config templates updated to use "warn"
- Files Modified:
  - src/ai_guardian/schemas/ai-guardian-config.schema.json: Updated enum and default
  - src/ai_guardian/secret_redactor.py: Updated docstring and default
  - src/ai_guardian/__init__.py: Simplified to always redact when enabled
  - src/ai_guardian/config_inspector.py: Added validation to reject "block"
  - src/ai_guardian/setup.py: Changed default from "log-only" to "warn"
  - src/ai_guardian/tui/secret_redaction.py: Removed "block" option, default "warn"
- Tests Updated: All tests expecting blocking behavior updated to expect redaction

Fixed

Ignore Files Patterns with Leading **/ Don't Work (Issue #232)
- Root Cause: Three different implementations of ignore_files pattern matching existed with inconsistent behavior:
  - Secret Scanning (__init__.py) - ✅ WORKED - Used custom _match_leading_doublestar_pattern() helper
  - Prompt Injection (prompt_injection.py) - ❌ BROKEN - Only used Path.match() which doesn't properly handle leading **/
  - Config Scanner (config_scanner.py) - ❌ BROKEN - Used fnmatch.fnmatch() which doesn't support **/
- Fix: Extracted _match_leading_doublestar_pattern() to src/ai_guardian/utils/path_matching.py module and updated all three implementations to use it consistently
- Impact: All detectors now properly support leading **/ patterns for ignoring files in subdirectories
- Files Modified:
  - src/ai_guardian/utils/path_matching.py: Created new utility module with match_leading_doublestar_pattern() and match_ignore_pattern() functions
  - src/ai_guardian/__init__.py: Updated to import and use utility function
  - src/ai_guardian/prompt_injection.py: Updated _is_file_ignored() to use utility function
  - src/ai_guardian/config_scanner.py: Updated _should_ignore_file() to use utility function
- Tests Added:
  - tests/test_unicode_attacks.py::UnicodeDetectorIgnoreFilesTest - 2 tests for unicode detection ignore patterns
  - tests/test_config_scanner.py::TestConfigFileScanner::test_ignore_files_leading_double_star_patterns - 1 test for config scanner
Config File Scanner File Path Extraction Bug (Issue #228)
- Problem: Config File Scanner failed to extract file path from PreToolUse hook data when using the Read tool, allowing malicious config files (CLAUDE.md, AGENTS.md, etc.) to pass through unscanned and potentially exfiltrate credentials
- Root Cause: extract_file_content_from_tool() function only checked tool_use.parameters.file_path format, but Claude Code actually sends tool_use.input.file_path, causing file path extraction to fail with "Could not extract file path from hook data" error
- Fix: Added support for tool_use.input.file_path format in file path extraction logic to match Claude Code's actual hook data structure
- Impact: Config File Scanner now properly scans config files for exfiltration patterns (env | curl, AWS S3 uploads, etc.) when read via PreToolUse hooks, protecting against persistent credential theft attacks
- Affected Versions: v1.3.0, v1.4.0, v1.4.1, v1.5.0-dev (bug present since Config File Scanner was added in v1.3.0)
- Test Added: 1 new regression test verifying file path extraction from tool_use.input format (tests/test_ai_guardian.py::test_pretooluse_hook_with_tool_use_input_format)
- Files Modified:
  - src/ai_guardian/__init__.py: Added tool_use.input.file_path check in extract_file_content_from_tool()
  - tests/test_ai_guardian.py: Added regression test with actual Claude Code hook format
PreToolUse Hook Auto-Approve Bug (Issue #224)
- Problem: PreToolUse hook was auto-approving all Edit and Write operations when no secrets were detected, bypassing Claude Code's normal permission prompts and removing user control over file modifications
- Root Cause: format_response() function returned permissionDecision: allow for clean files, which instructed Claude Code to auto-approve the operation
- Fix: PreToolUse now only returns permissionDecision when denying operations (secrets/threats detected). For clean operations, returns empty response to allow Claude Code's normal permission system to prompt the user
- Impact: Users now properly see permission prompts for Edit/Write operations, maintaining informed consent for file modifications
- Affected Versions: v1.3.0, v1.4.0, v1.4.1 (bug introduced with GitHub Copilot integration in v1.3.0)
- Tests Added:
  - Unit Tests: 4 new PreToolUse permission tests covering Edit/Write operations for both Claude Code and GitHub Copilot IDE types (tests/test_hook_processing.py)
  - Integration Tests: 6 new end-to-end tests verifying no auto-approve behavior (tests/test_pretooluse_no_auto_approve.py):
    - Edit operations (Claude Code and GitHub Copilot)
    - Write operations (Claude Code and GitHub Copilot)
    - Verification that secrets still trigger deny (no regression)
    - End-to-end workflow showing user sees permission prompts
  - User Experience Contract Tests: 5 new tests documenting expected UX (tests/test_user_experience_contract.py):
    - Read with secret → Immediate denial (no prompt shown)
    - Edit without secret → Permission prompt shown
    - Comparison test showing different UX for secret vs clean operations
    - Documentation test describing expected behavior for users
    - Manual verification guide for testing in actual Claude Code IDE
  - Updated 3 existing tests to expect correct behavior (no auto-approve)
- Files Modified:
  - src/ai_guardian/__init__.py: Updated format_response() for both GITHUB_COPILOT and CLAUDE_CODE paths
  - tests/test_hook_processing.py: Added PreToolUsePermissionTests class with 4 unit tests
  - tests/test_pretooluse_no_auto_approve.py: Added 6 integration tests (NEW FILE)
  - tests/test_ai_guardian.py: Updated 3 tests to expect correct behavior

Added

Local File Path Support in Remote Configurations (Issue #223)
- Feature: remote_configs now supports local file paths in addition to HTTPS URLs
- Supported Formats:
  - file:// URLs: file:///etc/ai-guardian/config.toml
  - Absolute paths: /etc/ai-guardian/config.toml
  - Tilde expansion: ~/team-configs/allowed-tools.toml
- Caching Behavior:
  - HTTPS URLs: Cached with TTL (default: 12h refresh, 168h expiration)
  - Local files: Always read fresh (bypass cache for immediate updates)
- Use Cases:
  - Development/Testing: Test configs locally without HTTPS server
  - Air-Gapped Environments: Offline systems without internet access
  - Corporate Networks: Shared network drives (NFS, SMB)
  - CI/CD Pipelines: Build environments with local config files
  - Team Configuration: Shared configs in home directories
- Security:
  - Path traversal prevention with Path.resolve(strict=True)
  - File type validation (regular files only)
  - Permission checks before reading
  - Symlinks followed safely with warnings
- Implementation:
  - New RemoteFetcher._fetch_from_local_file() method
  - Updated fetch_config() to bypass caching for local paths
  - Both JSON and TOML formats supported
- Tests Added:
  - Unit Tests (tests/test_remote_fetcher_local.py): 27 passing tests
    - file:// URLs, absolute paths, tilde expansion
    - JSON/TOML format support
    - Error handling (missing files, permission denied, invalid format)
    - Symlink following and broken symlinks
    - No-caching behavior verification
    - Edge cases (spaces in paths, special characters, UTF-8)
  - Integration Tests (tests/test_integration_local_remote_configs.py): 10 passing tests
    - Multiple local sources, cache isolation
    - Mixed local and HTTPS URLs
    - File updates reflected immediately
    - Concurrent updates, error recovery
- Documentation:
  - README.md updated with local file path examples and use cases
  - Security features documented
- Files Modified:
  - src/ai_guardian/remote_fetcher.py: Added local file path support
  - README.md: Added "Local File Paths" section under "Remote Configs vs Directory Discovery"
Integration and Use-Case Tests with Mock MCP Server (Issue #220)
- Comprehensive test infrastructure for MCP tool security testing
- Test Fixtures:
  - tests/fixtures/mock_mcp_server.py: Simulates NotebookLM and other MCP tools with controllable responses
  - tests/fixtures/attack_constants.py: Comprehensive attack patterns (SSRF, secrets, prompt injection, exfiltration)
  - tests/conftest.py: Pytest fixtures for test isolation using AI_GUARDIAN_CONFIG_DIR
- Integration Tests (tests/test_integration_mcp.py): 24 passing tests
  - MCP Tool Permission Tests (6 tests): Allowlists, blocklists, wildcards, custom servers
  - Secret Scanning Tests (4 tests): Secrets in notebook titles/sources, multiple secret types, false positives
  - Prompt Injection Tests (4 tests): Injection in parameters, role-switching, delimiter escapes
  - SSRF Protection Tests (5 tests): AWS/GCP metadata, private IPs, public URLs, Bash-specific behavior
  - Config Exfiltration Tests (3 tests): Curl exfiltration, credential theft in CLAUDE.md/AGENTS.md
  - Combined Protection Tests (2 tests): Multiple protections working together, defense in depth
- PostToolUse Tests (tests/test_posttooluse_mcp.py): 13 passing tests
  - Secret Scanning (5 tests): Bash/Read output with secrets, Write/Edit skipped, clean outputs
  - Content Scanning (3 tests): Documents that PostToolUse only scans secrets, not prompt injection
  - MCP Tool Tests (3 tests): MCP responses, notebook lists, current scanning behavior
  - Redaction Tests (1 test): Secret redaction mode behavior
  - Combined Tests (1 test): Multiple threats in output
- Use-Case Tests (tests/test_use_cases.py): 13 passing tests covering realistic scenarios
  - Data Exfiltration Attack (3 tests): Multi-stage attack attempts via Bash, NotebookLM, SSRF
  - Prompt Injection Chain (2 tests): Attempts to disable protections, privilege escalation prevention
  - Legitimate Workflow (2 tests): Normal NotebookLM usage, security code discussion
  - Enterprise Policy (2 tests): Approved MCP servers only, paranoid mode (all MCP blocked)
  - Multi-Stage Attack (2 tests): Combined injection + exfiltration, privilege escalation
  - Real-World Scenarios (2 tests): Developer workflows, documentation discussions
- Test Isolation: All tests run in isolated temporary directories via isolated_config_dir fixture
- Benefits:
  - ✅ Validates protections work with real MCP tool calls
  - ✅ Catches integration issues between protection layers
  - ✅ Serves as usage examples for MCP security
  - ✅ Prevents regression in multi-protection scenarios
  - ✅ Documents actual implementation behavior (SSRF only on Bash, PostToolUse only scans secrets)
  - ✅ Tests realistic attack chains and defense-in-depth
  - ✅ Validates enterprise policy enforcement
  - ✅ Ensures legitimate workflows work without false positives
- Hook Processing Tests (tests/test_hook_processing.py): 8 passing tests
  - Hook Input Parsing (4 tests): Valid JSON, UserPromptSubmit, PreToolUse, PostToolUse
  - Tool Response Extraction (4 tests): Bash output, Read content, MCP tools, Write/Edit skipped
- Advanced Tool Policy Tests (tests/test_tool_policy_advanced.py): 11 passing tests
  - Rule Matching (2 tests): Wildcard patterns, case sensitivity
  - Rule Ordering (2 tests): First-match wins, default behavior
  - Config Variations (4 tests): Disabled permissions, empty rules, no config, invalid rules
  - Edge Cases (3 tests): Empty tool name, null tool name, missing field
- End-to-End Workflow Tests (tests/test_e2e_workflow.py): 5 passing tests
  - Legitimate Workflows (3 tests): NotebookLM, Bash, Read→Write workflows
  - Secret Detection (1 test): Secret caught at PostToolUse stage
  - Multi-Tool Sequence (1 test): Multiple tools in realistic workflow
- 74 new integration and use-case tests covering all 9 protection layers with MCP tools
- Test Coverage: Core protection modules at 70% (excluding TUI/setup: 4,500 statements, 1,359 missing)
- Part of ongoing MCP security validation effort
Pattern Server Support for Security Features (Issue #206, Epic #186)
- OPTIONAL/ADVANCED: Enterprise pattern server integration for centralized pattern management
- Three-tier pattern system: Immutable core + Pattern server/defaults + Local config additions
- Multiple pattern types: SSRF, Unicode, Config Scanner, Secret Redaction
- Fallback chain: Pattern server → cache → hardcoded defaults (always available)
- Features:
  - PatternServerClient extended for multiple pattern types (ssrf, unicode, config-exfil, secrets)
  - New PatternLoader base class with feature-specific implementations
  - TOML pattern file format with native comment support
  - Source attribution tracking (IMMUTABLE, SERVER, DEFAULT, LOCAL_CONFIG)
  - Pattern server configuration in JSON schema for all four features
  - Maintains 100% backward compatibility (works without pattern server)
- Secret Redaction (highest value): New secret formats deployed in <24h
  - Override modes: replace (server replaces defaults) or extend (adds to defaults)
  - 35+ secret types enterprise-manageable
- SSRF Protection (second priority): RFC 1918 ranges overridable via pattern server
  - Immutable: Cloud metadata endpoints, dangerous URL schemes
  - Overridable: Private IP ranges (enables Docker access for dev teams)
- Unicode Detection: Homoglyph patterns updateable as new scripts emerge
  - Immutable: Zero-width chars, bidi overrides (Unicode spec-based)
  - Overridable: 80+ homoglyph pairs managed via pattern server
- Config Scanner: Enterprise-specific exfiltration patterns
  - Immutable: Core patterns (env|curl, AWS S3, GCP storage)
  - Overridable: Additional pattern server patterns
- Implementation: 6 new files, 4 feature integrations, schema updates
- Documentation: Implementation plan, example patterns (future)
- Testing: Backward compatibility verified, pattern server optional
Phase 5: Integration & Polish - CI/CD and Static Analysis (Issue #198)
- New scan Command for static repository scanning
  - Scans files statically without running as a hook
  - Integrates all Phase 1-4 security checks (SSRF, Unicode, Config Scanner, Secret Detection)
  - File discovery with glob patterns: --include "*.md", --exclude "node_modules/*"
  - Config-only mode: --config-only to scan only AI configuration files
  - Multiple output formats: text (default), JSON (--json-output), SARIF (--sarif-output)
  - CI/CD ready: --exit-code flag exits with code 1 if issues found
  - Usage: ai-guardian scan . --sarif-output results.sarif --exit-code
- SARIF 2.1.0 Output Format for CI/CD integration
  - Industry-standard Static Analysis Results Interchange Format
  - GitHub Code Scanning integration: findings appear in Security tab and PR reviews
  - GitLab Security Dashboard support
  - 5 rule definitions: SSRF-001, UNICODE-001, CONFIG-001, SECRET-001, PROMPT-INJECTION-001
  - Complete metadata: file locations, line numbers, code snippets, severity levels
  - Upload to GitHub: github/codeql-action/upload-sarif@v3
- Pre-commit Hook Templates for git workflow integration
  - Git hook template: templates/pre-commit.sh for direct git integration
  - pre-commit framework template: templates/.pre-commit-config.yaml
  - Safe, non-invasive approach: ai-guardian setup --pre-commit provides templates and instructions WITHOUT auto-installing
  - Detects existing hooks and warns to prevent conflicts with company/team hooks
  - Shows manual integration steps with copy-paste commands
  - Provides snippet for adding to existing pre-commit configurations
  - Scans staged files before commit, blocks commit if issues found
  - Skip with: git commit --no-verify (not recommended)
- Performance Benchmark Suite (tests/benchmark_phases.py)
  - Validates all Phase 1-4 features meet performance targets
  - SSRF check: <1ms per URL (measured: ~0.016ms ✅)
  - Unicode detection: <5ms per check
  - Config file scanning: <10ms per file
  - Secret redaction: <5ms per 10KB output
  - Total overhead: <20ms for all features combined
  - Run with: pytest tests/benchmark_phases.py -v -m benchmark
- Hermes Payload Validation Suite (tests/test_hermes_payloads.py)
  - Validates 10/10 Hermes Security Framework payloads
  - Phase 1 (SSRF): 2/2 payloads - metadata endpoint, private IP
  - Phase 2 (Unicode): 3/3 payloads - zero-width, bidi override, homoglyphs
  - Phase 3 (Config): 3/3 payloads - env|curl, base64 exfil, AWS S3 upload
  - Phase 4 (Secrets): 2/2 payloads - GitHub tokens, AWS keys
  - Meta-tests: Coverage comparison showing AI Guardian exceeds Hermes framework
  - Run with: pytest tests/test_hermes_payloads.py -v -m hermes
- GitHub Actions Workflow Example
  - Ready-to-use workflow for security scanning in CI
  - Automated SARIF upload to GitHub Code Scanning
  - Findings visible in Security tab and PR reviews
- Complete documentation updates
  - Scan command examples in README
  - SARIF output integration guide
  - Pre-commit hook setup instructions
  - Performance benchmarks and targets
- Production-ready features:
  - ✅ Runtime protection (hooks)
  - ✅ Static analysis (scan command)
  - ✅ CI/CD integration (SARIF output)
  - ✅ Developer workflow (pre-commit hooks)
  - ✅ Performance validated (<20ms overhead)
  - ✅ Hermes framework validated (10/10 payloads)
- Part of Hermes Security Patterns integration epic (Issue #186)
Secret Redaction for Tool Outputs (Issue #197, Phase 4: Hermes Security Patterns)
- Redacts secrets from tool outputs instead of blocking them entirely, enabling work to continue while protecting credentials
- Defense-in-depth: Redaction provides a safety net when secrets are unavoidable, complementing existing blocking mechanisms
- 35+ secret types detected and redacted:
  - API keys: OpenAI (sk-proj-), GitHub (ghp_, gho_, ghr_, ghs_), Anthropic (sk-ant-), GitLab (glpat-), Google (AIza), npm, PyPI
  - Cloud provider keys: AWS (AKIA*, aws_secret_access_key), Azure client secrets, Google OAuth tokens
  - Payment/SaaS: Stripe (sk_live_, pk_live_), Twilio (SK*), SendGrid (SG.), Mailgun (key-), Slack (xox*)
  - Private keys: RSA, SSH, PGP (full redaction for maximum security)
  - Structured formats: Environment variables, JSON fields, HTTP headers, database connection strings
  - Generic patterns: Long hex strings, Base64 encoded secrets
- Multiple masking strategies:
  - preserve_prefix_suffix: Keep first 6 + last 4 characters for debugging (e.g., "sk-pro...1vwx")
  - full_redact: Complete replacement with "[HIDDEN TYPE]" for high-sensitivity secrets
  - env_assignment: Preserve variable name (e.g., "AWS_SECRET_KEY=[HIDDEN]")
  - json_field: Preserve JSON structure (e.g., '{"api_key": "[HIDDEN]"}')
  - connection_string: Preserve endpoint info (e.g., "mongodb://user:[HIDDEN]@host:port/db")
- Configuration (secret_redaction section):
  - enabled: Toggle redaction feature (default: true)
  - action: "log-only" (redact silently), "warn" (redact with user warning), "block" (original blocking behavior, default: log-only)
  - preserve_format: Enable prefix/suffix preservation (default: true)
  - log_redactions: Log all redaction events (default: true)
  - additional_patterns: Add custom secret patterns with regex
- Real-world scenarios enabled:
  - ✅ Environment variable debugging: See AWS_REGION=us-east-1 while AWS_SECRET_KEY=[HIDDEN]
  - ✅ Log file analysis: Review 10,000 log lines with buried secrets redacted inline
  - ✅ Config file review: See structure (host: prod-db.example.com) with passwords hidden
  - ✅ Git history analysis: View commits with accidentally-committed secrets redacted
- Integration: Works automatically with PostToolUse hook, requires no changes to existing workflows
- Performance: <5ms overhead per tool output (sub-50ms for 10KB text with 35+ patterns)
- Logging: All redactions logged to violation logger with type, position, and count metadata
- Testing: 28 comprehensive test cases covering all secret types, masking strategies, and edge cases
- Part of Hermes Security Patterns integration (defense-in-depth approach)
SSRF (Server-Side Request Forgery) Protection (Issue #194, Phase 1 of #186)
- Prevents AI agents from accessing private networks, cloud metadata endpoints, and dangerous URL schemes
- Immutable core protections (cannot be disabled):
  - Private IP ranges (RFC 1918): 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16, 127.0.0.0/8, 169.254.0.0/16
  - IPv6 private ranges: ::1/128, fc00::/7, fe80::/10
  - Cloud metadata endpoints: 169.254.169.254 (AWS/Azure), metadata.google.internal (GCP), fd00:ec2::254 (AWS IPv6)
  - Dangerous URL schemes: file://, gopher://, ftp://, data://, dict://, ldap://
- Fast performance: <1ms overhead per Bash command
- No false positives: Public AWS services (s3.amazonaws.com) are NOT blocked
- Full IPv6 support for all blocking rules
- Configurable features:
  - action modes: block (default), warn, log-only
  - additional_blocked_ips: Add custom IP ranges to block
  - additional_blocked_domains: Add custom domains to block
  - allow_localhost: Enable for local development (default: false)
- Comprehensive test suite: 73 tests including 2 validated Hermes Security Framework SSRF payloads
- Inspired by Hermes Security Framework patterns
- Documentation: docs/SSRF_PROTECTION.md
Unicode Attack Detection for Prompt Injection (Issue #195, Phase 2: Hermes Security Patterns)
- Detects Unicode-based attacks that bypass pattern matching via invisible or look-alike characters
- Zero-width character detection (9 types): U+200B (zero-width space), U+200C (non-joiner), U+200D (joiner), U+FEFF (BOM), U+2060 (word joiner), and 4 more invisible characters
- Bidirectional override detection (2 types): U+202E (RTL override), U+202D (LTR override) for visual deception attacks
- Unicode tag character detection: Deprecated tags (U+E0000 - U+E007F) used for hidden data encoding
- Homoglyph detection (80+ pairs): Cyrillic/Greek/Mathematical look-alikes (e.g., Cyrillic 'е' U+0435 vs Latin 'e' U+0065)
- Smart false positive prevention:
  - Allows emoji with zero-width joiners (e.g., 👨‍👩‍👧‍👦 family emoji) when allow_emoji: true
  - Allows RTL languages (Arabic, Hebrew) with legitimate bidi marks when allow_rtl_languages: true
  - Context-aware detection using surrounding character analysis
- Configuration options under prompt_injection.unicode_detection:
  - enabled: Enable/disable all Unicode detection (default: true)
  - detect_zero_width: Toggle zero-width character detection (default: true)
  - detect_bidi_override: Toggle bidi override detection (default: true)
  - detect_tag_chars: Toggle tag character detection (default: true)
  - detect_homoglyphs: Toggle homoglyph detection (default: true)
  - allow_rtl_languages: Allow legitimate RTL text (default: true)
  - allow_emoji: Allow emoji with zero-width joiners (default: true)
- Performance: <5ms overhead per prompt with early exit on first detection
- Integration: Works with existing action modes (block/warn/log-only)
- Testing: 40 comprehensive test cases covering all attack types and false positive scenarios
- Validates 3/3 Hermes unicode attack payloads (zero-width, bidi override, tag characters)
- Based on Tirith CLI patterns and Hermes Security Framework
- New UnicodeAttackDetector class in src/ai_guardian/prompt_injection.py
- Updated JSON schema with unicode_detection configuration section
- Updated setup.py to include unicode_detection in default config template (ensures ai-guardian setup --create-config includes new options)
Config File Scanner (Issue #196, Phase 3: Hermes Security Patterns)
- Detects credential exfiltration commands in AI configuration files that could cause persistent credential theft across ALL AI sessions
- The Threat: Malicious instructions in CLAUDE.md, AGENTS.md, or .cursorrules execute in every AI session, exfiltrating credentials from all developers on the project
- Persistence Multiplier: 1 malicious config file × N developers × M sessions = N×M credential thefts
- 8 Core Exfiltration Patterns (immutable, cannot be disabled):
  1. curl.*\$\{?[A-Z_][A-Z0-9_]*\}? - curl with environment variables
  2. wget.*\$\{?[A-Z_][A-Z0-9_]*\}? - wget with environment variables
  3. \benv\s*\|.*\bcurl\b - env piped to curl (credential exfiltration)
  4. \bprintenv\b.*\|.*\bcurl\b - printenv exfiltration
  5. \bcat\s+(?:/etc/|~/\.ssh/|~/\.aws/).*\|.*\bcurl\b - file exfiltration
  6. \bbase64\b.*\|.*\bcurl\b - base64 encoded exfiltration
  7. \baws\s+s3\s+(?:cp|sync)\b - AWS S3 upload command
  8. \bgcloud\s+storage\s+cp\b - GCP Cloud Storage upload command
- Standard Config Files Scanned: CLAUDE.md, AGENTS.md, .cursorrules, .aider.conf.yml, .github/CLAUDE.md
- Context-Aware Detection: Ignores documentation examples with keywords (example, warning, don't, avoid, dangerous, attack, threat, security)
- Configurable Options under config_file_scanning:
  - enabled: Enable/disable config file scanning (default: true)
  - action: "block" (default), "warn", or "log-only"
  - additional_files: Add more config file patterns to scan
  - ignore_files: Glob patterns for files to skip (e.g., "/examples/", "/docs/")
  - additional_patterns: Add custom regex patterns to detect
- Performance: <10ms overhead per config file scan with early exit on first match
- Testing: 37 comprehensive test cases including all 3 Hermes config file payloads
- Integration: Runs after prompt injection detection, before secret scanning in PreToolUse hook
- New ConfigFileScanner class in src/ai_guardian/config_scanner.py
- Updated JSON schema with config_file_scanning configuration section
- Updated setup.py to include config_file_scanning in default config template
- Inspired by Hermes Security Framework patterns
Documented --create-config and --permissive flags in README (Issue #199)
- Quick Start section now shows ai-guardian setup --create-config as the recommended way to create config files
- Explains difference between secure mode (default) and permissive mode (--permissive flag)
- Setup Command section includes --create-config examples in Basic Usage
- Includes dry-run preview example (--create-config --dry-run)
- Makes onboarding easier by highlighting the automated config creation introduced in v1.4.0
Version information in all log entries (Issue #190)
- Every log line now includes AI Guardian version (e.g., v1.5.0)
- New log format: YYYY-MM-DD HH:MM:SS - v{VERSION} - logger - LEVEL - message
- Version logged explicitly at startup with Python version and platform information
- Helps correlate bugs with specific releases and verify fixes
- No manual version strings needed in log statements - automatically injected via custom LogRecord factory
- Example log output:
```
2026-04-21 18:49:20 - v1.5.0 - root - INFO - AI Guardian v1.5.0 initialized
2026-04-21 18:49:20 - v1.5.0 - root - INFO - Python 3.12.11
2026-04-21 18:49:20 - v1.5.0 - root - INFO - Platform: Darwin-25.4.0-arm64
```

Changed

Clarified zero-configuration installation in README (Issue #216)
- Quick Start section now emphasizes that ai-guardian works immediately after installing gitleaks with zero configuration required
- Added "Default Behavior (No Configuration File)" section showing which features are enabled by default
- Added minimal configuration example showing that only specific restrictions need to be configured
- Reorganized Quick Start to clearly separate zero-config installation from optional advanced configuration
- Makes it clearer that configuration is only needed for tool/skill restrictions, directory rules, custom patterns, or log-only mode
- All core protections (secret scanning, prompt injection, SSRF, config file scanning, immutable file protection) work out-of-the-box

Fixed

Setup command now generates complete configuration with violation_logging section (Issue #214)
- Fixed missing violation_logging section in ai-guardian setup --create-config output
- Added violation_logging property to JSON schema with proper validation
- Users can now discover and configure violation logging from generated config files
- Includes all log types: tool_permission, directory_blocking, secret_detected, secret_redaction, prompt_injection
- Improves discoverability of violation logging feature (available since v1.1.0)
Overly aggressive self-protection heuristic no longer blocks legitimate content (Issue #188)
- Fixed false positives where commands mentioning "ai-guardian" in content were blocked
- Self-protection patterns are now path-specific, only blocking when targeting actual protected files:
  - Config files: *ai-guardian.json, */.config/ai-guardian/*
  - IDE hooks: */.claude/settings.json, */.cursor/hooks.json
  - Package code: */site-packages/ai_guardian/*, */ai-guardian/src/ai_guardian/*
  - Cache files: */.cache/ai-guardian/*
  - Directory markers: */.ai-read-deny
- Now allows legitimate use cases:
  - Writing code reviews mentioning "ai-guardian" (e.g., echo "Review mentions ai-guardian" > /tmp/review.md)
  - Creating documentation about ai-guardian (e.g., echo "Install ai-guardian using pip" > docs/README.md)
  - Writing bug reports containing "ai-guardian" text
- Protection remains strong for actual config/hook files - only the heuristic is more precise
- Added 9 new test cases to prevent regression

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.5.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Added

Security

Changed

Fixed

Added

Changed

Fixed

Added

Added

Changed

Fixed

Contributors

Uh oh!