v1.5.0
Added
-
Scanner Installer: Skip Installation if Already Up-to-Date (Issue #271)
- Smart Version Checking:
ai-guardian scanner installnow checks if scanner is already installed before downloading - Skip When Up-to-Date: Automatically skips installation if the latest version is already installed
- Upgrade Detection: Automatically upgrades when a newer version is available
- Downgrade Protection: Does not auto-downgrade without explicit
--versionflag - Explicit Control:
--versionflag allows downgrade or reinstall of specific versions - Clear Messaging: Shows current version, target version, and action taken
- Version Comparison: Proper semantic version comparison (e.g., 8.30.1 < 8.31.0)
- Performance: Saves bandwidth and time by skipping unnecessary downloads
- Implementation:
- Added
_get_installed_version()method to detect currently installed version - Added
_compare_versions()method for semantic version comparison - Updated
install()method with version checking logic
- Added
- Test Coverage: Added 10 comprehensive test cases for version checking scenarios
- Benefits: Faster installation, bandwidth-friendly, safe (no auto-downgrades), clear user feedback
- Smart Version Checking:
-
SSRF Protection: Wildcard Domain Pattern Support (Issue #253)
- New Feature: Added wildcard pattern matching for
additional_blocked_domainsconfiguration - Syntax: Supports
*(match zero or more characters) and?(match exactly one character) - Pattern Examples:
*.internal.com- Block all .internal.com domains (api.internal.com, db.internal.com)admin.*- Block admin.* with any suffix (admin.example.com, admin.local)*.corp.*- Block all .corp. domains (api.corp.internal, db.corp.example.com)metadata.*- Block all metadata.* endpoints (metadata.aws.com, metadata.google.internal)test?.example.com- Block test1.example.com, test2.example.com, testa.example.com
- Use Cases:
- Block entire TLDs with single pattern (
*.internal,*.local) - Block subdomain patterns (
*.admin.example.com) - Block naming patterns (
metadata.*,admin.*) - Enterprise-wide policies with simplified configuration
- Block entire TLDs with single pattern (
- Backward Compatibility: Exact domain matching and subdomain matching still work as before
- Pattern Validation: Invalid patterns are rejected at config load time with warnings
- Performance: Patterns stored separately from exact domains for optimal matching
- Files Modified:
src/ai_guardian/ssrf_protector.py: Addedfnmatchimport,_blocked_domain_patternslist,_is_valid_domain_pattern()method, pattern matching in_is_domain_blocked()src/ai_guardian/schemas/ai-guardian-config.schema.json: Updatedadditional_blocked_domainsdescription with wildcard pattern syntaxdocs/SSRF_PROTECTION.md: Comprehensive documentation with wildcard pattern examples and use cases
- Tests: Added 11 comprehensive test cases in
TestWildcardDomainPatternsclass - Impact: Users can now use flexible wildcard patterns to block domains more efficiently
- New Feature: Added wildcard pattern matching for
-
SSRF Protection: URL Allow-List Support (Issue #252)
- New Configuration: Added
allowed_domainsarray tossrf_protectionconfiguration - Purpose: Allow specific trusted domains/URLs while maintaining core protections
- Evaluation Order (Deny-First):
- Check immutable core protections (metadata endpoints, dangerous schemes, private IPs)
- Check deny-list (
additional_blocked_domains) - Check allow-list (
allowed_domains) - can override step 2, NOT step 1
- Use Cases:
- Allow specific internal APIs while blocking other internal domains
- Allow development/staging servers without allowing all localhost
- Allow specific partner domains on restricted networks
- Provide granular control to override broad domain blocks
- Domain Matching: Supports exact match and subdomain matching
"api.corp.internal"allowsapi.corp.internalandv1.api.corp.internal
- Security: Cannot override immutable core protections
- Metadata endpoints (169.254.169.254, metadata.google.internal) remain blocked
- Private IP ranges (RFC 1918) remain blocked
- Dangerous schemes (file://, gopher://) remain blocked
- Files Modified:
src/ai_guardian/schemas/ai-guardian-config.schema.json: Addedallowed_domainspropertysrc/ai_guardian/ssrf_protector.py: Implemented allow-list logic in_check_url()src/ai_guardian/setup.py: Addedallowed_domains: []to default configai-guardian-example.json: Added examples and security warningsdocs/SSRF_PROTECTION.md: Comprehensive documentation with examplesAGENTS.md: Enhanced schema change checklist
- Tests: Added 9 comprehensive test cases in
tests/test_ssrf_protection.py - Impact: Users can now create exceptions for specific domains while maintaining strong security boundaries
- New Configuration: Added
Security
- Cascading Priority for Remote Config URLs to Prevent Immutability Bypass (Issue #255)
- Fix: Implemented first-match-wins cascading for remote config URL sources
- Vulnerability: Users could bypass
immutable: trueenterprise policies by adding their own remote config URLs in local/user configs - Attack Scenario: Enterprise deploys system config with
immutable: trueSSRF protection, user adds attacker-controlled remote URL that disables it - Solution: Remote config URLs now follow strict priority hierarchy (system config → env var → user config → local config)
- System config (
/etc/ai-guardian/remote-configs.json): Highest priority, requires root/admin, blocks all lower sources - Environment variable (
AI_GUARDIAN_REMOTE_CONFIG_URLS): Second priority, blocks user/local sources - User config (
~/.config/ai-guardian/ai-guardian.json): Third priority, blocks local config - Local config (
~/.ai-guardian.json): Lowest priority fallback
- System config (
- Implementation:
- Added
_get_system_config_path(): Returns platform-specific system config path (Linux/macOS:/etc/ai-guardian/remote-configs.json, Windows:C:\ProgramData\ai-guardian\remote-configs.json) - Refactored
_load_remote_configs(): Implements cascading with early return on first match - Added
_fetch_remote_configs(): Helper to reduce code duplication
- Added
- Testing: Added 5 new test cases in
test_immutable_configs.py:- System config blocks user remote URLs
- Environment variable takes priority over user config
- User remote URLs work without system config
- Local config has lowest priority
- Legacy format (direct list) still works
- Backward Compatibility: ✅ Existing users with remote_configs in user/local files continue working unchanged
- Enterprise Deployment: Enterprises can now deploy one system config file to enforce policies across all users
- Impact: Critical security fix - prevents users from bypassing all enterprise security policies
Changed
- Documentation: Clarify SSRF Protection Limitations and Scope (Issue #256)
- Updated docs/SSRF_PROTECTION.md: Added "Important Limitations" section at the top
- Clearly explains what SSRF protection CAN and CANNOT protect against
- Documents pattern-based filtering vs comprehensive network security
- Added OpenShell integration guide for comprehensive SSRF protection
- Explains hook-based architecture and why limitations exist
- Updated README.md: SSRF section now includes limitation disclaimers
- Examples of what CAN be blocked (explicit URLs in Bash/tool parameters)
- Examples of what CANNOT be blocked (MCP server internal calls)
- Recommendations for network-level controls and MCP sandboxing
- Updated ai-guardian-example.json: Added comprehensive limitation comments
- Explains pattern-based filtering cannot replace network security
- Documents that it cannot detect MCP internal network calls
- Notes about HTTP redirects and dynamic URL construction
- Updated src/ai_guardian/ssrf_protector.py: Enhanced module docstring
- Clear architecture explanation (hook-based, not proxy)
- Defense in depth strategy documentation
- Usage guidance and limitations
- Updated error messages: SSRF block/warn messages now mention limitations
- Changed "SSRF ATTACK DETECTED" to "SSRF PATTERN DETECTED"
- Added note about pattern-based detection
- Recommends firewall rules and network controls
- References docs/SSRF_PROTECTION.md
- Impact: Users now have realistic expectations about SSRF protection scope
- Key Message: ai-guardian catches obvious SSRF attempts in command strings but cannot replace network-level security
- Updated docs/SSRF_PROTECTION.md: Added "Important Limitations" section at the top
Fixed
-
Setup.py Missing permissions_directories in Default Config Template (Issue #240)
- Fix: Added
permissions_directoriesfield to_get_default_config_template()function in setup.py - Problem: Users running
ai-guardian setup --create-configgot incomplete configuration files missing thepermissions_directoriesoption - Root Cause: When
permissions_directorieswas added to the schema, setup.py wasn't updated (violating AGENTS.md configuration consistency guidelines) - Impact: Generated configs now include
permissions_directorieswith comprehensive comments and examples:- Local directory scanning example (
~/.claude/skills) - GitHub repository scanning example with token_env
- Documentation explaining it's OPTIONAL/ADVANCED and most users should prefer remote_configs
- Local directory scanning example (
- Location: Added to setup.py after permissions section (line 893), before directory_rules
- Verification: Tested via
_get_default_config_template()and confirmed field appears in generated config
- Fix: Added
-
JSON Schema Missing Definitions (Issue #239)
- Fix: Added missing
pattern_server_authandpattern_server_cachedefinitions to schema - Problem: Schema referenced definitions that didn't exist, causing validation failures for pattern_server configurations
- Root Cause: When pattern_server was refactored from root-level to nested under each feature in v1.7.0, the auth/cache structures were not extracted into reusable definitions
- Impact: Schema validation now succeeds for configs using pattern_server with auth/cache in:
secret_redaction.pattern_serverprompt_injection.unicode_detection.pattern_serverssrf_protection.pattern_serverconfig_file_scanning.pattern_server
- Tests: Added comprehensive test suite (
test_pattern_server_definitions.py) validating all pattern_server references
- Fix: Added missing
Added
-
LeakTK Pattern Server Documentation (Issue #156)
- Added comprehensive documentation for using LeakTK patterns as a pattern server
- README.md: Added "Using LeakTK Patterns (Recommended)" section with quick start guide
- Benefits: Free, community-maintained, 104+ rules, no authentication required
- Configuration example using GitHub raw content URL
- Verification steps and expected log output
- docs/SECRET_SCANNING.md: Added complete LeakTK integration guide
- Pattern sources comparison table (LeakTK vs Gitleaks defaults)
- Configuration options and cache settings
- Pattern version compatibility table (8.25.0, 8.26.0, 8.27.0)
- Troubleshooting guide and common issues
- FAQ section covering offline usage, updates, firewall workarounds
- Example workflows for combining LeakTK with project-specific patterns
- ai-guardian-example.json: Added LeakTK example configuration
- Documented free, community-maintained pattern source
- Reference to LeakTK GitHub repository
- Feature already implemented and tested - documentation completes the feature
- LeakTK repository: https://github.com/leaktk/patterns
-
Permissions Comparison Documentation (Issue #235)
- Added comprehensive
docs/PERMISSIONS_COMPARISON.mdcomparing ai-guardian.json vs settings.json permission systems - Covers: architecture diagrams, capabilities comparison, enforcement differences, when to use each
- Explains Skills are only controllable via ai-guardian.json (not in settings.json)
- Documents defense-in-depth best practices using both permission systems
- Includes example configurations for different scenarios (user preferences, enterprise enforcement, defense-in-depth)
- Cross-referenced from README.md "When to Use" section
- Added comprehensive
Changed
-
Removed Unused Maintainer Detection Code (Issue #231)
- Change: Removed ~450 lines of unused GitHub maintainer detection code from
tool_policy.py - Removed Methods:
_get_git_repo_info()- Extract GitHub repo info from git remote_get_authenticated_github_user()- Get GitHub username from gh CLI_check_github_collaborator()- Check if user has write access via GitHub API_get_maintainer_cache()- Read maintainer status from cache_cache_maintainer_status()- Write maintainer status to cache_is_github_maintainer_cached()- Main maintainer check with caching_diagnose_maintainer_bypass()- Diagnostic helper for bypass issues
- Rationale:
- These methods were no longer called in production code since commit
0f6e456(April 19, 2026) - The
_should_skip_immutable_protection()bypass logic was simplified to allow ALL contributors to edit development source (fork + PR workflow) - Maintainer check was removed to enable standard open-source contribution workflow
- Security relies on PR review process, not role-based permissions
- These methods were no longer called in production code since commit
- Impact:
- Reduced codebase complexity (~450 lines removed)
- Removed dependency on
ghCLI for permission checking - Eliminated cache file management (
~/.cache/ai-guardian/maintainer-status.json) - Faster hook execution (no GitHub API calls)
- Tests Updated:
- Renamed
test_maintainer_bypass.py→test_development_source_bypass.py - Removed tests for unused GitHub API methods (~400 lines)
- Kept tests for core bypass logic (
_should_skip_immutable_protection) - Removed
@patch('_is_github_maintainer_cached')mocks from other test files
- Renamed
- No Breaking Changes: The permission model remains unchanged - all contributors can edit development source, config/hooks/cache remain always protected
- Change: Removed ~450 lines of unused GitHub maintainer detection code from
-
Secret Redaction Always Redacts (Removed Block Mode) (Issue #234)
- Change:
secret_redaction.action="block"mode removed - secrets are now always redacted (never blocked) - New Default: Changed default action from "log-only" to "warn" for better UX
- Valid Actions: Only "warn" (redact with notification) and "log-only" (redact silently) are now supported
- Breaking Change: Configurations with
action="block"will fail validation with a helpful error message - Rationale:
- Simpler UX - one behavior, no confusing modes
- Better DX - AI can still help (sees masked secrets) instead of being completely blocked
- Same security - real secrets never reach AI
- Less friction - reading files with secrets doesn't stop work
- Name matches behavior - "secret_redaction" actually redacts
- Migration: For users who want old "block" behavior, add sensitive files to
.gitleaksignoreto prevent reading them entirely - Impact:
- Schema updated to only allow "warn" and "log-only"
- TUI dropdown no longer shows "block" option
- Config validation rejects "block" with migration guidance
- Default config templates updated to use "warn"
- Files Modified:
src/ai_guardian/schemas/ai-guardian-config.schema.json: Updated enum and defaultsrc/ai_guardian/secret_redactor.py: Updated docstring and defaultsrc/ai_guardian/__init__.py: Simplified to always redact when enabledsrc/ai_guardian/config_inspector.py: Added validation to reject "block"src/ai_guardian/setup.py: Changed default from "log-only" to "warn"src/ai_guardian/tui/secret_redaction.py: Removed "block" option, default "warn"
- Tests Updated: All tests expecting blocking behavior updated to expect redaction
- Change:
Fixed
-
Ignore Files Patterns with Leading
**/Don't Work (Issue #232)- Root Cause: Three different implementations of
ignore_filespattern matching existed with inconsistent behavior:- Secret Scanning (
__init__.py) - ✅ WORKED - Used custom_match_leading_doublestar_pattern()helper - Prompt Injection (
prompt_injection.py) - ❌ BROKEN - Only usedPath.match()which doesn't properly handle leading**/ - Config Scanner (
config_scanner.py) - ❌ BROKEN - Usedfnmatch.fnmatch()which doesn't support**/
- Secret Scanning (
- Fix: Extracted
_match_leading_doublestar_pattern()tosrc/ai_guardian/utils/path_matching.pymodule and updated all three implementations to use it consistently - Impact: All detectors now properly support leading
**/patterns for ignoring files in subdirectories - Files Modified:
src/ai_guardian/utils/path_matching.py: Created new utility module withmatch_leading_doublestar_pattern()andmatch_ignore_pattern()functionssrc/ai_guardian/__init__.py: Updated to import and use utility functionsrc/ai_guardian/prompt_injection.py: Updated_is_file_ignored()to use utility functionsrc/ai_guardian/config_scanner.py: Updated_should_ignore_file()to use utility function
- Tests Added:
tests/test_unicode_attacks.py::UnicodeDetectorIgnoreFilesTest- 2 tests for unicode detection ignore patternstests/test_config_scanner.py::TestConfigFileScanner::test_ignore_files_leading_double_star_patterns- 1 test for config scanner
- Root Cause: Three different implementations of
-
Config File Scanner File Path Extraction Bug (Issue #228)
- Problem: Config File Scanner failed to extract file path from PreToolUse hook data when using the Read tool, allowing malicious config files (CLAUDE.md, AGENTS.md, etc.) to pass through unscanned and potentially exfiltrate credentials
- Root Cause:
extract_file_content_from_tool()function only checkedtool_use.parameters.file_pathformat, but Claude Code actually sendstool_use.input.file_path, causing file path extraction to fail with "Could not extract file path from hook data" error - Fix: Added support for
tool_use.input.file_pathformat in file path extraction logic to match Claude Code's actual hook data structure - Impact: Config File Scanner now properly scans config files for exfiltration patterns (
env | curl, AWS S3 uploads, etc.) when read via PreToolUse hooks, protecting against persistent credential theft attacks - Affected Versions: v1.3.0, v1.4.0, v1.4.1, v1.5.0-dev (bug present since Config File Scanner was added in v1.3.0)
- Test Added: 1 new regression test verifying file path extraction from
tool_use.inputformat (tests/test_ai_guardian.py::test_pretooluse_hook_with_tool_use_input_format) - Files Modified:
src/ai_guardian/__init__.py: Addedtool_use.input.file_pathcheck inextract_file_content_from_tool()tests/test_ai_guardian.py: Added regression test with actual Claude Code hook format
-
PreToolUse Hook Auto-Approve Bug (Issue #224)
- Problem: PreToolUse hook was auto-approving all Edit and Write operations when no secrets were detected, bypassing Claude Code's normal permission prompts and removing user control over file modifications
- Root Cause:
format_response()function returnedpermissionDecision: allowfor clean files, which instructed Claude Code to auto-approve the operation - Fix: PreToolUse now only returns
permissionDecisionwhen denying operations (secrets/threats detected). For clean operations, returns empty response to allow Claude Code's normal permission system to prompt the user - Impact: Users now properly see permission prompts for Edit/Write operations, maintaining informed consent for file modifications
- Affected Versions: v1.3.0, v1.4.0, v1.4.1 (bug introduced with GitHub Copilot integration in v1.3.0)
- Tests Added:
- Unit Tests: 4 new PreToolUse permission tests covering Edit/Write operations for both Claude Code and GitHub Copilot IDE types (
tests/test_hook_processing.py) - Integration Tests: 6 new end-to-end tests verifying no auto-approve behavior (
tests/test_pretooluse_no_auto_approve.py):- Edit operations (Claude Code and GitHub Copilot)
- Write operations (Claude Code and GitHub Copilot)
- Verification that secrets still trigger deny (no regression)
- End-to-end workflow showing user sees permission prompts
- User Experience Contract Tests: 5 new tests documenting expected UX (
tests/test_user_experience_contract.py):- Read with secret → Immediate denial (no prompt shown)
- Edit without secret → Permission prompt shown
- Comparison test showing different UX for secret vs clean operations
- Documentation test describing expected behavior for users
- Manual verification guide for testing in actual Claude Code IDE
- Updated 3 existing tests to expect correct behavior (no auto-approve)
- Unit Tests: 4 new PreToolUse permission tests covering Edit/Write operations for both Claude Code and GitHub Copilot IDE types (
- Files Modified:
src/ai_guardian/__init__.py: Updatedformat_response()for both GITHUB_COPILOT and CLAUDE_CODE pathstests/test_hook_processing.py: AddedPreToolUsePermissionTestsclass with 4 unit teststests/test_pretooluse_no_auto_approve.py: Added 6 integration tests (NEW FILE)tests/test_ai_guardian.py: Updated 3 tests to expect correct behavior
Added
Added
-
Local File Path Support in Remote Configurations (Issue #223)
- Feature:
remote_configsnow supports local file paths in addition to HTTPS URLs - Supported Formats:
file://URLs:file:///etc/ai-guardian/config.toml- Absolute paths:
/etc/ai-guardian/config.toml - Tilde expansion:
~/team-configs/allowed-tools.toml
- Caching Behavior:
- HTTPS URLs: Cached with TTL (default: 12h refresh, 168h expiration)
- Local files: Always read fresh (bypass cache for immediate updates)
- Use Cases:
- Development/Testing: Test configs locally without HTTPS server
- Air-Gapped Environments: Offline systems without internet access
- Corporate Networks: Shared network drives (NFS, SMB)
- CI/CD Pipelines: Build environments with local config files
- Team Configuration: Shared configs in home directories
- Security:
- Path traversal prevention with
Path.resolve(strict=True) - File type validation (regular files only)
- Permission checks before reading
- Symlinks followed safely with warnings
- Path traversal prevention with
- Implementation:
- New
RemoteFetcher._fetch_from_local_file()method - Updated
fetch_config()to bypass caching for local paths - Both JSON and TOML formats supported
- New
- Tests Added:
- Unit Tests (
tests/test_remote_fetcher_local.py): 27 passing tests- file:// URLs, absolute paths, tilde expansion
- JSON/TOML format support
- Error handling (missing files, permission denied, invalid format)
- Symlink following and broken symlinks
- No-caching behavior verification
- Edge cases (spaces in paths, special characters, UTF-8)
- Integration Tests (
tests/test_integration_local_remote_configs.py): 10 passing tests- Multiple local sources, cache isolation
- Mixed local and HTTPS URLs
- File updates reflected immediately
- Concurrent updates, error recovery
- Unit Tests (
- Documentation:
- README.md updated with local file path examples and use cases
- Security features documented
- Files Modified:
src/ai_guardian/remote_fetcher.py: Added local file path supportREADME.md: Added "Local File Paths" section under "Remote Configs vs Directory Discovery"
- Feature:
-
Integration and Use-Case Tests with Mock MCP Server (Issue #220)
- Comprehensive test infrastructure for MCP tool security testing
- Test Fixtures:
tests/fixtures/mock_mcp_server.py: Simulates NotebookLM and other MCP tools with controllable responsestests/fixtures/attack_constants.py: Comprehensive attack patterns (SSRF, secrets, prompt injection, exfiltration)tests/conftest.py: Pytest fixtures for test isolation usingAI_GUARDIAN_CONFIG_DIR
- Integration Tests (
tests/test_integration_mcp.py): 24 passing tests- MCP Tool Permission Tests (6 tests): Allowlists, blocklists, wildcards, custom servers
- Secret Scanning Tests (4 tests): Secrets in notebook titles/sources, multiple secret types, false positives
- Prompt Injection Tests (4 tests): Injection in parameters, role-switching, delimiter escapes
- SSRF Protection Tests (5 tests): AWS/GCP metadata, private IPs, public URLs, Bash-specific behavior
- Config Exfiltration Tests (3 tests): Curl exfiltration, credential theft in CLAUDE.md/AGENTS.md
- Combined Protection Tests (2 tests): Multiple protections working together, defense in depth
- PostToolUse Tests (
tests/test_posttooluse_mcp.py): 13 passing tests- Secret Scanning (5 tests): Bash/Read output with secrets, Write/Edit skipped, clean outputs
- Content Scanning (3 tests): Documents that PostToolUse only scans secrets, not prompt injection
- MCP Tool Tests (3 tests): MCP responses, notebook lists, current scanning behavior
- Redaction Tests (1 test): Secret redaction mode behavior
- Combined Tests (1 test): Multiple threats in output
- Use-Case Tests (
tests/test_use_cases.py): 13 passing tests covering realistic scenarios- Data Exfiltration Attack (3 tests): Multi-stage attack attempts via Bash, NotebookLM, SSRF
- Prompt Injection Chain (2 tests): Attempts to disable protections, privilege escalation prevention
- Legitimate Workflow (2 tests): Normal NotebookLM usage, security code discussion
- Enterprise Policy (2 tests): Approved MCP servers only, paranoid mode (all MCP blocked)
- Multi-Stage Attack (2 tests): Combined injection + exfiltration, privilege escalation
- Real-World Scenarios (2 tests): Developer workflows, documentation discussions
- Test Isolation: All tests run in isolated temporary directories via
isolated_config_dirfixture - Benefits:
- ✅ Validates protections work with real MCP tool calls
- ✅ Catches integration issues between protection layers
- ✅ Serves as usage examples for MCP security
- ✅ Prevents regression in multi-protection scenarios
- ✅ Documents actual implementation behavior (SSRF only on Bash, PostToolUse only scans secrets)
- ✅ Tests realistic attack chains and defense-in-depth
- ✅ Validates enterprise policy enforcement
- ✅ Ensures legitimate workflows work without false positives
- Hook Processing Tests (
tests/test_hook_processing.py): 8 passing tests- Hook Input Parsing (4 tests): Valid JSON, UserPromptSubmit, PreToolUse, PostToolUse
- Tool Response Extraction (4 tests): Bash output, Read content, MCP tools, Write/Edit skipped
- Advanced Tool Policy Tests (
tests/test_tool_policy_advanced.py): 11 passing tests- Rule Matching (2 tests): Wildcard patterns, case sensitivity
- Rule Ordering (2 tests): First-match wins, default behavior
- Config Variations (4 tests): Disabled permissions, empty rules, no config, invalid rules
- Edge Cases (3 tests): Empty tool name, null tool name, missing field
- End-to-End Workflow Tests (
tests/test_e2e_workflow.py): 5 passing tests- Legitimate Workflows (3 tests): NotebookLM, Bash, Read→Write workflows
- Secret Detection (1 test): Secret caught at PostToolUse stage
- Multi-Tool Sequence (1 test): Multiple tools in realistic workflow
- 74 new integration and use-case tests covering all 9 protection layers with MCP tools
- Test Coverage: Core protection modules at 70% (excluding TUI/setup: 4,500 statements, 1,359 missing)
- Part of ongoing MCP security validation effort
-
Pattern Server Support for Security Features (Issue #206, Epic #186)
- OPTIONAL/ADVANCED: Enterprise pattern server integration for centralized pattern management
- Three-tier pattern system: Immutable core + Pattern server/defaults + Local config additions
- Multiple pattern types: SSRF, Unicode, Config Scanner, Secret Redaction
- Fallback chain: Pattern server → cache → hardcoded defaults (always available)
- Features:
PatternServerClientextended for multiple pattern types (ssrf, unicode, config-exfil, secrets)- New
PatternLoaderbase class with feature-specific implementations - TOML pattern file format with native comment support
- Source attribution tracking (IMMUTABLE, SERVER, DEFAULT, LOCAL_CONFIG)
- Pattern server configuration in JSON schema for all four features
- Maintains 100% backward compatibility (works without pattern server)
- Secret Redaction (highest value): New secret formats deployed in <24h
- Override modes:
replace(server replaces defaults) orextend(adds to defaults) - 35+ secret types enterprise-manageable
- Override modes:
- SSRF Protection (second priority): RFC 1918 ranges overridable via pattern server
- Immutable: Cloud metadata endpoints, dangerous URL schemes
- Overridable: Private IP ranges (enables Docker access for dev teams)
- Unicode Detection: Homoglyph patterns updateable as new scripts emerge
- Immutable: Zero-width chars, bidi overrides (Unicode spec-based)
- Overridable: 80+ homoglyph pairs managed via pattern server
- Config Scanner: Enterprise-specific exfiltration patterns
- Immutable: Core patterns (env|curl, AWS S3, GCP storage)
- Overridable: Additional pattern server patterns
- Implementation: 6 new files, 4 feature integrations, schema updates
- Documentation: Implementation plan, example patterns (future)
- Testing: Backward compatibility verified, pattern server optional
-
Phase 5: Integration & Polish - CI/CD and Static Analysis (Issue #198)
- New
scanCommand for static repository scanning- Scans files statically without running as a hook
- Integrates all Phase 1-4 security checks (SSRF, Unicode, Config Scanner, Secret Detection)
- File discovery with glob patterns:
--include "*.md",--exclude "node_modules/*" - Config-only mode:
--config-onlyto scan only AI configuration files - Multiple output formats: text (default), JSON (
--json-output), SARIF (--sarif-output) - CI/CD ready:
--exit-codeflag exits with code 1 if issues found - Usage:
ai-guardian scan . --sarif-output results.sarif --exit-code
- SARIF 2.1.0 Output Format for CI/CD integration
- Industry-standard Static Analysis Results Interchange Format
- GitHub Code Scanning integration: findings appear in Security tab and PR reviews
- GitLab Security Dashboard support
- 5 rule definitions: SSRF-001, UNICODE-001, CONFIG-001, SECRET-001, PROMPT-INJECTION-001
- Complete metadata: file locations, line numbers, code snippets, severity levels
- Upload to GitHub:
github/codeql-action/upload-sarif@v3
- Pre-commit Hook Templates for git workflow integration
- Git hook template:
templates/pre-commit.shfor direct git integration - pre-commit framework template:
templates/.pre-commit-config.yaml - Safe, non-invasive approach:
ai-guardian setup --pre-commitprovides templates and instructions WITHOUT auto-installing - Detects existing hooks and warns to prevent conflicts with company/team hooks
- Shows manual integration steps with copy-paste commands
- Provides snippet for adding to existing pre-commit configurations
- Scans staged files before commit, blocks commit if issues found
- Skip with:
git commit --no-verify(not recommended)
- Git hook template:
- Performance Benchmark Suite (tests/benchmark_phases.py)
- Validates all Phase 1-4 features meet performance targets
- SSRF check: <1ms per URL (measured: ~0.016ms ✅)
- Unicode detection: <5ms per check
- Config file scanning: <10ms per file
- Secret redaction: <5ms per 10KB output
- Total overhead: <20ms for all features combined
- Run with:
pytest tests/benchmark_phases.py -v -m benchmark
- Hermes Payload Validation Suite (tests/test_hermes_payloads.py)
- Validates 10/10 Hermes Security Framework payloads
- Phase 1 (SSRF): 2/2 payloads - metadata endpoint, private IP
- Phase 2 (Unicode): 3/3 payloads - zero-width, bidi override, homoglyphs
- Phase 3 (Config): 3/3 payloads - env|curl, base64 exfil, AWS S3 upload
- Phase 4 (Secrets): 2/2 payloads - GitHub tokens, AWS keys
- Meta-tests: Coverage comparison showing AI Guardian exceeds Hermes framework
- Run with:
pytest tests/test_hermes_payloads.py -v -m hermes
- GitHub Actions Workflow Example
- Ready-to-use workflow for security scanning in CI
- Automated SARIF upload to GitHub Code Scanning
- Findings visible in Security tab and PR reviews
- Complete documentation updates
- Scan command examples in README
- SARIF output integration guide
- Pre-commit hook setup instructions
- Performance benchmarks and targets
- Production-ready features:
- ✅ Runtime protection (hooks)
- ✅ Static analysis (scan command)
- ✅ CI/CD integration (SARIF output)
- ✅ Developer workflow (pre-commit hooks)
- ✅ Performance validated (<20ms overhead)
- ✅ Hermes framework validated (10/10 payloads)
- Part of Hermes Security Patterns integration epic (Issue #186)
- New
-
Secret Redaction for Tool Outputs (Issue #197, Phase 4: Hermes Security Patterns)
- Redacts secrets from tool outputs instead of blocking them entirely, enabling work to continue while protecting credentials
- Defense-in-depth: Redaction provides a safety net when secrets are unavoidable, complementing existing blocking mechanisms
- 35+ secret types detected and redacted:
- API keys: OpenAI (sk-proj-), GitHub (ghp_, gho_, ghr_, ghs_), Anthropic (sk-ant-), GitLab (glpat-), Google (AIza), npm, PyPI
- Cloud provider keys: AWS (AKIA*, aws_secret_access_key), Azure client secrets, Google OAuth tokens
- Payment/SaaS: Stripe (sk_live_, pk_live_), Twilio (SK*), SendGrid (SG.), Mailgun (key-), Slack (xox*)
- Private keys: RSA, SSH, PGP (full redaction for maximum security)
- Structured formats: Environment variables, JSON fields, HTTP headers, database connection strings
- Generic patterns: Long hex strings, Base64 encoded secrets
- Multiple masking strategies:
preserve_prefix_suffix: Keep first 6 + last 4 characters for debugging (e.g., "sk-pro...1vwx")full_redact: Complete replacement with "[HIDDEN TYPE]" for high-sensitivity secretsenv_assignment: Preserve variable name (e.g., "AWS_SECRET_KEY=[HIDDEN]")json_field: Preserve JSON structure (e.g., '{"api_key": "[HIDDEN]"}')connection_string: Preserve endpoint info (e.g., "mongodb://user:[HIDDEN]@host:port/db")
- Configuration (
secret_redactionsection):enabled: Toggle redaction feature (default: true)action: "log-only" (redact silently), "warn" (redact with user warning), "block" (original blocking behavior, default: log-only)preserve_format: Enable prefix/suffix preservation (default: true)log_redactions: Log all redaction events (default: true)additional_patterns: Add custom secret patterns with regex
- Real-world scenarios enabled:
- ✅ Environment variable debugging: See
AWS_REGION=us-east-1whileAWS_SECRET_KEY=[HIDDEN] - ✅ Log file analysis: Review 10,000 log lines with buried secrets redacted inline
- ✅ Config file review: See structure (
host: prod-db.example.com) with passwords hidden - ✅ Git history analysis: View commits with accidentally-committed secrets redacted
- ✅ Environment variable debugging: See
- Integration: Works automatically with PostToolUse hook, requires no changes to existing workflows
- Performance: <5ms overhead per tool output (sub-50ms for 10KB text with 35+ patterns)
- Logging: All redactions logged to violation logger with type, position, and count metadata
- Testing: 28 comprehensive test cases covering all secret types, masking strategies, and edge cases
- Part of Hermes Security Patterns integration (defense-in-depth approach)
-
SSRF (Server-Side Request Forgery) Protection (Issue #194, Phase 1 of #186)
- Prevents AI agents from accessing private networks, cloud metadata endpoints, and dangerous URL schemes
- Immutable core protections (cannot be disabled):
- Private IP ranges (RFC 1918): 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16, 127.0.0.0/8, 169.254.0.0/16
- IPv6 private ranges: ::1/128, fc00::/7, fe80::/10
- Cloud metadata endpoints: 169.254.169.254 (AWS/Azure), metadata.google.internal (GCP), fd00:ec2::254 (AWS IPv6)
- Dangerous URL schemes: file://, gopher://, ftp://, data://, dict://, ldap://
- Fast performance: <1ms overhead per Bash command
- No false positives: Public AWS services (s3.amazonaws.com) are NOT blocked
- Full IPv6 support for all blocking rules
- Configurable features:
actionmodes: block (default), warn, log-onlyadditional_blocked_ips: Add custom IP ranges to blockadditional_blocked_domains: Add custom domains to blockallow_localhost: Enable for local development (default: false)
- Comprehensive test suite: 73 tests including 2 validated Hermes Security Framework SSRF payloads
- Inspired by Hermes Security Framework patterns
- Documentation: docs/SSRF_PROTECTION.md
-
Unicode Attack Detection for Prompt Injection (Issue #195, Phase 2: Hermes Security Patterns)
- Detects Unicode-based attacks that bypass pattern matching via invisible or look-alike characters
- Zero-width character detection (9 types): U+200B (zero-width space), U+200C (non-joiner), U+200D (joiner), U+FEFF (BOM), U+2060 (word joiner), and 4 more invisible characters
- Bidirectional override detection (2 types): U+202E (RTL override), U+202D (LTR override) for visual deception attacks
- Unicode tag character detection: Deprecated tags (U+E0000 - U+E007F) used for hidden data encoding
- Homoglyph detection (80+ pairs): Cyrillic/Greek/Mathematical look-alikes (e.g., Cyrillic 'е' U+0435 vs Latin 'e' U+0065)
- Smart false positive prevention:
- Allows emoji with zero-width joiners (e.g., 👨👩👧👦 family emoji) when
allow_emoji: true - Allows RTL languages (Arabic, Hebrew) with legitimate bidi marks when
allow_rtl_languages: true - Context-aware detection using surrounding character analysis
- Allows emoji with zero-width joiners (e.g., 👨👩👧👦 family emoji) when
- Configuration options under
prompt_injection.unicode_detection:enabled: Enable/disable all Unicode detection (default: true)detect_zero_width: Toggle zero-width character detection (default: true)detect_bidi_override: Toggle bidi override detection (default: true)detect_tag_chars: Toggle tag character detection (default: true)detect_homoglyphs: Toggle homoglyph detection (default: true)allow_rtl_languages: Allow legitimate RTL text (default: true)allow_emoji: Allow emoji with zero-width joiners (default: true)
- Performance: <5ms overhead per prompt with early exit on first detection
- Integration: Works with existing action modes (block/warn/log-only)
- Testing: 40 comprehensive test cases covering all attack types and false positive scenarios
- Validates 3/3 Hermes unicode attack payloads (zero-width, bidi override, tag characters)
- Based on Tirith CLI patterns and Hermes Security Framework
- New
UnicodeAttackDetectorclass insrc/ai_guardian/prompt_injection.py - Updated JSON schema with
unicode_detectionconfiguration section - Updated
setup.pyto includeunicode_detectionin default config template (ensuresai-guardian setup --create-configincludes new options)
-
Config File Scanner (Issue #196, Phase 3: Hermes Security Patterns)
- Detects credential exfiltration commands in AI configuration files that could cause persistent credential theft across ALL AI sessions
- The Threat: Malicious instructions in CLAUDE.md, AGENTS.md, or .cursorrules execute in every AI session, exfiltrating credentials from all developers on the project
- Persistence Multiplier: 1 malicious config file × N developers × M sessions = N×M credential thefts
- 8 Core Exfiltration Patterns (immutable, cannot be disabled):
curl.*\$\{?[A-Z_][A-Z0-9_]*\}?- curl with environment variableswget.*\$\{?[A-Z_][A-Z0-9_]*\}?- wget with environment variables\benv\s*\|.*\bcurl\b- env piped to curl (credential exfiltration)\bprintenv\b.*\|.*\bcurl\b- printenv exfiltration\bcat\s+(?:/etc/|~/\.ssh/|~/\.aws/).*\|.*\bcurl\b- file exfiltration\bbase64\b.*\|.*\bcurl\b- base64 encoded exfiltration\baws\s+s3\s+(?:cp|sync)\b- AWS S3 upload command\bgcloud\s+storage\s+cp\b- GCP Cloud Storage upload command
- Standard Config Files Scanned: CLAUDE.md, AGENTS.md, .cursorrules, .aider.conf.yml, .github/CLAUDE.md
- Context-Aware Detection: Ignores documentation examples with keywords (example, warning, don't, avoid, dangerous, attack, threat, security)
- Configurable Options under
config_file_scanning:enabled: Enable/disable config file scanning (default: true)action: "block" (default), "warn", or "log-only"additional_files: Add more config file patterns to scanignore_files: Glob patterns for files to skip (e.g., "/examples/", "/docs/")additional_patterns: Add custom regex patterns to detect
- Performance: <10ms overhead per config file scan with early exit on first match
- Testing: 37 comprehensive test cases including all 3 Hermes config file payloads
- Integration: Runs after prompt injection detection, before secret scanning in PreToolUse hook
- New
ConfigFileScannerclass insrc/ai_guardian/config_scanner.py - Updated JSON schema with
config_file_scanningconfiguration section - Updated
setup.pyto includeconfig_file_scanningin default config template - Inspired by Hermes Security Framework patterns
-
Documented
--create-configand--permissiveflags in README (Issue #199)- Quick Start section now shows
ai-guardian setup --create-configas the recommended way to create config files - Explains difference between secure mode (default) and permissive mode (
--permissiveflag) - Setup Command section includes
--create-configexamples in Basic Usage - Includes dry-run preview example (
--create-config --dry-run) - Makes onboarding easier by highlighting the automated config creation introduced in v1.4.0
- Quick Start section now shows
-
Version information in all log entries (Issue #190)
- Every log line now includes AI Guardian version (e.g.,
v1.5.0) - New log format:
YYYY-MM-DD HH:MM:SS - v{VERSION} - logger - LEVEL - message - Version logged explicitly at startup with Python version and platform information
- Helps correlate bugs with specific releases and verify fixes
- No manual version strings needed in log statements - automatically injected via custom LogRecord factory
- Example log output:
2026-04-21 18:49:20 - v1.5.0 - root - INFO - AI Guardian v1.5.0 initialized 2026-04-21 18:49:20 - v1.5.0 - root - INFO - Python 3.12.11 2026-04-21 18:49:20 - v1.5.0 - root - INFO - Platform: Darwin-25.4.0-arm64
- Every log line now includes AI Guardian version (e.g.,
Changed
- Clarified zero-configuration installation in README (Issue #216)
- Quick Start section now emphasizes that ai-guardian works immediately after installing gitleaks with zero configuration required
- Added "Default Behavior (No Configuration File)" section showing which features are enabled by default
- Added minimal configuration example showing that only specific restrictions need to be configured
- Reorganized Quick Start to clearly separate zero-config installation from optional advanced configuration
- Makes it clearer that configuration is only needed for tool/skill restrictions, directory rules, custom patterns, or log-only mode
- All core protections (secret scanning, prompt injection, SSRF, config file scanning, immutable file protection) work out-of-the-box
Fixed
- Setup command now generates complete configuration with violation_logging section (Issue #214)
- Fixed missing
violation_loggingsection inai-guardian setup --create-configoutput - Added
violation_loggingproperty to JSON schema with proper validation - Users can now discover and configure violation logging from generated config files
- Includes all log types: tool_permission, directory_blocking, secret_detected, secret_redaction, prompt_injection
- Improves discoverability of violation logging feature (available since v1.1.0)
- Fixed missing
- Overly aggressive self-protection heuristic no longer blocks legitimate content (Issue #188)
- Fixed false positives where commands mentioning "ai-guardian" in content were blocked
- Self-protection patterns are now path-specific, only blocking when targeting actual protected files:
- Config files:
*ai-guardian.json,*/.config/ai-guardian/* - IDE hooks:
*/.claude/settings.json,*/.cursor/hooks.json - Package code:
*/site-packages/ai_guardian/*,*/ai-guardian/src/ai_guardian/* - Cache files:
*/.cache/ai-guardian/* - Directory markers:
*/.ai-read-deny
- Config files:
- Now allows legitimate use cases:
- Writing code reviews mentioning "ai-guardian" (e.g.,
echo "Review mentions ai-guardian" > /tmp/review.md) - Creating documentation about ai-guardian (e.g.,
echo "Install ai-guardian using pip" > docs/README.md) - Writing bug reports containing "ai-guardian" text
- Writing code reviews mentioning "ai-guardian" (e.g.,
- Protection remains strong for actual config/hook files - only the heuristic is more precise
- Added 9 new test cases to prevent regression