# Lab 08: AI-Enhanced Vulnerability Scanner

Build an AI-enhanced vulnerability scanner with intelligent prioritization using EPSS, threat intelligence, and asset context.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/depalmar/ai_for_the_win/blob/main/notebooks/lab08_vuln_scanner_ai.ipynb)

## Learning Objectives
- CVE analysis with NVD data enrichment
- CVSS v3.1 scoring interpretation
- EPSS (Exploit Prediction Scoring System) integration
- Threat intelligence correlation (CISA KEV, active exploitation)
- Asset criticality and business context
- AI-powered remediation prioritization
- Automated remediation guidance
- Vulnerability trending and risk forecasting

## Prioritization Factors

Modern vulnerability management considers:
1. **CVSS Base Score** - Technical severity
2. **EPSS Score** - Probability of exploitation in the wild
3. **CISA KEV** - Known exploited vulnerabilities catalog
4. **Threat Intel** - Active exploitation reports
5. **Asset Criticality** - Business impact of affected system
6. **Attack Surface** - Network exposure (internal vs internet-facing)
7. **Compensating Controls** - Existing mitigations

In [None]:
import json
from typing import List, Dict
from dataclasses import dataclass
from enum import Enum

## 1. Vulnerability Data Model

In [None]:
class Severity(Enum):
    CRITICAL = 4
    HIGH = 3
    MEDIUM = 2
    LOW = 1
    INFO = 0

class AssetCriticality(Enum):
    CRITICAL = "critical"      # Domain controllers, production DBs
    HIGH = "high"              # Production servers, web apps
    MEDIUM = "medium"          # Development systems, internal apps
    LOW = "low"                # Workstations, test systems

class ExposureLevel(Enum):
    INTERNET = "internet"      # Directly internet-facing
    DMZ = "dmz"                # In DMZ with some protection
    INTERNAL = "internal"      # Internal network only
    ISOLATED = "isolated"      # Air-gapped or highly segmented

@dataclass
class Vulnerability:
    """Enhanced vulnerability with EPSS and threat intelligence."""
    cve_id: str
    title: str
    cvss_score: float
    cvss_vector: str
    severity: Severity
    description: str
    affected_product: str
    affected_version: str
    exploit_available: bool
    patch_available: bool
    # New fields for enhanced prioritization
    epss_score: float          # 0-1 probability of exploitation
    epss_percentile: float     # Percentile rank
    in_cisa_kev: bool          # In CISA Known Exploited Vulnerabilities
    kev_due_date: str          # Remediation due date if in KEV
    actively_exploited: bool   # Current active exploitation reports
    exploit_maturity: str      # POC, Functional, High, Weaponized
    ransomware_associated: bool # Used by ransomware groups
    cwe_id: str                # Weakness enumeration
    affected_asset: str        # Asset hostname
    asset_criticality: AssetCriticality
    exposure_level: ExposureLevel
    compensating_controls: List[str]

# Comprehensive vulnerability dataset with real-world examples
SAMPLE_VULNS = [
    # Critical - Actively exploited, ransomware
    Vulnerability(
        cve_id="CVE-2024-21887",
        title="Ivanti Connect Secure Command Injection",
        cvss_score=9.1,
        cvss_vector="CVSS:3.1/AV:N/AC:L/PR:H/UI:N/S:C/C:H/I:H/A:H",
        severity=Severity.CRITICAL,
        description="Command injection vulnerability in Ivanti Connect Secure allows remote code execution",
        affected_product="Ivanti Connect Secure",
        affected_version="9.x, 22.x",
        exploit_available=True,
        patch_available=True,
        epss_score=0.97,
        epss_percentile=99.9,
        in_cisa_kev=True,
        kev_due_date="2024-01-22",
        actively_exploited=True,
        exploit_maturity="Weaponized",
        ransomware_associated=True,
        cwe_id="CWE-77",
        affected_asset="vpn-gateway-01",
        asset_criticality=AssetCriticality.CRITICAL,
        exposure_level=ExposureLevel.INTERNET,
        compensating_controls=[]
    ),
    Vulnerability(
        cve_id="CVE-2023-46805",
        title="Ivanti Connect Secure Authentication Bypass",
        cvss_score=8.2,
        cvss_vector="CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:L/A:N",
        severity=Severity.HIGH,
        description="Authentication bypass in Ivanti Connect Secure allows unauthorized access",
        affected_product="Ivanti Connect Secure",
        affected_version="9.x, 22.x",
        exploit_available=True,
        patch_available=True,
        epss_score=0.95,
        epss_percentile=99.5,
        in_cisa_kev=True,
        kev_due_date="2024-01-22",
        actively_exploited=True,
        exploit_maturity="Weaponized",
        ransomware_associated=True,
        cwe_id="CWE-287",
        affected_asset="vpn-gateway-01",
        asset_criticality=AssetCriticality.CRITICAL,
        exposure_level=ExposureLevel.INTERNET,
        compensating_controls=[]
    ),
    
    # Critical - Known exploited, not ransomware
    Vulnerability(
        cve_id="CVE-2023-22518",
        title="Atlassian Confluence Improper Authorization",
        cvss_score=9.8,
        cvss_vector="CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H",
        severity=Severity.CRITICAL,
        description="Improper authorization vulnerability allows unauthenticated data destruction",
        affected_product="Atlassian Confluence",
        affected_version="< 8.3.3, < 8.4.3, < 8.5.2",
        exploit_available=True,
        patch_available=True,
        epss_score=0.89,
        epss_percentile=98.2,
        in_cisa_kev=True,
        kev_due_date="2023-11-28",
        actively_exploited=True,
        exploit_maturity="Functional",
        ransomware_associated=False,
        cwe_id="CWE-863",
        affected_asset="wiki-server-01",
        asset_criticality=AssetCriticality.HIGH,
        exposure_level=ExposureLevel.INTERNAL,
        compensating_controls=["WAF rules deployed"]
    ),
    
    # High CVSS but low EPSS - Lower priority
    Vulnerability(
        cve_id="CVE-2024-0001",
        title="Theoretical Buffer Overflow in Legacy App",
        cvss_score=9.8,
        cvss_vector="CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H",
        severity=Severity.CRITICAL,
        description="Buffer overflow in custom legacy application - requires specific conditions",
        affected_product="Legacy Internal App",
        affected_version="1.0",
        exploit_available=False,
        patch_available=False,
        epss_score=0.02,
        epss_percentile=45.0,
        in_cisa_kev=False,
        kev_due_date="",
        actively_exploited=False,
        exploit_maturity="POC",
        ransomware_associated=False,
        cwe_id="CWE-120",
        affected_asset="legacy-app-01",
        asset_criticality=AssetCriticality.LOW,
        exposure_level=ExposureLevel.ISOLATED,
        compensating_controls=["Network segmentation", "Application firewall"]
    ),
    
    # Medium CVSS but high exploitation - Higher priority
    Vulnerability(
        cve_id="CVE-2023-20198",
        title="Cisco IOS XE Web UI Command Injection",
        cvss_score=10.0,
        cvss_vector="CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:C/C:H/I:H/A:H",
        severity=Severity.CRITICAL,
        description="Web UI command injection allows creation of local accounts and privilege escalation",
        affected_product="Cisco IOS XE",
        affected_version="16.x, 17.x",
        exploit_available=True,
        patch_available=True,
        epss_score=0.98,
        epss_percentile=99.95,
        in_cisa_kev=True,
        kev_due_date="2023-10-23",
        actively_exploited=True,
        exploit_maturity="Weaponized",
        ransomware_associated=False,
        cwe_id="CWE-78",
        affected_asset="router-edge-01",
        asset_criticality=AssetCriticality.CRITICAL,
        exposure_level=ExposureLevel.INTERNET,
        compensating_controls=[]
    ),
    
    # SQL Injection - Common but often lower impact
    Vulnerability(
        cve_id="CVE-2024-0002",
        title="SQL Injection in Internal Portal",
        cvss_score=8.5,
        cvss_vector="CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H",
        severity=Severity.HIGH,
        description="Authenticated SQL injection in search functionality allows data extraction",
        affected_product="Internal HR Portal",
        affected_version="2.1.0",
        exploit_available=False,
        patch_available=False,
        epss_score=0.15,
        epss_percentile=70.0,
        in_cisa_kev=False,
        kev_due_date="",
        actively_exploited=False,
        exploit_maturity="POC",
        ransomware_associated=False,
        cwe_id="CWE-89",
        affected_asset="hr-portal-01",
        asset_criticality=AssetCriticality.MEDIUM,
        exposure_level=ExposureLevel.INTERNAL,
        compensating_controls=["WAF with SQLi rules", "Input validation"]
    ),
    
    # Medium severity, patch available
    Vulnerability(
        cve_id="CVE-2024-0003",
        title="Information Disclosure in Error Pages",
        cvss_score=5.3,
        cvss_vector="CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:N/A:N",
        severity=Severity.MEDIUM,
        description="Verbose error messages expose internal paths and version information",
        affected_product="Custom Web Application",
        affected_version="3.0",
        exploit_available=False,
        patch_available=True,
        epss_score=0.01,
        epss_percentile=25.0,
        in_cisa_kev=False,
        kev_due_date="",
        actively_exploited=False,
        exploit_maturity="None",
        ransomware_associated=False,
        cwe_id="CWE-209",
        affected_asset="web-app-02",
        asset_criticality=AssetCriticality.MEDIUM,
        exposure_level=ExposureLevel.DMZ,
        compensating_controls=["Custom error pages configured"]
    ),
    
    # Log4j - Historical but still relevant
    Vulnerability(
        cve_id="CVE-2021-44228",
        title="Apache Log4j Remote Code Execution (Log4Shell)",
        cvss_score=10.0,
        cvss_vector="CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:C/C:H/I:H/A:H",
        severity=Severity.CRITICAL,
        description="Remote code execution via JNDI injection in Log4j logging library",
        affected_product="Apache Log4j",
        affected_version="< 2.17.0",
        exploit_available=True,
        patch_available=True,
        epss_score=0.976,
        epss_percentile=99.8,
        in_cisa_kev=True,
        kev_due_date="2021-12-24",
        actively_exploited=True,
        exploit_maturity="Weaponized",
        ransomware_associated=True,
        cwe_id="CWE-917",
        affected_asset="app-server-03",
        asset_criticality=AssetCriticality.HIGH,
        exposure_level=ExposureLevel.INTERNAL,
        compensating_controls=["WAF rules", "Outbound LDAP blocked"]
    ),
    
    # Windows Print Spooler
    Vulnerability(
        cve_id="CVE-2021-34527",
        title="Windows Print Spooler RCE (PrintNightmare)",
        cvss_score=8.8,
        cvss_vector="CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H",
        severity=Severity.HIGH,
        description="Remote code execution vulnerability in Windows Print Spooler service",
        affected_product="Windows",
        affected_version="All versions before August 2021 patches",
        exploit_available=True,
        patch_available=True,
        epss_score=0.87,
        epss_percentile=97.5,
        in_cisa_kev=True,
        kev_due_date="2021-11-17",
        actively_exploited=True,
        exploit_maturity="Weaponized",
        ransomware_associated=True,
        cwe_id="CWE-269",
        affected_asset="dc-01",
        asset_criticality=AssetCriticality.CRITICAL,
        exposure_level=ExposureLevel.INTERNAL,
        compensating_controls=["Print Spooler disabled"]
    ),
]

print(f"Loaded {len(SAMPLE_VULNS)} vulnerabilities for analysis")
print(f"\nBreakdown by CISA KEV status:")
kev_count = sum(1 for v in SAMPLE_VULNS if v.in_cisa_kev)
print(f"  In CISA KEV: {kev_count}")
print(f"  Not in KEV: {len(SAMPLE_VULNS) - kev_count}")
print(f"\nRansomware associated: {sum(1 for v in SAMPLE_VULNS if v.ransomware_associated)}")

## 2. Risk Prioritization

In [None]:
class VulnPrioritizer:
    """Prioritize vulnerabilities based on risk factors."""
    
    def calculate_risk_score(self, vuln: Vulnerability) -> float:
        """Calculate composite risk score."""
        score = vuln.cvss_score
        
        # Boost for exploit availability
        if vuln.exploit_available:
            score += 1.0
        
        # Reduce for patch availability
        if vuln.patch_available:
            score -= 0.5
        
        return min(10.0, max(0.0, score))
    
    def prioritize(self, vulns: List[Vulnerability]) -> List[Dict]:
        """Sort vulnerabilities by risk."""
        scored = []
        for v in vulns:
            risk = self.calculate_risk_score(v)
            scored.append({
                "vulnerability": v,
                "risk_score": risk,
                "priority": "P1" if risk >= 9 else "P2" if risk >= 7 else "P3"
            })
        
        return sorted(scored, key=lambda x: x["risk_score"], reverse=True)

# Prioritize
prioritizer = VulnPrioritizer()
prioritized = prioritizer.prioritize(SAMPLE_VULNS)

print("Prioritized Vulnerabilities:")
print("=" * 60)
for item in prioritized:
    v = item["vulnerability"]
    print(f"[{item['priority']}] {v.cve_id} - Risk: {item['risk_score']:.1f}")
    print(f"    {v.title}")
    print(f"    Exploit: {'Yes' if v.exploit_available else 'No'} | Patch: {'Yes' if v.patch_available else 'No'}")
    print()

## 3. AI Remediation Advisor

In [None]:
class RemediationAdvisor:
    """AI-powered remediation guidance."""
    
    def __init__(self):
        try:
            from anthropic import Anthropic
            self.client = Anthropic()
            self.available = True
        except:
            self.available = False
    
    def get_remediation(self, vuln: Vulnerability) -> str:
        """Get remediation advice for vulnerability."""
        if not self.available:
            return self._mock_remediation(vuln)
        
        prompt = f"""Provide remediation guidance for this vulnerability:

CVE: {vuln.cve_id}
Title: {vuln.title}
CVSS: {vuln.cvss_score}
Description: {vuln.description}
Product: {vuln.affected_product} {vuln.affected_version}
Exploit Available: {vuln.exploit_available}
Patch Available: {vuln.patch_available}

Provide:
1. Immediate mitigation steps
2. Long-term remediation
3. Detection recommendations

Be specific and actionable."""
        
        response = self.client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=512,
            messages=[{"role": "user", "content": prompt}]
        )
        
        return response.content[0].text
    
    def _mock_remediation(self, vuln: Vulnerability) -> str:
        return f"""
## Remediation for {vuln.cve_id}

### Immediate Mitigations
1. Apply vendor patch if available
2. Implement WAF rules to block exploitation
3. Restrict network access to affected systems

### Long-term Remediation
1. Update {vuln.affected_product} to latest version
2. Review and harden configuration
3. Implement defense-in-depth controls

### Detection
- Monitor for exploitation attempts in logs
- Enable IDS/IPS signatures for {vuln.cve_id}
- Set up alerts for suspicious activity
"""

# Get advice
advisor = RemediationAdvisor()
advice = advisor.get_remediation(SAMPLE_VULNS[0])
print(advice)

## 4. Scan Report Generator

In [None]:
def generate_scan_report(vulns: List[Vulnerability]) -> str:
    """Generate vulnerability scan report."""
    prioritizer = VulnPrioritizer()
    prioritized = prioritizer.prioritize(vulns)
    
    # Statistics
    stats = {
        "total": len(vulns),
        "critical": sum(1 for v in vulns if v.severity == Severity.CRITICAL),
        "high": sum(1 for v in vulns if v.severity == Severity.HIGH),
        "exploitable": sum(1 for v in vulns if v.exploit_available)
    }
    
    report = f"""
# Vulnerability Scan Report

## Executive Summary
- Total Vulnerabilities: {stats['total']}
- Critical: {stats['critical']}
- High: {stats['high']}
- Exploitable: {stats['exploitable']}

## Prioritized Findings
"""
    
    for item in prioritized:
        v = item["vulnerability"]
        report += f"""
### [{item['priority']}] {v.cve_id} - {v.title}
- **CVSS**: {v.cvss_score} ({v.severity.name})
- **Risk Score**: {item['risk_score']:.1f}
- **Product**: {v.affected_product} {v.affected_version}
- **Exploit Available**: {'Yes' if v.exploit_available else 'No'}
- **Patch Available**: {'Yes' if v.patch_available else 'No'}

{v.description}
"""
    
    return report

# Generate report
report = generate_scan_report(SAMPLE_VULNS)
print(report)

## Summary

We built an AI-enhanced vulnerability scanner:

1. **Data Model** - Structured vulnerability representation
2. **Prioritization** - Risk-based scoring algorithm
3. **AI Advisor** - LLM-powered remediation guidance
4. **Reporting** - Automated scan reports

### Next Steps:
1. Integrate with CVE databases (NVD API)
2. Add asset context for prioritization
3. Create automated remediation workflows