# F4. AI-Driven Mitigation Recommendations

1. Explain My Top Risk
    * Input: asset name
    * Output: “These CVEs pose the highest risk because…”, plus mitigation hints.
2. Mitigation Roadmap
    * Input: list of high-risk CVEs or asset
    * Output: prioritized step-by-step action plan (patch, config hardening, monitoring).
3. Risk Trend Insights
    * Input: date range
    * Output: narrative on emerging risk patterns (e.g., “Web servers saw a 40% spike in Critical CVEs in Q1 2025…”)

In [65]:
from openai import OpenAI
import os
import pandas as pd
from IPython.display import Markdown, display
import json

# Config
api_key = os.getenv("OPENAI_API_KEY")
if not api_key:
    raise RuntimeError("API key not set.")
client = OpenAI(api_key=api_key)
asset_scores = pd.read_csv('../data/asset_risk_summary.csv')
vuln_scores = pd.read_csv('../data/cve_vuln_summary.csv')
vul_catalogue = pd.read_csv('../data/vuln_catalogue_v2.csv')

# Tidy up
scores = pd.merge(asset_scores,vuln_scores,how='inner',on=['cpeName','Title'])
scores.sort_values(by='riskScore').head(10)
df0 = pd.merge(vul_catalogue,scores, how='inner', on=['Title','cpeName','cveID'])
df0.drop(columns=['Unnamed: 0','vectorString','WrittenAt'],inplace=True,axis=1)
df = df0.sort_values(by='riskScore', ascending=False)
top10 = df.head(5)

# Top risks prep
cols = (['Title',
        'cpeName',
        'cveID',
        'published',
        'baseScore',
        'exploitabilityScore',
        'impactScore',
        'riskScore',
        'MaxRiskScore',
        'countHighRiskCVEs (>7.0)',
        'baseSeverity',
        'attackVector',
        'attackComplexity',
        'confidentialityImpact',
        'availabilityImpact',
        'description'])
records = top10[cols].to_dict(orient='records')

In [68]:
def explain_top_risks(df, client):
    risks_json = json.dumps(records, indent=2)

    # Construct the system and user prompts
    system_prompt = (
        "You are a cybersecurity expert specializing in vulnerability risk analysis "
        "and mitigation planning with a focus on NIST's Cybersecurity Framework 2.0."
    )

    user_prompt = (
        "Below is a list of the 10 highest-risk vulnerabilities (CVEs) affecting my assets. "
        "For each item, do the following:\n"
        "1. Briefly explain why this CVE poses a high risk based on its details.\n"
        "2. Suggest concise, actionable mitigation steps or best practices.\n"
        "Present the output as a numbered list. Here are the vulnerabilities:\n"
        f"{risks_json}"
    )

    # Send to OpenAI
    response = client.chat.completions.create(
        model="gpt-4o",  # or "gpt-4", "gpt-3.5-turbo" as available
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt}
        ],
        max_tokens=1000,
        temperature=0.2
    )

top_risks = response.choices[0].message.content

with open("../markdown/top_risks.md", "w", encoding="utf-8") as f:
    f.write(risk_trend_narrative)

with open("../markdown/top_risks.md", "r", encoding="utf-8") as f:
    md_content = f.read()

display(Markdown(md_content))

Here's a detailed analysis and mitigation plan for each of the listed vulnerabilities:

1. **CVE-2020-1953 - Oracle Database Server 19c**
   - **Risk Explanation:** This vulnerability involves Apache Commons Configuration, which can execute arbitrary code if a YAML file from an untrusted source is parsed. The critical risk stems from the potential for remote code execution with low attack complexity and high confidentiality and availability impacts.
   - **Mitigation Steps:**
     - Update Apache Commons Configuration to a version where this issue is resolved.
     - Ensure YAML files are sourced from trusted locations only.
     - Implement strict input validation and sanitization for all external data sources.

2. **CVE-2024-21410 - Microsoft Exchange Server 2019 Cumulative Update 14**
   - **Risk Explanation:** This is an elevation of privilege vulnerability in Microsoft Exchange Server, allowing attackers to gain unauthorized access and potentially execute arbitrary commands. The n

In [78]:
def get_trend_summary(df, freq='Q'):
    # Aggregates counts of CVEs by asset and severity, grouped by quarter or year.
    df['published'] = pd.to_datetime(df['published'])
    if freq == 'Q':
        df['period'] = df['published'].dt.to_period('Q')
    elif freq == 'Y':
        df['period'] = df['published'].dt.to_period('Y')
    else:
        raise ValueError("freq must be 'Q' for quarter or 'Y' for year.")
    # Group by period, asset, and severity
    trend = df.groupby(['period', 'cpeName', 'baseSeverity']).size().unstack(fill_value=0)
    return trend

# By quarter:
quarterly_trend = get_trend_summary(df, freq='Q')

# Calculate % Change (Quarter-over-Quarter)
def calculate_qoq_change(trend_df):
    # Ensure DataFrame is sorted by period
    trend_df = trend_df.sort_index(level=0)
    # Calculate the percentage change
    pct_change = trend_df.groupby('cpeName').pct_change().replace([float('inf'), -float('inf')], 0).fillna(0) * 100
    pct_change = pct_change.round(1)
    return pct_change

quarterly_change = calculate_qoq_change(quarterly_trend)

Unnamed: 0_level_0,baseSeverity,CRITICAL,HIGH,LOW,MEDIUM
period,cpeName,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2016Q2,cpe:2.3:a:oracle:database_server:19c:*:*:*:*:*:*:*,0.0,0.0,0.0,0.0
2018Q2,cpe:2.3:a:oracle:database_server:19c:*:*:*:*:*:*:*,0.0,-100.0,0.0,0.0
2018Q3,cpe:2.3:a:oracle:database:19c:*:*:*:enterprise:*:*:*,0.0,0.0,0.0,0.0
2018Q4,cpe:2.3:a:microsoft:exchange_server:2019:-:*:*:*:*:*:*,0.0,0.0,0.0,0.0
2018Q4,cpe:2.3:a:oracle:database_server:19c:*:*:*:*:*:*:*,0.0,0.0,0.0,0.0
...,...,...,...,...,...
2024Q2,cpe:2.3:a:adobe:acrobat_reader:20.004.30006:*:*:*:classic:*:*:*,0.0,128.6,0.0,-42.9
2024Q3,cpe:2.3:a:adobe:acrobat_reader:20.004.30006:*:*:*:classic:*:*:*,0.0,-31.2,0.0,25.0
2024Q4,cpe:2.3:a:adobe:acrobat_reader:20.004.30006:*:*:*:classic:*:*:*,0.0,-45.5,0.0,100.0
2024Q4,cpe:2.3:a:microsoft:exchange_server:2019:-:*:*:*:*:*:*,0.0,0.0,0.0,0.0


In [85]:
def generate_trend_narrative_prompt(trend_df, pct_change_df, start_period, end_period):
    # Filter the dataframes for the date range
    filtered_trend = trend_df.loc[start_period:end_period].reset_index()
    filtered_pct_change = pct_change_df.loc[start_period:end_period].reset_index()

    # Convert Period columns to strings so they're JSON serializable
    for df in (filtered_trend, filtered_pct_change):
        if 'period' in df.columns:
            df['period'] = df['period'].astype(str)

    trend_data = filtered_trend.to_dict(orient='records')
    change_data = filtered_pct_change.to_dict(orient='records')

    trend_json = json.dumps(trend_data, indent=2)
    change_json = json.dumps(change_data, indent=2)
    prompt = (
        "You are a cybersecurity risk analyst. Below are two tables:\n\n"
        "1. Raw counts of vulnerabilities (CVEs) by asset and severity for each quarter.\n"
        "2. The corresponding quarter-over-quarter percentage change for each asset and severity.\n\n"
        "Analyze the data, highlighting emerging risk patterns. Narrate where there were significant spikes or drops, and call out assets with the most notable changes. Use percentages and time periods (e.g., 'Web servers saw a 40% spike in Critical CVEs in Q1 2025').\n"
        "Finish by suggesting which assets or severities require immediate attention due to recent trends.\n\n"
        f"Raw trend data:\n{trend_json}\n\n"
        f"Quarterly percent change data:\n{change_json}\n"
    )
    return prompt



# Example: full range
prompt = generate_trend_narrative_prompt(quarterly_trend, quarterly_change, 
                                         quarterly_trend.index.get_level_values(0).min(), 
                                         quarterly_trend.index.get_level_values(0).max())

system_prompt = "You are an expert at analyzing vulnerability data for security reporting with a particular interest in NIST's Cybersecurity Framework 2.0."
user_prompt = prompt

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt}
    ],
    max_tokens=1000,
    temperature=0.3
)

risk_trend_narrative = response.choices[0].message.content

with open("../markdown/risk_trend_narrative.md", "w", encoding="utf-8") as f:
    f.write(risk_trend_narrative)

with open("../markdown/risk_trend_narrative.md", "r", encoding="utf-8") as f:
    md_content = f.read()

display(Markdown(md_content))

### Analysis of Vulnerability Trends

#### Emerging Risk Patterns

1. **Adobe Acrobat Reader:**
   - **High Severity CVEs:** There was a notable increase in high-severity vulnerabilities for Adobe Acrobat Reader. From 2021Q1 to 2022Q2, high-severity CVEs spiked significantly, with a 466.7% increase in 2022Q1 and a further 111.8% increase in 2022Q2. However, there was a decrease in subsequent quarters, with a 52.8% drop in 2022Q3 and an 88.2% drop in 2022Q4. Despite these fluctuations, the numbers remained relatively high, indicating persistent risk.
   - **Medium Severity CVEs:** Medium-severity vulnerabilities also showed significant volatility, with a 300% increase in 2022Q2, followed by a 54.2% decrease in 2022Q3. The trend continued with a 63.6% decrease in 2022Q4.

2. **Oracle Database Server:**
   - **Critical and High Severity CVEs:** There was a critical vulnerability recorded in 2019Q1, which was not seen in subsequent quarters. High-severity CVEs showed fluctuations, with a notable spike in 2020Q1 (300% increase) but a sharp decline in 2020Q2 (-66.7%).
   - **Medium Severity CVEs:** There was a significant increase in medium-severity vulnerabilities in 2019Q4 (600% increase), followed by a sharp decline in 2020Q1 (-57.1%) and further decreases in subsequent quarters.

3. **Microsoft Exchange Server:**
   - **Critical Severity CVEs:** A critical vulnerability was recorded in 2019Q1 and again in 2023Q3, indicating a resurgence of critical issues.
   - **Medium Severity CVEs:** There was a significant increase in medium-severity vulnerabilities in 2019Q2 (100% increase), followed by fluctuations in subsequent quarters.

4. **Fortinet Fortigate 7000:**
   - A critical vulnerability was recorded in 2023Q2, which is a new entry and indicates an emerging risk for this asset.

#### Significant Spikes or Drops

- **Adobe Acrobat Reader:** The most significant spikes were observed in high-severity CVEs in 2022Q1 and 2022Q2, with increases of 466.7% and 111.8%, respectively. This trend suggests a growing risk in this asset's security posture.
- **Oracle Database Server:** The 600% increase in medium-severity CVEs in 2019Q4 was a notable spike, indicating a temporary surge in vulnerabilities.
- **Microsoft Exchange Server:** The reappearance of critical vulnerabilities in 2023Q3 after a period of absence suggests a potential resurgence of risk.

#### Assets Requiring Immediate Attention

1. **Adobe Acrobat Reader:** Given the significant and persistent increase in high-severity vulnerabilities, this asset requires immediate attention to mitigate potential risks.
2. **Microsoft Exchange Server:** The re-emergence of critical vulnerabilities in 2023Q3 indicates a need for focused security measures.
3. **Fortinet Fortigate 7000:** The appearance of a critical vulnerability in 2023Q2 suggests this asset should be closely monitored for emerging threats.

In conclusion, Adobe Acrobat Reader and Microsoft Exchange Server are the most concerning assets due to their recent trends in high and critical vulnerabilities. Immediate attention and remediation efforts should be prioritized for these assets to mitigate potential security risks.