# F4. AI-Driven Mitigation Recommendations



## Intended Purpose of Code

### Key Features

1. Explain My Top Risk/Mitigations
    * Input: asset name
    * Output: “These CVEs pose the highest risk because…”, plus mitigation hints.
3. Risk Trend Insights
    * Input: date range
    * Output: narrative on emerging risk patterns (e.g., “Web servers saw a 40% spike in Critical CVEs in Q1 2025…”)

In [3]:
from openai import OpenAI
import os
import pandas as pd
from IPython.display import Markdown, display
import json

# Config
api_key = os.getenv("OPENAI_API_KEY")
if not api_key:
    raise RuntimeError("API key not set.")
client = OpenAI(api_key=api_key)
asset_scores = pd.read_csv('../data/asset_risk_summary.csv')
vuln_scores = pd.read_csv('../data/cve_vuln_summary.csv')
vul_catalogue = pd.read_csv('../data/vuln_catalogue_v2.csv')

# Tidy up
scores = pd.merge(asset_scores,vuln_scores,how='inner',on=['cpeName','Title'])
scores.sort_values(by='riskScore').head(10)
df0 = pd.merge(vul_catalogue,scores, how='inner', on=['Title','cpeName','cveID'])
df0.drop(columns=['Unnamed: 0','vectorString','WrittenAt'],inplace=True,axis=1)
df = df0.sort_values(by='riskScore', ascending=False)
top10 = df.head(5)

# Explain My Top Risks & Mitigation Tips Prep
cols = (['Title',
        'cpeName',
        'cveID',
        'published',
        'baseScore',
        'exploitabilityScore',
        'impactScore',
        'riskScore',
        'MaxRiskScore',
        'countHighRiskCVEs (>7.0)',
        'baseSeverity',
        'attackVector',
        'attackComplexity',
        'confidentialityImpact',
        'availabilityImpact',
        'description'])
records = top10[cols].to_dict(orient='records')

# Risk Trends Insight Prep
def get_trend_summary(df, freq='Q'):
    # Aggregates counts of CVEs by asset and severity, grouped by quarter or year.
    df['published'] = pd.to_datetime(df['published'])
    if freq == 'Q':
        df['period'] = df['published'].dt.to_period('Q')
    elif freq == 'Y':
        df['period'] = df['published'].dt.to_period('Y')
    else:
        raise ValueError("freq must be 'Q' for quarter or 'Y' for year.")
    # Group by period, asset, and severity
    trend = df.groupby(['period', 'cpeName', 'baseSeverity']).size().unstack(fill_value=0)
    return trend

# By quarter:
quarterly_trend = get_trend_summary(df, freq='Q')

# Calculate % Change (Quarter-over-Quarter)
def calculate_qoq_change(trend_df):
    # Ensure DataFrame is sorted by period
    trend_df = trend_df.sort_index(level=0)
    # Calculate the percentage change
    pct_change = trend_df.groupby('cpeName').pct_change().replace([float('inf'), -float('inf')], 0).fillna(0) * 100
    pct_change = pct_change.round(1)
    return pct_change

quarterly_change = calculate_qoq_change(quarterly_trend)

In [5]:
def explain_top_risks(df, client):
    risks_json = json.dumps(records, indent=2)

    # Construct the system and user prompts
    system_prompt = (
        "You are a cybersecurity expert specializing in vulnerability risk analysis "
        "and mitigation planning with a focus on NIST's Cybersecurity Framework 2.0."
    )

    user_prompt = (
        "Below is a list of the 10 highest-risk vulnerabilities (CVEs) affecting my assets. "
        "For each item, do the following:\n"
        "1. Briefly explain why this CVE poses a high risk based on its details.\n"
        "2. Suggest concise, actionable mitigation steps or best practices.\n"
        "Generate a report memo with only the subject line in the memo header,"
        "Present the output as a numbered list. Here are the vulnerabilities:\n"
        f"{risks_json}"
    )

    # Send to OpenAI
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt}
        ],
        max_tokens=1000,
        temperature=0.2
    )

top_risks = response.choices[0].message.content

with open("../markdown/top_risks.md", "w", encoding="utf-8") as f:
    f.write(top_risks)

with open("../markdown/top_risks.md", "r", encoding="utf-8") as f:
    md_content = f.read()

display(Markdown(md_content))

### Analysis of Vulnerability Data

#### Emerging Risk Patterns

1. **Adobe Acrobat Reader:**
   - **High Severity:** Adobe Acrobat Reader has shown significant fluctuations in high severity vulnerabilities. Notably, there was a 466.7% increase in Q1 2022, followed by a further 111.8% increase in Q2 2022. Although there was a drop of 52.8% in Q3 2022, the numbers remained high, with a peak of 36 high severity vulnerabilities in Q2 2022. The trend continued with fluctuations, showing a 550% increase in Q1 2023, and a 128.6% rise in Q2 2024. However, there was a notable decrease of 45.5% in Q4 2024.
   - **Medium Severity:** There was a substantial 300% increase in medium severity vulnerabilities in Q2 2022, followed by a drop of 54.2% in Q3 2022. The numbers fluctuated but remained relatively high, with a significant 1800% increase in Q3 2023.

2. **Oracle Database Server 19c:**
   - **Critical and High Severity:** The Oracle Database Server 19c experienced a critical vulnerability in Q1 2019, which was resolved by Q2 2019. High severity vulnerabilities have been sporadic, with a notable spike in Q1 2020 (3 vulnerabilities) and a subsequent drop in Q2 2020 by 66.7%.
   - **Medium Severity:** There was a significant increase in medium severity vulnerabilities in Q4 2019, with a 600% rise, followed by a decrease of 57.1% in Q1 2020. The trend continued with fluctuations, showing a 100% increase in Q2 2021.

3. **Microsoft Exchange Server 2019:**
   - **Critical Severity:** Microsoft Exchange Server 2019 saw critical vulnerabilities in Q1 2019 and Q1 2021, with a recurrence in Q3 2023.
   - **Medium Severity:** There was a 100% increase in medium severity vulnerabilities in Q2 2019, followed by a 50% decrease in Q3 2019. The numbers remained low in subsequent periods.

4. **Oracle Database 19c (Enterprise):**
   - **Medium Severity:** There was a notable 200% increase in medium severity vulnerabilities in Q3 2019 and a 350% increase in Q3 2022. However, there was a significant drop of 100% in Q4 2022.

#### Significant Spikes and Drops

- **Adobe Acrobat Reader:** The asset experienced multiple spikes in high severity vulnerabilities, particularly in Q1 2022 (466.7%) and Q1 2023 (550%). Medium severity vulnerabilities also saw a significant increase in Q3 2023 (1800%).
- **Oracle Database Server 19c:** A notable spike in medium severity vulnerabilities occurred in Q4 2019 (600%), followed by a drop in Q1 2020 (-57.1%).
- **Microsoft Exchange Server 2019:** Critical vulnerabilities reappeared in Q3 2023, indicating a potential resurgence of risk.

#### Assets and Severities Requiring Immediate Attention

1. **Adobe Acrobat Reader:** Given the frequent and significant spikes in both high and medium severity vulnerabilities, this asset requires immediate attention to mitigate potential risks.

2. **Oracle Database Server 19c:** The fluctuations in medium severity vulnerabilities, along with sporadic high severity issues, suggest a need for ongoing monitoring and patching.

3. **Microsoft Exchange Server 2019:** The recurrence of critical vulnerabilities in recent quarters highlights the need for focused security measures and monitoring.

Overall, the data suggests that Adobe Acrobat Reader and Oracle Database Server 19c are the assets with the most volatile vulnerability trends, necessitating proactive risk management strategies.

In [4]:
def generate_trend_narrative_prompt(trend_df, pct_change_df, start_period, end_period):
    # Filter the dataframes for the date range
    filtered_trend = trend_df.loc[start_period:end_period].reset_index()
    filtered_pct_change = pct_change_df.loc[start_period:end_period].reset_index()

    # Convert Period columns to strings so they're JSON serializable
    for df in (filtered_trend, filtered_pct_change):
        if 'period' in df.columns:
            df['period'] = df['period'].astype(str)

    trend_data = filtered_trend.to_dict(orient='records')
    change_data = filtered_pct_change.to_dict(orient='records')

    trend_json = json.dumps(trend_data, indent=2)
    change_json = json.dumps(change_data, indent=2)
    prompt = (
        "You are a cybersecurity risk analyst. Below are two tables:\n\n"
        "1. Raw counts of vulnerabilities (CVEs) by asset and severity for each quarter.\n"
        "2. The corresponding quarter-over-quarter percentage change for each asset and severity.\n\n"
        "Analyze the data, highlighting emerging risk patterns. Narrate where there were significant spikes or drops, and call out assets with the most notable changes. Use percentages and time periods (e.g., 'Web servers saw a 40% spike in Critical CVEs in Q1 2025').\n"
        "Finish by suggesting which assets or severities require immediate attention due to recent trends.\n\n"
        f"Raw trend data:\n{trend_json}\n\n"
        f"Quarterly percent change data:\n{change_json}\n"
    )
    return prompt


prompt = generate_trend_narrative_prompt(quarterly_trend, quarterly_change, 
                                         quarterly_trend.index.get_level_values(0).min(), 
                                         quarterly_trend.index.get_level_values(0).max())

system_prompt = "You are an expert at analyzing vulnerability data for security reporting with a particular interest in NIST's Cybersecurity Framework 2.0."
user_prompt = prompt

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt}
    ],
    max_tokens=1000,
    temperature=0.3
)

risk_trend_narrative = response.choices[0].message.content

with open("../markdown/risk_trend_narrative.md", "w", encoding="utf-8") as f:
    f.write(risk_trend_narrative)

with open("../markdown/risk_trend_narrative.md", "r", encoding="utf-8") as f:
    md_content = f.read()

display(Markdown(md_content))

### Analysis of Vulnerability Data

#### Emerging Risk Patterns

1. **Adobe Acrobat Reader:**
   - **High Severity:** Adobe Acrobat Reader has shown significant fluctuations in high severity vulnerabilities. Notably, there was a 466.7% increase in Q1 2022, followed by a further 111.8% increase in Q2 2022. Although there was a drop of 52.8% in Q3 2022, the numbers remained high, with a peak of 36 high severity vulnerabilities in Q2 2022. The trend continued with fluctuations, showing a 550% increase in Q1 2023, and a 128.6% rise in Q2 2024. However, there was a notable decrease of 45.5% in Q4 2024.
   - **Medium Severity:** There was a substantial 300% increase in medium severity vulnerabilities in Q2 2022, followed by a drop of 54.2% in Q3 2022. The numbers fluctuated but remained relatively high, with a significant 1800% increase in Q3 2023.

2. **Oracle Database Server 19c:**
   - **Critical and High Severity:** The Oracle Database Server 19c experienced a critical vulnerability in Q1 2019, which was resolved by Q2 2019. High severity vulnerabilities have been sporadic, with a notable spike in Q1 2020 (3 vulnerabilities) and a subsequent drop in Q2 2020 by 66.7%.
   - **Medium Severity:** There was a significant increase in medium severity vulnerabilities in Q4 2019, with a 600% rise, followed by a decrease of 57.1% in Q1 2020. The trend continued with fluctuations, showing a 100% increase in Q2 2021.

3. **Microsoft Exchange Server 2019:**
   - **Critical Severity:** Microsoft Exchange Server 2019 saw critical vulnerabilities in Q1 2019 and Q1 2021, with a recurrence in Q3 2023.
   - **Medium Severity:** There was a 100% increase in medium severity vulnerabilities in Q2 2019, followed by a 50% decrease in Q3 2019. The numbers remained low in subsequent periods.

4. **Oracle Database 19c (Enterprise):**
   - **Medium Severity:** There was a notable 200% increase in medium severity vulnerabilities in Q3 2019 and a 350% increase in Q3 2022. However, there was a significant drop of 100% in Q4 2022.

#### Significant Spikes and Drops

- **Adobe Acrobat Reader:** The asset experienced multiple spikes in high severity vulnerabilities, particularly in Q1 2022 (466.7%) and Q1 2023 (550%). Medium severity vulnerabilities also saw a significant increase in Q3 2023 (1800%).
- **Oracle Database Server 19c:** A notable spike in medium severity vulnerabilities occurred in Q4 2019 (600%), followed by a drop in Q1 2020 (-57.1%).
- **Microsoft Exchange Server 2019:** Critical vulnerabilities reappeared in Q3 2023, indicating a potential resurgence of risk.

#### Assets and Severities Requiring Immediate Attention

1. **Adobe Acrobat Reader:** Given the frequent and significant spikes in both high and medium severity vulnerabilities, this asset requires immediate attention to mitigate potential risks.

2. **Oracle Database Server 19c:** The fluctuations in medium severity vulnerabilities, along with sporadic high severity issues, suggest a need for ongoing monitoring and patching.

3. **Microsoft Exchange Server 2019:** The recurrence of critical vulnerabilities in recent quarters highlights the need for focused security measures and monitoring.

Overall, the data suggests that Adobe Acrobat Reader and Oracle Database Server 19c are the assets with the most volatile vulnerability trends, necessitating proactive risk management strategies.