# F4. AI-Driven Mitigation Recommendations



## Intended Purpose of Code

1. Explain My Top Risk/Mitigations
    * Input: asset name
    * Output: “These CVEs pose the highest risk because…”, plus mitigation hints.
3. Risk Trend Insights
    * Input: date range
    * Output: narrative on emerging risk patterns (e.g., “Web servers saw a 40% spike in Critical CVEs in Q1 2025…”)

In [5]:
from openai import OpenAI
import os
import pandas as pd
from IPython.display import Markdown, display
import json

# Config
api_key = os.getenv("OPENAI_API_KEY")
if not api_key:
    raise RuntimeError("API key not set.")
client = OpenAI(api_key=api_key)
asset_scores = pd.read_csv('../data/asset_risk_summary.csv')
vuln_scores = pd.read_csv('../data/cve_vuln_summary.csv')
vul_catalogue = pd.read_csv('../data/vuln_catalogue_v2.csv')

# Tidy up
scores = pd.merge(asset_scores,vuln_scores,how='inner',on=['cpeName','Title'])
scores.sort_values(by='riskScore').head(10)
df0 = pd.merge(vul_catalogue,scores, how='inner', on=['Title','cpeName','cveID'])
df0.drop(columns=['Unnamed: 0','vectorString','WrittenAt'],inplace=True,axis=1)
df = df0.sort_values(by='riskScore', ascending=False)
top10 = df.head(5)

# Explain My Top Risks & Mitigation Tips Prep
cols = (['Title',
        'cpeName',
        'cveID',
        'published',
        'baseScore',
        'exploitabilityScore',
        'impactScore',
        'riskScore',
        'MaxRiskScore',
        'countHighRiskCVEs (>7.0)',
        'baseSeverity',
        'attackVector',
        'attackComplexity',
        'confidentialityImpact',
        'availabilityImpact',
        'description'])
records = top10[cols].to_dict(orient='records')

# Risk Trends Insight Prep
def get_trend_summary(df, freq='Q'):
    # Aggregates counts of CVEs by asset and severity, grouped by quarter or year.
    df['published'] = pd.to_datetime(df['published'])
    if freq == 'Q':
        df['period'] = df['published'].dt.to_period('Q')
    elif freq == 'Y':
        df['period'] = df['published'].dt.to_period('Y')
    else:
        raise ValueError("freq must be 'Q' for quarter or 'Y' for year.")
    # Group by period, asset, and severity
    trend = df.groupby(['period', 'cpeName', 'baseSeverity']).size().unstack(fill_value=0)
    return trend

# By quarter:
quarterly_trend = get_trend_summary(df, freq='Q')

# Calculate % Change (Quarter-over-Quarter)
def calculate_qoq_change(trend_df):
    # Ensure DataFrame is sorted by period
    trend_df = trend_df.sort_index(level=0)
    # Calculate the percentage change
    pct_change = trend_df.groupby('cpeName').pct_change().replace([float('inf'), -float('inf')], 0).fillna(0) * 100
    pct_change = pct_change.round(1)
    return pct_change

quarterly_change = calculate_qoq_change(quarterly_trend)

In [4]:
def explain_top_risks(df, client):
    risks_json = json.dumps(records, indent=2)

    # Construct the system and user prompts
    system_prompt = (
        "You are a cybersecurity expert specializing in vulnerability risk analysis "
        "and mitigation planning with a focus on NIST's Cybersecurity Framework 2.0."
    )

    user_prompt = (
        "Below is a list of the 10 highest-risk vulnerabilities (CVEs) affecting my assets. "
        "For each item, do the following:\n"
        "1. Briefly explain why this CVE poses a high risk based on its details.\n"
        "2. Suggest concise, actionable mitigation steps or best practices.\n"
        "Present the output as a numbered list. Here are the vulnerabilities:\n"
        f"{risks_json}"
    )

    # Send to OpenAI
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt}
        ],
        max_tokens=1000,
        temperature=0.2
    )

top_risks = response.choices[0].message.content

with open("../markdown/top_risks.md", "w", encoding="utf-8") as f:
    f.write(top_risks)

with open("../markdown/top_risks.md", "r", encoding="utf-8") as f:
    md_content = f.read()

display(Markdown(md_content))

Based on the provided vulnerability data, let's analyze the trends and identify key risk patterns for each asset and severity level over the given periods.

### Emerging Risk Patterns

1. **Adobe Acrobat Reader:**
   - **High Severity:** There was a significant increase in high-severity vulnerabilities from 2021Q1 to 2022Q2. The number of high-severity CVEs spiked from 3 in 2021Q4 to 36 in 2022Q2, marking a 111.8% increase. This trend indicates a growing risk associated with this asset.
   - **Medium Severity:** The medium-severity vulnerabilities also saw fluctuations, with a notable increase in 2022Q2 (24 vulnerabilities) and a subsequent decrease in 2022Q3 (11 vulnerabilities), followed by another increase in 2023Q3 (19 vulnerabilities).

2. **Oracle Database Server 19c:**
   - **Critical Severity:** The critical vulnerabilities remained relatively stable, with occasional spikes such as in 2019Q1 and 2020Q1, each reporting 1 critical vulnerability.
   - **Medium Severity:** There was a significant increase in medium-severity vulnerabilities in 2019Q4, with a 600% increase from the previous quarter. However, this was followed by a decrease in subsequent quarters.

3. **Microsoft Exchange Server 2019:**
   - **Critical Severity:** There were sporadic critical vulnerabilities, with notable occurrences in 2019Q1 and 2023Q3.
   - **Medium Severity:** The medium-severity vulnerabilities showed variability, with a peak in 2019Q2 (2 vulnerabilities) and a decrease in subsequent quarters.

4. **Oracle Database 19c (Enterprise):**
   - **Medium Severity:** There was a notable increase in medium-severity vulnerabilities in 2019Q3, with a 200% increase from the previous quarter. This trend continued with fluctuations over the subsequent periods.

5. **Alfresco:**
   - **Medium Severity:** There was a spike in medium-severity vulnerabilities in 2020Q1, with an increase from 0 to 3 vulnerabilities, indicating a potential emerging risk.

### Significant Spikes or Drops

- **Adobe Acrobat Reader:** The high-severity vulnerabilities saw a dramatic increase in 2022Q2, followed by fluctuations. This asset requires close monitoring due to its volatility in vulnerability counts.
- **Oracle Database Server 19c:** The medium-severity vulnerabilities experienced a significant spike in 2019Q4, which was a key period of increased risk.
- **Microsoft Exchange Server 2019:** The critical vulnerabilities in 2019Q1 and 2023Q3 highlight periods of increased risk, requiring attention.

### Assets and Severities Requiring Immediate Attention

- **Adobe Acrobat Reader:** Given the consistent high numbers and fluctuations in high and medium-severity vulnerabilities, this asset should be prioritized for security reviews and patch management.
- **Oracle Database Server 19c:** The historical spikes in medium-severity vulnerabilities suggest a need for ongoing monitoring and proactive vulnerability management.
- **Microsoft Exchange Server 2019:** The presence of critical vulnerabilities in recent quarters indicates a need for immediate attention to ensure these vulnerabilities are addressed promptly.

In summary, Adobe Acrobat Reader and Oracle Database Server 19c present the most notable changes and require immediate attention due to their vulnerability trends. Regular patching and security assessments should be prioritized for these assets to mitigate potential risks.

In [3]:
def generate_trend_narrative_prompt(trend_df, pct_change_df, start_period, end_period):
    # Filter the dataframes for the date range
    filtered_trend = trend_df.loc[start_period:end_period].reset_index()
    filtered_pct_change = pct_change_df.loc[start_period:end_period].reset_index()

    # Convert Period columns to strings so they're JSON serializable
    for df in (filtered_trend, filtered_pct_change):
        if 'period' in df.columns:
            df['period'] = df['period'].astype(str)

    trend_data = filtered_trend.to_dict(orient='records')
    change_data = filtered_pct_change.to_dict(orient='records')

    trend_json = json.dumps(trend_data, indent=2)
    change_json = json.dumps(change_data, indent=2)
    prompt = (
        "You are a cybersecurity risk analyst. Below are two tables:\n\n"
        "1. Raw counts of vulnerabilities (CVEs) by asset and severity for each quarter.\n"
        "2. The corresponding quarter-over-quarter percentage change for each asset and severity.\n\n"
        "Analyze the data, highlighting emerging risk patterns. Narrate where there were significant spikes or drops, and call out assets with the most notable changes. Use percentages and time periods (e.g., 'Web servers saw a 40% spike in Critical CVEs in Q1 2025').\n"
        "Finish by suggesting which assets or severities require immediate attention due to recent trends.\n\n"
        f"Raw trend data:\n{trend_json}\n\n"
        f"Quarterly percent change data:\n{change_json}\n"
    )
    return prompt


prompt = generate_trend_narrative_prompt(quarterly_trend, quarterly_change, 
                                         quarterly_trend.index.get_level_values(0).min(), 
                                         quarterly_trend.index.get_level_values(0).max())

system_prompt = "You are an expert at analyzing vulnerability data for security reporting with a particular interest in NIST's Cybersecurity Framework 2.0."
user_prompt = prompt

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt}
    ],
    max_tokens=1000,
    temperature=0.3
)

risk_trend_narrative = response.choices[0].message.content

with open("../markdown/risk_trend_narrative.md", "w", encoding="utf-8") as f:
    f.write(risk_trend_narrative)

with open("../markdown/risk_trend_narrative.md", "r", encoding="utf-8") as f:
    md_content = f.read()

display(Markdown(md_content))

Based on the provided vulnerability data, let's analyze the trends and identify key risk patterns for each asset and severity level over the given periods.

### Emerging Risk Patterns

1. **Adobe Acrobat Reader:**
   - **High Severity:** There was a significant increase in high-severity vulnerabilities from 2021Q1 to 2022Q2. The number of high-severity CVEs spiked from 3 in 2021Q4 to 36 in 2022Q2, marking a 111.8% increase. This trend indicates a growing risk associated with this asset.
   - **Medium Severity:** The medium-severity vulnerabilities also saw fluctuations, with a notable increase in 2022Q2 (24 vulnerabilities) and a subsequent decrease in 2022Q3 (11 vulnerabilities), followed by another increase in 2023Q3 (19 vulnerabilities).

2. **Oracle Database Server 19c:**
   - **Critical Severity:** The critical vulnerabilities remained relatively stable, with occasional spikes such as in 2019Q1 and 2020Q1, each reporting 1 critical vulnerability.
   - **Medium Severity:** There was a significant increase in medium-severity vulnerabilities in 2019Q4, with a 600% increase from the previous quarter. However, this was followed by a decrease in subsequent quarters.

3. **Microsoft Exchange Server 2019:**
   - **Critical Severity:** There were sporadic critical vulnerabilities, with notable occurrences in 2019Q1 and 2023Q3.
   - **Medium Severity:** The medium-severity vulnerabilities showed variability, with a peak in 2019Q2 (2 vulnerabilities) and a decrease in subsequent quarters.

4. **Oracle Database 19c (Enterprise):**
   - **Medium Severity:** There was a notable increase in medium-severity vulnerabilities in 2019Q3, with a 200% increase from the previous quarter. This trend continued with fluctuations over the subsequent periods.

5. **Alfresco:**
   - **Medium Severity:** There was a spike in medium-severity vulnerabilities in 2020Q1, with an increase from 0 to 3 vulnerabilities, indicating a potential emerging risk.

### Significant Spikes or Drops

- **Adobe Acrobat Reader:** The high-severity vulnerabilities saw a dramatic increase in 2022Q2, followed by fluctuations. This asset requires close monitoring due to its volatility in vulnerability counts.
- **Oracle Database Server 19c:** The medium-severity vulnerabilities experienced a significant spike in 2019Q4, which was a key period of increased risk.
- **Microsoft Exchange Server 2019:** The critical vulnerabilities in 2019Q1 and 2023Q3 highlight periods of increased risk, requiring attention.

### Assets and Severities Requiring Immediate Attention

- **Adobe Acrobat Reader:** Given the consistent high numbers and fluctuations in high and medium-severity vulnerabilities, this asset should be prioritized for security reviews and patch management.
- **Oracle Database Server 19c:** The historical spikes in medium-severity vulnerabilities suggest a need for ongoing monitoring and proactive vulnerability management.
- **Microsoft Exchange Server 2019:** The presence of critical vulnerabilities in recent quarters indicates a need for immediate attention to ensure these vulnerabilities are addressed promptly.

In summary, Adobe Acrobat Reader and Oracle Database Server 19c present the most notable changes and require immediate attention due to their vulnerability trends. Regular patching and security assessments should be prioritized for these assets to mitigate potential risks.