# F4. AI-Driven Mitigation Recommendations

## Intended Purpose of Code

### Key Features

1. Explain My Top Risk/Mitigations Tips
    * prompt openAI to summarize in a numbered list
        * the users top risks
        * mitigation recommendations to harden systems
3. Risk Trend Insights
    * Calculate quarter over quarter change trends
        * aggregates counts of CVEs by asset and severity
            * grouped by quarter or year
        * groups trends by by period, asset, and severity
    * Prompt openAI to
        * analyze the trend data
        * highlight emerging risk patterns
        * narrate where there were significant spikes or drops
        * call out assets with the most notable changes

In [1]:
from openai import OpenAI
import os
import pandas as pd
from IPython.display import Markdown, display
import json

# Config
api_key = os.getenv("OPENAI_API_KEY")
if not api_key:
    raise RuntimeError("API key not set.")
client = OpenAI(api_key=api_key)
asset_scores = pd.read_csv('../data/asset_risk_summary.csv')
vuln_scores = pd.read_csv('../data/cve_vuln_summary.csv')
vul_catalogue = pd.read_csv('../data/vuln_catalogue_v2.csv')

# Tidy up
scores = pd.merge(asset_scores,vuln_scores,how='inner',on=['cpeName','Title'])
scores.sort_values(by='riskScore').head(10)
df0 = pd.merge(vul_catalogue,scores, how='inner', on=['Title','cpeName','cveID'])
df0.drop(columns=['Unnamed: 0','vectorString','WrittenAt'],inplace=True,axis=1)
df = df0.sort_values(by='riskScore', ascending=False)
top10 = df.head(5)

# Explain My Top Risks & Mitigation Tips Prep
cols = (['Title',
        'cpeName',
        'cveID',
        'published',
        'baseScore',
        'exploitabilityScore',
        'impactScore',
        'riskScore',
        'MaxRiskScore',
        'countHighRiskCVEs (>7.0)',
        'baseSeverity',
        'attackVector',
        'attackComplexity',
        'confidentialityImpact',
        'availabilityImpact',
        'description'])
records = top10[cols].to_dict(orient='records')

# Risk Trends Insight Prep
def get_trend_summary(df, freq='Q'):
    # Aggregates counts of CVEs by asset and severity, grouped by quarter or year.
    df['published'] = pd.to_datetime(df['published'])
    if freq == 'Q':
        df['period'] = df['published'].dt.to_period('Q')
    elif freq == 'Y':
        df['period'] = df['published'].dt.to_period('Y')
    else:
        raise ValueError("freq must be 'Q' for quarter or 'Y' for year.")
    # Group by period, asset, and severity
    trend = df.groupby(['period', 'cpeName', 'baseSeverity']).size().unstack(fill_value=0)
    return trend

# By quarter:
quarterly_trend = get_trend_summary(df, freq='Q')

# Calculate % Change (Quarter-over-Quarter)
def calculate_qoq_change(trend_df):
    # Ensure DataFrame is sorted by period
    trend_df = trend_df.sort_index(level=0)
    # Calculate the percentage change
    pct_change = trend_df.groupby('cpeName').pct_change().replace([float('inf'), -float('inf')], 0).fillna(0) * 100
    pct_change = pct_change.round(1)
    return pct_change

quarterly_change = calculate_qoq_change(quarterly_trend)

In [4]:
def explain_top_risks(df, client):
    risks_json = json.dumps(records, indent=2)

    # Construct the system and user prompts
    system_prompt = (
        "You are a cybersecurity expert specializing in vulnerability risk analysis "
        "and mitigation planning with a focus on NIST's Cybersecurity Framework 2.0."
    )

    user_prompt = (
        "Below is a list of the 10 highest-risk vulnerabilities (CVEs) affecting my assets. "
        "For each item, do the following:\n"
        "1. Briefly explain why this CVE poses a high risk based on its details.\n"
        "2. Suggest concise, actionable mitigation steps or best practices.\n"
        "Generate a report memo with only the subject line in the memo header,"
        "Present the output as a numbered list. Here are the vulnerabilities:\n"
        f"{risks_json}"
    )

    # Send to OpenAI
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt}
        ],
        max_tokens=1000,
        temperature=0.2
    )
    return response.choices[0].message.content

# Now, call the function and store the result
top_risks = explain_top_risks(df, client)

with open("../markdown/top_risks.md", "w", encoding="utf-8") as f:
    f.write(top_risks)

with open("../markdown/top_risks.md", "r", encoding="utf-8") as f:
    md_content = f.read()

display(Markdown(md_content))

**Subject: High-Risk Vulnerability Analysis and Mitigation Recommendations**

1. **CVE-2020-1953 - Oracle Database Server 19c**
   - **Risk Explanation:** This vulnerability allows remote code execution due to improper handling of YAML files, which can lead to unauthorized code execution if YAML files are loaded from untrusted sources.
   - **Mitigation Steps:** 
     - Update Apache Commons Configuration to a version that addresses this vulnerability.
     - Ensure YAML files are only loaded from trusted sources.
     - Implement input validation and sanitization for all external data sources.

2. **CVE-2024-21410 - Microsoft Exchange Server 2019 Cumulative Update 14**
   - **Risk Explanation:** This vulnerability allows for elevation of privilege, potentially giving attackers unauthorized access to sensitive data and system controls.
   - **Mitigation Steps:**
     - Apply the latest security patches from Microsoft.
     - Regularly review and update user permissions to ensure least privilege.
     - Monitor Exchange Server logs for unusual activity.

3. **CVE-2019-0586 - Microsoft Exchange Server 2019**
   - **Risk Explanation:** This vulnerability involves remote code execution due to improper memory handling, which can be exploited to execute arbitrary code on the server.
   - **Mitigation Steps:**
     - Install the latest security updates from Microsoft.
     - Implement network segmentation to limit exposure of Exchange servers.
     - Conduct regular security audits and penetration testing.

4. **CVE-2023-27997 - Fortinet FortiGate 7000**
   - **Risk Explanation:** A heap-based buffer overflow vulnerability that allows remote attackers to execute arbitrary code via crafted requests, compromising system integrity.
   - **Mitigation Steps:**
     - Upgrade to the latest version of FortiOS and FortiProxy that addresses this vulnerability.
     - Enable and configure intrusion prevention systems (IPS) to detect and block exploit attempts.
     - Regularly review and update firewall rules and configurations.

5. **CVE-2019-16942 - Oracle Database Server 19c**
   - **Risk Explanation:** This vulnerability can lead to remote code execution through polymorphic typing issues in JSON endpoints, potentially allowing attackers to execute malicious payloads.
   - **Mitigation Steps:**
     - Update FasterXML jackson-databind to a secure version.
     - Disable Default Typing in JSON endpoints unless absolutely necessary.
     - Conduct regular code reviews and security assessments to identify and mitigate similar vulnerabilities.

In [6]:
def generate_trend_narrative_prompt(trend_df, pct_change_df, start_period, end_period):
    # Filter the dataframes for the date range
    filtered_trend = trend_df.loc[start_period:end_period].reset_index()
    filtered_pct_change = pct_change_df.loc[start_period:end_period].reset_index()

    # Convert Period columns to strings so they're JSON serializable
    for df in (filtered_trend, filtered_pct_change):
        if 'period' in df.columns:
            df['period'] = df['period'].astype(str)

    trend_data = filtered_trend.to_dict(orient='records')
    change_data = filtered_pct_change.to_dict(orient='records')

    trend_json = json.dumps(trend_data, indent=2)
    change_json = json.dumps(change_data, indent=2)
    prompt = (
        "You are a cybersecurity risk analyst. Below are two tables:\n\n"
        "1. Raw counts of vulnerabilities (CVEs) by asset and severity for each quarter.\n"
        "2. The corresponding quarter-over-quarter percentage change for each asset and severity.\n\n"
        "Analyze the data, highlighting emerging risk patterns. Narrate where there were significant spikes or drops,\n"
        "and call out assets with the most notable changes. Use percentages and time periods\n" 
        "(e.g., 'Web servers saw a 40% spike in Critical CVEs in Q1 2025').\n"
        "Your response should be titled: Risk Trends Insights.\n"
        "Finish by suggesting which assets or severities require immediate attention due to recent trends.\n\n"
        f"Raw trend data:\n{trend_json}\n\n"
        f"Quarterly percent change data:\n{change_json}\n"
    )
    return prompt


prompt = generate_trend_narrative_prompt(quarterly_trend, quarterly_change, 
                                         quarterly_trend.index.get_level_values(0).min(), 
                                         quarterly_trend.index.get_level_values(0).max())

system_prompt = "You are an expert at analyzing vulnerability data for security reporting with a particular interest in NIST's Cybersecurity Framework 2.0."
user_prompt = prompt

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt}
    ],
    max_tokens=1000,
    temperature=0.3
)

risk_trend_narrative = response.choices[0].message.content

with open("../markdown/risk_trend_narrative.md", "w", encoding="utf-8") as f:
    f.write(risk_trend_narrative)

with open("../markdown/risk_trend_narrative.md", "r", encoding="utf-8") as f:
    md_content = f.read()

display(Markdown(md_content))

## Risk Trends Insights

### Overview
The analysis of vulnerability data over several quarters reveals notable trends and shifts in risk patterns across different assets and severities. This report highlights significant spikes, drops, and emerging risk patterns, focusing on assets with the most notable changes.

### Key Observations

1. **Adobe Acrobat Reader**:
   - **High Severity**: There was a significant spike in high-severity vulnerabilities in Adobe Acrobat Reader, with a 466.7% increase in Q1 2022 and a further 111.8% increase in Q2 2022. This trend continued with fluctuations, peaking again in Q2 2024 with a 128.6% increase.
   - **Medium Severity**: Medium-severity vulnerabilities also saw a dramatic increase, with a 300% rise in Q2 2022. However, there was a notable decrease of 63.6% in Q4 2022, followed by fluctuations in subsequent quarters.

2. **Oracle Database Server 19c**:
   - **Critical Severity**: Critical vulnerabilities were sporadic, with occurrences in Q1 2019 and Q4 2019, but there was a notable absence of critical vulnerabilities in subsequent quarters.
   - **Medium Severity**: A significant increase of 600% in medium-severity vulnerabilities was observed in Q4 2019, followed by a sharp decline in Q1 2020.

3. **Microsoft Exchange Server 2019**:
   - **Critical Severity**: Critical vulnerabilities appeared intermittently, with occurrences in Q1 2019 and Q3 2023, indicating sporadic but potentially impactful risks.
   - **Medium Severity**: There was a 100% increase in medium-severity vulnerabilities in Q2 2019, followed by fluctuations and a notable absence in Q1 2021.

4. **Oracle Database 19c (Enterprise)**:
   - **Medium Severity**: There was a 200% increase in medium-severity vulnerabilities in Q3 2019, followed by a 350% increase in Q3 2022, indicating a recurring pattern of heightened risk.

### Emerging Risk Patterns

- **Adobe Acrobat Reader** consistently shows a high volume of vulnerabilities, particularly in high and medium severities, suggesting a persistent risk that requires ongoing attention.
- **Oracle Database Server 19c** and **Oracle Database 19c (Enterprise)** exhibit fluctuations in medium-severity vulnerabilities, indicating potential instability and the need for monitoring.
- **Microsoft Exchange Server 2019** shows sporadic critical vulnerabilities, highlighting the need for vigilance despite the absence of consistent patterns.

### Recommendations for Immediate Attention

- **Adobe Acrobat Reader**: Given the consistent increase in high-severity vulnerabilities, this asset requires immediate and sustained attention to mitigate potential risks.
- **Oracle Database Server 19c**: The significant fluctuations in medium-severity vulnerabilities suggest a need for enhanced monitoring and potential remediation efforts.
- **Microsoft Exchange Server 2019**: Despite the sporadic nature of critical vulnerabilities, the potential impact warrants close monitoring and timely patching.

In conclusion, while Adobe Acrobat Reader presents the most pressing risk due to its consistent vulnerability spikes, Oracle Database Server 19c and Microsoft Exchange Server 2019 also require attention to address emerging and sporadic risks effectively.