## Risk Generation Prompt Engineering Notes

Since the risk generation is something we are adding to our project to help visualize the current risks an organization has, there is no initial data, examples, or templates. We will have to research and generate our own examples.

Luckily, the risks are easier to read in a tabular format. Therefore, a JSON schema will work best.

In [1]:
import os
import google.generativeai as genai
from dotenv import load_dotenv, find_dotenv

load_dotenv(find_dotenv())
genai.configure(api_key=os.environ["GEMINI_API_KEY"])
model = genai.GenerativeModel('gemini-2.5-pro')

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
def export_html_to_files(full_html_content, html_output_filename="output.html"):
    """
    Takes the generated HTML string and performs the file export to HTML

    Args:
        full_html_content (str): The HTML content generated by generate_html_from_markdown.
        html_output_filename (str): The path to save the intermediate HTML file.
    """
    if not full_html_content:
        print("Export failed: HTML content is empty or None.")
        return

    try:
        with open(html_output_filename, "w", encoding="utf-8") as f:
            f.write(full_html_content)
        print(f"✅ Successfully exported HTML: {html_output_filename}")

    except Exception as e:
        print(f"❌ An unexpected error occurred during file export: {e}")

In [3]:
proModel = genai.GenerativeModel('gemini-2.5-pro')
flashModel = genai.GenerativeModel('gemini-2.5-flash')

In [4]:
def get_severity(score: float) -> tuple[str, str, str]:
    """Determines the color (hex), label, and Tailwind color class based on the CVSS score."""
    if score >= 9.0:
        return '#EF4444', 'CRITICAL', 'vuln-critical'
    if score >= 7.0:
        return '#F59E0B', 'HIGH', 'vuln-high'
    if score >= 4.0:
        return '#3B82F6', 'MEDIUM', 'vuln-medium'
    return '#10B981', 'LOW', 'vuln-low'

def json_to_html_cards(json_data: dict) -> str:
    """
    Converts the JSON vulnerability report into styled HTML cards.
    
    Args:
        json_data: A dictionary representing the parsed JSON vulnerability report.
        
    Returns:
        A string containing the HTML markup for the vulnerability cards.
    """
    if not json_data or not json_data.get('vulnerabilities'):
        return '<div class="text-center p-8 text-gray-500">No vulnerabilities found in the report.</div>'

    html_cards = []
    
    for vuln in json_data['vulnerabilities']:
        hex_color, label, tw_color_class = get_severity(vuln['severity_cvss_score'])
        
        # 1. Resources List HTML
        resources_html = ""
        for res in vuln['recommendations']['resources']:
            resources_html += f"""
                <li class="mb-1">
                    <a href="{res['url']}" target="_blank" class="text-blue-600 hover:text-blue-800 underline transition duration-150 ease-in-out font-medium">
                        <i class="fas fa-external-link-alt mr-2 text-xs"></i>
                        {res['type'].upper()}: {res['description']}
                    </a>
                </li>
            """
        
        # 2. Affected Elements HTML
        affected_html = "".join([
            f'<span class="inline-block bg-gray-100 text-gray-700 text-xs px-3 py-1 rounded-full mr-2 mb-2 font-mono">{el}</span>'
            for el in vuln['affected_elements']
        ])
        
        # 3. Long-Term Fix HTML (Conditional)
        long_term_fix_html = ""
        if vuln['recommendations'].get('long_term_fix'):
            long_term_fix_html = f"""
                <div class="mb-4 p-4 bg-red-100 rounded-lg shadow-sm">
                    <p class="font-semibold text-red-800 flex items-center mb-1">
                        <i class="fas fa-cogs mr-2"></i> Long-Term/Strategic Fix:
                    </p>
                    <p class="text-sm text-red-900">{vuln['recommendations']['long_term_fix']}</p>
                </div>
            """

        # 4. Main Card Structure
        card = f"""
            <div class="bg-white shadow-xl rounded-2xl overflow-hidden transform transition duration-500 hover:scale-[1.01] border-t-8" style="border-top-color: {hex_color};">
                
                <!-- Header and Score -->
                <div class="p-6">
                    <div class="flex justify-between items-start mb-4">
                        <h2 class="text-2xl font-bold text-gray-900">{vuln['risk_name']}</h2>
                        <div class="text-center p-2 rounded-lg text-white font-extrabold text-sm shadow-md" style="background-color: {hex_color};">
                            <span class="block">{label}</span>
                            <span class="text-xs">CVSS {vuln['severity_cvss_score']:.1f}</span>
                        </div>
                    </div>
                    
                    <!-- Overview -->
                    <p class="text-gray-600 mb-6 border-l-4 border-gray-200 pl-4 italic">
                        {vuln['overview']}
                    </p>

                    <!-- Affected Elements -->
                    <h3 class="text-lg font-semibold text-gray-800 mb-2 mt-4">Affected Elements</h3>
                    <div class="flex flex-wrap mb-6">
                        {affected_html}
                    </div>
                </div>

                <!-- Recommendations Section -->
                <div class="bg-gray-50 p-6">
                    <h3 class="text-lg font-bold text-gray-800 mb-4">Mitigation Recommendations</h3>
                    
                    <!-- Easy Fix -->
                    <div class="mb-4 p-4 bg-yellow-100 rounded-lg shadow-sm">
                        <p class="font-semibold text-yellow-800 flex items-center mb-1">
                            <i class="fas fa-hammer mr-2"></i> Quick/Easy Fix:
                        </p>
                        <p class="text-sm text-yellow-900">{vuln['recommendations']['easy_fix']}</p>
                    </div>
                    
                    <!-- Long Term Fix (Conditional) -->
                    {long_term_fix_html}

                    <!-- Resources -->
                    <h4 class="font-semibold text-gray-700 mt-4 mb-2 border-t pt-4">External Resources</h4>
                    <ul class="list-none space-y-2">
                        {resources_html}
                    </ul>
                </div>
            </div>
        """
        html_cards.append(card)
        
    return "".join(html_cards)

def generate_full_report_html(vulnerability_data: dict) -> str:
    """Generates the full, standalone HTML page including header, styles, and card content."""
    
    card_content = json_to_html_cards(vulnerability_data)
    
    # We must use inline styles (style="border-top-color: #EF4444;") for the dynamic colors 
    # since external Tailwind classes (like border-[${severity.color}]) cannot be used in Python string formatting reliably.
    # The Python function is now generating the specific hex codes inline for color properties.
    
    full_html = f"""
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Vulnerability Report Visualizer (Python Generated)</title>
    <!-- Load Tailwind CSS -->
    <script src="https://cdn.tailwindcss.com"></script>
    <!-- Load Font Awesome for icons -->
    <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.3/css/all.min.css" crossorigin="anonymous" />
    <script>
        // Configuration needed for custom colors
        tailwind.config = {{
            theme: {{
                extend: {{
                    fontFamily: {{
                        sans: ['Inter', 'sans-serif'],
                    }},
                    // Note: These colors are also applied via inline style in Python
                    colors: {{
                        'vuln-critical': '#EF4444', 
                        'vuln-high': '#F59E0B',    
                        'vuln-medium': '#3B82F6',  
                        'vuln-low': '#10B981',     
                    }}
                }}
            }}
        }}
    </script>
</head>
<body class="bg-gray-100 min-h-screen p-4 sm:p-8 font-sans">
    <div class="max-w-7xl mx-auto">
        <header class="text-center py-6 mb-8 bg-white shadow-lg rounded-xl">
            <h1 class="text-4xl font-extrabold text-gray-900 tracking-tight">
                Cybersecurity Vulnerability Dashboard
            </h1>
            <p class="text-lg text-gray-500 mt-2">Innovatech Dynamics - Cloud Security Audit Summary (Python Generated)</p>
        </header>

        <!-- Container for the generated vulnerability cards -->
        <div class="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-3 gap-8">
            {card_content}
        </div>
        
    </div>
</body>
</html>
"""
    return full_html

In [None]:
import textwrap
import json

example_report = "risk_template/example.tex"
example_risk_list = "risk_template/example_risk_list.json"
context_report = "risk_template/context.tex"

# Prompt
risk_prompt_text = textwrap.dedent(r'''You are an expert cybersecurity analyst tasked with converting a technical LaTeX vulnerability report into a structured JSON format. You are also given a list of current risks. Do not repeat risks that are already in this list. If it is empty, there are no current risks reported for this organization. It is your job to create new risks.
Your goal is to extract all explicit and implicit risks and map them precisely to the provided JSON schema. Do not include any text outside of the JSON block in your final response.

Focus Areas for Extraction:
    - Risk Name & Overview: Identify distinct vulnerabilities (e.g., exposed ports, missing policies, weak email configuration) and summarize them.
    - Affected Elements: Note the specific IP addresses, ports, domains, or controls mentioned (e.g., Port 3389 (RDP), email domain DMARC, MFA for Email).
    - Recommendations: Match the fixes mentioned in the report's Recommendations section to the easy_fix and long_term_fix fields.
    - CVSS Score: Assign an appropriate severity score (1-10) based on the report's assessment (e.g., Critical, High, Medium, Low) and the nature of the vulnerability. Critical/Severely exposed services should be high (e.g., 8-10).
    - Resources: Since the report does not provide links, use your knowledge to provide relevant, high-quality public links (YouTube or websites) for the proposed fixes, such as implementing DMARC or securing RDP/SSH.
''')

risk_example_text = 'Example Task:\nFollow the structure, formatting, and analytical style of this example precisely.\n\n Example report:\n'

# Example/Template Report
with open(example_report, 'r') as file:
  data = file.read()
  risk_example_text += data

risk_example_text += "\n\nExample Risk List:\n"

# Example Risk List
with open(example_risk_list, 'r') as file:
  data = json.load(file)
  risk_example_text += f"{json.dumps(data,indent=2)}"

# Context Report
report_response = ''
with open(context_report, 'r') as file:
  data = file.read()
  report_response += data

print(risk_example_text)
print("========================")
print(report_response)

Example Task:
Follow the structure, formatting, and analytical style of this example precisely.

 Example report:
```latex
\documentclass[12pt]{article}
\usepackage[letterpaper, margin=1in]{geometry}
\usepackage{amsmath, amssymb}
\usepackage{pifont} % For the checkmark symbol \ding{51}
\usepackage{booktabs} % For professional-looking tables
\usepackage{hyperref} % For links
\usepackage{fontawesome5}
\usepackage{setspace} % For line spacing
\usepackage{url} % Included and applied to domains
\usepackage{seqsplit} % Applied to long code/IP strings

\hypersetup{
    colorlinks=true,
    urlcolor=blue,
    linkcolor=black
}

\title{\textbf{Cybersecurity Readiness Report for G.A.S. Inc.}}
\author{Date: October 14, 2025}
\date{}

\begin{document}
\maketitle
\onehalfspacing

\section{Overview}
This report provides a cybersecurity assessment for G.A.S. Inc., based on a combination of a self-reported questionnaire and external technical scans. The analysis reveals critical vulnerabilities in net

In [16]:
risk_response = proModel.generate_content(
    contents=[risk_prompt_text, report_response + "\n" + risk_example_text],
    generation_config={
        'response_mime_type': 'application/json',
        'response_schema': {
            "type": "object",
            "properties": {
                "vulnerabilities": {
                    "type": "array",
                    "description": "A list of identified cybersecurity risks/vulnerabilities from the report.",
                    "items": {
                        "type": "object",
                        "properties": {
                            "risk_name": {
                                "type": "string",
                                "description": "A concise, descriptive name for the risk (e.g., 'SQL Injection Vulnerability', 'Outdated Library')."
                            },
                            "overview": {
                                "type": "string",
                                "description": "A text summary of the risk, its impact, and how it was identified."
                            },
                            "severity_cvss_score": {
                                "type": "number",
                                "description": "The calculated severity score (1-10) for the risk, based on CVSS metrics. This should be an integer or a decimal number."
                            },
                            "affected_elements": {
                                "type": "array",
                                "description": "A list of system components, files, URLs, or specific functions/code areas affected by this risk.",
                                "items": {
                                    "type": "string"
                                }
                            },
                            "recommendations": {
                                "type": "object",
                                "description": "Specific recommendations for mitigating the risk.",
                                "properties": {
                                    "easy_fix": {
                                        "type": "string",
                                        "description": "A quick, immediate, or easy-to-implement mitigation step."
                                    },
                                    "long_term_fix": {
                                        "type": "string",
                                        "description": "A more difficult, time-consuming, or comprehensive architectural fix, if necessary."
                                    },
                                    "resources": {
                                        "type": "array",
                                        "description": "Links to external resources (YouTube videos, official documentation, articles) for assistance.",
                                        "items": {
                                            "type": "object",
                                            "properties": {
                                                "type": {
                                                    "type": "string",
                                                    "enum": ["youtube", "website", "documentation"],
                                                    "description": "The type of resource."
                                                },
                                                "url": {
                                                    "type": "string",
                                                    "format": "uri",
                                                    "description": "The URL of the resource."
                                                },
                                                "description": {
                                                    "type": "string",
                                                    "description": "A brief description of what the resource contains."
                                                }
                                            },
                                            "required": ["type", "url"]
                                        }
                                    }
                                },
                                "required": ["easy_fix", "resources"]
                            }
                        },
                        "required": [
                            "risk_name",
                            "overview",
                            "severity_cvss_score",
                            "affected_elements",
                            "recommendations"
                        ]
                    }
                }
            },
            "required": ["vulnerabilities"]
        }
    }
)

print(risk_response.text)

{
  "vulnerabilities": [
    {
      "risk_name": "Publicly Accessible S3 Bucket",
      "overview": "The S3 bucket 'innovatech-prod-logs-storage', which contains customer analytics, session IDs, and geolocation data, was found to be publicly accessible due to an incorrect bucket policy. This direct exposure of sensitive data constitutes a critical security risk and a potential compliance breach.",
      "severity_cvss_score": 9.8,
      "affected_elements": [
        "AWS S3 Bucket: arn:aws:s3:::innovatech-prod-logs-storage"
      ],
      "recommendations": {
        "easy_fix": "Immediately modify the S3 bucket policy for 'innovatech-prod-logs-storage' to block all public access using the AWS S3 Block Public Access feature.",
        "resources": [
          {
            "type": "documentation",
            "url": "https://docs.aws.amazon.com/AmazonS3/latest/userguide/security-best-practices.html",
            "description": "Official AWS documentation on S3 security best practices

In [None]:
risk_response = flashModel.generate_content(
    contents=[risk_prompt_text, report_response + "\n" + risk_example_text],
    generation_config={
        'response_mime_type': 'application/json',
        'response_schema': {
            "type": "object",
            "properties": {
                "vulnerabilities": {
                    "type": "array",
                    "description": "A list of identified cybersecurity risks/vulnerabilities from the report.",
                    "items": {
                        "type": "object",
                        "properties": {
                            "risk_name": {
                                "type": "string",
                                "description": "A concise, descriptive name for the risk (e.g., 'SQL Injection Vulnerability', 'Outdated Library')."
                            },
                            "overview": {
                                "type": "string",
                                "description": "A text summary of the risk, its impact, and how it was identified."
                            },
                            "severity_cvss_score": {
                                "type": "number",
                                "description": "The calculated severity score (1-10) for the risk, based on CVSS metrics. This should be an integer or a decimal number."
                            },
                            "affected_elements": {
                                "type": "array",
                                "description": "A list of system components, files, URLs, or specific functions/code areas affected by this risk.",
                                "items": {
                                    "type": "string"
                                }
                            },
                            "recommendations": {
                                "type": "object",
                                "description": "Specific recommendations for mitigating the risk.",
                                "properties": {
                                    "easy_fix": {
                                        "type": "string",
                                        "description": "A quick, immediate, or easy-to-implement mitigation step."
                                    },
                                    "long_term_fix": {
                                        "type": "string",
                                        "description": "A more difficult, time-consuming, or comprehensive architectural fix, if necessary."
                                    },
                                    "resources": {
                                        "type": "array",
                                        "description": "Links to external resources (YouTube videos, official documentation, articles) for assistance.",
                                        "items": {
                                            "type": "object",
                                            "properties": {
                                                "type": {
                                                    "type": "string",
                                                    "enum": ["youtube", "website", "documentation"],
                                                    "description": "The type of resource."
                                                },
                                                "url": {
                                                    "type": "string",
                                                    "format": "uri",
                                                    "description": "The URL of the resource."
                                                },
                                                "description": {
                                                    "type": "string",
                                                    "description": "A brief description of what the resource contains."
                                                }
                                            },
                                            "required": ["type", "url"]
                                        }
                                    }
                                },
                                "required": ["easy_fix", "resources"]
                            }
                        },
                        "required": [
                            "risk_name",
                            "overview",
                            "severity_cvss_score",
                            "affected_elements",
                            "recommendations"
                        ]
                    }
                }
            },
            "required": ["vulnerabilities"]
        }
    }
)
print(risk_response.text)

{"vulnerabilities":[{"risk_name":"S3 Bucket Public Exposure","overview":"The S3 bucket named innovatech-prod-logs-storage, used for storing application logs and analytics data, was found to be publicly accessible due to an incorrect bucket policy setting. This exposes customer analytics records, including anonymized session IDs and geolocation data, which constitutes a breach of compliance standards.","severity_cvss_score":9.8,"affected_elements":["AWS S3 Bucket arn:aws:s3:::innovatech-prod-logs-storage"],"recommendations":{"easy_fix":"The bucket policy for innovatech-prod-logs-storage must be modified to block all public access. The policy should only allow access via specific IAM roles tied to the application services that require logging access.","resources":[{"type":"documentation","url":"https://docs.aws.amazon.com/AmazonS3/latest/userguide/configuring-block-public-access.html","description":"AWS documentation on blocking public access to S3 buckets."},{"type":"website","url":"htt

## Summary:

The Pro model was able to generate correct links to recommended fixes. However, the Flash model did not generate a correct YouTube video link on how to secure s3 buckets on AWS. 