# Threat Report Summarizing

This notebook is used to experiment with the few-shot learning technique to automatically summarize threat reports and generate a mind map of the key findings.

## What is few-shot learning?

Few-shot learning in prompt engineering refers to the approach where a model, with minimal examples or 'shots', is trained to understand and perform a specific task. Instead of requiring vast amounts of data, this method leverages prior knowledge and limited new data to quickly adapt to new tasks or prompts.

The motivation behind this is to "teach" a model with specific information to minimize mistakes or errors. In this experiment, I apply this concept to threat report summarization, but it can be applied to any type of data.

## Prerequisites

You will need an OpenAI API key, but any model can be used.

Packages to Install:
* openai
* requests
* BeautifulSoup

## Limitations

In some instances, the mindmap generation might not be flawless and can encounter issues, especially with nested parentheses or brackets. These can be easily rectified either by rerunning the model or by manually modifying the mindmap code. Also keep in mind that this is a proof of concept that needs to be adjusted for your need. :)

## Code

In [None]:
!pip3 install openai
!pip3 install beautifulsoup4

Collecting openai
  Downloading openai-1.47.0-py3-none-any.whl.metadata (24 kB)
Collecting httpx<1,>=0.23.0 (from openai)
  Downloading httpx-0.27.2-py3-none-any.whl.metadata (7.1 kB)
Collecting jiter<1,>=0.4.0 (from openai)
  Downloading jiter-0.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.6 kB)
Collecting httpcore==1.* (from httpx<1,>=0.23.0->openai)
  Downloading httpcore-1.0.5-py3-none-any.whl.metadata (20 kB)
Collecting h11<0.15,>=0.13 (from httpcore==1.*->httpx<1,>=0.23.0->openai)
  Downloading h11-0.14.0-py3-none-any.whl.metadata (8.2 kB)
Downloading openai-1.47.0-py3-none-any.whl (375 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m375.6/375.6 kB[0m [31m10.6 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading httpx-0.27.2-py3-none-any.whl (76 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m76.4/76.4 kB[0m [31m1.9 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading httpcore-1.0.5-py3-none-any.whl (77 kB)
[2K   [90m━

In [None]:
import requests
from bs4 import BeautifulSoup
from ipywidgets import widgets
from IPython.display import display
import openai
import os
import time
import re
from ipywidgets import Output

output = Output()

# You need to configure you environement variable or to add your key here
# Set env variable
os.environ["OPENAI_API_KEY"] = "<openai api key>"
client = openai.OpenAI()

# Function to scrape text from a URL
def scrape_text(url):
    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"
    }

    response = requests.get(url, headers=headers)
    if response.status_code == 200:
        page_content = response.content
        soup = BeautifulSoup(page_content, "html.parser")
        text = soup.get_text()
        return text
    else:
        return "Failed to scrape the website"

# Function to summarize the blog
def summarise(input_text):
    response = client.chat.completions.create(
        model="gpt-4o",
        messages= [
            {
                "role": "system",
                "content":"You are responsible for summarizing a threat report for a Threat Analyst. Write a paragraph that will summarize the main topic, the key findings, and all the detailed information relevant for a threat analyst such as detection opportunity iocs and TTPs. Use the title and add an emoji. Do not generate a bullet points list but rather multiple paragraphs."
            },

            {"role": "user", "content": input_text},
        ],
    )
    return response

# Function to generate a mindmap (few shot technique).
# NB: the more shot you add the better the result will be
def run_models(input_text):
    response = client.chat.completions.create(
        model="gpt-4o",
        messages= [
            {
                "role": "system",
                "content":"You are tasked with creating an in-depth mindmap designed specifically for a threat analyst. This mindmap aims to visually organize key findings and crucial highlights from the text. Please adhere to the following guidelines: \n1. Avoid using hyphens in the text, as they cause errors in the Mermaid.js code \n2. Limit the number of primary nodes branching from the main node to four. These primary nodes should encapsulate the top four main themes. Add detailed sub-nodes to elaborate on these themes \n3. Incorporate icons where suitable to enhance readability and comprehension\n 4. Use single parentheses around each node to give them a rounded shape."
            },
            {
                "role": "user",
                "content": "Title: \ud83e\udda0 Lazarus Group's Infrastructure Reuse Leads to Discovery of New Malware\n\nThe Lazarus Group, a North Korean state-sponsored actor famous for its relentless cyber offensive actions, continues to adjust its tactics and expand its arsenal. Recently, the revealed an exploitation of the ManageEngine ServiceDesk vulnerability (CVE-2022-47966) in another campaign. This exposure led to deploying multiple threats, with a new one identified as CollectionRAT, alongside an already used threat named QuiteRAT. \n\nThe advanced malware CollectionRAT has standard remote access trojan (RAT) capabilities, being able to run arbitrary commands on an infected system. Our intense analysis linked CollectionRAT to Jupiter/EarlyRAT, a malware family somewhat known to be associated with Andariel, a subgroup under the Lazarus Group umbrella. Interestingly, the group is gradually increasing its reliance on open-source tools and frameworks in the initial access phase of its attacks, as shown by Lazarus' use of the DeimosC2 framework. \n\nThe Lazarus Group's unchanging use of certain infrastructures, despite them being well-documented by security researchers, is another noteworthy observation. Their modus operandi, captured in the repeated use of the same tactics, techniques, and procedures (TTPs), shows audacious assurance in their operations. But this approach also offers intelligence opportunities for security analysts on tracking these reusable infrastructure components. \n \nDespite their blatant actions, the Lazarus Group's dynamism is evident by their increasingly heavy reliance on using dual-use utilities for activities like reverse tunneling. Some of the reused infrastructure components hosted the new CollectionRAT malware, and an altered copy of PuTTY's Plink utility was downloaded onto compromised endpoints, further demonstrating Lazarus Group's rapid evolution. \n\nIn conclusion, the Lazarus Group continues to reuse and recycle its well-worn tactics while pushing ahead with new threats and evolving TTPs. This information highlights why there's a need for organized cooperation among threat researchers and the critical importance of staying up-to-date with the latest threat intelligence.",
            },
            {
                "role": "assistant",
                "content": "mindmap\nroot(Lazarus Group Threat Analysis)\n    (Infrastructure Reuse)\n      ::icon(fa fa-sync-alt)\n      (Used in latest campaign)\n      (Includes Plink, an open-source tool)\n      (Indicates confidence in operations)\n    (New Malware - CollectionRAT)\n      ::icon(fa fa-bug)\n      (Remote Access Trojan)\n      (Allows running arbitrary commands on infected system)\n      (Similarities to EarlyRAT malware)\n    (Shifting Tactics)\n      ::icon(fa fa-exchange-alt)\n      (Increased use of open-source tools and frameworks)\n      (Including DeimosC2 framework)\n    (Protection Measures)\n      ::icon(fa fa-shield-alt)\n      (Detection and blocking with Cisco security products)\n      (Indicators of Compromise available on GitHub)\n    (Threat Report Significance)\n      ::icon(fa fa-exclamation-circle)\n      (Highlighted continued activity of Lazarus Group)\n      (Essential for enhancing detection and response capabilities)",
            },
            {"role": "user", "content": input_text},
        ],
    )
    return response

# Function to generate MermaidJS HTML
def mermaid_chart(mindmap_code):
    html_code = f"""
    <html>
     <head>
      <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.1/css/all.min.css">
      <script src="https://cdn.jsdelivr.net/npm/mermaid@10/dist/mermaid.min.js">
      </script>
      <script>
        mermaid.initialize({{
            startOnLoad: true,
            flowchart: {{
                width: 1000,  // Set as per your requirements
                height: 800   // Set as per your requirements
            }}
        }});
      </script>
      <style>
        body {{
          margin: 0;
          height: 100vh;
          display: flex;
          justify-content: center;
          align-items: center;
          background-color: transparent;
        }}

        .mermaid {{
          background-color: transparent;
          transform: scale(1.5);  /* Adjust the scaling factor as needed */
          transform-origin: center;
          width: 100%;
          max-width: 1100px;  // Adjust as needed
          height: auto;
        }}
      </style>
     </head>
     <body>
      <div class="mermaid">
    {mindmap_code}
      </div>
     </body>
    </html>
    """

    with open("mindmap.html", "w") as f:
        f.write(html_code)

# Define button click event
def on_button_click(b):
    with output:
        url = url_input.value
        print("[+] Blog to summarize: " + url)
        scraped_text = scrape_text(url)

        if summarize_checkbox.value:
            summary = summarise(scraped_text)
            print("Summary:")
            print(summary)

        if mindmap_checkbox.value:
            #mindmap_code = run_models(scraped_text)
            mindmap_code = run_models(summary)
            print("Mindmap Code:")

            #print("#########")
            print(mindmap_code)
            mermaid_chart(mindmap_code)
            print("Mindmap has been saved to 'mindmap.html'.")

# Create widgets
# url_input = widgets.Text(
#     value='',
#     placeholder='Enter URL to summarize',
#     description='URL:',
#     disabled=False
# )
#
# summarize_checkbox = widgets.Checkbox(
#     value=False,
#     description='Summarize',
#     disabled=False,
#     indent=False
# )
#
# mindmap_checkbox = widgets.Checkbox(
#     value=False,
#     description='Generate Mindmap',
#     disabled=False,
#     indent=False
# )
#
# # Create a button
# button = widgets.Button(description="Go")
# button.on_click(on_button_click)
#
# # Display widgets
# display(url_input, summarize_checkbox, mindmap_checkbox, button)
# display(output)

content = """
Skip to main content
U.S. flag
An official website of the United States government

Here’s how you know
Free Cyber Services
#protect2024
Secure Our World
Shields Up
Report A Cyber Issue

CISA Logo Americas Cyber Defense Agency
Search


Topics
Spotlight
Resources & Tools
News & Events
Careers
About
Breadcrumb
Home  News & Events  Cybersecurity Advisories  Alert
Share:
Alert
Versa Networks Releases Advisory for a Vulnerability in Versa Director, CVE-2024-39717
Release DateAugust 27, 2024
Versa Networks has released an advisory for a vulnerability (CVE-2024-39717) in Versa Director, a key component in managing SD-WAN networks, used by some Internet Service Providers (ISPs) and Managed Service Providers (MSPs). A cyber threat actor could exploit this vulnerability to take control of an affected system.

CISA urges organizations to apply necessary updates, hunt for any malicious activity, report any positive findings to CISA, and review the following for more information:

Versa Security Bulletin: Update on CVE-2024-39717 – Versa Director Dangerous File Type Upload Vulnerability
Lumen: Taking the Crossroads: The Versa Director Zero-Day Exploitation
CISA has added this vulnerability to its Known Exploited Vulnerabilities Catalog based on evidence of active exploitation.

This product is provided subject to this Notification and this Privacy & Use policy.

Please share your thoughts
We recently updated our anonymous product survey; we’d welcome your feedback.

Related Advisories
Aug 27, 2024
Alert
CISA Adds One Known Exploited Vulnerability to Catalog
Aug 26, 2024
Alert
CISA Adds One Known Exploited Vulnerability to Catalog
Aug 23, 2024
Alert
CISA Adds One Known Exploited Vulnerability to Catalog for Versa Networks Director
Aug 22, 2024
Alert
CISA Releases Five Industrial Control Systems Advisories
Return to top
Topics
Spotlight
Resources & Tools
News & Events
Careers
About
Cybersecurity & Infrastructure Security Agency
Facebook
Twitter
LinkedIn
YouTube
Instagram
RSS
CISA Central
1-844-Say-CISA SayCISA@cisa.gov
DHS Seal
CISA.gov
An official website of the U.S. Department of Homeland Security
About CISA
Budget and Performance
DHS.gov
Equal Opportunity & Accessibility
FOIA Requests
No FEAR Act
Office of Inspector General
Privacy Policy
Subscribe
The White House
USA.gov
Website Feedback
"""
response = summarise(content)
chart = run_models(content)

AuthenticationError: Error code: 401 - {'error': {'message': 'Incorrect API key provided: <openai ****key>. You can find your API key at https://platform.openai.com/account/api-keys.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}}

In [None]:
print(response.choices[0].message.content)

🔒 **Versa Networks Releases Advisory for a Vulnerability in Versa Director, CVE-2024-39717** 🔒

Versa Networks has issued an advisory for a critical vulnerability identified as CVE-2024-39717 within Versa Director, a pivotal tool used for managing SD-WAN networks, commonly deployed by Internet Service Providers (ISPs) and Managed Service Providers (MSPs). This vulnerability, characterized as a dangerous file type upload flaw, can be exploited by cyber threat actors to gain control over the affected systems. The urgency of this situation has prompted the Cybersecurity and Infrastructure Security Agency (CISA) to add this vulnerability to its Known Exploited Vulnerabilities Catalog, confirming active exploitation in the wild.

Organizations utilizing Versa Director are strongly advised to implement the necessary updates immediately. Detection opportunities include monitoring for signs of unauthorized file uploads and increased privileged operations on affected systems. Indicators of Comp

In [None]:
print(chart.choices[0].message.content)

mindmap
  root(CISA Alert: Versa Director Vulnerability)
    (Vulnerability Details)
      ::icon(fa fa-bug)
      (CVE-2024-39717)
      (Impact on SD-WAN networks)
      (Exploitable by cyber threat actors)
      (Active exploitation evidence)
    (Affected Systems)
      ::icon(fa fa-network-wired)
      (Internet Service Providers (ISPs))
      (Managed Service Providers (MSPs))
    (Mitigation Steps)
      ::icon(fa fa-tools)
      (Apply necessary updates)
      (Hunt for malicious activity)
      (Report findings to CISA)
      (Review security bulletin)
    (Resources)
      ::icon(fa fa-book)
      (Versa Security Bulletin)
      (Lumen Report: Versa Director Zero-Day)
      (CISA Known Exploited Vulnerabilities Catalog)

