# RAD Security: CVE-Aware Analysis Agent

## AI Engineer Take-Home Exercise

This document outlines the architecture and implementation of an LLM-powered agent designed to analyze security incidents in the context of CVE data. The agent uses semantic search, retrieval-augmented generation, and structured agent tools to provide contextual prioritization of security vulnerabilities.

### Key Features

- **Semantic Search**: Searches over KEV, NVD, and historical incident records
- **Agent Tools**: Uses MCP (Model Context Protocol) for structured tool access
- **Contextual Prioritization**: Ranks CVEs based on their relevance to specific incidents
- **Historical Learning**: Builds a vector store of past analyses for normalization
- **Persistence**: Stores analyses in SQLite for future reference

## Why This Architecture

The architecture is designed to address several key challenges in security incident analysis:

1. **Volume Challenge**: Security teams face thousands of CVEs and alerts daily
2. **Context Challenge**: Understanding the relationship between vulnerabilities and incidents requires contextual knowledge
3. **Expertise Challenge**: Security expertise is scarce and expensive
4. **Consistency Challenge**: Manual analysis leads to inconsistent prioritization

Our solution uses LLMs and semantic search to understand incident context, identify relevant CVEs, prioritize them based on impact, and generate human-readable explanations of the analysis.

## 1. Setup and Dependencies

Let's start by installing the required dependencies and setting up our environment.

**Why we do this:** Ensuring all required packages are available creates a reproducible environment. This setup step loads essential libraries for LangChain, LangGraph, OpenAI, FAISS vector storage, and Redis caching.

In [1]:
# Let's quickly disable all INFO logging before we continue
import logging
logging.getLogger().setLevel(logging.ERROR)

In [2]:
# Install requirements
%pip install -r requirements.txt
# Or whatever equivelent you prefer (poetry, uv)

Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 25.0 -> 25.1.1
[notice] To update, run: C:\Python312\python.exe -m pip install --upgrade pip


## 2. Start Redis (for Idempotency Cache)

We use Redis for request deduplication and caching. This ensures our system is idempotent and avoids redundant processing.

We ensure a unique `request_id` is sent 

**Why we do this:** Redis provides fast, in-memory caching that helps us:
1. Deduplicate analysis requests (idempotency)
2. Cache expensive operations like semantic searches
3. Reduce API costs and latency by storing LLM responses
4. Ensure consistent behavior even with intermittent failures

In [3]:
# Start Docker Service for Redis:
!docker run -d --name local-redis -p 6379:6379 redis:latest

docker: Error response from daemon: Conflict. The container name "/local-redis" is already in use by container "d7532e8e07eda16696991be756f16a983f49b82035ed57a286f0bc5f0aaafb22". You have to remove (or rename) that container to be able to reuse that name.

Run 'docker run --help' for more information


## 3. System Architecture Overview

Our system follows a layered architecture with distinct components handling specific responsibilities:

```
┌─────────────────────────────────┐
│          Client Layer           │
│ (Notebook, run_analysis.py)     │
└────────────────┬────────────────┘
                 │
┌────────────────▼────────────────┐
│        API Service Layer        │
│ (main_security_agent_server.py) │
└────────────────┬────────────────┘
                 │
┌────────────────▼────────────────┐
│           Agent Layer           │
│ (LangChain, LangGraph, ReAct)   │
└────────────────┬────────────────┘
                 │
┌────────────────▼────────────────┐
│       Tools Layer               │
│ (MCP Server, mcp_cve_server.py) │
└────────────────┬────────────────┘
                 │
┌────────────────▼────────────────┐
│       Storage Layer             │
│ (FAISS, Redis, SQLite)          │
└─────────────────────────────────┘
```

**Why this architecture:** 

1. **Separation of Concerns**: Each layer has a distinct responsibility
2. **Scalability**: Components can be scaled independently
3. **Resilience**: Failures in one layer don't cascade to others
4. **Maintainability**: Easier to update or replace individual components
5. **Testing**: Components can be tested in isolation

### Key Project Files and Their Roles

```
.
├── main_security_agent_server.py           
│     # FastAPI server:  
│     # • /analyze_incidents endpoint enforces idempotency via unique request_id  
│     # • Config-driven defaults (batch_size, model_name, Redis URL)  
│     # • Orchestrates agent logic (separation of concerns)  
│     # • Captures run metadata & logs (observability)  
│ 
├── mcp_cve_server.py                       
│     # MCP tool server:  
│     # • Defines CVE search tools (semantic, keyword, schema lookups)  
│     # • @timing_metric & @cache_result for latency logging & caching  
│     # • Purely “tools”—keeps retrieval logic out of agent core  
│ 
├── run_analysis.py                         
│     # CLI batch runner:  
│     # • Reads config (API URL, concurrency, batch_size)  
│     # • Fires off concurrent requests to the API  
│ 
├── data/                                   
│   ├── incidents.json                      
│   ├── kev.json                            
│   ├── nvd_subset.json                     
│   └── vectorstore/                        
│         # FAISS indexes for kev, nvd, and incident_history  
│ 
├── setup/                                  
│   ├── download_cve_data.py                
│   │     # Pulls & filters KEV/NVD feeds (config-driven URLs)  
│   ├── build_faiss_KEV_and_NVD_indexes.py  
│   │     # Embeds & builds FAISS indexes for KEV & NVD  
│   ├── build_historical_incident_analyses_index.py  
│   │     # Builds FAISS index on dummy/historical incidents  
│   └── setup_initial_CVE_data_and_FAISS_indexes.sh  
│         # Shell wrapper: runs all setup steps in one go  
│ 
└── utils/                                  
    ├── retrieval_utils.py                  
    │     # Embedding init, batch_match, historical_context, MMR search  
    ├── flatteners.py                       
    │     # JSON → text flatteners for incidents, KEV, NVD  
    ├── prompt_utils.py                     
    │     # Prompt templates, Pydantic models, prompt‐generation logic  
    ├── datastore_utils.py                  
    │     # SQLAlchemy DB ops (incident & run_metadata tables)  
    └── decorators.py                       
          # @timing_metric & @cache_result for observability & idempotency  
``` 
# 4. Data Ingestion & Preprocessing

In this section, we'll explore the input data and prepare it for analysis. We need to:

1. Load and inspect incident data
2. Retrieve and prepare CVE data (KEV and NVD)
3. Create flattened text representations for vector embedding

**Why we do this:** Proper data preparation is critical for effective semantic search. By flattening complex JSON structures into searchable text, we enable the embedding model to capture semantic relationships between incidents and vulnerabilities.

### Set Up Data Sources

In addition to the `incidents.json` file, we will have some main data sources to draw upon for the application.

The first two will be CVE data downloaded from the [Known Exploided Vulnerabilities Catalog](https://www.cisa.gov/known-exploited-vulnerabilities-catalog) (KEV).
- This contains data on vulnerabilities that have _actually been used in attacks_.

The second primary source is the government's [National Vulnerability Databaasse](https://nvd.nist.gov/vuln) (NVD).
- This contains data on known vulnerabilities, often given by companies (SAP, Microsoft, etc.) and is brought to the attention of developers to ensure safe handling of vulnerabilities as they are discovered.

#### Pulling the Data

NIST has a reputation for a rate limit that we would far exceed for our purposes here (5 request perr 30 seconds) if used on individual CVEs.  Additionally, **the input data (incidents.json) does not specify any CVE IDs**, so the general spirit of our first step is to acquire the data we can in order to use NLP to map incident data with CVEs for better risk assessment.

We begin by pulling the entire current KEV dataset JSON and the NVD.zip for a given year (here we used 2025):
```python
# NVD feed URL for 2025
NVD_FEED_URL = (
    "https://nvd.nist.gov/feeds/json/cve/1.1/nvdcve-1.1-2025.json.zip"
)
# CISA KEV (Known-Exploited Vulnerabilities) JSON feed
KEV_FEED_URL = (
    "https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json"
)
```

The KEV JSON is manageable (around 1.2 MB), but the unzipped NVD JSON for 2025 (this far) is almosst 30 MB.  That isn't impossibly large, but when we are looking at bringing tokens, latency, and most importantly, accuracy, into the equation, we will want to isolate entries where possible for the "canonical dataset" that we are working with.

We instrument a simple but effective method by extracting software names from the `incidents.json`'s `affected_assets.installed_software` field.  We then use that as a filter to create a second JSON (`nvd_subset.json`) of entries from the larger NVD dataset, but now only pertaining to affected software in our incidents.

```python
# setup/download_cve_data.py
def extract_vendor_filters() -> list[str]:
    """Read all installed_software names from incidents and return lowercase tokens."""
    incidents = json.loads(INCIDENTS_PATH.read_text())
    filters = set()
    for inc in incidents:
        for asset in inc.get("affected_assets", []):
            for sw in asset.get("installed_software", []):
                # Option 1: take the full name
                filters.add(sw["name"].lower())
                # Option 2: split vendor and product
                vendor = sw["name"].split()[0].lower()
                filters.add(vendor)
    return sorted(filters)
```

This brings down the size to about 8.5 MB which is much more reasonable and gives our agent-to-be a more narrowed scope to search upon.

In [4]:
import json
from pathlib import Path
import os
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Ensure we have an OpenAI API key
if not os.getenv("OPENAI_API_KEY"):
    raise ValueError(
        "Please set the OPENAI_API_KEY environment variable in your .env file"
    )

# Load incidents data
data_dir = Path('data')
with open(data_dir / 'incidents.json') as f:
    incidents = json.load(f)

# Display overview statistics
print(f"Total incidents: {len(incidents)}")
print("Fields in first incident:", list(incidents[0].keys()))

# Display first incident in pretty format
print("\nSample incident details:")
print(json.dumps(incidents[0], indent=2))

Total incidents: 39
Fields in first incident: ['incident_id', 'timestamp', 'title', 'description', 'affected_assets', 'observed_ttps', 'indicators_of_compromise', 'initial_findings']

Sample incident details:
{
  "incident_id": "INC-2023-08-01-001",
  "timestamp": "2023-08-01T09:15:00Z",
  "title": "Unauthorized Access Attempt on VPN Gateway",
  "description": "Multiple failed login attempts followed by a successful connection from an unusual geographic location on the main VPN gateway.",
  "affected_assets": [
    {
      "hostname": "vpn-gateway-01",
      "ip_address": "203.0.113.1",
      "os": "Cisco IOS XE",
      "installed_software": [
        {
          "name": "Cisco IOS XE",
          "version": "17.3.4a"
        }
      ],
      "role": "VPN Gateway"
    }
  ],
  "observed_ttps": [
    {
      "framework": "MITRE ATT&CK",
      "id": "T1110",
      "name": "Brute Force"
    },
    {
      "framework": "MITRE ATT&CK",
      "id": "T1078",
      "name": "Valid Accounts"
    

In [5]:
# Load KEV (Known Exploited Vulnerabilities) data
with open(data_dir / 'kev.json') as f:
    kev_data = json.load(f)
    
print(f"KEV entries: {len(kev_data.get('vulnerabilities', []))}")
print("\nSample KEV entry:")
print(json.dumps(kev_data.get('vulnerabilities', [])[0], indent=2))

KEV entries: 1351

Sample KEV entry:
{
  "cveID": "CVE-2023-38950",
  "vendorProject": "ZKTeco",
  "product": "BioTime",
  "vulnerabilityName": "ZKTeco BioTime Path Traversal Vulnerability",
  "dateAdded": "2025-05-19",
  "shortDescription": "ZKTeco BioTime contains a path traversal vulnerability in the iclock API that allows an unauthenticated attacker to read arbitrary files via supplying a crafted payload.",
  "requiredAction": "Apply mitigations per vendor instructions, follow applicable BOD 22-01 guidance for cloud services, or discontinue use of the product if mitigations are unavailable.",
  "dueDate": "2025-06-09",
  "knownRansomwareCampaignUse": "Unknown",
  "notes": "https://www.zkteco.com/en/Security_Bulletinsibs ; https://nvd.nist.gov/vuln/detail/CVE-2023-38950",
  "cwes": [
    "CWE-22"
  ]
}


In [6]:
# Load NVD (National Vulnerability Database) data
with open(data_dir / 'nvd_subset.json') as f:
    nvd_data = json.load(f)
    
print(f"NVD entries: {len(nvd_data)}")

# Display a sample NVD entry
sample_cve_id = list(nvd_data.keys())[0]
print(f"\nSample NVD entry ({sample_cve_id}):")
# Just show the description part to keep it manageable
desc = nvd_data[sample_cve_id].get("cve", {}).get("description", {}).get("description_data", [{}])[0].get("value", "")
print(desc)

NVD entries: 3062

Sample NVD entry (CVE-2025-0020):
Violation of Secure Design Principles, Hidden Functionality, Incorrect Provision of Specified Functionality vulnerability in ArcGIS (Authentication) allows Privilege Abuse, Manipulating Hidden Fields, Configuration/Environment Manipulation.

The ArcGIS client_credentials OAuth 2.0 API implementation does not adhere to the RFC/standards; This hidden (known and by-design, but undocumented) functionality enables a requestor (Referred to as client in RFC 6749) to request an, undocumented, custom token expiration from ArcGIS (Referred to as authorization server in RFC 6749).


## 5. Text Flattening for Vector Embedding

Before building our vector indexes, we need to convert the structured data into a flattened text format suitable for embedding. Let's examine our flattening strategies:

**Why we do this:** Embeddings work on raw text, but our data is in complex JSON structures. Flattening transforms these structures into searchable text while preserving the semantic meaning of the data.

Let's examine the flattening functions to understand how they work.  Here is an example implementation followed by an example of using it in the wild.

In [7]:
# utils/flatteners.py example implementation
def flatten_incident(incident: dict) -> str:
    """
    Flatten an incident into a text representation for embedding.
    
    Args:
        incident: The incident dict to flatten
        
    Returns:
        A string representation of the incident
    """
    # Start with the title and description
    text = f"{incident.get('title', '')}\n{incident.get('description', '')}\n"
    
    # Add initial findings
    text += f"{incident.get('initial_findings', '')}\n"
    
    # Add affected assets
    for asset in incident.get("affected_assets", []):
        text += f"Asset: {asset.get('hostname', '')} ({asset.get('ip_address', '')})\n"
        text += f"OS: {asset.get('os', '')}\n"
        text += f"Role: {asset.get('role', '')}\n"
        
        # Add installed software
        for sw in asset.get("installed_software", []):
            text += f"Software: {sw.get('name', '')} {sw.get('version', '')}\n"
    
    # Add TTPs (Tactics, Techniques, and Procedures)
    for ttp in incident.get("observed_ttps", []):
        text += f"TTP: {ttp.get('name', '')} ({ttp.get('id', '')})\n"
    
    # Add indicators of compromise
    for ioc in incident.get("indicators_of_compromise", []):
        text += f"IoC: {ioc.get('type', '')}: {ioc.get('value', '')} - {ioc.get('context', '')}\n"
    
    return text

In [8]:
from utils.flatteners import flatten_kev, flatten_nvd, flatten_incident
from langchain.docstore.document import Document

# Example KEV entry flattening
sample_kev = kev_data.get('vulnerabilities', [])[0]
doc_kev = flatten_kev(sample_kev)
print("Flattened KEV document:")
print(doc_kev.page_content[:200], "...")
print(f"\n{'-'*50}\n")

# Example NVD entry flattening
sample_nvd = list(nvd_data.values())[0]
doc_nvd = flatten_nvd(sample_nvd)
print("\nFlattened NVD document:")
print(doc_nvd.page_content[:200], "...")
print(f"\n{'-'*50}\n")

# Example Incident flattening
doc_inc = Document(
    page_content=flatten_incident(incidents[0]), 
    metadata={"incident_id": incidents[0]["incident_id"]}
)
print("\nFlattened Incident document:")
print(doc_inc.page_content[:200], "...")

09:15:09 INFO     [root] Testing Logger Levels...
09:15:09 DEBUG    [root] Debug logging initialized
09:15:09 INFO     [root] Info logging initialized
09:15:09 ERROR    [root] Error logging initialized


--- Logging error ---
Traceback (most recent call last):
  File "C:\Python312\Lib\logging\handlers.py", line 74, in emit
    self.doRollover()
  File "C:\Python312\Lib\logging\handlers.py", line 179, in doRollover
    self.rotate(self.baseFilename, dfn)
  File "C:\Python312\Lib\logging\handlers.py", line 115, in rotate
    os.rename(source, dest)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log' -> 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log.1'
Call stack:
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python312\site-packages\ipykernel_launcher.py", line 18, in <module>
    app.launch_new_instance()
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python3

Flattened KEV document:
CVE CVE-2023-38950
ZKTeco
BioTime
ZKTeco BioTime Path Traversal Vulnerability
ZKTeco BioTime contains a path traversal vulnerability in the iclock API that allows an unauthenticated attacker to read a ...

--------------------------------------------------


Flattened NVD document:
CVE CVE-2025-0020
Violation of Secure Design Principles, Hidden Functionality, Incorrect Provision of Specified Functionality vulnerability in ArcGIS (Authentication) allows Privilege Abuse, Manipulat ...

--------------------------------------------------


Flattened Incident document:
Unauthorized Access Attempt on VPN Gateway
Multiple failed login attempts followed by a successful connection from an unusual geographic location on the main VPN gateway.
Credential stuffing or brute  ...


## 6. Building Vector Indexes

Now we build FAISS vector indexes for efficient semantic search across our data sources. This process involves:

1. Initializing the OpenAI embeddings model
2. Creating FAISS indexes for KEV, NVD, and historical incident data
3. Setting up utilities for semantic search

**Why we do this:** Vector indexes enable fast similarity search over large datasets. By precomputing embeddings and storing them in FAISS indexes, we can perform semantic searches in milliseconds rather than having to recompute embeddings for each query.

### Indexing

Initially, we simply gave the agent tools to search the JSON files, but this of course was not as efficient or ideal long-term as some other options that gave it more intelligence and accuracy.

The static and reasonable structure and size of entries in both the KVD and NVD made them natural candidates for chunking, embedding, and indexing, each into their own FAISS index.

We used `langchain_openai`'s `OpenAIEmbeddings` with the default model of `text-embedding-3-small` for both performance and low cost given the context of this project.

We wrote the setup scripts to primarily run via CLI arguments since these are not intended to be called frequently, if at all, by our application later.  They are instead intended to be one-time runs to "initialize" a system that mimics a setup that a production environment would likely already have (Weaviate/Pinecone, Historical Data, Known Incindents, etc.).

```python
# setup/build_faiss_KEV_and_NVD_indexes.py
# ---------- CLI --------------
parser = argparse.ArgumentParser()
parser.add_argument("--model", default="text-embedding-3-small")
parser.add_argument("--topk-test", type=int, default=3)
args = parser.parse_args()

embeddings = OpenAIEmbeddings(model=args.model, show_progress_bar=True)
```

We use our embedddings naturally to create our indexes:

```python
def build_kev_index():
    # ---------- Build / Skip KEV index ----------
    if index_is_fresh(kev_json, kev_out):
        print("✅ KEV index up-to-date – skipping build")
    else:
        print("🔄 Building KEV index …")
        kev_raw  = json.load(kev_json.open())
        kev_docs = [flatten_kev(e) for e in kev_raw["vulnerabilities"]]
        FAISS.from_documents(kev_docs, embeddings).save_local(kev_out)
        print("✅ Saved KEV index to data/vectorstore/kev\n")

def build_nvd_index():
    # ---------- Build / Skip NVD index ----------
    nvd_json = DATA_DIR / "nvd_subset.json"
    nvd_out  = OUT_DIR / "nvd"

    if index_is_fresh(nvd_json, nvd_out):
        print("✅ NVD index up-to-date – skipping build")
    else:
        print("🔄 Building NVD index …")
        nvd_raw  = json.load(nvd_json.open())
        nvd_docs = [flatten_nvd(item) for item in nvd_raw.values()]
        FAISS.from_documents(nvd_docs, embeddings).save_local(nvd_out)
        print("✅ Saved NVD index to data/vectorstore/nvd\n")
```

**Note:** You'll see there is a check on whether the index is "fresh".  We instill a check ensures it ONLY builds the index if the lastModifiedDate of the corresponding JSON is more recent than the current index (and of course, we build the index if no index yet exists).

Here is an example of initializing out embeddings and indexes:

In [9]:
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS

from utils.retrieval_utils import DATA_DIR

def initialize_openai_embeddings():
    """
    Initialize the global OpenAI embeddings object for vector representations.
    """
    global embeddings
    embeddings = OpenAIEmbeddings()

def initialize_faiss_indexes():
    """
    Initialize global FAISS vector indexes for different vulnerability databases.
    """
    global KEV_FAISS, NVD_FAISS, INCIDENT_HISTORY_FAISS
    if embeddings is None:
        initialize_openai_embeddings()

    # Load pre-built FAISS indexes
    KEV_FAISS = FAISS.load_local(DATA_DIR / "kev", embeddings,
                              allow_dangerous_deserialization=True)
    
    NVD_FAISS = FAISS.load_local(DATA_DIR / "nvd", embeddings,
                              allow_dangerous_deserialization=True)
    
    INCIDENT_HISTORY_FAISS = FAISS.load_local(DATA_DIR / "incident_analysis_history", 
                               embeddings, allow_dangerous_deserialization=True)

Implementation details from `utils/retrieval_utils.py`:

In [10]:
from utils.retrieval_utils import initialize_openai_embeddings, initialize_faiss_indexes

# Initialize OpenAI embeddings and FAISS indexes
initialize_openai_embeddings()
initialize_faiss_indexes()
print("Embeddings and FAISS indexes initialized successfully.")

09:15:11 INFO     [root] Initializing OpenAI embeddings...


--- Logging error ---
Traceback (most recent call last):
  File "C:\Python312\Lib\logging\handlers.py", line 74, in emit
    self.doRollover()
  File "C:\Python312\Lib\logging\handlers.py", line 179, in doRollover
    self.rotate(self.baseFilename, dfn)
  File "C:\Python312\Lib\logging\handlers.py", line 115, in rotate
    os.rename(source, dest)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log' -> 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log.1'
Call stack:
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python312\site-packages\ipykernel_launcher.py", line 18, in <module>
    app.launch_new_instance()
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python3

09:15:12 INFO     [root] OpenAI embeddings initialized!
09:15:12 INFO     [root] initialize_openai_embeddings completed in 0.83s
09:15:12 INFO     [root] Loading KEV FAISS index...
09:15:12 DEBUG    [faiss.loader] Environment variable FAISS_OPT_LEVEL is not set, so let's pick the instruction set according to the current CPU
09:15:12 INFO     [faiss.loader] Loading faiss with AVX2 support.
09:15:12 INFO     [faiss.loader] Successfully loaded faiss with AVX2 support.
09:15:12 INFO     [faiss] Failed to load GPU Faiss: name 'GpuIndexIVFFlat' is not defined. Will not load constructor refs for GPU indexes. This is only an error if you're trying to use GPU Faiss.
09:15:12 INFO     [root] KEV FAISS index loaded!
09:15:12 INFO     [root] Loading NVD FAISS index...
09:15:12 INFO     [root] NVD FAISS index loaded!
09:15:12 INFO     [root] Loading Incident Analysis History FAISS index...
09:15:12 INFO     [root] Incident Analysis History FAISS index loaded!
09:15:12 INFO     [root] initialize_fai

--- Logging error ---
Traceback (most recent call last):
  File "C:\Python312\Lib\logging\handlers.py", line 74, in emit
    self.doRollover()
  File "C:\Python312\Lib\logging\handlers.py", line 179, in doRollover
    self.rotate(self.baseFilename, dfn)
  File "C:\Python312\Lib\logging\handlers.py", line 115, in rotate
    os.rename(source, dest)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log' -> 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log.1'
Call stack:
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python312\site-packages\ipykernel_launcher.py", line 18, in <module>
    app.launch_new_instance()
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python3

### Normalizing

Originally, when we began and gave the agent tools to perform semantic searches, general JSON queries, etc. to map incidents to CVEs andd perform an intelligent reasoning for their ranking, we explored ways to normalize the results.

A common practice would be to have pre-defined tiers to map onto to give "weights" that would sum to a final risk score for an incident, but we found that:
1. These often bloated more than ideal and led to larger risk scores than felt accurate (Ex: a 0.65 on a user marking an email as phishing but without clicking on it)
2. These neglected to take more advantage of the model's latent intelligence.

So we deferred to the model to reason itself on how to rank the score.

**However...**

For develoment, we have typically given the pipeline about 5 incident_ids per call to the agent to balance speed and accuracy.  The model might rank otherwise similar incidents at different levels when comparing against the other incidents in its current batch, so **we needed a way to normalize the risk scores of similar incidentts**.

This is when we decided to create **dummy data** mimicking the concept and JSON structure of the incidents.json file to build up some historical data:
```json
    {
        "incident_id": "INC-2024-02-22-002", 
        "title": "Cloud Infrastructure Unauthorized Access",
        "description": "Detected unauthorized access and potential data exfiltration from AWS cloud environment.",
        "initial_findings": "Multiple API calls from unusual geographic locations detected outside of normal business hours.",
        "timestamp": "2024-02-22T14:45:22Z",
        "affected_assets": [
            {
                "hostname": "AWS-PROD-APP-01",
                "ip_address": "10.0.1.45",
                "os": "Amazon Linux 2",
                "installed_software": [
                    {"name": "Docker", "version": "20.10.23"},
                    {"name": "Kubernetes", "version": "1.27.3"}
                ]
            }
        ],
        "observed_ttps": [
            {
                "id": "T1078.004",
                "name": "Valid Accounts: Cloud Accounts",
                "description": "Unauthorized access to cloud infrastructure"
            },
            {
                "id": "T1530",
                "name": "Data Transfer",
                "description": "Potential unauthorized data movement"
            }
        ],
        "indicators_of_compromise": [
            {
                "type": "container_id",
                "value": "a1b2c3d4e5f6",
                "description": "Suspicious container with unexpected network activity"
            },
            {
                "type": "library_name",
                "value": "boto3-exfil-mod",
                "description": "Potential custom exfiltration library"
            }
        ]
    }, ...
```

We simply gave examples of incidents to an LLM manually and asked for roughly 40 examples of such similar entries.  We then had it do the same but for its analysis of those entries, using the format we would later enforce in agent responses via a parser.

That gave us a mimic of what would be, in a production setting, historical data and analyses.

That meant we had once more FAISS index to build, using a standardized flattener on the incidents to ensure quality retrieval later, into a third FAISS index, under `data/vectorstore/incident_analysis_history` using effectively the same method as that of the KEV and NVD indexes.

Below, we will briefly go through how we added flattening for vectors to have standardized embeddings, then we will discuss the actual implmenetation of the indexes.

With that, we now had the initial setup process that, with the right tools, would allow our agent to perform its task:
1. Consistently (accounting for similar incidents in the past, mapping those to assigned risk values in prior analyses)
2. Quickly (removing latency of API calling on-demand, reducing file size for more specific scope where applicable)
3. Accurately (using semantic search comined with general JSON traversal, informed with the relevant data at hand, and with the tools we will see later)

The setup steps are now in a simple shell script `setup_initial_CVE_data_and_FAISS_indexes.sh` that you can see and run here:

In [11]:
%echo "Downloading CVE data..."
!python ./setup/download_cve_data.py

%echo "Building FAISS indexes for KEV and NVD..."
!python ./setup/build_faiss_KEV_and_NVD_indexes.py

%echo "Building historical incident analysis index..."
!python ./setup/build_historical_incident_analyses_index.py

%echo "Done!"

"Downloading CVE data..."
Using vendor filters: ['.net', '.net framework', 'adobe', 'adobe acrobat reader', 'apache', 'apache http server', 'apache struts', 'apache tomcat', 'axios', 'cisco', 'cisco ios xe', 'custom', 'custom c application', 'custom c++ application', 'custom java application', 'django', 'docker', 'ffmpeg', 'google', 'google chrome', 'java', 'java jre', 'jinja2', 'kubernetes', 'libxml2', 'microsoft', 'microsoft office', 'microsoft powershell', 'microsoft remote desktop services', 'microsoft windows', 'microsoft windows server', 'mozilla', 'mozilla firefox', 'mysql', 'mysql connector/j', 'nginx', 'nmap', 'node.js', 'openldap', 'openssh', 'openssl', 'php', 'php-fpm', 'pillow', 'postgresql', 'python', 'reportlab', 'sqlalchemy', 'wordpress']
NVD ZIP already present.
Raw NVD JSON already present.
Loading raw NVD JSON...
Total CVE items: 12137
Filtered down to 3062 CVEs. Saving subset...
Wrote subset to data\nvd_subset.json
Downloading KEV feed...
Wrote KEV data to data\kev.j

IOPub data rate exceeded.
The Jupyter server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--ServerApp.iopub_data_rate_limit`.

Current values:
ServerApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
ServerApp.rate_limit_window=3.0 (secs)

  File "C:\Users\Dan Guilliams\OneDrive\Code Projects\MCP_Agents_RADSecurity\.venv\Lib\site-packages\openai\resources\embeddings.py", line 128, in create
    return self._post(
  File "C:\Users\Dan Guilliams\OneDrive\Code Projects\MCP_Agents_RADSecurity\.venv\Lib\site-packages\openai\_base_client.py", line 1239, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
  File "C:\Users\Dan Guilliams\OneDrive\Code Projects\MCP_Agents_RADSecurity\.venv\Lib\site-packages\openai\_base_client.py", line 1003, in request
    log.debug(
Message: 'HTTP Response: %s %s "%i %s" %s'
Arguments: ('POST', URL('https://api.openai.com/v1/embedd

"Building historical incident analysis index..."
09:15:40 INFO     [root] Testing Logger Levels...
09:15:40 DEBUG    [root] Debug logging initialized
09:15:40 INFO     [root] Info logging initialized
09:15:40 ERROR    [root] Error logging initialized
09:15:41 DEBUG    [faiss.loader] Environment variable FAISS_OPT_LEVEL is not set, so let's pick the instruction set according to the current CPU
09:15:41 INFO     [faiss.loader] Loading faiss with AVX2 support.
09:15:41 INFO     [faiss.loader] Successfully loaded faiss with AVX2 support.
09:15:41 INFO     [faiss] Failed to load GPU Faiss: name 'GpuIndexIVFFlat' is not defined. Will not load constructor refs for GPU indexes. This is only an error if you're trying to use GPU Faiss.
âœ… Historical Incident Analyses index up-to-date â€“ skipping build

ðŸ”Ž top-3 Historical Incident Analyses matches for 'ransomware'
09:15:41 DEBUG    [openai._base_client] Request options: {'method': 'post', 'url': '/embeddings', 'files': None, 'idempotency_key

--- Logging error ---
Traceback (most recent call last):
  File "C:\Python312\Lib\logging\handlers.py", line 74, in emit
    self.doRollover()
  File "C:\Python312\Lib\logging\handlers.py", line 179, in doRollover
    self.rotate(self.baseFilename, dfn)
  File "C:\Python312\Lib\logging\handlers.py", line 115, in rotate
    os.rename(source, dest)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log' -> 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log.1'
Call stack:
  File "C:\Users\Dan Guilliams\OneDrive\Code Projects\MCP_Agents_RADSecurity\setup\build_historical_incident_analyses_index.py", line 18, in <module>
    from utils.flatteners import flatten_incident
  File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1310,

"Done!"


## 7. Testing Semantic Search

Let's test our vector indexes by performing semantic searches over the different data sources:

**Why we do this:** Verifying semantic search capabilities ensures that our system can effectively identify relevant CVEs and historical incidents. This helps validate our data preparation and embedding strategies.

In [12]:
from utils.retrieval_utils import _search, KEV_FAISS, NVD_FAISS

# Perform a semantic search using an incident title
query_text = incidents[0]['title']
print(f"Search query: {query_text}")

# Search KEV database
kev_results = _search(KEV_FAISS, query_text, k=3)
print("\nTop 3 KEV matches:")
for i, r in enumerate(kev_results, 1):
    print(f"{i}. {r['cve_id']} (variance: {r['variance']:.3f})")
    print(f"   {r.get('preview', '')[:100]}...")

# Search NVD database
nvd_results = _search(NVD_FAISS, query_text, k=3)
print("\nTop 3 NVD matches:")
for i, r in enumerate(nvd_results, 1):
    print(f"{i}. {r['cve_id']} (variance: {r['variance']:.3f})")
    print(f"   {r.get('preview', '')[:100]}...")

Search query: Unauthorized Access Attempt on VPN Gateway
09:15:42 INFO     [root] _search completed in 0.45s

Top 3 KEV matches:
1. CVE-2015-1187 (variance: 1.897)
   CVE CVE-2015-1187 D-Link and TRENDnet Multiple Devices D-Link and TRENDnet Multiple Devices Remote C...
2. CVE-2006-2492 (variance: 1.930)
   CVE CVE-2006-2492 Microsoft Word Microsoft Word Malformed Object Pointer Vulnerability Microsoft Wor...
3. CVE-2020-3452 (variance: 1.905)
   CVE CVE-2020-3452 Cisco Adaptive Security Appliance (ASA) and Firepower Threat Defense (FTD) Cisco A...


--- Logging error ---
Traceback (most recent call last):
  File "C:\Python312\Lib\logging\handlers.py", line 74, in emit
    self.doRollover()
  File "C:\Python312\Lib\logging\handlers.py", line 179, in doRollover
    self.rotate(self.baseFilename, dfn)
  File "C:\Python312\Lib\logging\handlers.py", line 115, in rotate
    os.rename(source, dest)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log' -> 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log.1'
Call stack:
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python312\site-packages\ipykernel_launcher.py", line 18, in <module>
    app.launch_new_instance()
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python3

09:15:43 INFO     [root] _search completed in 1.01s

Top 3 NVD matches:
1. CVE-2025-25724 (variance: 1.903)
   CVE CVE-2025-25724 list_item_verbose in tar/util.c in libarchive through 3.7.7 does not check an str...
2. CVE-2025-20153 (variance: 1.931)
   CVE CVE-2025-20153 A vulnerability in the email filtering mechanism of Cisco Secure Email Gateway co...
3. CVE-2025-30065 (variance: 1.938)
   CVE CVE-2025-30065 Schema parsing in the parquet-avro module of Apache Parquet 1.15.0 and previous v...


--- Logging error ---
Traceback (most recent call last):
  File "C:\Python312\Lib\logging\handlers.py", line 74, in emit
    self.doRollover()
  File "C:\Python312\Lib\logging\handlers.py", line 179, in doRollover
    self.rotate(self.baseFilename, dfn)
  File "C:\Python312\Lib\logging\handlers.py", line 115, in rotate
    os.rename(source, dest)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log' -> 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log.1'
Call stack:
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python312\site-packages\ipykernel_launcher.py", line 18, in <module>
    app.launch_new_instance()
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python3

Let's also examine the core search function:

In [13]:
from typing import Dict, List

def _search(
    store: FAISS,
    query: str,
    k: int = 5,
    use_mmr: bool = True,
    lambda_mult: float = 0.7,
    fetch_k: int = None
) -> List[Dict]:
    """
    Perform a semantic search on a given FAISS vector store.
    
    Args:
        store: The FAISS vector store to search
        query: The search query string
        k: Number of top results to return
        use_mmr: Use Maximal Marginal Relevance for diverse results
        lambda_mult: Diversity control for MMR search
        fetch_k: Number of documents to fetch before filtering for MMR
        
    Returns:
        List of search results with metadata and scores
    """
    if use_mmr:
        # embed the query once
        vec = embeddings.embed_query(query)
        # if fetch_k not provided, default to 2*k
        fk = fetch_k or (2 * k)
        # call the vector-based MMR-with-scores method
        pairs = store.max_marginal_relevance_search_with_score_by_vector(
            vec, k=k, fetch_k=fk, lambda_mult=lambda_mult,
        )
    else:
        # direct text-based similarity search (score included)
        pairs = store.similarity_search_with_score(query, k=k)

    # Format results
    out = []
    for doc, score in pairs:
        meta = doc.metadata.copy()
        meta["variance"] = float(score)
        meta["preview"] = ' '.join(doc.page_content.replace('\n', ' ').split())[:120]
        out.append(meta)

    return out

# 8. Agent Architecture and MCP Tools

In this section, we'll explore the core agent architecture and the MCP (Model Context Protocol) tools it uses to analyze security incidents and identify relevant CVEs.

## 8.1 MCP Server: Tool Definitions

Our agent uses a toolkit of specialized functions for incident analysis. These tools are defined in `mcp_cve_server.py` and exposed via the .

**Why we do this:** 
- MCP provides a standardized way for LLMs to interact with external tools
- Tools are defined with rich metadata (annotations) to guide the LLM
- The server handles caching, error handling, and logging consistently
- Tool definitions are separate from agent logic, enabling reuse

The tools in the project itself empower the agent to semantic search at will either with structured or unstructured input, query the JSON files for details as needed, return static strings showing the schema of each JSON to assist in such, and more.  All of them are centralized in `mcp_cve_server.py`.

In [14]:
# Core tool definitions from mcp_cve_server.py
from typing import Any, Dict, List
from fastmcp import FastMCP
from mcp_cve_server import NVD_INDEX
from utils.decorators import timing_metric, cache_result
from utils.retrieval_utils import match_incident_to_cves, semantic_search_cves

mcp = FastMCP("cve")

@mcp.tool(annotations={
    "title": "Match Incident to CVEs using semantic search",
    "readOnlyHint": True,
    "destructiveHint": False,
    "idempotentHint": False,
    "openWorldHint": False
})

@cache_result(ttl_seconds=30)  # cache identical incident queries for 30s
def match_incident_to_cves_tool(incident_id: str, k: int = 5, use_mmr: bool = True) -> dict:
    """
    Match an incident to potentially relevant CVEs using semantic search.
    
    Args:
        incident_id: The ID of the incident to match
        k: Maximum number of matches to return
        use_mmr: Whether to use MMR for diversity
        
    Returns:
        Dict containing matching CVEs from KEV and NVD databases
    """
    return match_incident_to_cves(incident_id, k, use_mmr)

@mcp.tool(
  annotations={
    "title": "Semantic Free-Form CVE Search",
    "readOnlyHint": True,
    "destructiveHint": False,
    "idempotentHint": False,
    "openWorldHint": False
  }
)
@cache_result(ttl_seconds=30)  # cache identical free-form queries
def semantic_search_cves_tool(
    query: str,
    sources: List[str] = ["kev", "nvd", "historical"],
    k: int = 5,
    use_mmr: bool = False,
    lambda_mult: float = 0.7
) -> Dict[str, Any]:
    """
    Perform a semantic search for CVEs using a free-form query.
    Contains ability to search multiple indexes in a single call to help with speed and token use, eliminating the need for the agent to perform
    multiple tools calls to search the FAISS indexes available.
    
    Args:
        query: Free-form search query
        sources: Which databases to search ("kev", "nvd", "historical")
        k: Maximum number of results per source
        use_mmr: Whether to use MMR for diversity
        lambda_mult: Diversity parameter for MMR
        
    Returns:
        Dict containing search results from specified sources
    """
    return semantic_search_cves(query, sources, k, use_mmr, lambda_mult)

@mcp.tool(annotations={
    "title": "Search NVD Entries for a specific match for ALL words in the query",
    "readOnlyHint": True,
    "destructiveHint": False,
    "idempotentHint": False,
    "openWorldHint": False
})
@cache_result(ttl_seconds=30)  # cache identical free-form queries
def search_nvd(query: str, limit: int = 10) -> list[dict]:
    """
    Return up to `limit` full CVE records whose fields match ALL words in `query`.
    Case-insensitive substring match over CVE ID, description, and any reference URLs.
    """
    qwords = query.lower().split()
    matches = []
    for cve_id, rec in NVD_INDEX.items():
        # flatten searchable text
        desc = rec.get("cve", {}) \
                  .get("description", {}) \
                  .get("description_data", [{}])[0] \
                  .get("value", "")
        refs = " ".join([r.get("url","") for r in rec.get("cve",{}) \
                                          .get("references",{}) \
                                          .get("reference_data",[])])
        text = f"{cve_id} {desc} {refs}".lower()
        if all(w in text for w in qwords):
            # return the full record so the agent can inspect any fields
            matches.append(rec)
            if len(matches) >= limit:
                break
    return matches

09:15:44 INFO     [root] Initializing OpenAI embeddings...


--- Logging error ---
Traceback (most recent call last):
  File "C:\Python312\Lib\logging\handlers.py", line 74, in emit
    self.doRollover()
  File "C:\Python312\Lib\logging\handlers.py", line 179, in doRollover
    self.rotate(self.baseFilename, dfn)
  File "C:\Python312\Lib\logging\handlers.py", line 115, in rotate
    os.rename(source, dest)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log' -> 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log.1'
Call stack:
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python312\site-packages\ipykernel_launcher.py", line 18, in <module>
    app.launch_new_instance()
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python3

09:15:44 INFO     [root] OpenAI embeddings initialized!
09:15:44 INFO     [root] initialize_openai_embeddings completed in 0.53s
09:15:44 INFO     [root] Loading KEV FAISS index...
09:15:44 INFO     [root] KEV FAISS index loaded!
09:15:44 INFO     [root] Loading NVD FAISS index...
09:15:44 INFO     [root] NVD FAISS index loaded!
09:15:45 INFO     [root] Loading Incident Analysis History FAISS index...
09:15:45 INFO     [root] Incident Analysis History FAISS index loaded!
09:15:45 INFO     [root] initialize_faiss_indexes completed in 0.05s
09:15:45 DEBUG    [mcp.server.lowlevel.server] Initializing server 'cve'
09:15:45 DEBUG    [mcp.server.lowlevel.server] Registering handler for ListToolsRequest
09:15:45 DEBUG    [mcp.server.lowlevel.server] Registering handler for CallToolRequest
09:15:45 DEBUG    [mcp.server.lowlevel.server] Registering handler for ListResourcesRequest
09:15:45 DEBUG    [mcp.server.lowlevel.server] Registering handler for ReadResourceRequest
09:15:45 DEBUG    [mcp.s

--- Logging error ---
Traceback (most recent call last):
  File "C:\Python312\Lib\logging\handlers.py", line 74, in emit
    self.doRollover()
  File "C:\Python312\Lib\logging\handlers.py", line 179, in doRollover
    self.rotate(self.baseFilename, dfn)
  File "C:\Python312\Lib\logging\handlers.py", line 115, in rotate
    os.rename(source, dest)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log' -> 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log.1'
Call stack:
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python312\site-packages\ipykernel_launcher.py", line 18, in <module>
    app.launch_new_instance()
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python3

## 8.2 Prompt Engineering

Templated prompts are key for detecting performance over time and ensuring standardization.  They are particularly well-suited for instances like this project where the input data sources are well formatted and consistent in theme.

**Why we do this:** Well-crafted prompts are critical for LLM performance. Our prompts are designed to:
- Provide clear instructions and context
- Include example formats for outputs
- Guide the agent to use appropriate tools at the right time
- Support structured JSON output via Pydantic models

We originally had the agent simply using the tools to perform semantic search of the given incident, after being given a list of `incident_id` values, but upon inspection and consideration, we realized that the steps were ALWAYS taking place of:

1. Use the `get_incident_tool` function to retrieve the incident's full JSON object
2. Use the `semantic_match_incident_tool` function to retrieve KVE and (potentially) NVD results
3. Use the more agnostic `semantic_search_cves_tool` function to search against similar incidents from the historical index.

We realized we could skip the latency and token usage by simply "preprocessing" that for the LLM and injecting it into a standard prompt template.

We break up the template into a System and Human message, both of which are combined, before sending it to the agent.

In [15]:
# Prompt template from utils/prompt_utils.py
SYSTEM_TMPL = """
You are a CVE‐analysis assistant. Analyze the following incidents and provide structured analysis.

Incident Details:
{incident_details}

Batch FAISS matches (KEV/NVD):
{batch_faiss_results}

Historical FAISS‐anchoring context:
{historical_faiss_results}

{format_instructions}

Now, when I ask you to analyze incidents, use the KEV/NVD context to inform your severity rankings and the historical context to normalize your severity rankings.
"""

# Human query example
query = """
I need you to help me analyze some security incidents and rank their actual severity, using identify potential CVE connections and details.
Let's start with a small sample to test the system:
1. Note the incident IDs and summaries you have available to you already.
2. For each incident:
    a.  Understand Incident Context: Reason about the affected assets, observed TTPs, and initial findings.
    b.  Identify Relevant CVEs: Determine which CVEs are potentially relevant based on the incident context and affected software/hardware, using LLM reasoning and potentially querying data sources.
    c.  Prioritize CVEs: Assess the risk and impact of relevant CVEs in the context of the specific incident, going beyond standard scores like CVSS.
    d.  Generate Analysis: Provide a brief, human-readable explanation of why certain CVEs are prioritized, linking them back to the incident details.
3. Finally, and most importantly, provide an organized list of all analyzed incidents in the following format:
{
    "incidents": [
        {
            "incident_id": "The ID of the incident that caused the error",
            "incident_summary": "A brief summary of the incident",
            "cve_ids": [
                {
                    "cve_id": "The CVE ID that is related to the incident",
                    "cve_summary": "A brief summary of the CVE and its relation to the incident",
                    "cve_relevance": "The estimated relevance level of the CVE match (0.0-1.0)",
                    "cve_risk_level": "The risk level of the CVE on a scale of (0.0-1.0)"
                }
            ],
            "incident_risk_level": "The risk level of the incident (0.0-1.0)",
            "incident_risk_level_explanation": "An explanation of the rationale for the risk level assessment"
        }
    ]
}
"""

## 8.3 Retrieval-Augmented Prompt Enhancement

Like we mentioned earlier in the notebook, we learned that we were better off injecting semantic search results to KEV, NVD, and historical incidents to the prompt itself.  This led to an effective RAG system as part of the agenic workflow.

**Test it!**

**Goal:** Demonstrate how `batch_get_historical_context` fetches structured past analyses for prompt injection.    
**Why we do this:** Shows the exact payload the agent will receive for historical context, ensuring transparency.

In [16]:
from utils.retrieval_utils import batch_get_historical_context

batch_ids = [incidents[0]['incident_id'], incidents[1]['incident_id']]
historical_context = batch_get_historical_context(batch_ids)

print("Historical context for batch:")
import json
print(json.dumps(historical_context, indent=2))

09:15:45 INFO     [root] Searching for similar incidents with k=3, MMR=True


--- Logging error ---
Traceback (most recent call last):
  File "C:\Python312\Lib\logging\handlers.py", line 74, in emit
    self.doRollover()
  File "C:\Python312\Lib\logging\handlers.py", line 179, in doRollover
    self.rotate(self.baseFilename, dfn)
  File "C:\Python312\Lib\logging\handlers.py", line 115, in rotate
    os.rename(source, dest)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log' -> 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log.1'
Call stack:
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python312\site-packages\ipykernel_launcher.py", line 18, in <module>
    app.launch_new_instance()
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python3

09:15:45 INFO     [root] Found 3 similar incidents
09:15:45 INFO     [root] search_similar_incidents completed in 0.41s
09:15:45 INFO     [root] Retrieving analyses from database for 3 incidents...
09:15:45 INFO     [root] Retrieving analyses for 3 incidents...
09:15:45 INFO     [root] Successfully retrieved 0 analyses!
09:15:45 DEBUG    [root] Closed database session
09:15:45 INFO     [root] Retrieved 0 analyses
09:15:45 INFO     [root] get_similar_incidents_with_analyses completed in 0.43s
09:15:45 INFO     [root] Retrieving analyses for 3 incidents...
09:15:45 INFO     [root] Successfully retrieved 0 analyses!
09:15:45 DEBUG    [root] Closed database session
09:15:45 INFO     [root] Retrieved 0 analyses
09:15:45 INFO     [root] Searching for similar incidents with k=3, MMR=True


--- Logging error ---
Traceback (most recent call last):
  File "C:\Python312\Lib\logging\handlers.py", line 74, in emit
    self.doRollover()
  File "C:\Python312\Lib\logging\handlers.py", line 179, in doRollover
    self.rotate(self.baseFilename, dfn)
  File "C:\Python312\Lib\logging\handlers.py", line 115, in rotate
    os.rename(source, dest)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log' -> 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log.1'
Call stack:
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python312\site-packages\ipykernel_launcher.py", line 18, in <module>
    app.launch_new_instance()
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python3

09:15:45 INFO     [root] Found 3 similar incidents
09:15:45 INFO     [root] search_similar_incidents completed in 0.26s
09:15:45 INFO     [root] Retrieving analyses from database for 3 incidents...
09:15:45 INFO     [root] Retrieving analyses for 3 incidents...
09:15:45 INFO     [root] Successfully retrieved 0 analyses!
09:15:45 DEBUG    [root] Closed database session
09:15:45 INFO     [root] Retrieved 0 analyses
09:15:45 INFO     [root] get_similar_incidents_with_analyses completed in 0.27s
09:15:45 INFO     [root] Retrieving analyses for 3 incidents...
09:15:45 INFO     [root] Successfully retrieved 0 analyses!
09:15:45 DEBUG    [root] Closed database session
09:15:45 INFO     [root] Retrieved 0 analyses
09:15:45 INFO     [root] batch_get_historical_context completed in 0.72s
Historical context for batch:
{
  "success": true,
  "results": [
    {
      "incident_id": "INC-2023-08-01-001",
      "historical_context": {
        "similar_incidents": [
          {
            "incident_i

--- Logging error ---
Traceback (most recent call last):
  File "C:\Python312\Lib\logging\handlers.py", line 74, in emit
    self.doRollover()
  File "C:\Python312\Lib\logging\handlers.py", line 179, in doRollover
    self.rotate(self.baseFilename, dfn)
  File "C:\Python312\Lib\logging\handlers.py", line 115, in rotate
    os.rename(source, dest)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log' -> 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log.1'
Call stack:
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python312\site-packages\ipykernel_launcher.py", line 18, in <module>
    app.launch_new_instance()
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python3

## 8.3 Pydantic Output Parsing

We use Pydantic models to define the structure of the agent's output:

**Why we do this:** Structured outputs ensure:
- Consistency in the format of analyses
- Validation of required fields
- Clear typing for downstream processing
- Enforced schema compliance

In [17]:
# Pydantic models from utils/prompt_utils.py
from pydantic import BaseModel, Field
from langchain.output_parsers import PydanticOutputParser

class CVEInfo(BaseModel):
    """
    A Pydantic model for CVE information.
    This model defines the structure of the output from the CVE analysis.
    It includes fields for the CVE ID, summary, relevance, and risk level.
    """
    cve_id: str = Field(description="The CVE ID that is related to the incident")
    cve_summary: str = Field(description="A brief summary of the CVE and its relation to the incident")
    cve_relevance: float = Field(description="The estimated relevance level of the CVE match (0.0-1.0)")
    cve_risk_level: float = Field(description="The risk level of the CVE on a scale of (0.0-1.0)")

class IncidentAnalysis(BaseModel):
    """
    A Pydantic model for incident analysis.
    This model defines the structure of the output from the incident analysis.
    It includes fields for the incident ID, summary, list of related CVEs, and the risk level of the incident.
    """
    incident_id: str = Field(description="The ID of the incident that caused the error")
    incident_summary: str = Field(description="A brief summary of the incident")
    cve_ids: list[CVEInfo] = Field(description="List of related CVEs and their details")
    incident_risk_level: float = Field(description="The risk level of the incident (0.0-1.0)")
    incident_risk_level_explanation: str = Field(description="An explanation of the rationale for the risk level assessment")

class IncidentAnalysisList(BaseModel):
    incidents: list[IncidentAnalysis] = Field(description="List of incident analyses")

# Initialize the parser
parser = PydanticOutputParser(pydantic_object=IncidentAnalysisList)

**Test it!**

This leads us to a generalized `generate_prompt` function that alows for growth and modularity over time:

In [18]:
from utils.prompt_utils import generate_prompt, parser
from utils.retrieval_utils import batch_match_incident_to_cves, batch_get_historical_context

# Prepare batch FAISS and historical results for a sample
batch_results = batch_match_incident_to_cves(start_index=0, batch_size=2, top_k=3)
historical_results = batch_get_historical_context(incident_ids=[res['incident_id'] for res in batch_results['results']], top_k=2)

# Generate the prompt messages
prompt_messages = generate_prompt(
    query="Analyze these incidents and output JSON per Pydantic schema.",
    batch_faiss_results=batch_results,
    historical_faiss_results=historical_results
)

# Show the system and human messages
print("System message (click the output to scroll through the results):")
print(prompt_messages[0].content[:500])

print(f"\n{'-'*50}\n")

print("Human message preview (this will match the prompt sent just above this in the generate_prompt call):")
print(prompt_messages[1].content)

09:15:45 INFO     [root] Matching incident against KEV/NVD databases. KEV k=3, NVD k=3
09:15:45 DEBUG    [root] Incident details: {'incident_id': 'INC-2023-08-13-039', 'timestamp': '2023-08-13T14:30:00Z', 'title': 'Open Redirect Vulnerability Exploited', 'description': 'Security logs show users being redirected to external, potentially malicious websites via a trusted internal URL.', 'affected_assets': [{'hostname': 'url-shortener-service', 'ip_address': '192.168.1.160', 'os': 'Ubuntu 20.04 LTS', 'installed_software': [{'name': 'Node.js', 'version': '14.17.0'}], 'role': 'Internal Service'}], 'observed_ttps': [{'framework': 'MITRE ATT&CK', 'id': 'T0864', 'name': 'CLIENT-SIDE ATTACK: Open Redirect'}], 'indicators_of_compromise': [{'type': 'http_request_parameter', 'value': 'redirect_url=http://malicious.site', 'context': 'Observed request parameter'}, {'type': 'user_behavior', 'value': 'clicked link leading to external site', 'context': 'Observed user action'}], 'initial_findings': 'Open

--- Logging error ---
Traceback (most recent call last):
  File "C:\Python312\Lib\logging\handlers.py", line 74, in emit
    self.doRollover()
  File "C:\Python312\Lib\logging\handlers.py", line 179, in doRollover
    self.rotate(self.baseFilename, dfn)
  File "C:\Python312\Lib\logging\handlers.py", line 115, in rotate
    os.rename(source, dest)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log' -> 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log.1'
Call stack:
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python312\site-packages\ipykernel_launcher.py", line 18, in <module>
    app.launch_new_instance()
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python3

09:15:47 INFO     [root] _search completed in 1.81s
09:15:47 INFO     [root] Found 3 KEV matches
09:15:47 INFO     [root] semantic_match_incident_kev completed in 1.82s
09:15:47 INFO     [root] Lowest KEV variance: 1.909
09:15:47 INFO     [root] KEV score above threshold, searching NVD database
09:15:47 DEBUG    [root] Flattening incident text for NVD search
09:15:47 DEBUG    [root] Searching NVD index with k=3, MMR=True


--- Logging error ---
Traceback (most recent call last):
  File "C:\Python312\Lib\logging\handlers.py", line 74, in emit
    self.doRollover()
  File "C:\Python312\Lib\logging\handlers.py", line 179, in doRollover
    self.rotate(self.baseFilename, dfn)
  File "C:\Python312\Lib\logging\handlers.py", line 115, in rotate
    os.rename(source, dest)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log' -> 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log.1'
Call stack:
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python312\site-packages\ipykernel_launcher.py", line 18, in <module>
    app.launch_new_instance()
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python3

09:15:47 INFO     [root] _search completed in 0.29s
09:15:47 INFO     [root] Found 3 NVD matches
09:15:47 INFO     [root] semantic_match_incident_nvd completed in 0.30s
09:15:47 INFO     [root] semantic_match_incident completed in 2.12s
09:15:47 INFO     [root] match_incident_to_cves completed in 2.13s
09:15:47 INFO     [root] Matching incident against KEV/NVD databases. KEV k=3, NVD k=3
09:15:47 DEBUG    [root] Incident details: {'incident_id': 'INC-2023-08-13-038', 'timestamp': '2023-08-13T11:00:00Z', 'title': 'Subdomain Takeover Attempt', 'description': 'Threat intelligence alert indicates a dangling DNS record pointing to a service that is no longer active, potentially allowing subdomain takeover.', 'affected_assets': [{'hostname': 'old-blog.example.com', 'ip_address': 'N/A', 'os': 'N/A', 'installed_software': [], 'role': 'Legacy DNS Entry'}], 'observed_ttps': [{'framework': 'MITRE ATT&CK', 'id': 'T1584', 'name': 'Compromise Infrastructure'}, {'framework': 'MITRE ATT&CK', 'id': 'T1

--- Logging error ---
Traceback (most recent call last):
  File "C:\Python312\Lib\logging\handlers.py", line 74, in emit
    self.doRollover()
  File "C:\Python312\Lib\logging\handlers.py", line 179, in doRollover
    self.rotate(self.baseFilename, dfn)
  File "C:\Python312\Lib\logging\handlers.py", line 115, in rotate
    os.rename(source, dest)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log' -> 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log.1'
Call stack:
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python312\site-packages\ipykernel_launcher.py", line 18, in <module>
    app.launch_new_instance()
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python3

09:15:48 INFO     [root] _search completed in 0.41s
09:15:48 INFO     [root] Found 3 KEV matches
09:15:48 INFO     [root] semantic_match_incident_kev completed in 0.41s
09:15:48 INFO     [root] Lowest KEV variance: 1.930
09:15:48 INFO     [root] KEV score above threshold, searching NVD database
09:15:48 DEBUG    [root] Flattening incident text for NVD search
09:15:48 DEBUG    [root] Searching NVD index with k=3, MMR=True
09:15:48 INFO     [root] _search completed in 0.19s


--- Logging error ---
Traceback (most recent call last):
  File "C:\Python312\Lib\logging\handlers.py", line 74, in emit
    self.doRollover()
  File "C:\Python312\Lib\logging\handlers.py", line 179, in doRollover
    self.rotate(self.baseFilename, dfn)
  File "C:\Python312\Lib\logging\handlers.py", line 115, in rotate
    os.rename(source, dest)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log' -> 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log.1'
Call stack:
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python312\site-packages\ipykernel_launcher.py", line 18, in <module>
    app.launch_new_instance()
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python3

09:15:48 INFO     [root] Found 3 NVD matches
09:15:48 INFO     [root] semantic_match_incident_nvd completed in 0.20s
09:15:48 INFO     [root] semantic_match_incident completed in 0.62s
09:15:48 INFO     [root] match_incident_to_cves completed in 0.62s
09:15:48 INFO     [root] batch_match_incident_to_cves completed in 2.75s
09:15:48 INFO     [root] Searching for similar incidents with k=2, MMR=True


--- Logging error ---
Traceback (most recent call last):
  File "C:\Python312\Lib\logging\handlers.py", line 74, in emit
    self.doRollover()
  File "C:\Python312\Lib\logging\handlers.py", line 179, in doRollover
    self.rotate(self.baseFilename, dfn)
  File "C:\Python312\Lib\logging\handlers.py", line 115, in rotate
    os.rename(source, dest)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log' -> 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log.1'
Call stack:
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python312\site-packages\ipykernel_launcher.py", line 18, in <module>
    app.launch_new_instance()
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python3

09:15:48 INFO     [root] Found 2 similar incidents
09:15:48 INFO     [root] search_similar_incidents completed in 0.20s
09:15:48 INFO     [root] Retrieving analyses from database for 2 incidents...
09:15:48 INFO     [root] Retrieving analyses for 2 incidents...
09:15:48 INFO     [root] Successfully retrieved 0 analyses!
09:15:48 DEBUG    [root] Closed database session
09:15:48 INFO     [root] Retrieved 0 analyses
09:15:48 INFO     [root] get_similar_incidents_with_analyses completed in 0.20s
09:15:48 INFO     [root] Retrieving analyses for 2 incidents...
09:15:48 INFO     [root] Successfully retrieved 0 analyses!
09:15:48 DEBUG    [root] Closed database session
09:15:48 INFO     [root] Retrieved 0 analyses
09:15:48 INFO     [root] Searching for similar incidents with k=2, MMR=True
09:15:48 INFO     [root] Found 2 similar incidents


--- Logging error ---
Traceback (most recent call last):
  File "C:\Python312\Lib\logging\handlers.py", line 74, in emit
    self.doRollover()
  File "C:\Python312\Lib\logging\handlers.py", line 179, in doRollover
    self.rotate(self.baseFilename, dfn)
  File "C:\Python312\Lib\logging\handlers.py", line 115, in rotate
    os.rename(source, dest)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log' -> 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log.1'
Call stack:
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python312\site-packages\ipykernel_launcher.py", line 18, in <module>
    app.launch_new_instance()
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python3

09:15:48 INFO     [root] search_similar_incidents completed in 0.18s
09:15:48 INFO     [root] Retrieving analyses from database for 2 incidents...
09:15:48 INFO     [root] Retrieving analyses for 2 incidents...
09:15:48 INFO     [root] Successfully retrieved 0 analyses!
09:15:48 DEBUG    [root] Closed database session
09:15:48 INFO     [root] Retrieved 0 analyses
09:15:48 INFO     [root] get_similar_incidents_with_analyses completed in 0.19s
09:15:48 INFO     [root] Retrieving analyses for 2 incidents...
09:15:48 INFO     [root] Successfully retrieved 0 analyses!
09:15:48 DEBUG    [root] Closed database session
09:15:48 INFO     [root] Retrieved 0 analyses
09:15:48 INFO     [root] batch_get_historical_context completed in 0.41s
System message (click the output to scroll through the results):

You are a CVE‐analysis assistant. Analyze the following incidents and provide structured analysis.

Incident Details:
[
  {
    "incident_id": "INC-2023-08-13-039",
    "timestamp": "2023-08-13T14

--- Logging error ---
Traceback (most recent call last):
  File "C:\Python312\Lib\logging\handlers.py", line 74, in emit
    self.doRollover()
  File "C:\Python312\Lib\logging\handlers.py", line 179, in doRollover
    self.rotate(self.baseFilename, dfn)
  File "C:\Python312\Lib\logging\handlers.py", line 115, in rotate
    os.rename(source, dest)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log' -> 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log.1'
Call stack:
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python312\site-packages\ipykernel_launcher.py", line 18, in <module>
    app.launch_new_instance()
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python3

## 8.4 LangChain ReAct Agent

We use LangChain's ReAct agent pattern to orchestrate the analysis process:

**Why we do this:** The ReAct agent pattern combines:
- **Re**asoning: Understanding the task and formulating a plan
- **Act**ion: Using tools to gather information
- Observation: Processing the results of tool calls
- Generation: Producing a final analysis

In [19]:
# Agent setup from main_security_agent_server.py
import asyncio
from langchain_mcp_adapters.tools import load_mcp_tools
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent
from mcp import ClientSession, StdioServerParameters, stdio_client

from utils.prompt_utils import generate_prompt
from utils.retrieval_utils import batch_get_historical_context, batch_match_incident_to_cves

# Setup server parameters and model
server_parameters = StdioServerParameters(
    command="python",
    args=["mcp_cve_server.py"],
)
model = ChatOpenAI(model="gpt-4o-mini", openai_api_key=os.getenv("OPENAI_API_KEY"))

async def run_agent(query, start_index, batch_size):
    async with stdio_client(server_parameters) as (read, write):
        # Initialize client session
        async with ClientSession(read, write) as session:
            await session.initialize()
            
            # Load MCP tools and create ReAct agent
            tools = await load_mcp_tools(session)
            agent = create_react_agent(model, tools, name="CVE_Agent")
            
            # Prepare incident batch and historical context
            batch_faiss_results = batch_match_incident_to_cves(
                batch_size=batch_size,
                start_index=start_index,
                top_k=3
            )
            
            historical_results = batch_get_historical_context(
                incident_ids=[r["incident_id"] for r in batch_faiss_results["results"]],
                top_k=2
            )
            
            # Generate prompt with all context
            prompt_messages = generate_prompt(
                query=query,
                batch_faiss_results=batch_faiss_results,
                historical_faiss_results=historical_results
            )
            
            # Execute agent
            final_msg, full_response = await agent.ainvoke({"messages": prompt_messages})
            
            # Parse and validate results
            analysis = parser.parse(final_msg.content)
            
            return analysis, full_response

## 8.5 Running the Agent

Let's see the basic logic of how the agent is called.

**Note:** The asynchronous nature of the functions do not "play nicely" with Jupyter, so ensure you follow the README guide on running the main server and then `run_analysis.py`, but below you can see a general outline of the structure expected.

In [20]:
import os
import asyncio
from utils.prompt_utils import AnalysisRequest

# Create a request to analyze a batch of incidents
async def analyze_incidents():
    request = AnalysisRequest(
        start_index=0,
        batch_size=2,
        request_id="demo-123",
        openai_api_key=os.getenv("OPENAI_API_KEY"),
        model_name="gpt-4o-mini"
    )
    
    # Run the analysis
    analysis, response = await run_agent(
        query=query,
        start_index=request.start_index,
        batch_size=request.batch_size
    )
    
    # Display the results
    print("Analysis Results:")
    for incident in analysis.incidents:
        print(f"\nIncident: {incident.incident_id}")
        print(f"Summary: {incident.incident_summary}")
        print(f"Risk Level: {incident.incident_risk_level}")
        print(f"Explanation: {incident.incident_risk_level_explanation}")
        print("\nRelevant CVEs:")
        for cve in incident.cve_ids:
            print(f"  - {cve.cve_id} (Relevance: {cve.cve_relevance}, Risk: {cve.cve_risk_level})")
            print(f"    {cve.cve_summary}")
    
    # Display usage metrics
    print("\nUsage Metrics:")
    print(f"Input tokens: {response['usage_metadata']['input_tokens']}")
    print(f"Output tokens: {response['usage_metadata']['output_tokens']}")
    print(f"Total tokens: {response['usage_metadata']['total_tokens']}")
    
    return analysis

# Run the analysis (This will fail in Jupyter due to its own event loop, but you ccan use this in python directly if you want to explore)
try: 
    analysis = await asyncio.run(await analyze_incidents()) 
except Exception: 
    print("Exception running in Jupyter Notebook")

Exception running in Jupyter Notebook


# 9. Persistence and Data Management

This section covers how the system persists analysis results and manages data for continuous learning and reference.

## 9.1 SQLite Database

Our system uses SQLite for structured persistence of incident analyses. This provides a lightweight, file-based database that requires no external server.

**Why we do this:** Persistent storage enables:
- Historical reference of past analyses
- Audit trails for security review
- Query capabilities for reporting and dashboards
- Cross-referencing between incidents
- Continuous learning for the system (analyses are queried and retrieved for similar incidents as part of the prompt injection)
- "Fire-and-forget" functionality to execute batches asychronously without waiting for a response

Below you can see the basic way we create two tables:
1. To keep track of incidents and resulting analyses
2. To keep track of run metrics (tokens, latency, tool usage, etc.)

In [21]:
from sqlalchemy import Integer, create_engine, Column, String, Float, Text, DateTime, Index
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
from datetime import datetime, UTC
import json
import os

# Swap with postgres:// URI as needed 
# --- Configuration ---
DATABASE_URL = os.getenv("DATABASE_URL", "sqlite:///data/incident_analysis.db") 

# --- Setup SQLAlchemy ---
engine = create_engine(DATABASE_URL)
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
Base = declarative_base()

# --- Model Definition ---
class IncidentRecord(Base):
    __tablename__ = "incident_analysis"

    id = Column(Integer, primary_key=True, autoincrement=True)
    request_id = Column(String, index=True)  # Uniquely identifies this *analysis*, not just incident
    incident_id = Column(String, index=True)       # Allows filtering or joining across repeated incidents
    created_at = Column(DateTime, default=lambda: datetime.now(UTC))
    incident_raw_json = Column(Text)               # Original incident details
    llm_analysis_json = Column(Text)               # Final analysis from the LLM
    llm_risk_score = Column(Float, nullable=True)  # For quick filtering/analytics
    model_name = Column(String)

    __table_args__ = (
        Index("ix_incident_id_request", "incident_id", "request_id"),  # Fast filtering if needed
    )

# --- Create Tables If Not Exists ---
def init_db():
    Base.metadata.create_all(bind=engine)

class RunMetadata(Base):
    __tablename__ = "run_metadata"
    id               = Column(Integer, primary_key=True, autoincrement=True)
    request_id       = Column(String, index=True, nullable=False)
    start_index      = Column(Integer, nullable=False)
    batch_size       = Column(Integer, nullable=False)
    input_tokens     = Column(Integer, nullable=True)
    output_tokens    = Column(Integer, nullable=True)
    total_tokens     = Column(Integer, nullable=True)
    tools_called     = Column(Text, nullable=True)    # JSON-encoded list of tool names
    duration_seconds = Column(Float, nullable=True)
    error_count      = Column(Integer, default=0)
    created_at       = Column(DateTime, default=lambda: datetime.now(UTC))

def init_db():
    Base.metadata.create_all(bind=engine)

  Base = declarative_base()


Using the `dev/incident_dashboard.py` for a lightweight Streamlit UI gives us a better idea of what's being captured.

Here is a glimpse of the more straight-forward columns in our `inident_analysis` table:

![Run Incident Analysis Summary Example](Documentation/images/incident_analysis_streamlit_preview_screenshot.png)

Note that they are by batches, these capture the asynchronous calls with variable number of `incident_id` values to request per batch, associated with their unique `request_id`

The dashboard gives a clean look at the more detailed JSON entries for the incident and resulting agent analysis as well:

![Run Incident Analysis Detail Example](Documentation/images/incident_analysis_streamlit_details_screenshot.png)

We can access this by running
```bash
streamlit run dev/incident_dashboard.py
```


## 9.2 Saving Analysis Results

When the agent completes an analysis, we save the results in both SQLite and as JSON backups:



In [22]:
# utils/datastore_utils.py
def save_incident_and_analysis_to_sqlite_db(
    request_id: str,
    incident_id: str,
    model_name: str,
    incident: dict,
    analysis: dict
):
    """
    Save both the incident and its analysis to the SQLite database.
    This is the primary storage location for incident analyses.
    
    Args:
        request_id (str): Unique identifier for this analysis request
        incident_id (str): ID of the incident being analyzed
        model_name (str): Name of the LLM model used for analysis
        incident (dict): The original incident data
        analysis (dict): The LLM's analysis of the incident
    """
    session = SessionLocal()
    try:
        record = IncidentRecord(
            request_id=request_id,
            incident_id=incident_id,
            incident_raw_json=json.dumps(incident),
            llm_analysis_json=json.dumps(analysis),
            llm_risk_score=analysis.get("incident_risk_level", None),
            model_name=model_name
        )
        session.add(record)
        session.commit()
    except Exception as e:
        session.rollback()
        raise e
    finally:
        session.close()


In addition to SQLite, we also save JSON backups locally (these are very small and lightweight files, but under large volume this can and would be abandoned or toggled if need-be):

In [23]:
def save_incident_analysis_backup_json(incident_id, analysis_data):
    """Save a backup of analysis data as JSON."""
    backup_dir = DATA_DIR / "backups"
    backup_dir.mkdir(exist_ok=True)
    
    timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
    backup_path = backup_dir / f"analysis_{incident_id}_{timestamp}.json"
    
    with open(backup_path, 'w') as f:
        json.dump(analysis_data, f, indent=2)
    
    return backup_path

## 9.3 FAISS Vector Index Updates

To support continuous learning, we update the INCIDENT_HISTORY_FAISS index with new incidents that have been analyzed:

**Why we do this:** Updating vector indexes enables:
- The system to learn from new analyses (pulled from the DB for similar incidents during prompt injection)
- Improved results over time as more examples are added
- Reference to previous analyses when encountering similar incidents
- Consistency in risk evaluation by referring to precedents

In [24]:
from langchain.docstore.document import Document
from langchain_community.vectorstores import FAISS
from utils.flatteners import flatten_incident
from utils.retrieval_utils import get_incident

def add_incident_to_faiss_history_index(incident_id, analysis):
    """
    Add a completed incident analysis to the historical FAISS index.
    
    Args:
        incident_id: The ID of the analyzed incident
        analysis: The analysis object from the agent
    """
    global INCIDENT_HISTORY_FAISS, embeddings
    
    if INCIDENT_HISTORY_FAISS is None or embeddings is None:
        initialize_openai_embeddings()
        initialize_faiss_indexes()
    
    # Create a document from the incident
    flattened_text = flatten_incident(get_incident(incident_id))
    doc = Document(
        page_content=flattened_text,
        metadata={
            "incident_id": incident_id,
            "analysis_id": analysis.get("analysis_id", "unknown"),
            "created_at": datetime.now().isoformat()
        }
    )
    
    # Add to FAISS index
    INCIDENT_HISTORY_FAISS.add_documents([doc])
    
    # Save updated index
    index_path = DATA_DIR / "vectorstore" / "incident_analysis_history"
    INCIDENT_HISTORY_FAISS.save_local(index_path)
    
    return True

**Test It!**

Below, we can test whether the setup is working by adding an incident and then searching for similar incidents to see if identical entries are returned as expected (given that we are entering these multiple times in the demo).

In [25]:
import random
from time import sleep
from utils.retrieval_utils import add_incident_to_faiss_history_index, INCIDENT_HISTORY_FAISS, search_similar_incidents

# We will pick an example incident entry and example analysis (one actually generated by the agent in am previous run)
# However, we will change the incident_id for each to ensure it's actually encoded into the index
random_incident_id = f"INC-{random.randint(1000,9999)}-{random.randint(10,99)}-{random.randint(10,99)}-0{random.randint(10,99)}f"

example_incident = {
    "incident_id": f"{random_incident_id}",
    "timestamp": "2023-08-13T11:00:00Z",
    "title": "Subdomain Takeover Attempt",
    "description": "Threat intelligence alert indicates a dangling DNS record pointing to a service that is no longer active, potentially allowing subdomain takeover.",
    "affected_assets": [
      {
        "hostname": "old-blog.example.com",
        "ip_address": "N/A",
        "os": "N/A",
        "installed_software": [],
        "role": "Legacy DNS Entry"
      }
    ],
    "observed_ttps": [
      {
        "framework": "MITRE ATT&CK",
        "id": "T1584",
        "name": "Compromise Infrastructure"
      },
      {
        "framework": "MITRE ATT&CK",
        "id": "T1584.001",
        "name": "Compromise Infrastructure: DNS"
      }
    ],
    "indicators_of_compromise": [
      {
        "type": "dns_record",
        "value": "CNAME old-blog.example.com -> inactive-service.cloudprovider.com",
        "context": "Observed DNS record"
      },
      {
        "type": "threat_intel_alert",
        "value": "Dangling DNS record detected",
        "context": "Threat intel alert"
      }
    ],
    "initial_findings": "Potential subdomain takeover risk due to dangling DNS record."
  }

example_analysis = {
    "incident_id": f"{random_incident_id}",
    "incident_summary": "Subdomain Takeover Attempt",
    "cve_ids": [
      {
        "cve_id": "CVE-2023-41265",
        "cve_summary": "HTTP Tunneling Vulnerability in Qlik Sense which could be exploited if a subdomain is compromised.",
        "cve_relevance": 1.93,
        "cve_risk_level": 0.8
      }
    ],
    "incident_risk_level": 0.75,
    "incident_risk_level_explanation": "Dangling DNS records pose a critical risk for subdomain takeover. The related CVE suggests a known vulnerability in HTTP tunneling that could be leveraged in this context."
  }

print(f"Adding incident {example_incident['incident_id']} to historical index...")
await add_incident_to_faiss_history_index(example_incident, example_analysis)

# We now expect to see entries for this same incident that we just stored to be returned in the search results
similar_incidents = search_similar_incidents(example_incident)

# The incident_ids will likely differ due to running this script, but we can check a very specific field such as the incident_risk_level_explanation to ensure valid results
print(f"Same incident: {example_analysis['incident_risk_level_explanation'] == similar_incidents[0]['analysis']['incident_risk_level_explanation']}")
print(f"{'-'*50}")
# Print the results 
print(f"Example Incident:\n\tincident_id: {example_incident['incident_id']}\n\tincident_summary: {example_analysis['incident_summary']}\n\nSimilar incidents: {json.dumps(similar_incidents[:3], indent=2)}")


Adding incident INC-3383-90-64-083f to historical index...
09:15:49 INFO     [root] add_incident_to_faiss_history_index completed in 0.18s
09:15:49 INFO     [root] Searching for similar incidents with k=5, MMR=True


--- Logging error ---
Traceback (most recent call last):
  File "C:\Python312\Lib\logging\handlers.py", line 74, in emit
    self.doRollover()
  File "C:\Python312\Lib\logging\handlers.py", line 179, in doRollover
    self.rotate(self.baseFilename, dfn)
  File "C:\Python312\Lib\logging\handlers.py", line 115, in rotate
    os.rename(source, dest)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log' -> 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log.1'
Call stack:
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python312\site-packages\ipykernel_launcher.py", line 18, in <module>
    app.launch_new_instance()
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python3

09:15:50 INFO     [root] Found 5 similar incidents
09:15:50 INFO     [root] search_similar_incidents completed in 0.47s
Same incident: True
--------------------------------------------------
Example Incident:
	incident_id: INC-3383-90-64-083f
	incident_summary: Subdomain Takeover Attempt

Similar incidents: [
  {
    "incident_id": "INC-3383-90-64-083f",
    "analysis": {
      "incident_id": "INC-3383-90-64-083f",
      "incident_summary": "Subdomain Takeover Attempt",
      "cve_ids": [
        {
          "cve_id": "CVE-2023-41265",
          "cve_summary": "HTTP Tunneling Vulnerability in Qlik Sense which could be exploited if a subdomain is compromised.",
          "cve_relevance": 1.93,
          "cve_risk_level": 0.8
        }
      ],
      "incident_risk_level": 0.75,
      "incident_risk_level_explanation": "Dangling DNS records pose a critical risk for subdomain takeover. The related CVE suggests a known vulnerability in HTTP tunneling that could be leveraged in this context

--- Logging error ---
Traceback (most recent call last):
  File "C:\Python312\Lib\logging\handlers.py", line 74, in emit
    self.doRollover()
  File "C:\Python312\Lib\logging\handlers.py", line 179, in doRollover
    self.rotate(self.baseFilename, dfn)
  File "C:\Python312\Lib\logging\handlers.py", line 115, in rotate
    os.rename(source, dest)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log' -> 'C:\\Users\\Dan Guilliams\\OneDrive\\Code Projects\\MCP_Agents_RADSecurity\\logs\\timing_metrics.log.1'
Call stack:
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python312\site-packages\ipykernel_launcher.py", line 18, in <module>
    app.launch_new_instance()
  File "C:\Users\Dan Guilliams\AppData\Roaming\Python\Python3

## 9.4 Usage Metadata Tracking

We track usage metadata to monitor performance and costs:

In [26]:
def save_run_metadata(
    request_id: str,
    start_index: int,
    batch_size: int,
    usage_metrics: dict, # Extracted from the agent's response object
    tools: list[str], # Extracted from the agent's response object
    duration: float,
    error_count: int = 0
):
    session = SessionLocal()
    try:
        rm = RunMetadata(
            request_id=request_id, # Contains the batch, shows performance of a single call
            start_index=start_index,
            batch_size=batch_size,
            input_tokens=usage_metrics.get("input_tokens"),
            output_tokens=usage_metrics.get("output_tokens"),
            total_tokens=usage_metrics.get("total_tokens"),
            tools_called=json.dumps(tools),
            duration_seconds=duration,
            error_count=error_count
        )
        session.add(rm)
        session.commit()
    finally:
        session.close()

Example metadata logs:
![Run Metadata Example](Documentation/images/run_metadata_example_screenshot.png)

Note that they are by batches, these capture the asynchronous calls with variable number of `incident_id` values to request per batch, associated with their unique `request_id`

## 9.5 Caching Strategy

To optimize performance and have a stand-in for later scaling to reduce costs, we implement a simple caching strategy:

**Why we do this:** Effective caching:
- Reduces redundant computation
- Improves response times
- Ensures consistent responses for identical queries
- Optimizes resource usage

In [27]:
import functools
from typing import Callable

# Simple caching decorator from utils/decorators.py
def cache_result(ttl_seconds: int = 300) -> Callable:
    """Simple in-memory cache with TTL for function results.

    Args:
        ttl_seconds: Time to live for cached results in seconds (default: 300)

    Returns:
        Decorator function that implements caching
    """
    cache: Dict[str, Dict[str, Any]] = {}

    def decorator(func: Callable) -> Callable:
        @functools.wraps(func)
        def wrapper(*args, **kwargs):
            # Create cache key from function name and arguments
            key = f"{func.__name__}:{str(args)}:{str(kwargs)}"

            # Check if cached and not expired
            if key in cache:
                result = cache[key]
                if time.time() - result['timestamp'] < ttl_seconds:
                    # logger.debug(f"Cache hit for {func.__name__}")
                    return result['data']

            # Execute function and cache result
            result = func(*args, **kwargs)
            cache[key] = {
                'data': result,
                'timestamp': time.time()
            }
            return result
        return wrapper
    return decorator

## 9.6 Backup and Recovery

To ensure data durability, we would propose implement backup and recovery procedures:

- Daily database backups
- Vector index snapshots
- Redundant storage for JSON backups
- Point-in-time recovery capability
- Automated recovery testing (with allotted budgets per-agent/agent-class)

By implementing comprehensive persistence strategies, our system ensures that valuable analysis results are preserved while ensuring performance, scalability, and observability.

---

I think we've now earned the right to look at our results more directly! Try running the actual streamlit app below:

In [None]:
# Defaults to open on localhost:8501
!streamlit run dev/incident_dashboard.py

# 10. Evaluation

**How do we measure accuracy and drift?**

We instrument our system to capture a variety of metrics—moving averages, variance, per-incident standard deviations, token usage, error counts, etc.—so that we can monitor performance over time. In the long run, these metrics should live behind a configurable layer, making it trivial to add, remove, or adjust what we log without touching the core code.

Right now, a quick look at our test runs shows risk scores that are consistently higher than expected. That’s largely an artifact of using synthetic “dummy” data during our early normalization experiments, which tends to skew the model’s self-calibration.

In a production rollout, we would:

1. **Build a “golden dataset”** of expert-validated incidents and analyses to seed the historical index.
2. **Audit metrics regularly**, especially early on, to verify that the model’s outputs align with real-world expectations.
3. **Incorporate feedback loops**, tagging or flagging outlier runs with human-adjusted confidence or priority scores, and then retraining or re-indexing as needed.

For this proof-of-concept, the underlying evaluation framework is sound—the key next step is to populate it with high-quality, representative data so that our statistics and drift detection become truly meaningful.


# 11. Conclusion

## 11.1 Summary

In this notebook we’ve delivered a complete pipeline for contextual CVE analysis:

1. **Semantic Search**  
   FAISS vector stores over KEV, NVD, and historical incident records power high-signal retrieval.  
2. **LLM-Driven Reasoning**  
   A GPT-4o-mini–based ReAct agent interprets incident context, invokes retrieval tools, and synthesizes risk assessments.  
3. **Structured Tooling**  
   An MCP server (`mcp_cve_server.py`) exposes typed, cache-enabled functions for CVE lookup, semantic search, and schema introspection.  
4. **Validated Output**  
   Pydantic models enforce a consistent JSON schema (`IncidentAnalysisList`), ensuring every analysis includes IDs, summaries, relevance scores, and risk explanations.  
5. **Durable Persistence & Learning**  
   Analyses and run metrics are stored in SQLite and vector indexes are incrementally updated—so every run enriches future context.

Together, these components demonstrate how retrieval-augmented generation can automate and standardize the initial triage of security incidents, reducing analyst toil, improving consistency, and providing transparent justifications for each risk score.

### 11.1.1 Platform-Level Best Practices

Beyond the AI logic itself, we baked in core software-engineering principles to ensure this system could evolve into a robust, service-based architecture:

1. **Idempotency**  
   Every analysis request carries a unique `request_id`, and duplicate submissions are automatically rejected via FastAPI dependency injection:  
   ```python
   @app.post("/analyze_incidents")
   async def analyze_incidents(
       request: AnalysisRequest,
       _dedupe: None = Depends(claim_request_id),
   ):
       ...
    ```

2. **Separation of Concerns**

   * **Tool Server** (`mcp_cve_server.py`): Hosts all MCP-decorated functions; thin wrappers around modular utilities (e.g. `retrieval_utils`).
   * **Agent Core** (`main_security_agent_server.py`): Contains prompt orchestration and ReAct logic, isolated from CLI or notebook invocation.
   * **Fire-and-Forget Persistence**: Writes to the database and FAISS indexes asynchronously, so failed I/O doesn’t derail the analysis loop. FAISS write steps utilize native library functionality to lock on write to ensure we don't have concurrent write attempts (for the historial incidents index).

3. **Observability**

   * **Metrics & Tracing**: Decorators capture timing, token use, cache hits/misses, tool-call counts, and error rates.
   * **Structured Logging**: Contextual logs (with request IDs) surface failures and performance bottlenecks.
   * **Pluggable Backends**: Our `datastore_utils` abstraction makes swapping SQLite for PostgreSQL—or even a hosted analytics datastore—a drop-in change.

4. **Scalability & Resilience**

   * **Async Processing**: Batched, asynchronous calls (via `asyncio`/FastAPI) let us process hundreds of incidents in parallel without blocking.
   * **Container-Ready**: Every component (API, MCP server, Redis) can be dockerized with resource limits and orchestrated via Kubernetes or Docker Compose.
   * **Failure Isolation**: Individual tool calls and persistence tasks are wrapped in `try/except`, so transient errors don’t cascade.

5. **Security & Configuration**

   * **Secrets Management**: API keys and database URLs are injected via environment variables or a `.env` file—never hard-coded.
   * **Input Validation**: Pydantic request models enforce schema compliance at the edge.
   * **Least Privilege**: Redis and database connections use dedicated, minimally scoped credentials.
   * **Config-Driven**: All key parameters—API URLs, batch sizes, concurrency levels, API keys, log file paths, etc.—are pulled from configuration (env vars or config files), with sensible defaults applied when no override is provided.

6. **Testability & CI/CD**

   * **Unit & Integration Tests**: Utilities (`utils/`) and MCP tools are covered by pytest with high extensibility.
   * **Automated Builds**: A CI pipeline can lint, type-check, spin up a temporary Redis/SQLite instance, and run the full test suite on each PR.
   * Containerization and deployment are things we could implment with likely a day (or less), moving us to aa more production-ready system.

By layering these practices on top of our GenAI agent, we ensure that the prototype is not only powerful but also maintainable, observable, and ready to scale.

## 11.2 Next Steps

Building on this foundation, a few high-impact enhancements could include:

1. **Broader Intelligence Feeds**  
   Integrate additional threat feeds (e.g., MITRE ATT&CK, vendor advisories) to widen coverage.  
2. **Remediation Guidance**  
   Extend the agent to suggest concrete mitigation steps alongside risk assessments.  
3. **Model Ensembles**  
   Layer specialized LLMs for discrete tasks (e.g., one for summarization, another for scoring) to optimize cost and performance.  
4. **Interactive Dashboards**  
   Enable analysts to drill down, flag or correct an analysis, and trigger on-demand re-runs.  
5. **Temporal and Trend Analysis**  
   Add time-series views over incidents and CVE metrics to surface emerging patterns.  
6. **Active Learning Loop**  
   Incorporate direct analyst feedback into the training and indexing processes to continuously refine accuracy.

## 11.3 Key Takeaways

1. **Architecture & Workflow**  
   A FastAPI front-end spawns a LangChain/LangGraph ReAct agent that orchestrates prompt generation, tool calls, and output parsing.  
2. **Prompting Strategy**  
   Pre-injected FAISS results and Pydantic instructions give the LLM focused, structured context and a clear output schema.  
3. **Tool Interaction**  
   Caching, timing metrics, and annotations in `mcp_cve_server.py` make tool calls reliable and efficient.  
4. **Context Management**  
   Batching, flattening, and targeted filtering keep the prompt concise while preserving relevance.  
5. **Explainability & Traceability**  
   Every risk score is backed by a Pydantic-validated explanation and a log of tool invocations.  

## 11.4 Final Thoughts

This project illustrates a practical, end-to-end GenAI solution for a real-world security challenge. By combining vector retrieval, declarative tool interfaces, and rigorous output validation, we’ve created an automated triage assistant that:

- **Scales** to hundreds of incidents without manual review  
- **Adapts** as new data feeds and analyses are added  
- **Supports** human oversight with transparent reasoning trails  

As cyber-threat volumes continue to grow, such AI-augmented workflows will become essential for security teams striving to stay ahead of emerging vulnerabilities. This notebook provides a solid, extensible blueprint for that future.  

In [None]:
# Thank you for reviewing this notebook!
print("Analysis complete!")

Please reach out with any questions, feel free to request a code walkthrough and/or demonstration, and anything else by reaching out to me at dan.guilliams@danguilliams.com