In [1]:
%load_ext autoreload
%autoreload 2

# Exploring Security Advisories

This notebook explores the `Advisories` class and the parsed security advisory documents.

In [2]:
from snyk_ai.advisories import Advisories, Advisory, _Section, _Chunk
from snyk_ai.utils.markdown import BlockType

# Load all advisories
advisories = Advisories("../data/advisories")
print(f"Loaded {len(advisories)} advisories")
print(f"Files: {' '.join(advisories.filenames)}")

Loaded 8 advisories
Files: advisory-001.md advisory-002.md advisory-003.md advisory-004.md advisory-005.md advisory-006.md advisory-007.md advisory-008.md


## Advisory Overview

In [3]:
# List all advisories with their titles
for adv in advisories:
    print(f"{adv.filename}: {adv.title} ({len(adv.blocks)} blocks, {len(adv.sections)} sections)")

advisory-001.md: Cross-Site Scripting (XSS) in express-validator (44 blocks, 13 sections)
advisory-002.md: SQL Injection in webapp-auth (59 blocks, 15 sections)
advisory-003.md: Dependency Confusion in secure-config (79 blocks, 15 sections)
advisory-004.md: Path Traversal in data-processor (62 blocks, 15 sections)
advisory-005.md: Remote Code Execution in file-handler (66 blocks, 15 sections)
advisory-006.md: Cross-Site Request Forgery (CSRF) in api-client (58 blocks, 15 sections)
advisory-007.md: Server-Side Request Forgery (SSRF) in http-server (68 blocks, 15 sections)
advisory-008.md: Insecure Deserialization in json-parser (62 blocks, 15 sections)


## Exploring a Single Advisory

In [4]:
# Pick one advisory to explore
adv = advisories["advisory-001.md"]

print(f"{adv.title}\n\n{adv.executive_summary}")

Cross-Site Scripting (XSS) in express-validator

A critical Cross-Site Scripting (XSS) vulnerability has been discovered in the `express-validator` package affecting versions prior to 4.5.0. This vulnerability allows attackers to inject malicious JavaScript code through validation error messages, potentially compromising user sessions and sensitive data.


In [5]:
print("Markdown blocks:")
for i, block in enumerate(adv.blocks):
    content_preview = block.content[:50].replace("\n", " ")
    if len(block.content) > 50:
        content_preview += "..."
    print(f"  {i:2}: {block.type.value:12} | {content_preview}")

Markdown blocks:
   0: header       | Security Advisory: Cross-Site Scripting (XSS) in e...
   1: paragraph    | **CVE ID:** CVE-2024-1234   **Package:** express-v...
   2: header       | Executive Summary
   3: paragraph    | A critical Cross-Site Scripting (XSS) vulnerabilit...
   4: header       | Vulnerability Details
   5: header       | Description
   6: paragraph    | The `express-validator` library fails to properly ...
   7: header       | Affected Versions
   8: table        | | Version Range | Status | Fixed Version | |------...
   9: header       | Attack Vector
  10: paragraph    | An attacker can exploit this vulnerability by subm...
  11: header       | Vulnerable Code Example
  12: code_block   | const { body, validationResult } = require('expres...
  13: paragraph    | **Attack Payload:**
  14: code_block   | username=<script>alert(document.cookie)</script>
  15: paragraph    | When this payload is submitted, the validation err...
  16: header       | Impact
  17: list

## Sections

A section is all blocks between two headers.

In [6]:
# Section breakdown with chunks
print(f"Total sections: {len(adv.sections)}\n")

for i, sec in enumerate(adv.sections):
    block_types = [b.type.value for b in sec.blocks]
    print(f"{i+1:2}. {sec.header.content}")
    print(f"    Blocks: {block_types}")
    
    # Show chunks (skip sections with code blocks - those need a model)
    if sec.has_code_blocks:
        print("    Chunks: (requires model for code summarization)")
    else:
        chunks = sec.get_chunks()
        print(f"    Chunks ({len(chunks)}):")
        for chunk in chunks:
            # text = chunk.text[:70] + "..." if len(chunk.text) > 70 else chunk.text
            text = chunk.text
            print(f"      [{chunk.source_type.value}] {text}")
    print()

Total sections: 13

 1. Security Advisory: Cross-Site Scripting (XSS) in express-validator
    Blocks: ['paragraph']
    Chunks (1):
      [paragraph] **CVE ID:** CVE-2024-1234   **Package:** express-validator   **Ecosystem:** npm   **Severity:** High   **CVSS Score:** 7.5   **Published:** January 15, 2024

 2. Executive Summary
    Blocks: ['paragraph']
    Chunks (2):
      [paragraph] A critical Cross-Site Scripting (XSS) vulnerability has been discovered in the `express-validator` package affecting versions prior to 4.5.0.
      [paragraph] This vulnerability allows attackers to inject malicious JavaScript code through validation error messages, potentially compromising user sessions and sensitive data.

 3. Description
    Blocks: ['paragraph']
    Chunks (2):
      [paragraph] The `express-validator` library fails to properly sanitize user input when generating validation error messages.
      [paragraph] When validation fails, the library includes user-provided input directly in

## Section to_text()

Render sections as text for context retrieval.

In [7]:
# Show to_text() for a section with mixed content
for sec in adv.sections:
    if len(sec.blocks) >= 3:
        print(f"=== Section: {sec.header.content} ===")
        print(sec.to_text())
        break

=== Section: Vulnerable Code Example ===
## Vulnerable Code Example

```javascript
const { body, validationResult } = require('express-validator');
const express = require('express');
const app = express();

app.post('/register', 
  body('email').isEmail(),
  body('username').isLength({ min: 3 }),
  (req, res) => {
    const errors = validationResult(req);
    if (!errors.isEmpty()) {
      // VULNERABLE: User input directly inserted into HTML
      return res.status(400).send(`
        <h1>Validation Error</h1>
        <p>${errors.array()[0].msg}</p>
        <p>Input: ${req.body.username}</p>
      `);
    }
    // ... registration logic
  }
);
```

**Attack Payload:**

```
username=<script>alert(document.cookie)</script>
```

When this payload is submitted, the validation error message will include the script tag, which executes in the browser, potentially stealing session cookies or performing unauthorized actions.
