<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
</head>
<body>

<div style="font-family: Arial, sans-serif; line-height: 1.6;">

<!-- Section 1: Aim -->
<div style="border-left: 6px solid #007BFF; background-color: #e7f3ff; padding: 20px 25px; margin: 25px 0; border-radius: 8px; box-shadow: 0 2px 5px rgba(0,0,0,0.1);">
  <h2 style="margin-top: 0; margin-bottom: 15px; color: #0056b3; border-bottom: 2px solid #0056b3; padding-bottom: 8px;">üéØ 1. Aim</h2>
  <p style="margin: 0; color: #333;">
    The primary aim of this work is to develop, deploy, and conduct a security penetration test on the custom-built, interactive multi-omics web application. A secondary aim is to establish a benchmark for responsible and secure scientific software development in a research context, demonstrating that security validation is a necessary and achievable step in the creation of trustworthy computational tools.
  </p>
</div>

<!-- Section 2: Objectives -->
<div style="border-left: 6px solid #FF9800; background-color: #fff3e0; padding: 20px 25px; margin: 25px 0; border-radius: 8px; box-shadow: 0 2px 5px rgba(0,0,0,0.1);">
  <h2 style="margin-top: 0; margin-bottom: 15px; color: #c66900; border-bottom: 2px solid #c66900; padding-bottom: 8px;">üéØ 2. Objectives</h2>
  <p style="margin-bottom: 10px; color: #333;">To achieve the stated aim, the following objectives have been set:</p>
  <ol style="margin: 0; padding-left: 20px; color: #333;">
    <li style="margin-bottom: 8px;">
      <strong>Reconnaissance and Endpoint Discovery:</strong> To systematically map the application's attack surface by discovering all accessible routes, static assets, and dynamic API endpoints (callbacks).
    </li>
    <li style="margin-bottom: 8px;">
      <strong>Vulnerability Assessment:</strong> To probe the application for common web vulnerabilities as defined by OWASP, including:
      <ul style="margin-top: 5px; padding-left: 20px; list-style-type: disc;">
        <li style="margin-bottom: 4px;"><strong>Injection Flaws (A03:2021):</strong> To fuzz all dynamic endpoints with a diverse set of malicious payloads to test for Cross-Site Scripting (XSS), SQL Injection (SQLi), and Command Injection.</li>
        <li style="margin-bottom: 4px;"><strong>Broken Access Control (A01:2021):</strong> To simulate an Insecure Direct Object Reference (IDOR) attack to assess the application's authorization logic.</li>
        <li style="margin-bottom: 4px;"><strong>Security Misconfiguration (A05:2021):</strong> To audit HTTP security headers and test for sensitive file exposure.</li>
      </ul>
    </li>
    <li style="margin-bottom: 8px;">
      <strong>Client-Side Validation:</strong> To use a headless browser to confirm the effectiveness of client-side defenses (e.g., Content-Security-Policy) against script execution.
    </li>
    <li style="margin-bottom: 8px;">
      <strong>Remediation and Verification:</strong> To address any identified vulnerabilities through code and configuration changes, and to re-run the tests to validate the effectiveness of the security hardening measures.
    </li>
  </ol>
</div>

<!-- Section 3: Research Questions -->
<div style="border-left: 6px solid #9C27B0; background-color: #f3e5f5; padding: 20px 25px; margin: 25px 0; border-radius: 8px; box-shadow: 0 2px 5px rgba(0,0,0,0.1);">
  <h2 style="margin-top: 0; margin-bottom: 15px; color: #6a0080; border-bottom: 2px solid #6a0080; padding-bottom: 8px;">‚ùì 3. Research Questions</h2>
  <p style="margin-bottom: 10px; color: #333;">This investigation seeks to answer the following questions:</p>
  <ul style="margin: 0; padding-left: 20px; color: #333; list-style-type: none;">
    <li style="margin-bottom: 8px;">
      <strong>Primary Question:</strong> To what extent is a custom-built scientific web application, developed in a research environment using the Dash framework, inherently vulnerable to common web application attack vectors?
    </li>
    <li style="margin-bottom: 8px;">
      <strong>Secondary Questions:</strong>
      <ul style="margin-top: 5px; padding-left: 20px; list-style-type: disc;">
        <li style="margin-bottom: 4px;">Can a systematic security audit identify and lead to the effective remediation of critical vulnerabilities, such as information leakage from server errors?</li>
        <li style="margin-bottom: 4px;">How effective are modern web frameworks and explicit security configurations (e.g., security headers) at providing a secure-by-default environment against injection attacks?</li>
        <li style="margin-bottom: 4px;">Does the application's design introduce architectural risks, such as a lack of object-level authorization, that must be considered for secure deployment in broader contexts?</li>
      </ul>
    </li>
  </ul>
</div>

<!-- Section 4: Methodology -->
<div style="border-left: 6px solid #4CAF50; background-color: #e8f5e9; padding: 20px 25px; margin: 25px 0; border-radius: 8px; box-shadow: 0 2px 5px rgba(0,0,0,0.1);">
  <h2 style="margin-top: 0; margin-bottom: 15px; color: #2e7d32; border-bottom: 2px solid #2e7d32; padding-bottom: 8px;">üìã 4. Methodology</h2>
  <p style="margin-bottom: 16px; color: #333;">
    The penetration test was conducted using a custom, multi-stage automated script developed within a Jupyter Notebook (Web App Penetration Test.ipynb). This approach ensured a reproducible and transparent testing process. The primary tools and stages are detailed below:
  </p>
  <ol style="margin: 0; padding-left: 20px; color: #333; list-style-type: decimal;">
    <li style="margin-bottom: 12px;">
      <strong>Framework & Libraries:</strong>
      <ul style="margin: 6px 0 0 20px; padding: 0; list-style-type: disc;">
        <li style="margin-bottom: 4px;"><strong>Requests:</strong> Used for all server interactions, including sending GET and POST requests to probe endpoints.</li>
        <li style="margin-bottom: 4px;"><strong>BeautifulSoup:</strong> Used for parsing HTML responses during the initial route discovery phase.</li>
        <li style="margin-bottom: 4px;"><strong>Selenium & Webdriver-Manager:</strong> Employed for client-side browser automation to validate the effectiveness of security headers against XSS.</li>
      </ul>
    </li>
    <li style="margin-bottom: 12px;">
      <strong>Test Stages:</strong>
      <ul style="margin: 6px 0 0 20px; padding: 0; list-style-type: disc;">
        <li style="margin-bottom: 4px;"><strong>Enumeration:</strong> Automated discovery of live routes and dynamic callbacks by querying the application's root and its <code>/_dash-dependencies</code> endpoint.</li>
        <li style="margin-bottom: 4px;"><strong>Input Fuzzing:</strong> Submission of a predefined list of malicious payloads to all discovered callbacks to test for injection vulnerabilities and unexpected server responses.</li>
        <li style="margin-bottom: 4px;"><strong>Configuration & Header Analysis:</strong> Automated checks for sensitive file exposure and verification of the presence and content of critical HTTP security headers.</li>
        <li style="margin-bottom: 4px;"><strong>Access Control Simulation:</strong> A test for Insecure Direct Object Reference (IDOR) by dynamically discovering valid object IDs from the application layout and re-requesting them to check for authorization logic.</li>
        <li style="margin-bottom: 4px;"><strong>Client-Side Confirmation:</strong> A final validation using a headless browser to inject a payload and confirm that client-side security policies prevent script execution.</li>
      </ul>
    </li>
  </ol>
</div>
</div>
</body>
</html>


### ‚öôÔ∏è Initial Configuration and Payloads
This cell sets up the foundational variables for the entire penetration test. It defines:
- **`BASE_URL`**: The root URL of the Dash application being tested.
- **`DASH_CALLBACKS`**: A list of known Dash-specific endpoints that are essential for the application's functionality and are prime targets for inspection.
- **`FILE_CANDIDATES`**: A list of common sensitive filenames (like configuration files or raw data) that will be checked for public exposure.
- **`FUZZ_INPUTS`**: A comprehensive list of malicious and unexpected data payloads. These are designed to test for a wide range of injection vulnerabilities, including Cross-Site Scripting (XSS), SQL Injection (SQLi), Path Traversal, and Command Injection.


In [1]:
import requests
from bs4 import BeautifulSoup
import urllib.parse
import json
import sys

# If your Dash app is running locally on port 8250, that is your TARGET
BASE_URL = "http://localhost:8280"

# Known Dash callback routes (adjust as needed for your app)
DASH_CALLBACKS = [
    "/_dash-layout",
    "/_dash-dependencies",
    "/_dash-update-component",
    "/_favicon.ico",
]

# Filenames to check (in case your CSV or secrets landed in a public folder)
FILE_CANDIDATES = [
    "enriched_metabolite_data.csv",
    ".env",
    "config.py",
    "credentials.json",
    "settings.yaml",
]

# Malicious payloads for callback fuzzing
FUZZ_INPUTS = [
    # Basic XSS
    "<script>alert('XSS1')</script>",
    # XSS with event handlers
    "<img src=x onerror=alert('XSS2')>",
    "<svg onload=alert('XSS3')>",
    # XSS in attributes
    "\" autofocus onfocus=alert('XSS4')//", 
    # SQL Injection (basic patterns)
    "' OR 1=1; --",
    "admin'--",
    "1; DROP TABLE users--",
    # Path Traversal
    "../../../../etc/passwd",
    "..\\..\\..\\..\\boot.ini", # Windows style
    "/etc/passwd",
    # Command Injection (basic)
    "; ls -la",
    "|id",
    # Template Injection (if a templating engine were used, less direct for Dash unless mishandled)
    "{{ 7*7 }}",
    "${7*7}",
    # Null Bytes (can sometimes truncate strings or bypass filters)
    "value%00", 
    # Unicode - different encodings or characters that might be mishandled
    "‰Ω†Â•Ω‰∏ñÁïå", # Hello world in Chinese
    "test\uFFFDtest", # Replacement character
    # Very Long String (test for buffer issues, unlikely in Python but good practice)
    "A" * 2048,
    # Unexpected JSON-like string (already had one, keeping for completeness)
    "{\"unexpected\": \"json\", \"value\": \"<script>alert(1)</script>\"}",
    # Integer/Float like strings (to see if type conversion is attempted and fails gracefully)
    "12345",
    "-99.99",
    # Boolean like strings
    "true",
    "FALSE",
    # Empty String
    "",
    # Just special characters
    "@#$%^&*()_+[];',./{}:\"<>?\\|`~"
]


### üó∫Ô∏è Stage 1: Route & Asset Enumeration
This function serves as the initial reconnaissance step. It automatically discovers the application's attack surface by:
1.  Fetching the main page of the application.
2.  Parsing the HTML to find all links (`<a>`), scripts (`<script>`), and stylesheets (`<link>`).
3.  Adding known Dash-specific endpoints (like `/_dash-layout`) to the list of potential targets.
4.  Sending a `GET` request to each discovered URL to confirm which ones are live (return a `200 OK` status code).

The output provides a clear list of all accessible routes and static assets, forming the basis for subsequent vulnerability tests.


In [2]:
import requests
from bs4 import BeautifulSoup
import urllib.parse

# DASH_CALLBACKS is expected to be a global variable defined by running Cell A

def enumerate_routes(base_url):
    """
    Crawl the home page for links, plus probe known Dash callback endpoints.
    Return all 200-OK URLs.
    """
    print("=== Stage 1: Route Enumeration (Actual Scan) ===") # Updated title for clarity
    try:
        resp = requests.get(base_url, timeout=5)
        resp.raise_for_status() 
    except requests.exceptions.RequestException as e:
        print(f"Error: Could not fetch home page at {base_url}. Details: {e}")
        return []
    except Exception as e: 
        print(f"An unexpected error occurred fetching {base_url}: {e}")
        return []

    soup = BeautifulSoup(resp.text, "html.parser")
    discovered = set()

    tags_to_scan = [("a", "href"), ("script", "src"), ("link", "href")]
    for tag, attr in tags_to_scan:
        for node in soup.find_all(tag):
            url_value = node.get(attr)
            if not url_value:
                continue
            try:
                full_url = urllib.parse.urljoin(base_url, url_value.strip())
            except Exception: 
                continue
            url_to_add = full_url.split("?")[0]
            if (url_to_add.startswith("http://") or url_to_add.startswith("https://")):
                discovered.add(url_to_add)

    if 'DASH_CALLBACKS' in globals() and DASH_CALLBACKS:
        for cb_path in DASH_CALLBACKS:
            try:
                full_cb_url = urllib.parse.urljoin(base_url, cb_path)
                discovered.add(full_cb_url)
            except Exception:
                continue
    
    if not discovered:
        print("No candidate URLs were discovered.")
        return []

    urls_to_test = sorted([d for d in list(discovered) if d.startswith(base_url)])
    
    if not urls_to_test:
        print(f"No discovered URLs start with the base URL ({base_url}). Other URLs found: {list(discovered)}")
        return []
        
    print(f" ‚Üí Found {len(urls_to_test)} candidate URLs (filtered to base URL); testing status codes‚Ä¶")
    
    alive = []
    for url_to_test in urls_to_test:
        status = None
        try:
            r = requests.get(url_to_test, timeout=3, allow_redirects=True)
            status = r.status_code
        except requests.exceptions.RequestException:
            pass 

        if status == 200:
            alive.append(url_to_test)
            print(f"  [200 OK] {url_to_test}")
        else:
            display_url_err = url_to_test[:100] + '...' if len(url_to_test) > 103 else url_to_test
            print(f"  [{status or 'ERR'}] {display_url_err}")

    print(f" ‚Üí {len(alive)} route(s) returned 200 OK.\n")
    return alive

print("enumerate_routes (Actual Scan version) function defined.")

enumerate_routes (Actual Scan version) function defined.


### üí• Stage 2: Broad Callback Fuzzing

This function forms the core of our vulnerability assessment. It systematically sends every payload from our `FUZZ_INPUTS` list to a specific application callback endpoint (`/_dash-update-component`). It monitors the server's response for each payload, looking for unexpected status codes (like `500 Internal Server Error`) or error messages that might indicate a vulnerability or poor input handling. This is a broad-spectrum test to find initial weaknesses.


In [3]:
import json
import urllib.parse
import requests

# (Make sure these globals are defined in a prior cell like Cell A)
# BASE_URL     = "http://localhost:8250"
# FUZZ_INPUTS  = [ ... expanded list ... ]

def fuzz_specific_callback(base_url, target_input_id, target_input_property, all_outputs_for_target_callback):
    """
    Probes `/_dash-update-component` targeting a specific callback,
    identified by its input and full list of outputs.
    Uses a singular 'output' key with a special dot-concatenated string
    value for multi-output callbacks.
    """
    print(f"\n=== Fuzzing Callback Triggered by Input: '{target_input_id}.{target_input_property}' ===")
    headers = {"Content-Type": "application/json"}

    if 'FUZZ_INPUTS' not in globals() or not FUZZ_INPUTS:
        print("  Error: 'FUZZ_INPUTS' is not defined or empty. Skipping fuzzing.")
        return

    callback_route = "/_dash-update-component"
    url = urllib.parse.urljoin(base_url, callback_route)
    print(f"‚Üí Probing {url} with various payloads for target input '{target_input_id}'...")

    if not all_outputs_for_target_callback:
        print("  Error: 'all_outputs_for_target_callback' list is empty. Cannot construct 'output' string.")
        return
        
    # Construct the special string format for multi-output
    target_multi_output_string = ".." + "...".join(all_outputs_for_target_callback) + ".."

    for idx, payload_val in enumerate(FUZZ_INPUTS, start=1):
        body = {
            "output": target_multi_output_string,
            "inputs": [
                {
                    "id": target_input_id,
                    "property": target_input_property,
                    "value": payload_val
                }
            ],
            "changedPropIds": [f"{target_input_id}.{target_input_property}"],
            "state": []
        }

        print(f"  Payload #{idx}:  input='{payload_val!r}' for component '{target_input_id}'")
        try:
            r = requests.post(url, json=body, headers=headers, timeout=4)
            status_code = r.status_code
            try:
                resp_json = r.json()
                snippet_text = json.dumps(resp_json) 
            except json.JSONDecodeError:
                snippet_text = r.text or ""

            if len(snippet_text) > 250:
                snippet = snippet_text[:250].replace("\n", " ") + "..."
            else:
                snippet = snippet_text.replace("\n", " ")

        except requests.exceptions.Timeout:
            status_code = "ERR (Timeout)"
            snippet = "Request timed out"
        except requests.exceptions.ConnectionError:
            status_code = "ERR (ConnectionError)"
            snippet = "Could not connect to server"
        except Exception as e: 
            status_code = f"ERR ({e.__class__.__name__})"
            snippet = str(e)[:100]

        print(f"    ‚Üí Status: {status_code}, Resp snippet: {snippet!r}")

    print(f"\n=== Fuzzing for '{target_input_id}' complete ===\n")

print("Cell D (fuzz_specific_callback function defined.")


Cell D (fuzz_specific_callback function defined.


### üî¨ Stage 2.5: Deep Fuzzing with Encoding

This function performs a more targeted test. It takes a single malicious payload (like an XSS vector) and sends it to the server in multiple formats: raw, URL-encoded, and HTML-entity-encoded. This helps determine if the application has weak decoding or sanitization routines that can be bypassed by an attacker using common encoding techniques.


In [4]:
import requests
import json
import urllib.parse
import html # Required for HTML entity encoding

# (Make sure BASE_URL is defined in a prior cell)

def fuzz_callback_with_encoding(base_url, target_input_id, target_input_property, all_outputs_for_target_callback, base_payload_to_test):
    """
    Takes a single base payload and probes an endpoint with its raw,
    URL-encoded, and HTML-encoded versions.
    """
    headers = {"Content-Type": "application/json"}
    callback_route = "/_dash-update-component"
    url = urllib.parse.urljoin(base_url, callback_route)

    # Construct the special dot-concatenated string for the multi-output 'output' key
    target_multi_output_string = ".." + "...".join(all_outputs_for_target_callback) + ".."
    
    # A dictionary of different encoding techniques to try
    payloads_to_try = {
        "Raw": base_payload_to_test,
        "URL Encoded": urllib.parse.quote(base_payload_to_test),
        "HTML Entity Encoded": html.escape(base_payload_to_test)
    }
    
    print(f"  --- Testing base payload: {base_payload_to_test!r} ---")
    
    for encoding_type, payload_val in payloads_to_try.items():
        body = {
            "output": target_multi_output_string,
            "inputs": [{"id": target_input_id, "property": target_input_property, "value": payload_val}],
            "changedPropIds": [f"{target_input_id}.{target_input_property}"],
            "state": []
        }

        print(f"    Testing with encoding '{encoding_type}': {payload_val!r}")
        try:
            r = requests.post(url, json=body, headers=headers, timeout=4)
            status_code = r.status_code
            try:
                resp_json = r.json()
                snippet_text = json.dumps(resp_json) 
            except json.JSONDecodeError:
                snippet_text = r.text or ""

            if len(snippet_text) > 200:
                snippet = snippet_text[:200].replace("\n", " ") + "..."
            else:
                snippet = snippet_text.replace("\n", " ")

        except requests.exceptions.RequestException as e: 
            status_code = f"ERR ({e.__class__.__name__})"
            snippet = str(e)[:100]

        print(f"      ‚Üí Status: {status_code}, Resp snippet: {snippet!r}")

print("The function 'fuzz_callback_with_encoding' is defined.")


The function 'fuzz_callback_with_encoding' is defined.


### üìÅ Stage 3: Sensitive File Exposure Test

This stage tests for **Security Misconfiguration (A05:2021)** by attempting to directly access potentially exposed sensitive files. The function iterates through a list of common configuration and data filenames (`FILE_CANDIDATES`) and attempts to request them from common web-accessible directories (`/` and `/assets/`).

A key feature is the advanced check to reduce false positives. Since many modern web frameworks redirect not-found requests to the main page, a simple `200 OK` status is not enough. This function confirms a finding only if the status is `200 OK` **and** the response content is not the application's standard HTML homepage.


In [5]:
import requests
import urllib.parse

# (Make sure BASE_URL and FILE_CANDIDATES are defined in a prior cell)

def brute_force_files_advanced(base_url):
    """
    Attempts to GET each entry in FILE_CANDIDATES from common static paths.
    This advanced version adds checks for Content-Type, Content-Length, and
    content sniffing to reduce false positives.
    """
    print("\n=== Stage 3: Advanced Static File Brute-Force ===")

    if 'FILE_CANDIDATES' not in globals() or not FILE_CANDIDATES:
        print("  Error: 'FILE_CANDIDATES' list is not defined or empty. Skipping.")
        return

    # First, get the content and length of the main index page to compare against
    main_page_content = None
    main_page_length = -1
    try:
        main_resp = requests.get(base_url, timeout=3)
        if main_resp.status_code == 200:
            main_page_content = main_resp.text
            main_page_length = len(main_resp.content)
            print(f"  Info: Fetched main page for comparison (length: {main_page_length} bytes).")
    except requests.exceptions.RequestException as e:
        print(f"  Warning: Could not fetch main page at {base_url} for comparison. False positive checks will be less effective. Error: {e}")

    paths_to_check = ["/", "/assets/"] 
    
    for fname in FILE_CANDIDATES:
        for path_prefix in paths_to_check:
            url = urllib.parse.urljoin(base_url, path_prefix.lstrip('/') + fname)

            print(f"  Checking for: {url}")
            try:
                r = requests.get(url, timeout=3, allow_redirects=False)
                status = r.status_code
            except requests.exceptions.RequestException as e:
                print(f"    [ERR] Request failed for {url}: {e}")
                continue # Skip to the next URL

            if status == 200:
                is_false_positive = False
                reason = []
                
                # 1. Check Content-Type
                content_type = r.headers.get('Content-Type', '')
                if 'text/html' in content_type:
                    is_false_positive = True
                    reason.append("Content-Type is 'text/html'")
                
                # 2. Check Content-Length
                content_length = len(r.content)
                if main_page_length != -1 and content_length == main_page_length:
                    is_false_positive = True
                    reason.append("Content-Length matches main page")

                # 3. Content Sniffing (check for HTML doctype)
                # We need to be careful with byte strings vs text strings
                try:
                    # Check first 20 characters of decoded text
                    if r.text.lstrip().startswith(('<!DOCTYPE html>', '<html')):
                        is_false_positive = True
                        reason.append("Content starts with HTML tag")
                except Exception:
                    # If it can't be decoded as text, that might be a good sign it's not HTML
                    pass
                
                if is_false_positive:
                    print(f"    [200 OK - LIKELY FALSE POSITIVE] URL: {url}")
                    print(f"      Reason(s): {', '.join(set(reason))}")
                else:
                    # If it's 200 OK and didn't trigger any false positive flags, it's a potential finding.
                    size = len(r.content)
                    try:
                        snippet_text = r.content[:200].decode('utf-8', errors='ignore').replace("\n", " ")
                    except:
                        snippet_text = repr(r.content[:200])
                    print(f"    [!!! CONFIRMED 200 OK !!!] URL: {url} ({size} bytes)")
                    print(f"      Content-Type: {content_type}")
                    print(f"      Snippet: {snippet_text!r} ...\n")
            else:
                print(f"    [{status}] Not found or error.")
                
    print("\n===file brute-forcing complete ===\n")

print("The function 'brute_force_files' is defined.")


The function 'brute_force_files' is defined.


### üõ°Ô∏è Stage 4: Security Header Analysis

This function audits the application's security posture by checking for the implementation of recommended HTTP security headers. It sends a `GET` request to key application routes and inspects the response headers for the presence of headers like:
- **Content-Security-Policy (CSP)**: The most critical defense against XSS.
- **X-Content-Type-Options**: Prevents browsers from MIME-sniffing a response away from the declared content-type.
- **X-Frame-Options**: Protects against clickjacking attacks.
- **Strict-Transport-Security (HSTS)**: Enforces the use of HTTPS.

The results indicate which headers are correctly implemented and which are missing, providing a clear roadmap for configuration hardening.


In [6]:
# Cell H - Define the check_security_headers function

import requests # Should already be imported from Cell A or your config cell

# Ensure BASE_URL is defined from Cell A

def check_security_headers(base_url, routes_to_check):
    """
    Checks for common security headers on a list of URLs.
    """
    print("\n=== Stage 4: Security Header Analysis ===")

    if not routes_to_check:
        print("  No routes provided to check for security headers. Skipping.")
        return

    # Common security headers to look for (case-insensitive for checking presence)
    common_security_headers = {
        "Content-Security-Policy": "Helps prevent XSS, clickjacking, and other code injection attacks.",
        "X-Content-Type-Options": "Prevents MIME-sniffing (usually set to 'nosniff').",
        "X-Frame-Options": "Protects against clickjacking (e.g., 'DENY', 'SAMEORIGIN').",
        "Strict-Transport-Security": "Enforces secure (HTTPS) connections to the server.",
        "Referrer-Policy": "Controls how much referrer information is sent with requests.",
        "Permissions-Policy": "Controls which browser features can be used by the page.",
        "X-XSS-Protection": "Older XSS filter, often superseded by CSP (e.g., '1; mode=block').",
        # "Cross-Origin-Opener-Policy": "COOP - Helps mitigate cross-origin attacks.", # More modern
        # "Cross-Origin-Embedder-Policy": "COEP - Helps mitigate cross-origin attacks.", # More modern
    }

    # Store findings per route
    all_header_findings = {}

    print(f"‚Üí Checking headers for {len(routes_to_check)} live routes discovered in Stage 1...\n")

    for idx, url in enumerate(routes_to_check):
        # We only want to check headers for the main page and key Dash endpoints,
        # not necessarily for every single JS asset, as those might not set all headers.
        # Let's focus on the base_url and key Dash API routes for this basic check.
        # You can expand this logic if needed.
        if not (url == base_url or \
                "_dash-layout" in url or \
                "_dash-dependencies" in url or \
                "_dash-update-component" in url or \
                "_favicon.ico" in url) : # Check favicon as it's a common direct asset
            # print(f"  Skipping header check for asset: {url}")
            continue

        print(f"  Headers for URL ({idx + 1}/{len(routes_to_check)}): {url}")
        route_findings = {"present": {}, "missing": []}
        try:
            r = requests.get(url, timeout=3, allow_redirects=True)
            # Headers are case-insensitive in HTTP, but Python dicts are case-sensitive.
            # requests.structures.CaseInsensitiveDict handles this for response.headers
            response_headers = r.headers # This is a CaseInsensitiveDict

            for header_name, description in common_security_headers.items():
                if header_name in response_headers:
                    route_findings["present"][header_name] = response_headers[header_name]
                    print(f"    [PRESENT] {header_name}: {response_headers[header_name]}")
                else:
                    route_findings["missing"].append(header_name)
                    print(f"    [MISSING] {header_name} - {description}")
            print("-" * 20)


        except requests.exceptions.RequestException as e:
            print(f"    Could not fetch URL {url} to check headers: {e}")
        except Exception as e_gen:
             print(f"    An unexpected error occurred checking headers for {url}: {e_gen}")
        
        all_header_findings[url] = route_findings
        
    print("\n=== Security header analysis complete ===\n")
    # You could return all_header_findings here if you wanted to process it further
    # return all_header_findings

print("Cell F EXECUTED: check_security_headers function defined.")

Cell F EXECUTED: check_security_headers function defined.


### üåê Stage 5: HTTP Method Tampering

This stage tests for another common misconfiguration: improper handling of HTTP methods. The function sends various HTTP verbs (`GET`, `POST`, `PUT`, `DELETE`, etc.) to key application endpoints.

A secure application should only allow the methods it's designed to handle (e.g., `GET` for `/`, `POST` for `/_dash-update-component`) and reject all others, typically with a `405 Method Not Allowed` status code. This test verifies that behavior.


In [7]:
import requests
import urllib.parse # Should already be imported

# Ensure BASE_URL is defined from Cell A (or your config cell)

def test_http_methods(base_url, routes_to_check):
    """
    Tests key routes with various HTTP methods to check for unexpected responses
    or lack of proper method handling (e.g., not returning 405).
    """
    print("\n=== Stage 5: HTTP Method Tampering Analysis ===")

    if not routes_to_check:
        print("  No routes provided to test HTTP methods. Skipping.")
        return

    # Key Dash routes we are interested in for method tampering.
    # We'll also include the base_url itself.
    # We don't need to test every single JS asset with all methods.
    key_dash_endpoints_for_method_test = [
        base_url, # The root/home page
        urllib.parse.urljoin(base_url, "/_dash-layout"),
        urllib.parse.urljoin(base_url, "/_dash-dependencies"),
        urllib.parse.urljoin(base_url, "/_dash-update-component"),
        # Add any other custom Flask routes if you had them
    ]

    # Filter routes_to_check to only include our key_dash_endpoints_for_method_test
    # that were actually found to be alive in Stage 1.
    # This ensures we don't try to test non-existent key routes.
    actual_key_routes_to_test = [route for route in routes_to_check if route in key_dash_endpoints_for_method_test]
    
    if not actual_key_routes_to_test:
        print("  None of the predefined key Dash endpoints for method testing were found live. Skipping.")
        return

    http_methods_to_try = ["GET", "POST", "PUT", "DELETE", "OPTIONS", "PATCH", "HEAD"]
    # For POST, PUT, PATCH, we might need some dummy data.
    dummy_json_data = {"test": "data"}
    dummy_form_data = {"test_form": "data_form"}


    print(f"‚Üí Testing {len(actual_key_routes_to_test)} key routes with various HTTP methods...\n")

    for url in actual_key_routes_to_test:
        print(f"  Testing methods for URL: {url}")
        for method in http_methods_to_try:
            response_status = None
            response_snippet = ""
            try:
                if method == "GET":
                    r = requests.get(url, timeout=3, allow_redirects=True)
                elif method == "POST":
                    # Try with JSON data first, then form data if JSON fails or is inappropriate
                    try:
                        r = requests.post(url, json=dummy_json_data, timeout=3)
                    except requests.exceptions.RequestException: # e.g. if endpoint expects form data
                        r = requests.post(url, data=dummy_form_data, timeout=3)
                elif method == "PUT":
                    r = requests.put(url, json=dummy_json_data, timeout=3)
                elif method == "DELETE":
                    r = requests.delete(url, timeout=3)
                elif method == "OPTIONS":
                    r = requests.options(url, timeout=3)
                elif method == "PATCH":
                    r = requests.patch(url, json=dummy_json_data, timeout=3)
                elif method == "HEAD":
                    r = requests.head(url, timeout=3, allow_redirects=True)
                
                response_status = r.status_code
                # For HEAD requests, there's no body, but headers are key.
                # For OPTIONS, 'Allow' header is important.
                if method == "HEAD":
                    response_snippet = f"Headers: {dict(r.headers)}"
                elif method == "OPTIONS":
                    response_snippet = f"Allow: {r.headers.get('Allow', 'Not Specified')}, Headers: {dict(r.headers)}"
                else:
                    response_snippet = r.text[:100].replace("\n", " ") + "..." if r.text and len(r.text) > 100 else (r.text or "")
                
            except requests.exceptions.RequestException as e:
                response_status = f"ERR ({e.__class__.__name__})"
                response_snippet = str(e)[:100]
            except Exception as e_gen:
                response_status = f"ERR_GENERIC ({e_gen.__class__.__name__})"
                response_snippet = str(e_gen)[:100]

            print(f"    [{method}] ‚Üí Status: {response_status}")
            if response_snippet: # Only print snippet if it's not empty
                 print(f"              Snippet/Info: {response_snippet!r}")
        print("-" * 20)
        
    print("\n=== HTTP method tampering analysis complete ===\n")

print("Cell G EXECUTED: test_http_methods function defined.")


Cell G EXECUTED: test_http_methods function defined.


### üîé Stage 1.5: Dynamic Callback Discovery

This function automates the discovery of which components in the Dash application can trigger server-side callbacks. It queries the special `/_dash-dependencies` endpoint, which returns a JSON map of all registered callbacks.

The function then parses this map to identify callbacks that are suitable for our specific fuzzing script. The criteria for a "fuzzable" callback in this context are:
1. It must have exactly one `Input` component.
2. That `Input` component's trigger property must be `value` (typical for dropdowns, text boxes, etc.).

This automated discovery is more robust than hard-coding callback names, as it adapts to any changes made in the application's source code. The output is a structured list that can be directly used by the fuzzing functions.


In [8]:
import requests
import json
import urllib.parse

def discover_callbacks_to_fuzz(base_url):
    """
    Dynamically discovers callbacks by fetching and parsing the /_dash-dependencies
    endpoint. This version correctly parses the 'output' key for multi-output callbacks.
    """
    print("\n=== Stage 1.5: Dynamic Callback Discovery (Corrected) ===")
    
    dependencies_url = urllib.parse.urljoin(base_url, "/_dash-dependencies")
    print(f"‚Üí Fetching callback dependency map from: {dependencies_url}")
    
    try:
        resp = requests.get(dependencies_url, timeout=5)
        resp.raise_for_status()
        dependencies = resp.json()
    except Exception as e:
        print(f"  Error fetching or parsing dependencies: {e}")
        return []

    print(f"‚Üí Found {len(dependencies)} registered callbacks. Analyzing each one...")

    discovered_callbacks = []
    for idx, cb in enumerate(dependencies):
        inputs = cb.get("inputs", [])
        output_signature = cb.get("output", "") # Get the singular 'output' key
        
        input_structure_for_debug = [f"id='{i.get('id')}', prop='{i.get('property')}'" for i in inputs]
        print(f"\n  Analyzing Callback #{idx + 1} with inputs: [{', '.join(input_structure_for_debug)}]")

        # Our criteria: exactly one input, and that input's property must be 'value'
        if len(inputs) == 1 and inputs[0].get("property") == "value":
            input_id = inputs[0].get("id")
            input_property = inputs[0].get("property")

            # Correctly parse the multi-output string from the 'output' key
            # This is the key fix
            outputs_list_for_fuzzer = []
            if isinstance(output_signature, str) and output_signature.startswith("..") and output_signature.endswith(".."):
                # Strip leading/trailing '..' and split by '...'
                cleaned_signature = output_signature.strip('.')
                outputs_list_for_fuzzer = cleaned_signature.split('...')
            
            if input_id and outputs_list_for_fuzzer:
                print(f"    [FOUND FUZZABLE CALLBACK] Triggered by: '{input_id}.{input_property}'")
                discovered_callbacks.append({
                    "name": f"Discovered: {input_id}",
                    "input_id": input_id,
                    "input_property": input_property,
                    "outputs_list": outputs_list_for_fuzzer
                })
            else:
                 print(f"    [SKIPPING] Callback met input criteria but output format was not a multi-output string.")
        else:
            print(f"    [SKIPPING] Callback did not meet the criteria (len(inputs)==1 and property=='value').")

    print(f"\n‚Üí Discovery complete. Found {len(discovered_callbacks)} callbacks suitable for this fuzzing script.")
    return discovered_callbacks

print("The function 'discover_callbacks_to_fuzz' is defined.")


The function 'discover_callbacks_to_fuzz' is defined.


### üñ•Ô∏è Stage 6: Client-Side XSS Confirmation (Browser Automation)

This function provides the final validation for any potential Cross-Site Scripting (XSS) vulnerabilities. While server-side fuzzing might show that input is reflected, this test confirms if that reflected content can actually execute as a script in a real browser environment.

It uses **Selenium** to launch a headless Chrome browser, automate user actions (like clicking a tab), inject a malicious payload into a component using JavaScript, and then wait to see if a JavaScript `alert()` is triggered. The appearance of an alert box provides definitive proof of a client-side XSS vulnerability.


In [9]:
import time
from selenium import webdriver
from selenium.webdriver.chrome.service import Service as ChromeService
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException, UnexpectedAlertPresentException
from selenium.webdriver.common.by import By # Import By for easier selector definitions

# (Make sure BASE_URL is defined in a prior cell)

def confirm_xss_with_browser(base_url, target_tab_label, trigger_component_id, payload):
    """
    Uses a headless browser (Selenium) to inject a payload into a component
    and checks if an alert box appears, confirming client-side XSS.
    This corrected version uses a more robust XPath selector for finding tabs.

    Args:
        base_url (str): The base URL of the Dash app.
        target_tab_label (str): The VISIBLE TEXT LABEL of the dcc.Tab to click.
        trigger_component_id (str): The ID of the component to inject the payload into.
        payload (str): The XSS payload to test.
    """
    print(f"\n=== Advanced XSS Confirmation for Component '{trigger_component_id}' ===")
    print(f"  Payload: {payload!r}")
    
    options = webdriver.ChromeOptions()
    options.add_argument('--headless')
    options.add_argument('--no-sandbox')
    options.add_argument('--disable-dev-shm-usage')
    
    driver = None
    try:
        print("  [1/5] Launching headless Chrome browser...")
        driver = webdriver.Chrome(service=ChromeService(ChromeDriverManager().install()), options=options)
        
        print(f"  [2/5] Navigating to {base_url}...")
        driver.get(base_url)
        
        # CORRECTED STEP 3: Wait for the tab to be clickable and then click it.
        # We find the tab by its visible text label. This is more robust.
        print(f"  [3/5] Waiting for and clicking tab with label '{target_tab_label}'...")
        # XPath selector to find a div with role='tab' that contains the specific text.
        tab_xpath = f"//div[@role='tab'][contains(text(), '{target_tab_label}')]"
        
        # Use WebDriverWait to wait up to 10 seconds for the element to be clickable
        wait = WebDriverWait(driver, 10)
        tab_element = wait.until(EC.element_to_be_clickable((By.XPATH, tab_xpath)))
        tab_element.click()
        time.sleep(2) # Wait for the tab's content to load

        print(f"  [4/5] Injecting payload into '{trigger_component_id}' via JavaScript...")
        js_script = f"""
            window.dash_clientside.setProps('{trigger_component_id}', {{'value': `{payload}`}})
        """
        driver.execute_script(js_script)
        
        print("  [5/5] Waiting for an alert box to appear (max 5 seconds)...")
        WebDriverWait(driver, 5).until(EC.alert_is_present())
        
        alert = driver.switch_to.alert
        alert_text = alert.text
        print(f"    [!!! XSS CONFIRMED !!!] Alert box appeared with text: '{alert_text}'")
        alert.accept()
        return True

    except TimeoutException:
        print("    [SAFE] No alert box appeared within the timeout period.")
        return False
    except UnexpectedAlertPresentException as e:
        alert_text = "N/A"
        try:
            alert = driver.switch_to.alert
            alert_text = alert.text
            alert.accept()
        except:
             pass # Alert might already be closed
        print(f"    [!!! XSS CONFIRMED !!!] An unexpected alert box appeared with text: '{alert_text}'")
        return True
    except Exception as e:
        print(f"    [ERROR] An error occurred during the browser test: {e}")
        return False
    finally:
        if driver:
            driver.quit()
            print("  Browser closed.")

print("The function 'confirm_xss_with_browser' is defined.")


The function 'confirm_xss_with_browser' is defined.


### üîë Stage 7: Access Control Simulation (IDOR)
This function simulates an **Insecure Direct Object Reference (IDOR)** attack, which falls under **Broken Access Control (A01:2021)**. The goal is to determine if the application correctly enforces authorization on the objects it serves.

The test works by directly requesting a valid object ID via a callback, bypassing the standard user interface (like a dropdown). If the server responds with the data corresponding to that ID, it indicates that the only check being performed is whether the object exists, not whether the current user is *authorized* to see it. In a multi-user or authenticated application, this would be a critical vulnerability. For this single-user app, it confirms a potential architectural risk.


In [10]:
import requests
import json
import urllib.parse

# This function defines the logic for Stage 7.
# Make sure this cell is run before executing the main script in "Cell G".

def test_for_idor(base_url, target_input_id, target_input_property, all_outputs_for_target_callback, direct_object_reference_id):
    """
    Simulates an Insecure Direct Object Reference (IDOR) test by sending a request
    for a valid object that might not be presented to a user in a UI dropdown.
    """
    print(f"\n=== Advanced Test (IDOR Simulation) for Input: '{target_input_id}.{target_input_property}' ===")
    headers = {"Content-Type": "application/json"}
    
    callback_route = "/_dash-update-component"
    url = urllib.parse.urljoin(base_url, callback_route)
    
    # Construct the special string format for the multi-output 'output' key
    target_multi_output_string = ".." + "...".join(all_outputs_for_target_callback) + ".."

    body = {
        "output": target_multi_output_string,
        "inputs": [
            {
                "id": target_input_id,
                "property": target_input_property,
                "value": direct_object_reference_id
            }
        ],
        "changedPropIds": [f"{target_input_id}.{target_input_property}"],
        "state": []
    }

    print(f"  Testing with a direct object reference: '{direct_object_reference_id}'")
    try:
        r = requests.post(url, json=body, headers=headers, timeout=5)
        status_code = r.status_code
        
        print(f"    ‚Üí Status: {status_code}")
        
        if status_code == 200:
            try:
                resp_json = r.json()
                resp_str = json.dumps(resp_json)
                
                if direct_object_reference_id in resp_str:
                    print("    [FINDING] The server returned data for the directly referenced object.")
                    print("              In a multi-user app, this could be a critical IDOR vulnerability if the object belonged to another user.")
                    print("              For this single-user app, it confirms the backend serves any valid ID it receives.")
                else:
                    print("    [INFO] The server responded with 200 OK, but the requested ID was not found in the response snippet.")
                
                snippet = resp_str[:250].replace("\n", " ") + "..."
                print(f"    Resp snippet: {snippet!r}")

            except json.JSONDecodeError:
                print("    [INFO] Server returned 200 OK but the response was not valid JSON.")

    except requests.exceptions.RequestException as e: 
        print(f"    ‚Üí Status: ERR ({e.__class__.__name__})")

    print(f"\n=== IDOR simulation for '{target_input_id}' complete ===\n")

print("Function 'test_for_idor' is now defined.")


Function 'test_for_idor' is now defined.


### üÜî IDOR Prep: Discovering Valid Object IDs
To effectively simulate an IDOR attack in Stage 7, we first need to know what a *valid* object ID looks like. This function provides that information by:
1. Fetching the application's entire component layout from the `/_dash-layout` endpoint.
2. Recursively parsing the JSON structure to find a specific component by its ID (in this case, the `metabolite-dropdown-explorer`).
3. Extracting all the valid `value`s from that component's `options`.

This list of known-good IDs allows us to make a realistic direct request in the next stage.


In [11]:
import requests
import json
import urllib.parse
import random # We'll need this for the test

def find_component_in_layout(layout_node, target_id):
    """
    A recursive helper function to search for a component with a specific ID
    within the nested structure of a Dash layout.
    """
    if isinstance(layout_node, dict):
        # Check if the current node is the one we're looking for
        if layout_node.get('props', {}).get('id') == target_id:
            return layout_node.get('props')
        
        # If not, check its children
        children = layout_node.get('props', {}).get('children')
        if children:
            if isinstance(children, list):
                for child in children:
                    found = find_component_in_layout(child, target_id)
                    if found:
                        return found
            # Also handle the case where children is a single dict
            elif isinstance(children, dict):
                found = find_component_in_layout(children, target_id)
                if found:
                    return found
    return None


def discover_valid_ids_from_layout(base_url, dropdown_id):
    """
    Discovers the list of valid options (IDs) from a specific dropdown
    component by fetching and parsing the app's initial layout.

    Returns:
        A list of strings representing the 'value' of each option in the dropdown.
    """
    print(f"\n=== Advanced IDOR Prep: Discovering Valid IDs from Dropdown '{dropdown_id}' ===")
    
    layout_url = urllib.parse.urljoin(base_url, "/_dash-layout")
    print(f"‚Üí Fetching app layout from: {layout_url}")
    
    try:
        resp = requests.get(layout_url, timeout=5)
        resp.raise_for_status()
        layout = resp.json()
    except Exception as e:
        print(f"  Error: Could not fetch or parse the app layout. Details: {e}")
        return []

    # Find the specific dropdown component in the layout
    dropdown_props = find_component_in_layout(layout, dropdown_id)
    
    if not dropdown_props:
        print(f"  Error: Could not find dropdown with ID '{dropdown_id}' in the layout.")
        return []
        
    options = dropdown_props.get('options', [])
    if not options or not isinstance(options, list):
        print(f"  Error: Dropdown '{dropdown_id}' has no 'options' or options are not a list.")
        return []

    # Extract the 'value' from each option dictionary
    valid_ids = [option.get('value') for option in options if isinstance(option, dict) and 'value' in option]
    
    if not valid_ids:
        print("  Warning: Extracted options list is empty.")
        return []

    print(f"‚Üí Discovery successful. Found {len(valid_ids)} valid IDs for '{dropdown_id}'.")
    return valid_ids

print("The function 'discover_valid_ids_from_layout' is defined.")


The function 'discover_valid_ids_from_layout' is defined.


### ‚ñ∂Ô∏è Main Execution Orchestrator
This is the main driver cell of the notebook. It orchestrates the entire penetration test by calling the previously defined functions in a logical sequence:
1.  **Stage 1 & 1.5:** It begins by enumerating all live routes and dynamically discovering all fuzzable callbacks.
2.  **Stage 2 & 2.5:** It then runs both the broad and deep fuzzing tests on the discovered callbacks.
3.  **Stage 3, 4, 5:** It proceeds to test for file exposure, incorrect HTTP headers, and method tampering.
4.  **Stage 6 & 7:** Finally, it performs the advanced client-side XSS confirmation and the IDOR simulation.

The outputs from each stage are printed in real-time, providing a comprehensive report of the application's security posture at the end of the run.


In [29]:
# Cell G - Execute All Reconnaissance Stages Sequentially (with all advanced modules)

print("Cell G: --- STARTING FULL RECONNAISSANCE ---")

# Ensure all necessary global variables are defined from your configuration cell (e.g., Cell A)
required_globals = ['BASE_URL', 'FUZZ_INPUTS', 'FILE_CANDIDATES', 'DASH_CALLBACKS']
missing_globals = [var for var in required_globals if var not in globals()]

if missing_globals:
    print(f"Cell G: CRITICAL ERROR - The following global config variables are not defined: {', '.join(missing_globals)}. Please run your configuration cell first!")
else:
    # This entire block is now correctly indented under the 'else'
    
    alive_routes = [] # Initialize in case Stage 1 fails
    callbacks_to_fuzz = [] # Initialize in case discovery fails

    # --- Stage 1: Route Enumeration ---
    print("\n\n--- EXECUTING STAGE 1: ROUTE ENUMERATION ---")
    try:
        live_routes = enumerate_routes(BASE_URL)
        print(f"Stage 1 Result: Found {len(live_routes)} live routes.")
    except Exception as e_s1:
        print(f"ERROR during Stage 1 (enumerate_routes): {e_s1}")

    # --- Stage 1.5: Dynamic Callback Discovery ---
    try:
        callbacks_to_fuzz = discover_callbacks_to_fuzz(BASE_URL)
    except Exception as e_discover:
        print(f"ERROR during callback discovery: {e_discover}")

    # --- Stage 2: Multiple Callback Fuzzing (Broad Scan) ---
    print("\n\n--- EXECUTING STAGE 2: MULTIPLE CALLBACK FUZZING (BROAD) ---")
    if not callbacks_to_fuzz:
        print("  No suitable callbacks were discovered to fuzz. Skipping Broad Scan.")
    else:
        try:
            for cb_config in callbacks_to_fuzz:
                print(f"\n--- TARGETING IN BROAD SCAN: {cb_config['name']} ---")
                fuzz_specific_callback(
                    BASE_URL,
                    cb_config["input_id"],
                    cb_config["input_property"],
                    cb_config["outputs_list"]
                )
            print("\nStage 2 (Broad Scan) complete.")
        except Exception as e_s2:
            print(f"UNEXPECTED ERROR during Stage 2 (Broad Scan): {e_s2}")

    # --- Stage 2.5: Advanced Fuzzing with Encoding (Deep Scan) ---
    print("\n\n--- EXECUTING STAGE 2.5:FUZZING WITH ENCODING ---")
    if not callbacks_to_fuzz:
        print("  No callbacks were discovered. Skipping Advanced Fuzzing.")
    else:
        first_discovered_cb = callbacks_to_fuzz[0]
        print(f"  (Selecting first discovered callback for deep scan: '{first_discovered_cb['name']}')")
        
        xss_payloads_for_encoding_test = [
            "<script>alert('XSS')</script>",
            "<img src=x onerror=alert('XSS2')>"
        ]
        
        try:
            for payload in xss_payloads_for_encoding_test:
                fuzz_callback_with_encoding(
                    BASE_URL,
                    first_discovered_cb["input_id"],
                    first_discovered_cb["input_property"],
                    first_discovered_cb["outputs_list"],
                    payload
                )
            print("\nStage 2.5 (Fuzzing) complete.")
        except Exception as e_s2_adv:
             print(f"UNEXPECTED ERROR during Stage 2.5 (Advanced Fuzzing): {e_s2_adv}")


    # --- Stage 3: Advanced File Brute-Force ---
    print("\n\n--- EXECUTING STAGE 3:FILE BRUTE-FORCE ---")
    try:
        brute_force_files_advanced(BASE_URL)
        print("Stage 3 (File Brute-Force) complete.")
    except Exception as e_s3:
        print(f"UNEXPECTED ERROR during Stage 3: {e_s3}")

    # --- Stage 4: Security Header Analysis ---
    print("\n\n--- EXECUTING STAGE 4: SECURITY HEADER ANALYSIS ---")
    try:
        check_security_headers(BASE_URL, live_routes)
        print("Stage 4 (Security Header Analysis) complete.")
    except Exception as e_s4:
        print(f"UNEXPECTED ERROR during Stage 4: {e_s4}")
    
    # --- Stage 5: HTTP Method Tampering Analysis ---
    print("\n\n--- EXECUTING STAGE 5: HTTP METHOD TAMPERING ANALYSIS ---")
    try:
        test_http_methods(BASE_URL, live_routes)
        print("Stage 5 (HTTP Method Tampering Analysis) complete.")
    except Exception as e_s5:
        print(f"UNEXPECTED ERROR during Stage 5: {e_s5}")

    # --- Stage 6: Advanced XSS Confirmation with Headless Browser ---
    print("\n\n--- EXECUTING STAGE 6:XSS CONFIRMATION ---")
    try:
        if 'confirm_xss_with_browser' not in globals():
            raise NameError("confirm_xss_with_browser function is not defined.")

        xss_payload_for_browser = "<h1>XSS_TEST</h1><script>alert('XSS is confirmed via browser!')</script>"
        
        # Use the VISIBLE LABEL of the tab to click
        tab_to_click_label = "Metabolite Explorer"
        # The component that would trigger a callback to potentially display this XSS payload
        trigger_component = "metabolite-dropdown-explorer"

        print("  NOTE: This stage simulates a client-side vulnerability. Given our previous results,")
        print("  we expect this test to report [SAFE] because your app handles bad inputs well.")
        
        # Call the browser automation function
        confirm_xss_with_browser(
            BASE_URL,
            tab_to_click_label,
            trigger_component,
            xss_payload_for_browser
        )
        print("Stage 6 (XSS Confirmation) complete.")
    except NameError as e:
        print(f"ERROR during Stage 6: {e}. Please run the cell defining 'confirm_xss_with_browser'.")
    except Exception as e_s6:
        print(f"UNEXPECTED ERROR during Stage 6 (Browser XSS Test): {e_s6}")

# --- Stage 7: Advanced IDOR Simulation with Dynamic Payload ---
print("\n\n--- EXECUTING STAGE 7: IDOR SIMULATION ---")
try:
    if 'discover_valid_ids_from_layout' not in globals() or \
       'test_for_idor' not in globals() or \
       'callbacks_to_fuzz' not in globals() or \
       not callbacks_to_fuzz:
        print("  Skipping IDOR test: A required function or callback configuration is missing.")
    else:
        # Step 1: Dynamically discover the list of valid IDs from the UI
        target_dropdown_id_for_idor = "metabolite-dropdown-explorer"
        discovered_ids = discover_valid_ids_from_layout(BASE_URL, target_dropdown_id_for_idor)

        # Step 2: If IDs were found, pick one and run the IDOR test
        if discovered_ids:
            # Pick a random valid ID from the list we just discovered
            idor_test_id = random.choice(discovered_ids)
            print(f"  Info: Randomly selected '{idor_test_id}' for IDOR test.")
            
            # Get the callback configuration for the metabolite explorer
            metabolite_cb_config = next((item for item in callbacks_to_fuzz if item["input_id"] == target_dropdown_id_for_idor), None)
            
            if metabolite_cb_config:
                test_for_idor(
                    BASE_URL,
                    metabolite_cb_config["input_id"],
                    metabolite_cb_config["input_property"],
                    metabolite_cb_config["outputs_list"],
                    idor_test_id # Use the dynamically discovered ID
                )
            else:
                print("  Error: Could not find callback configuration for the metabolite explorer.")
        else:
            print("  Skipping IDOR test execution because no valid IDs were discovered.")
            
    print("Stage 7 (Advanced IDOR Simulation) complete.")
except Exception as e_s7:
    print(f"UNEXPECTED ERROR during Stage 7 (Advanced IDOR Simulation): {e_s7}")

print("\n\nCell G: --- FULL RECONNAISSANCE SCRIPT FINISHED ---")



Cell G: --- STARTING FULL RECONNAISSANCE ---


--- EXECUTING STAGE 1: ROUTE ENUMERATION ---
=== Stage 1: Route Enumeration (Actual Scan) ===
 ‚Üí Found 14 candidate URLs (filtered to base URL); testing status codes‚Ä¶
  [200 OK] http://localhost:8280/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_14_2m1701447077.min.js
  [200 OK] http://localhost:8280/_dash-component-suites/dash/dash_table/bundle.v5_2_8m1701447077.js
  [200 OK] http://localhost:8280/_dash-component-suites/dash/dcc/dash_core_components-shared.v2_12_1m1701447077.js
  [200 OK] http://localhost:8280/_dash-component-suites/dash/dcc/dash_core_components.v2_12_1m1701447077.js
  [200 OK] http://localhost:8280/_dash-component-suites/dash/deps/polyfill@7.v2_14_2m1701447077.12.1.min.js
  [200 OK] http://localhost:8280/_dash-component-suites/dash/deps/prop-types@15.v2_14_2m1701447077.8.1.min.js
  [200 OK] http://localhost:8280/_dash-component-suites/dash/deps/react-dom@16.v2_14_2m1701447077.14.0.min.js
  [200 OK]

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
</head>
<body>

<div style="font-family: Arial, sans-serif; line-height: 1.6;">

<!-- Section 5: Conclusion -->
<div style="border-left: 6px solid #D32F2F; background-color: #ffebee; padding: 20px 25px; margin: 25px 0; border-radius: 8px; box-shadow: 0 2px 5px rgba(0,0,0,0.1);">
  <h2 style="margin-top: 0; margin-bottom: 15px; color: #b71c1c; border-bottom: 2px solid #b71c1c; padding-bottom: 8px;">‚úÖ 5. Conclusion & Findings Summary</h2>
  <p style="margin-bottom: 16px; color: #333;">
    The systematic penetration test conducted on the "Comprehensive Multi-Omics Dashboard" confirms that the application, in its final hardened state, possesses a strong and resilient security posture against common web vulnerabilities. The testing process successfully achieved its objectives, providing a clear assessment of the application's risks and validating the effectiveness of its security controls.
  </p>
  <h3 style="margin-top: 20px; margin-bottom: 10px; color: #c62828;">Key Findings:</h3>
  <ul style="margin: 0; padding-left: 20px; color: #333; list-style-type: disc;">
    <li style="margin-bottom: 10px;">
      <strong>Effective Remediation:</strong> The most critical finding was the initial risk of information leakage through unhandled framework-level exceptions (HTTP 500 errors). The implementation of a global error handler in <code>app.py</code> proved to be a successful remediation, effectively preventing the exposure of sensitive stack traces and demonstrating a secure "fail-safe" architecture.
    </li>
    <li style="margin-bottom: 10px;">
      <strong>Robust Defense-in-Depth:</strong> The application demonstrated strong resilience against injection attacks (XSS, SQLi). This is attributed to a layered defense strategy combining the inherent safety of the backend's data-filtering logic with a restrictive Content-Security-Policy (CSP) that was confirmed to prevent script execution in a client-side browser test.
    </li>
    <li style="margin-bottom: 10px;">
      <strong>Secure Configuration:</strong> The audit confirmed that the server is correctly configured. All critical HTTP security headers are present and properly configured, and the application correctly rejects improper HTTP methods, minimizing its attack surface. No sensitive files were found to be exposed.
    </li>
    <li style="margin-bottom: 10px;">
      <strong>Context-Aware Access Control:</strong> The Insecure Direct Object Reference (IDOR) simulation confirmed that the application's access control is appropriate for its intended use as a public, single-source dashboard. While this design would be a vulnerability in a multi-user context, it is a secure and deliberate choice for this specific application.
    </li>
  </ul>
  <p style="margin-top: 20px; margin-bottom: 0; color: #333; font-weight: bold;">
    Overall, this penetration test serves as a successful case study, validating the "develop and validate" model as a necessary standard for creating trustworthy and responsible scientific software.
  </p>
</div>

</div>
</body>
</html>
