# Use Case Description
Root cause analysis in cybersecurity is crucial as it helps identify the underlying causes of security incidents, enabling organizations to implement effective measures to prevent future breaches. However, this process is often tedious and time-consuming, since it requires sifting through vast amounts of data, correlating events, and manually detecting patterns. AI models can accelerates this by automating data analysis, improving accuracy, and reducing the effort needed to pinpoint the root causes efficiently.


## Model used for this use case
Both Instruct Model and Reasoning Model would be suitable for this task. In this example, we used Reasoning Model.

## SetUp

The setup scripts below are essentially the same as those in the [Quickstart (Reasoning Model)](https://github.com/RobustIntelligence/foundation-ai-cookbook/blob/main/1_quickstarts/Preview_Quickstart_reasoning_model.ipynb)

### Notice
- The code below assumes that users have access to the models via Hugging Face. If you are using API access instead, please replace the inference code with the API version provided in the Quickstart guide.
- This model is currently in preview mode and may receive updates. As a result, outputs can vary even when parameters are configured to ensure reproducibility.

In [1]:
import os
import transformers
import torch
from IPython.display import display, Markdown

HF_TOKEN = os.getenv("HF_TOKEN")

def _get_device():
    if torch.cuda.is_available():
        return "cuda"
    elif torch.backends.mps.is_available():
        return "mps"
    else:
        return "cpu"

DEVICE = _get_device()

In [2]:
MODEL_ID = "" # To be relaced with the final model name

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, token=HF_TOKEN)
model = AutoModelForCausalLM.from_pretrained(
    pretrained_model_name_or_path=MODEL_ID,
    device_map="auto",
    torch_dtype=torch.float32, # this model's tensor_type is float32
    token=HF_TOKEN,
)

Loading checkpoint shards:   0%|          | 0/7 [00:00<?, ?it/s]

In [3]:
generation_args = {
    "max_new_tokens": 1024,
    "temperature": None,
    "repetition_penalty": 1.2,
    "do_sample": False,
    "use_cache": True,
    "eos_token_id": tokenizer.eos_token_id,
    "pad_token_id": tokenizer.pad_token_id,
}

In [4]:
import re

def inference(prompt):
    messages = [
        {"role": "user", "content": prompt},
    ]
    inputs = tokenizer.apply_chat_template(
        messages, tokenize=False, add_generation_prompt=True
    )

    # The old model version didn’t include the <think> token in the chat template.
    think_token = "<think>\n"
    if not inputs.endswith(think_token):
        inputs += think_token
    
    inputs = tokenizer(inputs, return_tensors="pt").to(DEVICE)
    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            **generation_args,
        )
    response = tokenizer.decode(outputs[0], skip_special_tokens = False)

    # extract the thinking part only
    match = re.search(r"<think>(.*?)<\|end_of_text\|>", response, re.DOTALL)
    
    return match.group(1).strip()

## Security Logs Assessment

Here’s a short mock security logs and configuration of firewalls

In [5]:
context = """

Sample Security Log

| Timestamp           | Source IP     | Destination IP | Protocol | Action   | Reason                  | User           |
|---------------------|---------------|----------------|----------|----------|-------------------------|----------------|
| 2025-07-14 22:15:03 | 192.168.1.45  | 10.10.10.5     | TCP      | Denied   | Suspicious Port Scan    | N/A            |
| 2025-07-14 22:15:10 | 192.168.1.45  | 10.10.10.5     | TCP      | Denied   | Suspicious Port Scan    | N/A            |
| 2025-07-14 22:15:15 | 192.168.1.45  | 10.10.10.5     | TCP      | Denied   | Suspicious Port Scan    | N/A            |
| 2025-07-14 22:16:00 | 192.168.1.45  | 10.10.10.5     | TCP      | Allowed  | Established Connection  | user123        |
| 2025-07-14 22:16:05 | 192.168.1.45  | 10.10.10.5     | TCP      | Allowed  | Data Transfer           | user123        |
| 2025-07-14 22:17:00 | 192.168.1.45  | 10.10.10.5     | TCP      | Denied   | Suspicious Payload      | user123        |

Firewall rules

| Rule ID | Source IP | Destination IP | Protocol | Port Range | Action |
|---------|-----------|----------------|----------|------------|--------|
| 101     | Any       | 10.10.10.5     | TCP      | 1-65535    | Allow  |


"""

display(Markdown(context))



Sample Security Log

| Timestamp           | Source IP     | Destination IP | Protocol | Action   | Reason                  | User           |
|---------------------|---------------|----------------|----------|----------|-------------------------|----------------|
| 2025-07-14 22:15:03 | 192.168.1.45  | 10.10.10.5     | TCP      | Denied   | Suspicious Port Scan    | N/A            |
| 2025-07-14 22:15:10 | 192.168.1.45  | 10.10.10.5     | TCP      | Denied   | Suspicious Port Scan    | N/A            |
| 2025-07-14 22:15:15 | 192.168.1.45  | 10.10.10.5     | TCP      | Denied   | Suspicious Port Scan    | N/A            |
| 2025-07-14 22:16:00 | 192.168.1.45  | 10.10.10.5     | TCP      | Allowed  | Established Connection  | user123        |
| 2025-07-14 22:16:05 | 192.168.1.45  | 10.10.10.5     | TCP      | Allowed  | Data Transfer           | user123        |
| 2025-07-14 22:17:00 | 192.168.1.45  | 10.10.10.5     | TCP      | Denied   | Suspicious Payload      | user123        |

Firewall rules

| Rule ID | Source IP | Destination IP | Protocol | Port Range | Action |
|---------|-----------|----------------|----------|------------|--------|
| 101     | Any       | 10.10.10.5     | TCP      | 1-65535    | Allow  |




In [6]:
def make_prompt(context):
   return f"""You are a security expert in charge of root cause analysis of security incidents.
    
    Go over the context given, and analyze the root cause of the issue
    
    ## CONTEXT
    {context}
    
    Respond with:
    - What is the issue?
    - What is the root cause?
    - How to make modifications to prevent the issue?

    When displaying a table, ensure it is clearly presented using Markdown.
    """

In [7]:
response = inference(make_prompt(context))
display(Markdown(response))

## Issue Analysis

### What is the issue?

The logs indicate that there have been multiple attempts by `192.168.1.45` (source) to connect to `10.10.10.5` (destination). Initially, these connections were denied due to suspicious port scans (`TCP Deny for Suspicious Port Scan`). However, after some time, an established connection was allowed (`Allowed Established Connection`) followed by data transfer. But then another attempt was denied again because of a suspicious payload (`TCP Denied for Suspicious Payload`).

**Issue:** The firewall is inconsistently applying its rules, allowing unauthorized access despite previous denials based on port scanning.

### Root Cause

**Root Cause:** 
The firewall rule set allows all TCP traffic from any source IP to destination `10.10.10.5` on ports 1-65535. This broad rule overrides more specific detections like port scan detection or malicious payloads. Since the allow rule has higher precedence than deny rules for the same protocol/port range, once the initial connection is allowed (after the port scan), subsequent legitimate traffic isn't blocked even if it's part of a potential attack sequence. 

Additionally, the system might not be properly configured to track sessions beyond the initial packet inspection. Therefore, once a connection is established, further packets within that session aren’t being inspected as aggressively, leading to false negatives where malicious activity occurs post-initial allowance.

### Solution Recommendations

To address this issue effectively:

#### Modify Firewall Rules

1. **Implement Layered Defense**: Create separate rules for different stages of communication. For example, block initial port scans entirely rather than just flagging them. Use rate limiting to detect and stop repeated probes.

2. **Update Existing Rules**: Ensure that the "Allow" rule for `Any -> 10.10.10.5 TCP 1-65535` is either removed or made more restrictive. Instead, define specific exceptions for known services required, e.g., only allow certain ports (like HTTP/HTTPS, SSH, etc.) instead of the entire range.

3. **Anomaly Detection Integration**: Integrate with IDS/IPS systems that can perform deeper inspection and anomaly-based blocking. These tools can look for patterns indicative of attacks even after a connection is established.

4. **Session Tracking Improvements**: If possible, configure the firewall to maintain stateful tracking of active sessions. This ensures that each packet is checked against the existing session’s legitimacy, preventing malicious activities during ongoing communications.

Here's how updated firewall rules could look like:

```markdown
| Rule ID | Source IP | Destination IP | Protocol | Port Range | Action         | Notes                           |
|---------|-----------|----------------|----------|------------|---------------|-----------------------------|
| 102     | Any       | 10.10.10.5     | TCP      | 80,443     | Allow         | HTTPS & SSH                 |
| 103     | Any       | 10.10.10.5     | TCP      | 0-1023     | Block          | Disallow non-approved ports   |
| 104     | Any       | 10.10.10.5     | TCP      | 1024-65535  | Detect & Alert | Monitor for unusual ports     |
```

This approach enforces stricter controls at various layers, reducing the risk of inconsistent application of policies and improving overall network security posture.