# Use Case Description

Automate security monitoring and guidance during the software development lifecycle (SDLC) by embedding a GitHub Action that triggers on pull requests (PRs). The system summarizes changes, evaluates them for security risks, and provides actionable recommendations to reviewers. This turns every PR into an opportunity for proactive security posture improvement — not just static scanning, but contextual reasoning.

## Model used for this use case
Both Instruct Model and Reasoning Model would be suitable for this task. In this example, we used Instruct Model.

## SetUp

The setup scripts below are essentially the same as those in the [Quickstart (Instruct Model)](https://github.com/RobustIntelligence/foundation-ai-cookbook/blob/main/1_quickstarts/Preview_Quickstart_instruct_model.ipynb)

In [1]:
import os
import transformers
import torch
from IPython.display import display, Markdown

HF_TOKEN = os.getenv("HF_TOKEN")
DEVICE = "cuda" if torch.cuda.is_available() else "cpu"

In [2]:
MODEL_ID = "" # To be relaced with the final model name

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, token=HF_TOKEN)
model = AutoModelForCausalLM.from_pretrained(
    pretrained_model_name_or_path=MODEL_ID,
    device_map="auto",
    torch_dtype=torch.bfloat16, # this model's tensor_type is BF16
    token=HF_TOKEN,
)

Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

In [3]:
generation_args = {
    "max_new_tokens": 1024,
    "temperature": None,
    "repetition_penalty": 1.2,
    "do_sample": False,
    "use_cache": True,
    "eos_token_id": tokenizer.eos_token_id,
    "pad_token_id": tokenizer.pad_token_id,
}

In [4]:
import re

def inference(prompt, system_prompt):
    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": prompt},
    ]
    inputs = tokenizer.apply_chat_template(
        messages, tokenize=False, add_generation_prompt=True
    )
    
    inputs = tokenizer(inputs, return_tensors="pt").to(DEVICE)
    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            **generation_args,
        )
    response = tokenizer.decode(outputs[0], skip_special_tokens = False)
    
    # extract the assistant response part only
    match = re.search(r"<\|assistant\|>(.*?)<\|end_of_text\|>", response, re.DOTALL)
    
    return match.group(1).strip()

## Security Analysis

In [5]:
SYSTEM_PROMPT = "You are a security expert reviewing changes introduced in a pull request."

def make_prompt(pr_diff_text):
    return f'''Analyze the following code diff and identify:
    1. Security-relevant changes
    2. Any potential vulnerabilities introduced
    3. A clear summary of affected areas
    4. Recommended remediations (if needed)
    
    ## PULL REQUEST DIFF
    {pr_diff_text}'''


In this sample PR Diff, Replaced use of `eval()` with `ast.literal_eval()` — this is a significant security improvement. `eval()` can execute arbitrary code, whereas `ast.literal_eval()` safely evaluates strings with Python literals.

In [6]:
pr_diff_text = """
diff --git a/app/utils.py b/app/utils.py
index e69de29..b25a3c0 100644
--- a/app/utils.py
+++ b/app/utils.py
@@ def process_input(data):
-    return eval(data)
+    import ast
+    return ast.literal_eval(data)

diff --git a/.github/workflows/deploy.yml b/.github/workflows/deploy.yml
index a1b2c3d..d4e5f6g 100644
--- a/.github/workflows/deploy.yml
+++ b/.github/workflows/deploy.yml
@@ steps:
-      run: npm run deploy
+      run: |
+        set -e
+        npm run lint
+        npm test
+        npm run deploy
"""


In [7]:
response = inference(make_prompt(pr_diff_text), SYSTEM_PROMPT)
display(Markdown(response))

### Analysis:

#### 1. **Security-Relevant Changes**

**In `app/utils.py`:**
The change from using Python's built-in `eval` function to use `ast.literal_eval` for processing input data is significant.

*Before:* 
```python
def process_input(data):
    return eval(data) # This can evaluate arbitrary expressions, which poses serious risks if untrusted or malicious data is passed as an argument.
```

*After:*
```python
import ast
def process_input(data):
    return ast.literal_eval(data) # Safer than 'eval' because it only evaluates strings that could be assigned to names according to the syntax rules; does not execute statements or expressions with side effects.
```
This switch reduces the risk associated with executing potentially harmful code due to unsafe evaluation of user-provided inputs.

**In `.github/workflows/deploy.yml`:**
The introduction of additional commands (`npm run lint`, `npm test`) before running `npm run deploy` adds more robustness to the deployment workflow by ensuring that any errors related to code quality and tests will halt the pipeline early, preventing deployments of broken software.

#### 2. **Potential Vulnerabilities Introduced**

* The original usage of `eval()` posed a high-risk vulnerability since it allowed execution of arbitrary python code provided through the `data` parameter. This could lead to command injection attacks, information disclosure, denial-of-service (DoS), and other types of attacks depending on how this method was used within the application context.

* There were no new obvious vulnerabilities added directly by these changes themselves but rather the removal of one dangerous practice and addition of good practices like testing and linting improve overall maintainability and reduce future risks.

#### 3. **Summary of Affected Areas**

* **Code Quality Improvement:** The replacement of `eval` with `ast.literal_eval` improves safety when handling string representations of Python objects without allowing the execution of arbitrary code.
  
* **CI/CD Pipeline Enhancement:** The updated GitHub Actions workflow now includes static analysis via linter and unit tests before deploying the application, enhancing the reliability and stability of releases.

#### 4. **Recommended Remediations (If Needed)**

No direct remediation actions are required based solely on the given diffs unless there are further contexts we're unaware of. However, here are some general recommendations:

* Ensure all parts of your application handle user-supplied data safely, avoiding dynamic execution whenever possible.
* Continuously review and update dependencies and their versions to ensure they do not introduce known vulnerabilities into your project.
* Maintain comprehensive test coverage to catch regressions and unexpected behavior changes.
* Regularly audit third-party libraries and tools for known weaknesses.
* Keep CI/CD pipelines rigorous so that they fail fast upon encountering issues such as failing tests or lint warnings.

Given the limited scope of the provided diffs, these suggestions focus broadly on maintaining secure coding standards and improving development processes. If deeper integration with the rest of the system is necessary, consider conducting a full code review or performing automated scans for vulnerabilities across the entire codebase.