# Introduction

This is a general demonstration on how we can aid our users to create bugreport by simply input ONLY 4 parameters, i.e. 2 pairs of request-response for normal-browsing, and that of successfully hacked.

The LLM can compare the difference and find out the hacking logic behind, i.e. how the hacker successfully exploit the vulnerability, thus returning details of the reports including `summary`, `step to reproduce` and `recommendation to fix this vulnerability`

## Impact
- With this tool, by interview, our user said they have saved 1 hour for producing one bugreport and agree that they are more attached to our platform because of having this helping tool.

- They also said that the tool helped them to write a better report other than saving time:
    - More Structural (Best practice format: summary, step of reproduce, recommendation), especially most of the users usually write too detailed or too brief for step of reproduce
    - Superior knowledge (Most of the time, inexperienced hacker cannot give recommendation)
    - More professional (No grammatical mistake)
    - Generated in 2 seconds instead of 60 minutes

## Technical details
- We will apply `gemini-pro` model from Google as most of our input exceed `gpt-3.5-turbo` maximum token limit (4,096 tokens), while Gemini-Pro can ingest 32,000 tokens for input.

- For generating a prompt template of `generating bug report in your custom style` with input variables of parameters (2 paris of request-response in this case), please read the steps from my Medium article 

`Generating a report in a tailored format using Reverse Prompt Engineering` : 
https://medium.com/@miltonchan_85581/generating-a-report-in-a-tailored-format-using-reverse-prompt-engineering-16d1944a6413

In [1]:
!pip install -q google-generativeai==0.3.2
!pip install -q python-dotenv

In [2]:
import pandas as pd
from tqdm import tqdm

from dotenv import load_dotenv, find_dotenv
# Get Gemini API key
try:
    _ = load_dotenv(find_dotenv()) # read local .env file
    GOOGLE_API_KEY = os.environ['GEMINI_API_KEY']
except:
    GOOGLE_API_KEY = 'your Gemini key'

import google.generativeai as genai
genai.configure(api_key=GOOGLE_API_KEY)
model = genai.GenerativeModel('gemini-pro')


In [3]:
#! /usr/bin/env python

def bugreport_prompt_V1(baseline_Request, 
                        baseline_Response,
                        hacker_Request, 
                        hacker_Response): 
    '''
    docstring
    '''
    instruction_prompt = f"""
Generate a detailed bug report based on the provided baseline log and hacker logs. The focus is on xxxxxx:

[Confidential prompt which cannot be provided]

### Baseline Log:
Request:
'''
{baseline_Request}
'''
Response:
'''
{baseline_Response}
'''

### Hacker Log:
Request:
'''
{hacker_Request}
'''
Response:
'''
{hacker_Response}
'''
    """
    return instruction_prompt


In [4]:
df = pd.read_csv('data/demo_data.csv')

In [5]:
df

Unnamed: 0,example,request_baseline,response_baseline,request_hacking,response_hacking
0,1,POST /api/BasketItems/\ncookie:_ga=GA1.1.16973...,"HTTP/2.0 200 OK\ndate:Sat, 30 Dec 2023 14:41:2...",POST /api/BasketItems/\ncookie:_ga=GA1.1.16973...,"HTTP/2.0 200 OK\ndate:Tue, 02 Jan 2024 11:08:3..."
1,2,GET /rest/user/change-password?current=qqqqq&n...,"HTTP/2.0 401 Unauthorized\ndate:Mon, 08 Jan 20...",GET /rest/user/change-password?new=qqqqq&repea...,"HTTP/2.0 200 OK\ndate:Mon, 08 Jan 2024 07:21:4..."
2,3,POST /rest/user/login\ncookie:_ga=GA1.1.169736...,"HTTP/2.0 401 Unauthorized\ndate:Fri, 15 Dec 20...",POST /rest/user/login\ncookie:_ga=GA1.1.169736...,"HTTP/2.0 200 OK\ndate:Fri, 15 Dec 2023 01:35:0..."
3,4,GET /rest/user/whoami\ncookie:_ga=GA1.1.169736...,"HTTP/2.0 200 OK\ndate:Thu, 14 Dec 2023 14:27:0...",GET /rest/user/whoami?callback=admin\ncookie:l...,"HTTP/2.0 200 OK\ndate:Tue, 16 Jan 2024 06:08:3..."


### As the content is about hacking and may violate the safety principal of using LLM, we have to customize the safety setting of LLM, else empty content will be output

In [6]:
safety_settings = [
    {
        "category": "HARM_CATEGORY_DANGEROUS",
        "threshold": "BLOCK_NONE",
    },
    {
        "category": "HARM_CATEGORY_HARASSMENT",
        "threshold": "BLOCK_NONE",
    },
    {
        "category": "HARM_CATEGORY_HATE_SPEECH",
        "threshold": "BLOCK_NONE",
    },
    {
        "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
        "threshold": "BLOCK_NONE",
    },
    {
        "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
        "threshold": "BLOCK_NONE",
    },
]

In [7]:
bugreports = []
for index, row in tqdm(df.iterrows(), total=len(df)):
    # mask domain for the incoming logs
    baseline_Request = row["request_baseline"]
    baseline_Response = row["response_baseline"]
    hacker_Request = row["request_hacking"]
    hacker_Response = row["response_hacking"]

    bugreport_prompt = bugreport_prompt_V1(baseline_Request, baseline_Response, hacker_Request, hacker_Response)
    response = model.generate_content(bugreport_prompt ,
                      generation_config=genai.types.GenerationConfig(temperature=0),
                      safety_settings=safety_settings
                     )
    # str to json ingestion
    bugreport = response.text
    bugreports.append(bugreport)

100%|██████████| 4/4 [00:22<00:00,  5.74s/it]


In [8]:
print(bugreports[0])

**Summary:**

The hacker log demonstrates a successful attack on the `/api/BasketItems/` endpoint, allowing the attacker to add an item to the basket with a negative quantity. This vulnerability could be exploited to manipulate inventory levels and potentially lead to financial losses for the e-commerce platform.

**Steps to Reproduce:**

1. Send a POST request to the `/api/BasketItems/` endpoint with the following JSON payload:
```
{"ProductId":33,"BasketId":"6","quantity":-1000}
```
2. Include the following headers in the request:
```
Authorization: Bearer <access_token>
Content-Type: application/json
```
3. The response will be a 200 OK status code with a JSON payload indicating the successful addition of the item to the basket.

**Recommendation:**

To address this vulnerability, the following recommendations are suggested:

* Implement input validation to ensure that the quantity field cannot be set to a negative value.
* Consider using a range of allowed values for the quantity f

In [9]:
print(bugreports[1])

**Summary:**

The hacker log demonstrates a successful password change attack by exploiting a vulnerability in the password change endpoint. The attacker was able to change the password without providing the current password, allowing them to gain unauthorized access to the user's account. This vulnerability could have severe consequences, such as account takeover, data theft, or financial fraud.

**Steps to Reproduce:**

1. Send a GET request to the `/rest/user/change-password` endpoint with the `new` and `repeat` parameters set to the desired new password.
2. Omit the `current` parameter, which is typically required to verify the user's current password.
3. The server will process the request and update the user's password without requiring the current password.

**Recommendation:**

To address this vulnerability, the following recommendations should be implemented:

* **Implement strong input validation:** Ensure that the `current` parameter is always present and matches the user's 

In [10]:
print(bugreports[2])

**Summary:**

The security issue identified in the hacker logs is an SQL injection vulnerability in the login endpoint. This vulnerability allows an attacker to bypass authentication and gain unauthorized access to the application by injecting malicious SQL queries into the login request.

**Steps to Reproduce:**

1. Send a POST request to the `/rest/user/login` endpoint with the following JSON payload:

```json
{"email":"hi' or 1=1 --","password":"vvv"}
```

2. The server responds with an HTTP 200 OK status code and a JSON response containing an authentication token.

3. The attacker can use the authentication token to access protected resources within the application.

**Recommendation:**

To address this vulnerability, it is recommended to implement proper input validation and sanitization on the server-side to prevent malicious SQL queries from being executed. Additionally, consider using prepared statements or parameterized queries to prevent SQL injection attacks.


In [11]:
print(bugreports[3])

**Summary:**

The application is vulnerable to a JSONP (JSON with Padding) attack. The attacker can exploit this vulnerability to execute arbitrary JavaScript code in the victim's browser. This could allow the attacker to steal sensitive information, such as cookies, session tokens, or credit card numbers.

**Steps to Reproduce:**

1. The attacker sends a request to the application with a `callback` parameter. The value of the `callback` parameter is a function name that the application will call when it responds to the request.
2. The application responds to the request with a JSONP response. The JSONP response is a JavaScript function call that includes the data that the application wants to return to the client.
3. The victim's browser executes the JavaScript function call. This executes the arbitrary JavaScript code that the attacker included in the `callback` parameter.

**Recommendation:**

To address this vulnerability, the application should validate the value of the `callback`