Building the prompt from dataset

In [2]:
import pandas as pd
from sklearn.model_selection import train_test_split

In [37]:
friday = pd.read_csv('Friday-WorkingHours-Afternoon-DDos.pcap_ISCX.csv')
friday.columns = [column.strip() for column in friday.columns]
friday.head()

Unnamed: 0,Destination Port,Flow Duration,Total Fwd Packets,Total Backward Packets,Total Length of Fwd Packets,Total Length of Bwd Packets,Fwd Packet Length Max,Fwd Packet Length Min,Fwd Packet Length Mean,Fwd Packet Length Std,...,min_seg_size_forward,Active Mean,Active Std,Active Max,Active Min,Idle Mean,Idle Std,Idle Max,Idle Min,Label
0,54865,3,2,0,12,0,6,6,6.0,0.0,...,20,0.0,0.0,0,0,0.0,0.0,0,0,BENIGN
1,55054,109,1,1,6,6,6,6,6.0,0.0,...,20,0.0,0.0,0,0,0.0,0.0,0,0,BENIGN
2,55055,52,1,1,6,6,6,6,6.0,0.0,...,20,0.0,0.0,0,0,0.0,0.0,0,0,BENIGN
3,46236,34,1,1,6,6,6,6,6.0,0.0,...,20,0.0,0.0,0,0,0.0,0.0,0,0,BENIGN
4,54863,3,2,0,12,0,6,6,6.0,0.0,...,20,0.0,0.0,0,0,0.0,0.0,0,0,BENIGN


In [4]:
friday.columns

Index(['Destination Port', 'Flow Duration', 'Total Fwd Packets',
       'Total Backward Packets', 'Total Length of Fwd Packets',
       'Total Length of Bwd Packets', 'Fwd Packet Length Max',
       'Fwd Packet Length Min', 'Fwd Packet Length Mean',
       'Fwd Packet Length Std', 'Bwd Packet Length Max',
       'Bwd Packet Length Min', 'Bwd Packet Length Mean',
       'Bwd Packet Length Std', 'Flow Bytes/s', 'Flow Packets/s',
       'Flow IAT Mean', 'Flow IAT Std', 'Flow IAT Max', 'Flow IAT Min',
       'Fwd IAT Total', 'Fwd IAT Mean', 'Fwd IAT Std', 'Fwd IAT Max',
       'Fwd IAT Min', 'Bwd IAT Total', 'Bwd IAT Mean', 'Bwd IAT Std',
       'Bwd IAT Max', 'Bwd IAT Min', 'Fwd PSH Flags', 'Bwd PSH Flags',
       'Fwd URG Flags', 'Bwd URG Flags', 'Fwd Header Length',
       'Bwd Header Length', 'Fwd Packets/s', 'Bwd Packets/s',
       'Min Packet Length', 'Max Packet Length', 'Packet Length Mean',
       'Packet Length Std', 'Packet Length Variance', 'FIN Flag Count',
       'SYN Flag Co

In [5]:
# reducing to the best features for DDoS detection (according to published paper)

friday_reduced = friday[['Bwd Packet Length Min', 'Bwd Packet Length Std', 'Average Packet Size', 'Flow Duration', 'Flow IAT Std', 'Label']]
friday_reduced.head()

Unnamed: 0,Bwd Packet Length Min,Bwd Packet Length Std,Average Packet Size,Flow Duration,Flow IAT Std,Label
0,0,0.0,9.0,3,0.0,BENIGN
1,6,0.0,9.0,109,0.0,BENIGN
2,6,0.0,9.0,52,0.0,BENIGN
3,6,0.0,9.0,34,0.0,BENIGN
4,0,0.0,9.0,3,0.0,BENIGN


In [110]:
friday_reduced.describe()

Unnamed: 0,Bwd Packet Length Min,Bwd Packet Length Std,Average Packet Size,Flow Duration,Flow IAT Std
count,225745.0,225745.0,225745.0,225745.0,225745.0
mean,16.718776,1230.172938,574.568843,16241650.0,4248569.0
std,50.480568,1733.201267,626.096202,31524370.0,7622819.0
min,0.0,0.0,0.0,-1.0,0.0
25%,0.0,0.0,7.5,71180.0,19104.46
50%,0.0,2.44949,141.0,1452333.0,564167.6
75%,6.0,2436.833027,1291.888889,8805237.0,4033232.0
max,1460.0,8194.660487,2528.0,119999900.0,69200000.0


In [16]:
train, test = train_test_split(friday_reduced, test_size=1e-4)
test.shape

(23, 6)

In [17]:
training_sample = pd.concat([train[train['Label'] == 'BENIGN'].sample(5), train[train['Label'] == 'DDoS'].sample(5)])
testing_sample = test.iloc[[0]]

In [92]:
def promptify_data(df):
    column_names = ['Bwd Packet Length Min', 'Bwd Packet Length Std', 'Average Packet Size', 'Flow Duration', 'Time Between Packets Std', 'Label']
    formatted_rows = []
    for index, row in df.iterrows():
        formatted_row = ' | '.join([f'{column_names[i]}: {row.iloc[i]}' for i in range(len(row))])
        formatted_rows.append(formatted_row)
    
    interleaved_rows = []
    while len(formatted_rows) > 1:
        interleaved_rows.append(formatted_rows.pop(0))
        interleaved_rows.append(formatted_rows.pop(-1))
    if len(formatted_rows) == 1:
        interleaved_rows.append(formatted_rows[0])

    return '\n'.join(interleaved_rows)

In [94]:
promptify_data(testing_sample.iloc[:, :-1])

'Bwd Packet Length Min: 6.0 | Bwd Packet Length Std: 0.0 | Average Packet Size: 7.0 | Flow Duration: 7607787.0 | Time Between Packets Std: 3392968.51'

In [93]:
print(promptify_data(training_sample))

Bwd Packet Length Min: 128 | Bwd Packet Length Std: 0.0 | Average Packet Size: 96.25 | Flow Duration: 50027 | Time Between Packets Std: 28838.07777 | Label: BENIGN
Bwd Packet Length Min: 0 | Bwd Packet Length Std: 5795.50069 | Average Packet Size: 1661.857143 | Flow Duration: 98446 | Time Between Packets Std: 39859.03376 | Label: DDoS
Bwd Packet Length Min: 6 | Bwd Packet Length Std: 0.0 | Average Packet Size: 9.0 | Flow Duration: 121289 | Time Between Packets Std: 0.0 | Label: BENIGN
Bwd Packet Length Min: 0 | Bwd Packet Length Std: 3668.897 | Average Packet Size: 897.1538462 | Flow Duration: 98319786 | Time Between Packets Std: 26900000.0 | Label: DDoS
Bwd Packet Length Min: 0 | Bwd Packet Length Std: 467.5 | Average Packet Size: 189.5 | Flow Duration: 427605 | Time Between Packets Std: 103718.3593 | Label: BENIGN
Bwd Packet Length Min: 0 | Bwd Packet Length Std: 2177.344966 | Average Packet Size: 1292.555556 | Flow Duration: 976170 | Time Between Packets Std: 344723.8229 | Label: DD

In [98]:
system_prompt = '''You will be provided with a sample of network traffic data that is split between training data and a single testing data (separated by '###'). Each row of data is separated by a newline, and each row has features that are separated by a pipe symbol ('|'). Using information from the training data, predict the best label (BENIGN or DDoS) for the testing data. First explain your reasoning for the selected label. Then indicate the predicted label with '$$$' on each side.'''
user_prompt = promptify_data(training_sample) + '\n###\n' + promptify_data(testing_sample.iloc[:, :-1])

In [100]:
print(system_prompt)
print(user_prompt)

You will be provided with a sample of network traffic data that is split between training data and a single testing data (separated by '###'). Each row of data is separated by a newline, and each row has features that are separated by a pipe symbol ('|'). Using information from the training data, predict the best label (BENIGN or DDoS) for the testing data. First explain your reasoning for the selected label. Then indicate the predicted label with '$$$' on each side.
Bwd Packet Length Min: 128 | Bwd Packet Length Std: 0.0 | Average Packet Size: 96.25 | Flow Duration: 50027 | Time Between Packets Std: 28838.07777 | Label: BENIGN
Bwd Packet Length Min: 0 | Bwd Packet Length Std: 5795.50069 | Average Packet Size: 1661.857143 | Flow Duration: 98446 | Time Between Packets Std: 39859.03376 | Label: DDoS
Bwd Packet Length Min: 6 | Bwd Packet Length Std: 0.0 | Average Packet Size: 9.0 | Flow Duration: 121289 | Time Between Packets Std: 0.0 | Label: BENIGN
Bwd Packet Length Min: 0 | Bwd Packet 

Testing the prompt

In [22]:
from openai import OpenAI

In [23]:
client = OpenAI()

In [101]:
completion = client.chat.completions.create(
  model='gpt-3.5-turbo',
  messages=[
    {'role': 'system', 'content': system_prompt},
    {'role': 'user', 'content': user_prompt}
  ]
)

print(completion.choices[0].message.content)

Based on the provided training data, we can see that the features for DDoS-labelled entries generally have higher values compared to those labelled as BENIGN. Specifically, the 'Bwd Packet Length Std' and 'Average Packet Size' features seem to be significantly higher for DDoS traffic compared to BENIGN traffic.

In the given testing data entry, the values for 'Bwd Packet Length Std' (0.0) and 'Average Packet Size' (7.0) are more in line with the BENIGN traffic entries from the training data. Therefore, based on this observation, I predict that the label for this testing data entry should be BENIGN.

$$$BENIGN$$$


In [102]:
testing_sample['Label']

95513    BENIGN
Name: Label, dtype: object

Extract label

In [103]:
import re

In [104]:
re.search(r'(?<=\${3}).+(?=\${3})', completion.choices[0].message.content).group()

'BENIGN'