Spam Detection

In this project, we aim to detect SMS spam messages using large language models (LLMs).

We evaluate spam messages in both English and Korean. We use the following prompt to evalute the effectiveness of the large language models.

Get Started

Create an virtual environment with Python 3.13 and install the required packages in requirements.txt.
Place your dataset in the ./data folder. Dataset in .csv format should be used.
Place your API_KEY in the ./models/{model_name}/{model_name}_api_0.py file.
Paramter information can be adjusted in the same file.
- SHOT, METHOD, FILE_NAME, TIMEOUT_MIN, TIMEOUT_MAX, START_INDEX, END_INDEX can be adjusted.
Evaluate your dataset using your API_KEY. For example, to evalute the performance of ChatGPT:

python ./models/chatGPT/chatGPT_api_0.py

Note:

You can create multiple python files to execute different prompting methods.
- ex) chatGPT_api_1.py, chatGPT_api_2.py, ...
Multi-threading is not implemented.
You can create multiple API_KEYs for faster results.
You can download the most recent English spam dataset here.

Process the Results

Generated response are saved in the ./results folder.
- ./results/{model_name}/{prompt_method}/{few_shot}/{datset_name}_response.csv.
Classify the generated response using check_response.py in the ./results/utils folder:

python ./results/utils/check_response.py

Processed response will be saved in the same folder with the following name: {dataset_name}_res_filtered.csv.
Calculate the precision, recall, accuracy, and the f1-score using:

python ./results/utils/get_metrics.py

Note:

We used the substring algorithm to disinguish the generated response.
We recommend to create individual check_response.py file for each model and method as they all generate sentences in a different manner.
get_metrics_m3.py is used to collect the generated score from a range of 1 to 10. 0 is used when no score is retrieved.
Some manual operation may be needed in the response processing stage.

Main Results

We find that GPT-4o achieves the best performance in the detection of spam messages in both the English and the Korean dataset.

Using Method_3, we adjust the threshold and find that value of 6 yields the best results.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data		data
images		images
models		models
results/utils		results/utils
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spam Detection

Get Started

Process the Results

Main Results

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Spam Detection

Get Started

Process the Results

Main Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages