[DFRWS USA 2025] SERENA (Systematic Extraction and Reconstruction for ENhanced A2P Message Forensics)

📄 Official repository for our research paper submitted to DFRWS 2025 USA SERENA (Systematic Extraction and Reconstruction for Enhanced A2P Message Forensics) is a forensic tool designed to automate the extraction, classification, and highlighting of structured data from Application-to-Person (A2P) messages. The tool leverages GPT-4o to classify A2P messages and extract key forensic information, enabling investigators to efficiently analyze digital communications.

📌 Key Features

Automated Data Processing: Converts emails (.eml), text messages (.xlsx), and chat logs into structured text format.
A2P vs. P2P Classification: Identifies and categorizes messages using OpenAI’s GPT-4o.
Named Entity Recognition: Extracts essential forensic data (e.g., service name, timestamps, payment details).
GUI-Based Analysis: Provides an interactive interface for forensic investigators.
Highlighting Feature: Displays extracted keywords directly within the original message.

📂 Repository Structure

data_preprocessing.py, preprocess.py
- Converts emails (.eml), text messages (.xlsx), and chat logs into structured text format.
A2P_classifying.py
- Classifies messages as A2P (Application-to-Person) or P2P (Person-to-Person) using GPT-4o.
named_entity_recognition.py
- Extracts structured data such as service name, timestamp, payment amount, etc.
- Normalize values (e.g., datetime - YYYY/MM/DD hh:mm:ss, amount - USD 150, AUD 300)
SERENA-GUI.py
- Provides a GUI-based interface for forensic investigation.
SAMPLE A2P DATASET
- Provides real-world A2P messages (emls and chat logs).
- Service names, addresses, numbers, and other identifiable information have been redacted.
requirements.txt
- Lists required Python dependencies for the project.
README.md
- This README file containing project details and setup instructions.

📌 How to Run the Tool on a Git Repository

To ensure the tool works correctly, the selected folder must contain the following subfolders:

📂 emls/ (For processing email data)
📂 messagingapp/ (For chat and messaging app logs)
📂 textmessage/ (For SMS and text message logs)
Once the base folder is selected, the A2P classification and Named Entity Recognition (NER) modules are executed automatically.
When the process is completed, the terminal displays the message:
"Processing completed."
Double-clicking a file in the TreeView will display the message file with highlighted JSON keywords.
Selecting "Load JSON Data" allows the user to view Named Entities in a structured Table View.

📌 Regarding the evaluation dataset

We assessed our methodology for A2P Message Classification using three categories: ground truth, augmented, and unseen, each with 25 A2P messages and 25 non-A2P messages. For A2P Keyword Extraction, our approach was evaluated on 50 A2P messages, with 10 messages per category (Order, Transaction, Notification, Booking, and Miscellaneous).

This is sample A2P dataset composed of real-world messages.
Service names, addresses, numbers, and other identifiable information have been redacted.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[DFRWS USA 2025] SERENA (Systematic Extraction and Reconstruction for ENhanced A2P Message Forensics)

📌 Key Features

📂 Repository Structure

📌 How to Run the Tool on a Git Repository

📌 Regarding the evaluation dataset

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
sample-a2p-dataset		sample-a2p-dataset
.env		.env
A2P_classifying.py		A2P_classifying.py
README.md		README.md
SERENA-GUI.py		SERENA-GUI.py
data_preprocessing.py		data_preprocessing.py
named_entitiy_recognition.py		named_entitiy_recognition.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

[DFRWS USA 2025] SERENA (Systematic Extraction and Reconstruction for ENhanced A2P Message Forensics)

📌 Key Features

📂 Repository Structure

📌 How to Run the Tool on a Git Repository

📌 Regarding the evaluation dataset

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages