<a href="https://colab.research.google.com/github/kennethmugo/Swahili-SMS-Spam-Detection/blob/main/research/Qwen3_swahili_spam_detection.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install transformers torch accelerate bitsandbytes

In [2]:
import os
import pandas as pd
import numpy as np
from tqdm import tqdm
import kagglehub
from google.colab import userdata
from huggingface_hub import login

Now, we want to see how LLama does in terms of zero-shot classification of this dataset. Let us create the pipeline to import the model from HuggingFace. Ensure you have `HF_TOKEN` in your secrets to enable you to log into hugging face.

In [3]:
my_secret_key = userdata.get('HF_TOKEN')
login(token=my_secret_key, add_to_git_credential=True)

In [4]:
## Download the dataset from kaggle
path = kagglehub.dataset_download("henrydioniz/swahili-sms-detection-dataset")
full_path = os.path.join(path, "bongo_scam.csv")
df = pd.read_csv(full_path)
df.head()

Unnamed: 0,Category,Sms
0,trust,"Nipigie baada ya saa moja, tafadhali."
1,scam,Naomba unitumie iyo Hela kwenye namba hii ya A...
2,scam,"666,KARIBU FREEMASON UTIMIZE NDOTO KATIKA BIAS..."
3,trust,Watoto wanapenda sana zawadi ulizowaletea.
4,scam,IYO PESA ITUME KWENYE NAMBA HII 0657538690 JIN...


In [5]:
## Let us rename: trust -> spam and scam -> spam.
mapper = {"trust": "ham", "scam": "spam"}
df["Category"] = df["Category"].map(mapper)
df.head()

Unnamed: 0,Category,Sms
0,ham,"Nipigie baada ya saa moja, tafadhali."
1,spam,Naomba unitumie iyo Hela kwenye namba hii ya A...
2,spam,"666,KARIBU FREEMASON UTIMIZE NDOTO KATIKA BIAS..."
3,ham,Watoto wanapenda sana zawadi ulizowaletea.
4,spam,IYO PESA ITUME KWENYE NAMBA HII 0657538690 JIN...


Since the dataset has many rows and I have limited compute, I will take 50 rows from each category. The purpose of this exercise is to see how LLama 3.2 could compare to supervised classification methods.

In [6]:
# Set the number of samples per class
n_per_class = 50

df = (
    df.groupby("Category", group_keys=False)
      .apply(lambda x: x.sample(n=n_per_class, random_state=42))
      .reset_index(drop=True)
)
print(f"Length of the dataframe now: {len(df)}")

Length of the dataframe now: 100


  .apply(lambda x: x.sample(n=n_per_class, random_state=42))


In [7]:
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Qwen/Qwen3-4B"

# load the tokenizer and the model
tokenizer = AutoTokenizer.from_pretrained(model_name, padding_side="left")
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

tokenizer_config.json:   0%|          | 0.00/9.73k [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/2.78M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/1.67M [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


tokenizer.json:   0%|          | 0.00/11.4M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/726 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/32.8k [00:00<?, ?B/s]

Fetching 3 files:   0%|          | 0/3 [00:00<?, ?it/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model-00002-of-00003.safetensors:   0%|          | 0.00/3.99G [00:00<?, ?B/s]

model-00003-of-00003.safetensors:   0%|          | 0.00/99.6M [00:00<?, ?B/s]

model-00001-of-00003.safetensors:   0%|          | 0.00/3.96G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

In [8]:
# prepare the model input
prompt = "Siku mingi kaka... unaweza nipigia simu tuongee baadaye ukitoka kazi."
messages = [
    {"role": "system", "content": "You are a helpful assistant that detects spam messages in both Swahili and English. Classify the provided text as SPAM or HAM (not spam) and provide a brief explanation. Respond with a JSON object containing 'classification' (either 'ham' or 'spam') and 'explanation'."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=False # Switches between thinking and non-thinking modes. Default is True.
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

# conduct text completion
generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=32768
)

output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
content = tokenizer.decode(output_ids[0:], skip_special_tokens=True).strip("\n")
print("content:", content)

content: {
  "classification": "ham",
  "explanation": "The message is a regular conversation asking if someone can bring a phone later to leave work. There are no suspicious links, urgent requests, or typical spam indicators."
}


In [11]:
import json
import re

def classify_sms_batch(model, tokenizer, messages_batch):
    """Classifies a batch of SMS messages using the given pipeline."""
    # Create a list of message sequences, where each element is a list of dictionaries
    # representing the chat history for a single SMS.
    batched_messages = []
    for message in messages_batch:
        prompt = [
            {"role": "system", "content": "You are a helpful assistant that detects spam messages in both Swahili and English. Classify the provided text as SPAM or HAM (not spam) and provide a brief explanation. Respond with a JSON object containing 'classification' (either 'ham' or 'spam') and 'explanation'."},
            {"role": "user", "content": message}
        ]
        # Apply the chat template to each message sequence to get a formatted string
        formatted_message = tokenizer.apply_chat_template(
            prompt,
            tokenize=False,
            add_generation_prompt=True,
            enable_thinking=False # Switches between thinking and non-thinking modes. Default is True.
        )
        batched_messages.append(formatted_message) # Append the formatted string


    # Now, pass the list of formatted strings to the tokenizer
    model_inputs = tokenizer(
        batched_messages,  # This is now a List[str]
        return_tensors="pt",
        padding=True, # Add padding for batch processing
        truncation=True, # Add truncation to handle long sequences if necessary
        #max_length=512 # Specify a max_length (adjust as needed)
    ).to(model.device)

    # conduct text completion
    generated_ids = model.generate(
        **model_inputs,
        max_new_tokens=32768
    )

    # Decode the generated tokens. Need to iterate through the batch for decoding.
    classifications = []
    explanations = []

    # The generated_ids tensor has shape (batch_size, sequence_length)
    # The input_ids tensor has shape (batch_size, input_sequence_length)
    # We need to decode from the end of the input_ids for each item in the batch
    for i in range(generated_ids.shape[0]):
        # Find the end of the input sequence for this item
        input_length = model_inputs.input_ids[i].shape[0]
        # Decode the tokens generated after the input sequence
        # Ensure the slice is valid
        if input_length < generated_ids.shape[1]:
             output_ids = generated_ids[i][input_length:].tolist()
             content = tokenizer.decode(output_ids, skip_special_tokens=True).strip("\n")
        else:
             # This case might happen if generation didn't produce any new tokens
             content = "" # Or handle appropriately

        classification, explanation = parse_decoded_output(content)
        classifications.append(classification)
        explanations.append(explanation)

    return (classifications, explanations)

def parse_decoded_output(output):
  try:
    matched = re.search(r'\s*(\{.*?\})\s*', output, re.DOTALL)
    json_str = matched.group(1)

    # Attempt to parse the JSON response
    json_response = json.loads(json_str)

    classification = json_response.get('classification', 'unknown').lower()
    explanation = json_response.get('explanation', 'No explanation provided')
  except:
    print(f"Error processing model output. Output: {output}")
    classification = 'unknown'
    explanation = 'No explanation provided'

  return classification, explanation


In [12]:
# let us test out the functions that make batch calls to ensure they work
sms = [
     "Utanitumia kwenye ii 0615810764 Vodacom jina LUKA KIMBANGU namba yangu inadeni usiitumie",
     "Siku mingi kaka... unaweza nipigia simu tuongee baadaye ukitoka kazi."
     ]

results = classify_sms_batch(model, tokenizer, sms)
results

(['spam', 'ham'],
 ['The message contains suspicious phone number and name, which are commonly used in spam messages to deceive individuals into providing personal information or making payments.',
  'The message is a regular conversation asking if someone can send a message later. It does not contain any suspicious or misleading content typically associated with spam.'])

In [17]:
# Prepare data for batching
sms_messages = df['Sms'].tolist()
ground_truth_labels = df['Category'].tolist()

# You can adjust the batch size based on your GPU memory
# TODO: Figure out why using batch sizes greater than 1 seems to get stuck at some point when using T4 GPUs
batch_size = 1

predicted_labels = []
model_explanations = []

for i in tqdm(range(0, len(sms_messages), batch_size), desc="Classifying SMS batches"):
    batch_messages = sms_messages[i:i + batch_size]
    batch_predictions, batch_explanations = classify_sms_batch(model, tokenizer, batch_messages)
    predicted_labels.extend(batch_predictions)
    model_explanations.extend(batch_explanations)

# Ensure predicted_labels and ground_truth_labels have the same length
min_len = min(len(predicted_labels), len(ground_truth_labels))
predicted_labels = predicted_labels[:min_len]
ground_truth_labels = ground_truth_labels[:min_len]
model_explanations = model_explanations[:min_len]

# Filter out 'unknown' predictions if necessary for metrics
valid_indices = [i for i, label in enumerate(predicted_labels) if label in ['ham', 'spam']]
filtered_predicted_labels = [predicted_labels[i] for i in valid_indices]
filtered_ground_truth_labels = [ground_truth_labels[i] for i in valid_indices]
filtered_explanations = [model_explanations[i] for i in valid_indices]

Classifying SMS batches: 100%|██████████| 100/100 [08:03<00:00,  4.83s/it]


In [18]:
from sklearn.metrics import recall_score, accuracy_score, precision_score, f1_score

if len(filtered_predicted_labels) > 0:
    # Calculate metrics
    recall = recall_score(filtered_ground_truth_labels, filtered_predicted_labels, pos_label='spam')
    precision = precision_score(filtered_ground_truth_labels, filtered_predicted_labels, pos_label='spam')
    accuracy = accuracy_score(filtered_ground_truth_labels, filtered_predicted_labels)
    f1 = f1_score(filtered_ground_truth_labels, filtered_predicted_labels, pos_label='spam')

    print(f"\n--- Classification Metrics ---")
    print(f"Recall (Spam): {recall:.4f}")
    print(f"Precision (Spam): {precision:.4f}")
    print(f"Accuracy: {accuracy:.4f}")
    print(f"F1 Score (Spam): {f1:.4f}")
else:
    print("\nNo valid predictions were obtained to calculate metrics.")


--- Classification Metrics ---
Recall (Spam): 0.9800
Precision (Spam): 0.9423
Accuracy: 0.9600
F1 Score (Spam): 0.9608


In [19]:
# Add the predicted labels to the dataframe for inspection
df['Predicted_Category'] = predicted_labels
df['Explanation'] = model_explanations
print("\nDataFrame with predictions:")
print(df[['Sms', 'Category', 'Explanation', 'Predicted_Category']].head())
print(df['Predicted_Category'].value_counts())
print(df['Category'].value_counts())


DataFrame with predictions:
                                                 Sms Category  \
0  Bro, kuna movie mpya imeachiwa leo. Je, tutaza...      ham   
1                      Tafadhali nipe maelezo zaidi.      ham   
2          Nitaandika ripoti mara tu nitakapomaliza.      ham   
3  Niambie ukweli, unafikiri Ronaldo bado ana kiw...      ham   
4                        Nisaidie na namba ya fundi.      ham   

                                         Explanation Predicted_Category  
0  The message is a casual and friendly inquiry a...                ham  
1  The message 'Tafadhali nipe maelezo zaidi.' is...                ham  
2  The message is a regular sentence in Swahili t...                ham  
3  Mwendo huu ni mwa kwa kuzinga kwa mwanafunzi y...                ham  
4  The message 'Nisaidie na namba ya fundi.' tran...                ham  
Predicted_Category
spam    52
ham     48
Name: count, dtype: int64
Category
ham     50
spam    50
Name: count, dtype: int64


Qwen seems to work very well as a zero-shot classifier. That is very impressive given there is no training needed out of the box. Let us see some common topics via a word cloud by inspecting the explanations.

In [21]:
predicted_spam_mask = df['Predicted_Category'] == 'spam'
actual_spam_mask = df['Category'] == 'spam'
spam_messages = df[predicted_spam_mask & actual_spam_mask]
spam_messages.head()

Unnamed: 0,Category,Sms,Predicted_Category,Explanation
50,spam,"666,KARIBU FREEMASON UTIMIZE NDOTO KATIKA BIAS...",spam,The message contains suspicious numbers and us...
51,spam,Au nitumie kwenye M-Pesa Namba.0696530433 jina...,spam,The message contains a phone number and a name...
52,spam,Mjukuu wangu ndagu niliyokukabizi hiyo uwe mak...,spam,The text contains suspicious and vague languag...
53,spam,Nitumie tu kwenye hii Tigo 0733822240 jina SAL...,spam,The message contains a phone number and a name...
54,spam,Naomba unitumie iyo pesa kwenye namba hii ya A...,spam,The message contains suspicious content asking...


In [22]:
for row in spam_messages.sample(5).iterrows():
    print("----------------------")
    print(f"SMS: {row[1]['Sms']}")
    print(f"Explanation: {row[1]['Explanation']}")

----------------------
SMS: TUZO POINT hongera umepata zawadi Sh2,000,000 milioni kutoka (TUZO POINT) piga sim,.0733822240 kupata zawadi  asante
Explanation: The message contains suspicious language and is likely a scam. It mentions a fake opportunity to receive Sh2,000,000,000 (which is an unusually large sum) and includes a phone number that is not verified. These are common red flags for spam or scam messages.
----------------------
SMS: 666,KARIBU FREEMASON UTIMIZE NDOTO KATIKA BIASHARA, KILIMO,UFUGAJI,MACHI MBO,MICHEZO N.K KWAMHITAJI KUJIUNGA PG: 0787-406-889 AU 0787-406-889
Explanation: The message contains suspicious numbers and mentions 'KARIBU FREEMASON', which is a red flag for spam. It also includes a request to contact a phone number, which is common in spam messages to gather personal information.
----------------------
SMS: mjukuu wangu utafuta ji wako mgumu ela hazikai mkononi pakazinaisha unasota sana mpenzi hamuelewani je utatunza siri nikikusaidia pesa bila mashaliti 

The explanations for spam messafes seem quite good considering I've not had to do any sort of training or fine-tuning. Seems the inclusion of phone numbers in the text while soliciting money is generally flagged as spam.

Let us view the spam messages that were misclassified to try and see what could've been missed by the model

In [26]:
predicted_ham_mask = df['Predicted_Category'] == 'ham'
misclassified_spam_messages = df[predicted_ham_mask & actual_spam_mask]

for row in misclassified_spam_messages.iterrows():
    print("----------------------")
    print(f"SMS: {row[1]['Sms']}")
    print(f"Explanation: {row[1]['Explanation']}")

----------------------
SMS: Habari za muda. Mimi  mwenye nyumba wako hii namba yangu ya tigo. Mbona kimya na siku zinazidi kwenda...?
Explanation: The message is a regular conversation in Swahili about daily news and plans, with no suspicious or deceptive content indicating spam.


Looking at the SMS it is quite plausible to see why the model probably classified the message as spam since there doesn't seem to be any solicitation of funds or anything suspicious.

Let us see ham messages that were misclassified

In [28]:
actual_ham_mask = df['Category'] == 'ham'
misclassified_ham_messages = df[predicted_spam_mask & actual_ham_mask]

for row in misclassified_ham_messages.iterrows():
    print("----------------------")
    print(f"SMS: {row[1]['Sms']}")
    print(f"Explanation: {row[1]['Explanation']}")

----------------------
SMS: Masha Allah amekaa kistaarabu
Explanation: The phrase 'Masha Allah amekaa kistaarabu' contains potential spam elements, including religious references and possibly misleading or suspicious content. The use of 'kistaarabu' (which may relate to Arabic or Islamic context) can be a red flag for spam or phishing attempts.
----------------------
SMS: Noma sanaaaaaaaaaaah
Explanation: The message 'Noma sanaaaaaaaaaaah' contains excessive repetition ('aaaaaaaaaah') which is a common tactic in spam messages to catch attention or appear more urgent. The lack of clear content or context also raises suspicion of being spam.
----------------------
SMS: Daaaah... Hiyo ya nyumba ni hatariii aseee
Explanation: The message contains excessive exclamation marks and possibly nonsensical or exaggerated language, which is common in spam messages to grab attention or appear more urgent.


Seems the model flags extra punctuations as spam when it isn't always the case. It is also very curious that mention of religion in one particular message has been flagged as spam.

Let me try 2 messages to ensure religion isn't always classified as spam by Qwen.

In [30]:
sms = [
     "Bwana yesu asifiwe!",
     "Praise God!",
     "Glory be to Allah!",
     "Allah asifiwe!"
     ]

classifications, explanations = classify_sms_batch(model, tokenizer, sms)
for classification, explanation, sms in zip(classifications, explanations, sms):
    print("----------------------")
    print(f"SMS: {sms}")
    print(f"Classification: {classification}")
    print(f"Explanation: {explanation}")

----------------------
SMS: Bwana yesu asifiwe!
Classification: ham
Explanation: The message 'Bwana yesu asifiwe!' is a greeting in Swahili, which translates to 'Lord Jesus be with you!' This is a common religious greeting and not considered spam.
----------------------
SMS: Praise God!
Classification: ham
Explanation: The message 'Praise God!' is a simple expression of religious devotion and does not contain any suspicious or deceptive content typically associated with spam.
----------------------
SMS: Glory be to Allah!
Classification: ham
Explanation: The text 'Glory be to Allah!' is a religious phrase and does not contain any suspicious or malicious content typically associated with spam. It is a common expression in Islamic contexts and is not likely to be spam.
----------------------
SMS: Allah asifiwe!
Classification: ham
Explanation: The message 'Allah asifiwe!' is a greeting in Swahili, which translates to 'Peace be upon you!' This is a common and respectful greeting in many c

Seems religious connotations aren't always classified as spam by the model.