##### ### The University of Melbourne, School of Computing and Information Systems
# COMP30027 Machine Learning, 2025 Semester 1

## Assignment 1: Scam detection with naive Bayes


**Student ID(s):**     `1389444`


This iPython notebook is a template which you will use for your Assignment 1 submission.

**NOTE: YOU SHOULD ADD YOUR RESULTS, GRAPHS, AND FIGURES FROM YOUR OBSERVATIONS IN THIS FILE TO YOUR REPORT (the PDF file).** Results, figures, etc. which appear in this file but are NOT included in your report will not be marked.

**Adding proper comments to your code is MANDATORY. **

## 1. Supervised model training


In [1]:
import pandas as pd
import numpy as np
from collections import defaultdict

# Load supervised train dataset
sv_train = pd.read_csv("sms_supervised_train.csv")
sv_train["textPreprocessed"] = sv_train["textPreprocessed"].astype(str)

texts = sv_train["textPreprocessed"] # assign the textPreprocessed column as a separate variable
labels = sv_train["class"] # assign the class column as a separate variable

# Compute prior probabilities P(c)
N_total = len(sv_train) # Dataset size
N_scam = 0  # Scam label count
N_nonmal = 0 # Non-malicious label count

# Iterate and count the frequency of each label
for label in labels:
    if label == 0:
        N_nonmal += 1
    else:
        N_scam += 1

P_scam = N_scam / N_total   # Prior probability of scam instances
P_nonmal = N_nonmal / N_total   # Prior probability of non-malicious instances
prior_prob = {0: P_nonmal, 1: P_scam}

# Display P(c) results
print(f"Prior probability of scam, P(scam) = {P_scam:.4f}")
print(f"Prior probability of non-malicious, P(non-malicious) = {P_nonmal:.4f}\n")

word_count_matrix = {0: defaultdict(int), 1: defaultdict(int)}  # Count matrix for each word separated by label
vocabulary = set()  # List of every unique words in the dataset

# Iterate and count the words frequency in each class while adding new words to the vocabulary
for text, label in zip(texts, labels):
    words = text.split()
    for word in words:
        word_count_matrix[label][word] += 1
        vocabulary.add(word)

N_vocab = len(vocabulary)  # Vocabulary size
alpha = 1  # Laplace smoothing factor

# Compute word probabilities and store them as a matrix
word_probability_matrix = {0: {}, 1: {}}
for word in vocabulary:
    word_probability_matrix[0][word] = (word_count_matrix[0][word] + alpha) / (sum(word_count_matrix[0].values()) + N_vocab * alpha)
    word_probability_matrix[1][word] = (word_count_matrix[1][word] + alpha) / (sum(word_count_matrix[1].values()) + N_vocab * alpha)

top_scam_words = sorted(word_probability_matrix[1].items(), key=lambda x: x[1], reverse=True)[:10]  # Top 10 most probable words for scam
top_nonmal_words = sorted(word_probability_matrix[0].items(), key=lambda x: x[1], reverse=True)[:10] # Top 10 most probable words for non-malicious

# Display the most probable words for each class
print("Most probable words in scam class:")
for word, prob in top_scam_words:
    print(f"{word}: {prob:.4f}")
print()

print("Most probable words in non-malicious class:")
for word, prob in top_nonmal_words:
    print(f"{word}: {prob:.4f}")
print()


# Compute predictive word ratios R (class 0 / class 1)
ratios = {word: word_probability_matrix[0][word] / word_probability_matrix[1][word] 
          for word in vocabulary if word_probability_matrix[1][word] > 0}

top_predictive_nonmal = sorted(ratios.items(), key=lambda x: x[1], reverse=True)[:10]  # Top 10 most predictive words for non-malicious
top_predictive_scam = sorted(ratios.items(), key=lambda x: x[1])[:10]  # Top 10 predictive words for scam

# Display the most predictive words for each class
print("Most predictive words for non-malicious class:")
for word, ratio in top_predictive_nonmal:
    print(f"{word}: {ratio:.2f}")
print()

print("Most predictive words for scam class:")
for word, ratio in top_predictive_scam:
    print(f"{word}: {ratio:.2f}")


Prior probability of scam, P(scam) = 0.2000
Prior probability of non-malicious, P(non-malicious) = 0.8000

Most probable words in scam class:
.: 0.0565
!: 0.0243
,: 0.0235
call: 0.0205
£: 0.0139
free: 0.0105
/: 0.0091
2: 0.0088
&: 0.0087
?: 0.0085

Most probable words in non-malicious class:
.: 0.0793
,: 0.0260
?: 0.0256
u: 0.0189
...: 0.0187
!: 0.0172
..: 0.0149
;: 0.0132
&: 0.0131
go: 0.0111

Most predictive words for non-malicious class:
;: 60.50
...: 57.50
gt: 54.06
lt: 53.55
:): 47.88
ü: 31.92
lor: 28.83
ok: 24.71
hope: 24.71
d: 21.11

Most predictive words for scam class:
prize: 0.01
tone: 0.02
£: 0.02
select: 0.02
claim: 0.02
paytm: 0.03
code: 0.03
award: 0.03
won: 0.03
18: 0.03


## 2. Supervised model evaluation

In [2]:
from math import factorial, log, exp

def posterior_count_class(word_counts, class_label, vocabulary, word_probability_matrix):
    n = sum(word_counts.values())
    
    numerator = factorial(n)
    denominator = 1
    prob_product = 1.0

    for word, count in word_counts.items():
        if word not in vocabulary:
            continue
        word_prob = word_probability_matrix[class_label][word]
        denominator *= factorial(count)
        prob_product *= word_prob ** count

    return (numerator / denominator) * prob_product

def log_posterior_count_class(word_counts, class_label, vocabulary, word_probability_matrix):
    n = sum(word_counts.values())
    
    log_numerator = log(factorial(n))
    log_denominator = 0.0
    log_prob_sum = 0.0

    for word, count in word_counts.items():
        if word not in vocabulary:
            continue
        word_prob = word_probability_matrix[class_label][word]
        log_denominator += log(factorial(count))
        log_prob_sum += count * log(word_prob)

    return log_numerator - log_denominator + log_prob_sum

In [3]:
from sklearn.metrics import accuracy_score, confusion_matrix, precision_recall_fscore_support

# Load test dataset
test = pd.read_csv("sms_test.csv")
test["textPreprocessed"] = test["textPreprocessed"].astype(str)

test_texts = test["textPreprocessed"]   # assign the textPreprocessed column as a separate variable
test_labels = test["class"] # assign the class column as a separate variable

oov_word_count = 0  # Count of out-of-vocabulary words
skipped_instances = 0  # Number of skipped test messages

predictions = []    # list of predictions for test instances
confidence_ratios = []  # list of confidence ratios for test instances
log_prior_prob = {0: np.log(P_nonmal), 1: np.log(P_scam)}

for test_text, test_label in zip(test_texts, test_labels):
    test_words = test_text.split()
    test_word_counts = defaultdict(int)
    
    # Compute word count vector, ignoring OOV words
    for test_word in test_words:
        if test_word in vocabulary:
            test_word_counts[test_word] += 1
    oov_word_count += len(test_words) - len(test_word_counts)

    # Skip instances with no known words
    if len(test_word_counts) == 0:
        skipped_instances += 1
        predictions.append(None)
        confidence_ratios.append(None)
        continue

    log_posterior_class_count_0 = log_prior_prob[0] + log_posterior_count_class(test_word_counts, 0, vocabulary, word_probability_matrix)
    log_posterior_class_count_1 = log_prior_prob[1] + log_posterior_count_class(test_word_counts, 1, vocabulary, word_probability_matrix)

    prediction = 0 if log_posterior_class_count_0 > log_posterior_class_count_1 else 1
    predictions.append(prediction)

    # Compute confidence ratio
    log_confidence_ratio = log_posterior_class_count_0 - log_posterior_class_count_1
    confidence_ratio = np.exp(log_confidence_ratio)
    confidence_ratios.append((test_text, prediction, confidence_ratio))

# Compute performance metrics
accuracy = accuracy_score(test_labels[:len(predictions)], predictions)
conf_matrix = confusion_matrix(test_labels[:len(predictions)], predictions)
precision, recall, f1, _ = precision_recall_fscore_support(test_labels[:len(predictions)], predictions, average=None)

# Print Results
print(f"Overall Accuracy: {accuracy:.4f}")
print(f"Confusion Matrix:\n{conf_matrix}")
print(f"Precision (Non-Malicious, Scam): {precision}")
print(f"Recall (Non-Malicious, Scam): {recall}\n")

print(f"Total OOV words encountered: {oov_word_count}")
print(f"Total skipped test instances: {skipped_instances}")

# Extract examples
high_conf_nonmal = sorted([x for x in confidence_ratios], key=lambda x:x[2], reverse=True)[:5]  # High confidence in non-malicious (class 0)
high_conf_scam = sorted([x for x in confidence_ratios], key=lambda x:x[2])[:5]  # High confidence in scam (class 1)
boundary_cases = sorted([x for x in confidence_ratios], key=lambda x:abs(np.log(x[2])))[:5]  # Somewhere in the middle (low confidence for both class)


# Print results
print("\nExamples of Scam Classified with High Confidence:")
for text, label, R in high_conf_scam:
    print(f"R={R:.2f} | Text: {text}")

print("\nExamples of Non-Malicious Classified with High Confidence:")
for text, label, R in high_conf_nonmal:
    print(f"R={R:.2f} | Text: {text}")

print("\nExamples on the Decision Boundary (Uncertain Classification):")
for text, label, R in boundary_cases:
    print(f"R={R:.2f} | Text: {text}")

Overall Accuracy: 0.9750
Confusion Matrix:
[[785  15]
 [ 10 190]]
Precision (Non-Malicious, Scam): [0.98742138 0.92682927]
Recall (Non-Malicious, Scam): [0.98125 0.95   ]

Total OOV words encountered: 1571
Total skipped test instances: 0

Examples of Scam Classified with High Confidence:
R=0.00 | Text: . 4 + call £ - * holiday & urgent 18 t landline 150ppm cash cs await collection po box sae complimentary 10,000 ibiza
R=0.00 | Text: . 3 4 + ! call : £ offer * holiday & urgent 18 t landline 150ppm cash cs await collection po box sae tenerife 10,000
R=0.00 | Text: . . . , please order text call / : customer tone number [ [ service mobile ] ] colour colour thanks ringtone reference charge 4.50 arrive = red x49 09065989182
R=0.00 | Text: . call £ £ guarantee won customer prize prize claim service 1000 yr 2000 representative cash 10am-7pm
R=0.00 | Text: . . 2 free u + ! 1st / / wk wk txt tone gr8 hit 150p 16 poly 8007 8007 nokia nokia nokia tones polys

Examples of Non-Malicious Classified 

## 3. Extending the model with semi-supervised training

In [4]:
from sklearn.model_selection import train_test_split

# Load unlabelled dataset
unlabelled_df = pd.read_csv("sms_unlabelled.csv")
unlabelled_df["textPreprocessed"] = unlabelled_df["textPreprocessed"].astype(str)

unlabelled_texts = unlabelled_df["textPreprocessed"]
unlabelled_true_labels = unlabelled_df["class"]

# Use stratified sampling to sample 20% of the dataset for validation set
unlabelled_train, unlabelled_validation, labels_train, labels_validation = train_test_split(
    unlabelled_texts, unlabelled_true_labels, 
    test_size=0.2, 
    stratify=unlabelled_true_labels, 
    random_state=42
)

In [5]:
import random

random.seed(42)

random_200 = random.sample(list(zip(unlabelled_train, labels_train)), 200)

# Extract texts and true labels from the selected instances
selected_texts = [x[0] for x in random_200]
selected_labels = [x[1] for x in random_200]

# Combine with original training data from Q1
expanded_texts = list(texts) + selected_texts
expanded_labels = list(labels) + selected_labels

# Rebuild vocabulary and count matrix
expanded_vocabulary = set()
expanded_word_count_matrix = {0: defaultdict(int), 1: defaultdict(int)}

for text, label in zip(expanded_texts, expanded_labels):
    for word in text.split():
        expanded_word_count_matrix[label][word] += 1
        expanded_vocabulary.add(word)

expanded_N_vocab = len(expanded_vocabulary)
expanded_word_probability_matrix = {0: {}, 1: {}}

# Recompute word probabilities with Laplace smoothing
for word in expanded_vocabulary:
    for label in [0, 1]:
        expanded_word_probability_matrix[label][word] = (
            expanded_word_count_matrix[label][word] + alpha
        ) / (sum(expanded_word_count_matrix[label].values()) + expanded_N_vocab * alpha)

# Recompute prior probabilities and the log of them
expanded_total = len(expanded_labels)
expanded_N_scam = sum(1 for l in expanded_labels if l == 1)
expanded_N_nonmal = expanded_total - expanded_N_scam

expanded_prior_prob = {0: expanded_N_nonmal / expanded_total, 1: expanded_N_scam / expanded_total}
expanded_log_prior_prob = {0: np.log(expanded_prior_prob[0]), 1: np.log(expanded_prior_prob[1])}

print("Selecting 200 random instances\n")
print(f"New expanded prior probabilies {expanded_prior_prob}")

# Test on original test set
validation_predictions = [] 
for validation_text in unlabelled_validation:
    validation_word_counts = defaultdict(int)
    for word in validation_text.split():
        if word in expanded_vocabulary:
            validation_word_counts[word] += 1

    if len(validation_word_counts) == 0:
        validation_predictions.append(None)
        continue

    log_0 = expanded_log_prior_prob[0] + log_posterior_count_class(validation_word_counts, 0, expanded_vocabulary, expanded_word_probability_matrix)
    log_1 = expanded_log_prior_prob[1] + log_posterior_count_class(validation_word_counts, 1, expanded_vocabulary, expanded_word_probability_matrix)
    prediction = 0 if log_0 > log_1 else 1
    validation_predictions.append(prediction)

# Clean predictions for scoring
valid_updated_preds = [p for p in validation_predictions if p is not None]
valid_test_labels = [l for p, l in zip(validation_predictions, labels_validation) if p is not None]

# Metrics
accuracy = accuracy_score(valid_test_labels, valid_updated_preds)
precision, recall, f1, _ = precision_recall_fscore_support(valid_test_labels, valid_updated_preds, average=None)
conf_matrix = confusion_matrix(valid_test_labels, valid_updated_preds)

print(f"Accuracy: {accuracy:.4f}")
print(f"Confusion Matrix:\n{conf_matrix}")
print(f"Precision (Non-Malicious, Scam): {precision}")
print(f"Recall (Non-Malicious, Scam): {recall}")

Selecting 200 random instances

New expanded prior probabilies {0: 0.8, 1: 0.2}
Accuracy: 0.9600
Confusion Matrix:
[[309  11]
 [  5  75]]
Precision (Non-Malicious, Scam): [0.98407643 0.87209302]
Recall (Non-Malicious, Scam): [0.965625 0.9375  ]


In [6]:
uncertain_instances = []

for text, true_label in zip(unlabelled_train, labels_train):
    word_counts = defaultdict(int)
    words = text.split()

    for word in words:
        if word in vocabulary:
            word_counts[word] += 1

    if len(word_counts) == 0:
        continue  # Skip completely OOV texts

    log_post_0 = log_prior_prob[0] + log_posterior_count_class(word_counts, 0, vocabulary, word_probability_matrix)
    log_post_1 = log_prior_prob[1] + log_posterior_count_class(word_counts, 1, vocabulary, word_probability_matrix)

    log_confidence_ratio = log_post_0 - log_post_1
    confidence_ratio = np.exp(log_confidence_ratio)

    uncertain_instances.append((confidence_ratio, text, true_label))

# Sort by uncertainty (closest to decision boundary)
uncertain_instances.sort(key=lambda x: abs(np.log(x[0])))

# Select top 200 most uncertain
selected_200 = uncertain_instances[:200]

# Extract texts and true labels from the selected instances
selected_texts = [x[1] for x in selected_200]
selected_labels = [x[2] for x in selected_200]

# Combine with original training data from Q1
expanded_texts = list(texts) + selected_texts
expanded_labels = list(labels) + selected_labels

# Rebuild vocabulary and count matrix
expanded_vocabulary = set()
expanded_word_count_matrix = {0: defaultdict(int), 1: defaultdict(int)}

for text, label in zip(expanded_texts, expanded_labels):
    for word in text.split():
        expanded_word_count_matrix[label][word] += 1
        expanded_vocabulary.add(word)

expanded_N_vocab = len(expanded_vocabulary)
expanded_word_probability_matrix = {0: {}, 1: {}}

# Recompute word probabilities with Laplace smoothing
for word in expanded_vocabulary:
    for label in [0, 1]:
        expanded_word_probability_matrix[label][word] = (
            expanded_word_count_matrix[label][word] + alpha
        ) / (sum(expanded_word_count_matrix[label].values()) + expanded_N_vocab * alpha)

# Recompute prior probabilities and the log of them
expanded_total = len(expanded_labels)
expanded_N_scam = sum(1 for l in expanded_labels if l == 1)
expanded_N_nonmal = expanded_total - expanded_N_scam

expanded_prior_prob = {0: expanded_N_nonmal / expanded_total, 1: expanded_N_scam / expanded_total}
expanded_log_prior_prob = {0: np.log(expanded_prior_prob[0]), 1: np.log(expanded_prior_prob[1])}

print("Selecting 200 most uncertain instances\n")
print(f"New expanded prior probabilies {expanded_prior_prob}")

# Test on original test set
validation_predictions = [] 
for validation_text in unlabelled_validation:
    validation_word_counts = defaultdict(int)
    for word in validation_text.split():
        if word in expanded_vocabulary:
            validation_word_counts[word] += 1

    if len(validation_word_counts) == 0:
        validation_predictions.append(None)
        continue

    log_0 = expanded_log_prior_prob[0] + log_posterior_count_class(validation_word_counts, 0, expanded_vocabulary, expanded_word_probability_matrix)
    log_1 = expanded_log_prior_prob[1] + log_posterior_count_class(validation_word_counts, 1, expanded_vocabulary, expanded_word_probability_matrix)
    prediction = 0 if log_0 > log_1 else 1
    validation_predictions.append(prediction)

# Clean predictions for scoring
valid_updated_preds = [p for p in validation_predictions if p is not None]
valid_test_labels = [l for p, l in zip(validation_predictions, labels_validation) if p is not None]

# Metrics
accuracy = accuracy_score(valid_test_labels, valid_updated_preds)
precision, recall, f1, _ = precision_recall_fscore_support(valid_test_labels, valid_updated_preds, average=None)
conf_matrix = confusion_matrix(valid_test_labels, valid_updated_preds)

print(f"Accuracy: {accuracy:.4f}")
print(f"Confusion Matrix:\n{conf_matrix}")
print(f"Precision (Non-Malicious, Scam): {precision}")
print(f"Recall (Non-Malicious, Scam): {recall}")

Selecting 200 most uncertain instances

New expanded prior probabilies {0: 0.8045454545454546, 1: 0.19545454545454546}
Accuracy: 0.9675
Confusion Matrix:
[[311   9]
 [  4  76]]
Precision (Non-Malicious, Scam): [0.98730159 0.89411765]
Recall (Non-Malicious, Scam): [0.971875 0.95    ]


## 4. Supervised model evaluation

In [7]:
for text, label in zip(expanded_texts, expanded_labels):
    for word in text.split():
        expanded_word_count_matrix[label][word] += 1
        expanded_vocabulary.add(word)

expanded_N_vocab = len(expanded_vocabulary)
expanded_word_probability_matrix = {0: {}, 1: {}}

# Recompute word probabilities with Laplace smoothing
for word in expanded_vocabulary:
    for label in [0, 1]:
        expanded_word_probability_matrix[label][word] = (
            expanded_word_count_matrix[label][word] + alpha
        ) / (sum(expanded_word_count_matrix[label].values()) + expanded_N_vocab * alpha)

# Recompute prior probabilities and the log of them
expanded_total = len(expanded_labels)
expanded_N_scam = sum(1 for l in expanded_labels if l == 1)
expanded_N_nonmal = expanded_total - expanded_N_scam

expanded_prior_prob = {0: expanded_N_nonmal / expanded_total, 1: expanded_N_scam / expanded_total}
expanded_log_prior_prob = {0: np.log(expanded_prior_prob[0]), 1: np.log(expanded_prior_prob[1])}

print("Evaluation on test set")
print(f"New expanded prior probabilies {expanded_prior_prob}")

# Test on original test set
updated_predictions = []
updated_conf_ratios = [] 
for test_text in test_texts:
    test_word_counts = defaultdict(int)
    for word in test_text.split():
        if word in expanded_vocabulary:
            test_word_counts[word] += 1

    if len(test_word_counts) == 0:
        updated_predictions.append(None)
        continue

    log_0 = expanded_log_prior_prob[0] + log_posterior_count_class(test_word_counts, 0, expanded_vocabulary, expanded_word_probability_matrix)
    log_1 = expanded_log_prior_prob[1] + log_posterior_count_class(test_word_counts, 1, expanded_vocabulary, expanded_word_probability_matrix)
    
    prediction = 0 if log_0 > log_1 else 1
    updated_predictions.append(prediction)
    
    log_conf_ratio = log_0 - log_1
    conf_ratio = np.exp(log_conf_ratio)
    updated_conf_ratios.append((test_text, prediction, conf_ratio))

# Clean predictions for scoring
valid_updated_preds = [p for p in updated_predictions if p is not None]
valid_test_labels = [l for p, l in zip(updated_predictions, test_labels) if p is not None]

# Metrics
accuracy = accuracy_score(valid_test_labels, valid_updated_preds)
precision, recall, f1, _ = precision_recall_fscore_support(valid_test_labels, valid_updated_preds, average=None)
conf_matrix = confusion_matrix(valid_test_labels, valid_updated_preds)

print(f"Accuracy: {accuracy:.4f}")
print(f"Confusion Matrix:\n{conf_matrix}")
print(f"Precision (Non-Malicious, Scam): {precision}")
print(f"Recall (Non-Malicious, Scam): {recall}")

# Most probable words after semi-supervised learning
top_scam_words = sorted(expanded_word_probability_matrix[1].items(), key=lambda x: x[1], reverse=True)[:10]
top_nonmal_words = sorted(expanded_word_probability_matrix[0].items(), key=lambda x: x[1], reverse=True)[:10]

print("\nMost probable words in scam class:")
for word, prob in top_scam_words:
    print(f"{word}: {prob:.4f}")

print("\nMost probable words in non-malicious class:")
for word, prob in top_nonmal_words:
    print(f"{word}: {prob:.4f}")

# Most predictive words after semi-supervised learning
ratios = {
    word: expanded_word_probability_matrix[0][word] / expanded_word_probability_matrix[1][word]
    for word in expanded_vocabulary if expanded_word_probability_matrix[1][word] > 0
}

top_predictive_nonmal = sorted(ratios.items(), key=lambda x: x[1], reverse=True)[:10]
top_predictive_scam = sorted(ratios.items(), key=lambda x: x[1])[:10]

print("\nMost predictive words for non-malicious class:")
for word, ratio in top_predictive_nonmal:
    print(f"{word}: {ratio:.2f}")

print("\nMost predictive words for scam class:")
for word, ratio in top_predictive_scam:
    print(f"{word}: {ratio:.2f}")

# Sort and print results
high_conf_ratio_scam = sorted(updated_conf_ratios, key=lambda x: x[2])[:5]
low_conf_ratio_nonmal = sorted(updated_conf_ratios, key=lambda x: x[2], reverse=True)[:5]
boundary_cases = sorted(updated_conf_ratios, key=lambda x: abs(np.log(x[2])))[:5]

# Print results
print("\nExamples of Scam Classified with High Confidence:")
for text, label, R in high_conf_ratio_scam:
    print(f"R={R:.2f} | Text: {text}")

print("\nExamples of Non-Malicious Classified with High Confidence:")
for text, label, R in low_conf_ratio_nonmal:
    print(f"R={R:.2f} | Text: {text}")

print("\nExamples on the Decision Boundary (Uncertain Classification):")
for text, label, R in boundary_cases:
    print(f"R={R:.2f} | Text: {text}")

# Filter instances where confidence ratio of Q1 model is between 0.8 and 1.2
mid_confidence = [x for x in confidence_ratios if 0.8 <= x[2] <= 1.2]
print(f"\nAmount of test instance with R-value between 0.8 and 1.2 (using Q1 model): {len(mid_confidence)}")

# Filter instances where confidence ratio of Q3 model is between 0.8 and 1.2
updated_mid_confidence = [x for x in updated_conf_ratios if 0.8 <= x[2] <= 1.2]
print(f"Amount of test instance with R-value between 0.8 and 1.2 (using Q3 model): {len(updated_mid_confidence)}")

Evaluation on test set
New expanded prior probabilies {0: 0.8045454545454546, 1: 0.19545454545454546}
Accuracy: 0.9770
Confusion Matrix:
[[786  14]
 [  9 191]]
Precision (Non-Malicious, Scam): [0.98867925 0.93170732]
Recall (Non-Malicious, Scam): [0.9825 0.955 ]

Most probable words in scam class:
.: 0.0647
!: 0.0274
,: 0.0267
call: 0.0228
£: 0.0151
free: 0.0117
/: 0.0101
&: 0.0101
2: 0.0099
?: 0.0097

Most probable words in non-malicious class:
.: 0.0863
,: 0.0283
?: 0.0273
u: 0.0194
...: 0.0189
!: 0.0186
..: 0.0153
&: 0.0133
;: 0.0133
go: 0.0113

Most predictive words for non-malicious class:
gt: 101.24
lt: 100.28
:): 89.62
ü: 59.58
lor: 53.77
;: 45.44
d: 39.24
da: 37.30
...: 36.01
let: 33.43

Most predictive words for scam class:
prize: 0.00
tone: 0.01
select: 0.01
paytm: 0.01
code: 0.01
ringtone: 0.02
won: 0.02
18: 0.02
claim: 0.02
mob: 0.02

Examples of Scam Classified with High Confidence:
R=0.00 | Text: . 4 + call £ - * holiday & urgent 18 t landline 150ppm cash cs await collect