Additional notes :

Naive Bayes Algorithm: https://scikit-learn.org/stable/api/sklearn.naive_bayes.html

https://ieeexplore.ieee.org/document/10276274

In [1]:
# Sample data (text messages and labels)
texts = [
    "Congratulations! You won a $1000 Walmart gift card!",
    "Hey, are we still meeting for lunch today?",
    "Free entry in 2 a weekly competition to win prizes",
    "Can you send me the report by 5 PM?",
    "Win a brand new car just by entering this contest!",
    "Don't forget to call mom tonight.",
    "URGENT! Your account has been compromised. Verify now.",
    "How was your weekend?",
]
 
labels = [
    "spam",
    "ham",
    "spam",
    "ham",
    "spam",
    "ham",
    "spam",
    "ham"
]

In [2]:
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import train_test_split

In [3]:
# Convert text to feature vectors
vectorizer = CountVectorizer()
X = vectorizer.fit_transform(texts)  # 🔸 Converts text into word frequency features

In [4]:
# Labels
y = labels

In [5]:
# Train/test split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.25, random_state=42
)

In [6]:
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score, classification_report
 
# Initialize and train model
model = MultinomialNB()
model.fit(X_train, y_train)  # 🔸 Best for word count-based features
 

In [7]:
# Predict
y_pred = model.predict(X_test)

In [8]:
# Evaluate
print("Accuracy:", accuracy_score(y_test, y_pred))
print("\nClassification Report:\n", classification_report(y_test, y_pred))

Accuracy: 1.0

Classification Report:
               precision    recall  f1-score   support

         ham       1.00      1.00      1.00         2

    accuracy                           1.00         2
   macro avg       1.00      1.00      1.00         2
weighted avg       1.00      1.00      1.00         2



In [9]:
# New text messages to classify
new_messages = [
    "Win cash prizes now, limited time offer!",
    "Are you coming to the meeting tomorrow?",
    "Update your account info to avoid suspension",
    "Let's grab coffee after work",
]

In [10]:
# Transform using the same vectorizer used for training
new_vectors = vectorizer.transform(new_messages)  # 🔸 Must use same vectorizer as before

In [11]:
# Make predictions
new_predictions = model.predict(new_vectors)

In [12]:
# Show results
for msg, label in zip(new_messages, new_predictions):
    print(f"Message: '{msg}' -> Prediction: {label}")

Message: 'Win cash prizes now, limited time offer!' -> Prediction: spam
Message: 'Are you coming to the meeting tomorrow?' -> Prediction: ham
Message: 'Update your account info to avoid suspension' -> Prediction: spam
Message: 'Let's grab coffee after work' -> Prediction: spam


![image.png](attachment:b58c2362-7495-4670-82bb-3c6ea8c445a2.png)![image.png](attachment:2695a4ea-785a-4589-9f23-2534077ff380.png)

In [None]:
Additional notes: 

Machine Learning Algorithms : https://www.linkedin.com/posts/riyazahd_machinelearning-mlalgorithms-datascience-activity-7317014688880308224-TQXW?utm_source=share&utm_medium=member_desktop&rcm=ACoAAAlrAg0BOtlbNoOCIcZd5OPk_gF9qChJn38 

Class Recording : https://365umedumy.sharepoint.com/sites/2025-Mar-MachineLearningforDataScience-Group31RL-Sunday3pmto/Shared%20Documents/General/Recordings/Week-4-%20Applied%20ML%20_%20ML4Data%20Science%20-Group-3%20%26%20RL-Online-Class-Sunday-3%20PM%20to%206%20PM-20250413_150901-Meeting%20Recording.mp4?web=1&referrer=Teams.TEAMS-ELECTRON&referrerScenario=MeetingChicletGetLink.view