#  Email Spam Detection using Machine Learning
This project uses a simple dataset of email messages and is built using Python and scikit-learn. The goal is to classify messages as either *spam* or *not spam* (ham).

## Importing Libraries

In [1]:
import pandas as pd
import numpy as np

## Creating Dataset


In [2]:

data = {
    'label': ['ham', 'spam', 'ham', 'spam', 'ham', 'spam', 'ham', 'spam', 'ham', 'spam'],
    'message': [
        'Hey, just wanted to check in with you about the meeting.',
        'Congratulations! You have won a $500 Amazon gift card. Click to claim now.',
        'Are we still on for dinner tonight?',
        'Urgent! Your account has been suspended. Login to verify.',
        'Let’s catch up over coffee tomorrow.',
        'Winner! You have been selected for a free trip to Paris.',
        'Please review the attached document and send feedback.',
        'You’ve been chosen for a cash prize. Respond immediately.',
        'Can we move the call to 4 PM instead of 3?',
        'Your credit card limit has been increased. Apply here!'
    ]
}


## Converting it into DataFrame

In [3]:
df = pd.DataFrame(data)

## Saving and Loading CSV

In [4]:
df.to_csv("email_spam.csv", index=False)
df.head()

Unnamed: 0,label,message
0,ham,"Hey, just wanted to check in with you about th..."
1,spam,Congratulations! You have won a $500 Amazon gi...
2,ham,Are we still on for dinner tonight?
3,spam,Urgent! Your account has been suspended. Login...
4,ham,Let’s catch up over coffee tomorrow.


## Vectorization

In [5]:
from sklearn.feature_extraction.text import CountVectorizer

vectorizer = CountVectorizer()
X = vectorizer.fit_transform(df['message'])

y = df['label'].map({'ham':0, 'spam':1})

## Train-test Split

In [6]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)

## Prediction and Model Evaluation

In [7]:
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score

model = MultinomialNB()
model.fit(X_train, y_train)

y_pred = model.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)
print("Accuracy of the Model :", accuracy)

Accuracy of the Model : 1.0


#### Conclusion

The model was successfully trained using the Naive Bayes classifier from scikit-learn.  
It achieved an accuracy of *100%* on the sample test data.



## Custom Email Prediction

In this final step, we can test the model by entering your own email text. The model will analyze the input and predict whether the email is *Spam * or *Ham *.

###  Example:

In [12]:

custom_email = input(" Enter a custom email to check if it's spam or ham:\n➡ ")
custom_vector = vectorizer.transform([custom_email])
prediction = model.predict(custom_vector)

print("\n Result:")
if prediction[0] == 1:
    print(" Predicted Label: Spam ")
else:
    print(" Predicted Label: Ham ")


 Result:
 Predicted Label: Ham 
