# SMS Classifier

### Importing Libraries

In [11]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score, classification_report

### Loading the dataset

In [12]:
df = pd.read_csv('SMSSpamCollection.csv', sep='\t', names=['label', 'message'])

In [13]:
df

Unnamed: 0,label,message
0,ham,"Go until jurong point, crazy.. Available only ..."
1,ham,Ok lar... Joking wif u oni...
2,spam,Free entry in 2 a wkly comp to win FA Cup fina...
3,ham,U dun say so early hor... U c already then say...
4,ham,"Nah I don't think he goes to usf, he lives aro..."
...,...,...
5567,spam,This is the 2nd time we have tried 2 contact u...
5568,ham,Will ü b going to esplanade fr home?
5569,ham,"Pity, * was in mood for that. So...any other s..."
5570,ham,The guy did some bitching but I acted like i'd...


### Data preprocessing

### Here, we will drop some columns and handle missing values for simplicity

In [14]:
df['label'] = df['label'].map({'ham': 0, 'spam': 1})

### Split the dataset into training and testing sets

In [15]:
X_train, X_test, y_train, y_test = train_test_split(df['message'], df['label'], test_size=0.2, random_state=42)

### Text Vectorization

In [16]:
vectorizer = CountVectorizer()
X_train_vectorized = vectorizer.fit_transform(X_train)
X_test_vectorized = vectorizer.transform(X_test)

### Train the model

In [17]:
classifier = MultinomialNB()
classifier.fit(X_train_vectorized, y_train)

MultinomialNB()

### Make predictions on the test set

In [18]:
predictions = classifier.predict(X_test_vectorized)

### Evaluate the model

In [19]:
accuracy = accuracy_score(y_test, predictions)
report = classification_report(y_test, predictions)
print(f'Accuracy: {accuracy}')
print(f'Classification Report:\n{report}')

Accuracy: 0.9919282511210762
Classification Report:
              precision    recall  f1-score   support

           0       0.99      1.00      1.00       966
           1       1.00      0.94      0.97       149

    accuracy                           0.99      1115
   macro avg       1.00      0.97      0.98      1115
weighted avg       0.99      0.99      0.99      1115



### Result

In [20]:
user_input = input('Enter SMS Message: ')
user_input_vectorized = vectorizer.transform([user_input])
prediction = classifier.predict(user_input_vectorized)
if prediction[0] == 1:
    print(' Hey! It is a Spam SMS..!')
else:
    print('Dont worry! It is a Non-Spam SMS..!')

Enter SMS Message: how are you?
Dont worry! It is a Non-Spam SMS..!
