# **Assuming a set of documents that need to be classified, use the naïve Bayesian Classifier model to perform this task.**

## **Step 1: Import Libraries and Modules**

In [11]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score, confusion_matrix, precision_score, recall_score


## **Step 2: Load and preprocess the data**

In [12]:
msg = pd.read_csv('document.csv', names=['message', 'label'])
print("Total Instances of Dataset: ", msg.shape[0])

# Map labels to numerical values: 'pos' -> 1, 'neg' -> 0
msg['labelnum'] = msg['label'].map({'pos': 1, 'neg': 0})

# Split into features (messages) and labels (sentiments)
X = msg['message']
y = msg['labelnum']

Total Instances of Dataset:  18


## **Step 3: Split the data into training and testing sets**

In [13]:
Xtrain, Xtest, ytrain, ytest = train_test_split(X, y)

## **Step 4: Vectorize the text data**

In [14]:
vectorizer = CountVectorizer()
Xtrain_dm = vectorizer.fit_transform(Xtrain)
Xtest_dm = vectorizer.transform(Xtest)

## **Step 5: Train the Naive Bayes classifier**

In [15]:
clf = MultinomialNB()
clf.fit(Xtrain_dm, ytrain)

## **Step 6: Evaluate the model's performance**

In [16]:
pred = clf.predict(Xtest_dm)
print('Accuracy: ', accuracy_score(ytest, pred))
print('Confusion Matrix:\n', confusion_matrix(ytest, pred))

Accuracy:  0.6
Confusion Matrix:
 [[1 0]
 [2 2]]


## **Step 7: Predict sentiment for user input**

In [17]:
user_input = input("Enter a message to predict its sentiment: ")
user_input_dm = vectorizer.transform([user_input])
user_pred = clf.predict(user_input_dm)
sentiment = 'pos' if user_pred[0] == 1 else 'neg'
print(f"The sentiment of your message is: {sentiment}")

Enter a message to predict its sentiment: Hello Kush How are you 
The sentiment of your message is: neg
