<a href="https://colab.research.google.com/github/Sanidhyar10/Intro-to-Data-Science-using-python-/blob/main/Naive_bayes_classification_IT2K2156.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 6: Naive Bayes Classification

For the Naïve Bayes algorithm, which includes the MultinomialNB model, negative values are not appropriate, as it assumes non-negative integer features. The features balance and pdays in your dataset have negative values, leading to the ValueError.

To address this issue, you can consider one of the following options:

Remove Negative Values: If the negative values in balance and pdays are outliers or errors, you may choose to remove those specific records from your dataset.

Feature Transformation: Apply a transformation to make the features non-negative. For example, you can add a constant value to each feature to shift its range.

In [None]:
from sklearn.model_selection import train_test_split
import pandas as pd
import numpy as np
from sklearn.naive_bayes import MultinomialNB

# Load the dataset (replace 'bank.csv' with the actual path to your dataset)
bank_data = pd.read_csv('/content/bank.csv')
# Remove rows with negative values in 'balance' and 'pdays'
bank_data = bank_data[(bank_data['balance'] >= 0) & (bank_data['pdays'] >= 0)]

# Replace these columns with your actual predictor and response variable names
predictor_columns = ["age", "balance", "day", "duration", "campaign", "pdays", "previous"]
response_column = "deposit"

# Convert 'yes' and 'no' to 1 and 0
bank_data[response_column] = bank_data[response_column].map({'yes': 1, 'no': 0})

# Split the data into training and test sets
bank_train, bank_test = train_test_split(bank_data, test_size=0.2, random_state=42)

# Create dummy variables for categorical variables using get_dummies
bank_train = pd.get_dummies(bank_train, columns=["job", "marital", "education", "default", "housing", "loan", "contact", "month", "poutcome"])
bank_test = pd.get_dummies(bank_test, columns=["job", "marital", "education", "default", "housing", "loan", "contact", "month", "poutcome"])

# Extract predictor variables and target variable
X_train = bank_train[predictor_columns + bank_train.columns[bank_train.columns.str.startswith('job_')].tolist()]
y_train = bank_train[response_column]

X_test = bank_test[predictor_columns + bank_test.columns[bank_test.columns.str.startswith('job_')].tolist()]
y_test = bank_test[response_column]

# Create and train the Naïve Bayes model
nb_model = MultinomialNB().fit(X_train, y_train)

# Generate predictions for the test set
Y_predicted = nb_model.predict(X_test)

# Evaluate the model
ypred = pd.crosstab(y_test, Y_predicted, rownames=['Actual'], colnames=['Predicted'])
ypred['Total'] = ypred.sum(axis=1)
ypred.loc['Total'] = ypred.sum()

print("Contingency Table:")
print(ypred)
