# Naive Bayes Algorithm

##### Naive Bayes is a supervised machine learning algorithm that excels at classification tasks. It works by calculating the probability of a data point belonging to a particular class based on Bayes' theorem. In disease prediction, it estimates the likelihood of someone having a disease given their symptoms (attributes). While it assumes conditional independence between features (symptoms), it often performs well in practice, especially for large datasets.

In [127]:
import pandas as pd
from sklearn.naive_bayes import GaussianNB
from sklearn.model_selection import train_test_split
from sklearn.model_selection import RandomizedSearchCV, GridSearchCV
data = pd.read_csv("Dataset_cleaned.csv")

In [128]:
features = data.drop("TenYearCHD", axis=1)
target = data["TenYearCHD"]
# Split data into training and testing sets for model evaluation
X_train, X_test, y_train, y_test = train_test_split(features, target, test_size=0.2, random_state=42)

In [129]:
# Create the Naive Bayes classifier
clf = GaussianNB()
# Define hyperparameter grid for var_smoothing
param_grid = {
    'var_smoothing': [1e-9, 1e-8, 1e-7, 1e-6, 1e-5],
    'priors': [None, [0.1, 0.9]]
}
# Perform GridSearchCV for hyperparameter tuning with accuracy as the scoring metric
grid_search = GridSearchCV(clf, param_grid, scoring='accuracy')

# Train the model with different hyperparameter combinations
grid_search.fit(X_train, y_train)

# Access the best model and hyperparameters
best_model = grid_search.best_estimator_
best_params = grid_search.best_params_

In [130]:
predictions = best_model.predict(X_test)


In [131]:
# Evaluate model performance (e.g., accuracy, precision, recall)
from sklearn.metrics import accuracy_score, precision_score, recall_score

accuracy = accuracy_score(y_test, predictions)
precision = precision_score(y_test, predictions)
recall = recall_score(y_test, predictions)

In [132]:
print("Accuracy:", accuracy)
print("Precision:", precision)
print("Recall:", recall)

Accuracy: 0.8169398907103825
Precision: 0.4032258064516129
Recall: 0.20491803278688525
