## **Step 0: Data processing and Feature selection**

In [25]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
data="data.csv"
df = pd.read_csv(data)
df.head()

Unnamed: 0,glucose,bloodpressure,diabetes
0,40,85,0
1,40,92,0
2,45,63,1
3,45,80,0
4,40,73,1


In [28]:

features=["glucose","bloodpressure"] 
X = df.loc[0:len(df), features].values
Y = df.loc[0:len(df), "diabetes"].values
X_train, X_test, Y_train, Y_test = train_test_split (X, Y, test_size=0.15, random_state=42)

## **Steps 1: Compute the prior probabilities**

In [29]:
class_counts = np.bincount(Y_train) 
priors = class_counts / len(Y_train) 
No_of_features=len(features)

In [30]:
np.unique(Y_train)

array([0, 1], dtype=int64)

## **Step 2: Compute the conditional probabilities**

In [31]:
conditional_probs = {}
for feature in range(0,No_of_features):
    for label in np.unique(Y_train):
        feature_given_label = X_train[Y_train == label, feature]
        conditional_probs[(feature, label)] = len(feature_given_label) / class_counts[label]

## **Step 3: Apply Bayes' theorem**

In [32]:
def predict_class(input_data):
    posteriors = []
    for label in np.unique(Y_train):
        likelihood = 1.0
        for feature in range(0,No_of_features):
            likelihood *= conditional_probs[(feature, label)] ** input_data[feature]
        posterior = priors[label] * likelihood
        posteriors.append(posterior)
    return np.unique(Y_train)[np.argmax(posteriors)]

## **Step 4: Evaluate the performance of the model**

In [33]:
correct = 0
for i in range(len(X_test)):
    input_data = X_test[i]
    true_label = Y_test[i]
    predicted_label = predict_class(input_data)
    if true_label == predicted_label:
        correct += 1
accuracy = correct / len(X_test)
print(f"Accuracy: {accuracy}")

Accuracy: 0.4866666666666667


Here we see that the accuary is low and if other models are applied higher accuracy can be obtained. 
However it is takes lower computational power. to compute and can be used in cases where missclassification wont cause much harm such as classification of spam and not spam mail.