<a href="https://colab.research.google.com/github/Manahil4/ML_Labs/blob/main/ML_Lab_03_Naive_Bayes.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## The Heart Dataset

File name: 'D3_Heart_Dataset.csv'

This dataset has been obtained from Kaggle: https://www.kaggle.com/fedesoriano/heart-failure-prediction

The data contains 918 observations with 12 attributes as described below:
1. Age: patient's age, range: 28 to 77.
2. Sex: patient's gender, M(79%), F(21%).
3. ChestPainType: ASY (54%), NAP (22%), Other(24%).
4. RestingBP: resting blood pressure, range: 0 to 200.
5. Cholestrol: serum cholestrol, range: 0 to 603.
6. FastingBS: fasting blood sugar, 0 or 1.
7. RestingECG: resting electrocardiogram results, Normal (60%), LVH (20%), Other (19%).
8. MaxHR: maximum heart rate achieved, range: 60 to 202.
9. ExerciseAngina: exercise induced angina, true(317-40%), false (547-60%).
10. OldPeak: old peak=ST, range: -2.6 to 6.2.
11. ST_Slope: ST slope, Up or flat.
12. HeartDisease: target, 1 or 0.

Last column indicates presence of heart disease given the remaining 11 attributes.

This is a binary classification problem.

Contains categorical data, otherwise the dataset is clean.

## Loading and Exploring Dataset

In [None]:
from google.colab import files
uploaded = files.upload()



KeyboardInterrupt: 

In [None]:
%pip install scikit-learn

In [None]:
!ls


In [None]:
import pandas as pd
#Reading the file into a dataframe
data=pd.read_csv(f'D3_Heart_Dataset.csv')
#Displaying the read contents
data

In [None]:
# Finding datatype of data
type(data),

In [None]:
# Displaying general info
data.info()

## Separating Features and Target

In [None]:
# separating predictors
X = data.drop("HeartDisease",axis=1)
X

In [None]:
# separating target
Y = data["HeartDisease"]
Y

## Applying Ordinal Encoding on all Five Categorical Features

In [None]:
# Feature: Gender
X['Gender'].unique()

In [None]:
X['Gender']=X['Gender'].replace('M',1)
X['Gender']=X['Gender'].replace('F',0)
X

In [None]:
# Feature: ChestPainType
X['ChestPainType'].unique()

In [None]:
X['ChestPainType']=X['ChestPainType'].replace('ATA',1)
X['ChestPainType']=X['ChestPainType'].replace('NAP',2)
X['ChestPainType']=X['ChestPainType'].replace('ASY',3)
X['ChestPainType']=X['ChestPainType'].replace('TA',4)
X

In [None]:
# Feature: RestingECG
X['RestingECG'].unique()

In [None]:
X['RestingECG']=X['RestingECG'].replace('Normal',1)
X['RestingECG']=X['RestingECG'].replace('ST',2)
X['RestingECG']=X['RestingECG'].replace('LVH',3)
X

In [None]:
# Feature: ExerciseAngina
X['ExerciseAngina'].unique()

In [None]:
X['ExerciseAngina']=X['ExerciseAngina'].replace('Y',1)
X['ExerciseAngina']=X['ExerciseAngina'].replace('N',0)
X

In [None]:
# Feature: ST_Slope
X['ST_Slope'].unique()

In [None]:
X['ST_Slope']=X['ST_Slope'].replace('Up',0)
X['ST_Slope']=X['ST_Slope'].replace('Flat',1)
X['ST_Slope']=X['ST_Slope'].replace('Down',2)
X

## Splitting the Dataset into train and test sets

In [None]:
!pip install scikit-learn

from sklearn.model_selection import train_test_split

X_train,X_test,Y_train,Y_test = train_test_split(X, Y,test_size=0.20,random_state=0)
print(X_train.shape )
print(X_test.shape)
print(Y_train.shape)
print(Y_test.shape)


## Creating Gaussian Naive Bayes Model

In [None]:
from sklearn.naive_bayes import GaussianNB

# Creating Gaussian Naive Bayes Object
classifer1 = GaussianNB()

In [None]:
# Training the model
model1 = classifer1.fit(X_train, Y_train)

In [None]:
from sklearn import metrics
from sklearn.metrics import confusion_matrix, classification_report

# Evaluating the model
Y_pred1 = model1.predict(X_test)
print("The accuracy is "+str(metrics.accuracy_score(Y_test,Y_pred1)*100)+"%")
print(confusion_matrix(Y_test, Y_pred1))

In [None]:
target_names = ['class 0', 'class 1']
print(classification_report(Y_test, Y_pred1, target_names=target_names))

## Creating Gaussian Naive Bayes Model with Prior Probabilities of Classes

The priors parameter is typically used when you have limited data in one target class and you want to specify equal initial probabilities.

In [None]:
# Creating Gaussian Naive Bayes Object
classifer2 = GaussianNB(priors=[0.25, 0.75])
# Training the model
model2 = classifer2.fit(X_train, Y_train)

# Evaluating the model
Y_pred2 = model2.predict(X_test)
print("The accuracy is "+str(metrics.accuracy_score(Y_test,Y_pred2)*100)+"%")
print(confusion_matrix(Y_test, Y_pred2))

## Creating Miltinomial Naive Bayes Model

In [None]:
from sklearn.naive_bayes import MultinomialNB

# Creating Multinomial Naive Bayes Object
classifer3 = MultinomialNB()

# Training the model
model3 = classifer3.fit(X_train, Y_train)

# Evaluating the model
Y_pred3 = model3.predict(X_test)
print("The accuracy is "+str(metrics.accuracy_score(Y_test,Y_pred3)*100)+"%")
print(confusion_matrix(Y_test, Y_pred3))