**Problem Statement**


My Heart Disease Prediction model is a trained Logistic Regression model designed to determine whether an individual has a healthy heart or is afflicted with heart disease. By analyzing various input parameters such as age, blood pressure, cholesterol levels, and other relevant health indicators, the model computes the likelihood of heart disease based on these factors. This predictive capability aids healthcare professionals in early diagnosis and intervention, facilitating timely treatment and management strategies for individuals at risk of or already affected by heart disease.

In [None]:
# Importing Dependencies
import numpy as np
import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split

In [None]:
# Loading the dataset into a pandas DataFrame
heart_dataset = pd.read_csv('/content/heart_disease_data.csv')

In [None]:
# Checking the first 5 rows of the DataFrame
heart_dataset.head()

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,target
0,63,1,3,145,233,1,0,150,0,2.3,0,0,1,1
1,37,1,2,130,250,0,1,187,0,3.5,0,0,2,1
2,41,0,1,130,204,0,0,172,0,1.4,2,0,2,1
3,56,1,1,120,236,0,1,178,0,0.8,2,0,2,1
4,57,0,0,120,354,0,1,163,1,0.6,2,0,2,1


In [None]:
# Checking the number of rows and columns in the DataFrame
heart_dataset.shape

(303, 14)

In [None]:
# Statistical measures
heart_dataset.describe()

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,target
count,303.0,303.0,303.0,303.0,303.0,303.0,303.0,303.0,303.0,303.0,303.0,303.0,303.0,303.0
mean,54.366337,0.683168,0.966997,131.623762,246.264026,0.148515,0.528053,149.646865,0.326733,1.039604,1.39934,0.729373,2.313531,0.544554
std,9.082101,0.466011,1.032052,17.538143,51.830751,0.356198,0.52586,22.905161,0.469794,1.161075,0.616226,1.022606,0.612277,0.498835
min,29.0,0.0,0.0,94.0,126.0,0.0,0.0,71.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,47.5,0.0,0.0,120.0,211.0,0.0,0.0,133.5,0.0,0.0,1.0,0.0,2.0,0.0
50%,55.0,1.0,1.0,130.0,240.0,0.0,1.0,153.0,0.0,0.8,1.0,0.0,2.0,1.0
75%,61.0,1.0,2.0,140.0,274.5,0.0,1.0,166.0,1.0,1.6,2.0,1.0,3.0,1.0
max,77.0,1.0,3.0,200.0,564.0,1.0,2.0,202.0,1.0,6.2,2.0,4.0,3.0,1.0


In [None]:
# Checking if there exists any missing values in the DataFrame
heart_dataset.isnull().sum()

age         0
sex         0
cp          0
trestbps    0
chol        0
fbs         0
restecg     0
thalach     0
exang       0
oldpeak     0
slope       0
ca          0
thal        0
target      0
dtype: int64

Separating Labels and Features


In [None]:
x = heart_dataset.drop(columns='target',axis=1)
y = heart_dataset['target']

Splitting the training and testing data

In [None]:
x_train, x_test, y_train, y_test = train_test_split(x,y,test_size=0.2,stratify = y,random_state=2)

Model Building and Training

In [None]:
model = LogisticRegression()

In [None]:
# Training the model with training data
model.fit(x_train,y_train)

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(


In [None]:
# Checking the accuracy on training data
training_prediction = model.predict(x_train)
training_accuracy = accuracy_score(training_prediction,y_train)

In [None]:
print("Model's accuracy on training data is: ", training_accuracy)

Model's accuracy on training data is:  0.8512396694214877


In [None]:
# Checking the accuracy on testing data
testing_prediction = model.predict(x_test)
testing_accuracy = accuracy_score(testing_prediction,y_test)

In [None]:
print("Model's accuracy on testing data is: ", testing_accuracy)

Model's accuracy on testing data is:  0.819672131147541


Making our own prediction system

In [None]:
# Taking the input from the user and converting it into a reshaped numpy array

input = [63,	1,	3,	145,	233,	1,	0,	150,	0,	2.3,	0,	0,	1	]
input_array = np.asarray(input)
input_array_reshaped = input_array.reshape(1,-1)

In [None]:
# Making prediction
prediction = model.predict(input_array_reshaped)



In [None]:
if(prediction[0]==1):
  print("You have Heart disease")
else:
  print("You don't have heart disease")

You have Heart disease
