#heart attack predictor.py

---

**Support Vector Machine (SVM) Model**

Support Vector Machine (SVM) is a classification algorithm that works by finding a decision boundary (hyperplane) that best separates data points into different classes. It’s particularly effective for datasets with clear margins between classes and can be tuned with hyperparameters such as the kernel type, regularization (C), and gamma for better performance. In this project, SVM is used to classify whether a patient is at high or low risk of a heart attack.

---
**Importance of Heart Attack Risk Prediction**

Heart attacks are one of the leading causes of death globally, making their early detection crucial. This project uses machine learning to predict heart attack risks based on patient data, such as age, blood pressure, and cholesterol levels. Such a system can provide valuable insights to healthcare providers, enabling timely medical intervention and improving patient outcomes.


---

**Import Necessary Libraries**


This part loads essential libraries like NumPy and Pandas for data handling, Scikit-learn for preprocessing, model building, and evaluation, and Joblib for saving the trained model. These tools form the foundation for building and evaluating the SVM model.

 ---

In [None]:
### Heart Attack Risk Predictor Jupyter Notebook

# Import Necessary Libraries
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.metrics import classification_report, confusion_matrix
from sklearn.pipeline import Pipeline
import joblib

I upload my dataset on google drive and then give the access to the code but you can download the data set on your computer and then give access directly by this code below

In [None]:
#with open('/content/drive/MyDrive/Colab Notebooks/your_file.py', 'r') as f:
 #file_content = f.read()

In [None]:
#accessing the google drive
from google.colab import drive
drive.mount ('/content/drive')


In this step i mounts my Google Drive in the Colab environment, allowing you to access datasets stored in the cloud directly from the notebook. This is useful for large datasets or when collaborating with others.

---

Don’t forget to put your folder name and address in parentheses

In [None]:
#locating the dataset file
import pandas as pd
df = pd.read_csv('/content/drive/MyDrive/my doc/your file name.csv')
df.head()

**Loading the Dataset**
---
The dataset is loaded using Pandas into a DataFrame. A preview of the data (df.head()) ensures that it has been correctly loaded, providing insight into its structure and column names.

In [None]:
# Load the dataset (assuming it's in a CSV file named 'heart_attack_risk.csv')
data = pd.read_csv('/content/drive/MyDrive/my doc/heart orginal.csv')

**Splitting the Dataset**
---
The data is split into features (X) and target labels (y), followed by a division into training and testing sets. This step ensures that the model can learn patterns from one part of the data (training) and be evaluated on unseen data (testing).

In [None]:
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


**Building the SVM Pipeline**
---
A pipeline is created that standardizes the features using StandardScaler and applies the SVM model. The pipeline simplifies the workflow by chaining preprocessing and modeling steps.



In [None]:
# Create a pipeline with preprocessing and model
pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('svm', SVC(random_state=42))
    ])

**Defining Hyperparameters for Grid Search**
---
Grid search is used to test different hyperparameter combinations for the SVM model. This ensures that the model achieves the best performance by fine-tuning parameters like the regularization strength (C), kernel type, and gamma.

In [None]:
# Define hyperparameters for grid search
param_grid = {
    'svm__C': [0.1, 1, 10, 100],
    'svm__kernel': ['rbf', 'linear'],
    'svm__gamma': ['scale', 'auto', 0.1, 0.01, 0.001]
}

**Training the Model with Grid Search**
---
The model is trained using the best combination of hyperparameters identified by grid search. This step involves fitting the pipeline to the training data and selecting the most optimal configuration.

In [None]:
# Perform grid search with cross-validation
grid_search = GridSearchCV(pipeline, param_grid, cv=5, scoring='accuracy', n_jobs=-1)
grid_search.fit(X_train, y_train)

In [None]:
# Get the best model
best_model = grid_search.best_estimator_


In [None]:
# Make predictions on the test set
y_pred = best_model.predict(X_test)
# Print the best parameters and score
print("Best parameters:", grid_search.best_params_)
print("Best cross-validation score:", grid_search.best_score_)


In [None]:
# Print the classification report and confusion matrix
print("\nClassification Report:")
print(classification_report(y_test, y_pred))
print("\nConfusion Matrix:")
print(confusion_matrix(y_test, y_pred))


In [None]:
# Function to predict heart attack risk for new data
def predict_heart_attack_risk(new_data):
    return best_model.predict(new_data)
# Get the best model
best_model = grid_search.best_estimator_


***Saving the Trained Model**
---
The model is saved using Joblib for reuse. This ensures the model does not need to be retrained every time and can be easily loaded for predictions.



In [None]:
# Save the model using joblib
joblib.dump(best_model, 'heart_attack_model.pkl')  # Save with .pkl extension
# Load the trained model
model = joblib.load('heart_attack_model.pkl') # load the model
# Load the trained model (make sure you've run the training code and saved the model first)
model = joblib.load('heart_attack_model.pkl')

**Creating Input Fields**
---
Interactive input fields are created using widgets, allowing users to enter new patient data directly into the notebook. This feature makes the project accessible to non-technical users.



In [None]:
# Create input fields
fields = [
    "Age", "Sex (0:Female, 1:Male)", "Chest Pain Type (0-3)",
    "Resting Blood Pressure (mm Hg)", "Alcohol Drink (0:No, 1:Yes)",
    "Fasting Blood Sugar (0:No, 1:Yes)", "Resting ECG Results (0-2)",
    "Max Heart Rate", "Exercise Induced Angina (0:No, 1:Yes)",
    "Previous Peak", "Slope (0-2)", "Number of Major Vessels (0-3)",
    "Thalassemia (0-3)"
]

**Making Predictions**
---
The final step involves using the saved model to predict the heart attack risk based on the user’s inputs. The results are displayed interactively, indicating whether the patient is at high or low risk.



In [None]:
# Create input widgets
inputs = {field: widgets.FloatText(description=field) for field in fields}

# Create a button widget
button = widgets.Button(description="Predict")
output = widgets.Output()

# Display widgets
form = widgets.VBox(list(inputs.values()) + [button, output])
display(form)


At the end i made a mini app to get thet valuables from user and put it into the svm model and then show the result

In [None]:
# Function to make prediction
def predict(b):
    with output:
        output.clear_output()
        try:
            input_data = np.array([[inputs[field].value for field in fields]])
            prediction = model.predict(input_data)
            result = "High Risk" if prediction[0] == 1 else "Low Risk"
            display(HTML(f"<h3>Heart Attack Risk: {result}</h3>"))
        except ValueError:
            display(HTML("<h3 style='color: red;'>Error: Please enter valid numeric values for all fields.</h3>"))

# Attach the function to the button
button.on_click(predict)