# EazyML Explainable AI Template

## Define Imports

In [None]:
!pip install --upgrade eazyml-xai
!pip install --upgrade eazyml-automl
!pip install gdown python-dotenv

In [None]:
import os
from eazyml_xai import (
    ez_init,
    ez_explain
)

from eazyml import (
    ez_display_df,
    ez_build_model
)

import gdown
import pandas as pd

from dotenv import load_dotenv
load_dotenv()

## 1. Initialize EazyML

The `ez_init` function uses the `EAZYML_ACCESS_KEY` environment variable for authentication. If the variable is not set, it defaults to a trial license.

In [None]:
ez_init(access_key=os.getenv('EAZYML_ACCESS_KEY'))

## 2. Define Dataset Files and Outcome Variable

In [None]:
gdown.download_folder(id='1EobxYR3pg_Z3Sd4sETfe4aJLAsT98fL2')

In [None]:
# Names of the files that will be used by EazyML APIs
train_file_path = os.path.join('data', 'Heart_Attack_traindata.csv')
test_file_path  = os.path.join('data', 'Heart_Attack_testdata.csv')

# The column name for outcome of interest
outcome = 'class'

## 3. Dataset Information

The dataset used in this notebook is the **Heart Attack Dataset**, which is a well-known dataset in machine learning and statistics. It contains data about patients, with several features (such as age, gender, blood pressure levels, and heart-related measurements) to predict the likelihood of a heart attack.

### Columns in the Dataset:
- **age**: The age of the patient, measured in years.
- **gender**: The gender of the patient, represented as a categorical variable (e.g., 1 = male, 0 = female).
- **impulse**: Refers to the patient's pulse rate, measured in beats per minute (bpm).
- **pressurehight**: Refers to systolic blood pressure, the higher number in a blood pressure reading (e.g., 120/80 mmHg).
- **pressurelow**: Refers to diastolic blood pressure, the lower number in a blood pressure reading (e.g., 120/80 mmHg).
- **glucose**: A measurement related to the heart, likely referring to potassium (K) concentration.
- **kcm**: This refer to a measurement related to the heart, related to potassium (K) concentration.
- **troponin**: A protein found in the heart muscle, measured to assess heart damage (especially after a heart attack).
- **class**: The target variable, indicating the presence or absence of a condition or disease (e.g., 1 = heart attack, 0 = no heart attack).

### 3.1 Display the Dataset

Below is a preview of the dataset:

In [None]:
# Load the dataset from the provided file
train = pd.read_csv(train_file_path)

# Display the first few rows of the dataset
ez_display_df(train.head())

## 4. EazyML Predictive Models

### 4.1 Reading the Datasets and Dropping Unnecessary Columns

In [None]:
discard_columns = []

# Reading Training Data
train = pd.read_csv(train_file_path)
train = train.drop(columns=discard_columns)

# Reading Test Data
test = pd.read_csv(test_file_path)
test = test.drop(columns=discard_columns)

### 4.2 Model Training: Several Models Trained 

In [None]:
## Build Model
options = {'model_type': 'predictive'}
resp = ez_build_model(train, outcome=outcome, options=options)

### 4.3 Show Model Performance

In [None]:
ez_display_df(resp['model_performance'])

## 5. Get Explanations

### 5.1 Use model_info from ez_build_model

In [None]:
# In extra info, we have model information
model_info = resp["model_info"]

### 5.2 Get Explanations for 5 Points

In [None]:
options = {'record_number': [1, 6, 7, 8, 9]}
response = ez_explain(train_file_path, outcome, test_file_path, model_info, options=options)

### 5.3 Display Explanation DataFrame

In [None]:
ex_df = pd.DataFrame([i.values() for i in response['explanations']], columns=response['explanations'][0].keys())
ez_display_df(ex_df)