# PyCaret

https://pycaret.org/

There is low code for everything nowadays, so why not for AI. It can help you with

* Exploratory Data Analysis
* Data Preprocessing
* Model Training
* Model Explainability
* MLOps

So let's give it a test run. A [quickstart](https://pycaret.gitbook.io/docs/get-started/quickstart) sounds nice. The following is a copy of the code, for the explanations you'll need to visit the website.

In [None]:
# load sample dataset
from pycaret.datasets import get_data
data = get_data('diabetes')

In [None]:
from pycaret.classification import *
s = setup(data, target = 'Class variable', session_id = 123)

In [None]:
# from pycaret.classification import ClassificationExperiment
# s = ClassificationExperiment()
# s.setup(data, target = 'Class variable', session_id = 123)

# --> we'll be working with the functional API, not the OOP API

In [None]:
# functional API
best = compare_models()

Just noting: do you see how we just trained a dozen models in less than a minute without using GPU? Try doing that when doing deep learning!

In [None]:
print(best)

Do you see all the different metrics that are measured?

In [None]:
# functional API
evaluate_model(best)

In [None]:
# functional API
predict_model(best)

In [None]:
# functional API
predictions = predict_model(best, data=data)
predictions.head()


In [None]:
# functional API
save_model(best, '../exports/my_best_pipeline')

In [None]:
# functional API
loaded_model = load_model('../exports/my_best_pipeline')
print(loaded_model)

# The actual exercise

There was not much exercise in the part before this. We also didn't complete the Quickstart, but more copy pasting would not have helped us any further.

What would help us (and the world) much more is to solve heart failure. Or just help predicting it. We'll be using a [kaggle](https://www.kaggle.com/datasets/johnsmith88/heart-disease-dataset) dataset.

## Step 1: import dependencies

You need pandas and pycaret. Import them.

In [None]:
# Up to you!



## Step 2: Download and import data

Download the data from above and import as a pandas dataframe. It's also stored in the files-folder.

In [None]:
# Up to you!



Count the amount of different values per column. PyCaret likes to know which columns are categorical, and columns with low value-counts are likely to be categorical.

In [None]:
# Up to you!



Good candidates are 'sex', 'cp', 'fbs', 'restecg', 'exang', 'slope', 'ca' and 'thal'. Target is our target column which has only two values (0 or 1), so no need to include that in the categoricals.

When looking at the description of the dataset (link above, but alse [here](https://www.kaggle.com/datasets/johnsmith88/heart-disease-dataset)) we not that we were right on most counts:

1) age: not categorical
1) sex: categorical
1) chest pain type (4 values): categorical
1) resting blood pressure: not categorical
1) serum cholestoral in mg/dl: not categorical
1) fasting blood sugar > 120 mg/dl: categorical (True/False)
1) resting electrocardiographic results (values 0,1,2): categorical
1) maximum heart rate achieved: not categorical
1) exercise induced angina: categorical (True/False)
1) oldpeak = ST depression induced by exercise relative to rest: not categorical
1) the slope of the peak exercise ST segment: **not** categorical
1) number of major vessels (0-3) colored by flourosopy: **not** categorical
1) thal: 0 = normal; 1 = fixed defect; 2 = reversable defect: categorical

In [None]:
cat_features = ['sex', 'cp', 'fbs', 'restecg', 'exang', 'thal']

## Step 3: Train and evaluate model

Setup and experiment first. Make sure to pass the list of catergorical features.

In [None]:
# Up to you!



Now the experiment is setup we can use it to compare the different models. Save the best result in a variable!

In [None]:
# Up to you!



## Step 4: Test model

Now that you have tested a lot of models, test the best model. Use only the bottom five lines of the data to test on.

In [None]:
# Up to you!



## Step 5: Save the model

In a pickle-file.

In [None]:
# Up to you!



And you may feel bad for your teacher having to look all this up, but [don't](https://youtu.be/sL-4rWuEiVw?si=wr5YAFCrg1LlSkcP).