# Simple Prediction

This is a template notebook for simple prediction modelling.

Author: {{ cookiecutter.author_name }}
Created: {{ cookiecutter.timestamp }}

## How to use the notebook

The following cells:
- specify data filepath, target variable name, and target variable type,
- read dataset,
- set up the lazypredict and autokeras models,
- present results from the models,
- provide the model with the best performance.

By default, the notebook is set up to run with an example (sklearn breast cancer). To see how it works, run the notebook without changing the code.

For your project, adjust the code in the User Inputs cell with your target variable, data filepath etc. and then execute all cells in order.

## Imports

In [0]:
!pip install lazypredict

In [0]:
!pip install autokeras

In [0]:
import os
import numpy as np
import pandas as pd

from sklearn.model_selection import train_test_split

from functions.simple_predict import SimplePredict, get_example_data

## User Inputs

In [0]:
filepath = "default classification" # Filepath of data csv
target = "target" # Name of target column
continuous_target = False # True if continuous, False if categorical
lazypredict_save_filepath = "pipeline.joblib" # Filepath to save final lazypredict pipeline artefact
autokeras_save_name = "autokeras" # Name to save final autokeras model

End of User Inputs

In [0]:
if filepath == "default regression" or "default classification":
    filepath = get_example_data(filepath)

Instantiate SimplePredict instance

In [0]:
simple = SimplePredict(filepath, target, continuous_target)

Import Data

In [0]:
X, y = simple.read_data()

In [0]:
X

In [0]:
y

Split into training and testing data

In [0]:
X_train, X_test, y_train, y_test = train_test_split(X, y,test_size=0.2)

### Lazypredict

Get predictor

In [0]:
lazy = simple.get_lazy()

Fit predictor models and get metrics

In [0]:
# Some models may fail to execute depending on the data
metrics, predictions = lazy.fit(X_train, X_test, y_train, y_test)

### Autokeras
Trains a neural network (might take some time)

In [0]:
auto = simple.get_autokeras()

In [0]:
epochs = 10
auto.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=epochs)

### Model metrics

Lazy Predict

In [0]:
metrics

In [0]:
top = 20

simple.plot_top_metrics(top, metrics)

In [0]:
simple.model_scores("Best", metrics)

In [0]:
simple.model_prediction("Best", metrics.index[0], predictions, y_test)

Autokeras

In [0]:
autokeras_predicted_y = auto.predict(X_test)
autokeras_predicted_y

In [0]:
simple.autokeras_scores(auto, X_test, y_test)

### Get pipeline/model artefact

In [0]:
# Get pipeline for specified model (Default best model)
# Can call predict() on pipeline object to get prediction from raw X data
pipeline = simple.get_model_pipeline("Best", metrics, lazy.models)
pipeline

Create artificats folder if it does not exist

In [0]:
isExist = os.path.exists('./artifacts')
if not isExist:
   os.makedirs('./artifacts')

Save pipeline object as joblib artefact

In [0]:
simple.save_pipeline(pipeline, 'artifacts/' + lazypredict_save_filepath)

Save autokeras model as a keras model

In [0]:
model = auto.export_model()
model.save('artifacts/' + autokeras_save_name)