# Direct Marketing with Amazon SageMaker Autopilot

This notebook works well with the `Python 3 (Data Science)` kernel on SageMaker Studio.

---

---

## Contents

1. [Introduction](#Introduction)
1. [Setup](#Setup)
1. [Downloading the dataset](#Downloading)
1. [Upload the dataset to Amazon S3](#Uploading)
1. [Setting up the SageMaker Autopilot Job](#Settingup)
1. [Launching the SageMaker Autopilot Job](#Launching)
1. [Tracking Sagemaker Autopilot Job Progress](#Tracking)
1. [Results](#Results)
1. [Cleanup](#Cleanup)

## Introduction

[Amazon SageMaker Autopilot](https://docs.aws.amazon.com/sagemaker/latest/dg/autopilot-automate-model-development-create-experiment.html) provides UI interface in SageMaker Studio to make your Autopilot experiment easier. 

In the notebook, we will explore the process on how to create a SageMaker Autopilot job via Studio UI. 

> **_NOTE_** Please do finish [01_sagemaker_autopilot_data_preparation.ipynb](./01_sagemaker_autopilot_data_preparation.ipynb) notebook execution before proceeding below.

### Why using SageMaker Studio UI?

Using SageMaker Studio UI to create Autopilot experiment eases the pains for non-developers. With clicking & filling necessary detail on the training data & Autopilot job setting, you can start tackling ML problem without writing code. 

## Setup

To restore shared variables created by [01_sagemaker_autopilot_data_preparation.ipynb](./01_sagemaker_autopilot_data_preparation.ipynb) notebook and list the S3 resources URIs for filling Autopilot experiment detail.

In [23]:
%store -r train_data_s3_path
%store -r bucket
%store -r prefix

try:
  train_data_s3_path
except NameError:
    raise ValueError("Training dataset S3 URI is missing, please execute the data preparation notebook!")

Please note down below variables:
* `train_data_s3_path` for training data input.
* `using_studio_ui_output_path` for Autopilot experiment output.

In [24]:
train_data_s3_path

's3://sagemaker-ap-southeast-2-452533547478/mlu-workshop/autopilot-dm/train/train_data.csv'

In [26]:
using_studio_ui_output_path = f"s3://{bucket}/{prefix}/using-studio-ui-output"
using_studio_ui_output_path

's3://sagemaker-ap-southeast-2-452533547478/mlu-workshop/autopilot-dm/using-studio-ui-output'

## AutoPilot Experiment using SageMaker Studio UI

SageMaker Studio UI provides an easy way to trigger AutoPilot job so that you can start your model experiment quickly. In this section, we will work through the steps to create AutoPilot job and do some predictions on test data.

### Open Amazon SageMaker Studio

Please follow below steps:
* Logon to AWS Management Console
* Select 'Amazon SageMaker' service
* Select 'Studio' under left-hand side menu, which is under 'SageMaker Domain'
* Click 'Launch App' dropdown box under a SageMaker User
* Click 'Studio' item under the dropdown box.

### Create Autopilot Experiment

Under 'Launcher' tab, choose the **New autopilot experiment** option from the **Build model automatically** box. (If you don't see 'Launcher' tab, you can open one under Menu 'File' -> 'New Launcher'
 
![New autoPilot experiment](./image/ap_new_autopilot_experiment.png)

### Enter information for the AutoPilot Job

* **Experiment name** - an unique name to your account in the current AWS Region and container a maximum of 63 alphanumeric characters. Can include hyphens(-). 
 
 ![experiment name](./image/ap_experiment_name.png)
 
  * Type in 'Experiment name', e.g. 'direct-marketing-autopilot-job'

 * **Connect your data** - Provide the training data S3 URI.
 
 ![experiment name](./image/ap_enter_s3bucket_location.png)
 
  * Select 'Enter S3 bucket location'
  * Copy & paste the value of `train_data_s3_path` to 'S3 bucket address', and the value will be similar to 's3://sagemaker-ap-southeast-2-123456789012/mlu-workshop/autopilot-dm/train/train_data.csv' 

 * **Is your S3 input a manifest file?** - choose 'off' for the lab given we don't have a manifest file include meta data for our training data.

  * **target** - the target value or label in the training dataset.
  
  ![manifest file](./image/ap_target.png)
  
   * Click the dropdown box and select field 'y', which is the target value.

  * **Output data location** - the name of the S3 bucket and directory where you want to store the output data.
 
  ![manifest file](./image/ap_output_data_location.png)
  
  > **_NOTE_**: You may select a S3 bucket (which is under the AWS Region) and related directory, or provide the S3 folder URI. In our exercise, please use a directory under SageMaker Default S3 bucket. 

  * **Select the machine learning problem type** - Autopilot can automatically select the machine learning problem type and you can specify manually. In our exercise, please choose `Binary classification` in dropdox box.
  
  ![manifest file](./image/ap_ml_problem_type.png)
  
   * Please select `F1` as Object metric.

  * **Do you want to run complete experiment** - You can specify how to run the experiment. 
  
  ![manifest file](./image/ap_complete_experiment.png)
  
   * If you choose **Yes**, Autopilot runs experiments with model training, generates related trials and you will be able to deploy the best model to SageMaker Endpoint service for realtime inference. 
   * If you choose **No**, instead of running the entire experiment, AutoPilot stops running after generating the notebooks for dataset analysis & candidates definitions. 

  * **Auto deploy** - Autopilot can automatically deploy the best model from an Autopilot experiment to an endpoint (for realtime inference), accept the default Auto deploy value **On** when creating the experiment. Also, please provide the endpoint name. In our exercise, please input `dm-autopilot-experiment`.
  
  ![manifest file](./image/ap_auto_deploy.png)

  * **ADVANCED SETTINGS** - The settings allows you to specify how the experiment should be run. Especially, we want to set the max candidate to be experimented as `10` and accept default values for others.
  
  ![manifest file](./image/ap_advanced_settings.png)

  * **Auto deploy the best model confirmation?** - If you choose **On** under 'Auto Deploy', Autopilot will prompt a confirmation dialog to remind you that it will generate cost while deploying the model to SageMaker endpoint. In our exercise, please click `Confirm` button.
  
  ![manifest file](./image/ap_prompt_best_model_deployment.png)

4. **AutoPilot experiment in progress** - Once the Autopilot experiment is kicked off, you will be able to view the progress of the experiment. It may takes 20-40mins until the job is finished, which depends on the amount the training dataset & the number of candidates you want to experiment. (Autopilot supports up to 250 candidates)
  
  ![manifest file](./image/ap_auto_pilot_job_in_progress.png)

### View Autopilot Experiment Job 

Once the experiment is completed, we will be able to view the related trials and access generate notebooks & deployed endpoint.

* **Autopilot Job Detail** - To access the Autopilot job detail, you may wait under the job finished from the previous step. Or,
 1. click ![icon](./image/sm_studio_sagemaker_resources.png) `SageMaker Resources` icon to open resources pane. 
 2. select `Experiments and trials` to list SageMaker experiments. 
 3. right click the experiment object in the list and select `Describe AutoML Job`.
 

  ![To View Autopilot Job Detail](./image/ap_direct_markting_autopilot_job_detail.png)

#### To learn more about the generated notebooks

* Click button `Open candidate generation notebook` to understand more detail on how the model candidates are being explored.
* Click button `Open data exploration notebook` to understand more on how the training data statistics look liks.

#### To view `Best Model`

`Best Model` is the one with the highest performance on the selected `Objective metric`. In our lab, it's the `F1` score.

Please go ahead and right click your mouse on first row with `Best Model` and select `Describe in model details` menu. 

  ![To View Best Model](./image/ap_describe_best_model.png)
  
With that, the model details page will be shown, especially, for the `Best Model`, Autopilot provides reports for `Explainability` and `Performance` tabs. Please select them to understand more about model explainability and model performance.

  ![To View Model Details](./image/ap_model_detail_explainability.png)

#### To experiment the deployed model

Click `dm-autopilot-experiment` (or whatever name you use for the deployment endpoint), you will be able to view the endpoint details. Especially, you will be provide a payload content to get prediction result from the endpoint.

  ![To Test Endpoint](./image/ap_test_endpoint.png)