# Amazon SageMaker MLOps: Step 0 - Experimentation

This sequence of six notebooks takes you from developing your ML idea in a simple notebook to a production solution with automated model building and deployment pipelines, and model monitoring.

Follow these steps one by one to see how you can move your first model to production:

0. [Experiment in a notebook](00-experiment.ipynb)
1. [Move to SageMaker APIs](01-sagemaker-apis.ipynb)
2. [Orchestrate model building using a SageMaker Pipeline](02-sagemaker-pipelines.ipynb)
3. [Integrate into your CI/CD pipeline](03-sagemaker-projects.ipynb)
4. [Automate model deployment](04-deploy.ipynb)
Bonus: [Add data quality monitoring](05-monitoring.ipynb)



## Setup

### Import packages

In [None]:
import time
import os
import json
import boto3
import numpy as np  
import pandas as pd 
import sagemaker

sagemaker.__version__

### Set constants

In [None]:
boto_session = boto3.Session()
region = boto_session.region_name
bucket_name = sagemaker.Session().default_bucket()
bucket_prefix = "from-idea-to-prod/xgboost"  
sm_session = sagemaker.Session()
sm_client = boto_session.client("sagemaker")
sm_role = sagemaker.get_execution_role()

initialized = True

print(sm_role)

In [None]:
%store bucket_name
%store bucket_prefix
%store sm_role
%store region
%store initialized

## Donwloading the Dataset

This example uses the [direct marketing dataset](https://archive.ics.uci.edu/ml/datasets/bank+marketing) from UCI's ML Repository:
> [Moro et al., 2014] S. Moro, P. Cortez and P. Rita. A Data-Driven Approach to Predict the Success of Bank Telemarketing. Decision Support Systems, Elsevier, 62:22-31, June 2014

Download and unzip the dataset:

In [None]:
!wget -P data/ -N https://archive.ics.uci.edu/ml/machine-learning-databases/00222/bank-additional.zip

In [None]:
import zipfile

with zipfile.ZipFile("data/bank-additional.zip", "r") as z:
    print("Unzipping the dataset...")
    z.extractall("data")
print("Done")

### Inspect the data

In [None]:
df_data = pd.read_csv("./data/bank-additional/bank-additional-full.csv", sep=";")

pd.set_option("display.max_columns", 500)  # View all of the columns
df_data  # show first 5 and last 5 rows of the dataframe

## Continue with the step 1
Start with the step 1 [notebook](01-idea-development.ipynb).

## Resources

### Documentation
- [Use Amazon SageMaker Built-in Algorithms](https://docs.aws.amazon.com/sagemaker/latest/dg/algos.html)

### Hands-on examples
- [Get started with Amazon SageMaker](https://aws.amazon.com/sagemaker/getting-started/)


### Workshops
- [Amazon SageMaker 101 Workshop](https://catalog.us-east-1.prod.workshops.aws/workshops/0c6b8a23-b837-4e0f-b2e2-4a3ffd7d645b/en-US)

# Shutdown kernel

In [None]:
%%html

<p><b>Shutting down your kernel for this notebook to release resources.</b></p>
<button class="sm-command-button" data-commandlinker-command="kernelmenu:shutdown" style="display:none;">Shutdown Kernel</button>
        
<script>
try {
    els = document.getElementsByClassName("sm-command-button");
    els[0].click();
}
catch(err) {
    // NoOp
}    
</script>