# Amazon SageMaker MLOps

## Pre-Requisite

### 1.1 Attach IAM polich to sagemaker execution role (<b>with console</b>)
> step 1. IAM console 로 이동    
>
> step 2. 왼쪽 네비게이터에서  "Role" 선택  
> ![nn](../images/Role.png)  
>
> step 3. SageMaker Execution Role 검색 후 role 선택 (상위 cell output 참조)  
> ![nn](../images/search-by-rolename.png)  
>
> step 4. "attach policies" 메뉴 선택 
> ![nn](../images/attach-policy-menu.png)  
>
> step 5. "IAMFullAccess" policy 검색 후 attach 
> ![nn](../images/attach-policy.png) 

## Setup
최신 SageMaker Python SDK을 설치합니다. 

<div class="alert alert-info"> 💡 이 Workshop에서는 SageMaker Studio을 활용해 MLOps Pipeline을 구성합니다. </code>
</div>

In [1]:
# Uncomment if you have any compatibility issues and would like to use the specific version of the sagemaker library
# %pip install sagemaker==2.219.0
%pip install --upgrade pip sagemaker boto3

Note: you may need to restart the kernel to use updated packages.


In [2]:
%pip install mlflow==2.13.2 sagemaker-mlflow

Note: you may need to restart the kernel to use updated packages.


### Import packages

In [3]:
import time
import os
import json
import boto3
import numpy as np  
import pandas as pd 
import sagemaker
from time import gmtime, strftime, sleep

(sagemaker.__version__,boto3.__version__)

sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /home/sagemaker-user/.config/sagemaker/config.yaml


('2.228.0', '1.34.161')

### Set constants

In [4]:
# Get some variables you need to interact with SageMaker service
boto_session = boto3.Session()
region = boto_session.region_name
bucket_name = sagemaker.Session().default_bucket()
bucket_prefix = "mlops-workshop/xgboost"  
sm_session = sagemaker.Session()
sm_client = boto_session.client("sagemaker")
sm_role = sagemaker.get_execution_role()
dataset_file_local_path = "data/bank-additional/bank-additional-full.csv"

initialized = True

print(sm_role)

arn:aws:iam::498139060672:role/sagemaker-immersion-day-SageMakerExecutionRole-5ulC9EuIDWIK


In [7]:
# Store some variables to keep the value between the notebooks
%store bucket_name
%store bucket_prefix
%store sm_role
%store region
%store initialized
%store dataset_file_local_path

Stored 'bucket_name' (str)
Stored 'bucket_prefix' (str)
Stored 'sm_role' (str)
Stored 'region' (str)
Stored 'initialized' (bool)
Stored 'dataset_file_local_path' (str)


### domain id 
많은 세이지메이커 파이썬 SDK와 boto3 세이지메이커 API 호출에서 이 값 `domain_id`가 필요합니다. 노트북 메타데이터 파일에는 `domain_id` 값이 포함되어 있습니다. 다음 코드는 노트북 메타데이터 파일에 액세스하여 `domain_id`를 가져오는 방법을 보여줍니다.

In [8]:
NOTEBOOK_METADATA_FILE = "/opt/ml/metadata/resource-metadata.json"
domain_id = None

if os.path.exists(NOTEBOOK_METADATA_FILE):
    with open(NOTEBOOK_METADATA_FILE, "rb") as f:
        metadata = json.loads(f.read())
        domain_id = metadata.get('DomainId')
        space_name = metadata.get('SpaceName')
        print(f"SageMaker domain id: {domain_id}")

if not space_name:
    raise Exception(f"Cannot find the current space name. Make sure you run this notebook in a JupyterLab in the SageMaker Studio")
else:
    print(f"Space name: {space_name}")
    
r = sm_client.describe_space(DomainId=domain_id, SpaceName=space_name)
user_profile_name = r['OwnershipSettings']['OwnerUserProfileName']

assert(user_profile_name)
print(f"User profile: {user_profile_name}")

%store domain_id
%store space_name
%store user_profile_name

SageMaker domain id: d-2vrfnbaod7xr
Space name: danny-lab
User profile: SageMakerUser
Stored 'domain_id' (str)
Stored 'space_name' (str)
Stored 'user_profile_name' (str)


### MLflow tracking server 연결하기 


MLflow 추적 서버를 생성 및 관리하고 managed MLflow을 경험하려면 SageMaker 실행 역할에 다음과 같은 권한이 필요합니다:

IAM -> SageMaker Execution Role에 아래 내용을 inline policy로 추가합니다. 

```json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "sagemaker-mlflow:*",
                "sagemaker:CreateMlflowTrackingServer",
                "sagemaker:UpdateMlflowTrackingServer",
                "sagemaker:DeleteMlflowTrackingServer",
                "sagemaker:StartMlflowTrackingServer",
                "sagemaker:StopMlflowTrackingServer",
                "sagemaker:CreatePresignedMlflowTrackingServerUrl"
            ],
            "Resource": "*"
        }
    ]
}
```


다음 코드를 실행해서 SageMaker Execution Role에 MLflow Tracking을 할 수 있는 policy를 추가합니다. 

In [9]:
# from utils.iam import iam_handler
# iam = iam_handler()
from sagemaker import get_execution_role
strSageMakerRoleName = get_execution_role().rsplit('/', 1)[-1]

import boto3
iam = boto3.client('iam')

# 사용자 정의 정책을 정의합니다.
custom_policy = {
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "sagemaker-mlflow:*",
                "sagemaker:CreateMlflowTrackingServer",
                "sagemaker:UpdateMlflowTrackingServer",
                "sagemaker:DeleteMlflowTrackingServer",
                "sagemaker:StartMlflowTrackingServer",
                "sagemaker:StopMlflowTrackingServer",
                "sagemaker:CreatePresignedMlflowTrackingServerUrl"
            ],
            "Resource": "*"
        }
    ]
}

# 사용자 정의 정책을 생성합니다.
response = iam.create_policy(
    PolicyName='SageMakerMLflowPolicy',
    PolicyDocument=json.dumps(custom_policy)
)

# 생성된 정책의 ARN을 가져옵니다.
custom_policy_arn = response['Policy']['Arn']
#custom_policy_arn = "arn:aws:iam::498139060672:policy/SageMakerMLflowPolicy"

# 생성한 사용자 정의 정책을 역할에 부착합니다.
iam.attach_role_policy(
    RoleName=strSageMakerRoleName,
    PolicyArn=custom_policy_arn
)

{'ResponseMetadata': {'RequestId': 'a607b420-62af-4ff6-85ba-95f818dec335',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'date': 'Thu, 15 Aug 2024 08:35:30 GMT',
   'x-amzn-requestid': 'a607b420-62af-4ff6-85ba-95f818dec335',
   'content-type': 'text/xml',
   'content-length': '212'},
  'RetryAttempts': 0}}

다음 코드를 실행하여 실행 중인 MLflow 서버가 있는지 확인합니다.

In [2]:
# Find an active MLflow server in the account
r = boto3.client("sagemaker").list_mlflow_tracking_servers(
    TrackingServerStatus='Created',
)['TrackingServerSummaries']

if len(r) < 1:
    print("You don't have any running MLflow servers. Trying to find a server in the status 'Creating'...")

    r = boto3.client("sagemaker").list_mlflow_tracking_servers(
        TrackingServerStatus='Creating',
    )['TrackingServerSummaries']

    if len(r) < 1:
        print("You don't have any MLflow server in the status 'Creating'. Run the next code cell to create a new one.")
        mlflow_arn = None
        mlflow_name = None
    else:
        mlflow_arn = r[0]['TrackingServerArn']
        mlflow_name = r[0]['TrackingServerName']
        print(f"You have an MLflow server {mlflow_arn} in the status 'Creating', going to use this one")
else:
    mlflow_arn = r[0]['TrackingServerArn']
    mlflow_name = r[0]['TrackingServerName']
    print(f"You have {len(r)} running MLflow server(s). Get the first server ARN:{mlflow_arn}")

NameError: name 'boto3' is not defined

In [None]:
# This code cell creates a new MLflow server
if not mlflow_arn:
    ts = strftime('%d-%H-%M-%S', gmtime())
    mlflow_name = f"mlflow-{domain_id}-{ts}"
    r = boto3.client("sagemaker").create_mlflow_tracking_server(
        TrackingServerName=mlflow_name,
        ArtifactStoreUri=f"s3://{bucket_name}/mlflow/{ts}",
        RoleArn=sm_role,
        AutomaticModelRegistration=True,
    )

    mlflow_arn = r['TrackingServerArn']
    print(f"Server creation request succeded. The server {mlflow_arn} is being created.")

<div style="border: 4px solid coral; text-align: center; margin: auto;">
MLflow 서버를 만드는 데 최대 25분 정도 걸릴 수 있습니다. 기다릴 필요 없이 워크샵의 흐름을 따라 진행하세요.
</div>

In [11]:
(mlflow_arn, mlflow_name)

('arn:aws:sagemaker:us-west-2:498139060672:mlflow-tracking-server/mlflow-d-2vrfnbaod7xr-15-05-39-28',
 'mlflow-d-2vrfnbaod7xr-15-05-39-28')

In [12]:
%store mlflow_arn
%store mlflow_name

Stored 'mlflow_arn' (str)
Stored 'mlflow_name' (str)


## Studio 로컬 모드를 사용하려면 Docker를 설치하세요.
Amazon SageMaker Studio 애플리케이션은 로컬 모드를 사용하여 추정기, 프로세서 및 파이프라인을 생성한 다음 로컬 환경에 배포할 수 있도록 지원합니다. 로컬 모드를 사용하면 머신 러닝 스크립트를 Amazon SageMaker 관리형 트레이닝 또는 호스팅 환경에서 실행하기 전에 테스트할 수 있습니다. 현재 Studio에서 지원하는 도커 작업을 이해하려면 [Amazon SageMaker Studio의 로컬 모드 지원](https://docs.aws.amazon.com/sagemaker/latest/dg/studio-updated-local.html)을 참조하세요.

Studio 애플리케이션에서 로컬 모드를 사용하려면 JupyterLab 공간에 Docker를 설치해야 합니다. 

### 도커 액세스가 활성화되어 있는지 확인하기

In [13]:
# check that docker enabled in the SageMaker domain
docker_settings = sm_client.describe_domain(DomainId=domain_id)['DomainSettings'].get('DockerSettings')
docker_enabled = False

if docker_settings:
    if docker_settings.get('EnableDockerAccess') in ['ENABLED']:
        print(f"The docker access is ENABLED in the domain {domain_id}")
        docker_enabled = True

if not docker_enabled:
    raise Exception(f"You must enable docker access in the domain to use Studio local mode")

The docker access is ENABLED in the domain d-2vrfnbaod7xr


<div style="border: 4px solid coral; text-align: center; margin: auto;">
이전 코드 셀에서 도커 액세스가 활성화되지 않았다는 예외가 발생하면 액세스를 활성화해야 합니다. 다음 지침을 참조하세요.
</div>

In [14]:
print(f"Domain id: {domain_id}")

Domain id: d-2vrfnbaod7xr


### SageMaker domain에서 Docker Access가 되는지 점검


In [15]:
# check the updated settings
sm_client.describe_domain(DomainId=domain_id)['DomainSettings']

{'DockerSettings': {'EnableDockerAccess': 'ENABLED',
  'VpcOnlyTrustedAccounts': []}}

### Install Docker

In [16]:
%%bash

# see https://docs.docker.com/engine/install/ubuntu/#install-using-the-repository
sudo apt-get update
sudo apt-get install -y ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

# Add the repository to Apt sources:
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update

## Currently only Docker version 20.10.X is supported in Studio: see https://docs.aws.amazon.com/sagemaker/latest/dg/studio-updated-local.html
# pick the latest patch from:
# apt-cache madison docker-ce | awk '{ print $3 }' | grep -i 20.10
VERSION_STRING=5:20.10.24~3-0~ubuntu-jammy
sudo apt-get install docker-ce-cli=$VERSION_STRING docker-compose-plugin -y

# validate the Docker Client is able to access Docker Server at [unix:///docker/proxy.sock]
docker version

Hit:1 https://download.docker.com/linux/ubuntu jammy InRelease
Get:2 http://security.ubuntu.com/ubuntu jammy-security InRelease [129 kB]
Hit:3 http://archive.ubuntu.com/ubuntu jammy InRelease
Get:4 http://archive.ubuntu.com/ubuntu jammy-updates InRelease [128 kB]
Hit:5 http://archive.ubuntu.com/ubuntu jammy-backports InRelease
Get:6 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 Packages [2447 kB]
Fetched 2704 kB in 2s (1395 kB/s)
Reading package lists...
Reading package lists...
Building dependency tree...
Reading state information...
ca-certificates is already the newest version (20230311ubuntu0.22.04.1).
curl is already the newest version (7.81.0-1ubuntu1.17).
0 upgraded, 0 newly installed, 0 to remove and 20 not upgraded.
Hit:1 https://download.docker.com/linux/ubuntu jammy InRelease
Hit:2 http://security.ubuntu.com/ubuntu jammy-security InRelease
Hit:3 http://archive.ubuntu.com/ubuntu jammy InRelease
Hit:4 http://archive.ubuntu.com/ubuntu jammy-updates InRelease
Hit:5 h

## Data

이 예에서는 UCI의 ML 리포지토리에 있는 [다이렉트 마케팅 데이터 세트](https://archive.ics.uci.edu/ml/datasets/bank+marketing)를 사용합니다:
> [Moro et al., 2014] S. Moro, P. Cortez, P. Rita. 은행 텔레마케팅의 성공을 예측하기 위한 데이터 기반 접근 방식. 의사 결정 지원 시스템, 엘스비어, 62:22-31, 6월 2014

이 데이터는 포르투갈 은행 기관의 다이렉트 마케팅 캠페인과 관련이 있습니다. 이 마케팅 캠페인은 전화 통화를 기반으로 했습니다. 상품(은행 정기예금)의 가입 여부('예')를 확인하기 위해 동일한 고객에게 두 번 이상 연락해야 하는 경우가 많았습니다('아니오').

Download and unzip the dataset:

In [17]:
!wget -P data/ -N https://archive.ics.uci.edu/static/public/222/bank+marketing.zip --no-check-certificate

--2024-08-15 07:47:38--  https://archive.ics.uci.edu/static/public/222/bank+marketing.zip
Resolving archive.ics.uci.edu (archive.ics.uci.edu)... 128.195.10.252
Connecting to archive.ics.uci.edu (archive.ics.uci.edu)|128.195.10.252|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified
Saving to: ‘data/bank+marketing.zip’

bank+marketing.zip      [ <=>                ] 999.85K  5.33MB/s    in 0.2s    

Last-modified header missing -- time-stamps turned off.
2024-08-15 07:47:39 (5.33 MB/s) - ‘data/bank+marketing.zip’ saved [1023843]



In [18]:
import zipfile

with zipfile.ZipFile("data/bank+marketing.zip", "r") as z:
    print("Unzipping bank+marketing...")
    z.extractall("data")

with zipfile.ZipFile("data/bank-additional.zip", "r") as z:
    print("Unzipping bank-additional...")
    z.extractall("data")

print("Done")

Unzipping bank+marketing...
Unzipping bank-additional...
Done


### 데이터 둘러보기

In [19]:
df_data = pd.read_csv(dataset_file_local_path, sep=";")

pd.set_option("display.max_columns", 500)  # View all of the columns
df_data  # show first 5 and last 5 rows of the dataframe

Unnamed: 0,age,job,marital,education,default,housing,loan,contact,month,day_of_week,duration,campaign,pdays,previous,poutcome,emp.var.rate,cons.price.idx,cons.conf.idx,euribor3m,nr.employed,y
0,56,housemaid,married,basic.4y,no,no,no,telephone,may,mon,261,1,999,0,nonexistent,1.1,93.994,-36.4,4.857,5191.0,no
1,57,services,married,high.school,unknown,no,no,telephone,may,mon,149,1,999,0,nonexistent,1.1,93.994,-36.4,4.857,5191.0,no
2,37,services,married,high.school,no,yes,no,telephone,may,mon,226,1,999,0,nonexistent,1.1,93.994,-36.4,4.857,5191.0,no
3,40,admin.,married,basic.6y,no,no,no,telephone,may,mon,151,1,999,0,nonexistent,1.1,93.994,-36.4,4.857,5191.0,no
4,56,services,married,high.school,no,no,yes,telephone,may,mon,307,1,999,0,nonexistent,1.1,93.994,-36.4,4.857,5191.0,no
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
41183,73,retired,married,professional.course,no,yes,no,cellular,nov,fri,334,1,999,0,nonexistent,-1.1,94.767,-50.8,1.028,4963.6,yes
41184,46,blue-collar,married,professional.course,no,no,no,cellular,nov,fri,383,1,999,0,nonexistent,-1.1,94.767,-50.8,1.028,4963.6,no
41185,56,retired,married,university.degree,no,yes,no,cellular,nov,fri,189,2,999,0,nonexistent,-1.1,94.767,-50.8,1.028,4963.6,no
41186,44,technician,married,professional.course,no,no,no,cellular,nov,fri,442,1,999,0,nonexistent,-1.1,94.767,-50.8,1.028,4963.6,yes


### S3에 데이터 업로드

In [20]:
input_s3_url = sagemaker.Session().upload_data(
    path=dataset_file_local_path,
    bucket=bucket_name,
    key_prefix=f"{bucket_prefix}/input"
)
print(f"Upload the dataset to {input_s3_url}")

%store input_s3_url

Upload the dataset to s3://sagemaker-us-west-2-498139060672/mlops-workshop/xgboost/input/bank-additional-full.csv
Stored 'input_s3_url' (str)


## Restart kernel

In [21]:
# Restart kernel to get the packages
import IPython
IPython.Application.instance().kernel.do_shutdown(True)

{'status': 'ok', 'restart': True}