# Assignment: setup
This series of the notebooks takes you through assignments while you're following ML development process using Amazon SageMaker MLOps building blocks. 

The assignments are based on the provided notebooks and you can use the code in the notebooks to complete exercises.

Refer to the notebook [`00-start-here.ipynb`](../00-start-here.ipynb) for code snippets and a general guidance for the exercises in this assignment.

## Import packages

In [8]:
import time
import os
import json
import boto3
import numpy as np  
import pandas as pd 
import sagemaker

sagemaker.__version__

'2.165.0'

## Exercise 1: AWS and SageMaker environment
- Instantiate a [sagemaker session](https://sagemaker.readthedocs.io/en/stable/api/utility/session.html)
- Get the name of the default bucket to use in relevant Amazon SageMaker interactions
- Get the SageMaker execution role
- Get the AWS region
- Instantiate a boto3 [sagemaker client](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html)


In [10]:
# Exercise 1 - write code here
sm_session = sagemaker.Session()
default_bucket  = sm_session.default_bucket()
sm_role = sagemaker.get_execution_role()
region = sm_session.boto_session.region_name
sm_client = boto3.client("sagemaker",region_name = region)

## Exercise 2: Studio environment
- Explore the notebook metadata file `/opt/ml/metadata/resource-metadata.json`
- Get the SageMaker `domain_id`
- Get the Studio user profile name
- Get the notebook image name

In [12]:
# Exercise 2 - write code here
NOTEBOOK_METADATA_FILE = "/opt/ml/metadata/resource-metadata.json"
if os.path.exists(NOTEBOOK_METADATA_FILE):
    with open (NOTEBOOK_METADATA_FILE, "rb") as f:
        metadata = json.loads(f.read())

domain_id = metadata['DomainId']
profile_name = metadata['UserProfileName']
image_name = metadata['ResourceName']

## Exercise 3: Data
- Download a dataset. You can use your own dataset and download it from your local storage or from internet
- Load data into a Pandas dataframe and view the data

In [13]:
!wget -P data/ -N https://archive.ics.uci.edu/ml/machine-learning-databases/00222/bank-additional.zip

--2023-06-17 02:57:31--  https://archive.ics.uci.edu/ml/machine-learning-databases/00222/bank-additional.zip
Resolving archive.ics.uci.edu (archive.ics.uci.edu)... 128.195.10.252
Connecting to archive.ics.uci.edu (archive.ics.uci.edu)|128.195.10.252|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified
Saving to: ‘data/bank-additional.zip’

bank-additional.zip     [  <=>               ] 434.15K  1.28MB/s    in 0.3s    

Last-modified header missing -- time-stamps turned off.
2023-06-17 02:57:32 (1.28 MB/s) - ‘data/bank-additional.zip’ saved [444572]



In [21]:
import zipfile


with zipfile.ZipFile("data/bank-additional.zip", "r") as z:
    print("Unzipping bank-additional...")
    z.extractall("data")

print("Done")

Unzipping bank-additional...
Done


In [23]:
# Exercise 3 - write code here
df_data = pd.read_csv('data/bank-additional/bank-additional-full.csv',sep = ";")
df_data.head()

Unnamed: 0,age,job,marital,education,default,housing,loan,contact,month,day_of_week,...,campaign,pdays,previous,poutcome,emp.var.rate,cons.price.idx,cons.conf.idx,euribor3m,nr.employed,y
0,56,housemaid,married,basic.4y,no,no,no,telephone,may,mon,...,1,999,0,nonexistent,1.1,93.994,-36.4,4.857,5191.0,no
1,57,services,married,high.school,unknown,no,no,telephone,may,mon,...,1,999,0,nonexistent,1.1,93.994,-36.4,4.857,5191.0,no
2,37,services,married,high.school,no,yes,no,telephone,may,mon,...,1,999,0,nonexistent,1.1,93.994,-36.4,4.857,5191.0,no
3,40,admin.,married,basic.6y,no,no,no,telephone,may,mon,...,1,999,0,nonexistent,1.1,93.994,-36.4,4.857,5191.0,no
4,56,services,married,high.school,no,no,yes,telephone,may,mon,...,1,999,0,nonexistent,1.1,93.994,-36.4,4.857,5191.0,no


In [18]:
!pwd

/root/amazon-sagemaker-from-idea-to-production/assignments


## Continue with the assignment 1
Navigate to the [assignment 1](01-assignment-local-development.ipynb) notebook.