In [31]:
import boto3
%run AWS_functions.py

session = boto3.session.Session()
s3_resource = boto3.resource("s3")
ec2_resource = boto3.resource("ec2")
ec2_client = boto3.client('ec2')

<a id="top"></a>
# Following Workflow

### [File Storage](#File_Storage)
* [Create](#File_Storage_Create) an **S3 Bucket**
* [Load](#File_Storage_Load) csv file and script to execute
* [Delete](#File_Storage_Delete) object and content
* [List](#File_Storage_List) objects

### [Execute Code](#Execute_Code)
#### [EC2](#Execute_Code_EC2) Miniconda Image
* [Create](#Execute_Code_EC2_Create) the EC2 Instance
* [Mount](#Execute_Code_EC2_Mount) S3 Storage
* [Execute](#Execute_Code_EC2_Script) Script
* [Stop](#Execute_Code_EC2_Stop) \ Terminate server

#### [SageMaker](#Execute_Code_SM) Instance
* [Create](#Execute_Code_SM_Create) SageView
* Mount S3 Storage
* Execute Script

### Requirements
pycloud environment
```
conda env create -f pycloud.env.ymp --force
```

From Command line execute `aws configure` in order to setup your AWS Access Key and AWS Secret Key

Local Test Code is located under `example_src`

[Go back to top](#top)
## File Storage<a id="File_Storage"></a>
Code to create an S3 Storage
https://realpython.com/python-boto3-aws-s3/#creating-a-bucket

### Create S3 Object<a id="File_Storage_Create"></a>

In [78]:
%run AWS_functions.py

bucket_name, first_response = aws_create_bucket(
    bucket_name='iris-train', 
    s3_connection=s3_resource.meta.client,
    current_region=session.region_name)

iris-train-12746b8f-be9f-4ef8-8c06-38323c33c122 us-east-1


### Load Files into S3 Object<a id="File_Storage_Load"></a>
There is no native folder sync within Python SDK, so using aws command to solve for this problem!

In [81]:
%run AWS_functions.py

success = aws_upload_files_to_bucket(
    path_to_src="../example_src/iris", 
    bucket_name=bucket_name)

print("Uploaded was a", success)

Completed 3.6 KiB/5.7 KiB (23.0 KiB/s) with 3 file(s) remaining
upload: ../example_src/iris/data/iris.csv to s3://iris-train-12746b8f-be9f-4ef8-8c06-38323c33c122/data/iris.csv
Completed 3.6 KiB/5.7 KiB (23.0 KiB/s) with 2 file(s) remaining
Completed 3.6 KiB/5.7 KiB (13.7 KiB/s) with 2 file(s) remaining
upload: ../example_src/iris/requirements.txt to s3://iris-train-12746b8f-be9f-4ef8-8c06-38323c33c122/requirements.txt
Completed 3.6 KiB/5.7 KiB (13.7 KiB/s) with 1 file(s) remaining
Completed 5.7 KiB/5.7 KiB (19.6 KiB/s) with 1 file(s) remaining
upload: ../example_src/iris/iris_train.py to s3://iris-train-12746b8f-be9f-4ef8-8c06-38323c33c122/iris_train.py

Uploaded was a True


[Go to EC2](#Execute_Code_EC2)
[Go to SageMaker](#Execute_Code_SM)

### Delete S3 Object and Content<a id="File_Storage_Delete"></a>
Remove all resources and delete bucket!<br>
**This does not back-up anything!**

In [18]:
%run AWS_functions.py

aws_delete_bucket(bucket_name, s3_resource)    

[{'Key': '.DS_Store', 'VersionId': 'null'}, {'Key': '.ipynb_checkpoints/Untitled-checkpoint.ipynb', 'VersionId': 'null'}, {'Key': 'Untitled.ipynb', 'VersionId': 'null'}, {'Key': 'data/.DS_Store', 'VersionId': 'null'}, {'Key': 'data/chicagoCrimes10k.csv', 'VersionId': 'null'}, {'Key': 'data/training/iris.csv', 'VersionId': 'null'}, {'Key': 'model/HasDetections_GridSearch_RF_final.pkl', 'VersionId': 'null'}, {'Key': 'model/iris-randomforest.pkl', 'VersionId': 'null'}, {'Key': 'notebooks/.ipynb_checkpoints/ChicagoCrime-RF-checkpoint.ipynb', 'VersionId': 'null'}, {'Key': 'notebooks/ChicagoCrime-RF.ipynb', 'VersionId': 'null'}, {'Key': 'output/failure', 'VersionId': 'null'}, {'Key': 'requirements.txt', 'VersionId': 'null'}, {'Key': 'requirements.yml', 'VersionId': 'null'}, {'Key': 'sm_train.py', 'VersionId': 'null'}, {'Key': 'train.py', 'VersionId': 'null'}]


### List S3 Buckets<a id=File_Storage_List></a>
Quickly list and clean up all buckets created by with iris-trian in name

In [129]:
import boto3
s3_resource = boto3.resource('s3')
for bucket in s3_resource.buckets.all():
  if "iris-train" in bucket.name:
    print(bucket.name)
#     aws_delete_bucket(bucket.name, s3_resource)

iris-train-12746b8f-be9f-4ef8-8c06-38323c33c122


[Go back to top](#top)
## Execute Code<a id=Execute_Code></a>

## EC2 Instance<a id=Execute_Code_EC2></a>

### Create EC2 Instance<a id=Execute_Code_EC2_Create></a>
https://blog.ipswitch.com/how-to-create-an-ec2-instance-with-python

You need to bring your own:
* Security Group
* pem key

We will be building the following EC2
* MiniConda - ami-062c42cbecc1d5ec0
* t2.medium

I built my own security group and granted ssh access.

Use a bash script to create the S3 Mount, go [here](#Execute_Code_EC2_Mount) to see details

### Mount S3 onto EC2 Instance<a id=Execute_Code_EC2_Mount></a>
https://cloudkul.com/blog/mounting-s3-bucket-linux-ec2-instance/

Using existing EC2 in AWS
Need to leverage API to create EC2 and mount determining setup

**Required setup on EC2**
```
sudo yum update
sudo yum install automake fuse fuse-devel gcc-c++ git libcurl-devel libxml2-devel make openssl-devel
git clone https://github.com/s3fs-fuse/s3fs-fuse.git
cd s3fs-fuse
./autogen.sh
./configure --prefix=/usr --with-openssl
make
sudo make install
```

You must create an IAM role for S3 Mounting, for sake of simplicity, i'm using my Admin IAM Access
```
sudo touch /etc/passwd-s3fs
sudo vim /etc/passwd-s3fs
```
Provide `Your_accesskey:Your_secretkey` inside the file
```
sudo chmod 640 /etc/passwd-s3fs
```

Let's mount it!, replace iris-trainc0e3588c-d9bb-4699-821c-1883670ace42 with your bucket name
uid=500 is ec2-user account
```
sudo mkdir /mys3bucket
sudo chown ec2-user:ec2-user /mys3bucket
s3fs iris-trainc0e3588c-d9bb-4699-821c-1883670ace42 -o use_cache=/tmp -o allow_other -o uid=500 -o mp_umask=002 -o multireq_max=5 /mys3bucket
```

Validate
```
df -Th
```

Mount at reboot
```
vi /etc/rc.local
/usr/bin/s3fs iris-trainc0e3588c-d9bb-4699-821c-1883670ace42 -o use_cache=/tmp -o allow_other -o uid=500 -o mp_umask=002 -o multireq_max=5 /mys3bucket
```
**or** add it to the User Data at execution!

In [82]:
%run AWS_functions.py

from IPython.display import display
import socket
import time
from time import sleep


# create a new EC2 instance
user_data = aws_user_data_script(bucket_name)

instance = aws_create_ec2_instance(
    ec2_name="iris-train", 
    security_group=['sg-0d24aec64507df8b5'], 
    user_data=user_data, 
    ec2_resource=ec2_resource,
    image_id = "ami-062c42cbecc1d5ec0", 
    instance_type="t2.medium")

print("instance id: ",instance.id)

instance id:  i-0c4e0a21c3f6164a9


In [83]:
#Provide status when instance is finally up!
retries = 10
retry_delay = 10
retry_count = 0

print("Wait till instance state changes to running")
instance.wait_until_running()
instance = ec2_resource.Instance(id=instance.id)
print("Instance State Up, waiting for boot-up")

waiting_status = "instance is still loading retrying . . . "
dh = display(waiting_status,display_id=True)

while retry_count <= retries:
    sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    result = sock.connect_ex((instance.public_ip_address,22))
    if result == 0:
        print("Instance is UP & accessible on port 22, the IP address is:  ",instance.public_ip_address)
        break
    else:
        if len(waiting_status) < 50:
            waiting_status += ". "
        else:
            waiting_status = waiting_status[0:41]

        dh.update(waiting_status)
        time.sleep(retry_delay)

Wait till instance state changes to running
Instance State Up, waiting for boot-up


'instance is still loading retrying . . . '

Instance is UP & accessible on port 22, the IP address is:   3.80.171.61


Run the following commands via SSH

From what I can tell miniconda app happens after UserData is complete, thus no installing conda environment

We will execute the following command through ssh client via root access
```
sudo su
while read requirement; do conda install --yes $requirement; done < /mys3bucket/requirements.txt
```

In [84]:
%run AWS_functions.py

commands = [
    "mkdir /mys3bucket/output",
    "mkdir /mys3bucket/model",
    "echo 'for requirement in `cat /mys3bucket/requirements.txt` ; do  conda install --yes  ${requirement}  ;  done' > /tmp/setup.sh",
    "chmod +x /tmp/setup.sh",
    "sudo  -i /tmp/setup.sh",
    "conda list"
]

ssh_result = exec_ssh_cmd(instance.public_ip_address,commands)

#Print the last line conda list
[print(l) for l in ssh_result[5]["response"]]

# packages in environment at /opt/conda:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main  
asn1crypto                0.24.0                   py37_0  
blas                      1.0                         mkl  
ca-certificates           2019.10.16                    0  
certifi                   2019.9.11                py37_0  
cffi                      1.11.5           py37he75722e_1  
chardet                   3.0.4                    py37_1  
conda                     4.5.12                   py37_0  
conda-env                 2.6.0                         1  
cryptography              2.4.2            py37h1ba5d50_0  
idna                      2.8                      py37_0  
intel-openmp              2019.4                      243  
joblib                    0.14.0                     py_0  
libedit                   3.1.20181209         hc058e9b_0  
libffi                    3.2.1                hd8

[None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None]

### Execute Python Script<a id=Execute_Code_EC2_Script></a>

Execute the train.py file!
```
cd /mys3bucket
python train.py
```

Execute the train.py file!
```
cd /mys3bucket
python train.py
```

In [107]:
%run AWS_functions.py

ssh_result = exec_ssh_cmd(
    public_dns_name = instance.public_ip_address,
    commands = ["cd /mys3bucket/; python iris_train.py"])

[ print(val) for val in ssh_result[0]["response"]]

load data
trian model
save model


[None, None, None]

### Wait for S3 to refresh then download and kill instance

In [127]:
%run AWS_functions.py

resources, bucket = aws_list_objects(
    bucket_name=bucket_name,
    s3_resource=s3_resource)

models = [ res["file_name"] for res in resources if "model/iris" in res["file_name"]]
models

if len(models) > 0:
    aws_download_objects(
        path_to_src="../example_src/iris",
        bucket_name=bucket_name
    )

Completed 564 Bytes/9.1 KiB (2.5 KiB/s) with 2 file(s) remaining
download: s3://iris-train-12746b8f-be9f-4ef8-8c06-38323c33c122/model/iris-labels.pkl to ../example_src/iris/model/iris-labels.pkl
Completed 564 Bytes/9.1 KiB (2.5 KiB/s) with 1 file(s) remaining
Completed 9.1 KiB/9.1 KiB (40.8 KiB/s) with 1 file(s) remaining 
download: s3://iris-train-12746b8f-be9f-4ef8-8c06-38323c33c122/model/iris-rf.pkl to ../example_src/iris/model/iris-rf.pkl



### Stop \ Terminate EC2 Instance<a id=Execute_Code_EC2_Stop></a>

In [128]:
%run AWS_functions.py

response = aws_stop_ec2(
    instance_id = instance.id, 
    ec2_client = ec2_client)

print( response )

{'StoppingInstances': [{'CurrentState': {'Code': 64, 'Name': 'stopping'}, 'InstanceId': 'i-0c4e0a21c3f6164a9', 'PreviousState': {'Code': 16, 'Name': 'running'}}], 'ResponseMetadata': {'RequestId': '747e5d07-623a-42fe-95f9-07bea87604fa', 'HTTPStatusCode': 200, 'HTTPHeaders': {'content-type': 'text/xml;charset=UTF-8', 'content-length': '579', 'date': 'Fri, 15 Nov 2019 20:50:26 GMT', 'server': 'AmazonEC2'}, 'RetryAttempts': 0}}


# Deploy Model

https://medium.com/@patrickmichelberger/how-to-deploy-a-serverless-machine-learning-microservice-with-aws-lambda-aws-api-gateway-and-d5b8cbead846

We used a mircoserver flask and zappa to automate lambda, s3, and api gateway

*Assumptions*
* Flask app created in the project folder under api folder and works locally
* Pickle \ joblib file uploaded to s3
* virtualenv installed `pip install virtualenv`

1) Setup a lambda virtualenv from your project folder
```
cd example_src\iris
virtualenv lambda
source lambda/bin/activate
pip install flask zappa sklearn numpy scipy
```
2) Test your flask locally by running `python api/app.py`
3) Initialize zappa by `zappa init`
    * Environment: dev
    * S3 Bucket: use default
    * App Function: use default `api.app.app`
    * Globally: n
*Should look like this*
```
{
    "dev": {
        "app_function": "api.app.app",
        "aws_region": "us-east-1",
        "profile_name": "default",
        "project_name": "iris",
        "runtime": "python3.7",
        "s3_bucket": "zappa-telpd5in0"
    }
}
```
4) Deploy zappa 
```zappa deploy dev```<br>
5) Test Model<br>
```
curl -d '{"data":[[4, 2.1,1,0.4], [6.2, 1,3,2]]}' -X POST https://sgj5uofirg.execute-api.us-east-1.amazonaws.com/dev
```

[Go back to top](#top)
## Sagemaker<a id=Execute_Code_SM></a>
Sagemaker requires setting up the S3 Folder Structure alittle differently
```
example_src/
+----data/
    +--iris.csv
+----output/
```

### Setup S3<a id=Execute_Code_SM_S3></a>
https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/scikit_learn_iris/Scikit-learn%20Estimator%20Example%20With%20Batch%20Transform.ipynb

In [131]:
import sagemaker
from sagemaker import get_execution_role

sagemaker_session = sagemaker.Session()

# from sagemaker import get_execution_role
# role = get_execution_role()
# role
role = "arn:aws:iam::741519135447:role/service-role/AmazonSageMaker-ExecutionRole-20191018T183794"

In [136]:
import numpy as np
import os
from sklearn import datasets

# Load Iris dataset, then join labels and features
# iris = datasets.load_iris()
# joined_iris = np.insert(iris.data, 0, iris.target, axis=1)

# # Create directory and write csv
# os.makedirs('./data', exist_ok=True)
# np.savetxt('./data/iris.csv', joined_iris, delimiter=',', fmt='%1.1f, %1.3f, %1.3f, %1.3f, %1.3f')

# WORK_DIRECTORY = 'data'

train_input = sagemaker_session.upload_data(
    '../example_src/iris/data', 
    key_prefix="{}/{}".format("iris-train", 'data') )

print(train_input)

s3://sagemaker-us-east-1-741519135447/iris-train/data


In [137]:
from sagemaker.sklearn.estimator import SKLearn

script_path = '../example_src/iris/train.py'

sklearn = SKLearn(
    entry_point=script_path,
    train_instance_type="ml.c4.xlarge",
    role=role,
    sagemaker_session=sagemaker_session)

In [140]:
sklearn.fit({'train': train_input})

2019-11-16 17:40:58 Starting - Starting the training job...
2019-11-16 17:40:59 Starting - Launching requested ML instances...
2019-11-16 17:42:00 Starting - Preparing the instances for training......
2019-11-16 17:42:57 Downloading - Downloading input data...
2019-11-16 17:43:25 Training - Downloading the training image..[31m2019-11-16 17:43:39,354 sagemaker-containers INFO     Imported framework sagemaker_sklearn_container.training[0m
[31m2019-11-16 17:43:39,357 sagemaker-containers INFO     No GPUs detected (normal if no gpus installed)[0m
[31m2019-11-16 17:43:39,367 sagemaker_sklearn_container.training INFO     Invoking user training script.[0m
[31m2019-11-16 17:43:39,629 sagemaker-containers INFO     Module train does not provide a setup.py. [0m
[31mGenerating setup.py[0m
[31m2019-11-16 17:43:39,629 sagemaker-containers INFO     Generating setup.cfg[0m
[31m2019-11-16 17:43:39,629 sagemaker-containers INFO     Generating MANIFEST.in[0m
[31m2019-11-16 17:43:39,630 sag

In [141]:
predictor = sklearn.deploy(initial_instance_count=1, instance_type="ml.m4.xlarge")

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------*

UnexpectedStatusException: Error hosting endpoint sagemaker-scikit-learn-2019-11-16-17-40-57-075: Failed. Reason:  The primary container for production variant AllTraffic did not pass the ping health check. Please check CloudWatch logs for this endpoint..

In [142]:
import itertools
import pandas as pd

shape = pd.read_csv("../example_src/iris/data/iris.csv")

a = [50*i for i in range(3)]
b = [40+i for i in range(10)]
indices = [i+j for i,j in itertools.product(a,b)]

test_data = shape.iloc[indices[:-1]]
test_X = test_data.iloc[:,1:]
test_y = test_data.iloc[:,0]

In [143]:

print(predictor.predict(test_X.values))
print(test_y.values)

NameError: name 'predictor' is not defined

In [None]:
for requirement in `cat /mys3bucket/requirements.txt` ; do  pip install --yes  ${requirement}  ;  done

