# Time to Deploy
Ryan Prasad 10/5/2023 rhprasad@outlook.com

Hello World was a success, but we need to import sklearn from SageMaker and retrain that so we can deploy this and get it hooked up to the ArcGIS JS front end. 

Commentary is going to be light until we get to new stuff.

In [3]:
from requests import get
ip = get('https://api.ipify.org').content.decode ('utf8') 
print ('My public IP address is: {}'.format (ip))

My public IP address is: 54.203.17.143


In [4]:
%pip install psycopg2-binary

Collecting psycopg2-binary
  Obtaining dependency information for psycopg2-binary from https://files.pythonhosted.org/packages/bc/0d/486e3fa27f39a00168abfcf14a3d8444f437f4b755cc34316da1124f293d/psycopg2_binary-2.9.9-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
  Using cached psycopg2_binary-2.9.9-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.4 kB)
Using cached psycopg2_binary-2.9.9-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB)
Installing collected packages: psycopg2-binary
Successfully installed psycopg2-binary-2.9.9
[0mNote: you may need to restart the kernel to use updated packages.


In [5]:
from sqlalchemy import create_engine
from sqlalchemy.sql import text
import psycopg2
import pandas as pd

engine = create_engine(
    "{dialect}+{driver}://{username}:{password}@{host}:{port}/{database}".format(
        dialect="postgresql",
        driver="psycopg2",
        username=":-)",
        password=":-)",
        host="fastdb.com40arouubf.us-west-2.rds.amazonaws.com", 
        port=5432,
        database="geoml"
    )
)

with engine.connect() as db_conn:
    sql_query = "SELECT version()"
    df = pd.read_sql_query(sql=text(sql_query), con=db_conn)
df

Unnamed: 0,version
0,"PostgreSQL 15.3 on x86_64-pc-linux-gnu, compil..."


In [6]:
with engine.connect() as db_conn:
    sql_query = """
    SELECT ST_X (ST_Transform (geom, 4326)) AS lon,
       ST_Y (ST_Transform (geom, 4326)) AS lat,
	   cid as continent_id
    FROM points_xy;
    """
    df = pd.read_sql_query(sql=text(sql_query), con=db_conn)
df = df.sample(frac=1)
df

Unnamed: 0,lon,lat,continent_id
73118,-77.702820,-77.142996,8
55255,-155.669855,19.560042,6
51344,175.772107,-40.282927,6
15278,151.783363,66.942484,2
52519,169.557308,-44.986742,6
...,...,...,...
64649,-68.104537,-6.653018,7
35656,12.275588,62.030286,4
54136,173.007352,-34.526141,6
39603,52.601519,63.971393,4


In [7]:
features = df[["lon", "lat"]].to_numpy()
target = df[["continent_id"]].to_numpy()

features

array([[ -77.70281955,  -77.14299593],
       [-155.66985476,   19.56004177],
       [ 175.77210663,  -40.28292677],
       ...,
       [ 173.00735226,  -34.52614082],
       [  52.60151941,   63.97139263],
       [ 175.58396625,  -40.6907277 ]])

In [8]:
target

array([[8],
       [6],
       [6],
       ...,
       [6],
       [4],
       [6]])

In [9]:
from sklearn.svm import LinearSVC
lsvc = LinearSVC(verbose=0, max_iter=1000)
print(lsvc)

LinearSVC()


In [10]:
lsvc.fit(features, target.ravel())
score = lsvc.score(features, target.ravel())
print("Score: ", score)

Score:  0.629675




Here is where we left off. Let's save that model to this directory (with my other notebooks and images).

In [35]:
import joblib

joblib.dump(lsvc, "model.pkl")

['model.pkl']

This needs to be a .tar.gz file. Opened a console and did the following:

`tar -czvf model.tar.gz model.pickle`

In [40]:
import boto3

s3 = boto3.resource('s3')
s3.Bucket(':-)').upload_file("model.tar.gz", "model.tar.gz")

Saved the model in my S3 bucket. 

In [15]:
import sagemaker
role = sagemaker.get_execution_role()
role

sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /root/.config/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /root/.config/sagemaker/config.yaml


':-)'

Needed to grab the role I am using to deploy. Probably not best practice but whatever.

Now, create the entry_point script that contains the definition for model_fn, which is required according to the documentation.

In [49]:
%%writefile entry_point.py

import joblib
import os

def model_fn(model_dir):
    clf = joblib.load(os.path.join(model_dir, "model.pkl"))
    return clf

Writing entry_point.py


In [50]:
from sagemaker.sklearn.model import SKLearnModel

model = SKLearnModel(name="Test-model",
                     model_data="s3://sagemaker-studio-waehynjqjqj/model.tar.gz",
                     entry_point="entry_point.py",
                     role="arn:aws:iam::748757098892:role/service-role/AmazonSageMaker-ExecutionRole-20230926T210852",
                     framework_version="1.0-1")

sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /root/.config/sagemaker/config.yaml


In [51]:
predictor = model.deploy(instance_type="ml.t2.large", initial_instance_count=1)

sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /root/.config/sagemaker/config.yaml
------!

It worked? I can't believe it. That was painful. Let's see if we can get a prediction out of it.

In [57]:
predictor

<sagemaker.sklearn.model.SKLearnPredictor at 0x7ff6b094fc40>

In [63]:
predictor.predict([[-98,36]])

array([7])

Welp, that is definitely not South America. That coordinate was North America yesterday. Guess there is a difference between 1k and 10k training iterations after all! 

Onward! Time to hook this baby up to the Lambda function and then to the map! But let's kill this endpoint before bedtime. I am doing this in the SageMaker Console.