## Model Deployment with SageMaker

### 1. Creating a Model
Create a model in SageMaker—By creating a model, you tell SageMaker where it can find the model components. This includes the S3 path where the model artifacts are stored and the Docker registry path for the image that contains the inference code.

The following example shows how to create a model using the AWS SDK for Python (Boto3). The first few lines define:

- **sagemaker_client:** A low-level SageMaker client object that makes it easy to send and receive requests to AWS services.

- **sagemaker_role:** A string variable with the SageMaker IAM role Amazon Resource Name (ARN).

- **aws_region:** A string variable with the name of your AWS region.

**Tips-1:** The S3 bucket where the model artifacts are stored must be in the same region as the model that you are creating.

In [1]:
import boto3

In [3]:
# Specify your AWS Region
aws_region = "us-east-1"

# Create a low-level SageMaker service client
sagemaker_client = boto3.client("sagemaker", region_name = aws_region)

# Role to give SageMaker permission to access AWS services.
sagemaker_role = "arn:aws:iam:region:account:role/"

Next, specify the location of the pre-trained model stored in Amazon S3. In this example, we use a pre-trained XGBoost model named ``demo-xgboost-model.tar.gz``

In [6]:
# Create a variable with the model s3 URI
s3_bucket = "smworkshop-firat-olcum" # Provide the name of your S3 bucket
bucket_prefix = "saved_models"
model_s3_key = f"{bucket_prefix}/demo-xgboost-model.tar.gz"

# Specify S3 bucket with model
model_url = f"s3://{s3_bucket}/{model_s3_key}"

Specify a primary container. For the primary container, you specify the Docker image that contains inference code, artifacts (from prior training), and a custom environment map that the inference code uses when you deploy the model for predictions.

In this example, we specify an XGBoost built-in algorithm container image:

In [7]:
from sagemaker import image_uris

In [8]:
# Specify an AWS container image.
container = image_uris.retrieve(region = aws_region, framework = "xgboost", version = "0.90-1")

Create a model in Amazon SageMaker with CreateModel. Specify the following:

- **ModelName:** A name for your model (in this example it is stored as a string variable called model_name).

- **ExecutionRoleArn:** The Amazon Resource Name (ARN) of the IAM role that Amazon SageMaker can assume to access model artifacts and Docker images for deployment on ML compute instances or for batch transform jobs.

- **PrimaryContainer:** The location of the primary Docker image containing inference code, associated artifacts, and custom environment maps that the inference code uses when the model is deployed for predictions.

In [None]:
model_name = "The_name_of_the_model"

# Create model
create_model_response = sagemaker_client.create_model(Model_name = model_name,
                                                      ExecutionRoleArn = sagemaker_role,
                                                      PrimaryContainer = {"Image" : container, "ModelDataUrl" : model_url})

### 2. Creating an Endpoint Configuration
Once you have a model, create an endpoint configuration with ``CreateEndpointConfig.`` Amazon SageMaker hosting services uses this configuration to deploy models. In the configuration, you identify one or more models, created using with ``CreateModel``, to deploy the resources that you want Amazon SageMaker to provision.

In [14]:
import datetime
from time import gmtime, strftime

In [None]:
# Create an endpoint config name. Here we can create one based on the date.
# so it we can search endpoints based on creation time.
endpoint_config_name = f"XGBoostEndpointConfig-{strftime('%Y-%m-%d-%H-%M-%S', gmtime())}"

# The name of the model that you want to host. This is the name that you specified when creating the model.
model_name = "The_name_of_your_model"

endpoint_config_response = sagemaker_client.create_endpoint_config(
                           EndPointConfigName = endpoint_config_name, # You will specify this name in a CreateEndpoint request.
                           # List of ProductionVariant objects, one for each model that you want to host at this endpoint.
                           ProductionVariants = [
                               {
                                   "VariantName" : "variant1", # The name of the production variant.
                                   "ModelName" : model_name, # 
                                   "InstanceType" : "ml.m5.xlarge", # Specify the compute instance type.
                                   "InıtıalInstanceCount" : 1 # Number of instances to launch initially.
                               }
                           ]
                           
)

print(f"Created EndpointConfig: {endpoint_config_response['EndpointConfigArn']}")

In the aforementioned example, you specify the following keys for the ProductionVariants field:

**VariantName:** The name of the production variant.

**ModelName:** The name of the model that you want to host. This is the name that you specified when creating the model.

**InstanceType:** The compute instance type.

### 3. Creating an Endpoint
- To create an HTTPS endpoint, provide the endpoint configuration to SageMaker. The service launches the ML compute instances and deploys the model or models as specified in the configuration.

- Once you have your model and endpoint configuration, use the CreateEndpoint API to create your endpoint. The endpoint name must be unique within an AWS Region in your AWS account.

**Tips-1:** Endpoints are scoped to an individual AWS account, and are not public.

In [None]:
# The name of the endpoint. The name must be unique within an AWS Region in your AWS account.
endpoint_name = "endpoint-name"

# The name of the endpoint configuration associated with this endpoint.
endpoint_config_name = "endpoint_config_name"

create_endpoint_response = sagemaker_client.create_endpoint(EndpointName = endpoint_name,
                                                            EndpointConfigName = endpoint_config_name)

**Tips-2:** DELETE THE ENDPOINT AS SOON AS YOU ARE DONE, TO AVOID ADDITIONAL PAYMENTS

In [None]:
# Delete endpoint
sasgemaker_client.delete_endpoint(EndpointName = endpoint_name)

### 4. Deploy Models for Inference
After you build and train your models, you can deploy them to get predictions.

To set up a persistent endpoint to get predictions from your models, use Amazon SageMaker hosting services. 

You can deploy the trained model across one or more load-balanced compute instances. SageMaker provides a real-time endpoint to invoke your model from client applications