# Deploying Mercury using AWS Jumpstart

This notebook shows how you can deploy Mercury as a Sagemaker Endpoint with AWS Jumpstart. 

## Prerequisites

1. This notebook must be opened from an Amazon SageMaker Notebook Instance or Amazon SageMaker Studio 
2. Ensure that your IAM role has AmazonSageMakerFullAccess

**Note**: It is possible to run this notebook on a local machine if you manually set the `role_arn` to an IAM role with AmazonSagemakerFullAccess permissions.

## Step 1: Subscribe to the model package

To subscribe to Mercury:

1. Open the Mercury listing: [Mercury - Sagemaker Marketplace](https://aws.amazon.com/marketplace/pp/prodview-ycpatnhxuxgfa)
2. Click on **Continue to subscribe**.
3. Accept the offer on the next page after reviewing the EULA, pricing and support terms.

## Step 2: Create an endpoint

In [1]:
# import necessary packages and set the correcr region and execution role

import sagemaker
from sagemaker import ModelPackage, get_execution_role
import boto3
from sagemaker.utils import name_from_base
from time import perf_counter

region = boto3.Session().region_name
role_arn = get_execution_role()
sagemaker_session = sagemaker.Session()

sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /home/sagemaker-user/.config/sagemaker/config.yaml


In [None]:
# Set the correct model package ARN corresponding to your region

model_package_map = {
    "us-east-1": "arn:aws:sagemaker:us-east-1:865070037744:model-package/mercury-07172025-03cb9fd1240a3c57a5de87fa31d146ab",
    "us-east-2": "arn:aws:sagemaker:us-east-2:057799348421:model-package/mercury-07172025-03cb9fd1240a3c57a5de87fa31d146ab",
    "us-west-1": "arn:aws:sagemaker:us-west-1:382657785993:model-package/mercury-07172025-03cb9fd1240a3c57a5de87fa31d146ab",
    "us-west-2": "arn:aws:sagemaker:us-west-2:594846645681:model-package/mercury-07172025-03cb9fd1240a3c57a5de87fa31d146ab",
    "ca-central-1": "arn:aws:sagemaker:ca-central-1:470592106596:model-package/mercury-07172025-03cb9fd1240a3c57a5de87fa31d146ab",
    "eu-central-1": "arn:aws:sagemaker:eu-central-1:446921602837:model-package/mercury-07172025-03cb9fd1240a3c57a5de87fa31d146ab",
    "eu-west-1": "arn:aws:sagemaker:eu-west-1:985815980388:model-package/mercury-07172025-03cb9fd1240a3c57a5de87fa31d146ab",
    "eu-west-2": "arn:aws:sagemaker:eu-west-2:856760150666:model-package/mercury-07172025-03cb9fd1240a3c57a5de87fa31d146ab",
    "eu-west-3": "arn:aws:sagemaker:eu-west-3:843114510376:model-package/mercury-07172025-03cb9fd1240a3c57a5de87fa31d146ab",
    "eu-north-1": "arn:aws:sagemaker:eu-north-1:136758871317:model-package/mercury-07172025-03cb9fd1240a3c57a5de87fa31d146ab",
    "ap-southeast-1": "arn:aws:sagemaker:ap-southeast-1:192199979996:model-package/mercury-07172025-03cb9fd1240a3c57a5de87fa31d146ab",
    "ap-southeast-2": "arn:aws:sagemaker:ap-southeast-2:666831318237:model-package/mercury-07172025-03cb9fd1240a3c57a5de87fa31d146ab",
    "ap-northeast-2": "arn:aws:sagemaker:ap-northeast-2:745090734665:model-package/mercury-07172025-03cb9fd1240a3c57a5de87fa31d146ab",
    "ap-northeast-1": "arn:aws:sagemaker:ap-northeast-1:977537786026:model-package/mercury-07172025-03cb9fd1240a3c57a5de87fa31d146ab",
    "ap-south-1": "arn:aws:sagemaker:ap-south-1:077584701553:model-package/mercury-07172025-03cb9fd1240a3c57a5de87fa31d146ab",
    "sa-east-1": "arn:aws:sagemaker:sa-east-1:270155090741:model-package/mercury-07172025-03cb9fd1240a3c57a5de87fa31d146ab",
}

if region not in model_package_map:
    raise Exception("UNSUPPORTED REGION")

package_arn = model_package_map[region]

In [3]:
# Create the model package

endpoint_name = name_from_base("mercury-endpoint")  # set this to your liking
model = ModelPackage(role=role_arn, model_package_arn=package_arn, sagemaker_session=sagemaker_session)

In [4]:
# Deploy the Model. This may take 5-10 minutes to run

instance_type = "ml.p5.48xlarge"  # We only support ml.p5.48xlarge instances at the moment
start = perf_counter()
deployed_model = model.deploy(initial_instance_count=1, instance_type=instance_type, endpoint_name=endpoint_name)
print(f"\nDeployment took {perf_counter() - start:.2f} seconds")

-----------------!
Deployment took 542.79 seconds


## Step 3: Run real-time inference on the new endpoint

In [5]:
# Create a simple Sagemaker predictor

predictor = sagemaker.Predictor(
    endpoint_name=endpoint_name,
    sagemaker_session=sagemaker_session,
    serializer=sagemaker.serializers.JSONSerializer(),
    deserializer=sagemaker.deserializers.JSONDeserializer(),
)

In [6]:
# Run inference

payload = {
    "messages": [
        {
            "role": "user",
            "content": "Write a pomodoro app backend in python",
        },
    ],
    "max_tokens": 2048,
}
start = perf_counter()
outputs = predictor.predict(payload)
eta = perf_counter() - start
print(f"Speed: {outputs['usage']['completion_tokens'] / eta:.2f} tokens / second\n")
print(outputs["choices"][0]["message"]["content"])

Speed: 589.37 tokens / second

Creating a Pomodoro app backend in Python involves setting up a simple server that can handle requests for starting, stopping, and tracking Pomodoro sessions. Below is a basic example using Flask, a lightweight web framework for Python. This example will include endpoints to start a Pomodoro session, stop it, and get the current status.

First, ensure you have Flask installed. You can install it using pip:

```bash
pip install Flask
```

Now, let's create the backend:

```python
from flask import Flask, jsonify
import time
import threading

app = Flask(__name__)

# Global variables to track the Pomodoro session
pomodoro_session = None
start_time = None
is_running = False

def start_pomodoro():
    global start_time, is_running
    start_time = time.time()
    is_running = True
    print("Pomodoro started.")
    # Simulate a 25-minute Pomodoro session
    time.sleep(25 * 60)
    stop_pomodoro()

def stop_pomodoro():
    global is_running
    is_running = F

# Step 4: Cleanup (optional)

In [7]:
# Delete the endpoint

sagemaker_session.delete_endpoint(endpoint_name)