# Lab 9 - Deploying a auto start/stop Amazon SageMaker Foundation Model endpoint backed by a API Gateway/Lambda

In this notebook, we will run through examples of managing your SageMaker endpoint with the endpoint manager functionality.

We will also walkthrough an example on how to interact with the API gateway endpoint secured by a Lambda authoerizer.

In [None]:
import requests
import json

Set your API gateway URL and auth token

`https://xxxxxxxxxx.execute-api.us-east-1.amazonaws.com/prod/`

`'Authorization': '<YOUR TOKEN>'`

In [None]:
url='https://<YOUR API GATEWAY ENDPOINT NAME>.execute-api.us-east-1.amazonaws.com/prod/'
headers = {
    'Content-Type': 'application/json',
    'Accept': 'application/json',
    'Authorization': '<YOUR TOKEN>'
}

### Real-time Endpoint Management Functions
When you create your endpoint for the first time, it will initialize it with the default provision time in minutes. You can check the available time left on your endpoint by either querying a specific endpoint or get a list of endpoint.

#### Querying time left for a specific endpoint

You can check the time left for a specific endpoint by querying the `endpoint-expiry` api as part of the endpoint manager functionality and passing in the `EndpointName`. Below is an example on how to do this:

In [None]:
# Set endpoint you like to lookup
endpoint_name = 'demo-Falcon40B-Endpoint'

In [None]:
response = requests.get(url=f"{url}/endpoint-expiry?EndpointName={endpoint_name}", headers=headers, timeout=60)
print(json.dumps(response.json(), indent=2))

#### Querying time left for all managed endpoints

You can also get a list of managed endpoints and their respective time left. This can be done by using the `endpoint-expiry` api as well.

In [None]:
response = requests.get(url=f"{url}/endpoint-expiry", headers=headers, timeout=60)
print(json.dumps(response.json(), indent=2))

#### Extending your real-time endpoint expiry time
The managed endpoint API also provides you with the ability to extend the expiry date. Below is an example on how you can extend an endpoint by 30 minutes.

In [None]:
payload = {
    "EndpointName": endpoint_name,
    "minutes": 90
}
response = requests.post(url=f"{url}/endpoint-expiry", headers=headers, json=payload, timeout=60)
print(json.dumps(response.json(), indent=2))

#### Adding a new real-time endpoint
With the endpoint management API, you can also add a new real-time endpoint with pre-existing Amazon SageMaker endpoint configurations.

Note: You can use the endpoint manager for any model regardless if it is jumpstart or not as long as you have a define Amazon SageMaker endpoint configuration.

In [None]:
new_endpoint_name=""
new_endpoint_config_name=""

In [None]:
payload = {
    "EndpointName": new_endpoint_name,
    "EndpointConfigName": new_endpoint_config_name,
    "minutes": 30
}
response = requests.post(url=f"{url}/endpoint-expiry", headers=headers, json=payload, timeout=60)
print(json.dumps(response.json(), indent=2))

### Interacting with your endpoint via API gateway

With the deploy API Gateway and model lambda, you can interact with your Amazon SageMaker endpoint through the internet via API Gateway. Below is an example on how to send your payload request to the falcon model.

In [None]:
payload = {
    "inputs": "Write a program to compute factorial in python:",
    "parameters": {"max_new_tokens": 200}
}
response = requests.post(url=f"{url}/falcon", headers=headers, json=payload, timeout=60)
print(json.dumps(response.json(), indent=2))

#### Interactive with your endpoint via Langchain/APIGateway

The deploy API can be used with the [Amazon API Gateway/Langchain](https://python.langchain.com/docs/ecosystem/integrations/amazon_api_gateway) integration.

The following example will walk you through on how to interact with the API Gateway backed by a Lambda authorizer.

In [None]:
# Install depedencies
!pip install git+https://github.com/sunbc0120/langchain.git

In [None]:
from langchain.llms import AmazonAPIGateway
from langchain.agents import load_tools
from langchain.agents import initialize_agent
from langchain.agents import AgentType

In [None]:
llm = AmazonAPIGateway(api_url=f"{url}/falcon", headers=headers)

### Langchain LLM example

In [None]:
parameters = {
    "max_new_tokens": 100,
    "num_return_sequences": 1,
    "top_k": 50,
    "top_p": 0.95,
    "do_sample": False,
    "return_full_text": True,
    "temperature": 0.2,
}

prompt = "what day comes after Friday?"
llm.model_kwargs = parameters
llm(prompt)

### Langchain/APIGateway Agent Example

In [None]:
parameters = {
    "max_new_tokens": 50,
    "num_return_sequences": 1,
    "top_k": 250,
    "top_p": 0.25,
    "do_sample": False,
    "temperature": 0.1,
}

llm.model_kwargs = parameters

# Next, let's load some tools to use. Note that the `llm-math` tool uses an LLM, so we need to pass that in.
tools = load_tools(["python_repl", "llm-math"], llm=llm)

# Finally, let's initialize an agent with the tools, the language model, and the type of agent we want to use.
agent = initialize_agent(
    tools,
    llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True,
)

# Now let's test it out!
agent.run("""
Write a Python script that prints "Hello, world!"
""")