## Cohere Medium SageMaker JumpStart Deployment

[MarketPlace Subscription](https://aws.amazon.com/marketplace/pp/prodview-6dmzzso5vu5my)

This example was built off of the following notebook from the Cohere SageMaker documentation: https://github.com/cohere-ai/cohere-sagemaker/blob/main/notebooks/Deploy%20command%20medium.ipynb.

### Setup

In [None]:
!pip install cohere-sagemaker --quiet

In [4]:
from cohere_sagemaker import Client
import boto3

In [6]:
# Currently us-east-1 and eu-west-1 only supported
model_package_map = {
    "us-east-1": "arn:aws:sagemaker:us-east-1:865070037744:model-package/cohere-gpt-medium-v1-5-15e34931a06235b7bac32dca396a970a",
    "eu-west-1": "arn:aws:sagemaker:eu-west-1:985815980388:model-package/cohere-gpt-medium-v1-5-15e34931a06235b7bac32dca396a970a",
}

region = boto3.Session().region_name
if region not in model_package_map.keys():
    raise Exception(f"Current boto3 session region {region} is not supported.")

model_package_arn = model_package_map[region]

### Instantiate Client and Endpoint

The Cohere SageMaker SDK builds a wrapper around existing SageMaker constructs to present a Client object which will create the REST endpoint that you can invoke.

In [7]:
# instantiate client
co = Client(region_name=region)

In [None]:
co.create_endpoint(arn=model_package_arn, endpoint_name="cohere-gpt-medium", instance_type="ml.g5.xlarge", n_instances=1)

In [8]:
# If the endpoint is already created, you just need to connect to it

#co.connect_to_endpoint(endpoint_name="cohere-gpt-medium")

### Sample Inference

In [10]:
prompt = "Write a LinkedIn post about starting a career in tech:"

In [None]:
response = co.generate(prompt=prompt, max_tokens=100, temperature=0, return_likelihoods='GENERATION')
print(response.generations[0].text)

### Test Temperature Parameter

In [None]:
for i in range(5):
    response = co.generate(prompt=prompt, max_tokens=100, temperature=i, return_likelihoods='GENERATION')
    print("-----------------------------------")
    print(response.generations[0].text)
    print("-----------------------------------")

### Test Max Tokens Parameter

In [None]:
token_range = [100, 200, 300, 400, 500]

for token in token_range:
    response = co.generate(prompt=prompt, max_tokens=token, temperature=0.9, return_likelihoods='GENERATION')
    print("-----------------------------------")
    print(response.generations[0].text)
    print("-----------------------------------")

### Test Combinations Of Both Tokens and Temperature

In [None]:
import itertools
temperature = [0,1,2,3,4,5]
params = [token_range, temperature]
param_combos = list(itertools.product(*params))

In [None]:
for param in param_combos:
    response = co.generate(prompt=prompt, max_tokens=param[0], temperature=param[1], return_likelihoods='GENERATION')

### Cleanup

In [9]:
#co.delete_endpoint()
#co.close()