# Deploy cohere-gpt-medium Model Package from AWS Marketplace 


Cohere builds a collection of Large Language Models (LLMs) trained on a massive corpus of curated web data. Powering these models, our infrastructure enables our product to be deployed for a wide range of use cases. The use cases we power include generation (copy writing, etc), summarization, classification, content moderation, information extraction, semantic search, and contextual entity extraction

This sample notebook shows you how to deploy [cohere-gpt-medium](https://aws.amazon.com/marketplace/pp/prodview-6dmzzso5vu5my) using Amazon SageMaker.

> **Note**: This is a reference notebook and it cannot run unless you make changes suggested in the notebook.

## Pre-requisites:
1. **Note**: This notebook contains elements which render correctly in Jupyter interface. Open this notebook from an Amazon SageMaker Notebook Instance or Amazon SageMaker Studio.
1. Ensure that IAM role used has **AmazonSageMakerFullAccess**
1. To deploy this ML model successfully, ensure that:
    1. Either your IAM role has these three permissions and you have authority to make AWS Marketplace subscriptions in the AWS account used: 
        1. **aws-marketplace:ViewSubscriptions**
        1. **aws-marketplace:Unsubscribe**
        1. **aws-marketplace:Subscribe**  
    2. or your AWS account has a subscription to [cohere-gpt-medium](https://aws.amazon.com/marketplace/pp/prodview-6dmzzso5vu5my). If so, skip step: [Subscribe to the model package](#1.-Subscribe-to-the-model-package)

## Contents:
1. [Subscribe to the model package](#1.-Subscribe-to-the-model-package)
2. [Create an endpoint and perform real-time inference](#2.-Create-an-endpoint-and-perform-real-time-inference)
   1. [Create an endpoint](#A.-Create-an-endpoint)
   2. [Create input payload](#B.-Create-input-payload)
   3. [Perform real-time inference](#C.-Perform-real-time-inference)
   4. [Visualize output](#D.-Visualize-output)
   5. [Delete the endpoint](#E.-Delete-the-endpoint)
3. [Perform batch inference](#3.-Perform-batch-inference) 
4. [Clean-up](#4.-Clean-up)
    1. [Delete the model](#A.-Delete-the-model)
    2. [Unsubscribe to the listing (optional)](#B.-Unsubscribe-to-the-listing-(optional))
    

## Usage instructions
You can run this notebook one cell at a time (By using Shift+Enter for running a cell).

## 1. Subscribe to the model package

To subscribe to the model package:
1. Open the model package listing page [cohere-gpt-medium](https://aws.amazon.com/marketplace/pp/prodview-6dmzzso5vu5my)
1. On the AWS Marketplace listing, click on the **Continue to subscribe** button.
1. On the **Subscribe to this software** page, review and click on **"Accept Offer"** if you and your organization agrees with EULA, pricing, and support terms. 
1. Once you click on **Continue to configuration button** and then choose a **region**, you will see a **Product Arn** displayed. This is the model package ARN that you need to specify while creating a deployable model using Boto3. Copy the ARN corresponding to your region and specify the same in the following cell.

In [None]:
model_package_arn = "<Customer to specify Model package ARN corresponding to their AWS region>"

In [None]:
!pip install cohere-sagemaker

from cohere_sagemaker import Client, CohereError

In [None]:
from sagemaker import ModelPackage
import sagemaker as sage
from sagemaker import get_execution_role
import boto3
import numpy as np

In [None]:
role = get_execution_role()

sagemaker_session = sage.Session()

bucket = sagemaker_session.default_bucket()
runtime = boto3.client("runtime.sagemaker")
bucket

In [None]:
co = Client(endpoint_name='cohere-gpt-medium')

## 2. Create an endpoint and perform real-time inference

If you want to understand how real-time inference with Amazon SageMaker works, see [Documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-hosting.html).

In [None]:
model_name = "cohere-gpt-medium"

content_type = "application/json"

real_time_inference_instance_type = (
    "ml.g5.xlarge"
)

### A. Create an endpoint

In [None]:
# create a deployable model from the model package.
model = ModelPackage(
    role=role, model_package_arn=model_package_arn, sagemaker_session=sagemaker_session
)

# Deploy the model
predictor = model.deploy(1, real_time_inference_instance_type, endpoint_name=model_name)

Once endpoint has been created, you would be able to perform real-time inference.

### B. Create input payload

In [None]:
prompt = """
Prompt: Given a product and keywords, this program will generate exciting product descriptions. Here are some examples:
Product: Monitor
Keywords: curved, gaming
Exciting Product Description: When it comes to serious gaming, every moment counts. This curved gaming monitor delivers the unprecedented immersion you need to play your best.
--
Product: Surfboard
Keywords: 6", matte finish
Exciting Product Description: This 6" surfboard is designed for fun. It’s a board that almost anyone can pick up, ride, and get psyched on.
--
Product: Headphones
Keywords: bluetooth, lightweight
Exciting Product Description:
"""

### C. Perform real-time inference

In [None]:
response = co.generate(prompt=prompt, max_tokens=50, temperature=0.8)

### D. Visualize output

In [None]:
print(response.generations[0].text)

### E. Writing a blobpost with co.generate

In [None]:
prompt="""This program will generate an introductory paragraph to a blog post given a blog title, audience, and tone of voice.
--
Blog Title: Best Activities in Toronto
Audience: Millennials
Tone of Voice: Lighthearted
First Paragraph: Looking for fun things to do in Toronto? When it comes to exploring Canada's largest city, there's an ever-evolving set of activities to choose from. Whether you're looking to visit a local museum or sample the city's varied cuisine, there is plenty to fill any itinerary. In this blog post, I'll share some of my favorite recommendations
--
Blog Title: Mastering Dynamic Programming
Audience: Developers
Tone: Informative
First Paragraph: In this piece, we'll help you understand the fundamentals of dynamic programming, and when to apply this optimization technique. We'll break down bottom-up and top-down approaches to solve dynamic programming problems.
--
Blog Title: How to Get Started with Rock Climbing
Audience: Athletes
Tone: Enthusiastic
First Paragraph: """

response = co.generate(prompt=prompt, max_tokens=100, temperature=0.8)

print(response.generations[0].text)


### F. Entity Extraction using co.generate

In [None]:
prompt="""This program will extract relevant information from contracts. Here are some examples:

Contract: This influencer Marketing Agreement (“Agreement”) dated on the 23 day of August, 2022 (the “Effective Date”) is made between Oren & Co (the “Influencer”) and Brand Capital (the “Company”) regarding. The Company will compensate the Influencer with five thousand dollars ($5000.00) for the overall Services rendered. This Agreement is effective upon its signing until July 31, 2023, when the final LinkedIn post is uploaded and all Services and compensation are exchanged.

Extracted Text:
Influencer: Oren & Co
Company: Brand Capital
--
Contract: This Music Recording Agreement ("Agreement") is made effective as of the 13 day of December, 2021 by and between Good Kid, a Toronto-based musical group (“Artist”) and Universal Music Group, a record label with license number 545345 (“Recording Label"). Artist and Recording Label may each be referred to in this Agreement individually as a "Party" and collectively as the "Parties." Work under this Agreement shall begin on March 15, 2022.

Extracted Text:"""

response = co.generate(prompt=prompt, max_tokens=20, temperature=0.5)

print(response.generations[0].text)

### G. Article Summarization using co.generate

In [None]:
prompt="""Passage: Is Wordle getting tougher to solve? Players seem to be convinced that the game has gotten harder in recent weeks ever since The New York Times bought it from developer Josh Wardle in late January. The Times has come forward and shared that this likely isn’t the case. That said, the NYT did mess with the back end code a bit, removing some offensive and sexual language, as well as some obscure words There is a viral thread claiming that a confirmation bias was at play. One Twitter user went so far as to claim the game has gone to “the dusty section of the dictionary” to find its latest words.

TLDR: Wordle has not gotten more difficult to solve.
--
Passage: ArtificialIvan, a seven-year-old, London-based payment and expense management software company, has raised $190 million in Series C funding led by ARG Global, with participation from D9 Capital Group and Boulder Capital. Earlier backers also joined the round, including Hilton Group, Roxanne Capital, Paved Roads Ventures, Brook Partners, and Plato Capital.

TLDR: ArtificialIvan has raised $190 million in Series C funding.
--
Passage: The National Weather Service announced Tuesday that a freeze warning is in effect for the Bay Area, with freezing temperatures expected in these areas overnight. Temperatures could fall into the mid-20s to low 30s in some areas. In anticipation of the hard freeze, the weather service warns people to take action now.

TLDR: """

response = co.generate(prompt=prompt, max_tokens=50, temperature=0.8)

print(response.generations[0].text)

### H. Delete the endpoint

Now that you have successfully performed a real-time inference, you do not need the endpoint any more. You can terminate the endpoint to avoid being charged.

In [None]:
model.sagemaker_session.delete_endpoint(model_name)
model.sagemaker_session.delete_endpoint_config(model_name)

## 4. Clean-up

### A. Delete the model

In [None]:
model.delete_model()

### B. Unsubscribe to the listing (optional)

If you would like to unsubscribe to the model package, follow these steps. Before you cancel the subscription, ensure that you do not have any [deployable model](https://console.aws.amazon.com/sagemaker/home#/models) created from the model package or using the algorithm. Note - You can find this information by looking at the container name associated with the model. 

**Steps to unsubscribe to product from AWS Marketplace**:
1. Navigate to __Machine Learning__ tab on [__Your Software subscriptions page__](https://aws.amazon.com/marketplace/ai/library?productType=ml&ref_=mlmp_gitdemo_indust)
2. Locate the listing that you want to cancel the subscription for, and then choose __Cancel Subscription__  to cancel the subscription.

