# Deploy cohere-summarize Model Package from AWS Marketplace 


Cohere builds a collection of Large Language Models (LLMs) trained on a massive corpus of curated web data. Powering these models, our infrastructure enables our product to be deployed for a wide range of use cases. The use cases we power include generation (copy writing, etc), summarization, classification, content moderation, information extraction, semantic search, and contextual entity extraction

This sample notebook shows you how to deploy [cohere-summarize](https://aws.amazon.com/marketplace/pp/prodview-44fhlriqdvlb4) using Amazon SageMaker.

> **Note**: This is a reference notebook and it cannot run unless you make changes suggested in the notebook.

## Pre-requisites:
1. **Note**: This notebook contains elements which render correctly in Jupyter interface. Open this notebook from an Amazon SageMaker Notebook Instance or Amazon SageMaker Studio. The conda_python3 kernel should work.
1. Ensure that IAM role used has **AmazonSageMakerFullAccess**
1. To deploy this ML model successfully, ensure that:
    1. Either your IAM role has these three permissions and you have authority to make AWS Marketplace subscriptions in the AWS account used: 
        1. **aws-marketplace:ViewSubscriptions**
        1. **aws-marketplace:Unsubscribe**
        1. **aws-marketplace:Subscribe**  
    2. or your AWS account has a subscription to [cohere-summarize](https://aws.amazon.com/marketplace/pp/prodview-6dmzzso5vu5my). If so, skip step: [Subscribe to the model package](#1.-Subscribe-to-the-model-package)

## Contents:
1. [Subscribe to the model package](#1.-Subscribe-to-the-model-package)
2. [Create an endpoint and perform real-time inference](#2.-Create-an-endpoint-and-perform-real-time-inference)
   1. [Create an endpoint](#A.-Create-an-endpoint)
   2. [Create input payload](#B.-Create-input-payload)
   3. [Perform real-time inference](#C.-Perform-real-time-inference)
   4. [Visualize output](#D.-Visualize-output)
   5. [Writing a blobpost with co.generate](#E.-writing-a-blobpost-with-cogenerate)
   6. [Entity Extraction using co.generate](#F.-entity-extraction-using-cogenerate)
   7. [Article Summarization using co.generate](#G.-article-summarization-using-cogenerate)
   5. [Delete the endpoint](#H.-Delete-the-endpoint)
3. [Clean-up](#4.-Clean-up)
    1. [Delete the model](#A.-Delete-the-model)
    2. [Unsubscribe to the listing (optional)](#B.-Unsubscribe-to-the-listing-(optional))
    

## Usage instructions
You can run this notebook one cell at a time (By using Shift+Enter for running a cell).

## 1. Subscribe to the model package

To subscribe to the model package:
1. Open the model package listing page [cohere-summarize](https://aws.amazon.com/marketplace/pp/prodview-6dmzzso5vu5my)
1. On the AWS Marketplace listing, click on the **Continue to subscribe** button.
1. On the **Subscribe to this software** page, review and click on **"Accept Offer"** if you and your organization agrees with EULA, pricing, and support terms. 
1. Once you click on **Continue to configuration button** and then choose a **region**, you will see a **Product Arn** displayed. This is the model package ARN that you need to specify while creating a deployable model using Boto3. Copy the ARN corresponding to your region and specify the same in the following cell.

In [None]:
!pip install --upgrade cohere-sagemaker
# if you upgrade the package, you need to restart the kernel

from cohere_sagemaker import Client
import boto3

In [None]:
cohere_package = "cohere-summarize-v0-1-cf86758bcbdc31ada3cf5b88116913df"

# Mapping for Model Packages
model_package_map = {
    "us-east-1": f"arn:aws:sagemaker:us-east-1:865070037744:model-package/{cohere_package}",
    "us-east-2": f"arn:aws:sagemaker:us-east-2:057799348421:model-package/{cohere_package}",
    "us-west-1": f"arn:aws:sagemaker:us-west-1:382657785993:model-package/{cohere_package}",
    "us-west-2": f"arn:aws:sagemaker:us-west-2:594846645681:model-package/{cohere_package}",
    "ca-central-1": f"arn:aws:sagemaker:ca-central-1:470592106596:model-package/{cohere_package}",
    "eu-central-1": f"arn:aws:sagemaker:eu-central-1:446921602837:model-package/{cohere_package}",
    "eu-west-1": f"arn:aws:sagemaker:eu-west-1:985815980388:model-package/{cohere_package}",
    "eu-west-2": f"arn:aws:sagemaker:eu-west-2:856760150666:model-package/{cohere_package}",
    "eu-west-3": f"arn:aws:sagemaker:eu-west-3:843114510376:model-package/{cohere_package}",
    "eu-north-1": f"arn:aws:sagemaker:eu-north-1:136758871317:model-package/{cohere_package}",
    "ap-southeast-1": f"arn:aws:sagemaker:ap-southeast-1:192199979996:model-package/{cohere_package}",
    "ap-southeast-2": f"arn:aws:sagemaker:ap-southeast-2:666831318237:model-package/{cohere_package}",
    "ap-northeast-2": f"arn:aws:sagemaker:ap-northeast-2:745090734665:model-package/{cohere_package}",
    "ap-northeast-1": f"arn:aws:sagemaker:ap-northeast-1:977537786026:model-package/{cohere_package}",
    "ap-south-1": f"arn:aws:sagemaker:ap-south-1:077584701553:model-package/{cohere_package}",
    "sa-east-1": f"arn:aws:sagemaker:sa-east-1:270155090741:model-package/{cohere_package}",
}

region = boto3.Session().region_name
if region not in model_package_map.keys():
    raise Exception(f"Current boto3 session region {region} is not supported.")

model_package_arn = model_package_map[region]

## 2. Create an endpoint and perform real-time inference

If you want to understand how real-time inference with Amazon SageMaker works, see [Documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-hosting.html).

### A. Create an endpoint

In [None]:
co = Client(region_name=region)
co.create_endpoint(arn=model_package_arn, endpoint_name="cohere-summarize", instance_type="ml.p4d.24xlarge", n_instances=1)

# If the endpoint is already created, you just need to connect to it
# co.connect_to_endpoint(endpoint_name="cohere-summarize")

Once endpoint has been created, you would be able to perform real-time inference.

### B. Create input payload

In [None]:
text = "Ice cream is a frozen dessert typically made from milk or cream that has been flavoured with a sweetener, either sugar or an alternative, and a spice, such as cocoa or vanilla, or with fruit, such as strawberries or peaches. Food colouring is sometimes added in addition to stabilizers. The mixture is cooled below the freezing point of water and stirred to incorporate air spaces and prevent detectable ice crystals from forming. It can also be made by whisking a flavoured cream base and liquid nitrogen together. The result is a smooth, semi-solid foam that is solid at very low temperatures (below 2 °C or 35 °F). It becomes more malleable as its temperature increases. Ice cream may be served in dishes, eaten with a spoon, or licked from edible wafer ice cream cones held by the hands as finger food. Ice cream may be served with other desserts—such as cake or pie—or used as an ingredient in cold dishes—like ice cream floats, sundaes, milkshakes, and ice cream cakes—or in baked items such as Baked Alaska. Italian ice cream is gelato. Frozen custard is a type of rich ice cream. Soft serve is softer and is often served at amusement parks and fast-food restaurants in America. Ice creams made from cow's milk alternatives, such as goat's or sheep's milk, or milk substitutes (e.g., soy, cashew, coconut, almond milk, or tofu), are available for those who are lactose intolerant, allergic to dairy protein, or vegan. Banana 'nice cream'[a] is a 100% fruit-based vegan alternative. Frozen yoghurt, or 'froyo', is similar to ice cream but uses yoghurt and can be lower in fat. Fruity sorbets or sherbets are not ice creams but are often available in ice cream shops. The meaning of the name 'ice cream' varies from one country to another. In some countries, such as the United States, 'ice cream' applies only to a specific variety, and most governments regulate the commercial use of the various terms according to the relative quantities of the main ingredients, notably the amount of cream.[1] Products that do not meet the criteria to be called ice cream are sometimes labelled 'frozen dairy dessert' instead.[2] In other countries, such as Italy and Argentina, one word is used for all variants."

### C. Perform real-time inference

In [None]:
response = co.summarize(text=text)

### D. Visualize output

In [None]:
print(response)

### E. Optional Arguments

In [None]:
response = co.summarize(text=text, length="short", format_="bullets", extractiveness="low", temperature=0, additional_command="focusing on the next steps")

## 4. Clean-up

### A. Delete the endpoint

Now that you have successfully performed a real-time inference, you do not need the endpoint any more. You can terminate the endpoint to avoid being charged.

In [None]:
co.delete_endpoint()
co.close()

### B. Unsubscribe to the listing (optional)

If you would like to unsubscribe to the model package, follow these steps. Before you cancel the subscription, ensure that you do not have any [deployable model](https://console.aws.amazon.com/sagemaker/home#/models) created from the model package or using the algorithm. Note - You can find this information by looking at the container name associated with the model. 

**Steps to unsubscribe to product from AWS Marketplace**:
1. Navigate to __Machine Learning__ tab on [__Your Software subscriptions page__](https://aws.amazon.com/marketplace/ai/library?productType=ml&ref_=mlmp_gitdemo_indust)
2. Locate the listing that you want to cancel the subscription for, and then choose __Cancel Subscription__  to cancel the subscription.

