# Text Summarization using Amazon Bedrock Large Language Models

## Set AWS access credentials

In [1]:
import os
os.environ["AWS_DEFAULT_REGION"] = ""
os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""

## Install Dependencies

In [2]:
!pip install -q --no-cache-dir -r requirements.txt

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
mlflow-cml-plugin 0.0.1 requires typing-extensions==4.1.1, but you have typing-extensions 4.8.0 which is incompatible.[0m[31m
[0m

## Imports

In [3]:
import json
import os
from typing import Optional
import boto3
from botocore.config import Config

## Set up Amazon Bedrock Client with boto3

In [4]:
def get_bedrock_client(
    assumed_role: Optional[str] = None,
    endpoint_url: Optional[str] = None,
    region: Optional[str] = None,
):
    """Create a boto3 client for Amazon Bedrock, with optional configuration overrides

    Parameters
    ----------
    assumed_role :
        Optional ARN of an AWS IAM role to assume for calling the Bedrock service. If not
        specified, the current active credentials will be used.
    endpoint_url :
        Optional override for the Bedrock service API Endpoint. If setting this, it should usually
        include the protocol i.e. "https://..."
    region :
        Optional name of the AWS Region in which the service should be called (e.g. "us-east-1").
        If not specified, AWS_REGION or AWS_DEFAULT_REGION environment variable will be used.
    """
    if region is None:
        target_region = os.environ.get("AWS_REGION", os.environ.get("AWS_DEFAULT_REGION"))
    else:
        target_region = region

    print(f"Create new client\n  Using region: {target_region}")
    session_kwargs = {"region_name": target_region}
    client_kwargs = {**session_kwargs}

    profile_name = os.environ.get("AWS_PROFILE")
    if profile_name:
        print(f"  Using profile: {profile_name}")
        session_kwargs["profile_name"] = profile_name

    retry_config = Config(
        region_name=target_region,
        retries={
            "max_attempts": 10,
            "mode": "standard",
        },
    )
    session = boto3.Session(**session_kwargs)

    if assumed_role:
        print(f"  Using role: {assumed_role}", end='')
        sts = session.client("sts")
        response = sts.assume_role(
            RoleArn=str(assumed_role),
            RoleSessionName="langchain-llm-1"
        )
        print(" ... successful!")
        client_kwargs["aws_access_key_id"] = response["Credentials"]["AccessKeyId"]
        client_kwargs["aws_secret_access_key"] = response["Credentials"]["SecretAccessKey"]
        client_kwargs["aws_session_token"] = response["Credentials"]["SessionToken"]

    if endpoint_url:
        client_kwargs["endpoint_url"] = endpoint_url

    bedrock_client = session.client(
        service_name="bedrock-runtime",
        config=retry_config,
        **client_kwargs
    )

    print("boto3 Bedrock client successfully created!")
    print(bedrock_client._endpoint)
    return bedrock_client


In [8]:
# Initializing the bedrock client using AWS credentials
# If you are using a special Assumed role or custom endpoint url, see get_bedrock_client
boto3_bedrock = get_bedrock_client(
      region=os.environ.get("AWS_DEFAULT_REGION", None))

Create new client
  Using region: us-east-1
boto3 Bedrock client successfully created!
bedrock-runtime(https://bedrock-runtime.us-east-1.amazonaws.com)


## Set desired instruction
The bedrock models shown in this notebook (Amazon's Titan and Anthropic's Claude) are both general instruction-following text generation models. Meaning we can provide some instructions and input text to generate a response that will follow the instructions provided.

In [9]:
instruction_text = "Please provide a summary of the following text. Do not add any information that is not mentioned in the text below."

## Set desired input text
This is the input text that we want to be summarized. The length of this text plus any included instructions must fit within the context window size of the selected model.

In [10]:
input_text = '''Machine learning has become one of the most critical capabilities for modern businesses to grow and stay competitive today. From automating internal processes to optimizing the design, creation, and marketing processes behind virtually every product consumed, ML models have permeated almost every aspect of our work and personal lives.

ML development is iterative and complex, made even harder because most ML tools aren’t built for the entire machine learning lifecycle. Cloudera Machine Learning on Cloudera Data Platform accelerates time-to-value by enabling data scientists to collaborate in a single unified platform that is all inclusive for powering any AI use case. Purpose-built for agile experimentation and production ML workflows, Cloudera Machine Learning manages everything from data preparation to MLOps, to predictive reporting. Solve mission critical ML challenges along the entire lifecycle with greater speed and agility to discover opportunities which can mean the difference for your business.

Each ML workspace enables teams of data scientists to develop, test, train, and ultimately deploy machine learning models for building predictive applications all on the data under management within the enterprise data cloud. ML workspaces support fully-containerized execution of Python, R, Scala, and Spark workloads through flexible and extensible engines.'''

## Creating Prompt for Titan model
The format of this engineered prompt is specific to the Titan model with special tags lilke <text></text>. See AWS Bedrock documentation for more details

In [11]:
full_prompt = instruction_text + """\n<text>""" + input_text + """</text>"""

## Creating API request for Titan model
The parameters and format required for this API request is specific to the Titan model, see AWS Bedrock documentation for more details.

In [12]:
body = json.dumps({"inputText": full_prompt, 
                   "textGenerationConfig":{
                       "maxTokenCount":4096,
                       "stopSequences":[],
                       "temperature":0.60,
                       "topP":1}})

## Titan Inference API Call

In [13]:
modelId = 'amazon.titan-tg1-large'
response = boto3_bedrock.invoke_model(body=body, modelId=modelId, accept='application/json', contentType='application/json')
response_body = json.loads(response.get('body').read())

The response body is specific to the Titan Model API, see AWS Bedrock documentation for more details.

In [14]:
result = response_body.get('results')[0].get('outputText')
print(result)


Modern companies need machine learning to develop and compete, and Cloudera Machine Learning on Cloudera Data Platform speeds up time-to-value by enabling data scientists to work together in a single, integrated environment. It handles everything from data preparation to MLOps, to predictive reporting, and enables fully-containerized execution of Python, R, Scala, and Spark workloads.


## Creating Prompt for Claude model
The format of this engineered prompt is specific to the Claude model with special tags lilke <text></text>. See AWS Bedrock documentation for more details

In [15]:
full_prompt = """Human: """ + instruction_text + """\n<text>""" + input_text + """</text>
Assistant:"""

## Creating API request for Claude model
The parameters and format required for this API request is specific to the Claude model, see AWS Bedrock documentation for more details.

In [16]:
body = json.dumps({"prompt": full_prompt,
             "max_tokens_to_sample":4096,
             "temperature":0.6,
             "top_k":250,
             "top_p":1.0,
             "stop_sequences":[]
              })

## Claude Inference API Call

In [17]:
modelId = 'anthropic.claude-v2'
response = boto3_bedrock.invoke_model(body=body, modelId=modelId, accept='application/json', contentType='application/json')
response_body = json.loads(response.get('body').read())

The response body is specific to the Claude Model API, see AWS Bedrock documentation for more details.

In [18]:
result = response_body.get('completion')
print(result)

 Here is a summary of the key points from the text:

- Machine learning has become critical for modern businesses to stay competitive by automating processes and optimizing products. 

- ML development is complex and iterative. Most ML tools aren't built for the entire lifecycle. 

- Cloudera Machine Learning accelerates time-to-value by enabling collaboration on a unified platform for any AI use case. It manages data preparation, MLOps, and reporting.

- ML workspaces in Cloudera let data science teams develop, test, train, and deploy models to build predictive apps using Python, R, Scala, Spark, etc.
