# Bedrock Additional Features - Streaming support & Retry 

This notebook covers additional features and controls with Bedrock. This covers Streaming and Retry mechanism

(This notebook was tested on SageMaker Studio ml.m5.2xlarge instance with Datascience 3.0 kernel)

## Pre-requisites

In [2]:
#Check Python version is greater than 3.8 which is required by Langchain if you want to use Langchain
import sys
sys.version

'3.10.6 (main, Oct  7 2022, 20:19:58) [GCC 11.2.0]'

## Install SDK
We will download the latest version of the SDK and unzip the archive into a folder

In [None]:
!wget https://d2eo22ngex1n9g.cloudfront.net/Documentation/SDK/bedrock-python-sdk.zip -P bedrock_docs/

In [None]:
!unzip -o bedrock_docs/bedrock-python-sdk.zip -d bedrock_docs/SDK-1-28

### Uninstall previous version of SDK

In [None]:
!python3 -m pip uninstall bedrock_docs/SDK/boto3-1.26.162-py3-none-any.whl -y
!python3 -m pip uninstall bedrock_docs/SDK/botocore-1.29.162-py3-none-any.whl -y

In [None]:
!python3 -m pip install bedrock_docs/SDK-1-28/boto3-1.28.21-py3-none-any.whl
!python3 -m pip install bedrock_docs/SDK-1-28/botocore-1.31.21-py3-none-any.whl

## Install Dependencies

In [None]:
!pip install langchain --upgrade

## Restart Kernel

In [None]:
import IPython
app = IPython.Application.instance()
app.kernel.do_shutdown(True)  

In [3]:
#Check Python version is greater than 3.8 which is required by Langchain if you want to use Langchain
import sys
sys.version

'3.10.6 (main, Oct  7 2022, 20:19:58) [GCC 11.2.0]'

In [4]:
assert sys.version_info >= (3, 8)

In [5]:
import langchain

In [6]:
langchain.__version__

'0.0.279'

In [7]:
import sagemaker
import boto3
session = boto3.Session()
sagemaker_session = sagemaker.Session()
studio_region = sagemaker_session.boto_region_name 
#sagemaker_session.get_caller_identity_arn()

In [8]:
import json

#Create Bedrock client
bedrock = boto3.client('bedrock' , 'us-east-1', endpoint_url='https://bedrock.us-east-1.amazonaws.com')
prompt_data = """Command: Write me a blog about making strong business decisions as a leader.\nBlog:"""

## Streaming Response
Bedrock provides streaming inference for models that support streaming. To run inference with streaming, use the InvokeModelWithResponseStream operation.

In [9]:
from IPython.display import display, display_markdown, Markdown, clear_output

body = json.dumps({"prompt": prompt_data, "max_tokens_to_sample": 200})
modelId = "anthropic.claude-instant-v1"  
accept = "application/json"
contentType = "application/json"

response = bedrock.invoke_model_with_response_stream(body=body, modelId=modelId, accept=accept, contentType=contentType)
stream = response.get('body')
output = []

if stream:
    for event in stream:
        chunk = event.get('chunk')
        if chunk:
            chunk_obj = json.loads(chunk.get('bytes').decode())
            text = chunk_obj['completion']
            clear_output(wait=True)
            output.append(text)
            display_markdown(Markdown(''.join(output)))


Business leaders make many decisions every day that shape and affect their organizations. However, some decisions end up being more critical than others, setting the path for success or leading to challenges down the road. Making strong business decisions requires a thoughtful approach, considering all available options and their potential consequences. Here are some strategies for making decisive yet well-reasoned calls as a leader:
Research thoroughly. Do not rely on first impressions or quick analysis alone. Gather as many relevant facts, data, and expert opinions as possible before deciding on a course of action. Evaluate all possible options methodically, weighing pros and cons in an objective manner. This thorough research helps ensure important considerations are not missed.
Consult with others. Seek input from employees, customers, industry peers, and experts. Multiple viewpoints can reveal insights and angles that one person may not think of. Discussing options with stakeholders also helps explain the rationale behind choices, building understanding and buy-in. Just be careful not to premature

## Bedrock boto3 client API calls with retry
The code sniippet below shows how to implement retry with Botocore configuration. For more details check the [documentation.](https://botocore.amazonaws.com/v1/documentation/api/latest/reference/config.html)

**total_max_attempts** :An integer representing the maximum number of total attempts that will be made on a single request. This includes the initial request, so a value of 1 indicates that no requests will be retried.

**mode**: Possible values legacy, standard and adaptive.

- legacy - The pre-existing retry behavior.
- standard - The standardized set of retry rules. This will also default to 3 max attempts unless overridden.
- adaptive - Retries with additional client side throttling.


In [10]:
import boto3
from botocore.config import Config

config = Config(
   retries = {
      'total_max_attempts': 10, #This includes total attempts including the initial attempt
      'mode': 'standard' # legacy, standard, adaptive
   }
)

#Create Bedrock client
bedrock = boto3.client('bedrock' , 'us-east-1', endpoint_url='https://bedrock.us-east-1.amazonaws.com',config=config)
bedrock.list_foundation_models()

{'ResponseMetadata': {'RequestId': 'aee0c94a-572c-49b4-af73-b75cacfedc4c',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'date': 'Fri, 01 Sep 2023 20:16:52 GMT',
   'content-type': 'application/json',
   'content-length': '1166',
   'connection': 'keep-alive',
   'x-amzn-requestid': 'aee0c94a-572c-49b4-af73-b75cacfedc4c'},
  'RetryAttempts': 0},
 'modelSummaries': [{'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-tg1-large',
   'modelId': 'amazon.titan-tg1-large'},
  {'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-e1t-medium',
   'modelId': 'amazon.titan-e1t-medium'},
  {'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/stability.stable-diffusion-xl',
   'modelId': 'stability.stable-diffusion-xl'},
  {'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/ai21.j2-grande-instruct',
   'modelId': 'ai21.j2-grande-instruct'},
  {'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/ai21.j2-jumbo-instruct',
   'modelId': 'ai21.j2-jumbo-i

## Retry option when using Langchain
You can pass the Boto3 client created with retry attempts configuration when creating Bedrock LLM with Langchain 

In [11]:
import boto3
from botocore.config import Config
from langchain.llms.bedrock import Bedrock

config = Config(
   retries = {
      'total_max_attempts': 10, #This includes total attempts including the initial attempt
      'mode': 'standard' # legacy, standard, adaptive
   }
)

#Create Bedrock client
bedrock = boto3.client('bedrock' , 'us-east-1', endpoint_url='https://bedrock.us-east-1.amazonaws.com',config=config)

#Pass the client to create Bedrock
llm = Bedrock(
        client=bedrock,
        model_id="amazon.titan-tg1-large",
        model_kwargs={"temperature": 0.5, "maxTokenCount": 100}
    )

llm(prompt_data)

" Making strong business decisions as a leader requires a combination of strategic thinking, careful analysis, and insight into the ever-changing market and industry trends. In this blog post, we'll explore some key factors that leaders should consider when making important business decisions.\nUnderstand the Business Objective:\nBefore making any business decision, it's crucial to have a clear understanding of the organization's overall objectives and the specific goals that the decision is intended to achieve. This ensures that the decision aligns with the company's vision"

## Submitting a batch with Retry attempts
You can submit a batch of requests with retry attempts config enabled client

In [18]:
from concurrent.futures import ThreadPoolExecutor, as_completed

class ThrottlingException(Exception):
    "Raised when Langchain llm gets retry error"
    pass

def bedrock_llm_call(prompt):
    try:
        return llm(prompt)
    except ValueError as ex:
        print(f"Received value error. Details {ex}")
        raise ThrottlingException(f"Retry error encounetred. Failed to process data {prompt}.")

In [19]:
batch_prompt_data = []

prompt_template = """Command: Write me a blog about making strong business decisions as a leader. Limit your response to {n} sentence(s). \nBlog:"""

for i in range(1,6):
    batch_prompt_data.append(prompt_template.format(n=i))

with ThreadPoolExecutor(max_workers=3) as executor:
    #Create KV pairs
    submitted_items = {executor.submit(bedrock_llm_call, d): d for d in batch_prompt_data}

    for i, future in enumerate(as_completed(submitted_items)):
        d = submitted_items[future]
        try:
            resp = future.result()
            print(f"{i +1}: Prompt: {d} Response: {resp}")
        except ThrottlingException as ex:
            print(f"Received Throttling exception: {ex}")

1: Prompt: Command: Write me a blog about making strong business decisions as a leader. Limit your response to 1 sentence(s). 
Blog: Response:  Making strong business decisions as a leader requires a combination of strategic thinking, careful analysis, and insight into market trends and customer needs.
2: Prompt: Command: Write me a blog about making strong business decisions as a leader. Limit your response to 2 sentence(s). 
Blog: Response:  Making strong business decisions as a leader requires a combination of strategic thinking, careful analysis, and insight into market trends. Leaders must be able to weigh the potential risks and benefits of different options and make a decision that aligns with the overall goals and mission of the organization. In addition, effective decision-making also requires effective communication with stakeholders and the ability to adapt to changing circumstances.
3: Prompt: Command: Write me a blog about making strong business decisions as a leader. Limi