# Run inference on the custom provisioned model and the base model via bedrock and observe the difference

## Prepare the inference request

Let's prepare an inference request that will be sent to the fine-tuned custom model endpoint provisioned through Amazon Bedrock. The model will process the input text and return a generated summary based on the configuration parameters.

In [None]:
import json
import boto3

# Initialize Bedrock Runtime client in the specified region
bedrockRuntime = boto3.client(service_name='bedrock-runtime', region_name='us-west-2')

# Sample request body containing text for summarization and parameters for model inference
body = json.dumps({
    "inputText": "Summarize the following:   TOKYO–January 19, 2024–Today, Amazon Web Services (AWS) announced its plans to invest 2.26 trillion yen into its existing cloud infrastructure in Tokyo and Osaka by 2027 to meet growing customer demand for cloud services in Japan. According to the new AWS Economic Impact Study (EIS) for Japan, this planned investment is estimated to contribute 5.57 trillion yen to Japan’s Gross Domestic Product (GDP), and support an estimated average of 30,500 full-time equivalent (FTE) jobs in local Japanese businesses each year. Having already invested 1.51 trillion yen in Japan from 2011 to 2022, AWS’s planned total investment into cloud infrastructure in the country by 2027 will be approximately 3.77 trillion yen. Hundreds of thousands of active customers use the two AWS Regions in Japan to digitally transform (DX) their businesses. AWS opened its first office in Japan in 2009 and launched the AWS Asia Pacific (Tokyo) Region in 2011, and the AWS Asia Pacific (Osaka) Region in 2021. As demand for cloud services to drive the government’s DX agenda grew in Japan, AWS invested 1.51 trillion yen between 2011 and 2022 to construct, connect, operate, and maintain AWS data centers. This is estimated to have contributed 1.46 trillion yen to Japan’s GDP and supported more than 7,100 FTE jobs. These positions, including construction, facility maintenance, engineering, telecommunications, and other jobs within the country’s broader economy, are part of the AWS data center supply chain in Japan.",
    "textGenerationConfig": {
        "temperature": 0.01,  
        "topP": 0.99,
        "maxTokenCount": 300
    }
})

# Specify content types for request and response
accept = 'application/json'
contentType = 'application/json'


## Invoke custom model with the provided parameters

In [None]:
# Using the provisionedModelArn from the previous notebook
provisionedModelArn = "<Update the value from the previous notebook>"

Here we are making an inference call to the custom model endpoint that was provisioned through Amazon Bedrock. The response from the model is then loaded as JSON and printed, allowing the output text to be viewed.

In [None]:
# Invoke the custom model endpoint
response = bedrockRuntime.invoke_model(body=body, modelId=provisionedModelArn, accept=accept, contentType=contentType)

# Parse and print the output from the custom model
response_body_custom = json.loads(response.get('body').read())
print("Custom Model Output:")
print(response_body_custom['results'][0]['outputText'])

## Invoke the base model with the same parameters

Let's make an inference call to the base model endpoint with the same configuration parameters such as temperature, topP and maxTokenCount. You can use the output generated from the base model for comparing with the output of the fine-tuned model. 

In [None]:
# Base model to use
basemodelId = 'amazon.titan-text-express-v1'

# Invoke the base model endpoint
response = bedrockRuntime.invoke_model(body=body, modelId=basemodelId, accept=accept, contentType=contentType)

# Parse and print the output from the base model
response_body_base = json.loads(response.get('body').read())
print("\n")
print("Base Model Output:")
print(response_body_base['results'][0]['outputText'])

## Notice that the output from the custom model (fine-tuned with summarization data) is more concise and is of better quality