# Meta Llama - Text generation with inference parameters

Anaother text generation LLM availabe in Aamzon Bedrock is - Llama. In this example, I am going to use this model to generate text. 

Note - Eventhough some of these inference parameters applicable for any LLM - (temperature, topP, topK, stop-sequence, max_token_count), not all models follow same model/inference syntax and request/response structure. Ensure to refere to the documentation to understand the correct inference parameters, response messages and syntax.

Refence documentation - https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-meta.html

## Step 1: Configure AWS environment

In [1]:
# Set env variables using os module .
## NOT RECOMMENDED FOR PRODUCTION DEPLOYMENTS OR TO SHARE ON PUBLIC DOMAIN
import os

In [2]:
#os.environ[''] = 'your key id'
#os.environ[''] = 'your secret'

os.environ['AWS_DEFAULT_REGION'] = 'us-east-1'

## Step 2: Explore and choose the model of your choice
we will be using Meta Llama LLM - `meta.llama3-70b-instruct-v1:0`

In [3]:
# Cretae boto3 client to access bedrock APIs
import boto3

In [4]:
bedrock = boto3.client(service_name='bedrock',
                      region_name='us-east-1')

In [5]:
bedrock.list_foundation_models()

{'ResponseMetadata': {'RequestId': '05dd09eb-e453-4828-8a95-82a081a63023',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'date': 'Thu, 08 May 2025 10:19:19 GMT',
   'content-type': 'application/json',
   'content-length': '54644',
   'connection': 'keep-alive',
   'x-amzn-requestid': '05dd09eb-e453-4828-8a95-82a081a63023'},
  'RetryAttempts': 0},
 'modelSummaries': [{'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-tg1-large',
   'modelId': 'amazon.titan-tg1-large',
   'modelName': 'Titan Text Large',
   'providerName': 'Amazon',
   'inputModalities': ['TEXT'],
   'outputModalities': ['TEXT'],
   'responseStreamingSupported': True,
   'customizationsSupported': [],
   'inferenceTypesSupported': ['ON_DEMAND'],
   'modelLifecycle': {'status': 'ACTIVE'}},
  {'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-image-generator-v1:0',
   'modelId': 'amazon.titan-image-generator-v1:0',
   'modelName': 'Titan Image Generator G1',
   'providerName': 'Amazon',

In [6]:
bedrock.get_foundation_model(modelIdentifier='meta.llama3-70b-instruct-v1:0')

{'ResponseMetadata': {'RequestId': '5730015a-c54f-4ff0-ac41-c575c5cb5d5b',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'date': 'Thu, 08 May 2025 10:19:19 GMT',
   'content-type': 'application/json',
   'content-length': '521',
   'connection': 'keep-alive',
   'x-amzn-requestid': '5730015a-c54f-4ff0-ac41-c575c5cb5d5b'},
  'RetryAttempts': 0},
 'modelDetails': {'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/meta.llama3-70b-instruct-v1:0',
  'modelId': 'meta.llama3-70b-instruct-v1:0',
  'modelName': 'Llama 3 70B Instruct',
  'providerName': 'Meta',
  'inputModalities': ['TEXT'],
  'outputModalities': ['TEXT'],
  'responseStreamingSupported': True,
  'customizationsSupported': [],
  'inferenceTypesSupported': ['ON_DEMAND'],
  'modelLifecycle': {'status': 'ACTIVE'}}}

## Step 3: Prepare prompt for the model

Document reference - https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-meta.html#model-parameters-meta-request-response 
```
{
    "prompt": string,
    "temperature": float,
    "top_p": float,
    "max_gen_len": int
}
```

Meta also provides recommendation for prompt text for optimum results. Refer to prompt engineering concepts

In [7]:
# Use json module to construct the requets payload
import json

In [8]:
# Define the prompt for the model.
prompt = "Tell me about United Kingdom"

# Embed the prompt in Llama 3's instruction format.
formatted_prompt = f"""
<|begin_of_text|><|start_header_id|>user<|end_header_id|>
{prompt}
<|eot_id|>
<|start_header_id|>assistant<|end_header_id|>
"""

req = json.dumps({
    'prompt': formatted_prompt,
    'temperature': 1.0,
    'top_p': 1.0,
    'max_gen_len': 50
})

In [9]:
print(req)

{"prompt": "\n<|begin_of_text|><|start_header_id|>user<|end_header_id|>\nTell me about United Kingdom\n<|eot_id|>\n<|start_header_id|>assistant<|end_header_id|>\n", "temperature": 1.0, "top_p": 1.0, "max_gen_len": 50}


## Step 4: Create bedrock runtime to invoke the model

In [10]:
bedrock_runtime = boto3.client(service_name='bedrock-runtime', region_name='us-east-1')

## Step 5: Invoke the model to get the text generated

In [11]:
response = bedrock_runtime.invoke_model(body=req, modelId='meta.llama3-70b-instruct-v1:0')
response

{'ResponseMetadata': {'RequestId': '1b0e639d-40c8-41c5-a46a-cea0bbd9cc10',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'date': 'Thu, 08 May 2025 10:19:20 GMT',
   'content-type': 'application/json',
   'content-length': '338',
   'connection': 'keep-alive',
   'x-amzn-requestid': '1b0e639d-40c8-41c5-a46a-cea0bbd9cc10',
   'x-amzn-bedrock-invocation-latency': '1164',
   'x-amzn-bedrock-output-token-count': '50',
   'x-amzn-bedrock-input-token-count': '18'},
  'RetryAttempts': 0},
 'contentType': 'application/json',
 'body': <botocore.response.StreamingBody at 0x1085cafe0>}

## Step 6: Extract the generated content from the response body

response format
```
{
    "generation": "\n\n<response>",
    "prompt_token_count": int,
    "generation_token_count": int,
    "stop_reason" : string
}
```

In [12]:
body = json.loads(response.get('body').read())
body

{'generation': 'The United Kingdom (UK) - a country steeped in history, culture, and natural beauty!\n\n** Geography and Climate **\n\nThe UK is an island nation located off the northwest coast of Europe, comprising four constituent countries:\n\n1. **England**:',
 'prompt_token_count': 18,
 'generation_token_count': 50,
 'stop_reason': 'length'}

In [13]:
gen_text = body['generation']
print(gen_text)

The United Kingdom (UK) - a country steeped in history, culture, and natural beauty!

** Geography and Climate **

The UK is an island nation located off the northwest coast of Europe, comprising four constituent countries:

1. **England**:
