# Embeddings with AWS Bedrock
* Notebook by Adam Lang
* Date: 2/21/2025

# Overview
* We will use a pre-trained embedding model from AWS Bedrock to create embeddings.

# AWS Titan Text Embeddings
* Titan Text Embeddings docs: https://docs.aws.amazon.com/bedrock/latest/userguide/titan-embedding-models.html
* Configuration:
  * can intake up to 8,192 tokens
  * outputs a vector of 1,024 dimensions
  * The model is optimized for text retrieval tasks, but can also be optimized for additional tasks, such as semantic similarity and clustering.
```
Amazon Titan Text Embeddings V2 model

Model ID - amazon.titan-embed-text-v2:0

Max input text tokens  8,192

Languages - English (100+ languages in preview)

Output vector size - 1,024 (default), 512, 256

Inference types – On-Demand, Provisioned Throughput

Supported use cases – RAG, document search, reranking, classification, etc.
```

# Create Bedrock Client

In [1]:
%%capture
!pip install boto3

In [5]:
import boto3

## create bedrock client
bedrock_client = boto3.client(service_name="bedrock")

## create bedrock runtime
bedrock_runtime_client = boto3.client(service_name="bedrock-runtime",
                                      region_name="eu-central-1")

# List Foundation Models
* We can easily list the foundation models below.

In [6]:
bedrock_client.list_foundation_models()


{'ResponseMetadata': {'RequestId': '4d90aa03-e81c-49ae-8c6d-50ba269a4c17',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'date': 'Fri, 21 Feb 2025 20:45:39 GMT',
   'content-type': 'application/json',
   'content-length': '11593',
   'connection': 'keep-alive',
   'x-amzn-requestid': '4d90aa03-e81c-49ae-8c6d-50ba269a4c17'},
  'RetryAttempts': 0},
 'modelSummaries': [{'modelArn': 'arn:aws:bedrock:eu-central-1::foundation-model/amazon.titan-text-express-v1:0:8k',
   'modelId': 'amazon.titan-text-express-v1:0:8k',
   'modelName': 'Titan Text G1 - Express',
   'providerName': 'Amazon',
   'inputModalities': ['TEXT'],
   'outputModalities': ['TEXT'],
   'responseStreamingSupported': True,
   'customizationsSupported': [],
   'inferenceTypesSupported': [],
   'modelLifecycle': {'status': 'ACTIVE'}},
  {'modelArn': 'arn:aws:bedrock:eu-central-1::foundation-model/amazon.titan-text-express-v1',
   'modelId': 'amazon.titan-text-express-v1',
   'modelName': 'Titan Text G1 - Express',
   'providerNam

Summary
* We can see above that the metadata tells us things such as the input data type, if it can be fine-tuned, and the model lifecycle.

In [9]:
## list specific LLM models
all_models = [model['modelId'] for model in bedrock_client.list_foundation_models()['modelSummaries']]
all_models

['amazon.titan-text-express-v1:0:8k',
 'amazon.titan-text-express-v1',
 'amazon.titan-text-lite-v1:0:4k',
 'amazon.titan-text-lite-v1',
 'amazon.titan-embed-text-v1:2:8k',
 'amazon.titan-embed-text-v1',
 'amazon.titan-embed-image-v1:0',
 'amazon.titan-embed-image-v1',
 'amazon.titan-embed-text-v2:0',
 'amazon.rerank-v1:0',
 'anthropic.claude-instant-v1',
 'anthropic.claude-v2:1:18k',
 'anthropic.claude-v2:1:200k',
 'anthropic.claude-v2:1',
 'anthropic.claude-v2',
 'anthropic.claude-3-sonnet-20240229-v1:0',
 'anthropic.claude-3-haiku-20240307-v1:0',
 'anthropic.claude-3-5-sonnet-20240620-v1:0',
 'cohere.embed-english-v3',
 'cohere.embed-multilingual-v3',
 'cohere.rerank-v3-5:0',
 'meta.llama3-2-1b-instruct-v1:0',
 'meta.llama3-2-3b-instruct-v1:0']

In [10]:
## list embedding models
embedding_models = [model for model in all_models if 'embed' in model.lower()]
embedding_models

['amazon.titan-embed-text-v1:2:8k',
 'amazon.titan-embed-text-v1',
 'amazon.titan-embed-image-v1:0',
 'amazon.titan-embed-image-v1',
 'amazon.titan-embed-text-v2:0',
 'cohere.embed-english-v3',
 'cohere.embed-multilingual-v3']

# Generate Embeddings with AWS Bedrock

In [32]:
import json 

## prompt
prompt = "Hello, welcome to AWS Bedrock, I hope you find what you are looking for!"

# ## 2nd prompt 
# prompt2 = """AWS Bedrock supports foundation models from industry-leadin providers such as
# A21 labs, Anthropic, Stability AI, and Amazon. Choose the model that is best suited for your use case and goals!
# """

## init model_id from AWS bedrock --- can use different model_id's 
model_id = 'amazon.titan-embed-text-v2:0' 
body = json.dumps({
    "inputText": prompt, 
    "dimensions": 1024, ## other sizes --> 384, 256
    "normalize": False
    
})
# Invoke the model
response = bedrock_runtime_client.invoke_model(
    body=body,
    modelId=model_id,
    accept='application/json',
    contentType='application/json'
)

# Parse the response
response_body = json.loads(response['body'].read())

# Get embeddings
embedding = response_body['embedding']
print(f"The embedding vector has {len(embedding)} values\n{embedding[:3]+['...']+embedding[-3:]}")

The embedding vector has 1024 values
[-5.40625, 1.28125, 3.03125, '...', -0.71875, -2.953125, -0.51953125]


In [33]:
len(embedding)

1024

In [34]:
## check vector magnitude to see if normalized
import numpy as np 

## calculate magnitude
np.linalg.norm(embedding)

48.168456186155574

Summary
* We can see the length of the vector embeddings above is NOT equal to 1 it is quite high thus we have validated the fact that the embeddings were not normalized since we set the normalize function to FALSE.

# Normalization Parameter in Embeddings
* Normalizing a vector is the process of scaling it to have a unit length of magnitude of 1.
* This is pretty useful to ensure that all vectors have the `same scale` and contribute equally during vector operations preventing some vectors from dominating others due to their larger magnitudes.
* This is usually done in data pre-processing.

## When to normalize?
* Use as default for MOST use cases such as: Retrieval, RAG, related use cases.

## When NOT to normalize?
* Normalization will work for most use cases but there may be some cases such as `Classification` or `Entity extraction` where it may not work well. 

## Example of Normalizing Embeddings with Bedrock

In [24]:
## example of normalization in embedding generation
prompt = """AWS Bedrock supports foundation models from industry-leadin providers such as
A21 labs, Anthropic, Stability AI, and Amazon. Choose the model that is best suited for your use case and goals!
"""

## init model_id from AWS bedrock --- can use different model_id's 
model_id = 'amazon.titan-embed-text-v2:0' 
body = json.dumps({
    "inputText": prompt, 
    "dimensions": 1024, ## other sizes --> 384, 256
    "normalize": True, ## set this to True to NORMALIZE vectors
    
})
# Invoke the model
response = bedrock_runtime_client.invoke_model(
    body=body,
    modelId=model_id,
    accept='application/json',
    contentType='application/json'
)

# Parse the response
response_body = json.loads(response['body'].read())

# Get embeddings
embedding = response_body['embedding']
print(f"The embedding vector has {len(embedding)} values\n{embedding[:3]+['...']+embedding[-3:]}")

The embedding vector has 1024 values
[-0.08297207951545715, 0.04197411239147186, -0.04392639547586441, '...', -0.0021149746607989073, -0.04229949042201042, 0.005490799434483051]


In [25]:
## check vector magnitude to see if normalized
import numpy as np 

## calculate magnitude
np.linalg.norm(embedding)

0.9999999543085568

Summary
* We can see the value rounds up to 1.0 which means that we have normalized the embeddings.
* The main point is that not all embeddings are created equal in a vector space and this depends on the language of origin, special characters, length of the sentences, number of tokens and other variables. So normalization is a scaling step in feature engineering, however it can skew the results so be careful.

# Embeddings and Cosine Similarity
* We can now test the similarity of 2 texts/prompts.

In [83]:
def get_embedding(prompt):
    body = json.dumps({
        "inputText": prompt,
        "dimensions": 1024,
        "normalize": False,
    })
    model = bedrock_runtime_client.invoke_model(
        modelId=model_id,
        body=body, 
        accept="application/json",
        contentType="application/json"
    )
    response_body = json.loads(model.get('body').read())

    return response_body.get("embedding")
    

In [84]:
# Example usage
text1 = "Python is a programming language."
text2 = "I am going to the moon."

# Get embeddings
text1_embedding = get_embedding(text1)
text2_embedding = get_embedding(text2)

# Calculate cosine similarity
def cosine_sim(vec1, vec2):
    dot_product = np.dot(vec1, vec2)
    norm_vec1 = np.linalg.norm(vec1)
    norm_vec2 = np.linalg.norm(vec2)
    similarity = dot_product / (norm_vec1 * norm_vec2)
    return similarity

# Calculate cosine similarity
similarity = cosine_sim(text1_embedding, text2_embedding)
print(f"Cosine similarity between '{text1}' and '{text2}': {similarity}")

Cosine similarity between 'Python is a programming language.' and 'I am going to the moon.': -0.012073559764820775


In [86]:
## another example
text3 = "I am sitting on the river bank."

text3_embedding = get_embedding(text3)

cosine_sim(text1_embedding, text3_embedding)

0.06447508325343351

In [87]:
## another example
text4 = "I am going to the bank."

text4_embedding = get_embedding(text4)

cosine_sim(text3_embedding, text4_embedding)


0.2691220067267417