![Generating Embeddings](../../images/headings/01_embeddings_02_01_comparing_embeddings.png)

# Comparing Embeddings

## Configure embedding model

Available model IDs hosted in Bedrock include:

- `cohere.embed-english-v3`
- `cohere.embed-multilingual-v3`
- `amazon.titan-embed-text-v1`
- `amazon.titan-embed-text-v2:0`
- `amazon.titan-embed-image-v1`

In [None]:
from langchain_aws.embeddings import BedrockEmbeddings
model = BedrockEmbeddings(model_id='amazon.titan-embed-text-v2:0')

## Generate embeddings for documents

In [None]:
sentences = ['That is a happy person', '그 사람은 행복한 사람이야', 'That is a very happy person', 'These are some fairly unhappy people']
embeddings = model.embed_documents(sentences)
print(embeddings)

## Use cosine similarity to compare embeddings

### Generate cosine similarities between each pair of embeddings

In [None]:
import itertools
from langchain_community.utils.math import cosine_similarity

results = [
    { 'items': [a, b], 'similarity': cosine_similarity([embeddings[a]], [embeddings[b]])[0][0] }
    for a, b in itertools.combinations(range(len(sentences)), 2)
]

### Sort results by similarity (high to low)

In [None]:
results.sort(key=lambda x: x['similarity'], reverse=True)

### Display results

In [None]:
for result in results:
    a, b = result['items']
    similarity = result['similarity']
    print(f'Similarity between "{sentences[a]}" and "{sentences[b]}": {similarity}')

## Exercises

- Try using different sentences as input, with the goal of getting a sense for making comparisons between embeddings
- Try using different models to compare the same sentences to see how embeddings and similarities differ between models

## Discussion Questions

- Do you notice any differences? If so, why do you think that is?