[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/aurelio-labs/semantic-router/blob/main/docs/encoders/bedrock.ipynb) [![Open nbviewer](https://raw.githubusercontent.com/pinecone-io/examples/master/assets/nbviewer-shield.svg)](https://nbviewer.org/github/aurelio-labs/semantic-router/blob/main/docs/encoders/bedrock.ipynb)

# Using Bedrock embedding Models

The 3rd generation embedding models from AWS Bedrock (`amazon.titan-embed-text-v1`, `amazon.titan-embed-text-v2` and `cohere.embed-english-v3`) can both be used with our `BedrockEncoder`.

## Getting Started

We start by installing semantic-router. Support for the new `Bedrock` embedding models was added in `0.0.40`.

In [None]:
!pip install -qU "semantic-router[bedrock]"

We start by defining a dictionary mapping routes to example phrases that should trigger those routes.

In [1]:
from semantic_router import Route

politics = Route(
    name="politics",
    utterances=[
        "isn't politics the best thing ever",
        "why don't you tell me about your political opinions",
        "don't you just love the president",
        "don't you just hate the president",
        "they're going to destroy this country!",
        "they will save the country!",
    ],
)

  from .autonotebook import tqdm as notebook_tqdm
2025-10-03 10:53:21 - httpx - INFO - _client.py:1026 - _send_single_request() - HTTP Request: GET https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json "HTTP/1.1 200 OK"


Let's define another for good measure:

In [2]:
chitchat = Route(
    name="chitchat",
    utterances=[
        "how's the weather today?",
        "how are things going?",
        "lovely weather today",
        "the weather is horrendous",
        "let's go to the chippy",
    ],
)

routes = [politics, chitchat]

Now we initialize our embedding model, we will use the `-3-large` model alongside a `dimensions` value of `256`. This will produce _tiny_ 256-dimensional vectors that — according to OpenAI — outperform the 1536-dimensional vectors produced by `text-embedding-ada-002`.

In [12]:
import os
from getpass import getpass
from semantic_router.encoders import BedrockEncoder

aws_access_key_id = os.getenv("AWS_ACCESS_KEY_ID") or getpass(
    "Enter AWS Access Key ID: "
)
aws_secret_access_key = os.getenv("AWS_SECRET_ACCESS_KEY") or getpass(
    "Enter AWS Secret Access Key: "
)
aws_session_token = os.getenv("AWS_SESSION_TOKEN") or getpass(
    "Enter AWS Session Token: "
)
aws_region = os.getenv("AWS_REGION") or getpass("Enter AWS Region: ")

encoder = BedrockEncoder(
    name="amazon.titan-embed-image-v1",
    score_threshold=0.5,
    access_key_id=aws_access_key_id,
    secret_access_key=aws_secret_access_key,
    session_token=aws_session_token,
    region=aws_region,
)

## Alternative: Using a Pre-configured Boto3 Client

Instead of passing individual AWS credentials to the `BedrockEncoder`, you can create and pass a pre-configured `boto3` client. This approach is useful when you want more control over the client configuration or when you're already using boto3 in your application.

In [4]:
import boto3
from semantic_router.encoders import BedrockEncoder

boto3_client = boto3.client(
    "bedrock-runtime",
    region_name=aws_region,
    aws_access_key_id=aws_access_key_id,
    aws_secret_access_key=aws_secret_access_key,
    aws_session_token=aws_session_token
)


encoder = BedrockEncoder(
	name="amazon.titan-embed-image-v1",
	client=boto3_client
)

In [5]:
encoder(["hey"])

[[0.012878418,
  0.028442383,
  -0.022094727,
  -0.020751953,
  -0.008300781,
  0.033691406,
  0.09326172,
  0.0045166016,
  0.033935547,
  0.015319824,
  0.012939453,
  0.015380859,
  0.012756348,
  -0.064453125,
  0.018432617,
  0.03173828,
  -0.018188477,
  -0.007171631,
  0.03955078,
  0.0033874512,
  0.007019043,
  0.010131836,
  -0.025878906,
  0.056152344,
  0.01373291,
  -0.020263672,
  0.055419922,
  -0.06225586,
  0.040039062,
  -0.015075684,
  0.012268066,
  -0.056640625,
  0.04736328,
  -0.002609253,
  -0.0064086914,
  0.011291504,
  -0.019165039,
  -0.005493164,
  0.003189087,
  0.008666992,
  0.03564453,
  -0.0027923584,
  -0.016601562,
  0.014404297,
  -0.01171875,
  0.013183594,
  -0.018920898,
  -0.030639648,
  0.010864258,
  0.052734375,
  -0.006164551,
  0.0035705566,
  0.0060424805,
  -0.021606445,
  -0.040527344,
  0.020385742,
  0.004638672,
  -0.010314941,
  -0.010681152,
  -0.010803223,
  -0.038330078,
  -0.029174805,
  0.036865234,
  -0.03112793,
  -0.034179688

Now we define the `SemanticRouter`. When called, the semantic router will consume text (a query) and output the category (`Route`) it belongs to — to initialize a `SemanticRouter` we need our `encoder` model and a list of `routes`.

In [6]:
from semantic_router.routers import SemanticRouter

rl = SemanticRouter(encoder=encoder, routes=routes)



We can check the dimensionality of our vectors by looking at the `index` attribute of the `RouteLayer`.

In [7]:
rl.index.dimensions

1024

We do have 256-dimensional vectors. Now let's test them:

In [None]:
rl("don't you love politics?")

In [None]:
rl("how's the weather today?")

Both are classified accurately, what if we send a query that is unrelated to our existing `Route` objects?

In [22]:
rl("How does llama model work?")

RouteChoice(name=None, function_call=None, similarity_score=None)

In this case, we return `None` because no matches were identified. We always recommend optimizing your `RouteLayer` for optimal performance, you can see how in [this notebook](https://github.com/aurelio-labs/semantic-router/blob/main/docs/06-threshold-optimization.ipynb).

---