In [None]:
!pip install -qU "semantic-router[pinecone]>=0.1.5"

In [1]:
from semantic_router import Route

# we could use this as a guide for our chatbot to avoid political conversations
politics = Route(
    name="politics",
    utterances=[
        "isn't politics the best thing ever",
        "why don't you tell me about your political opinions",
        "don't you just love the presidentdon't you just hate the president",
        "they're going to destroy this country!",
        "they will save the country!",
    ],
)

# this could be used as an indicator to our chatbot to switch to a more
# conversational prompt
chitchat = Route(
    name="chitchat",
    utterances=[
        "how's the weather today?",
        "how are things going?",
        "lovely weather today",
        "the weather is horrendous",
        "let's go to the chippy",
    ],
)

# we place both of our decisions together into single list
routes = [politics, chitchat]

  from .autonotebook import tqdm as notebook_tqdm


As of 13 June 2024, two encoders support async functionality:

* `AzureOpenAIEncoder`
* `OpenAIEncoder`

To use either of these encoders in async mode we simply initialize them as we usually would. When we then include them within a `RouteLayer` and run `acall` the route layer will automatically run the encoders in async mode.

**Azure OpenAI:**

```python
from semantic_router.encoders import AzureOpenAIEncoder

encoder = AzureOpenAIEncoder(
    api_key="YOUR_AZURE_OPENAI_API_KEY",
    deployment_name="YOUR_DEPLOYMENT_NAME",
    azure_endpoint="YOUR_ENDPOINT",
    api_version="2024-02-01",
    model="text-embedding-3-small",
)
```

**OpenAI:**

In [2]:
import os
from getpass import getpass
from semantic_router.encoders import OpenAIEncoder

# get at platform.openai.com
os.environ["OPENAI_API_KEY"] = os.environ.get("OPENAI_API_KEY") or getpass(
    "Enter OpenAI API key: "
)
encoder = OpenAIEncoder(name="text-embedding-3-small")

We can see encoder details, including default `score_threshold` like so:

In [3]:
encoder

OpenAIEncoder(name='text-embedding-3-small', score_threshold=0.3, type='openai', dimensions=NOT_GIVEN, token_limit=8192, max_retries=3)

We can create embeddings asynchronously via our encoder using the `encoder.acall` method:

In [4]:
await encoder.acall(docs=["test", "test 2"])

2025-01-03 14:45:25 - httpx - INFO - _client.py:1740 - _send_single_request() - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


[[-0.009876498021185398,
  0.0015304419212043285,
  0.015627983957529068,
  -0.05478791892528534,
  -0.00641245674341917,
  -0.012935652397572994,
  0.009648099541664124,
  -0.013551635667681694,
  0.02862592600286007,
  0.007862440310418606,
  0.03180966153740883,
  -0.006592406891286373,
  0.0038308631628751755,
  0.010132580995559692,
  0.014077643863856792,
  0.044683024287223816,
  -0.059909578412771225,
  -0.0024206764064729214,
  -0.051188915967941284,
  0.0364883653819561,
  0.036100782454013824,
  0.024334806948900223,
  0.03992126137018204,
  -0.0436033196747303,
  0.034439701586961746,
  -0.019573045894503593,
  -0.013973826542496681,
  0.010617062449455261,
  0.031200598925352097,
  -0.041582342237234116,
  0.0587468259036541,
  -0.028875088319182396,
  -0.0013963445089757442,
  -0.03881387785077095,
  0.055203188210725784,
  0.004097328055649996,
  0.023615004494786263,
  0.018922457471489906,
  -0.0056684319861233234,
  -0.003938141278922558,
  -0.04149928689002991,
  -0.

For our `PineconeIndex` we do the exact same thing, ie we initialize as usual:

In [7]:
import os
from semantic_router.index.pinecone import PineconeIndex

# get at app.pinecone.io
os.environ["PINECONE_API_KEY"] = os.environ.get("PINECONE_API_KEY") or getpass(
    "Enter Pinecone API key: "
)

In [10]:
pc_index = PineconeIndex(dimensions=1536, init_async_index=True, region="us-east-1")

2025-01-03 14:48:52 - pinecone_plugin_interface.logging - INFO - discover_namespace_packages.py:12 - discover_subpackages() - Discovering subpackages in _NamespacePath(['c:\\Users\\Joshu\\OneDrive\\Documents\\Aurelio\\agents-course\\07-pratical-ai\\.venv\\Lib\\site-packages\\pinecone_plugins'])
2025-01-03 14:48:52 - pinecone_plugin_interface.logging - INFO - discover_plugins.py:9 - discover_plugins() - Looking for plugins in pinecone_plugins.inference
2025-01-03 14:48:52 - pinecone_plugin_interface.logging - INFO - installation.py:10 - install_plugins() - Installing plugin inference into Pinecone
2025-01-03 14:48:58 - pinecone_plugin_interface.logging - INFO - discover_namespace_packages.py:12 - discover_subpackages() - Discovering subpackages in _NamespacePath(['c:\\Users\\Joshu\\OneDrive\\Documents\\Aurelio\\agents-course\\07-pratical-ai\\.venv\\Lib\\site-packages\\pinecone_plugins'])
2025-01-03 14:48:58 - pinecone_plugin_interface.logging - INFO - discover_plugins.py:9 - discover_pl

There are several async methods we can call directly:

In [11]:
await pc_index._async_list_indexes()

{'indexes': [{'name': 'index',
   'metric': 'dotproduct',
   'dimension': 1536,
   'status': {'ready': True, 'state': 'Ready'},
   'host': 'index-96ix5ds.svc.aped-4627-b74a.pinecone.io',
   'spec': {'serverless': {'region': 'us-east-1', 'cloud': 'aws'}},
   'deletion_protection': 'disabled'},
  {'name': 'rerankers',
   'metric': 'dotproduct',
   'dimension': 1536,
   'status': {'ready': True, 'state': 'Ready'},
   'host': 'rerankers-96ix5ds.svc.aped-4627-b74a.pinecone.io',
   'spec': {'serverless': {'region': 'us-east-1', 'cloud': 'aws'}},
   'deletion_protection': 'disabled'}]}

But unless we're using the index directly, we don't need to use these. As with the encoder, once we pass the `PineconeIndex` to our route layer, the route layer will call all async methods automatically when we hit the `acall` method.

## Async RouteLayer

The `RouteLayer` class supports both sync and async operations by default, so we initialize as usual:

In [13]:
from semantic_router.routers import SemanticRouter
import time

rl = SemanticRouter(encoder=encoder, routes=routes, index=pc_index, auto_sync="local")
# due to pinecone indexing latency we wait 3 seconds
time.sleep(3)

2025-01-03 14:56:09 - pinecone_plugin_interface.logging - INFO - discover_namespace_packages.py:12 - discover_subpackages() - Discovering subpackages in _NamespacePath(['c:\\Users\\Joshu\\OneDrive\\Documents\\Aurelio\\agents-course\\07-pratical-ai\\.venv\\Lib\\site-packages\\pinecone_plugins'])
2025-01-03 14:56:09 - pinecone_plugin_interface.logging - INFO - discover_plugins.py:9 - discover_plugins() - Looking for plugins in pinecone_plugins.inference
2025-01-03 14:56:13 - httpx - INFO - _client.py:1025 - _send_single_request() - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


We can check our route layer and index information as usual:

In [14]:
rl.list_route_names()

['politics', 'chitchat']

In [15]:
len(rl.index)

0

We can also view all of the records for a given route:

In [16]:
rl.index._get_route_ids(route_name="politics")

['politics#64069085d9d6e98e5a80915f69fabe82bac6c742f801bc305c5001dce88f0d19',
 'politics#af8b76111f260cf44fb34f04fcf82927dcbe08e8f47c30f4d571379c1512fac8',
 'politics#d1bb40236c3d95b9c695bfa86b314b6da4eb87e136699563fccae47fccea23e2',
 'politics#ed0f3dd7bd5dea12e55b1953bcd2c562a5ab19f501f6d5ff8c8927652c3904b8',
 'politics#fc6d15f9e6075e6de82b3fbef6722b64353e4eadc8d663b7312a4ed60c43e6f6']

And now for async vs. sync usage! To call in synchronous mode we simply hit `rl(...)`, to switch to async mode we hit `rl.acall(...)`:

In [17]:
rl("don't you love politics").name  # SYNC mode

2025-01-03 14:56:30 - httpx - INFO - _client.py:1025 - _send_single_request() - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


'politics'

In [18]:
out = await rl.acall("don't you love politics?")  # ASYNC mode
out.name

2025-01-03 14:56:33 - httpx - INFO - _client.py:1740 - _send_single_request() - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


'politics'

Let's try a few more sync and async requests:

In [19]:
rl("how's the weather today?").name

2025-01-03 14:56:39 - httpx - INFO - _client.py:1025 - _send_single_request() - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


'chitchat'

In [20]:
out = await rl.acall("how's the weather today?")
out.name

2025-01-03 14:56:41 - httpx - INFO - _client.py:1740 - _send_single_request() - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


'chitchat'

In [21]:
rl("I'm interested in learning about llama 2").name

2025-01-03 14:56:46 - httpx - INFO - _client.py:1025 - _send_single_request() - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


In [22]:
out = await rl.acall("I'm interested in learning about llama 2")
out.name

2025-01-03 14:56:48 - httpx - INFO - _client.py:1740 - _send_single_request() - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


We can delete or update routes using the usual synchronous methods:

In [23]:
len(rl.index)

10

In [24]:
import time

rl.delete(route_name="chitchat")
time.sleep(3)
len(rl.index)



5

In [25]:
out = await rl.acall("how's the weather today?")
out.name

2025-01-03 14:57:00 - httpx - INFO - _client.py:1740 - _send_single_request() - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


In [26]:
rl.index.get_routes()

[Route(name='politics', utterances=["they're going to destroy this country!", 'they will save the country!', "don't you just love the presidentdon't you just hate the president", "isn't politics the best thing ever", "why don't you tell me about your political opinions"], description=None, function_schemas=None, llm=None, score_threshold=None, metadata={})]

In [27]:
rl.index.describe()

{'type': 'pinecone', 'dimensions': 1536, 'vectors': 5}

---