# Semantic Router: Hybrid Layer

The Hybrid Layer in the Semantic Router library can improve decision making performance particularly for niche use-cases that contain specific terminology, such as finance or medical. It helps us provide more importance to decision making based on the keywords contained in our utterances and user queries.

## Getting Started

We start by installing the library:

In [None]:
!pip install -qU semantic-router==0.0.5

We start by defining a dictionary mapping decisions to example phrases that should trigger those decisions.

In [1]:
import os

os.environ["COHERE_API_KEY"] = "<<APIKEY>>"

In [2]:
from semantic_router.schema import Decision

politics = Decision(
    name="politics",
    utterances=[
        "isn't politics the best thing ever",
        "why don't you tell me about your political opinions",
        "don't you just love the president" "don't you just hate the president",
        "they're going to destroy this country!",
        "they will save the country!",
    ],
)

  from .autonotebook import tqdm as notebook_tqdm
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.


Let's define another for good measure:

In [3]:
chitchat = Decision(
    name="chitchat",
    utterances=[
        "how's the weather today?",
        "how are things going?",
        "lovely weather today",
        "the weather is horrendous",
        "let's go to the chippy",
    ],
)

decisions = [politics, chitchat]

Now we initialize our embedding model:

In [4]:
from semantic_router.encoders import CohereEncoder
from getpass import getpass

os.environ["COHERE_API_KEY"] = os.environ["COHERE_API_KEY"] or getpass(
    "Enter Cohere API Key: "
)

encoder = CohereEncoder()

Now we define the `DecisionLayer`. When called, the decision layer will consume text (a query) and output the category (`Decision`) it belongs to — to initialize a `DecisionLayer` we need our `encoder` model and a list of `decisions`.

In [5]:
from semantic_router.layer import HybridDecisionLayer

dl = HybridDecisionLayer(encoder=encoder, decisions=decisions)

2067848296 1405
2212344012 2520
3313717465 206
3076736765 769
1778150425 4131
2067848296 1405
202708381 770
2212344012 2520
3374841595 2375
2067848296 1405
3508911095 2067
3454774732 not in encoder.idx_mapping
2379717389 3565
298452803 4356
1063320047 3369
4186256544 713
1846246980 858
3897916792 643
575623047 1476
3897916792 643


ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 2 dimension(s) and the array at index 1 has 1 dimension(s)

In [None]:
dl("don't you love politics?")

In [None]:
if 3454774732 in encoder.idx_mapping:
    print("yes")

In [None]:
from semantic_router.encoders import BM25Encoder

encoder = BM25Encoder()

In [None]:
tests = ["hello this is some text", "and more stuff"]

In [None]:
idx_list = encoder.model.get_params()['doc_freq']['indices']
idx_list

In [None]:
sparse_dicts = encoder.model.encode_documents(tests)
sparse_dicts

In [None]:
embeds = [0.0] * len(encoder.idx_mapping)

In [None]:
for output in sparse_dicts:
    indices = output["indices"]
    values = output["values"]
    for idx, val in zip(indices, values):
        position = encoder.idx_mapping[idx]
        embeds[position] = val

In [None]:
encoder.idx_mapping

In [None]:
encoded_output = encoder(tests)
encoded_output

In [None]:
import numpy as np


sparse_vec = np.zeros(len(idx_list))
idx_position_dict = {idx: i for i, idx in enumerate(idx_list)}

for output in encoded_output:
    indices = output['indices']
    values = output['values']
    for idx, value in zip(indices, values):
        if idx in idx_position_dict:
            position = idx_position_dict[idx]
            sparse_vec[position] = value

In [None]:
sparse_vec

In [None]:
sparse_vec.shape

Now we can test it:

In [None]:
dl("don't you love politics?")

In [None]:
dl("how's the weather today?")

Both are classified accurately, what if we send a query that is unrelated to our existing `Decision` objects?

In [None]:
dl("I'm interested in learning about llama 2")

In this case, we return `None` because no matches were identified.