[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/aurelio-labs/semantic-router/blob/main/docs/encoders/huggingface-endpoint.ipynb) [![Open nbviewer](https://raw.githubusercontent.com/pinecone-io/examples/master/assets/nbviewer-shield.svg)](https://nbviewer.org/github/aurelio-labs/semantic-router/blob/main/docs/encoders/huggingface-endpoint.ipynb)

# Using Huggingface endpoint

HuggingFace is a huge ecosystem of open source models. It can be run locally and supports the largest library of encoders.

## Getting Started

We start by installing semantic-router.

In [1]:
# !pip install -qU semantic-router==0.0.20

We start by defining a dictionary mapping routes to example phrases that should trigger those routes.

In [2]:
from semantic_router import Route

politics = Route(
    name="politics",
    utterances=[
        "isn't politics the best thing ever",
        "why don't you tell me about your political opinions",
        "don't you just love the president",
        "don't you just hate the president",
        "they're going to destroy this country!",
        "they will save the country!",
    ],
)

  from .autonotebook import tqdm as notebook_tqdm


Let's define another for good measure:

In [3]:
chitchat = Route(
    name="chitchat",
    utterances=[
        "how's the weather today?",
        "how are things going?",
        "lovely weather today",
        "the weather is horrendous",
        "let's go to the chippy",
    ],
)

routes = [politics, chitchat]

Now we initialize our embedding model, we will use the Huggingface endpoint

In [2]:
import os
from getpass import getpass
from semantic_router.encoders.huggingface import HFEndpointEncoder

huggingface_url = os.getenv("HF_API_URL") or getpass("Enter HuggingFace API URL: ")
huggingface_api_key = os.getenv("HF_API_KEY") or getpass("Enter HuggingFace API Key: ")

encoder = HFEndpointEncoder(
    huggingface_url=huggingface_url,
    huggingface_api_key=huggingface_api_key,
)

  from .autonotebook import tqdm as notebook_tqdm
[32m2024-04-14 20:54:07 INFO semantic_router.utils.logger Model Initializing wait for - 26.81s [0m


In [3]:
encoder("Hey")

[[-0.351535439491272,
  -0.0007503691595047712,
  0.22843779623508453,
  0.1197630763053894,
  -0.9190473556518555,
  0.4975742697715759,
  0.08101634681224823,
  0.05379204452037811,
  0.1970519870519638,
  -0.0833292305469513,
  0.36685681343078613,
  0.004698676988482475,
  -0.19794942438602448,
  -0.33827024698257446,
  -0.24040493369102478,
  0.3189111649990082,
  -0.5002703666687012,
  -0.6781224012374878,
  -0.2871842384338379,
  0.0505770742893219,
  -0.03144306689500809,
  0.15773507952690125,
  -1.278083324432373,
  -0.2693241238594055,
  -0.11886045336723328,
  0.3414249122142792,
  0.5203899145126343,
  0.6207894682884216,
  0.6428499817848206,
  0.8519144058227539,
  -0.08569950610399246,
  0.32095733284950256,
  0.38129955530166626,
  -0.5376394391059875,
  -0.23925058543682098,
  -0.975605309009552,
  0.14065563678741455,
  0.08411094546318054,
  0.5871137976646423,
  -0.5853262543678284,
  -0.16012267768383026,
  -0.30777785181999207,
  0.32361841201782227,
  -0.5167716

Now we define the `RouteLayer`. When called, the route layer will consume text (a query) and output the category (`Route`) it belongs to — to initialize a `RouteLayer` we need our `encoder` model and a list of `routes`.

In [5]:
from semantic_router.layer import RouteLayer

rl = RouteLayer(encoder=encoder, routes=routes)

[32m2024-04-14 20:02:32 INFO semantic_router.utils.logger local[0m


We can check the dimensionality of our vectors by looking at the `index` attribute of the `RouteLayer`.

In [6]:
rl.index

LocalIndex(index=array([[ 0.37762564,  0.37923592,  0.04006954, ...,  0.2910035 ,
         0.14261879, -0.14989774],
       [ 0.14489685, -0.47280183, -0.13473961, ..., -0.184137  ,
        -0.44280073, -0.96940869],
       [ 1.16709912,  0.38906148,  0.24399863, ...,  0.03619115,
        -0.00167309,  0.50425595],
       ...,
       [-0.64046752,  0.45156148, -0.27317011, ..., -0.64851284,
        -0.10258984,  0.15441738],
       [-0.11908327,  0.4233726 , -0.29102552, ..., -0.69622546,
         0.27602831,  0.2030668 ],
       [-0.06046702, -0.18556708, -0.45608515, ..., -0.86009502,
        -0.01424424, -0.489003  ]]), routes=array(['politics', 'politics', 'politics', 'politics', 'politics',
       'politics', 'chitchat', 'chitchat', 'chitchat', 'chitchat',
       'chitchat'], dtype='<U8'), utterances=array(["isn't politics the best thing ever",
       "why don't you tell me about your political opinions",
       "don't you just love the president",
       "don't you just hate the 

We do have 1024-dimensional vectors. Now let's test them:

In [7]:
rl("tell me about your political opinions?")

RouteChoice(name='politics', function_call=None, similarity_score=None)

In [8]:
rl("how's the weather today?")

RouteChoice(name='chitchat', function_call=None, similarity_score=None)

Both are classified accurately, what if we send a query that is unrelated to our existing `Route` objects?

In [9]:
rl("I'm interested in learning about llama 2")

RouteChoice(name=None, function_call=None, similarity_score=None)

In this case, we return `None` because no matches were identified. We always recommend optimizing your `RouteLayer` for optimal performance, you can see how in [this notebook](https://github.com/aurelio-labs/semantic-router/blob/main/docs/06-threshold-optimization.ipynb).

---