![Redis](https://redis.io/wp-content/uploads/2024/04/Logotype.svg?auto=webp&quality=85,75&width=120)
# Routing Optimization

Implementing a semantic router is a great light weight way to add branching logic to your application without taking on additional LLM calls. However, it can be tough to determine the optimal distance threshold values for your routes to maximize performance. This guide will walk through:

- how to configure a semantic router
- how to optimize the distance thresholds for the routes
- a comparison between performing similar logic with an LLM versus a router

## Let's Begin!
<a href="https://colab.research.google.com/github/redis-developer/redis-ai-resources/blob/main/python-recipes/semantic-router/01_routing_optimization.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# Setup

## Install Packages

In [None]:
%pip install -q sentence-transformers ranx

Note: you may need to restart the kernel to use updated packages.


In [2]:
# install from branch since scheduled for 0.5.0
%pip install git+https://github.com/redis/redis-vl-python.git@0.5.0

Collecting git+https://github.com/redis/redis-vl-python.git@0.5.0
  Cloning https://github.com/redis/redis-vl-python.git (to revision 0.5.0) to /private/var/folders/_g/rr4lnxxx1_z7m78lz89dhvsm0000gp/T/pip-req-build-54zjmrpr
  Running command git clone --filter=blob:none --quiet https://github.com/redis/redis-vl-python.git /private/var/folders/_g/rr4lnxxx1_z7m78lz89dhvsm0000gp/T/pip-req-build-54zjmrpr
  Running command git checkout -b 0.5.0 --track origin/0.5.0
  Switched to a new branch '0.5.0'
  branch '0.5.0' set up to track 'origin/0.5.0'.
  Resolved https://github.com/redis/redis-vl-python.git to commit 3ca4c97baa9640d24feedd3bb3791cf95859367d
  Installing build dependencies ... [?25ldone
[?25h  Getting requirements to build wheel ... [?25ldone
[?25h  Preparing metadata (pyproject.toml) ... [?25ldone
Building wheels for collected packages: redisvl
  Building wheel for redisvl (pyproject.toml) ... [?25ldone
[?25h  Created wheel for redisvl: filename=redisvl-0.4.1-py3-none-any

## Run a Redis instance

#### For Colab
Use the shell script below to download, extract, and install [Redis Stack](https://redis.io/docs/getting-started/install-stack/) directly from the Redis package archive.

In [None]:
# NBVAL_SKIP
%%sh
curl -fsSL https://packages.redis.io/gpg | sudo gpg --dearmor -o /usr/share/keyrings/redis-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/redis-archive-keyring.gpg] https://packages.redis.io/deb $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/redis.list
sudo apt-get update  > /dev/null 2>&1
sudo apt-get install redis-stack-server  > /dev/null 2>&1
redis-stack-server --daemonize yes

#### For Alternative Environments
There are many ways to get the necessary redis-stack instance running
1. On cloud, deploy a [FREE instance of Redis in the cloud](https://redis.com/try-free/). Or, if you have your
own version of Redis Enterprise running, that works too!
2. Per OS, [see the docs](https://redis.io/docs/latest/operate/oss_and_stack/install/install-stack/)
3. With docker: `docker run -d --name redis-stack-server -p 6379:6379 redis/redis-stack-server:latest`

### Define the Redis Connection URL

By default this notebook connects to the local instance of Redis Stack. **If you have your own Redis Enterprise instance** - replace REDIS_PASSWORD, REDIS_HOST and REDIS_PORT values with your own.

In [17]:
import os
import warnings

warnings.filterwarnings("ignore")

# Replace values below with your own if using Redis Cloud instance
REDIS_HOST = os.getenv("REDIS_HOST", "localhost") # ex: "redis-18374.c253.us-central1-1.gce.cloud.redislabs.com"
REDIS_PORT = os.getenv("REDIS_PORT", "6379")      # ex: 18374
REDIS_PASSWORD = os.getenv("REDIS_PASSWORD", "")  # ex: "1TNxTEdYRDgIDKM2gDfasupCADXXXX"

# If SSL is enabled on the endpoint, use rediss:// as the URL prefix
REDIS_URL = f"redis://:{REDIS_PASSWORD}@{REDIS_HOST}:{REDIS_PORT}"

# Routing with multiple routes

## Define the Routes

Below we define 3 different routes. One for `technology`, one for `sports`, and
another for `entertainment`. Now for this example, the goal here is
surely topic "classification". But you can create routes and references for
almost anything.

Each route has a set of references that cover the "semantic surface area" of the
route. The incoming query from a user needs to be semantically similar to one or
more of the references in order to "match" on the route. Note that each route can have it's own distinct `distance_threshold` that defines what is considered a match for the particular query. 

In [38]:
from redisvl.extensions.router import Route

faq = Route(
    name="faq",
    references=[
        "How do I reset my password?",
        "Where can I view my order history?",
        "How do I update my shipping address?",
        "Where are my saved payment methods?",
        "How do I change my email preferences?",
        "How can I see my loyalty points balance?",
        "Where do I find my digital receipts?",
        "How do I enable two-factor authentication?",
        "Can I change my username or email?",
        "How do I manage my account settings?"
    ],
    metadata={"category": "account_management", "priority": 1},
    distance_threshold=0.5
)

general = Route(
    name="general",
    references=[
        "I received the wrong item in my order, can you help?",
        "Can you recommend products that match my specific needs?",
        "The assembly instructions for my furniture are unclear",
        "I need help finding a product with particular specifications",
        "My order arrived damaged, what are my options?",
        "Can you help me design a room with your products?",
        "I'm looking for custom sizing options for this product",
        "The item I received doesn't match the online description",
        "I need advice on which model would work best for my situation",
        "Can you help troubleshoot an issue with my recent purchase?"
    ],
    metadata={"category": "customer_service", "priority": 2},
    distance_threshold=0.5
)

blocked = Route(
    name="blocked",
    references=[
        "What is your company's stance on the recent election?",
        "Do you support liberal or conservative policies?",
        "Can you tell me another customer's address?",
        "What's your CEO's opinion on gun control?",
        "I need personal information about one of your employees",
        "How does your company vote on political issues?",
        "Can you provide me with someone's credit card details?",
        "What's your position on immigration reform?",
        "I want to know where a specific customer lives",
        "Does your company donate to political campaigns?"
    ],
    metadata={"category": "prohibited", "priority": 3},
    distance_threshold=0.5
)

## Initialize the SemanticRouter

Like before the ``SemanticRouter`` class will automatically create an index within Redis upon initialization for the route references.

In [39]:
import os
from redisvl.extensions.router import SemanticRouter

os.environ["TOKENIZERS_PARALLELISM"] = "false"

# Initialize the SemanticRouter
ecom_router = SemanticRouter(
    name="ecom-router",
    routes=[faq, general, blocked],
    redis_url="redis://localhost:6379",
    overwrite=True # Blow away any other routing index with this name
)



15:46:13 redisvl.index.index INFO   Index already exists, overwriting.


## View the created index

In [40]:
# look at the index specification created for the semantic router
!rvl index info -i ecom-router



Index Information:
╭──────────────┬────────────────┬─────────────────┬─────────────────┬────────────╮
│ Index Name   │ Storage Type   │ Prefixes        │ Index Options   │   Indexing │
├──────────────┼────────────────┼─────────────────┼─────────────────┼────────────┤
│ ecom-router  │ HASH           │ ['ecom-router'] │ []              │          0 │
╰──────────────┴────────────────┴─────────────────┴─────────────────┴────────────╯
Index Fields:
╭────────────┬─────────────┬────────┬────────────────┬────────────────┬────────────────┬────────────────┬────────────────┬────────────────┬─────────────────┬────────────────╮
│ Name       │ Attribute   │ Type   │ Field Option   │ Option Value   │ Field Option   │ Option Value   │ Field Option   │   Option Value │ Field Option    │ Option Value   │
├────────────┼─────────────┼────────┼────────────────┼────────────────┼────────────────┼────────────────┼────────────────┼────────────────┼─────────────────┼────────────────┤
│ route_name │ route_name

## Test it out

In [41]:
# Query the router with a statement
route_match = ecom_router("Whatup how do i reset my password?")
route_match

RouteMatch(name='faq', distance=0.108501493931)

## Optimize route distance thresholds with test data

For optimization within redisvl you can create test data manually or make use of a model to generate some for you. In this case we will use a model to do it for us.

Prompt for creating test data:
> used claude sonnet 3.7 for generation of resource

```txt
You are a test data creation helper. 

Create test data of the form:

{
    "query": "query about a topic",
    "query_match": "topic-the-query-matches"
}

The 3 available topics are: faq, general, and blocked. Generate many examples that map to these topics such that we can train a model to find the best thresholds for this classification task. Also make sure to include some examples that don't map to any of the topics to check the null case for these leave the query_match field empty.
```

The output of this call was saved to `./resources/test_data.json`

In [11]:
import json

with open("resources/ecom_train_data.json", "r") as f:
    train_data = json.load(f)

## Run optimization with router

In [12]:
from redisvl.utils.optimize import RouterThresholdOptimizer

optimizer = RouterThresholdOptimizer(ecom_router, train_data)
optimizer.optimize()

Eval metric F1: start 0.586, end 0.671 
Ending thresholds: {'faq': 0.5868686868686872, 'general': 0.7626262626262627, 'blocked': 0.12517090092847685}


## Test classification against LLM

Using the same prompt above we generated and stored another 20 questions to use as our `test_data` to compare against using an LLM model to perform this classification.

In [17]:
import os
import getpass
import time
import numpy as np

from openai import OpenAI

os.environ["TOKENIZERS_PARALLELISM"] = "False"

api_key = os.getenv("OPENAI_API_KEY") or getpass.getpass("Enter your OpenAI API key: ")

client = OpenAI(api_key=api_key)

def ask_openai(question: str) -> str:
    prompt = f"""
    You are a classification bot. Your job is to classify the following query as either faq, general, blocked, or none. Return only the string label or an empty string if no match.

    general is defined as request requiring customer service.
    faq is defined as a request for commonly asked account questions.
    blocked is defined as a request for prohibited information.

    query: "{question}"
    """
    response = client.completions.create(
      model="gpt-3.5-turbo-instruct",
      prompt=prompt,
      max_tokens=200
    )
    return response.choices[0].text.strip()

In [18]:
with open("resources/ecom_test_data.json", "r") as f:
    test_data = json.load(f)


ask_openai(test_data[0]["query"])

'faq'

In [19]:
import time

def test_classifier(classifier, test_data, is_router=False):
    correct = 0
    times = []
    for data in test_data:
        start = time.time()
        if is_router:
            prediction = classifier(data["query"]).name
        else:
            prediction = classifier(data["query"])
        
        if not prediction or prediction.lower() == "none":
            prediction = ""

        times.append(time.time() - start)
        print(f"Expected | Observed: {data['query_match']} | {prediction.lower()}")
        if prediction.lower() == data["query_match"]:
            correct += 1
    accuracy = correct / len(test_data)
    avg_time = np.mean(times)
    return accuracy, avg_time

In [20]:
llm_accuracy, llm_avg_time = test_classifier(ask_openai, test_data)

Expected | Observed: faq | faq
Expected | Observed: faq | 
Expected | Observed: faq | faq
Expected | Observed: faq | faq
Expected | Observed: faq | faq
Expected | Observed: faq | general
Expected | Observed: faq | faq
Expected | Observed: general | general
Expected | Observed: general | 
Expected | Observed: general | 
Expected | Observed: general | the label is general.
Expected | Observed: general | general
Expected | Observed: general | 
Expected | Observed: blocked | 
Expected | Observed: blocked | blocked
Expected | Observed: blocked | empty string
Expected | Observed: blocked | general
Expected | Observed: blocked | classifier's got no talent :(

none
Expected | Observed: blocked | 
Expected | Observed: blocked | 


In [21]:
llm_accuracy, llm_avg_time

(0.4, 0.6402494192123414)

In [22]:
router_accuracy, router_avg_time = test_classifier(ecom_router, test_data, is_router=True)

Expected | Observed: faq | faq
Expected | Observed: faq | faq
Expected | Observed: faq | 
Expected | Observed: faq | faq
Expected | Observed: faq | 
Expected | Observed: faq | faq
Expected | Observed: faq | 
Expected | Observed: general | 
Expected | Observed: general | 
Expected | Observed: general | 
Expected | Observed: general | 
Expected | Observed: general | 
Expected | Observed: general | general
Expected | Observed: blocked | 
Expected | Observed: blocked | blocked
Expected | Observed: blocked | blocked
Expected | Observed: blocked | 
Expected | Observed: blocked | 
Expected | Observed: blocked | blocked
Expected | Observed: blocked | blocked


In [23]:
router_accuracy, router_avg_time

(0.45, 0.07264227867126465)

In [None]:
from redisvl.extensions.router.schema import DistanceAggregationMethod
from redisvl.extensions.router import RoutingConfig

# toggle aggregation method
ecom_router.update_routing_config(
    RoutingConfig(aggregation_method=DistanceAggregationMethod.min, max_k=3)
)

In [25]:
router_accuracy_min, router_avg_time_min = test_classifier(ecom_router, test_data, is_router=True)

Expected | Observed: faq | faq
Expected | Observed: faq | faq
Expected | Observed: faq | 
Expected | Observed: faq | faq
Expected | Observed: faq | 
Expected | Observed: faq | faq
Expected | Observed: faq | 
Expected | Observed: general | 
Expected | Observed: general | 
Expected | Observed: general | 
Expected | Observed: general | 
Expected | Observed: general | 
Expected | Observed: general | general
Expected | Observed: blocked | 
Expected | Observed: blocked | blocked
Expected | Observed: blocked | blocked
Expected | Observed: blocked | 
Expected | Observed: blocked | 
Expected | Observed: blocked | blocked
Expected | Observed: blocked | blocked


In [26]:
router_accuracy_min, router_avg_time_min

(0.45, 0.04600746631622314)

## Analysis

The following outputs illustrate the tradeoffs with LLMs vs. using the router. For this particular example, the accuracy is similar however the router comes with an almost 10x latency boost and considerable cost savings depending on the model used.