# Custom Routing for LLM Prompts with Not Diamond

This notebook demonstrates how to use Weave with [Not Diamond's custom routing](https://docs.notdiamond.ai/docs/router-training-quickstart) to route LLM prompts to the most appropriate model based on evaluation results.

## Routing prompts

When building complex LLM workflows users may need to prompt different models according to accuracy, cost, or call latency. 
Users can use [Not Diamond](https://www.notdiamond.ai/) to route prompts in these workflows to the right model for their needs, helping maximize accuracy while saving on model costs.

For any given distribution of data, rarely will one single model outperform every other model on every single query. By combining together multiple models into a "meta-model" that learns when to call each LLM, you can beat every individual model's performance and even drive down costs and latency in the process.

## Custom routing

You need three things to train a custom router for your prompts:

1. A set of LLM prompts: Prompts must be strings and should be representative of the prompts used in our application.
1. LLM responses: The responses from candidate LLMs for each input. Candidate LLMs can include both our supported LLMs and your own custom models.
1. Evaluation scores for responses to the inputs from candidate LLMs: Scores are numbers, and can be any metric that fit your needs.

By submitting these to the Not Diamond API you can then train a custom router tuned to each of your workflows.


## Setting up the training data

In practice, you will use your own Evaluations to train a custom router. For this example notebook, however, you will use LLM responses 
for [the HumanEval dataset](https://github.com/openai/human-eval) to train a custom router for coding tasks.

We start by downloading the dataset we have prepared for this example, then parsing LLM responses into EvaluationResults for each model.

In [1]:
!curl -L "https://drive.google.com/uc?export=download&id=1q1zNZHioy9B7M-WRjsJPkfvFosfaHX38" -o humaneval.csv

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  595k  100  595k    0     0   233k      0  0:00:02  0:00:02 --:--:--  412k


In [2]:
from weave.integrations.notdiamond.custom_router_test import get_model_evals

model_evals = get_model_evals('./humaneval.csv')
for model, evaluation_results in model_evals.items():
    print(f"Found {len(evaluation_results.rows)} rows for {model}.")

Found 164 rows for anthropic/claude-3-5-sonnet-20240620.
Found 164 rows for openai/gpt-4o-2024-05-13.
Found 164 rows for google/gemini-1.5-pro-latest.
Found 164 rows for openai/gpt-4-turbo-2024-04-09.
Found 164 rows for anthropic/claude-3-opus-20240229.


## Training a custom router

Now that you have EvaluationResults, you can train a custom router. Make sure you have [created an account](https://app.notdiamond.ai/keys) and 
[generated an API key](https://app.notdiamond.ai/keys), then insert your API key below.

![Create an API key](../docs/guides/integrations/imgs/notdiamond/api-keys.png)

In [3]:
import os
from weave.integrations.notdiamond.custom_router import train_evaluations

api_key = os.getenv("NOTDIAMOND_API_KEY", "<YOUR_API_KEY>")

preference_id = train_evaluations(
    model_evals=model_evals,
    prompt_column="prompt",
    response_column="actual",
    language="en",
    maximize=True,
    api_key=api_key,
)

You can then follow the training process for your custom router via the Not Diamond app.

![Check on router training progress](../docs/guides/integrations/imgs/notdiamond/router-preferences.png)

Once your custom router has finished training, you can use it to route your prompts.

In [9]:
from notdiamond import NotDiamond
import weave

weave.init("dev_testing")

llm_configs = [
    "anthropic/claude-3-5-sonnet-20240620",
    "openai/gpt-4o-2024-05-13",
    "google/gemini-1.5-pro-latest",
    "openai/gpt-4-turbo-2024-04-09",
    "anthropic/claude-3-opus-20240229",
]
client = NotDiamond(api_key=api_key, llm_configs=llm_configs)

new_prompt = """
You are a helpful coding assistant. Using the provided function signature, write the implementation for the function
in Python. Write only the function. Do not include any other text.

from typing import List


def has_close_elements(numbers: List[float], threshold: float) -> bool:
    """""" Check if in given list of numbers, are any two numbers closer to each other than
    given threshold.
    >>> has_close_elements([1.0, 2.0, 3.0], 0.5)
    False
    >>> has_close_elements([1.0, 2.8, 3.0, 4.0, 5.0, 2.0], 0.3)
    True
    """"""
"""
session_id, routing_target_model = client.model_select(
    messages=[{"role": "user", "content": new_prompt}],
    preference_id=preference_id,
)

print(f"Session ID: {session_id}")
print(f"Target Model: {routing_target_model}")

🍩 https://wandb.ai/notdiamond/dev_testing/r/call/01924a65-7f09-7ba3-8eaf-c8248e2be51e
🍩 https://wandb.ai/notdiamond/dev_testing/r/call/01924a65-7f14-7073-b4bf-dfc7b031fe4c
🍩 https://wandb.ai/notdiamond/dev_testing/r/call/01924a65-7f15-7881-b607-67c0de9928d5
🍩 https://wandb.ai/notdiamond/dev_testing/r/call/01924a65-7f16-71a0-a1d8-456ed6c83544
🍩 https://wandb.ai/notdiamond/dev_testing/r/call/01924a65-7f17-75f2-9316-d5e1ae8c5538
🍩 https://wandb.ai/notdiamond/dev_testing/r/call/01924a65-7f1f-7501-ae43-51dc16ed69b0
Session ID: 71ac45e8-b9c2-44ee-a359-66eb07bfa8ec
Target Model: anthropic/claude-3-opus-20240229


This example also used Not Diamond's compatibility with Weave auto-tracing. You can see the results in the Weave UI.

![Weave UI for custom routing](../docs/guides/integrations/imgs/notdiamond/weave-trace.png)