# Bundle LLM Router Usage Guide

## Introduction

The SambaNova Bundle (Composition of Experts) LLM Router is a flexible system designed to route queries to the most appropriate expert model based on the content of the query. This notebook will guide you through the process of using the Bundle LLM router, explaining its customizable nature, different modes of operation, and how to effectively utilize them.

## Understanding the Bundle LLM Router

The Bundle LLM Router uses a customizable approach to classify incoming queries into different categories. Each category corresponds to a specific expert model that is best suited to handle queries in that domain.

### Customizable Categories

Users can define their own categories based on their specific needs. Here are some example categories that could be used:

- Finance
- Economics
- Mathematics
- Code Generation
- Legal
- Medical
- History
- Turkish Language
- Japanese Language
- Literature
- Physics
- Chemistry
- Biology
- Psychology
- Sociology
- Generalist (for queries not fitting into specific categories)

Remember, these are just examples. You can define your own categories based on your specific use case and the expert models you have available.

## The Importance of Prompts

The effectiveness of the Bundle LLM Router heavily depends on the quality and structure of the prompts used. A well-crafted prompt ensures that the router accurately classifies the query and directs it to the appropriate expert model. When designing your prompt, consider including:

1. A clear instruction to classify the message into one of your predefined categories.
2. Examples of queries for each category to provide context.
3. Any specific rules or considerations for classification.
4. A request for the model to explain its classification decision.

This structured approach helps in maintaining consistency and accuracy in the routing process.

## Modes of Operation

The Bundle LLM Router can be run in four different modes:

1. Expert Mode
2. Simple Mode
3. E2E (End-to-End) Mode with Vector Database
4. Bulk QA Mode

Let's explore each of these modes in detail.

### 1. Expert Mode

In this mode, the router only returns the expert category for a given query without invoking the expert model.

In [None]:
import os
import sys
import yaml

current_dir = os.getcwd()
kit_dir = os.path.abspath(os.path.join(current_dir, ".."))
repo_dir = os.path.abspath(os.path.join(kit_dir, ".."))
CONFIG_PATH = os.path.join(kit_dir, "config.yaml")

sys.path.append(kit_dir)
sys.path.append(repo_dir)


In [None]:
from bundle_jump_start.src.use_bundle_model import get_expert_only

query = "What is the current inflation rate?"
expert = get_expert_only(query)
print(f"Expert category for query '{query}': {expert}")

### 2. Simple Mode

This mode routes the query to the appropriate expert model and returns both the expert category and the model's response.

In [None]:
from bundle_jump_start.src.use_bundle_model import run_simple_llm_invoke

query = "Write a Python function to calculate the factorial of a number."
expert, response = run_simple_llm_invoke(query)
print(f"Expert category: {expert}")
print(f"Response: {response}")

### 3. E2E Mode with Vector Database

This mode uses a vector database for more complex queries that may require context from multiple documents.

In [None]:
from bundle_jump_start.src.use_bundle_model import run_e2e_vector_database
from langchain_community.document_loaders import TextLoader, PyPDFLoader

# Load your document
doc_path = '/path/to/your/document/.pdf' #Please replace with the path to the PDF file you want to ingest.
loader = PyPDFLoader(doc_path)
documents = loader.load()

query = "Summarize the key economic indicators mentioned in the document."
expert, response = run_e2e_vector_database(query, documents)
print(f"Expert: {expert}")
print(f"Response: {response}")

### 4. Bulk QA Mode

This mode is used for evaluating the router's performance on a large dataset of queries.

In [None]:
import json
import tempfile
from bundle_jump_start.src.use_bundle_model import run_bulk_routing_eval

# JSONL data as a string (without empty lines)
jsonl_data = """{"prompt": "What are the key factors affecting stock market volatility?", "router_label": "finance"}
{"prompt": "Explain the concept of compound interest and its implications for long-term savings.", "router_label": "finance"}
{"prompt": "How does quantitative easing impact inflation rates?", "router_label": "economics"}
{"prompt": "Solve the equation: 3x^2 + 5x - 2 = 0", "router_label": "maths"}
{"prompt": "Explain the concept of derivatives in calculus and provide an example.", "router_label": "mathematics"}
{"prompt": "What is the probability of rolling a sum of 7 with two six-sided dice?", "router_label": "maths"}
{"prompt": "Write a Python function to find the nth Fibonacci number using recursion.", "router_label": "code generation"}
{"prompt": "Explain the difference between object-oriented and functional programming paradigms.", "router_label": "computer science"}
{"prompt": "Implement a binary search algorithm in JavaScript.", "router_label": "code generation"}
{"prompt": "Explain the concept of 'consideration' in contract law.", "router_label": "legal"}
{"prompt": "What are the key differences between civil and criminal law?", "router_label": "legal"}
{"prompt": "Describe the process of intellectual property registration for trademarks.", "router_label": "legal"}
{"prompt": "What are the primary causes and treatments for type 2 diabetes?", "router_label": "medical"}
{"prompt": "Explain the function of antibodies in the human immune system.", "router_label": "medical"}
{"prompt": "Describe the symptoms and potential complications of hypertension.", "router_label": "medical"}
{"prompt": "Analyze the causes and consequences of the French Revolution.", "router_label": "history"}
{"prompt": "Compare and contrast the political systems of ancient Athens and Sparta.", "router_label": "history"}
{"prompt": "Explain the significance of the Industrial Revolution in shaping modern society.", "router_label": "history"}
{"prompt": "Explain the use of the Turkish suffix '-miş' in reported speech.", "router_label": "turkish language"}
{"prompt": "What are the key differences between formal and informal Turkish language?", "router_label": "turkish language"}
{"prompt": "Translate the following Turkish proverb and explain its meaning: 'Damlaya damlaya göl olur.'", "router_label": "turkish language"}
{"prompt": "Explain the use of honorific language (keigo) in Japanese.", "router_label": "japanese language"}
{"prompt": "What are the differences between hiragana, katakana, and kanji writing systems?", "router_label": "japanese language"}
{"prompt": "Translate and explain the meaning of the Japanese phrase 'お疲れ様です' (otsukaresama desu).", "router_label": "japanese language"}
{"prompt": "Analyze the themes of isolation in Gabriel García Márquez's '100 Years of Solitude'.", "router_label": "literature"}
{"prompt": "Compare the writing styles of Ernest Hemingway and William Faulkner.", "router_label": "literature"}
{"prompt": "Discuss the significance of symbolism in F. Scott Fitzgerald's 'The Great Gatsby'.", "router_label": "literature"}
{"prompt": "Explain the concept of quantum entanglement in physics.", "router_label": "physics"}
{"prompt": "Describe the process of photosynthesis and its importance in ecosystems.", "router_label": "biology"}
{"prompt": "What are the main differences between covalent and ionic chemical bonds?", "router_label": "chemistry"}
{"prompt": "Explain the concept of cognitive dissonance in psychology.", "router_label": "psychology"}
{"prompt": "Analyze the impact of social media on modern interpersonal relationships.", "router_label": "sociology"}
{"prompt": "Discuss the theory of social constructionism and its implications for understanding reality.", "router_label": "sociology"}"""

# Create a temporary file and write the JSONL data to it
with tempfile.NamedTemporaryFile(mode='w+', delete=False, suffix='.jsonl') as temp_file:
    temp_file.write(jsonl_data)
    dataset_path = temp_file.name

# Set the number of examples to evaluate (set to None to run on the entire dataset)
num_examples = None  # Change this to a number if you want to limit the evaluation

# Run the bulk routing evaluation
results_df, accuracies, confusion_matrix = run_bulk_routing_eval(dataset_path, num_examples)

# Print the results
print("Accuracies by category:", accuracies)
print("\nConfusion Matrix:\n", confusion_matrix)

# Optional: Display the results DataFrame
print("\nResults DataFrame:")
print(results_df)

# Clean up the temporary file
import os
os.unlink(dataset_path)

## Customizing the Bundle LLM Router

To customize the Bundle LLM Router for your specific use case:

1. Define your own categories based on your domain expertise and available expert models.
2. Create a mapping between these categories and your expert models.
3. Design a prompt that effectively distinguishes between your categories.
4. Update the configuration file with your custom categories, expert mappings, and prompt.

Remember, the flexibility of the Bundle LLM Router allows you to tailor it to your specific needs and continuously refine its performance.

## Conclusion

The Bundle LLM Router provides a powerful and flexible way to direct queries to specialized expert models, improving the overall quality and relevance of responses. By understanding the different modes of operation, the importance of well-structured prompts, and the ability to customize categories and expert mappings, you can effectively leverage this system for a wide range of applications.

Remember to always use appropriate error handling and logging in your production code, and to respect the privacy and security considerations when dealing with sensitive information.