# WORK IN PROGRESS
# Notebook 4: LLM Pipeline Routing

In this notebook we will show how to use routing to different LLM pipelines. We will use the vector databases and approaches from Notebooks 1 + 2. We save key functions from those notebooks in "genai_utils.py"

We will consider 3 pipelines on Tesla (TSLA) 
* RAG on 10k report 
* Static RAG on stock prices data.
* Tabular data agent for stock data
* Web search retrieval

# Import libraries and load the 10k and stock data

In [1]:
import subprocess
import tiktoken 
import pandas as pd
import os
import csv
import json
import time
import re
import transformers
import torch
import numpy as np
from datetime import datetime

# To use with the router
from sklearn.metrics.pairwise import cosine_similarity

#We will use langchain to create a vector store to retrieve stronger negatives
from langchain_community.embeddings import HuggingFaceEmbeddings

from genai_utils import get_retriever, generate_response, generate_kb_response

MODEL = "gpt-4-1106-preview"
EMBEDDING_MODEL_NAME = "all-MiniLM-L6-v2"
company_name="Tesla"

# Get 10k and stocks retrievers

In [2]:
embedding_function = HuggingFaceEmbeddings(
            model_name=EMBEDDING_MODEL_NAME,
            cache_folder="./models/sentencetransformers"
        )

In [3]:
retriever_10k = get_retriever(EMBEDDING_MODEL_NAME, top_k=8, faiss_dir = "../data/faiss",)
retriever_stock = get_retriever(EMBEDDING_MODEL_NAME,top_k=8, faiss_dir = "../data/faiss_stock",)

# Build Question Bank and Similarity Functions

To get the idea across, I used a simplistic method to route questions / prompt to a given LLM pipeline using ROUGE F1 and Dense Similarity with a templated question bank. A more common approach is to use the LLM for orchestration, but this can incur additional cost.

In [4]:
# See config.py for the questions sets (put in there to save space here)
from config import QUESTIONS_10K, QUESTIONS_STATIC_STOCK, QUESTIONS_AGENT_STOCK, QUESTIONS_NEWS, QUESTIONS_GENERIC, ROUTE_DICT

In [5]:
def get_max_similarity(prompt, question_list, company_name):

    if type(question_list[0]) != str:
        question_list_embeddings = question_list
    else:
        question_list = [question.format(company_name=company_name) for question in question_list]
        question_list_embeddings = embedding_function.embed_documents(question_list)

    if type(prompt) == str:
        prompt_embedding = embedding_function.embed_documents([prompt])
    elif type(prompt[0]) == list:
        prompt_embedding = prompt
    else:
        prompt_embedding = [prompt]
    
    similarity = cosine_similarity(prompt_embedding, question_list_embeddings)

    max_score = np.max(similarity)
    best_question = question_list[np.argmax(similarity)]
    return max_score, best_question

Let's ask some question to test out similarity to see how well this would work for routing.

In [6]:
question = f"For {company_name}, how much was the stock price recently?"

print("QUESTIONS_10K: {}".format(get_max_similarity(question, QUESTIONS_10K, company_name=company_name)))
print("QUESTIONS_STATIC_STOCK: {}".format(get_max_similarity(question, QUESTIONS_STATIC_STOCK, company_name=company_name)))
print("QUESTIONS_AGENT_STOCK: {}".format(get_max_similarity(question, QUESTIONS_AGENT_STOCK, company_name=company_name)))
print("QUESTIONS_NEWS: {}".format(get_max_similarity(question, QUESTIONS_NEWS, company_name=company_name)))
print("QUESTIONS_GENERIC: {}".format(get_max_similarity(question, QUESTIONS_GENERIC, company_name=company_name)))

QUESTIONS_10K: (0.7051594009230184, 'How much revenue did Tesla earn in the last quarter?')
QUESTIONS_STATIC_STOCK: (0.9249916934634888, 'What is the stock price for Tesla today?')
QUESTIONS_AGENT_STOCK: (0.8929769567742046, 'What was the average stock price for Tesla in the last 90 days?')
QUESTIONS_NEWS: (0.6056327131882095, 'What are the recent news headlines for Tesla?')
QUESTIONS_GENERIC: (0.5027268188902144, 'For Tesla, what else can you tell me about that?')


In [7]:
question = f"What was last weeks volume weighted average price for {company_name}?"

print("QUESTIONS_10K: {}".format(get_max_similarity(question, QUESTIONS_10K, company_name=company_name)))
print("QUESTIONS_STATIC_STOCK: {}".format(get_max_similarity(question, QUESTIONS_STATIC_STOCK, company_name=company_name)))
print("QUESTIONS_AGENT_STOCK: {}".format(get_max_similarity(question, QUESTIONS_AGENT_STOCK, company_name=company_name)))
print("QUESTIONS_NEWS: {}".format(get_max_similarity(question, QUESTIONS_NEWS, company_name=company_name)))
print("QUESTIONS_GENERIC: {}".format(get_max_similarity(question, QUESTIONS_GENERIC, company_name=company_name)))

QUESTIONS_10K: (0.675001532403714, 'How much revenue did Tesla earn in the last quarter?')
QUESTIONS_STATIC_STOCK: (0.7742964940518129, 'What was the trading volume for Tesla?')
QUESTIONS_AGENT_STOCK: (0.9587606468279539, 'What was the volume weighted average price for Tesla in the last 30 days?')
QUESTIONS_NEWS: (0.5461695248300035, 'What are the recent news headlines for Tesla?')
QUESTIONS_GENERIC: (0.44465238804100415, 'For Tesla, what else do you know?')


In [8]:
question = f"What was yesterday close price for {company_name}?"

print("QUESTIONS_10K: {}".format(get_max_similarity(question, QUESTIONS_10K, company_name=company_name)))
print("QUESTIONS_STATIC_STOCK: {}".format(get_max_similarity(question, QUESTIONS_STATIC_STOCK, company_name=company_name)))
print("QUESTIONS_AGENT_STOCK: {}".format(get_max_similarity(question, QUESTIONS_AGENT_STOCK, company_name=company_name)))
print("QUESTIONS_NEWS: {}".format(get_max_similarity(question, QUESTIONS_NEWS, company_name=company_name)))
print("QUESTIONS_GENERIC: {}".format(get_max_similarity(question, QUESTIONS_GENERIC, company_name=company_name)))

QUESTIONS_10K: (0.6556617315047575, 'What is the gross profit for Tesla?')
QUESTIONS_STATIC_STOCK: (0.9412105469237768, 'What was the close price for Tesla?')
QUESTIONS_AGENT_STOCK: (0.8654705647460444, 'What was the average open to close price difference for Tesla in the last 30 days?')
QUESTIONS_NEWS: (0.5962918214282945, 'What are the recent news headlines for Tesla?')
QUESTIONS_GENERIC: (0.5197022002830722, 'For Tesla, what else do you know?')


In [9]:
question = f"Have there been any material news released on {company_name} this past week?"

print("QUESTIONS_10K: {}".format(get_max_similarity(question, QUESTIONS_10K, company_name=company_name)))
print("QUESTIONS_STATIC_STOCK: {}".format(get_max_similarity(question, QUESTIONS_STATIC_STOCK, company_name=company_name)))
print("QUESTIONS_AGENT_STOCK: {}".format(get_max_similarity(question, QUESTIONS_AGENT_STOCK, company_name=company_name)))
print("QUESTIONS_NEWS: {}".format(get_max_similarity(question, QUESTIONS_NEWS, company_name=company_name)))
print("QUESTIONS_GENERIC: {}".format(get_max_similarity(question, QUESTIONS_GENERIC, company_name=company_name)))

QUESTIONS_10K: (0.6049019743212483, 'What products does Tesla offer?')
QUESTIONS_STATIC_STOCK: (0.582107838790145, 'What is the latest stock price for Tesla?')
QUESTIONS_AGENT_STOCK: (0.529463290275921, 'What was the average trading volume for Tesla in the last 30 days?')
QUESTIONS_NEWS: (0.907196678020115, 'Has any news been published about Tesla recently?')
QUESTIONS_GENERIC: (0.6159116093878791, 'For Tesla, what else can you tell me about that?')


Ok, so we know the static stock data pipeline can't answer the VWAP question so a threshold not to enter that pipeline maybe should at least .77. Luckily, in this instance the agent won out, but this illustrates why a threshold is needed.

We also see that the answerable questions are > .85. So let's use that threshold for now.

In [10]:
ROUTE_THRESHOLD = .85

# Route Questions using Similarity Function and if not route use LLM

We will define a routing prompt using the question set in case semantic similarity doesn't hit threshold for routing

In [11]:
from config import ROUTING_PROMPT

print(ROUTING_PROMPT.format(company_name=company_name))


You are a helpful assistant determining how to route a question or prompt. Please give the category of the question or prompt by learning from the examples below.

Category #1:
Describe Tesla's business.
Describe what Tesla does.
What sector or industry does Tesla operate in?
What market does Tesla serve?
What products does Tesla offer?
What services does the Tesla offer?
Who are Tesla's clients or customers?
Who are the suppliers for Tesla?
What is the revenue for Tesla?
What is the net income for Tesla?
What is the operating income for Tesla?
What's Tesla's EBITDA?
What is the gross profit for Tesla?
What's Tesla's gross margin like?
How much cash does Tesla have?
What are the key risk factors Tesla is facing?
What did management discuss in the 10K for Tesla
How much revenue did Tesla earn in the last quarter?

Category #2:
What is the current stock price for Tesla?
What is the stock price for Tesla?
What is the stock price for Tesla today?
What is the latest stock price for Tesla?


In [12]:
def get_route(prompt, route_dict):
    """"
    This function takes a prompt and a dictionary of routes and returns the route with the highest similarity score
    """
    max_score = 0
    best_route = None
    best_question = None
    for route, questions in route_dict.items():
        score, question = get_max_similarity(prompt, questions, company_name="Tesla")
        if score > max_score:
            max_score = score
            best_route = route
            best_question = question
    if max_score > ROUTE_THRESHOLD:
        print(f"Best route: {best_route}\nWith question: {best_question}\nWith score: {max_score}")
        return best_route
    else:
        print(f"No route selected\nBest route: {best_route}\nWith question: {best_question}\nWith score: {max_score}")
        print("Routing to LLM")
        route_response = generate_response(prompt, MODEL, system_prompt=ROUTING_PROMPT.format(company_name=company_name))
        #find the route number from the response
        route_num = re.findall(r'\d+', route_response)
        if int(route_num[0])>0 and int(route_num[0])<=len(route_dict):
            route = list(route_dict.keys())[int(route_num[0])-1]
            print(f"Route selected from LLM: {route}")
            return route     
        return None

In [13]:
prompt = "How much was the stock price recently?"

#let's process the prompt and get the route
if prompt.lower().find(company_name.lower()) == -1:
    prompt = f"For {company_name}: {prompt}"

print(prompt)
route = get_route(prompt, ROUTE_DICT)

For Tesla: How much was the stock price recently?


Best route: Stock
With question: What is the stock price for Tesla today?
With score: 0.901985870003409


Now let's put this into a function

In [14]:
def ask_prompt(prompt, company_name):
    #let's process the prompt and get the route
    if prompt.lower().find(company_name.lower()) == -1:
        prompt = f"For {company_name}: {prompt}"
    route = get_route(prompt, ROUTE_DICT)
    if route is not None:
        if route == "10K":
            return generate_kb_response(prompt, MODEL, retriever_10k, system_prompt="",template=None, temperature=0, include_source=False)
        elif route == "Stock":
            return generate_kb_response(prompt, MODEL, retriever_stock, system_prompt="",template=None, temperature=0, include_source=False)
        elif route == "AgentStock":
            return "Agent not supported yet"
        elif route == "News":
            return "News not supported yet"
        elif route == "Generic":
            return "Generic not supported yet"
    else:
        #default to asking the 10k
        return generate_kb_response(prompt, MODEL, retriever_10k, system_prompt="",template=None, temperature=0, include_source=False)

In [15]:
response = ask_prompt("What was yesterday close price for Tesla?", "Tesla")
response

Best route: Stock
With question: What was the close price for Tesla?
With score: 0.9412105469237768


{'answer': 'Based on the context provided, the most recent date mentioned for Tesla\'s (TSLA) stock market data is 2023-05-26. The closing price for Tesla on that date was $193.169998. Since this is the latest date provided, and there is no data for a date after 2023-05-26, we can infer that "yesterday" in this context refers to 2023-05-26. Therefore, the closing price for Tesla "yesterday" would be $193.169998.',
 'source_documents': "Daily stock market data for Tesla (TSLA):\nDate: 2022-12-19\nOpen: 154.000000\nHigh: 155.250000\nLow: 145.820007\nClose: 149.869995\nAdj Close: 149.869995\nVolume: 139390600\nSource: {'source': './data/TSLA.csv', 'row': 3141, 'last_accessed_at': datetime.datetime(2022, 12, 19, 0, 0)}\n\nDaily stock market data for Tesla (TSLA):\nDate: 2020-03-17\nOpen: 29.334000\nHigh: 31.456667\nLow: 26.400000\nClose: 28.680000\nAdj Close: 28.680000\nVolume: 359919000\nSource: {'source': './data/TSLA.csv', 'row': 2445, 'last_accessed_at': datetime.datetime(2020, 3, 17, 

In [16]:
response = ask_prompt("What are the key risk factors for Tesla and do they hedge foreign currency risk?", "Tesla")
response

No route selected
Best route: 10K
With question: What are the key risk factors Tesla is facing?
With score: 0.7448293291281791
Routing to LLM
Route selected from LLM: 10K


{'answer': "The key risk factors for Tesla, as outlined in the provided context, include:\n\n1. **Public Perception and Commentary:** Tesla's products, business, and management are subject to significant public attention and commentary, including criticism that may be exaggerated or unfounded. Negative perceptions can harm Tesla's business and its ability to raise additional funds.\n\n2. **Foreign Currency Risk:** Tesla operates globally and transacts in multiple currencies, which exposes the company to foreign currency risks. They do not typically hedge foreign currency risk, which means that fluctuations in exchange rates can affect their operating results when expressed in U.S. dollars.\n\n3. **Market Demand for Electric Vehicles:** Tesla's growth and success depend on consumer demand for electric vehicles. If the market does not develop as expected, or if demand for Tesla's vehicles decreases, it could negatively impact the company's business and financial condition.\n\n4. **Compet