### Weeks 2 - Day 2 - Gradio Chatbot with LiteLLM (Model Routing)

**Author** : [Marcus Rosen](https://github.com/MarcusRosen)

[LiteLLM](https://docs.litellm.ai/docs/) provides the abilitty to call different LLM providers via a unified interface, returning results in OpenAI compatible formats.

Features:
- Model Selection in Gradio (Anthropic, OpenAI, Gemini)
- Single Inference function for all model providers via LiteLLM (call_llm)
- Streaming **NOTE:** Bug when trying to stream in Gradio, but works directly in Notebook
- Debug Tracing

In [109]:
from litellm import completion
import gradio as gr
from dotenv import load_dotenv
from bs4 import BeautifulSoup
import os
import requests
import json

#### Load API Keys

In [2]:
# Load environment variables in a file called .env
# Print the key prefixes to help with any debugging

load_dotenv(override=True)
openai_api_key = os.getenv('OPENAI_API_KEY')
anthropic_api_key = os.getenv('ANTHROPIC_API_KEY')
google_api_key = os.getenv('GEMINI_API_KEY')

if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")
    
if anthropic_api_key:
    print(f"Anthropic API Key exists and begins {anthropic_api_key[:7]}")
else:
    print("Anthropic API Key not set")

if google_api_key:
    print(f"Google API Key exists and begins {google_api_key[:8]}")
   # import google.generativeai
   # google.generativeai.configure()
else:
    print("Gemini API Key not set")

OpenAI API Key exists and begins sk-proj-
Anthropic API Key exists and begins sk-ant-
Google API Key exists and begins AIzaSyDC


### Use LiteLLM to abstract out the model provider

In [91]:
def call_llm(model, system_prompt, user_prompt, json_format_response=False, streaming=False):
    if DEBUG_OUTPUT:    
        print("call_llm()")
        print(f"streaming={streaming}")
        print(f"json_format_response={json_format_response}")
    
    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt}
    ]

    payload = {
        "model": model,
        "messages": messages
    }
    # Use Json Reponse Format
    # Link: https://docs.litellm.ai/docs/completion/json_mode
    if json_format_response:
        payload["response_format"]: { "type": "json_object" }
    
    if streaming:
        payload["stream"] = True
        response = completion(**payload)
        # Return a generator expression instead of using yield in the function
        return (part.choices[0].delta.content or "" for part in response)
    else:
        response = completion(**payload)
        return response["choices"][0]["message"]["content"]

### Brochure building functions

In [83]:
# A class to represent a Webpage

# Some websites need you to use proper headers when fetching them:
headers = {
 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
}

class Website:
    """
    A utility class to represent a Website that we have scraped, now with links
    """

    def __init__(self, url):
        self.url = url
        response = requests.get(url, headers=headers)
        self.body = response.content
        soup = BeautifulSoup(self.body, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        if soup.body:
            for irrelevant in soup.body(["script", "style", "img", "input"]):
                irrelevant.decompose()
            self.text = soup.body.get_text(separator="\n", strip=True)
        else:
            self.text = ""
        links = [link.get('href') for link in soup.find_all('a')]
        self.links = [link for link in links if link]

    def get_contents(self):
        return f"Webpage Title:\n{self.title}\nWebpage Contents:\n{self.text}\n\n"

In [84]:
def get_links(url, model):
    if DEBUG_OUTPUT:
        print("get_links()")
    website = Website(url)

    link_system_prompt = "You are provided with a list of links found on a webpage. \
    You are able to decide which of the links would be most relevant to include in a brochure about the company, \
    such as links to an About page, or a Company page, or Careers/Jobs pages.\n"
    link_system_prompt += "You should respond in raw JSON exactly as specified in this example. DO NOT USE MARKDOWN."
    link_system_prompt += """
    {
        "links": [
            {"type": "about page", "url": "https://full.url/goes/here/about"},
            {"type": "careers page": "url": "https://another.full.url/careers"}
        ]
    }
    """
    
    result = call_llm(model=model, 
                      system_prompt=link_system_prompt, 
                      user_prompt=get_links_user_prompt(website), 
                      json_format_response=True, 
                      streaming=False)
    if DEBUG_OUTPUT:
        print(result)
    return json.loads(result)

def get_links_user_prompt(website):
    if DEBUG_OUTPUT:
        print("get_links_user_prompt()")
        
    user_prompt = f"Here is the list of links on the website of {website.url} - "
    user_prompt += "please decide which of these are relevant web links for a brochure about the company, respond with the full https URL in JSON format. \
Do not include Terms of Service, Privacy, email links.\n"
    user_prompt += "Links (some might be relative links):\n"
    user_prompt += "\n".join(website.links)

    if DEBUG_OUTPUT:
        print(user_prompt)
    
    return user_prompt

def get_all_details(url, model):
    if DEBUG_OUTPUT:
        print("get_all_details()")
        
    result = "Landing page:\n"
    result += Website(url).get_contents()
    links = get_links(url, model)
    if DEBUG_OUTPUT:
        print("Found links:", links)
    for link in links["links"]:
        result += f"\n\n{link['type']}\n"
        result += Website(link["url"]).get_contents()
    return result

def get_brochure_user_prompt(company_name, url, model):
    
    if DEBUG_OUTPUT:
        print("get_brochure_user_prompt()")
    
    user_prompt = f"You are looking at a company called: {company_name}\n"
    user_prompt += f"Here are the contents of its landing page and other relevant pages; use this information to build a short brochure of the company in markdown.\n"
    user_prompt += get_all_details(url, model)
    user_prompt = user_prompt[:5000] # Truncate if more than 5,000 characters
    return user_prompt


In [106]:
def create_brochure(company_name, url, model, streaming):

    system_prompt = "You are an assistant that analyzes the contents of several relevant pages from a company website \
and creates a short brochure about the company for prospective customers, investors and recruits. Respond in markdown.\
Include details of company culture, customers and careers/jobs if you have the information."
    if streaming:
        result = call_llm(model=model, system_prompt=system_prompt, user_prompt=get_brochure_user_prompt(company_name, url, model), streaming=True)
        return (p for p in result)
    else:   
        return call_llm(model=model, system_prompt=system_prompt, user_prompt=get_brochure_user_prompt(company_name, url, model), streaming=False)
    

#### Testing Model before implenting Gradio

In [107]:
MODEL="claude-3-haiku-20240307"
DEBUG_OUTPUT=False
streaming=True
result = create_brochure(company_name="Rio Tinto", url="http://www.riotinto.com", model=MODEL, streaming=streaming)

if streaming:
    for chunk in result:
        print(chunk, end="", flush=True)
else:
    print(result)


# Rio Tinto: Providing the Materials for a Sustainable Future

## About Rio Tinto

Rio Tinto is a global mining and metals company, operating in 35 countries with over 60,000 employees. Their purpose is to find better ways to provide the materials the world needs. Continuous improvement and innovation are at the core of their DNA, as they work to responsibly supply the metals and minerals critical for urbanization and the transition to a low-carbon economy.

## Our Products

Rio Tinto's diverse portfolio includes:

- Iron Ore: The primary raw material used to make steel, which is strong, long-lasting and cost-efficient.
- Aluminium: A lightweight, durable and recyclable metal.
- Copper: A tough, malleable, corrosion-resistant and recyclable metal that is an excellent conductor of heat and electricity.
- Lithium: The lightest of all metals, a key element for low-carbon technologies.
- Diamonds: Ethically-sourced, high-quality diamonds.

## Sustainability and Innovation

Sustainability i

#### Gradio Setup
Associate Dropdown values with the model we want to use.
Link: https://www.gradio.app/docs/gradio/dropdown#initialization

In [None]:
DEBUG_OUTPUT=True
view = gr.Interface(
    fn=create_brochure,
    inputs=[
        gr.Textbox(label="Company name:"),
        gr.Textbox(label="Landing page URL including http:// or https://"),
        gr.Dropdown(choices=[("GPT 4o Mini", "gpt-4o-mini"), 
                             ("Claude Haiku 3", "claude-3-haiku-20240307"), 
                             ("Gemini 2.0 Flash", "gemini/gemini-2.0-flash")], 
                    label="Select model"),
        gr.Checkbox(label="Stream")
    ],
    outputs=[gr.Markdown(label="Brochure:")],
    flagging_mode="never"
)
view.launch()