## 🚅 liteLLM Quick Start Demo
### TLDR: Call 50+ LLM APIs using chatGPT Input/Output format
https://github.com/BerriAI/litellm

liteLLM is package to simplify calling **OpenAI, Azure, Llama2, Cohere, Anthropic, Huggingface API Endpoints**. LiteLLM manages



## Installation and setting Params

In [1]:
!pip install litellm

Collecting litellm
  Downloading litellm-1.72.2-py3-none-any.whl.metadata (39 kB)
Collecting python-dotenv>=0.2.0 (from litellm)
  Downloading python_dotenv-1.1.0-py3-none-any.whl.metadata (24 kB)
Downloading litellm-1.72.2-py3-none-any.whl (8.0 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m8.0/8.0 MB[0m [31m53.9 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading python_dotenv-1.1.0-py3-none-any.whl (20 kB)
Installing collected packages: python-dotenv, litellm
Successfully installed litellm-1.72.2 python-dotenv-1.1.0


In [3]:
from litellm import completion
import os

## Set your API keys
- liteLLM reads your .env, env variables or key manager for Auth

Set keys for the models you want to use below

In [10]:
# Only set keys for the LLMs you want to use
os.environ['OPENAI_API_KEY'] = "" #@param
os.environ["ANTHROPIC_API_KEY"] = "" #@param
os.environ["REPLICATE_API_KEY"] = "" #@param
os.environ["COHERE_API_KEY"] = "" #@param
os.environ["AZURE_API_BASE"] = "" #@param
os.environ["AZURE_API_VERSION"] = "" #@param
os.environ["AZURE_API_KEY"] = "" #@param
os.environ["OPENROUTER_API_KEY"] = "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" #@param

## Call chatGPT

In [11]:
completion(model="openrouter/google/gpt-3.5-turbo", messages=[{ "content": "what's the weather in SF","role": "user"}])


[1;31mGive Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new[0m
LiteLLM.Info: If you need to debug this error, use `litellm._turn_on_debug()'.



InternalServerError: litellm.InternalServerError: InternalServerError: OpenAIException - Connection error.

## Call Claude-2

In [12]:
completion(model="claude-2", messages=[{ "content": "what's the weather in SF","role": "user"}])


[1;31mGive Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new[0m
LiteLLM.Info: If you need to debug this error, use `litellm._turn_on_debug()'.



AuthenticationError: litellm.AuthenticationError: AnthropicException - {"type":"error","error":{"type":"authentication_error","message":"x-api-key header is required"}}

## Call llama2 on replicate

In [None]:
model = "replicate/llama-2-70b-chat:2c1608e18606fad2812020dc541930f2d0495ce32eee50074220b87300bc16e1"
completion(model=model, messages=[{ "content": "what's the weather in SF","role": "user"}])

<ModelResponse chat.completion id=chatcmpl-3151c2eb-b26f-4c96-89b5-ed1746b219e0 at 0x138b87e50> JSON: {
  "object": "chat.completion",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": " I'm happy to help! However, I must point out that the question \"what's the weather in SF\" doesn't make sense as \"SF\" could refer to multiple locations. Could you please clarify which location you are referring to? San Francisco, California or Sioux Falls, South Dakota? Once I have more context, I would be happy to provide you with accurate and reliable information.",
        "role": "assistant",
        "logprobs": null
      }
    }
  ],
  "id": "chatcmpl-3151c2eb-b26f-4c96-89b5-ed1746b219e0",
  "created": 1695490237.714101,
  "response_ms": 12109.565,
  "model": "replicate/llama-2-70b-chat:2c1608e18606fad2812020dc541930f2d0495ce32eee50074220b87300bc16e1",
  "usage": {
    "prompt_tokens": 6,
    "completion_tokens": 78,
    "total_tokens":

## Call Command-Nightly

In [None]:
completion(model="command-nightly", messages=[{ "content": "what's the weather in SF","role": "user"}])

<ModelResponse chat.completion id=chatcmpl-dc0d8ead-071d-486c-a111-78975b38794b at 0x1389725e0> JSON: {
  "object": "chat.completion",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": " As an AI model I don't have access to real-time data, so I can't tell",
        "role": "assistant",
        "logprobs": null
      }
    }
  ],
  "id": "chatcmpl-dc0d8ead-071d-486c-a111-78975b38794b",
  "created": 1695490235.936903,
  "response_ms": 1022.6759999999999,
  "model": "command-nightly",
  "usage": {
    "prompt_tokens": 6,
    "completion_tokens": 19,
    "total_tokens": 25
  }
}

## Call Azure OpenAI

For azure openai calls ensure to add the `azure/` prefix to `model`. If your deployment-id is `chatgpt-test` set `model` = `azure/chatgpt-test`

In [None]:
completion(model="azure/chatgpt-v-2", messages=[{ "content": "what's the weather in SF","role": "user"}])

<OpenAIObject chat.completion id=chatcmpl-820kZyCwbNvZATiLkNmXmpxxzvTKO at 0x138b84ae0> JSON: {
  "id": "chatcmpl-820kZyCwbNvZATiLkNmXmpxxzvTKO",
  "object": "chat.completion",
  "created": 1695490231,
  "model": "gpt-35-turbo",
  "choices": [
    {
      "index": 0,
      "finish_reason": "stop",
      "message": {
        "role": "assistant",
        "content": "Sorry, as an AI language model, I don't have real-time information. Please check your preferred weather website or app for the latest weather updates of San Francisco."
      }
    }
  ],
  "usage": {
    "completion_tokens": 33,
    "prompt_tokens": 14,
    "total_tokens": 47
  },
  "response_ms": 1499.529
}