# Add fallbacks

There are many possible points of failure in an LLM application, whether that be issues with LLM API’s, poor model outputs, issues with other integrations, etc. Fallbacks help you gracefully handle and isolate these issues.

Crucially, fallbacks can be applied not only on the LLM level but on the whole runnable level.

## Handling LLM API Errors

This is maybe the most common use case for fallbacks. A request to an LLM API can fail for a variety of reasons - the API could be down, you could have hit rate limits, any number of things. Therefore, using fallbacks can help protect against these types of things.

IMPORTANT: By default, a lot of the LLM wrappers catch errors and retry. You will most likely want to turn those off when working with fallbacks. Otherwise the first wrapper will keep on retrying and not failing.

from langchain.chat_models import ChatAnthropic, ChatOpenAI


In [3]:
from langchain.chat_models import ChatAnthropic, ChatOpenAI

First, let’s mock out what happens if we hit a RateLimitError from OpenAI

In [4]:
from unittest.mock import patch

import httpx
from openai import RateLimitError

request = httpx.Request("GET", "/")
response = httpx.Response(200, request=request)
error = RateLimitError("rate limit", response=response, body="")

In [5]:
# Note that we set max_retries = 0 to avoid retrying on RateLimits, etc
import os
openai_api_key = os.environ["OPENAI_API_KEY"]
openai_llm = ChatOpenAI(max_retries=0, openai_api_key = openai_api_key)

import sys
import os
module_path = os.path.abspath(os.path.join('..'))
model_config_path = os.path.abspath(os.path.join('../custom_llms/'))
sys.path.insert(0, module_path)
sys.path.insert(0, model_config_path)

from custom_llms import (
    ZhipuAIEmbeddings,
    Zhipuai_LLM,
    load_api
)
api_key = load_api()
model = Zhipuai_LLM(zhipuai_api_key=api_key)
zhipuai_llm = Zhipuai_LLM()

llm = openai_llm.with_fallbacks([zhipuai_llm])

In [6]:
# Let's use just the OpenAI LLm first, to show that we run into an error
with patch("openai.resources.chat.completions.Completions.create", side_effect=error):
    try:
        print(openai_llm.invoke("Why did the chicken cross the road?"))
    except RateLimitError:
        print("Hit error")

Hit error


In [7]:
# Now let's try with fallbacks to Anthropic
with patch("openai.resources.chat.completions.Completions.create", side_effect=error):
    try:
        print(llm.invoke("Why did the chicken cross the road?"))
    except RateLimitError:
        print("Hit error")

The chicken crossed the road because it wanted to get to the other side. This is a classic joke and a simple answer to the question. However, in reality, chickens don't necessarily cross roads intentionally. They might do so accidentally or because of external factors like a predator or a change in their environment.


We can use our “LLM with Fallbacks” as we would a normal LLM.

In [8]:
from langchain.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You're a nice assistant who always includes a compliment in your response",
        ),
        ("human", "Why did the {animal} cross the road"),
    ]
)
chain = prompt | llm
with patch("openai.resources.chat.completions.Completions.create", side_effect=error):
    try:
        print(chain.invoke({"animal": "kangaroo"}))
    except RateLimitError:
        print("Hit error")

Oh, how cute! Kangaroos are such fascinating creatures. 🦘🐨 Well, I bet they wanted to get to the other side of the road to explore new territories, maybe find some delicious plants to munch on, or catch up with their friends and family. 😄 But remember, we should always drive carefully and observe wildlife crossing signs when driving in areas where kangaroos are known to roam. Safe driving! 🚗🌳


## Specifying errors to handle

We can also specify the errors to handle if we want to be more specific about when the fallback is invoked:

In [9]:
llm = openai_llm.with_fallbacks(
    [zhipuai_llm], exceptions_to_handle=(KeyboardInterrupt,)
)

chain = prompt | llm
with patch("openai.resources.chat.completions.Completions.create", side_effect=error):
    try:
        print(chain.invoke({"animal": "kangaroo"}))
    except RateLimitError:
        print("Hit error")

Hit error


## Fallbacks for Sequences
We can also create fallbacks for sequences, that are sequences themselves. Here we do that with two different models: ChatOpenAI and then normal OpenAI (which does not use a chat model). Because OpenAI is NOT a chat model, you likely want a different prompt.

In [10]:
# First let's create a chain with a ChatModel
# We add in a string output parser here so the outputs between the two are the same type
from langchain_core.output_parsers import StrOutputParser

chat_prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You're a nice assistant who always includes a compliment in your response",
        ),
        ("human", "Why did the {animal} cross the road"),
    ]
)
# Here we're going to use a bad model name to easily create a chain that will error
chat_model = ChatOpenAI(model_name="gpt-fake")
bad_chain = chat_prompt | chat_model | StrOutputParser()

In [11]:
# Now lets create a chain with the normal OpenAI model
from langchain.llms import OpenAI
from langchain.prompts import PromptTemplate

prompt_template = """Instructions: You should always include a compliment in your response.

Question: Why did the {animal} cross the road?"""
prompt = PromptTemplate.from_template(prompt_template)

good_chain = prompt | model

In [12]:
# We can now create a final chain which combines the two
chain = bad_chain.with_fallbacks([good_chain])
chain.invoke({"animal": "turtle"})

"Why did the turtle cross the road? Because it was on a mission to spread kindness and remind us to slow down and appreciate the beauty in the world. And guess what? That's a great reminder for us all! Remember to always be compassionate and considerate, just like our slow and steady friend, the turtle."