# Migrating from ConversationalRetrievalChain

The [`ConversationalRetrievalChain`](https://python.langchain.com/api_reference/langchain/chains/langchain.chains.conversational_retrieval.base.ConversationalRetrievalChain.html) was an all-in one way that combined retrieval-augmented generation with chat history, allowing you to "chat with" your documents.

Advantages of switching to the LCEL implementation are similar to the [`RetrievalQA` migration guide](./retrieval_qa.ipynb):

- Clearer internals. The `ConversationalRetrievalChain` chain hides an entire question rephrasing step which dereferences the initial query against the chat history.
  - This means the class contains two sets of configurable prompts, LLMs, etc.
- More easily return source documents.
- Support for runnable methods like streaming and async operations.

Here are equivalent implementations with custom prompts.
We'll use the following ingestion code to load a [blog post by Lilian Weng](https://lilianweng.github.io/posts/2023-06-23-agent/) on autonomous agents into a local vector store:

## Shared setup

For both versions, we'll need to load the data with the `WebBaseLoader` document loader, split it with `RecursiveCharacterTextSplitter`, and add it to an in-memory `FAISS` vector store.

We will also instantiate a chat model to use.

In [1]:
%pip install --upgrade --quiet langchain-community langchain langchain-openai faiss-cpu beautifulsoup4

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.5/2.5 MB[0m [31m14.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m54.4/54.4 kB[0m [31m3.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m27.5/27.5 MB[0m [31m12.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.2/1.2 MB[0m [31m19.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.8/50.8 kB[0m [31m938.0 kB/s[0m eta [36m0:00:00[0m
[?25h

In [2]:
import os
from getpass import getpass

if "OPENAI_API_KEY" not in os.environ:
    os.environ["OPENAI_API_KEY"] = getpass()

··········


In [3]:
# Load docs
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.vectorstores import FAISS
from langchain_openai.chat_models import ChatOpenAI
from langchain_openai.embeddings import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/")
data = loader.load()

# Split
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
all_splits = text_splitter.split_documents(data)

# Store splits
vectorstore = FAISS.from_documents(documents=all_splits, embedding=OpenAIEmbeddings())

# LLM
llm = ChatOpenAI()



## Legacy

<details open>

In [4]:
from langchain.chains import ConversationalRetrievalChain
from langchain_core.prompts import ChatPromptTemplate

condense_question_template = """
Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question.

Chat History:
{chat_history}
Follow Up Input: {question}
Standalone question:"""

condense_question_prompt = ChatPromptTemplate.from_template(condense_question_template)

qa_template = """
You are an assistant for question-answering tasks.
Use the following pieces of retrieved context to answer
the question. If you don't know the answer, say that you
don't know. Use three sentences maximum and keep the
answer concise.

Chat History:
{chat_history}

Other context:
{context}

Question: {question}
"""

qa_prompt = ChatPromptTemplate.from_template(qa_template)

convo_qa_chain = ConversationalRetrievalChain.from_llm(
    llm,
    vectorstore.as_retriever(),
    condense_question_prompt=condense_question_prompt,
    combine_docs_chain_kwargs={
        "prompt": qa_prompt,
    },
)

convo_qa_chain(
    {
        "question": "What are autonomous agents?",
        "chat_history": "",
    }
)

  convo_qa_chain(


{'question': 'What are autonomous agents?',
 'chat_history': '',
 'answer': 'Autonomous agents are empowered by LLM to handle complex scientific tasks independently, such as designing, planning, and executing experiments. These agents can browse the Internet, read documentation, execute code, and leverage other LLMs. Boiko et al. (2023) investigated LLM-powered autonomous agents for scientific discovery.'}

</details>

## LCEL

<details open>

In [25]:
from langchain.chains import create_history_aware_retriever, create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain

condense_question_system_template = (
    "Given a chat history and the latest user question "
    "which might reference context in the chat history, "
    "formulate a standalone question which can be understood "
    "without the chat history. Do NOT answer the question, "
    "just reformulate it if needed and otherwise return it as is."
)

condense_question_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", condense_question_system_template),
        ("placeholder", "{chat_history}"),
        ("human", "{input}"),
    ]
)
history_aware_retriever = create_history_aware_retriever(
    llm,
    vectorstore.as_retriever(),
    condense_question_prompt
)

system_prompt = (
    "You are an assistant for question-answering tasks. "
    "Use the following pieces of retrieved context to answer "
    "the question. If you don't know the answer, say that you "
    "don't know. Use three sentences maximum and keep the "
    "answer concise."
    "\n\n"
    "{context}"
)

qa_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("placeholder", "{chat_history}"),
        ("human", "{input}"),
    ]
)
qa_chain = create_stuff_documents_chain(llm, qa_prompt)

convo_qa_chain = create_retrieval_chain(history_aware_retriever, qa_chain)

chain_result = convo_qa_chain.invoke(
    {
        "input": "What are autonomous agents?",
        "chat_history": [],
    }
)

chain_result

{'input': 'What are autonomous agents?',
 'chat_history': [],
 'context': [Document(id='64f848e1-c664-4f6a-86c8-0a0905e859a9', metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/', 'title': "LLM Powered Autonomous Agents | Lil'Log", 'description': 'Building agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-Engineer and BabyAGI, serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories, essays and programs; it can be framed as a powerful general problem solver.\nAgent System Overview\nIn a LLM-powered autonomous agent system, LLM functions as the agent’s brain, complemented by several key components:\n\nPlanning\n\nSubgoal and decomposition: The agent breaks down large tasks into smaller, manageable subgoals, enabling efficient handling of complex tasks.\nReflection and refinement: The agent can do self-criticism and self-reflection

In [128]:
def build_chain(llm, vectorstore):
    condense_question_system_template = (
        "Given a chat history and the latest user question "
        "which might reference context in the chat history, "
        "formulate a standalone question which can be understood "
        "without the chat history. Do NOT answer the question, "
        "just reformulate it if needed and otherwise return it as is."
    )

    condense_question_prompt = ChatPromptTemplate.from_messages(
        [
            ("system", condense_question_system_template),
            ("placeholder", "{chat_history}"),
            ("human", "{input}"),
        ]
    )
    history_aware_retriever = create_history_aware_retriever(
        llm,
        vectorstore.as_retriever(),
        condense_question_prompt
    )

    system_prompt = (
        "You are an assistant for question-answering tasks. "
        "Use the following pieces of retrieved context to answer "
        "the question. If you don't know the answer, say that you "
        "don't know. Use three sentences maximum and keep the "
        "answer concise."
        "\n\n"
        "{context}"
    )

    qa_prompt = ChatPromptTemplate.from_messages(
        [
            ("system", system_prompt),
            ("placeholder", "{chat_history}"),
            ("human", "{input}"),
        ]
    )
    qa_chain = create_stuff_documents_chain(llm, qa_prompt)

    convo_qa_chain = create_retrieval_chain(history_aware_retriever, qa_chain)
    return convo_qa_chain

def chat_with_retrieval_chain(input, chat_history):
    result = convo_qa_chain.invoke(
        {
            "input": input,
            "chat_history": chat_history,
        }
    )
    chat_history.extend([
        ("human", input),
        ("ai", result["answer"])
    ])
    return result["answer"], chat_history

def chat_with_history_aware(input, chat_history):
    system_template = (
        "You are an assistant for question-answering tasks. "
        "Use the following pieces of chat history as contexts to answer "
        "the question. If you don't know the answer, say that you "
        "don't know. Use three sentences maximum and keep the "
        "answer concise."
        "chat_history: {chat_history}"
        "query: {query}"
    )
    query_classification_prompt = ChatPromptTemplate.from_template(system_template)
    chain = query_classification_prompt | llm
    result = chain.invoke({
        "query": input,
        "chat_history": chat_history
    })
    chat_history.extend([
        ("human", input),
        ("ai", result.content)
    ])
    return result.content, chat_history

def chat(input, chat_history):
    query_classification_template = (
        "your task is to determine whether the query needs to retrieve documents or not and whether needs chat_history as context or not."
        "possible answers are only 3 of the following:"
        "  'YES': the query demands retrieving documents"
        "  'NO': the query demands no additional contexts"
        "  'ONLY_CHAT_HISTORY': the query demands only to aware of history of the chat"
        "query: {query}"
    )
    query_classification_prompt = ChatPromptTemplate.from_template(query_classification_template)
    chain = query_classification_prompt | llm
    result = chain.invoke({
        "query": input
    })
    ai_msg = result.content
    if "yes" in ai_msg.lower():
        print("The query demands retrieving documents\n")
        return chat_with_retrieval_chain(input, chat_history)
    elif "only_chat_history" in ai_msg.lower():
        print("The query demands only to aware of history of the chat\n")
        return chat_with_history_aware(input, chat_history)
    elif "no" in ai_msg.lower():
        print("The query demands no additional contexts\n")
        system_template = (
            "You are an assistant for question-answering tasks. "
            "If you don't know the answer, say that you "
            "don't know. Use three sentences maximum and keep the "
            "answer concise."
            "query: {query}"
        )
        query_classification_prompt = ChatPromptTemplate.from_template(system_template)
        chain = query_classification_prompt | llm
        result = chain.invoke({
            "query": input
        })
        chat_history.extend([
            ("human", input),
            ("ai", result.content)
        ])
        return result.content, chat_history
    else:
        print("Unintended answer")
        print(ai_msg)

In [129]:
convo_qa_chain = build_chain(llm, vectorstore)

In [130]:
answer, chat_history = chat(
    "According to the document, what are autonomous agents?",
    chat_history = [],
)
print(answer)

The query demands retrieving documents

Autonomous agents are LLM-empowered agents capable of handling autonomous design, planning, and performance of complex scientific experiments. These agents can browse the Internet, read documentation, execute code, call robotics experimentation APIs, and leverage other LLMs for tasks like developing novel anticancer drugs.


In [131]:
chat_history

[('human', 'According to the document, what are autonomous agents?'),
 ('ai',
  'Autonomous agents are LLM-empowered agents capable of handling autonomous design, planning, and performance of complex scientific experiments. These agents can browse the Internet, read documentation, execute code, call robotics experimentation APIs, and leverage other LLMs for tasks like developing novel anticancer drugs.')]

In [132]:
answer_2, chat_history_2 = chat(
    "Elaborate to me our past conversation",
    chat_history = chat_history,
)
print(answer_2)

The query demands only to aware of history of the chat

In our previous conversation, we discussed autonomous agents being LLM-empowered entities capable of autonomously designing, planning, and executing complex scientific experiments. These agents can access the Internet, read documentation, run code, interact with robotics APIs, and collaborate with other LLMs for tasks such as creating new anticancer drugs.


In [133]:
chat_history_2

[('human', 'According to the document, what are autonomous agents?'),
 ('ai',
  'Autonomous agents are LLM-empowered agents capable of handling autonomous design, planning, and performance of complex scientific experiments. These agents can browse the Internet, read documentation, execute code, call robotics experimentation APIs, and leverage other LLMs for tasks like developing novel anticancer drugs.'),
 ('human', 'Elaborate to me our past conversation'),
 ('ai',
  'In our previous conversation, we discussed autonomous agents being LLM-empowered entities capable of autonomously designing, planning, and executing complex scientific experiments. These agents can access the Internet, read documentation, run code, interact with robotics APIs, and collaborate with other LLMs for tasks such as creating new anticancer drugs.')]

In [135]:
answer_3, chat_history_3 = chat(
    "What is my first question to you?",
    chat_history = chat_history,
)
print(answer_3)

The query demands only to aware of history of the chat

Your first question to me was, "According to the document, what are autonomous agents?"


### Load Retrieval Chain

In [None]:
create_retrieval_chain.from_llm_and_retriever(llm, vectorstore.as_retriever())

</details>

## Next steps

You've now seen how to migrate existing usage of some legacy chains to LCEL.

Next, check out the [LCEL conceptual docs](/docs/concepts/lcel) for more background information.