# Step 1: Set up the LLM app
We're using a [Q&A example from LangChain](https://python.langchain.com/docs/use_cases/question_answering/quickstart) where we'll be doing a Q&A over a [blog](https://lilianweng.github.io/posts/2023-06-23-agent/).

In [2]:
%reload_ext autoreload
%autoreload 2

import dotenv
dotenv.load_dotenv("../.env", override=True)


True

In [3]:
import bs4
from langchain import hub
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.vectorstores import Chroma
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from operator import itemgetter

In [8]:
# Load, chunk and index the contents of the blog.
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
docs = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)
vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings())

# Retrieve and generate using the relevant snippets of the blog.
retriever = vectorstore.as_retriever()
prompt = hub.pull("rlm/rag-prompt")
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)

def format_to_string_list(docs):
    return [doc.page_content for doc in docs]

def concat_string(str_list):
    return "\n\n".join(str_list)

rag_chain = (
    {"context_list": retriever | format_to_string_list, 
     "context": retriever | format_to_string_list | concat_string, 
     "question": RunnablePassthrough()}
    | RunnablePassthrough()
    | {"answer": prompt | llm | StrOutputParser(), 
       "context": itemgetter("context_list")}
)

In [9]:
rag_chain.invoke("What is Task Decomposition?")

{'answer': "Task Decomposition is a technique that breaks down complex tasks into smaller and simpler steps. It involves transforming big tasks into multiple manageable tasks to aid in the interpretation of the model's thinking process. This process can be enhanced by prompting techniques like Chain of Thought or Tree of Thoughts.",
 'context': ['Fig. 1. Overview of a LLM-powered autonomous agent system.\nComponent One: Planning#\nA complicated task usually involves many steps. An agent needs to know what they are and plan ahead.\nTask Decomposition#\nChain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.',
  'Fig. 1. Overview of a LLM-powered a