# Q&A Quickstart

[link](https://python.langchain.com/docs/use_cases/question_answering/quickstart/) to tutorial by LangChain

## Setup

### Install Dependencies

In [6]:
%pip install --upgrade --quiet  langchain langchain-community langchainhub langchain-openai langchain-chroma bs4

Note: you may need to restart the kernel to use updated packages.


In [1]:
pip install -qU langchain-openai

Note: you may need to restart the kernel to use updated packages.


### OpenAI API Key

In [1]:
import getpass
import os

os.environ["OPENAI_API_KEY"] = getpass.getpass()

### Imports

In [2]:
import bs4
from langchain import hub
from langchain_community.document_loaders import WebBaseLoader
from langchain_chroma import Chroma
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

ImportError: cannot import name 'hub' from 'langchain' (unknown location)

## Let's code

In [3]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-3.5-turbo-0125")

### 1 - Indexing: Load

In [13]:
import bs4
from langchain_community.document_loaders import WebBaseLoader

# Only keep post title, headers, and content from the full HTML.
bs4_strainer = bs4.SoupStrainer(class_=("post-title", "post-header", "post-content"))
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs={"parse_only": bs4_strainer},
)
docs = loader.load()

In [14]:
len(docs[0].page_content)

43131

In [15]:
print(docs[0].page_content[:500])



      LLM Powered Autonomous Agents
    
Date: June 23, 2023  |  Estimated Reading Time: 31 min  |  Author: Lilian Weng


Building agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-Engineer and BabyAGI, serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories, essays and programs; it can be framed as a powerful general problem solver.
Agent System Overview#
In


**Go deeper**: Check `DocumentLoader` object that loads data from a source as list of Documents ([LINK](https://python.langchain.com/docs/modules/data_connection/document_loaders/))

### 2 - Indexing: Split

`add_start_index=True` Indicate that the splits has to preserve the "start_index", which indicate the first split of a document

In [16]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000, chunk_overlap=200, add_start_index=True
)
all_splits = text_splitter.split_documents(docs)

In [17]:
len(all_splits)

66

In [18]:
len(all_splits[0].page_content)

969

In [19]:
all_splits[10].metadata

{'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/',
 'start_index': 7056}

**go deeper**: [DocumentTransformer](https://python.langchain.com/docs/integrations/document_transformers/) and it's subclass [TextSplitter](https://python.langchain.com/docs/modules/data_connection/document_transformers/) 

### 3 - Indexing: Store

We want to store the chunks that we've created before. We will sore the in [Chroma](https://www.trychroma.com/), and we will use OpenAIEmbeddings for creating the embeddings to store, so we can compare them by similarity metrics (like cosine)

In [20]:
from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings

vectorstore = Chroma.from_documents(documents=all_splits, embedding=OpenAIEmbeddings())

**go deeper**: [Embedding](https://python.langchain.com/docs/use_cases/question_answering/quickstart/#:~:text=text%20to%20embeddings.-,Docs,-%3A%20Detailed%20documentation%20on) and [VectoreStore](https://python.langchain.com/docs/use_cases/question_answering/quickstart/#:~:text=and%20querying%20embeddings.-,Docs,-%3A%20Detailed%20documentation%20on) Documentation

### 4 - Retrieval and Generation: Retrieve

Let's build the actual Q&A application - Given a question, this application will look into the vectordb for relevant document and initial question to a model, and return an answer.

The `Retriever` interface offered by LangChain can return relevant document given a string query. The most common one is [VectorStoreRetriever](https://python.langchain.com/docs/modules/data_connection/retrievers/vectorstore/). Any `VectorStore` can be turned in a `Retriever` with `VectorStore.as_retriever()`

In [22]:
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 6})

In [23]:
retrieved_docs = retriever.invoke("What are the approaches to Task Decomposition?")

In [24]:
len(retrieved_docs)

6

In [25]:
print(retrieved_docs[0].page_content)

Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.
Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.


**Go deeper**: [Retriever](https://python.langchain.com/docs/modules/data_connection/retrievers/) is an object that return `Document` given a text query.

### 5 - Retrieval and Generation: Generate

In [26]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-3.5-turbo-0125")

We're using a Prompt Template from LangChain [Prompt Hub](https://smith.langchain.com/hub/rlm/rag-prompt)

In [27]:
from langchain import hub

prompt = hub.pull("rlm/rag-prompt")

In [30]:
example_messages = prompt.invoke(
    {"context": "filler context", "question": "filler question"}
).to_messages()

In [29]:
print(example_messages[0].content)

You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: filler question 
Context: filler context 
Answer:


Using [LCEL Runnable](https://python.langchain.com/docs/expression_language/) protocol to define the chain, allowing us to -pipe together components and function. Ready for the use also LangSmith for monitoring

In [31]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough


def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

In [32]:
for chunk in rag_chain.stream("What is Task Decomposition?"):
    print(chunk, end="", flush=True)

Task decomposition involves breaking down complex tasks into smaller and simpler steps using techniques like Chain of Thought or Tree of Thoughts. This process helps transform big tasks into more manageable ones and guides the model in thinking step by step. Task decomposition can be done by prompting the model with specific instructions, task-specific guidance, or human inputs.

**Go Deeper**: [ChatModel](https://python.langchain.com/docs/modules/model_io/chat/) and [LLM](https://python.langchain.com/docs/use_cases/question_answering/quickstart/#:~:text=returns%20a%20string.-,Docs,-Integrations%3A%2075%2B%20integrations)

### Full snippet

In [9]:
# Load, chunk and index the contents of the blog.
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
docs = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)

# We use Chroma DB which is an open suource embdedding database to store the embeddings of the blog
vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings()) 

# Retrieve and generate using the relevant snippets of the blog.
retriever = vectorstore.as_retriever()
prompt = hub.pull("rlm/rag-prompt")

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

In [10]:
rag_chain.invoke("What is Task Decomposition?")

'Task Decomposition involves breaking down complex tasks into smaller and simpler steps to facilitate problem-solving. Techniques like Chain of Thought and Tree of Thoughts help in decomposing tasks by exploring multiple reasoning possibilities at each step. Task decomposition can be achieved through various methods, including using language models with specific prompts or task-specific instructions.'

In [11]:
# cleanup
vectorstore.delete_collection()

## Using Local Models

Check [this](https://python.langchain.com/docs/use_cases/question_answering/local_retrieval_qa/) resource