# Give me data!

![retrieval](retrieval.png)


In [16]:
# Setup

from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.chat_models import ChatOpenAI
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory
from langchain.document_loaders import PyPDFLoader
from langchain.vectorstores import FAISS

gpt35 = ChatOpenAI(
    temperature=0, callbacks=[StreamingStdOutCallbackHandler()], streaming=True
)

gpt4 = ChatOpenAI(
    model="gpt-4",
    temperature=0,
    callbacks=[StreamingStdOutCallbackHandler()],
    streaming=True,
)


loader = PyPDFLoader("2305.10601.pdf")
pages = loader.load_and_split()
text_splitter = RecursiveCharacterTextSplitter(
    # Set a really small chunk size, just to show.
    chunk_size=1000,
    chunk_overlap=200,
    length_function=len,
)

documents = text_splitter.split_documents(pages)

faiss_index = FAISS.from_documents(documents, OpenAIEmbeddings())

memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

qa = ConversationalRetrievalChain.from_llm(
    gpt4,
    faiss_index.as_retriever(),
    condense_question_llm=gpt35,
    verbose=True,
)

In [17]:
# Example of a data powered agent
chat_history = []
query = "What is the main principle of tree of thought?"

result = qa({"question": query, "chat_history": chat_history})



[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mSystem: Use the following pieces of context to answer the users question. 
If you don't know the answer, just say that you don't know, don't try to make up an answer.
----------------
Tree of Thoughts: Deliberate Problem Solving
with Large Language Models
Shunyu Yao
Princeton UniversityDian Yu
Google DeepMindJeffrey Zhao
Google DeepMindIzhak Shafran
Google DeepMind
Thomas L. Grifﬁths
Princeton UniversityYuan Cao
Google DeepMindKarthik Narasimhan
Princeton University
Abstract
Language models are increasingly being deployed for general problem solving
across a wide range of tasks, but are still conﬁned to token-level, left-to-right
decision-making processes during inference. This means they can fall short in
tasks that require exploration, strategic lookahead, or where initial decisions play
a pivotal role. To surmount these challenges, we introduce a new

In [18]:
chat_history = [(query, result["answer"])]
query = "How is the backtracing implemented?"
result = qa({"question": query, "chat_history": chat_history})



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:

Human: What is the main principle of tree of thought?
Assistant: The main principle of the Tree of Thoughts (ToT) framework is to enable language models to explore multiple reasoning paths over coherent units of text called "thoughts" for problem-solving. ToT frames any problem as a search over a tree, where each node represents a partial solution with the input and the sequence of thoughts so far. This approach allows language models to incorporate planning, lookahead, and backtracking, which are characteristic of human problem-solving, and helps overcome the limitations of token-level, left-to-right decision-making processes in traditional language models.
Follow Up Input: How is the backtracing implemented?
Standalone question:[0m
Can you expl