### Basic RAG App: Question & Answering from a Document

In [2]:
import os
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())
openai_api_key = os.environ["OPENAI_API_KEY"] 


In [3]:
from langchain.llms import OpenAI

In [5]:
##instance of llm
llm = OpenAI()

**Load the text file**

In [6]:
from langchain.document_loaders import TextLoader

In [9]:
loader = TextLoader("be-good-and-how-not-to-die.txt")

In [10]:
document = loader.load()

**The document is loaded as a Python list with metadata**

In [11]:
print(type(document))

<class 'list'>


In [13]:
print(len(document))

1


In [15]:
print(document[0].metadata)

{'source': 'be-good-and-how-not-to-die.txt'}


In [16]:
print(f"You have {len(document)} document.")

You have 1 document.


In [18]:
print(f"Your document has {len(document[0].page_content)} characters")

Your document has 27419 characters


**Split the document in small chunks**

In [19]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

In [21]:
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 3000,
    chunk_overlap = 400
)

In [22]:
document_chunks = text_splitter.split_documents(document)

In [23]:
print(f"Now you have {len(document_chunks)} chunks.")

Now you have 12 chunks.


**Convert thext chunks in numeric vectors (Called "Embeddings")**

In [24]:
from langchain.embeddings.openai import OpenAIEmbeddings

In [26]:
# instance of embeddings
embeddings = OpenAIEmbeddings()

**Load the embeddings to a vector database**

In [27]:
from langchain.vectorstores import FAISS

Careful: the next operation is expensive in OpenAI

In [28]:
## Create a vector database called FAISS with the embeddings created from the conversion of the text into numbers
stored_embeddings = FAISS.from_documents(document_chunks, embeddings)

**Create a Retrival Question & Ansuering Chain**

In [29]:
from langchain.chains import RetrievalQA

In [30]:
QA_chain = RetrievalQA.from_chain_type(
    llm = llm,
    chain_type= "stuff",
    retriever = stored_embeddings.as_retriever()
)

**Now we have a Question & Ansuwering APP**

In [31]:
question = """
What is this article about? 
Describe it in less than 100 words.
"""

In [32]:
QA_chain.run(question)

'\nThis article is about the challenges and risks involved in starting a successful company. It discusses the importance of perseverance and not giving up, even when facing difficult obstacles. The author shares personal experiences and advice for avoiding failure and achieving success in the startup world.'

In [33]:
question2 = """
And how does it explain how to create somethin people want?
"""

In [36]:
QA_chain.run(question2)

' It explains that creating something people want involves having a sense of mission and genuinely wanting to help people, as well as constantly seeking feedback and making improvements based on user needs.'