# Parent Document Retrieval (Small to Big Retrieval)

### Load and Split Document

In [32]:
from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter

raw_documents = TextLoader('./data/tesla.txt', encoding='utf-8').load()
child_splitter = RecursiveCharacterTextSplitter(chunk_size=200)
parent_splitter = RecursiveCharacterTextSplitter(chunk_size=2000)

### Store Documents in Vector Database

In [33]:
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
from langchain.storage import InMemoryStore


embedding_model = OpenAIEmbeddings() # or something else

db = Chroma(embedding_function=embedding_model, persist_directory="./chroma_db", collection_name="parent_doc_tesla")

parent_store = InMemoryStore()

## Setup retriever

In [34]:
from langchain.retrievers import ParentDocumentRetriever

full_retriever = ParentDocumentRetriever(
    vectorstore=db,
    docstore=parent_store,
    child_splitter=child_splitter,
    parent_splitter=parent_splitter
)

full_retriever.add_documents(raw_documents)

In [35]:
query = "What was the motivation behind Tesla to not mary?"

### Query Vector Database

In [24]:
sub_docs = db.similarity_search(query)

print(sub_docs[0].page_content)

by not marrying, he had made too great a sacrifice to his work, needed] Tesla chose to never pursue or engage in any known relationships, instead finding all the stimulation he needed in his work.


### Query Retriever

In [25]:
retrieved_docs = full_retriever.get_relevant_documents(query)

print(retrieved_docs[0].page_content)

Relationships
Tesla was a lifelong bachelor, who had once explained that his chastity was very helpful to his scientific abilities. He once said in earlier years that he felt he could never be worthy enough for a woman, considering women superior in every way. His opinion had started to sway in later years when he felt that women were trying to outdo men and make themselves more dominant. This "new woman" was met with much indignation from Tesla, who felt that women were losing their femininity by trying to be in power. In an interview with the Galveston Daily News on 10 August 1924 he stated, "In place of the soft-voiced, a gentlewoman of my reverent worship, has come the woman who thinks that her chief success in life lies in making herself as much as possible like man—in dress, voice and actions, in sports and achievements of every kind ... The tendency of women to push aside man, supplanting the old spirit of cooperation with him in all the affairs of life, is very disappointing to

## RAG

In [36]:
from langchain.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
from langchain.schema.runnable import RunnablePassthrough
from langchain.schema.output_parser import StrOutputParser

template = """Answer the following question based only on the following context:
{context}

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)

model_name = 'gpt-3.5-turbo-0125'
model = ChatOpenAI(model_name = model_name)

### Simple RAG from Child Document

In [30]:
only_child_retriever = db.as_retriever()

child_chain = (
    {"context": only_child_retriever, "question": RunnablePassthrough()}
    | prompt
    | model
    | StrOutputParser()
)

child_chain.invoke(query)

'The motivation behind Tesla not marrying was that he felt he had made too great a sacrifice to his work and found all the stimulation he needed in his work.'

## Parent Retrieval RAG

In [37]:
full_chain = (
    {"context": full_retriever, "question": RunnablePassthrough()}
    | prompt
    | model
    | StrOutputParser()
)

full_chain.invoke(query)

"Tesla's motivation to not marry was his belief that his chastity was helpful to his scientific abilities, as well as his feelings of inadequacy and admiration for women. He also felt that women were losing their femininity by trying to be dominant and push aside men in society. Tesla ultimately chose to focus on his work and found all the stimulation he needed in his scientific pursuits."