# Document Question Answering with local persistence

An example of using Chroma DB and LangChain to do question answering over documents, with a locally persisted database. 
You can store embeddings and documents, then use them again later.

In [None]:
from dotenv import load_dotenv

# Load the environment variables from .env
load_dotenv()

In [None]:
#from .autonotebook import tqdm as notebook_tqdm
from langchain_chroma import Chroma
from langchain_huggingface.embeddings.huggingface import HuggingFaceEmbeddings #Ejecución local
from langchain_community.embeddings import HuggingFaceHubEmbeddings #Legacy
from langchain_huggingface.embeddings import HuggingFaceEndpointEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_groq import ChatGroq
from langchain.chains import VectorDBQA
from langchain.document_loaders import TextLoader

## Load the Database from disk, and create the chain
Be sure to pass the same `persist_directory` and `embedding_function` as you did when you instantiated the database. Initialize the chain we will use for question answering.

In [None]:
# Now we can load the persisted database from disk, and use it as normal. 

persist_directory = 'ChromaDB'
embedding = HuggingFaceEndpointEmbeddings()


vectordb = Chroma(persist_directory=persist_directory, embedding_function=embedding)
retriever = vectordb.as_retriever()
retriever.invoke("how stopped the tanks?")

## Ask questions!

Now we can use the chain to ask questions!

In [None]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

system_prompt = (
    "You are an assistant for question-answering tasks. "
    "Use the following pieces of retrieved context to answer "
    "If you don't know the answer, say that you "
    "don't know. Use three sentences maximum and keep the "
    "answer concise."
    "\n\n"
    "{context}"
)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", "{input}"),
    ]
)


def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


input_chain =     {"context": retriever | format_docs, "input": RunnablePassthrough()}    | prompt

print(input_chain.invoke(input="Who stopped the tanks?"))

In [None]:
llm = ChatGoogleGenerativeAI(model='gemini-pro') #No permite chains con ChatPromptTemplate
#The complete rules for what can be sent to Gemini are:
    #The role must alternate between "user" (HumanMessage) and "model" (AIMessage). (If you're using functions, you can use a "function" (FunctionMessage or ToolMessage) instead of "user".)
    #The history must start with a "user" role. It looks like you're starting with AIMessage, which will create a "model" role.
    #The last thing sent must be a "user" (or "function") role message. This is usually done by sending the message, so it's kinda obvious, but it's one of the rules.
    #There are also issues with SystemMessages, which Gemini didn't handle at all in older versions, and now handle using system_instruction. But that's a completely different story.)

rag_chain = (
    {"context": retriever | format_docs, "input": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

rag_chain.invoke("What happened with the Ukarani citizens?")

## Solution
Use one simple prompt o use ChatGroq

In [None]:
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

rag_prompt = (
    "You are an assistant for question-answering tasks. "
    "Use the following pieces of retrieved context to answer "
    "If you don't know the answer, say that you "
    "don't know. Use three sentences maximum and keep the "
    "answer concise."
    "\n\n"
    "{context}"
    "Original question: {input}"
)

prompt = PromptTemplate.from_template(rag_prompt)


def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


input_chain =     {"context": retriever | format_docs, "input": RunnablePassthrough()}    | prompt

print(input_chain.invoke(input="Who stopped the tanks?"))

## Cleanup

When you're done with the database, you can delete it from disk. You can delete the specific collection you're working with (if you have several), or delete the entire database by nuking the persistence directory.

In [None]:
# To cleanup, you can delete the collection
#vectordb.delete_collection()

# Or just nuke the persist directory
