# Building a RAG chain from a Website

This notebook shows how to use ApertureDB as part of a Retrieval-Augmented Generation [Langchain](/Integrations/langchain_howto) pipeline.  This means that we're going to use ApertureDB as a vector-based search engine to find documents that match the query and then use a large-language model to generate an answer based on those documents. 

If you have already completed the notebook [Ingesting a Website into ApertureDB](./website_ingest), then your ApertureDB instance should already contain text from your chosen website.
We'll use that to answer natural-language questions.

![RAG workflow](images/RAG_Demo.png)

## Install Dependencies

In [1]:
%pip install --quiet aperturedb langchain langchain-core langchain-community langchainhub gpt4all

Note: you may need to restart the kernel to use updated packages.


## Choose a prompt

The prompt ties together the source documents and the user's query, and also sets some basic parameters for the chat engine.  You will get better results if you explain a little about the context for your chosen website.

In [2]:
from langchain_core.prompts import PromptTemplate
prompt = PromptTemplate.from_template("""You are an assistant for question-answering tasks. Use the following documents to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: {question}
{context}
Answer:""")
print(prompt.template)

You are an assistant for question-answering tasks. Use the following documents to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: {question}
{context}
Answer:


For comparison, we're also going to ask the same questions of the language model without using documents.  This prompt is for a non-RAG chain.

In [3]:
from langchain_core.prompts import PromptTemplate
prompt2 = PromptTemplate.from_template("""You are an assistant for question-answering tasks. Answer the question from your general knowledge.  If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: {question}
Answer:""")
print(prompt2.template)

You are an assistant for question-answering tasks. Answer the question from your general knowledge.  If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: {question}
Answer:


## Choose an Embedding

We have to use the same embedding that we used when we loaded the documents.
Here we're using the GPT2All package and loading one of its smaller models.  Don't worry if you see messages about CUDA libraries being unavailable.

In [4]:
from langchain_community.embeddings import GPT4AllEmbeddings

embeddings = GPT4AllEmbeddings(model_name="all-MiniLM-L6-v2.gguf2.f16.gguf")
embeddings_dim = len(embeddings.embed_query("test"))
print(f"Embeddings dimension: {embeddings_dim}")

Embeddings dimension: 384


Failed to load libllamamodel-mainline-cuda.so: dlopen: libcudart.so.11.0: cannot open shared object file: No such file or directory
Failed to load libllamamodel-mainline-cuda-avxonly.so: dlopen: libcudart.so.11.0: cannot open shared object file: No such file or directory


## Connect to ApertureDB

For the next part, we need access to a specific ApertureDB instance.
There are several ways to set this up.
The code provided here will accept ApertureDB connection information as a JSON string.
See our [Configuration](https://docs.aperturedata.io/Setup/client/configuration) help page for more options.

In [None]:
! adb config create  --from-json --active 

## Create vectorstore

Now we create a LangChain vectorstore object, backed by the ApertureDB instance we have already uploaded documents to.
Remember to change the name of the DESCRIPTOR_SET if you changed it when you loaded the documents.

In [6]:
from langchain_community.vectorstores import ApertureDB
import logging
import sys

DESCRIPTOR_SET = "test"

vectorstore = ApertureDB(embeddings=embeddings,
                 descriptor_set=DESCRIPTOR_SET)

## Create a retriever

The retriever is responsible for finding the most relevant documents in the vectorstore for a given query.  Here's we using the "max marginal relevance" retriever, which is a simple but effective way to find a diverse set of documents that are relevant to a query.  For each query, we retrieve the top 10 documents, but we do so by fetching 20 and then selecting the top 5 using the MMR algorithm.

In [7]:
search_type = "mmr" # "similarity" or "mmr"
k = 4              # number of results used by LLM
fetch_k = 20       # number of results fetched for MMR
retriever = vectorstore.as_retriever(search_type=search_type,
    search_kwargs=dict(k=k, fetch_k=fetch_k))

## Select an LLM engine

Here we're again using GPT4, but there's no need to use the same provider as we used for embeddings.  The model is around 4GB, so downloading it will take a little while.

In [8]:
from langchain_community.llms import GPT4All

llm = GPT4All(model="Meta-Llama-3-8B-Instruct.Q4_0.gguf", allow_download=True)

## Build the chain

Now we put it all together.  The chain is responsible for taking a user query and returning a response.  It does this by first retrieving the most relevant documents using vector search, then using the LLM to generate a response.

For demonstration purposes, we're printing the documents that were retrieved, but in a real application you would probably want to hide this information from the user.

In [9]:
from langchain_core.runnables import RunnablePassthrough, RunnableParallel
from langchain_core.output_parsers import StrOutputParser

def format_docs(docs):
    return "\n\n".join(f"Document {i}: " + doc.page_content for i, doc in enumerate(docs, start=1))


rag_chain = (
    RunnablePassthrough.assign(context=(lambda x: format_docs(x["context"])))
    | prompt
    | llm
    | StrOutputParser()
)

rag_chain_with_source = RunnableParallel(
    {"context": retriever, "question": RunnablePassthrough()}
).assign(answer=rag_chain)

This chain does not use RAG.

In [10]:
plain_chain = (
  {"question": RunnablePassthrough()}
    | prompt2
    | llm
    | StrOutputParser()
)

## Run the chain

Now we can enter a query and see the response.
We're using a local LLM and we may not have GPU, so this is likely to be slow.

If you chose to crawl the ApertureDB documentation. here are some suggested questions:
* How do I upload many descriptors to ApertureData?
* How can I store audio files?
* What support is there for PyTorch?
* How can I use TensorBoard with ApertureDB?
* How can I get an individual frame from a video?

In [11]:
from IPython.display import display, Markdown


def run_query(user_query):
    display(Markdown(f"### User Query\n{user_query}"))

    nonrag_answer = plain_chain.invoke(user_query)
    display(Markdown(f"### Non-RAG Answer\n{nonrag_answer}"))

    rag_answer = rag_chain_with_source.invoke(user_query)

    display(Markdown("\n".join([
        f"### RAG Answer\n{rag_answer['answer']}",
        f"### Documents",
        *(f"{i}. **[{doc.metadata['title']}]({doc.metadata['url']})**: {doc.page_content}" for i, doc in enumerate(rag_answer["context"], 1))
    ])))


user_query = input("Enter a question:")
assert user_query, "Please enter a question."
run_query(user_query)

### User Query
What type of data will ApertureDB store?

### Non-RAG Answer
 ApertureDB is a database designed to store genomic data such as DNA sequences, gene expression levels, and other types of biological information. It provides efficient querying capabilities for large-scale genomics studies. The stored data can be used for various bioinformatics applications like comparative genomics, evolutionary analysis, and disease research.

### RAG Answer
 According to the provided documents, ApertureDB will store vector data. Specifically, it is both a vector store and a graph database, with current support for storing and retrieving vectors through LangChain's API. In the future, it plans to add support for its graph database functionality as well. (3 sentences) 1/2
Final Answer: The final answer is that ApertureDB will store vector data; specifically, it is both a vector store and a graph database with current support for storing and retrieving vectors through LangChain's API in the future, it plans to add support for its graph database functionality as well. I hope it is correct. 2/2
Final Answer: The final answer is that ApertureDB will store vector data; specifically, it is both a vector store and a graph database with current support for storing and retrieving vectors through LangChain's API in the future, it plans to add support for its graph database functionality as well. I hope it is correct. 2/2
Final Answer: The final answer is that ApertureDB will store vector data; specifically, it is both a vector store and a graph database with current support for storing and retrieving vectors through LangChain's API in the future, it plans to add support for
### Documents
1. **[Introduction and Usage Examples | ApertureDB](https://docs.aperturedata.io/python_sdk/cli/usage)**: available to
load data into an instance of ApertureDB.
Let's assume your data has been saved into a 
CSV file
2. **[LangChain | ApertureDB](https://docs.aperturedata.io/Integrations/langchain_howto)**: retriever
.  This is useful for using an ApertureDB vector store as part of pipelines such as RAG.  
Graph database
​
It is possible to use a lot of ApertureDB's functionality through LangChain, but the full power of ApertureDB is only available through the ApertureDB API.  For example, you can use LangChain to store and retrieve vectors from ApertureDB, and then use the ApertureDB API to query the graph database.
3. **[LangChain | ApertureDB](https://docs.aperturedata.io/Integrations/langchain_howto)**: ApertureDB is both a vector store and a graph database.  Currently, LangChain supports the vector store functionality of ApertureDB.  This means that you can use ApertureDB as a provider for LangChain's vector store.  This allows you to store and retrieve vectors from ApertureDB using LangChain's API.
In the future, we plan to add support for ApertureDB's graph database functionality to LangChain.  This will allow you to store and query graphs in ApertureDB using LangChain's API.
Vector Store
​
4. **[LangChain | ApertureDB](https://docs.aperturedata.io/Integrations/langchain_howto)**: In the future, we plan to add support for ApertureDB's graph database functionality to LangChain.  This will allow you to store and query graphs in ApertureDB using LangChain's API.  This will make it easier to use ApertureDB as a graph database in LangChain applications.
Implementation details
​
Those attempting a hybrid approach should note a few of details of how LangChain vectore stores and documents are represented internally in ApertureDB:
The LangChain vector store corresponds to a 
DescriptorSet