<a href="https://colab.research.google.com/github/bpalani/blyss-genai-apps/blob/main/google-vertexai/RAG/Testing_and_learning_RAG.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### RAG using LangChain Stuff Documents Chain and ChromaDB

*   Load document from web and store in ChromaDB
*   Use LangChain Stuff document chain to retrieve information relevant to user



In [1]:
#####  Installing required packages including LangChain, Google Genai (from LangChain) and ChromaDB
!pip install -q langchain==0.3.23
!pip install -q langchain-core==0.3.54
!pip install -q langchain-community==0.3.21
!pip install -q langchain-text-splitters==0.3.8
!pip install -q langchain-google-genai==2.1.1
!pip install -q chromadb==1.0.5

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m433.9/433.9 kB[0m [31m5.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.5/2.5 MB[0m [31m30.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.4/44.4 kB[0m [31m2.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.9/50.9 kB[0m [31m2.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m40.8/40.8 kB[0m [31m1.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.4/1.4 MB[0m [31m21.0 MB/s[0m eta [36m0:00:00[0m
[?25h[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-generativeai 0.8.4 requires google-ai-generativelanguage==0.6.15, but you have google-ai-generative

In [3]:
##### Set GOOGLE_API_KEY
from google.colab import userdata
import os

GOOGLE_API_KEY = userdata.get('GOOGLE_API_KEY')
os.environ["GOOGLE_API_KEY"] = GOOGLE_API_KEY

In [4]:
##### Define the LLM
from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash", temperature=0.3, max_retries=2)

In [7]:
##### Load RecursiveURL output, split into chunks & use embeddings to store in chromadb vector database
from langchain_community.document_loaders import RecursiveUrlLoader
from langchain.docstore.document import Document
import re
from bs4 import BeautifulSoup
from typing import List

def bs4_extractor(html: str) -> str:
    soup = BeautifulSoup(html, "lxml")
    return re.sub(r"\n\n+", "\n\n", soup.text).strip()

def recursive_load_url(url: str) -> List[Document]:
    loader = RecursiveUrlLoader(url, extractor=bs4_extractor)
    docs = loader.load()
    return docs
doc_list = recursive_load_url("https://docs.influxdata.com/influxdb3/core/")

# split the documents into chunks
from langchain_text_splitters import RecursiveCharacterTextSplitter
def chunk_docs(size:int, docs: List[Document] ) -> List[Document]:
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=size, chunk_overlap=100)
    documents = text_splitter.split_documents(docs)
    return documents

doc_chunks = chunk_docs(800,doc_list)

#load GoogleGenerativeAIEmbeddings model
from langchain_google_genai import GoogleGenerativeAIEmbeddings
gemini_embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")

# save to vector database
from langchain.vectorstores import Chroma
vectorstore = Chroma.from_documents(
                     documents=doc_chunks,                 # Data
                     embedding=gemini_embeddings,    # Embedding model
                     persist_directory="./influxdb_docs.db" # Directory to save data
                     )

print(f"Finished storing {len(doc_chunks)} documents into vector database.")

Finished storing 68 documents into vector database.


In [9]:
from langchain_core.prompts import PromptTemplate

##### Prompt template to query Gemini
llm_prompt_template = """You are a researcher for answering user questions.
Use the following context to answer the question.
If you don't know the answer, just say that you don't know.
Use as many sentences as possible to answer the user's question. Explain in detail if necessary.\n
Question: {question} \nContext: {context} \nAnswer:"""

llm_prompt = PromptTemplate.from_template(llm_prompt_template)

print(llm_prompt)

input_variables=['context', 'question'] input_types={} partial_variables={} template="You are a researcher for answering user questions.\nUse the following context to answer the question.\nIf you don't know the answer, just say that you don't know.\nUse as many sentences as possible to answer the user's question. Explain in detail if necessary.\n\nQuestion: {question} \nContext: {context} \nAnswer:"


In [15]:
##### Open vector database as retriever and use LCEL to create RAG stuff documents chain
from langchain_core.tools import tool
from langchain.schema.runnable import RunnablePassthrough
from langchain.schema import StrOutputParser

#Combine data from documents to readable string format.
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

@tool
def retrieve_info(querystring:str) -> str:
    """Retrieves documentation information about InfluxDB 3 Core"""
    #print("In retrieve_info function")
    # Expose index to the retriever
    retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 20})

    # Create stuff documents chain using LCEL.
    #
    # The chain implements the following pipeline:
    # 1. Extract the website data relevant to the question from the Chroma
    #    vector store and save it to the variable `context`.
    # 2. `RunnablePassthrough` option to provide `question` when invoking
    #    the chain.
    # 3. The `context` and `question` are then passed to the prompt where they
    #    are populated in the respective variables.
    # 4. This prompt is then passed to the LLM (`gemini-2.0-flash`).
    # 5. Output from the LLM is passed through an output parser
    #    to structure the model's response.
    rag_chain = (
        {"context": retriever | format_docs, "question": RunnablePassthrough()}
        | llm_prompt
        | llm
        | StrOutputParser()
    )
    return rag_chain.invoke(querystring)

In [16]:
print("Welcome to InfluxDB 3 Core docs bot! What would like to do? (e.g., 'Write data', 'Query data', 'exit'): ")
while True:
    user_input = input("> ")
    if user_input.lower() in ("exit"):
        print("Thank you for using InfluxDB Docs Bot. Goodbye!")
        break
    response = retrieve_info(user_input)
    print((response))

Welcome to InfluxDB 3 Core docs bot! What would like to do? (e.g., 'Write data', 'Query data', 'exit'): 
> how do i use python processing engine?
To use the Python Processing Engine with InfluxDB 3, follow these steps:

1.  **Activate the Processing Engine:** When starting the InfluxDB 3 Core server, include the `--plugin-dir <PLUGIN_DIR>` option. The PLUGIN_DIR is the file system location where you store plugin files for the Processing Engine.
2.  **Create a Plugin:** A plugin is a Python function with a signature compatible with a Processing Engine trigger. This plugin will receive HTTP request headers and content, allowing it to parse, process, and send data to the database or third-party services.
3.  **Create a Trigger:** When creating a trigger, you specify a plugin, a database, and optional arguments. Triggers define when the plugin is executed and what data it receives. There are different trigger types:

    *   **On WAL flush:** Sends a batch of written data to a plugin.
    