## Introduction
This notebook is designed to guide you through the process of building a simple retrieval augmented generation application using Langchain, ChromaDB and OpenAI's GPT-4 LLM. 

![](../assets/4-RAG.png)

### RAG: Retrieval Augmented Generation
Retrieval Augmented Generation (RAG) combines the power of language models with external content retrieval to answer questions based on specific content. This approach allows for more accurate and contextually relevant responses by leveraging a database of information.

The process involves several key steps:
- **Content Retrieval and Storage**: First, we ingest and store our content in a searchable format. This involves fetching content from a URL, splitting it into manageable parts, embedding these parts for semantic search, and storing them in a Vector DB.
- **Question Answering**: When a user question is posed, the system retrieves relevant content from the Vector DB and uses it to generate an answer with the LLM.

This notebook will walk you through setting up a RAG system using ChromaDB for content storage and retrieval, and OpenAI's GPT-4 for question answering.

In [None]:
from dotenv import load_dotenv
import os

# Load environment variables from .env file
load_dotenv()

# Run basic example from Langchain Vectorstores Chroma page

[Read more here](https://python.langchain.com/v0.2/docs/integrations/vectorstores/chroma/)

# Basic Example

In [None]:
import rich
from goob_ai import debugger

In [None]:
# import
from langchain_chroma import Chroma
from langchain_community.document_loaders import TextLoader
from langchain_community.embeddings.sentence_transformer import (
    SentenceTransformerEmbeddings,
)
from langchain_text_splitters import CharacterTextSplitter

# load the document and split it into chunks
loader = TextLoader("example_data/state_of_the_union.txt")
documents = loader.load()


In [None]:
rich.inspect(loader)

rich.inspect(documents)

In [None]:
# split it into chunks
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)


In [None]:
rich.inspect(text_splitter)

rich.inspect(docs)

In [None]:
# create the open-source embedding function
embedding_function = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")


In [None]:
# load it into Chroma
db = Chroma.from_documents(docs, embedding_function)

# query it
query = "What did the president say about Ketanji Brown Jackson"
docs = db.similarity_search(query)

# print results
print(docs[0].page_content)

# Persistant saving to disk example

In [None]:
# save to disk
db2 = Chroma.from_documents(docs, embedding_function, persist_directory="./chroma_db")
docs = db2.similarity_search(query)

# load from disk
db3 = Chroma(persist_directory="./chroma_db", embedding_function=embedding_function)
docs = db3.similarity_search(query)
print(docs[0].page_content)

# passing a Chroma client into langchain

In [None]:
import chromadb

persistent_client = chromadb.PersistentClient()
collection = persistent_client.get_or_create_collection("collection_name")
collection.add(ids=["1", "2", "3"], documents=["a", "b", "c"])

langchain_chroma = Chroma(
    client=persistent_client,
    collection_name="collection_name",
    embedding_function=embedding_function,
)

print("There are", langchain_chroma._collection.count(), "in the collection")

# Use Chroma in a docker container


In [None]:
# create the chroma client
import uuid

import chromadb
from chromadb.config import Settings

client = chromadb.HttpClient(host="localhost", port="8010", settings=Settings(allow_reset=True))
client.reset()  # resets the database
collection = client.create_collection("my_collection")
for doc in docs:
    collection.add(
        ids=[str(uuid.uuid1())], metadatas=doc.metadata, documents=doc.page_content
    )

# tell LangChain to use our client and collection name
db4 = Chroma(
    client=client,
    collection_name="my_collection",
    embedding_function=embedding_function,
)
query = "What did the president say about Ketanji Brown Jackson"
docs = db4.similarity_search(query)
print(docs[0].page_content)

# Chromadb: Update and Delete

While building toward a real application, you want to go beyond adding data, and also update and delete data.

Chroma has users provide ids to simplify the bookkeeping here. ids can be the name of the file, or a combined has like filename_paragraphNumber, etc.

Chroma supports all these operations - though some of them are still being integrated all the way through the LangChain interface. Additional workflow improvements will be added soon.

Here is a basic example showing how to do various operations:

In [None]:
# create simple ids
ids = [str(i) for i in range(1, len(docs) + 1)]

# add data
example_db = Chroma.from_documents(docs, embedding_function, ids=ids)
docs = example_db.similarity_search(query)
print(docs[0].metadata)

# update the metadata for a document
docs[0].metadata = {
    "source": "example_data/state_of_the_union.txt",
    "new_value": "hello world",
}
example_db.update_document(ids[0], docs[0])
print(example_db._collection.get(ids=[ids[0]]))

# delete the last document
print("count before", example_db._collection.count())
example_db._collection.delete(ids=[ids[-1]])
print("count after", example_db._collection.count())

# Use OpenAI Embeddings


In [None]:
from langchain_openai import OpenAIEmbeddings


embeddings = OpenAIEmbeddings()
new_client = chromadb.EphemeralClient()
openai_lc_client = Chroma.from_documents(
    docs, embeddings, client=new_client, collection_name="openai_collection"
)

query = "What did the president say about Ketanji Brown Jackson"
docs = openai_lc_client.similarity_search(query)
print(docs[0].page_content)

# Use OpenAI Embeddings w/ docker chroma

In [None]:
import os
from goob_ai.aio_settings import AioSettings, aiosettings

os.environ["FAKE"] = f"{aiosettings.openai_api_key.get_secret_value()}"
os.environ["FAKE"]


# rich.print(aiosettings)

In [None]:
# create the chroma client
import uuid

import os

import chromadb
from chromadb.config import Settings
from langchain_openai import OpenAIEmbeddings
from getpass import getpass

OPENAI_API_KEY = getpass()

os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY


In [None]:

embeddings = OpenAIEmbeddings()

client = chromadb.HttpClient(host="localhost", port="8010", settings=Settings(allow_reset=True))
# client.reset()  # resets the database
collection = client.create_collection("openai_collection")

openai_lc_client = Chroma.from_documents(
    docs, embeddings, client=client, collection_name="openai_collection"
)

query = "What did the president say about Ketanji Brown Jackson"
docs = openai_lc_client.similarity_search(query)
print(docs[0].page_content)

# # for doc in docs:
# #     collection.add(
# #         ids=[str(uuid.uuid1())], metadatas=doc.metadata, documents=doc.page_content
# #     )

# # # tell LangChain to use our client and collection name
# # db4 = Chroma(
# #     client=client,
# #     collection_name="my_collection",
# #     embedding_function=embedding_function,
# # )
# # query = "What did the president say about Ketanji Brown Jackson"
# # docs = db4.similarity_search(query)
# # print(docs[0].page_content)

# Other Information: Similarity search with score

In [None]:
# The returned distance score is cosine distance. Therefore, a lower score is better.
docs = db.similarity_search_with_score(query)
docs[0]

# Test Goob-ai chroma_service



In [None]:
# # NOTE: This is from GP, let's play with this later
# from langchain_community.document_loaders import WebBaseLoader
# from langchain_openai import AzureOpenAIEmbeddings
# from langchain_text_splitters import RecursiveCharacterTextSplitter
# from langchain_chroma import Chroma
# import bs4

# # Load, chunk and index the contents of the blog.
# loader = WebBaseLoader(
#     web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
#     bs_kwargs=dict(
#         parse_only=bs4.SoupStrainer(
#             class_=("post-content", "post-title", "post-header")
#         )
#     ),
# )
# docs = loader.load()

# # Split the content into manageable chunks for better retrieval.
# text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
# splits = text_splitter.split_documents(docs)

# # Embed the chunks and store them in ChromaDB for efficient retrieval.
# vectorstore = Chroma.from_documents(documents=splits, embedding=AzureOpenAIEmbeddings(azure_deployment=os.environ["AZURE_EMBEDDINGS_DEPLOYMENT"]))

In [None]:
# import os
# from langchain_core.runnables import RunnablePassthrough
# from langchain_core.prompts import ChatPromptTemplate
# from langchain_openai import AzureChatOpenAI
# from langchain_core.output_parsers import StrOutputParser

# # Set up the RAG chain for retrieving and generating answers.
# retriever = vectorstore.as_retriever()
# system_prompt = ("""
# You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
# Context: {context}
# """)

# prompt = ChatPromptTemplate.from_messages(
#     [
#         ("system", system_prompt),
#         ("human", "{question}"),
#     ]
# )


# def format_docs(docs):
#     return "\n\n".join(doc.page_content for doc in docs)


# # Initialize the model with our deployment of Azure OpenAI
# model = AzureChatOpenAI(azure_deployment=os.environ["AZURE_OPENAI_DEPLOYMENT"])

# rag_chain = (
#     {"context": retriever | format_docs, "question": RunnablePassthrough()}
#     | prompt
#     | model
#     | StrOutputParser()
# )

In [None]:
# question = "What is tool usage?"
# rag_chain.invoke(question)

In [None]:
import os
from langchain_core.runnables import RunnablePassthrough
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import AzureChatOpenAI
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import StrOutputParser
from langchain_community.vectorstores.chroma import Chroma
from langchain_core.vectorstores import VectorStoreRetriever
from goob_ai.services.chroma_service import ChromaService
from goob_ai.llm_manager import LlmManager

client = ChromaService.client
test_collection_name = "gp_demos"

db: Chroma = ChromaService.add_to_chroma(
    path_to_document="https://lilianweng.github.io/posts/2023-06-23-agent/",
    collection_name=test_collection_name,
    embedding_function=None,
)

# Set up the RAG chain for retrieving and generating answers.
retriever: VectorStoreRetriever = db.as_retriever()
system_prompt = ("""
You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Context: {context}
""")

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", "{question}"),
    ]
)


def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


# Initialize the model with our deployment of Azure OpenAI
# model = AzureChatOpenAI(azure_deployment=os.environ["AZURE_OPENAI_DEPLOYMENT"])
model =  LlmManager().llm

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | model
    | StrOutputParser()
)

In [None]:
question = "What is tool usage?"
rag_chain.invoke(question)

# Readthedocs collection

In [1]:
from goob_ai.services.chroma_service import ChromaService, DATA_PATH, CHROMA_PATH
from goob_ai.utils import file_functions
from loguru import logger as LOGGER
from langchain_openai import OpenAIEmbeddings
from langchain_chroma import Chroma

client = ChromaService.client
test_collection_name = "readthedocs"

documents = []

d = file_functions.tree(DATA_PATH)
result = file_functions.filter_pdfs(d)

for filename in result:
    LOGGER.info(f"Loading document: {filename}")
    db = ChromaService.add_to_chroma(
        path_to_document=f"{filename}",
        collection_name=test_collection_name,
        embedding_function=None,
    )

embedding_function = OpenAIEmbeddings()

db = Chroma(
    client=client,
    collection_name=test_collection_name,
    embedding_function=embedding_function,
)


# query it
query = "How do I enable syntax highlighting with rich?"
docs = db.similarity_search(query)

[32m2024-07-08 21:16:06.596[0m | [34m[1mDEBUG   [0m | [36mgoob_ai.utils.file_functions[0m:[36mtree[0m:[36m623[0m - [34m[1mdirectory -> /Users/malcolm/dev/bossjones/goob_ai/src/goob_ai/services/../data/chroma/documents[0m
[32m2024-07-08 21:16:06.597[0m | [34m[1mDEBUG   [0m | [36mgoob_ai.utils.file_functions[0m:[36mtree[0m:[36m626[0m - [34m[1mdirectory -> /Users/malcolm/dev/bossjones/goob_ai/src/goob_ai/services/../data/chroma/documents[0m
[32m2024-07-08 21:16:06.597[0m | [34m[1mDEBUG   [0m | [36mgoob_ai.utils.file_functions[0m:[36mtree[0m:[36m628[0m - [34m[1mdirectory -> /Users/malcolm/dev/bossjones/goob_ai/src/goob_ai/services/../data/chroma/documents[0m
[32m2024-07-08 21:16:06.599[0m | [1mINFO    [0m | [36m__main__[0m:[36m<module>[0m:[36m16[0m - [1mLoading document: /Users/malcolm/dev/bossjones/goob_ai/src/goob_ai/data/chroma/documents/opencv-tutorial-readthedocs-io-en-latest.pdf[0m
[32m2024-07-08 21:16:06.599[0m | [34m[1mDEBUG

+ /Users/malcolm/dev/bossjones/goob_ai/src/goob_ai/services/../data/chroma/documents
    + opencv-tutorial-readthedocs-io-en-latest.pdf
    + pillow-readthedocs-io-en-latest.pdf
    + rich-readthedocs-io-en-latest.pdf
    + state_of_the_union.txt


AttributeError: 'NoneType' object has no attribute 'load'