## Key components
Reliable RAG is an improvement over naive RAG that addresses:
1. Retrieval of irrelevant documents
2. Hallucinations not grounded in facts from the retrieved information
3. Citation of sources used to generate the response

How does it do this?
1. Use an LLM to filter retrieved documents
2. Something
3. Something

In [2]:
!poetry add llama-index llama-index-readers-web

The following packages are already present in the pyproject.toml and will be skipped:

  - [36mllama-index[39m

If you want to update it to the latest compatible version, you can use `poetry update package`.
If you prefer to upgrade it to the latest available version, you can use `poetry add package@latest`.

Using version [39;1m^0.2.3[39;22m for [36mllama-index-readers-web[39m

[34mUpdating dependencies[39m
[2K[34mResolving dependencies...[39m [39;2m(24.7s)[39;22m://files.pythonhosted.org/packages/a9/cb/2c8332bcdc14d33b0bedd18ae0a4981a069c3513e445120da3c3f23a8aaa/jieba3k-0.35.1.zip  93%[39m [39;2m(5.8s)[39;22m[34mResolving dependencies...[39m [39;2m(8.6s)[39;22m[34mResolving dependencies...[39m [39;2m(10.4s)[39;22m[34mResolving dependencies...[39m [39;2m(12.4s)[39;22m[34mResolving dependencies...[39m [39;2m(14.3s)[39;22m[34mResolving dependencies...[39m [39;2m(14.6s)[39;22m[34mResolving dependencies...[39m [39;2m(17.1s)[39;22m[34mResolving depe

In [4]:
!poetry add llama-index-vector-stores-chroma

Using version [39;1m^0.2.0[39;22m for [36mllama-index-vector-stores-chroma[39m

[34mUpdating dependencies[39m
[2K[34mResolving dependencies...[39m [39;2m(8.3s)[39;22m[34mResolving dependencies...[39m [39;2m(2.8s)[39;22m[34mResolving dependencies...[39m [39;2m(4.3s)[39;22m[34mResolving dependencies...[39m [39;2m(4.4s)[39;22m[34mResolving dependencies...[39m [39;2m(4.5s)[39;22m[34mResolving dependencies...[39m [39;2m(4.8s)[39;22m[34mResolving dependencies...[39m [39;2m(4.9s)[39;22m[34mResolving dependencies...[39m [39;2m(6.3s)[39;22m[34mResolving dependencies...[39m [39;2m(6.6s)[39;22m[34mResolving dependencies...[39m [39;2m(6.7s)[39;22m[34mResolving dependencies...[39m [39;2m(7.1s)[39;22m[34mResolving dependencies...[39m [39;2m(7.3s)[39;22m[34mResolving dependencies...[39m [39;2m(7.4s)[39;22m

[39;1mPackage operations[39;22m: [34m38[39m installs, [34m0[39m updates, [34m0[39m removals

  [34;1m-[39;22m [39mInstalling 

In [15]:
!poetry add python-dotenv

The following packages are already present in the pyproject.toml and will be skipped:

  - [36mpython-dotenv[39m

If you want to update it to the latest compatible version, you can use `poetry update package`.
If you prefer to upgrade it to the latest available version, you can use `poetry add package@latest`.

Nothing to add.


###  Import Libraries and enviornment variables

In [1]:
import os
from dotenv.main import load_dotenv

# Load environment variables from '.env' file
load_dotenv()

os.environ['OPENAI_API_KEY'] = os.getenv("OPENAI_API_KEY")

### Load documents

In [2]:

from llama_index.readers.web import SimpleWebPageReader
from llama_index.core.node_parser import (
    SentenceSplitter,
    SemanticSplitterNodeParser,
)


urls = [
    "https://www.deeplearning.ai/the-batch/how-agents-can-improve-llm-performance/?ref=dl-staging-website.ghost.io",
    "https://www.deeplearning.ai/the-batch/agentic-design-patterns-part-2-reflection/?ref=dl-staging-website.ghost.io",
    "https://www.deeplearning.ai/the-batch/agentic-design-patterns-part-3-tool-use/?ref=dl-staging-website.ghost.io",
    "https://www.deeplearning.ai/the-batch/agentic-design-patterns-part-4-planning/?ref=dl-staging-website.ghost.io",
    "https://www.deeplearning.ai/the-batch/agentic-design-patterns-part-5-multi-agent-collaboration/?ref=dl-staging-website.ghost.io"
]

loaded_docs = SimpleWebPageReader(html_to_text=True).load_data(urls)

### Create vector store

#### Create chroma vector store if it doesn't exist

In [1]:
from dotenv import dotenv_values


ModuleNotFoundError: No module named 'dotenv'

In [19]:
import chromadb
chroma_client = chromadb.Client()
chroma_collection = chroma_client.create_collection("reliable_rag")

ImportError: python-dotenv is not installed, run `pip install pydantic[dotenv]`

### Chunk the documents

In [None]:

from llama_index.embeddings.openai import OpenAIEmbedding

embed_model = OpenAIEmbedding()
splitter = SemanticSplitterNodeParser(
    buffer_size=1, breakpoint_percentile_threshold=95, embed_model=embed_model
)

nodes = splitter.get_nodes_from_documents(loaded_docs)


Store the chunks in Chroma

In [None]:
from llama_index.core import StorageContext
from llama_index.core import VectorStoreIndex
from llama_index.vector_stores.chroma import ChromaVectorStore


vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex(nodes, storage_context=storage_context)

In [None]:
retriever = index.as_retriever(similarity_top_k=2)

### Question

In [None]:
question = "what are the differnt kind of agentic design patterns?"

### Retrieve documents

In [None]:
docs = retriever.retrieve(question)

### Check what our docs look like

In [None]:
print(docs[0])