<a href="https://colab.research.google.com/github/sanjayakanungo/RAG/blob/main/RAG_Pipeline_Vector-DB.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Customize RAG Pipeline:

In this notebook, we will customize standard RAG pipeline

*   Configure different LLM.
*   Use Different Embedding Model.
*   Configure different Vector Store.
*   Customize with different Indices.
*   Synthesize response for a query.




In [1]:
from google.colab import drive
drive.mount("/content/drive")

Mounted at /content/drive


In [2]:
!pip install llama-index qdrant_client

Collecting llama-index
  Downloading llama_index-0.9.48-py3-none-any.whl (15.9 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m15.9/15.9 MB[0m [31m40.6 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting qdrant_client
  Downloading qdrant_client-1.7.3-py3-none-any.whl (206 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m206.3/206.3 kB[0m [31m23.4 MB/s[0m eta [36m0:00:00[0m
Collecting dataclasses-json (from llama-index)
  Downloading dataclasses_json-0.6.4-py3-none-any.whl (28 kB)
Collecting deprecated>=1.2.9.3 (from llama-index)
  Downloading Deprecated-1.2.14-py2.py3-none-any.whl (9.6 kB)
Collecting dirtyjson<2.0.0,>=1.0.8 (from llama-index)
  Downloading dirtyjson-1.0.8-py3-none-any.whl (25 kB)
Collecting httpx (from llama-index)
  Downloading httpx-0.26.0-py3-none-any.whl (75 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m75.9/75.9 kB[0m [31m11.3 MB/s[0m eta [36m0:00:00[0m
Collecting openai>=1.1.0 (from llama-index)

In [3]:
#!pip install python-dotenv --quiet
!pip install python-dotenv
import dotenv
import os
dotenv.load_dotenv(
        os.path.join('/content/drive/MyDrive/', '.env')
    )
OPENAI_API_KEY = os.getenv('OPENAI_API_KEY')


Collecting python-dotenv
  Downloading python_dotenv-1.0.1-py3-none-any.whl (19 kB)
Installing collected packages: python-dotenv
Successfully installed python-dotenv-1.0.1


In [4]:
import os
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY

Download Data

Access the uber 2021 10k sec filings data [here](https://www.sec.gov/Archives/edgar/data/1543151/000154315122000008/uber-20211231.htm)

Load Data

In [8]:
!pip install pypdf
#from pathlib import Path
from llama_index import download_loader, SimpleDirectoryReader

documents = SimpleDirectoryReader("/content/drive/MyDrive/GENAI-Pinnacle/VCFdataset").load_data()



In [9]:
len(documents)

771

In [13]:
print(documents[10].text)

About the VMware Cloud Foundation 
Administration Guide
The VMware Cloud Foundation  Administration Guide  provides information about managing a 
VMware Cloud Foundation ™ system, including managing the system's virtual infrastructure, 
managing users, configuring, upgrading, and monitoring the system. 
Intended Audience
The VMware Cloud Foundation  Administration Guide  is intended for cloud architects, 
infrastructure administrators, and cloud administrators who are familiar with and want to use 
VMware software to quickly deploy and manage a software-defined data center (SDDC). The 
information in this document is written for experienced data center system administrators who 
are familiar with:
nConcepts of virtualization, software-defined data centers, and virtual infrastructure (VI)
nVMware virtualization technologies, such as VMware ESXi ™, the hypervisor
nSoftware-defined networking using VMware NSX®
nSoftware-defined storage using VMware vSAN ™
nNetworking concepts such as Laye

Configure OpenAI LLM

In [14]:
from llama_index.llms import OpenAI
llm = OpenAI(model="gpt-3.5-turbo", temperature=0.1)

Load BGE embeddings from HuggingFace

In [15]:
from llama_index.embeddings import HuggingFaceEmbedding
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/743 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/133M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/366 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/711k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

Create Service Context by providing LLM and Embedding model



In [16]:
from llama_index import ServiceContext
service_context = ServiceContext.from_defaults(
    llm=llm,
    embed_model=embed_model
)

Configure Qdrant VectorDB

In [17]:
import qdrant_client
from llama_index.vector_stores.qdrant import QdrantVectorStore

# initialize client, setting path to save data
client = qdrant_client.QdrantClient(path="/content/drive/MyDrive/GENAI-Pinnacle/qdrant_db")

# create collection
vector_store = QdrantVectorStore(client=client, collection_name="rag_customization")

Create Store Context by assigning vector store created

In [18]:
from llama_index.storage.storage_context import StorageContext
storage_context = StorageContext.from_defaults(vector_store=vector_store)

#1.VectorStore Index

Define the vector store index by passing storage context and service context

In [19]:
from llama_index import VectorStoreIndex
index = VectorStoreIndex.from_documents(documents,
    storage_context=storage_context,
    service_context=service_context,
    show_progress=True
 )

Parsing nodes:   0%|          | 0/771 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/771 [00:00<?, ?it/s]

Build the query engine for the index

In [29]:
query_engine = index.as_query_engine(similarity_top_k=10)

In [30]:
response = query_engine.query(
    "What are the Physical Network Design Requirements for VMware Cloud Foundation?"
)

In [31]:
print(response)

The Physical Network Design Requirements for VMware Cloud Foundation encompass several aspects:

1. Leaf-Spine Physical Network Design Requirements: This entails defining the network topology for connecting physical switches and ESXi hosts, configuring switch port settings for VLANs and link aggregation, and designing routing.

2. vSAN Design Requirements: It is necessary to ensure sufficient raw capacity to meet the initial needs of the workload domain cluster and have at least the required minimum number of hosts based on the cluster type.

3. ESXi Server Design Requirements: Consideration should be given to the resources, networking, and security policies needed to support the virtual machines in each workload domain cluster.

4. vCenter Server Design Requirements: This involves deploying an appropriately sized vCenter Server appliance for each workload domain and safeguarding workload domain vCenter Server appliances using vSphere HA.

5. vSphere Cluster Design Requirements: Determ

#2 .Keyword Table

In [32]:
from llama_index.indices import SimpleKeywordTableIndex
keyword_table_index = SimpleKeywordTableIndex.from_documents(
    documents,
    service_context=service_context,
    show_progress=True
)

Parsing nodes:   0%|          | 0/771 [00:00<?, ?it/s]

Extracting keywords from nodes:   0%|          | 0/771 [00:00<?, ?it/s]

In [33]:
keyword_table_retriever = keyword_table_index.as_retriever()

In [34]:
query_engine = keyword_table_index.as_query_engine(retriever=keyword_table_retriever)

In [35]:
response = query_engine.query(
    "What are the Physical Network Design Requirements for VMware Cloud Foundation?"
)

In [36]:
print(response)

The Physical Network Design Requirements for VMware Cloud Foundation include considerations for network bandwidth, trunk port configuration, jumbo frames, and routing configuration for NSX. The design also involves connecting ESXi hosts redundantly to the top-of-rack switches using two 25-GbE ports and configuring the switches to provide necessary VLANs using an 802.1Q trunk. The design ensures that redundant connections are used and no physical interface is overrun.


## Without retriever

In [37]:
query_engine = keyword_table_index.as_query_engine()

response = query_engine.query(
    "What are the Physical Network Design Requirements for VMware Cloud Foundation?"
)

print(response)

The Physical Network Design Requirements for VMware Cloud Foundation include considerations for network bandwidth, trunk port configuration, jumbo frames, and routing configuration for NSX. The design also involves connecting ESXi hosts redundantly to the top-of-rack switches using two 25-GbE ports and configuring the switches to provide necessary VLANs using an 802.1Q trunk. The design requirements ensure that no physical interface is overrun and that available redundant paths are utilized.
