# **Vector DB - PINECONE DB**

**Note - This script executed in Google Colab**



- **Pinecone is a cloud based** Vector DB.
- **Pinecone and Weaiate**  are **cloud based db**, we need to **take subscription**, but it provides initial free credits, we can create only 1 cluster. if we **dont want to save our private data** there, then we should use Chrome db/FAISS
- Set **API key** in Pinecone website
- We need to **define/create index** and its **dimensions**. That time we get **API_Env**.
	- **PINECONE_API_ENV = 'gcp-starter'**
- If our embedding model creates vector of 384 diemnsions then we need to set diemnsions =384, while creating pinecone index. Then **.init** initialize the pinecone by providing index

- Whenever **connecting to Pinecone Via  API key and env key**, that time**import direct pinecone library** and use
- Whenever **importing embedding and doing db registry to Pinecone**, that time use **pinecone from langchain.vectorstores import Pinecone**

## **Terminology:**
- **CHROMA/PINECONE-CLIENT** Db **pip installed**, Then called via **langchain's vectore_stores**
- Here we used **Huggingface's** embedding
- **Chunking/Chunk_size:** In document/datset we will have more no of tokens, but word embedding LLM models will have **token size /token_limitation** like 4k Tokens etc, So to accomodate to that size, we **split our data as chunks**
- **Chunk_overlap =50:** It takes **50 token behind from previous chunk** while creating next chunk


## **Below steps followed:**
- Login to **Pinecone website(Pinecone: https://www.pinecone.io/)**, Create
	- **APE_KEY**
	- **API_Env**
	- **New index**

-  **Download some document**
- Then **split that into chunks**
- Then import **openai embedding or hugging face embedding model** or some other embedding which converts **tokens/text to vector**
- Then use **pinecone library** and pass
    - **document which conveted to chunks to vector**  
    - **embedding model name**
    - **index**
- This converts **chunk to vectors/embedding**, which will be **saved inside index in pinecone cloud**
- Each chunks creates as 1 vector, we can see this in **Pinecone website, under our index**
- Then we need to **load this vector_db** which we just now created by mentioning **persist_director/vector_db**  and **embedding model name**
- Then use **as_retriever** to **read vector db** and **do  symantic search on this**
- Then this **symantic/similarity search** will give **K=4 relavant answers**, that along **with user Q** we will **feed to LLM** to provide **meaningfull response on that Q**.
- We can use **langchain's chain operation** - **RetrivalQA** for this
- We can set this # of relevant answer by setting **search_kwargs ={k:2}**
by using Chroma library
- Here **VectorDB does similarity search based on user Q** but **LLM just structure the VectorDB response and gives as output**. LLM wont do anything else. **Its also called RAG**
- This **RetrievalQA** passes Q to Vector db **retriever** and then passes this O/P with Q to llm model to do **summarization** internally


Pinecone: https://www.pinecone.io/

In [None]:
#!pip install langchain
#!pip install pinecone-client==2.2.4
#!pip install pypdf
#!pip install sentence-transformers==2.2.2

## **1. Read the Document**
- Create directory pdfs and keep pdf file here, which will be used to created DB
- This pdf folder creating inside colab env,so it will deleted once session completes

In [None]:
!mkdir pdfs

mkdir: cannot create directory ‘pdfs’: File exists


In [None]:
!ls -l

total 8
drwxr-xr-x 2 root root 4096 Mar 18 04:22 pdfs
drwxr-xr-x 1 root root 4096 Mar 14 13:27 sample_data


### **Extract the Text from the PDF's**

In [None]:
from langchain.document_loaders import PyPDFDirectoryLoader

In [None]:
loader = PyPDFDirectoryLoader("pdfs")
data = loader.load()
data

[Document(page_content='You Only Look Once:\nUniﬁed, Real-Time Object Detection\nJoseph Redmon∗, Santosh Divvala∗†, Ross Girshick¶, Ali Farhadi∗†\nUniversity of Washington∗, Allen Institute for AI†, Facebook AI Research¶\nhttp://pjreddie.com/yolo/\nAbstract\nWe present YOLO, a new approach to object detection.\nPrior work on object detection repurposes classiﬁers to per-\nform detection. Instead, we frame object detection as a re-\ngression problem to spatially separated bounding boxes and\nassociated class probabilities. A single neural network pre-\ndicts bounding boxes and class probabilities directly from\nfull images in one evaluation. Since the whole detection\npipeline is a single network, it can be optimized end-to-end\ndirectly on detection performance.\nOur uniﬁed architecture is extremely fast. Our base\nYOLO model processes images in real-time at 45 frames\nper second. A smaller version of the network, Fast YOLO,\nprocesses an astounding 155 frames per second while\nstill a

### Split the whole document to chunks
- split that into chunks with **chunk_size=500, chunk_overlap=20** using **RecursiveCharacterTextSplitter**

In [None]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

In [None]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=20)
text_chunks = text_splitter.split_documents(data)

In [None]:
len(text_chunks) # Total it creats 94 chunks based on specified chunk size and chunk_overlap

94

In [None]:
text_chunks[2]

Document(page_content='still achieving double the mAP of other real-time detec-\ntors. Compared to state-of-the-art detection systems, YOLO\nmakes more localization errors but is less likely to predict\nfalse positives on background. Finally, YOLO learns very\ngeneral representations of objects. It outperforms other de-\ntection methods, including DPM and R-CNN, when gener-\nalizing from natural images to other domains like artwork.\n1. Introduction\nHumans glance at an image and instantly know what ob-', metadata={'source': 'pdfs/yolo.pdf', 'page': 0})

## **2. Creating Vector DB**

- Then import **openai embedding or hugging face embedding model** or some other embedding which converts **tokens/text to vector**
- Create **persist_directory** to create **schema/db** in local
- Then use **Chroma/vectore db library** and pass
    - **document which conveted to chunks to vector**  
    - **embedding model name**
    - **persist directory**
- This converts **chunk to vectors/embedding**, which will be **saved inside db folder**

### **Initialize Embedding**

- Used Hugging face embedding - **sentence-transformers/all-MiniLM-L6-v2**

In [None]:
from langchain.embeddings import HuggingFaceEmbeddings
#from langchain.embeddings import OpenAIEmbeddings

embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")

In [None]:
# Test this initialized embedding model with sample text
query_result = embeddings.embed_query("Hello World")
query_result

[-0.03447727486491203,
 0.03102317824959755,
 0.006734995171427727,
 0.026108944788575172,
 -0.039361994713544846,
 -0.16030240058898926,
 0.06692399084568024,
 -0.006441427860409021,
 -0.04745052009820938,
 0.014758813194930553,
 0.07087530195713043,
 0.05552761256694794,
 0.01919334940612316,
 -0.026251301169395447,
 -0.010109584778547287,
 -0.026940491050481796,
 0.0223073810338974,
 -0.022226644679903984,
 -0.1496926248073578,
 -0.017493028193712234,
 0.007676266599446535,
 0.05435226857662201,
 0.0032544408459216356,
 0.031725917011499405,
 -0.08462139219045639,
 -0.0294059906154871,
 0.05159558355808258,
 0.04812406376004219,
 -0.0033148028887808323,
 -0.058279186487197876,
 0.041969265788793564,
 0.02221064269542694,
 0.1281888484954834,
 -0.02233898639678955,
 -0.011656252667307854,
 0.06292840093374252,
 -0.032876357436180115,
 -0.09122604876756668,
 -0.031175360083580017,
 0.0526994913816452,
 0.04703482612967491,
 -0.08420313149690628,
 -0.030056189745664597,
 -0.02074486576

In [None]:
print("Length", len(query_result))

Length 384


> This embedding model creates vector with **384 dimension**, so we created pinecone index also 384 dimension

### **Intialize Pinecone Vector DB**

#### Invoke and Initialize Pinecone

In [None]:
from google.colab import userdata
PINECONE_API_KEY = userdata.get('PINECONE_API_KEY')
PINECONE_API_ENV = userdata.get('PINECONE_API_ENV')

import os
#Make is as env variable
os.environ["PINECONE_API_KEY"] = PINECONE_API_KEY
os.environ["PINECONE_API_ENV"] = PINECONE_API_ENV

In [None]:
import pinecone

# initialize pinecone
pinecone.init(
    api_key=PINECONE_API_KEY,  # find at app.pinecone.io
    environment=PINECONE_API_ENV  # next to api key in console
)
index_name = "testindex" # put in the name of your pinecone index here


### **Create Vector DB**
- Then use **pinecone library** and pass
    - **document which conveted to chunks to vector**  
    - **embedding model name**
    - **index**

In [None]:
from langchain.vectorstores import Pinecone

#Initialize Pinecone by passing text which converted as chunks, embedding model and schema name
docsearch = Pinecone.from_texts([t.page_content for t in text_chunks],
                                embeddings,
                                index_name=index_name)

### Load the Vector DB from Pinecone
- load this vector_db which we just now created by mentioning index_name and embedding model name
- If you already have an index(Means already have existing PINECONE Vector DB index with all vector data), you can load it like this

- docsearch = Pinecone.from_existing_index(index_name, embeddings)

In [None]:
docsearch = Pinecone.from_existing_index(index_name, embeddings)
docsearch

<langchain_community.vectorstores.pinecone.Pinecone at 0x7802cef2dea0>

## **3. Sementic/Similarity Search**
- Then use **similarity_search** to read vector db and do **symantic search** on this
- Then this symantic/similarity search will give K=4 relavant answers, that along with user Q we will feed to LLM to provide meaningfull response on that Q.
We can use langchain's chain operation - RetrivalQA for this

### Set Retrival argument **search_kwargs={"k": 3}**

In [None]:
query = "What is yolo?"

In [None]:
docs = docsearch.similarity_search(query, k=3)
docs

[Document(page_content='Figure 6: Qualitative Results. YOLO running on sample artwork and natural images from the internet. It is mostly accurate although it\ndoes think one person is an airplane.\nincluding the time to fetch images from the camera and dis-\nplay the detections.\nThe resulting system is interactive and engaging. While\nYOLO processes images individually, when attached to a\nwebcam it functions like a tracking system, detecting ob-\njects as they move around and change in appearance. A'),
 Document(page_content='making predictions. Unlike sliding window and region\nproposal-based techniques, YOLO sees the entire image\nduring training and test time so it implicitly encodes contex-\ntual information about classes as well as their appearance.\nFast R-CNN, a top detection method [14], mistakes back-\nground patches in an image for objects because it can’t see\nthe larger context. YOLO makes less than half the number\nof background errors compared to Fast R-CNN.'),
 Documen

In [None]:
len(docs)

3

## **4. Make a chain and do Semantic Search**
- We can use **langchain's chain operation** - **RetrivalQA** for this
- We can set this # of relevant answer by setting search_kwargs ={k:2} by using Chroma library
- Here VectorDB does **similarity search** based on **user Q** but **LLM just structure the VectorDB response and gives as output**. LLM wont do anything else. Its also called RAG
- This **RetrievalQA** passes Q to Vector db **retriever** and then passes this O/P with Q to llm model to do **summarization** internally

In [None]:
#!pip install openai

In [None]:
from google.colab import userdata
OPENAI_API_KEY = userdata.get('OPENAI_API_KEY')

import os
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY

In [None]:
from langchain.llms import OpenAI
from langchain.chains import RetrievalQA

In [None]:
llm = OpenAI()

In [None]:
qa = RetrievalQA.from_chain_type(llm=llm,
                                 chain_type="stuff",
                                 retriever=docsearch.as_retriever()
                                 )

### Call Chain and get response (RAG)

In [None]:
query = "What is yolo?" #which is here in content file
print('\n',qa.run(query))


  YOLO is a fast and simple object detection system that uses a single convolutional neural network to simultaneously predict multiple bounding boxes and class probabilities for objects in an image. It is able to process images in real-time and is highly generalizable, but it may struggle with precise localization of some objects.


In [None]:
# full example which is not there in content file
query = "what is spacex?"

print('\n',qa.run(query))


 
I don't know, there is no mention of SpaceX in the given context.


# **END**