# Vector DB Experimentations

Here we will try to leverage the Cohere Embeddings generator to create our embeddings and then use Pinecone to store the vectors before using the Cohere API again to create a QA on a PDF file that we have chosen to be our source data.

## Importing Libraries

In [1]:
from langchain.document_loaders import PyPDFDirectoryLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import CohereEmbeddings
from langchain.llms import Cohere
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate
from ApiSecrets import ApiSecrets
import os

## Creating a PDF Directory as our Retrieval Source

In [2]:
loader = PyPDFDirectoryLoader("pdfs")
source_data = loader.load()

In [4]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size = 500, chunk_overlap = 20)
text_chunks = text_splitter.split_documents(source_data)

In [9]:
print(text_chunks[-1].page_content)

just
-
this
is
what
we
are
missing
,
in
my
opinion
.
<EOS>
<pad>Figure 5: Many of the attention heads exhibit behaviour that seems related to the structure of the
sentence. We give two such examples above, from two different heads from the encoder self-attention
at layer 5 of 6. The heads clearly learned to perform different tasks.
15


## Embeddings and Pinecone

In [25]:
from pinecone import Pinecone
os.environ["COHERE_API_KEY"] = ApiSecrets.COHERE_API_KEY
os.environ["PINECONE_API_KEY"] = ApiSecrets.PINECONE_API_KEY

In [11]:
embeddings = CohereEmbeddings(model="embed-english-v3.0")
text = "this is a test document"
query_result = embeddings.embed_query(text)
len(query_result)

In [28]:
pc = Pinecone(api_key=os.environ.get("PINECONE_API_KEY"))
index_name = "testing-vec-db"
index = pc.Index(index_name)

### Creating Embeddings from each chunk from PDF

In [26]:
from langchain.vectorstores import Pinecone as LC_Pinecone

In [29]:
vecstore = LC_Pinecone.from_texts([chunk.page_content for chunk in text_chunks], embeddings, index_name=index_name)
vecstore.as_retriever()

VectorStoreRetriever(tags=['Pinecone', 'CohereEmbeddings'], vectorstore=<langchain_community.vectorstores.pinecone.Pinecone object at 0x000001EA664117F0>)

In [43]:
simi_prompt = "what is attention?"
simi_result = vecstore.similarity_search_with_score(simi_prompt)
print(f"Answer: {simi_result[0][0].page_content}\n Score: {simi_result[0][1]}")

Answer: described in section 3.2.
Self-attention, sometimes called intra-attention is an attention mechanism relating different positions
of a single sequence in order to compute a representation of the sequence. Self-attention has been
used successfully in a variety of tasks including reading comprehension, abstractive summarization,
textual entailment and learning task-independent sentence representations [4, 27, 28, 22].
 Score: 0.599198282


### Creating a Retrieval QA Chain

In [44]:
llm = Cohere(cohere_api_key = os.getenv("COHERE_API_KEY"))
qa = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=vecstore.as_retriever())

In [46]:
qa_prompt = "what is attention?"
qa_result = qa.run(qa_prompt)
print(qa_result)

 Attention is a mechanism that helps an AI model focus on different parts of data while doing training.  It is a way to model the dependency of one part of the data on another part, or self-dependencies. It is very useful for large NLP models so they can understand the context of a sentence since each sentence consists of ordered words. 


In [49]:
print("Type 'exit' to quit")
while True:
    user_input = input("Enter Prompt: ")
    if user_input == "exit" or user_input == "Exit":
        break
    if user_input == '':
        continue
    res = qa({"query": user_input})
    print(f"Ans: {res["result"]}")

Type 'exit' to quit
Ans:  From the provided context, a Transformer is a model architecture that substitutes recurrence and self-attention mechanisms for drawing global dependencies between input and output. Specifically, The Transformer presents an updated approach to sequence transduction models that utilize multi-headed self-attention mechanisms, replacing the use of recurrent layers in encoder-decoder architectures. The model allows for more parallelization and can reach a new state of the art in translation tasks. 
Ans:  This paper was presented at the 31st Conference on Neural Information Processing Systems (NIPS 2017) and was published on August 2, 2023, as stated in the copyright section of the paper. 
The authors' names and affiliations are listed on the paper, and the identity of the specific author who wrote the paper may be included in this information in some cases. 
However, I don't have access to real-time data on the internet, so I cannot search for any subsequent update

## Embeddings and ChromaDB