<a href="https://colab.research.google.com/github/pinilDissanayaka/Multi-query-RAG/blob/main/Notebook1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!pip install -r requirements.txt

Collecting langchain (from -r requirements.txt (line 1))
  Downloading langchain-0.2.14-py3-none-any.whl.metadata (7.1 kB)
Collecting langchain_community (from -r requirements.txt (line 2))
  Downloading langchain_community-0.2.12-py3-none-any.whl.metadata (2.7 kB)
Collecting langchain_huggingface (from -r requirements.txt (line 3))
  Downloading langchain_huggingface-0.0.3-py3-none-any.whl.metadata (1.2 kB)
Collecting sentence-transformers (from -r requirements.txt (line 4))
  Downloading sentence_transformers-3.0.1-py3-none-any.whl.metadata (10 kB)
Collecting pinecone (from -r requirements.txt (line 6))
  Downloading pinecone-5.0.1-py3-none-any.whl.metadata (18 kB)
Collecting langchain_core (from -r requirements.txt (line 7))
  Downloading langchain_core-0.2.32-py3-none-any.whl.metadata (6.2 kB)
Collecting langchain_pinecone (from -r requirements.txt (line 9))
  Downloading langchain_pinecone-0.1.3-py3-none-any.whl.metadata (1.7 kB)
Collecting langchain_experimental (from -r requirem

In [1]:
import os
from pinecone import ServerlessSpec, Pinecone
from langchain.document_loaders import WebBaseLoader
from langchain_core.documents import Document
from langchain_experimental.text_splitter import SemanticChunker
from langchain_pinecone import PineconeVectorStore
from langchain_groq import ChatGroq
from langchain.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnableLambda, RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
from langchain_huggingface import HuggingFaceEmbeddings



In [2]:
from google.colab import userdata
os.environ['GROQ_API_KEY']=userdata.get('GROQ_API_KEY')
os.environ['PINECONE_API_KEY']=userdata.get('PINECORN_API_KEY')

In [3]:
llm=ChatGroq(model="mixtral-8x7b-32768",
             temperature=0.7)

In [7]:
def createIndex(indexName:str, dimension:int):
  try:
    pinecone=Pinecone()
    if indexName not in pinecone.list_indexes().names():
      pinecone.create_index(
          name=indexName,
          dimension=dimension,
          metric="cosine",
          spec=ServerlessSpec(cloud='aws',
                            region='us-east-1')
      )
      print(f"Created {indexName}")
    else:
      print(f"{indexName} already exists")
    return pinecone.describe_index(indexName)
  except Exception as e:
    print(f"Error creating index: {e}")

In [8]:
print(createIndex(indexName='multi-rag', dimension=768))

Created multi-rag
{'deletion_protection': 'disabled',
 'dimension': 768,
 'host': 'multi-rag-4myrn7y.svc.aped-4627-b74a.pinecone.io',
 'metric': 'cosine',
 'name': 'multi-rag',
 'spec': {'serverless': {'cloud': 'aws', 'region': 'us-east-1'}},
 'status': {'ready': True, 'state': 'Ready'}}


In [9]:
model_name = "sentence-transformers/all-mpnet-base-v2"
model_kwargs = {'device': 'cpu'}
encode_kwargs = {'normalize_embeddings': False}
embeddings = HuggingFaceEmbeddings(
    model_name=model_name,
    model_kwargs=model_kwargs,
    encode_kwargs=encode_kwargs
)

In [10]:
webLoader=WebBaseLoader(web_path=['https://medium.com/@amanatulla1606/transformer-architecture-explained-2c49e2257b4c'])

data=webLoader.load()

print(f"Loaded {len(data)} documents")

Loaded 1 documents


In [11]:
print(data)

[Document(metadata={'source': 'https://medium.com/@amanatulla1606/transformer-architecture-explained-2c49e2257b4c', 'title': 'Transformer Architecture explained | by Amanatullah | Medium', 'description': 'Transformers are a new development in machine learning that have been making a lot of noise lately. They are incredibly good at keeping track of context, and this is why the text that they write…', 'language': 'en'}, page_content='Transformer Architecture explained | by Amanatullah | MediumOpen in appSign upSign inWriteSign upSign inTransformer Architecture explainedAmanatullah·Follow10 min read·Sep 1, 2023--11ListenShareTransformers are a new development in machine learning that have been making a lot of noise lately. They are incredibly good at keeping track of context, and this is why the text that they write makes sense. In this chapter, we will go over their architecture and how they work.Transformer models are one of the most exciting new developments in machine learning. They w

In [12]:
textSplitter=SemanticChunker(embeddings=embeddings)

splitteDocuments=textSplitter.split_documents(documents=data)

print(f"Split into {len(splitteDocuments)} chunks")

Split into 6 chunks


In [13]:
splitteDocuments[:3]

[Document(metadata={'source': 'https://medium.com/@amanatulla1606/transformer-architecture-explained-2c49e2257b4c', 'title': 'Transformer Architecture explained | by Amanatullah | Medium', 'description': 'Transformers are a new development in machine learning that have been making a lot of noise lately. They are incredibly good at keeping track of context, and this is why the text that they write…', 'language': 'en'}, page_content='Transformer Architecture explained | by Amanatullah | MediumOpen in appSign upSign inWriteSign upSign inTransformer Architecture explainedAmanatullah·Follow10 min read·Sep 1, 2023--11ListenShareTransformers are a new development in machine learning that have been making a lot of noise lately. They are incredibly good at keeping track of context, and this is why the text that they write makes sense. In this chapter, we will go over their architecture and how they work.Transformer models are one of the most exciting new developments in machine learning. They w

In [14]:
docs=[]
for doc in splitteDocuments:
  doc.page_content=doc.page_content.replace("\n", " ")
  docs.append(Document(page_content=doc.page_content))


print(f"Created {len(docs)} documents")

Created 6 documents


In [15]:
pineconeVectorStore=PineconeVectorStore.from_documents(
    documents=docs,
    embedding=embeddings,
    index_name="multi-rag"
)

In [16]:
retriever=pineconeVectorStore.as_retriever()

In [32]:
template = """You are an AI language model assistant. Your task is to generate five
different versions of the given user question and also add given question to the genarated questions to retrieve relevant documents from a vector
database. By generating multiple perspectives on the user question, your goal is to help
the user overcome some of the limitations of the distance-based similarity search.
Provide these alternative questions separated by newlines. Original question:

{question}"""


multiQyeryPrompt = ChatPromptTemplate.from_template(template)

print(multiQyeryPrompt)

input_variables=['question'] messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['question'], template='You are an AI language model assistant. Your task is to generate five\ndifferent versions of the given user question and also add given question to the genarated questions to retrieve relevant documents from a vector\ndatabase. By generating multiple perspectives on the user question, your goal is to help\nthe user overcome some of the limitations of the distance-based similarity search.\nProvide these alternative questions separated by newlines. Original question:\n\n{question}'))]


In [33]:
def separateQuestions(text:str):
  return text.split("\n")

In [34]:
multiQyeryChain=multiQyeryPrompt | llm | StrOutputParser() | RunnableLambda(separateQuestions)

In [35]:
question="What is transformer architecture?"

In [36]:
multiQyeryChain.invoke(question)

['1. Can you explain the concept of a transformer architecture?',
 '2. How does a transformer architecture work?',
 '3. What are the key components of a transformer architecture?',
 '4. What makes transformer architecture unique in the field of machine learning?',
 '5. Transformer architecture: Could you provide a detailed overview?',
 '',
 'Additional question for retrieval: What is the transformer architecture in natural language processing?']

In [37]:
def getRelevantDocuments(documents:list):
  context=[]
  for document in documents:
    for subdoc in document:
      context.append(subdoc.page_content)
  context=set(context)
  return context

In [38]:
retrivalChain=multiQyeryChain | retriever.map() | RunnableLambda(getRelevantDocuments)

In [39]:
template = """Answer the following question based on this context:

{context}

Question: {question}
"""


qaPrompt = ChatPromptTemplate.from_template(template)

print(qaPrompt)

input_variables=['context', 'question'] messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], template='Answer the following question based on this context:\n\n{context}\n\nQuestion: {question}\n'))]


In [40]:
chain= ({"context" : retrivalChain, "question" : RunnablePassthrough()}|
        qaPrompt |
        llm |
        StrOutputParser())

In [41]:
chain.invoke(question)

'Transformer architecture is a type of model used in machine learning that is particularly good at keeping track of context in the text it processes. It is composed of several key components, including tokenization, embedding, positional encoding, and the transformer block. The transformer block itself is made up of two main parts: the attention component and the feedforward component. The architecture can seem complex at first, but when broken down into its individual parts, it is easier to understand.\n\nIn tokenization, the input text is broken down into individual tokens, such as words, punctuation signs, etc. These tokens are then turned into vectors of numbers using an embedding. The positional encoding step combines all of these vectors into one vector for processing.\n\nThe transformer block is where the real magic happens. It is made up of several attention components, which are used to add context to each word in the text. This is important because the same word can have diff