<a href="https://colab.research.google.com/github/advik-7/NLP_projects/blob/main/Gen_RAG_LangChain.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
!pip install langchain_community
!pip install pypdf

Collecting langchain_community
  Downloading langchain_community-0.3.10-py3-none-any.whl.metadata (2.9 kB)
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain_community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting httpx-sse<0.5.0,>=0.4.0 (from langchain_community)
  Downloading httpx_sse-0.4.0-py3-none-any.whl.metadata (9.0 kB)
Collecting langchain<0.4.0,>=0.3.10 (from langchain_community)
  Downloading langchain-0.3.10-py3-none-any.whl.metadata (7.1 kB)
Collecting langchain-core<0.4.0,>=0.3.22 (from langchain_community)
  Downloading langchain_core-0.3.22-py3-none-any.whl.metadata (6.3 kB)
Collecting pydantic-settings<3.0.0,>=2.4.0 (from langchain_community)
  Downloading pydantic_settings-2.6.1-py3-none-any.whl.metadata (3.5 kB)
Collecting marshmallow<4.0.0,>=3.18.0 (from dataclasses-json<0.7,>=0.5.7->langchain_community)
  Downloading marshmallow-3.23.1-py3-none-any.whl.metadata (7.5 kB)
Collecting typing-inspect<1,>=0.4.0 (from dataclasses-

In [3]:
!pip install --upgrade google-generativeai -qq

In [4]:

!pip install faiss-cpu==1.7.4

Collecting faiss-cpu==1.7.4
  Downloading faiss_cpu-1.7.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (1.3 kB)
Downloading faiss_cpu-1.7.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.6 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m17.6/17.6 MB[0m [31m81.3 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: faiss-cpu
Successfully installed faiss-cpu-1.7.4


In [31]:
import os
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import FAISS
from sentence_transformers import SentenceTransformer
from langchain.chains import RetrievalQA
from langchain.llms.base import LLM
from typing import Optional, List, Mapping, Any
import google.generativeai as genai
from langchain.llms.base import LLM
from typing import Optional, List, Mapping, Any
import google.generativeai as genai

def process_query(file_path: str, query: str) -> str:
    # Set environment variables

    # Load the document
    loader = PyPDFLoader(file_path)
    documents = loader.load()

    # Split the documents into chunks
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
    texts = text_splitter.split_documents(documents)

    # Generate embeddings for the chunks
    model = SentenceTransformer('all-MiniLM-L6-v2')
    embeddings = model.encode([t.page_content for t in texts])

    # Create a custom embedding class
    class CustomEmbeddings:
        def __init__(self, model):
            self.model = model

        def embed_documents(self, texts):
            return self.model.encode(texts)

        def embed_query(self, text):
            return self.model.encode(text)

        def __call__(self, text):
            return self.model.encode(text)

    custom_embeddings = CustomEmbeddings(model)

    # Create FAISS vector store
    db = FAISS.from_embeddings(
        text_embeddings=[(t.page_content, embedding) for t, embedding in zip(texts, embeddings)],
        embedding=custom_embeddings
    )

    # Configure the generative model
    genai.configure(api_key=os.environ["GEMINI_API_KEY"])
    generation_config = {
        "temperature": 0.7,
        "max_output_tokens": 512,
    }

    model = genai.GenerativeModel(
      model_name="gemini-1.5-flash",
      generation_config=generation_config,
    )


    class CustomGemini:
        """Custom class to interact with Google Gemini."""

        def __init__(self, temperature: float, max_tokens: int, model: str, google_api_key: str):
            self.temperature = temperature
            self.max_tokens = max_tokens
            self.model = model
            genai.configure(api_key=google_api_key)
            self.generation_config = {
                "temperature": temperature,
                "max_output_tokens": max_tokens,
            }
            self.model_instance = genai.GenerativeModel(
                model_name=model,
                generation_config=self.generation_config,
            )
        def __call__(self, prompt: str, stop: Optional[List[str]] = None) -> str:
            response = self.model_instance.generate_content(prompt)
            return response.text  # Access text directly
    custom_gemini = CustomGemini(
        temperature=0.7,
        max_tokens=512,
        model="gemini-1.5-flash",
        google_api_key=os.environ["GEMINI_API_KEY"]
    )

    # Create a wrapper for the generative model
    class CustomLLMWrapper(LLM):
        custom_llm: CustomGemini

        @property
        def _llm_type(self) -> str:
            return "custom_gemini"

        def _call(self, prompt: str, stop: Optional[List[str]] = None) -> str:
            return self.custom_llm(prompt, stop=stop)

        @property
        def _identifying_params(self) -> Mapping[str, Any]:
            return {
                "temperature": self.custom_llm.temperature,
                "max_tokens": self.custom_llm.max_tokens,
                "model": self.custom_llm.model,
            }

    wrapped_llm = CustomLLMWrapper(custom_llm=custom_gemini)

    # Create the RetrievalQA chain
    retriever = db.as_retriever(search_kwargs={"k": 3})
    qa = RetrievalQA.from_chain_type(
        llm=wrapped_llm, chain_type="stuff", retriever=retriever
    )

    # Run the query and return the response
    response = qa.run(query)
    return response




The passage describes the attention mechanism as "Scaled Dot-Product Attention".  It takes queries and keys of dimension dk, and values of dimension dv as input.  The output is a weighted sum of the values, where the weight assigned to each value is computed by a compatibility function of the query with the corresponding key.



In [32]:
file="/content/NIPS-2017-attention-is-all-you-need-Paper.pdf"
query = "What is the attention mechanism?"
response = process_query(file, query)
print(response)




The passage describes a "Scaled Dot-Product Attention" mechanism.  The input consists of queries and keys of dimension dk, and values of dimension dv.  The output is a weighted sum of the values, where the weight assigned to each value is computed by a compatibility function of the query with the corresponding key.

