<a href="https://colab.research.google.com/github/QaziSaim/CASE-STUDIES/blob/main/LangChain_Documentation_RAG.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [20]:
%%capture
%pip install --quiet --upgrade langchain-text-splitters langchain-community langgraph
%pip install -U langchain-google-genai
%pip install -qU langchain-huggingface
%pip install -qU langchain-pinecone

In [21]:
from google.colab import userdata
langsmith_api = userdata.get('langsmith_api')
gemini_api = userdata.get('GEMINI_API')
hugging_face_api = userdata.get('HUGGINGFACE_API')
pinecone_api_key = userdata.get('PINECONE_API')

In [22]:
import getpass
import os
os.environ['LANGSMITH_TRACING'] = 'true'
os.environ['LANGSMITH_API_KEY'] = langsmith_api
os.environ["GOOGLE_API_KEY"]  = gemini_api
os.environ['HUGGINGFACE_API_KEY'] = hugging_face_api
os.environ['PINECONE_API_KEY'] = pinecone_api_key

In [23]:
from langchain.chat_models import init_chat_model

llm = init_chat_model("gemini-2.5-flash", model_provider="google_genai")



In [24]:
# llm.invoke('Do you know about decoder only model??')

In [25]:
from langchain_huggingface import HuggingFaceEmbeddings
embeddings = HuggingFaceEmbeddings(model_name = 'sentence-transformers/all-mpnet-base-v2')

In [31]:
# pip install "pinecone[grpc]"
# Serverless index
# from pinecone.grpc import PineconeGRPC as Pinecone
# from pinecone import ServerlessSpec

# pc = Pinecone(api_key=pinecone_api_key)

# pc.create_index(
#   name="rag-experiment",
#   dimension=768,
#   metric="cosine",
#   spec=ServerlessSpec(
#     cloud="aws",
#     region="us-east-1",
#   ),
#   deletion_protection="disabled"
# )

# Pod-based index
# from pinecone.grpc import PineconeGRPC as Pinecone, PodSpec

# pc = Pinecone(api_key="YOUR_API_KEY")

# pc.create_index(
#   name="docs-example2",
#   dimension=1536,
#   metric="cosine",
#   spec=PodSpec(
#     environment="us-west1-gcp",
#     pod_type="p1.x1",
#     pods=1,
#   ),
#   deletion_protection="disabled"
# )

In [32]:
from langchain_pinecone import PineconeVectorStore
from pinecone import Pinecone

pc = Pinecone(api_key=pinecone_api_key)
index = pc.Index('rag-experiment')

vector_store = PineconeVectorStore(embedding=embeddings, index=index)

In [33]:
import bs4
from langchain import hub
from langchain_community.document_loaders import WebBaseLoader
from langchain_core.documents import Document
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langgraph.graph import START, StateGraph
from typing_extensions import List, TypedDict



In [34]:
loader = WebBaseLoader(
        web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
        bs_kwargs=dict(
            parse_only = bs4.SoupStrainer(
                class_ = ('post-content','post-title','post-header')
            )
        ),
)

In [35]:
docs = loader.load()

In [None]:
docs[0]

In [36]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size = 1000,chunk_overlap = 200)
all_splits = text_splitter.split_documents(docs)

In [39]:
_  = vector_store.add_documents(documents=all_splits)

In [40]:
prompt = hub.pull('rlm/rag-prompt')

In [61]:
class State(TypedDict):
  question:str
  context:List[Document]
  answer:str

In [62]:
def retrieve(state:State):
  retrieved_docs = vector_store.similarity_search(state['question'])
  return {'context':retrieved_docs}

In [63]:
def generate(state:State):
  docs_content = '\n\n'.join(doc.page_content for doc in state['context'])
  message = prompt.invoke({'question':state['question'],'context':docs_content})
  response = llm.invoke(message)
  return {'answer':response.content}

In [64]:
graph_builder = StateGraph(State).add_sequence([retrieve,generate])
graph_builder.add_edge(START,'retrieve')
graph = graph_builder.compile()

In [65]:
response = graph.invoke({'question':'What is Task Decomposition?'})
print(response['answer'])

Task decomposition is a technique where a complex task is broken down into smaller, simpler, and more manageable steps. This process, often facilitated by methods like Chain of Thought (CoT), helps models address difficult problems by thinking step-by-step. It transforms large tasks into multiple manageable sub-tasks.
