
----

# **`Rag Experiments`**

- In this notebook we will perform some experiments related to langchain and RAG.

-----


### **Import Libraries**

In [None]:
import os
from pathlib import Path
from dotenv import load_dotenv

from langchain_groq import ChatGroq
from langchain_community.vectorstores import Chroma
from langchain_community.document_loaders.text import TextLoader
from langchain_community.document_loaders import (
    WebBaseLoader,
    PyPDFLoader,
    Docx2txtLoader,
)
from langchain_core.messages import AIMessage,HumanMessage
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_google_genai import ChatGoogleGenerativeAI, GoogleGenerativeAIEmbeddings
from langchain.chains import create_history_aware_retriever, create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain

load_dotenv()

True

### **Load Documents**

In [11]:
doc_path = [
    "docs/DeepSeek_R1.pdf",
    "docs/3_months_data_scientist_Roadmap.md",
    "docs/Skin_Disease_Detection_Project_Description.docx",
]

docs = []
for doc_file in doc_path:
    file_path = Path(doc_file)

    try:
        
        if doc_file.endswith(".pdf"):
            loader = PyPDFLoader(file_path)
        
        elif doc_file.endswith(".docx"):
            loader = Docx2txtLoader(file_path)

        elif doc_file.endswith(".txt") or doc_file.endswith(".md"):
            loader = TextLoader(file_path)

        else:
            print(f"\nDocument Type {doc_file.type} not Supported.")
    
    except Exception as e:
        print(f"\nError While Loading Document {doc_file.name}: {e}")

    finally:
        os.remove(file_path)

### **Load URL**

In [12]:
url = "https://docs.streamlit.io/develop/quick-reference/release-notes"

try:
    loader = WebBaseLoader(url)
    docs.extend(loader.load())

except Exception as e:
    print(f"\nError While Loading Document from {url}: {e}")

In [None]:
# load first 453 words of the document
text = docs[0].page_content[:453]
print(text)

Release notes - Streamlit DocsDocumentationsearchSearchrocket_launchGet startedInstallationaddFundamentalsaddFirst stepsaddcodeDevelopConceptsaddAPI referenceaddTutorialsaddQuick referenceremoveCheat sheetRelease notesremove2025202420232022202120202019Pre-release featuresRoadmapopen_in_newweb_assetDeployConceptsaddStreamlit Community CloudaddSnowflakeOther platformsaddschoolKnowledge baseFAQInstalling dependenciesDeployment issuesHome/Develop/Quick 


### **Split Documents into Chunks**

In [21]:
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 5000,
    chunk_overlap = 1000,
)

document_chunks = text_splitter.split_documents(docs)

### **Tokenize and Load the Documents to the VectorStore**

In [26]:
vector_db = Chroma.from_documents(
    documents = document_chunks,
    embedding = GoogleGenerativeAIEmbeddings(model="models/embedding-001"),
)

### **Define Retriver To get the Content From VectorStore**

In [27]:
def _get_context_retriver_chain(vector_db,llm):
    retriver = vector_db.as_retriever()
    prompt = ChatPromptTemplate.from_messages([
        MessagesPlaceholder(variable_name = "messages"),
        ("user", "{input}"),
        ("user", "Given the above conversation, generate a search query to look up in order to get inforamtion relevant to the conversation, focusing on the most recent messages."),
    ])
    retriver_chain = create_history_aware_retriever(llm, retriver, prompt)
    return retriver_chain

### **Define a Conversational RAG Chain**

In [28]:
def get_conversational_rag_chain(llm):
    retriver_chain = _get_context_retriver_chain(vector_db, llm)

    prompt = ChatPromptTemplate.from_messages([
        ("system",
         """You are a Helpfull Assistant. You will have to answer the user's Queries.
         You will have Some Context to Help With Your Aanswers, but now always would be compleatly related or Helpfull.
         You can also use your Knowledge to assist answering the User's Qqueries.\n
         {context}"""),
        MessagesPlaceholder(variable_name = "messages"),
        ("user", "{input}"),
    ])
    stuff_documents_chain = create_stuff_documents_chain(llm, prompt)

    return create_retrieval_chain(retriver_chain, stuff_documents_chain)


### **Augmented Response Generation**

In [29]:
llm_stream_groq = ChatGroq(
    model = "deepseek-r1-distill-llama-70b",
    temperature = 0.3,
    streaming = True,
)

llm_stream_gemini = ChatGoogleGenerativeAI(
    model = "gemini-1.5-flash",
    temperature = 0.3,
)

llm_stream = llm_stream_groq # select between groq and gemini for response

messages = [
    {"role": "user", "content": "Hi"},
    {"role": "assistant", "content": "Hi there! How can I assist you today?"},
    {"role": "user", "content": "What is the latest version of Streamlit?"},
]

messages = [HumanMessage(content=m["content"]) if m["role"] == "user" else AIMessage(content=m["content"]) for m in messages]

conversation_rag_chain = get_conversational_rag_chain(llm_stream)

response_message = "**RAG Response**\n"

for chunk in conversation_rag_chain.pick("answer").stream({"messages" : messages[:-1], "input" : messages[-1].content}):
    response_message += chunk
    print(chunk, end="", flush=True)

messages.append({"role" : "assistant", "content" : response_message})  




<think>
Okay, the user is asking about the latest version of Streamlit. I remember seeing the release notes earlier. Let me check the context provided. 

In the context, under the release notes, it mentions version 1.42.0 with a release date of February 4, 2025. That's the latest one. 

I should confirm that this is indeed the most recent version. Also, the user might be looking to upgrade, so including the upgrade command would be helpful. 

I should present the version clearly and maybe add a tip on how to upgrade using pip. That way, the user gets both the information and the next steps if they need to update.
</think>

The latest version of Streamlit is **1.42.0**, released on **February 4, 2025**. You can upgrade to this version using pip:

```bash
pip install --upgrade streamlit```

----