- pip install langchain qdrant-client
- pip install langchain.text_splitter
- pip install --upgrade google-generativeai


Short description:
- Loads a text file and splits it into semantic chunks.
- Creates embeddings using Google Generative AI (Gemini).
- Stores vectors in a Qdrant collection and uploads points.
- Provides functions: make_embedding (embed query), get_chunk_from_db (retrieve top chunks), ask_llm (generate answer via LLMs).
- Implements a run_rag_pipeline to run embed → retrieve → generate, plus a VectorDBTool and a LangChain agent for tooling.
"""

In [1]:
from langchain.text_splitter import RecursiveCharacterTextSplitter # Import the text splitter

# Read your TXT
with open(r"D:\My B.Tech\sem8\internship\LangChainTutorial\RAG\test_text.txt", "r", encoding="utf-8") as f: # Provide the full path to the text file
    text = f.read()

# Create a semantic text splitter
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,       # target chunk size (characters)
    chunk_overlap=50,     # overlap between chunks
    separators=["\n\n", "\n", ".", " "]  # split by paragraph → line → sentence → word
)

chunks = text_splitter.split_text(text)
print(f"Created {len(chunks)} semantic chunks")
print(chunks)


Created 6 semantic chunks
['The Bishop and the Canary\nSmall had earned the canary and loved him. How she did love him! When they had told her, “You may take your pick,” and she leaned over the cage and saw the four fluffy yellow balls, too young to have even sung their first song, her breath and her heart acted so queerly that it seemed as if she must strangle. She chose the one with the topknot. He was the first live creature she had ever owned. \n\n\n“Mine! I shall be his God,” she whispered.', '“Mine! I shall be his God,” she whispered. \n\nHow could she time her dancing feet to careful stepping? She was glad the cage protected him sufficiently so that she could hug it without hurting him. \n\nSave for the flowers that poked their faces through the fences, and for the sunshine, the long street was empty. She wished that there was someone to show him to—someone to say, “He is lovely!”', 'A gate opened and the Bishop stepped into the street. The Bishop was very holy—everybody said so

In [2]:
for i, chunk in enumerate(chunks):
    print(f"--- Chunk {i} ---\n{chunk}\n")

--- Chunk 0 ---
The Bishop and the Canary
Small had earned the canary and loved him. How she did love him! When they had told her, “You may take your pick,” and she leaned over the cage and saw the four fluffy yellow balls, too young to have even sung their first song, her breath and her heart acted so queerly that it seemed as if she must strangle. She chose the one with the topknot. He was the first live creature she had ever owned. 


“Mine! I shall be his God,” she whispered.

--- Chunk 1 ---
“Mine! I shall be his God,” she whispered. 

How could she time her dancing feet to careful stepping? She was glad the cage protected him sufficiently so that she could hug it without hurting him. 

Save for the flowers that poked their faces through the fences, and for the sunshine, the long street was empty. She wished that there was someone to show him to—someone to say, “He is lovely!”

--- Chunk 2 ---
A gate opened and the Bishop stepped into the street. The Bishop was very holy—everybody

In [3]:
from dotenv import load_dotenv
import os

load_dotenv()  # loads variables from .env

api_key_gemini = os.getenv("GEMINI_KEY")
api_key_gorq = os.getenv("GORQ_KEY")
# Check if key is loaded
#print(api_key_gemini)

- pip install --upgrade langchain-google-genai google-generativeai python-dotenv

In [4]:
from langchain_google_genai import GoogleGenerativeAIEmbeddings

# Initialize the embeddings model
embeddings_model = GoogleGenerativeAIEmbeddings(
    model="models/gemini-embedding-001",
    google_api_key=api_key_gemini
)

  from .autonotebook import tqdm as notebook_tqdm


In [29]:
# Generate embeddings for all chunks
embeddings_list = [embeddings_model.embed_query(chunk) for chunk in chunks]

print(f"Generated embeddings for {len(embeddings_list)} chunks")
print(f"Length of first embedding vector: {len(embeddings_list[0])}")

Generated embeddings for 6 chunks
Length of first embedding vector: 3072


In [61]:
print(embeddings_list[0])
vector_dimension = len(embeddings_list[0])

print(f"Your vector dimension is: {vector_dimension}")


[0.009391549974679947, 0.008400637656450272, 0.01899980939924717, -0.060855165123939514, -0.015411262400448322, 0.005167864263057709, 0.0072898268699646, -0.0036714435555040836, -0.010359469801187515, 0.00246584415435791, -0.02100243978202343, -0.003012222470715642, 0.008188084699213505, -0.01375350821763277, 0.12135331332683563, -0.00738591467961669, -0.010418131947517395, 0.01920497789978981, -0.008641323074698448, -0.006009988486766815, -0.009551532566547394, -0.02796027809381485, 0.015064830891788006, -0.0019136556657031178, 0.0019446618389338255, -0.007681249175220728, 0.0373704768717289, -0.007615493610501289, 0.0036665068473666906, -0.008584634400904179, 0.009144812822341919, -0.031608883291482925, -0.022242307662963867, -0.0058705760166049, -0.009547023102641106, 0.0024248971603810787, -0.010679609142243862, -0.0018990867538377643, -0.008171599358320236, 0.017466114833950996, -0.015789907425642014, -0.0002757310576271266, -0.0009340400574728847, -0.0265423022210598, 0.009473704

In [7]:
text = "The Bishop and the Canary Small had earned the canary and loved him..."
embedding_vector = embeddings_model.embed_query(text)

print(f"Embedding vector: {embedding_vector[:10]}...")  # Display first 10 elements


Embedding vector: [0.007367188576608896, -0.004111499059945345, 0.021217752248048782, -0.06289070844650269, -0.02294917218387127, -0.0021960847079753876, -0.0005966837052255869, 0.0008920596446841955, -0.01060901116579771, -0.002885199850425124]...


- pip install qdrant-client


In [17]:
from qdrant_client import QdrantClient
from qdrant_client.models import VectorParams, PointStruct

# Replace with your Qdrant URL and API key
qdrant_url = os.getenv("QDRANT_URL")       # e.g., "https://xyz-123.qdrantcloud.com"
qdrant_api_key = os.getenv("QDRANT_API_KEY")  # Your Qdrant API key

client = QdrantClient(
    url=qdrant_url,
    api_key=qdrant_api_key
)

In [None]:
# from qdrant_client.models import PointStruct

# points = [
#     PointStruct(
#         id=i,
#         vector=embeddings_list[i],
#         payload={"text": chunks[i]}
#     )
#     for i in range(len(chunks))
# ]


In [None]:
from qdrant_client.models import Distance, VectorParams, OptimizersConfig

# Set a name for your collection
COLLECTION_NAME = "my_text_chunks" # Pick a cool name!

# Set the parameters for your vectors (THIS IS CRITICAL!)
# Replace 3072 with the actual dimension of your embeddings_list vectors
VECTOR_SIZE = 3072  # e.g., 1536 for OpenAI's ada-002 embeddings & 3072 for Gemini Embeddings 

# 1. Define the Collection Structure
vectors_config = VectorParams(
    size=VECTOR_SIZE, 
    distance=Distance.COSINE # You can choose other distances like EUCLID, DOT, etc. range 1 to -1
)

# 2. Execute the creation command
try:
    client.recreate_collection(
        collection_name=COLLECTION_NAME,
        vectors_config=vectors_config
    )
    print(f"Collection '{COLLECTION_NAME}' created successfully! 🎉")
except Exception as e:
    print(f"Error creating collection: {e}")

  client.recreate_collection(


Collection 'my_text_chunks' created successfully! 🎉


In [None]:
# # CHANGE THIS LINE!
# VECTOR_SIZE = 3072 # <--- NOW IT MATCHES YOUR EMBEDDINGS

# vectors_config = VectorParams(
#     size=VECTOR_SIZE,
#     distance=Distance.COSINE
# )

# client.recreate_collection(
#     collection_name=COLLECTION_NAME,
#     vectors_config=vectors_config
# )

  client.recreate_collection(


True

- Final FIXXXX

In [None]:
import os
from qdrant_client import QdrantClient
from qdrant_client.models import VectorParams, Distance, PointStruct, UpdateStatus

# --- CONFIG ---
QDRANT_URL = os.getenv("QDRANT_URL")
QDRANT_API_KEY = os.getenv("QDRANT_API_KEY")
COLLECTION_NAME = "my_text_chunks" 
VECTOR_SIZE = 3072

# Assuming your variables are available globally or defined right above this
# If they are NOT defined, this script will fail.
try:
    data_count = len(chunks)
except NameError:
    print("FATAL ERROR: 'chunks' variable not found. Run your chunking code first!")
    exit(1)


# --- UPLOAD WORKFLOW ---
client = QdrantClient(url=QDRANT_URL, api_key=QDRANT_API_KEY)
vectors_config = VectorParams(size=VECTOR_SIZE, distance=Distance.COSINE)
    
# 1. Recreate the collection with the correct 3072 size
client.recreate_collection(
    collection_name=COLLECTION_NAME,
    vectors_config=vectors_config
)

# 2. Prepare points
points = [
    PointStruct(
        id=i,
        vector=embeddings_list[i], # Make sure embeddings_list is defined(AI GENTRATED CODES OF CHUNKS)
        payload={"text": chunks[i]} # Make sure chunks is defined (acturel chuks' data)
    )
    for i in range(data_count) # Loop through all data points
]

# 3. Upsert the data
operation_info = client.upsert( # CAPTURE THE RETURNED INFO
    collection_name=COLLECTION_NAME, # Your collection name
    wait=True, # Wait for the operation to complete
    points=points # Your data points
)

# 4. Check status
if operation_info.status == UpdateStatus.COMPLETED: # SUCCESS
    print(f"🔥 CONGRATS! {data_count} points are loaded. Status: COMPLETED.") 
    print("NOW run your RAG workflow, it will work!")
else: # FAILURE
    print(f"FAILED to upload. Status: {operation_info.status}")


  client.recreate_collection(


🔥 CONGRATS! 6 points are loaded. Status: COMPLETED.
NOW run your RAG workflow, it will work!


In [None]:
# client.upsert(collection_name="my_text_chunks", points=points)
# print(f"{len(points)} chunks uploaded successfully!")


In [8]:
import os
from langchain_google_genai import GoogleGenerativeAIEmbeddings # Updated Import!

# --- The Core Function for embedings ---
def make_embedding(text: str) -> list[float]:
    """
    Function 1: Generates the vector embedding for the input text (query).
    """
    print("Step 1: Making query embedding...")
    # The LangChain model client handles the API call and returns a list of floats
    return embeddings_model.embed_query(text)

# if __name__ == '__main__':
#     # This block confirms the function is ready without executing the API call.
#     query_text = "The Bishop and the Canary Small had earned the canary and loved him..."
#     print(f"Function 'make_embedding' is defined and ready for use. 🚀")
#     print(f"Using Google Generative AI Embedding Model: models/gemini-embedding-001")
#     print(f"Test text defined: '{query_text}'")
#     print("\nTo run, import this function into your main script and call it directly.")
#     # Example usage:
#     # from step_1_embed import make_embedding
#     # vector = make_embedding(query_text)


In [9]:
# This file handles the retrieval step: connecting to Qdrant and finding 
# the most relevant text chunks (context) based on a query vector.

import os
from qdrant_client import QdrantClient
from qdrant_client.models import VectorParams, Distance

# --- Configuration (Set your actual collection name here!) ---
# IMPORTANT: This MUST match the name of the collection where you uploaded your data.
COLLECTION_NAME = "my_text_chunks" 

# --- The Core Function ---
def get_chunk_from_db(query_vector: list[float]) -> str:
    """
    Function 2: Uses the query vector to search Qdrant and retrieve the top 3 relevant chunks.
    (Requires a 'query_vector' generated from the embedding step)
    """
    
    # 1. Configuration Check
    QDRANT_URL = os.getenv("QDRANT_URL")
    QDRANT_API_KEY = os.getenv("QDRANT_API_KEY")
    
    if not QDRANT_URL or not QDRANT_API_KEY:
        error_msg = "FATAL: QDRANT_URL or QDRANT_API_KEY environment variables are missing. Cannot connect to the database."
        print(f"❌ {error_msg}")
        return error_msg

    # 2. Qdrant Client Initialization (Inside function for clean execution)
    try:
        qdrant_client = QdrantClient(
            url=QDRANT_URL,
            api_key=QDRANT_API_KEY
        )
        # Quick check if the collection exists before searching (optional but helpful)
        qdrant_client.get_collection(collection_name=COLLECTION_NAME) # Will raise an error if collection doesn't exist
        
    except Exception as e:
        error_msg = f"FATAL: Qdrant Connection or Collection Error for '{COLLECTION_NAME}'. Check your URL, API Key, and Collection Name. Error: {e}"
        print(f"❌ {error_msg}")
        return error_msg

    # 3. Search Logic
    top_k = 3
    print(f"Step 2: Retrieving top {top_k} chunks from Qdrant...")
    
    try:
        search_result = qdrant_client.search(
            collection_name=COLLECTION_NAME,
            query_vector=query_vector,
            limit=top_k, 
            with_payload=True 
        )
    except Exception as e:
        error_msg = f"FATAL: Search failed. Check if the vector dimension matches the collection dimension. Error: {e}"
        print(f"❌ {error_msg}")
        return error_msg
    
    # 4. Context Processing
    context_list = [hit.payload.get("text", "Payload missing 'text' key") for hit in search_result]
    
    context = "\n---\n".join(context_list)
    
    if len(context_list) == 0:
        print(f"⚠️ Retrieved {len(context_list)} context chunks. Possible causes: wrong COLLECTION_NAME ('{COLLECTION_NAME}') or no data uploaded.")
    else:
        print(f"✅ Retrieved {len(context_list)} context chunks.")
        
    return context


In [16]:
import os
from langchain_groq import ChatGroq 
from langchain_google_genai import GoogleGenerativeAI
# --- Configuration & Initialization ---
GROQ_API_KEY = os.getenv("GROQ_KEY")     # Needed for the Groq LLM
api_key_gemini = os.getenv("GEMINI_KEY")

# LLM Gemini
llm = GoogleGenerativeAI(                  
    model="gemini-2.0-flash", #gemini-2.0-flash  # model selection this is paid model, use "gemini-1.5-pro" for free model but not awailable at this moment
    google_api_key=api_key_gemini, #auth key(API key from .env file)
    temperature=0.6, # creativity level (0-1) 0 strait forward, 1 very creative(may get false information)
    max_tokens=50,
    verbose=True
)
# LLM Groq
llm2 = ChatGroq(
    model="llama-3.1-8b-instant",
    api_key=api_key_gorq,
    temperature=0.6,
    max_tokens=50,
    verbose=True
)

# --- The Core Function ---
def ask_llm(query: str, context: str) -> str:
    """
    Function 3: Sends the question and the retrieved context to the LLM for generation.
    """
    print("Step 3: Asking LLM to generate the final answer...")

    # Build the prompt manually for simplicity
    prompt = f"""
    You are an expert Q&A assistant. Use ONLY the following context to answer the question. 
    If the answer is not found in the context, clearly state that you cannot answer based on the provided information.

    CONTEXT:
    ---
    {context}
    ---

    QUESTION: {query}
    """
    
    # Invoke the Groq LLM
    response = llm2.invoke(prompt)
    return response.content

# if __name__ == '__main__':
#     # NOTE: You would normally get this context from step_2_retrieve.py
#     DUMMY_CONTEXT = "The Bishop is a large, expensive luxury vehicle. The Canary Small is a compact, fuel-efficient economy car used for short trips."
#     DUMMY_QUERY = "What are the vehicles used for?"

#     final_answer = ask_llm(DUMMY_QUERY, DUMMY_CONTEXT)
    
#     print("\n=============================================")
#     print(f"🤖 FINAL ANSWER (LLM Generation Test):")
#     print(final_answer)
#     print("=============================================")


In [None]:
def run_rag_pipeline(query: str, verbose: bool = False) -> str:
    """
    The main execution function that runs the RAG workflow sequentially.
    
    Args:
        query (str): The user's question to be answered based on the knowledge base.
        verbose (bool): If True, prints intermediate steps and context.
    
    Returns:
        str: The final answer generated by the LLM.
    """
    if verbose:
        print("--- Starting RAG Workflow ---")
    
    # 1. Embed the query
    # Takes the user's question and turns it into a vector.
    query_vector = make_embedding(query)

    # if verbose:
    #     print(f"Query embedded. Vector size: {len(query_vector)}")
    #     # Displaying the first 5 vector elements for quick inspection
    #     print(f"Vector sample: {query_vector[:5]}...") 

    # 2. Retrieve the context
    # Uses the vector to search Qdrant for the top 3 most relevant text chunks.
    context = get_chunk_from_db(query_vector)
    
    # 3. Generate the answer
    # Passes the 'verbose' flag down to the LLM function, where the context will be printed.
    answer = ask_llm(query, context)
    
    if verbose:
        print("--- RAG Workflow Complete ---")
    return answer


# if __name__ == '__main__':
#     # --- Test Execution ---
    
#     # Define the question you want to ask your database
#     user_query = "Tell me about the Bishop and the Canary in one line."
    
#     # Execute the full pipeline function with verbose set to True
#     final_answer = run_rag_pipeline(user_query)
    
#     # --- Display Final Result ---
#     print("\n\n" + "="*50)
#     print("✨ FINAL ANSWER FROM LLAMA-3.1 VIA GROQ ✨")
#     print("="*50)
#     print(final_answer)
#     print("="*50)

Step 1: Making query embedding...
Step 2: Retrieving top 3 chunks from Qdrant...


  search_result = qdrant_client.search(


✅ Retrieved 3 context chunks.
Step 3: Asking LLM to generate the final answer...


✨ FINAL ANSWER FROM LLAMA-3.1 VIA GROQ ✨
A young girl named Small had chosen a canary from a cage, declaring "I shall be his God," but her love for the bird was rejected by the Bishop.


# Making RAG WITH LLM

In [None]:
from langchain.tools import BaseTool # Import BaseTool for tool creation

class VectorDBTool(BaseTool): # tool to interface with vector DB
    name: str = "VectorDB"
    description: str = "Use this tool to fetch relevant information from my_text_chunks for any question."

    def _run(self, query: str) -> str:
        # 1. Create embedding
        query_vector = make_embedding(query)
        # 2. Retrieve top chunks from Qdrant
        context = get_chunk_from_db(query_vector)
        return context

    async def _arun(self, query: str) -> str:
        raise NotImplementedError("Async not implemented yet")

In [17]:
from langchain.agents import initialize_agent
#from langchain.chat_models import ChatOpenAI  # or your Groq LLM wrapper

# 1. Initialize your LLM
# llm = ChatOpenAI(model_name="gpt-4", temperature=0.5)

# 2. List of tools (your DB-tool + optional others)
tools = [VectorDBTool()]

# 3. Initialize agent
agent = initialize_agent(
    tools,
    llm,
    agent="zero-shot-react-description",  # enables reasoning + tool use
    verbose=True
)

In [20]:
response = agent.invoke("Who are the Bishop and the Canary? in one line.")
print(response)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mI need to consult the VectorDB to find information about the Bishop and the Canary.
Action: VectorDB
Action Input: "Bishop and Canary"[0mStep 1: Making query embedding...
Step 2: Retrieving top 3 chunks from Qdrant...


  search_result = qdrant_client.search(


✅ Retrieved 3 context chunks.

Observation: [36;1m[1;3mThe Bishop and the Canary
Small had earned the canary and loved him. How she did love him! When they had told her, “You may take your pick,” and she leaned over the cage and saw the four fluffy yellow balls, too young to have even sung their first song, her breath and her heart acted so queerly that it seemed as if she must strangle. She chose the one with the topknot. He was the first live creature she had ever owned. 


“Mine! I shall be his God,” she whispered.
---
Though she played ladies with his little girls, Small stood in great awe of the Bishop. She had never voluntarily addressed him. When they were playing in his house, the children tiptoed past his study. God and the Bishop were in there making new hymns and collects. 



Her lovely bird! Because there was no one else to show him to she must show him to the Bishop. Birds belonged to the sky. The Bishop would understand. She was not at all afraid now. The bird gave her