## 1. Introduction
This project demonstrates a Retrieval-Augmented Generation (RAG) pipeline using the IBM Watsonx platform. We leverage the power of IBM's granite-13b-chat-v2 model for generation and slate-30m-eng for embeddings. The pipeline ingests a text document, creates a searchable vector knowledge base using ChromaDB, and answers user questions based on the document's content.

## 2. Setup and Dependencies
Install the required Python libraries

In [11]:
# First, uninstall any old packages to ensure a clean slate.
print("Uninstalling potentially conflicting packages...")
!pip uninstall -y langchain langchain-core langchain-community langchain-ibm

# Now, install our core packages. We explicitly add langchain-community
# to the list and let pip resolve the correct versions.
print("\nInstalling the definitive, compatible set of libraries...")
!pip install \
    "langchain" \
    "langchain-community" \
    "langchain-ibm==0.3.14" \
    "ibm-watsonx-ai" \
    "chromadb" \
    "sentence-transformers" \
    "python-dotenv" -q

print("\n✅ Installation complete. Please RESTART the kernel now for the changes to take effect.")

Uninstalling potentially conflicting packages...
Found existing installation: langchain 0.3.26
Uninstalling langchain-0.3.26:
  Successfully uninstalled langchain-0.3.26
Found existing installation: langchain-core 0.3.68
Uninstalling langchain-core-0.3.68:
  Successfully uninstalled langchain-core-0.3.68
[0mFound existing installation: langchain-ibm 0.3.14
Uninstalling langchain-ibm-0.3.14:
  Successfully uninstalled langchain-ibm-0.3.14

Installing the definitive, compatible set of libraries...

✅ Installation complete. Please RESTART the kernel now for the changes to take effect.


## 2: Securely Configure Credentials
Create a virtual .env file within the notebook to securely store your API key and Project ID, keeping them out of your main code.

In [2]:
%%writefile .env
WML_API_KEY="YOUR_API_KEY"
PROJECT_ID="YOUR_PROJECT_ID"

Writing .env


## 3. Load Credentials and Define Connection Parameters
Read the API key and Project ID from the .env file we just created and prepare them for use with the IBM Watsonx API.

In [3]:
import os
from dotenv import load_dotenv

# This function finds the .env file and loads the variables from it into the environment
load_dotenv()

# We can now access the variables using os.getenv()
wml_api_key = os.getenv("WML_API_KEY")
project_id = os.getenv("PROJECT_ID")

# --- Verification Step ---
# It's good practice to check that the variables were loaded correctly.
if not wml_api_key in wml_api_key:
    print("🚨 ERROR: Please go back to the previous cell and replace 'your_api_key_here' with your actual WML API key.")
elif not project_id in project_id:
    print("🚨 ERROR: Please go back to the previous cell and replace 'your_project_id_here' with your actual Project ID.")
else:
    print("✅ Credentials and Project ID loaded successfully!")

# This dictionary will be used by the LangChain integration to connect to Watsonx
credentials = {
    "url": "https://eu-de.ml.cloud.ibm.com",
    "apikey": wml_api_key
}

✅ Credentials and Project ID loaded successfully!


## 4. Prepare the Knowledge Base Data
Create the sample_state_of_union.txt file directly within the notebook's environment. This makes our project self-contained and easy to reproduce.

In [4]:
# Define the content for our knowledge base. This is the information
# our AI will be able to answer questions about.
state_of_union_text = """
Tonight, we meet as Democrats Republicans and Independents. But most importantly as Americans.
Six days ago, Russia's Vladimir Putin sought to shake the foundations of the free world thinking he could make it bend to his menacing ways. But he badly miscalculated.
He thought he could roll into Ukraine and the world would roll over. Instead he met a wall of strength he never imagined. He met the Ukrainian people.
From President Zelenskyy to every Ukrainian, their fearlessness, their courage, their determination, inspires the world.
The United States is a member of a NATO Alliance along with 29 other nations. The NATO Alliance was created to secure peace and stability in Europe after World War 2.
Putin's latest attack on Ukraine was premeditated and unprovoked. He rejected repeated efforts at diplomacy. He thought the West and NATO wouldn't respond. And he thought he could divide us at home. Putin was wrong. We were ready. Here is what we did.
We spent months building a coalition of other freedom-loving nations from Europe and the Americas to Asia and Africa to confront Putin. I spent countless hours unifying our European allies. America will lead that effort, releasing 30 Million barrels from our own Strategic Petroleum Reserve. And we stand ready to do more if necessary, unified with our allies. In fact, we have already announced a collective release of 60 Million barrels of oil from reserves around the world.
"""

# Define the filename
filename = "sample_state_of_union.txt"

# Write the content to the local file
with open(filename, "w") as f:
    f.write(state_of_union_text)

# You can uncomment the line below to verify the file was created and see its content
!cat sample_state_of_union.txt

print(f"✅ Knowledge base file created: {filename}")


Tonight, we meet as Democrats Republicans and Independents. But most importantly as Americans.
Six days ago, Russia's Vladimir Putin sought to shake the foundations of the free world thinking he could make it bend to his menacing ways. But he badly miscalculated.
He thought he could roll into Ukraine and the world would roll over. Instead he met a wall of strength he never imagined. He met the Ukrainian people.
From President Zelenskyy to every Ukrainian, their fearlessness, their courage, their determination, inspires the world.
The United States is a member of a NATO Alliance along with 29 other nations. The NATO Alliance was created to secure peace and stability in Europe after World War 2.
Putin's latest attack on Ukraine was premeditated and unprovoked. He rejected repeated efforts at diplomacy. He thought the West and NATO wouldn't respond. And he thought he could divide us at home. Putin was wrong. We were ready. Here is what we did.
We spent months building a coalition of othe

## 5. Build the Vector Knowledge Base (Ingestion)
Load, chunk, embed, and store the document content into a searchable vector database.

In [11]:
# With langchain-community correctly installed, we use the modern import paths.
from langchain_community.document_loaders import TextLoader
from langchain_community.vectorstores import Chroma

# from langchain.text_splitter import CharacterTextSplitter
# Swapping to a more robust text splitter
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_ibm import WatsonxEmbeddings
from ibm_watsonx_ai.foundation_models.utils.enums import EmbeddingTypes

# 1. Load the document from the file we created
print("1. Loading document...")
loader = TextLoader(filename)
documents = loader.load()

# 2. Split the document into smaller chunks
print("2. Splitting document into chunks...")
# This splitter will try different separators until it gets the chunk size right.
text_splitter = RecursiveCharacterTextSplitter(chunk_size=250, chunk_overlap=30)
texts = text_splitter.split_documents(documents)
print(f"   - Document split into {len(texts)} chunks.")

# 3. Create the embedding function using a Watsonx model
print("3. Creating Watsonx embedding function...")
embeddings = WatsonxEmbeddings(
    model_id=EmbeddingTypes.IBM_SLATE_30M_ENG.value,
    url=credentials["url"],
    apikey=credentials["apikey"],
    project_id=project_id
)
print("   - Embedding function created successfully.")

# 4. Ingest documents into ChromaDB
print("4. Ingesting documents into ChromaDB vector store...")
docsearch = Chroma.from_documents(texts, embeddings)
print("✅ Ingestion complete. Vector store is ready.")

1. Loading document...
2. Splitting document into chunks...
   - Document split into 10 chunks.
3. Creating Watsonx embedding function...
   - Embedding function created successfully.
4. Ingesting documents into ChromaDB vector store...
✅ Ingestion complete. Vector store is ready.


## 6. Define the Language Model and QA Chain
Set up the powerful Granite model from Watsonx and then connect it to our docsearch retriever to form a complete Question-Answering (QA) chain.

In [12]:
from ibm_watsonx_ai.foundation_models.utils.enums import ModelTypes
from ibm_watsonx_ai.metanames import GenTextParamsMetaNames as GenParams
from ibm_watsonx_ai.foundation_models.utils.enums import DecodingMethods
from langchain_ibm import WatsonxLLM

# New imports for prompt templates and manual chain creation
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from langchain.chains import RetrievalQA

# 1. Define the LLM from Watsonx that will generate answers
# We are selecting a model that is available in the Frankfurt (eu-de) region.
print("1. Defining the Watsonx LLM...")
# model_id = ModelTypes.GRANITE_13B_CHAT_V2 # This one is not available
model_id = "ibm/granite-3-3-8b-instruct" # Using the available instruction-tuned model

# 2. Define the parameters for text generation. These control the model's output.
parameters = {
    GenParams.DECODING_METHOD: DecodingMethods.GREEDY,
    GenParams.MIN_NEW_TOKENS: 1,
    GenParams.MAX_NEW_TOKENS: 250,
    GenParams.TEMPERATURE: 0.0, # Setting to 0.0 for maximum factuality
    GenParams.REPETITION_PENALTY: 1.05,
}

# 3. Initialize the WatsonxLLM class, which is the LangChain wrapper for our model
print("2. Initializing the LangChain wrapper for the LLM...")
watsonx_llm = WatsonxLLM(
    model_id=model_id,
    url=credentials.get("url"),
    apikey=credentials.get("apikey"),
    project_id=project_id,
    params=parameters
)
print("   - LLM wrapper initialized successfully.")


# 4. Create a custom Prompt Template
# This is the key to controlling the LLM's behavior.
print("4. Creating a custom prompt template...")
prompt_template = """
Use the following pieces of context to answer the question at the end. If you don't know the answer from the context, just say that you don't know. Do not use any other information. Do not try to make up an answer.

Context:
{context}

Question: {question}
Helpful Answer:"""

# Create a PromptTemplate object
PROMPT = PromptTemplate(
    template=prompt_template, input_variables=["context", "question"]
)
print("   - Prompt template created.")


# 5. Create the RAG chain using our custom prompt that connects the retriever (our ChromaDB) and the LLM
print("5. Creating the RetrievalQA chain...")
qa_chain = RetrievalQA.from_chain_type(
    llm=watsonx_llm,
    chain_type="stuff",
    retriever=docsearch.as_retriever(search_kwargs={"k": 3}),
    return_source_documents=True,
    chain_type_kwargs={"prompt": PROMPT}
)
print("✅ QA chain is ready with improved prompting.")

1. Defining the Watsonx LLM...
2. Initializing the LangChain wrapper for the LLM...
   - LLM wrapper initialized successfully.
4. Creating a custom prompt template...
   - Prompt template created.
5. Creating the RetrievalQA chain...
✅ QA chain is ready with improved prompting.


## 7: Test the RAG System!
Ask questions to our fully assembled QA chain and see it retrieve relevant information and generate accurate answers.

In [13]:
def ask_question(query, chain):
    print(f"\n🤔 Query: {query}")
    print("--------------------------------------------------")
    
    # Invoke the chain. This is where the magic happens!
    # 1. The query is used to find relevant text chunks from ChromaDB.
    # 2. The retrieved chunks and the query are sent to the Watsonx LLM.
    # 3. The LLM generates an answer based on the provided context.
    result = chain.invoke({"query": query})
    
    # Print the final, generated answer
    print(f"\n✅ Answer:\n{result['result']}")
    
    # Print the source documents that the LLM used to create the answer
    print("\n\n📚 Source Documents Used:")
    print("--------------------------------------------------")
    for i, doc in enumerate(result['source_documents']):
        print(f"--- Source {i+1} ---")
        print(f"{doc.page_content}\n")

# --- Let's ask some questions! ---

# Test Case 1: A direct question about a specific person in the text.
ask_question("What did the president say about Vladimir Putin?", qa_chain)

# Test Case 2: A question that requires finding a number.
ask_question("How much oil was released by the USA?", qa_chain)

# Test Case 3: A question that requires combining information.
ask_question("How much oil was released by all countries combined?", qa_chain)

# Test Case 4: A question whose answer is NOT in the text.
# This tests the model's ability to say "I don't know" or rely only on the context.
ask_question("What is the capital of Ukraine?", qa_chain)


🤔 Query: What did the president say about Vladimir Putin?
--------------------------------------------------

✅ Answer:
 The president stated that Vladimir Putin had premeditatedly and unprovokedly attacked Ukraine, rejecting diplomatic efforts and believing the West and NATO wouldn't respond or that he could divide them internally. The president emphasized that Putin miscalculated, as the West and NATO were ready and united in their response.


📚 Source Documents Used:
--------------------------------------------------
--- Source 1 ---
Six days ago, Russia's Vladimir Putin sought to shake the foundations of the free world thinking he could make it bend to his menacing ways. But he badly miscalculated.

--- Source 2 ---
Putin's latest attack on Ukraine was premeditated and unprovoked. He rejected repeated efforts at diplomacy. He thought the West and NATO wouldn't respond. And he thought he could divide us at home. Putin was wrong. We were ready. Here is what we

--- Source 3 ---
Toni