# Lab 2.2 Alternative ‚Äì RAG with Watsonx and Elasticsearch Python SDK (Google Colab Version)

![watsonx](https://raw.githubusercontent.com/IBM/watson-machine-learning-samples/master/cloud/notebooks/headers/watsonx-Prompt_Lab-Notebook.png)

---

## üöÄ Run this notebook in Google Colab

**Prerequisites:**
- IBM Cloud API Key ([Create one here](https://cloud.ibm.com/iam/apikeys))
- Watsonx Project ID ([Find it in your watsonx.ai project](https://dataplatform.cloud.ibm.com/docs/content/wsj/getting-started/wml-plans.html?context=wx&audience=wdp))
- **Elasticsearch Cloud Endpoint** (see setup instructions below)

‚ö†Ô∏è **Important:** This notebook requires an **Elasticsearch Cloud** instance. You can:
- Use [Elastic Cloud](https://cloud.elastic.co/) (14-day free trial available)
- Use [IBM Cloud Databases for Elasticsearch](https://cloud.ibm.com/catalog/services/databases-for-elasticsearch)

This notebook demonstrates **Retrieval Augmented Generation (RAG)** using:
- **Watsonx.ai** for LLM inference
- **Elasticsearch Python SDK** directly (no LangChain wrapper)
- **SentenceTransformers** for embeddings

---

## Step 1: Install Dependencies

In [None]:
# Install all required dependencies
!pip install -qU "langchain==0.0.340"
!pip install -qU elasticsearch
!pip install -qU sentence-transformers
!pip install -qU pandas
!pip install -qU rouge_score
!pip install -qU nltk
!pip install -qU wget
!pip install -qU evaluate
!pip install -qU "pydantic==1.10.0"
!pip install -qU "ibm-watsonx-ai>=1.0.312"

print("‚úÖ All dependencies installed successfully!")

## Step 2: Configure Watsonx Credentials

In [None]:
import os
import getpass
import pandas as pd

# Prompt for Watsonx credentials
watsonx_api_key = getpass.getpass("Enter IBM Cloud API Key: ")
project_id = getpass.getpass("Enter Watsonx Project ID: ")

credentials = {
    "url": "https://us-south.ml.cloud.ibm.com",
    "apikey": watsonx_api_key
}

os.environ["WATSONX_APIKEY"] = watsonx_api_key
os.environ["PROJECT_ID"] = project_id

print("‚úÖ Watsonx credentials configured!")

## Step 3: Configure Elasticsearch Credentials

**For Google Colab with Elasticsearch Cloud:**
- Elasticsearch Host (e.g., `my-cluster.es.us-east-1.aws.found.io`)
- Port (usually `9243` for Elastic Cloud)
- Username (usually `elastic`)
- Password

In [None]:
eshost = input("Enter Elasticsearch hostname (e.g., xxx.es.cloud.es.io): ")
esport = input("Enter Elasticsearch port (usually 9243): ")
esuser = input("Enter Elasticsearch username (usually 'elastic'): ")
espassword = getpass.getpass("Enter Elasticsearch password: ")

print("\n‚úÖ Elasticsearch credentials configured!")

## Step 4: Get SSL Fingerprint

‚ö†Ô∏è **Note:** In Colab, we'll use a simplified approach. For Elastic Cloud, you can often skip SSL verification in development.

In [None]:
# Try to get SSL fingerprint (may not work in Colab)
try:
    es_ssl_fingerprint_raw = !openssl s_client -connect $eshost:$esport -showcerts </dev/null 2>/dev/null | openssl x509 -fingerprint -sha256 -noout -in /dev/stdin
    if es_ssl_fingerprint_raw:
        es_ssl_fingerprint = es_ssl_fingerprint_raw[0].split("=")[1]
        print(f"‚úÖ SSL Fingerprint: {es_ssl_fingerprint}")
    else:
        es_ssl_fingerprint = None
        print("‚ö†Ô∏è  Could not retrieve SSL fingerprint. Will attempt connection without it.")
except:
    es_ssl_fingerprint = None
    print("‚ö†Ô∏è  Could not retrieve SSL fingerprint. Will attempt connection without it.")

## Step 5: Download Test Data

In [None]:
import wget

questions_test_filename = 'questions_test.csv'
questions_train_filename = 'questions_train.csv'
questions_test_url = 'https://raw.github.com/IBM/watson-machine-learning-samples/master/cloud/data/RAG/questions_test.csv'
questions_train_url = 'https://raw.github.com/IBM/watson-machine-learning-samples/master/cloud/data/RAG/questions_train.csv'

if not os.path.isfile(questions_test_filename): 
    wget.download(questions_test_url, out=questions_test_filename)

if not os.path.isfile(questions_train_filename): 
    wget.download(questions_train_url, out=questions_train_filename)

test_data = pd.read_csv(questions_test_filename)
train_data = pd.read_csv(questions_train_filename)

print("\n‚úÖ Test data downloaded!")
train_data.head()

## Step 6: Download Knowledge Base Documents

In [None]:
knowledge_base_dir = "./knowledge_base"
os.makedirs(knowledge_base_dir, exist_ok=True)

documents_filename = 'knowledge_base/psgs.tsv'
documents_url = 'https://raw.github.com/IBM/watson-machine-learning-samples/master/cloud/data/RAG/psgs.tsv'

if not os.path.isfile(documents_filename): 
    wget.download(documents_url, out=documents_filename)

documents = pd.read_csv(f"{knowledge_base_dir}/psgs.tsv", sep='\t', header=0)
documents['indextext'] = documents['title'].astype(str) + "\n" + documents['text']
documents = documents[:1000]

print(f"\n‚úÖ Loaded {len(documents)} documents for knowledge base")

## Step 7: Create Embedding Function

Using SentenceTransformers `all-MiniLM-L6-v2` model.

In [None]:
from langchain.embeddings import SentenceTransformerEmbeddings

emb_func = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")

# Get embedding dimensions
dims = emb_func.client.get_sentence_embedding_dimension()

print(f"‚úÖ Embedding function created! Dimensions: {dims}")

## Step 8: Initialize Watsonx Model

In [None]:
from ibm_watsonx_ai.foundation_models.utils.enums import ModelTypes
from ibm_watsonx_ai.metanames import GenTextParamsMetaNames as GenParams
from ibm_watsonx_ai.foundation_models.utils.enums import DecodingMethods
from ibm_watsonx_ai.foundation_models import Model

model_id = ModelTypes.FLAN_UL2

parameters = {
    GenParams.DECODING_METHOD: DecodingMethods.GREEDY,
    GenParams.MIN_NEW_TOKENS: 1,
    GenParams.MAX_NEW_TOKENS: 50
}

model = Model(
    model_id=model_id,
    params=parameters,
    credentials=credentials,
    project_id=project_id
)

print("‚úÖ Watsonx model initialized!")

## Step 9: Connect to Elasticsearch

In [None]:
from elasticsearch import Elasticsearch

# Try with SSL fingerprint first, fall back to basic auth without verification
if es_ssl_fingerprint:
    elastic_client = Elasticsearch(
        [f"https://{esuser}:{espassword}@{eshost}:{esport}"],
        basic_auth=(esuser, espassword),
        request_timeout=None,
        ssl_assert_fingerprint=es_ssl_fingerprint
    )
else:
    # For Elastic Cloud in development (not recommended for production)
    elastic_client = Elasticsearch(
        [f"https://{eshost}:{esport}"],
        basic_auth=(esuser, espassword),
        verify_certs=True,
        request_timeout=None
    )

# Test connection
if elastic_client.ping():
    print("‚úÖ Successfully connected to Elasticsearch!")
else:
    print("‚ùå Failed to connect to Elasticsearch. Please check your credentials.")

## Step 10: Create Elasticsearch Index

In [None]:
index_name = "elastic_knn_index_colab"

mapping = {
    "properties": {
        "text": {
            "type": "text"
        },
        "embedding": {
            "type": "dense_vector",
            "dims": dims,
            "index": True,
            "similarity": "l2_norm"
        }
    }
}

# Delete index if exists
if elastic_client.indices.exists(index=index_name):
    elastic_client.indices.delete(index=index_name)
    print(f"Deleted existing index: {index_name}")

# Create new index
elastic_client.indices.create(index=index_name, mappings=mapping)

print(f"‚úÖ Created Elasticsearch index: {index_name}")

## Step 11: Index Documents

‚ö†Ô∏è **This may take several minutes** to embed and index 1000 documents.

In [None]:
from elasticsearch.helpers import bulk

texts = documents.indextext.tolist()

print("Embedding documents... This may take a few minutes.")
embedded_docs = emb_func.embed_documents(texts)

print("Indexing into Elasticsearch...")
document_list = []
batch_size = 500

for i, (text, vector) in enumerate(zip(texts, embedded_docs)):
    document = {"_id": i, "_index": index_name, "embedding": vector, "text": text}
    document_list.append(document)
    
    if i % batch_size == batch_size - 1:
        success, failed = bulk(elastic_client, document_list)
        print(f"  Indexed {i+1} documents...")
        document_list = []

# Index remaining documents
if document_list:
    success, failed = bulk(elastic_client, document_list)

elastic_client.indices.refresh(index=index_name)

print(f"\n‚úÖ Indexed {len(texts)} documents into Elasticsearch!")

## Step 12: Test Questions

In [None]:
questions_and_answers = [
    ('names of founding fathers of the united states?', "Thomas Jefferson::James Madison::John Jay::George Washington::John Adams::Benjamin Franklin::Alexander Hamilton"),
    ('who played in the super bowl in 2013?', 'Baltimore Ravens::San Francisco 49ers'),
    ('when did bucharest become the capital of romania?', '1862')
]

## Step 13: Run Semantic Search Queries

In [None]:
relevant_contexts = []

for question_text, _ in questions_and_answers:
    embedded_question = emb_func.embed_query(question_text)
    
    relevant_chunks = elastic_client.search(
        index=index_name,
        knn={
            "field": "embedding",
            "query_vector": embedded_question,
            "k": 4,
            "num_candidates": 50,
        },
        _source=["text"],
        size=5
    )
    
    relevant_contexts.append(relevant_chunks)

print("‚úÖ Retrieved relevant contexts for all questions!")

## Step 14: View Retrieved Contexts for First Question

In [None]:
relevant_context = relevant_contexts[0]
hits = relevant_context['hits']['hits']

print(f"Question: {questions_and_answers[0][0]}\n")
for hit in hits:
    print("=" * 80)
    print(f"Paragraph index: {hit['_id']}")
    print(f"Paragraph: {hit['_source']['text'][:300]}...")
    print(f"Distance: {hit['_score']}")
    print()

## Step 15: Create Prompts and Generate Answers

In [None]:
def make_prompt(context, question_text):
    return (
        f"Please answer the following.\n"
        f"{context}:\n\n"
        f"{question_text}"
    )

prompt_texts = []

for relevant_context, (question_text, _) in zip(relevant_contexts, questions_and_answers):
    hits = [hit for hit in relevant_context["hits"]["hits"]]
    context = "\n\n\n".join([rel_ctx["_source"]['text'] for rel_ctx in hits])
    prompt_text = make_prompt(context, question_text)
    prompt_texts.append(prompt_text)

print(f"‚úÖ Created {len(prompt_texts)} prompts")

## Step 16: Generate Answers with Watsonx

In [None]:
results = []

for prompt_text in prompt_texts:
    results.append(model.generate_text(prompt_text))

print("‚úÖ Generated all answers!")

## Step 17: Display Results

In [None]:
for idx, result in enumerate(results):
    print("=" * 80)
    print(f"Question: {questions_and_answers[idx][0]}")
    print(f"Answer: {result}")
    print(f"Expected: {questions_and_answers[idx][1]}")
    print()

## Step 18: Calculate RougeL Metric

In [None]:
from evaluate import load

rouge = load('rouge')
scores = rouge.compute(
    predictions=results,
    references=[answer for _, answer in questions_and_answers]
)

print("\n" + "=" * 80)
print("ROUGE Scores:")
print("=" * 80)
for metric, score in scores.items():
    print(f"{metric}: {score:.4f}")

## Step 19: Try Your Own Question!

In [None]:
your_question = input("Enter your question: ")

# Embed and search
embedded_question = emb_func.embed_query(your_question)

search_results = elastic_client.search(
    index=index_name,
    knn={
        "field": "embedding",
        "query_vector": embedded_question,
        "k": 4,
        "num_candidates": 50,
    },
    _source=["text"],
    size=5
)

# Create context from search results
hits = search_results['hits']['hits']
context = "\n\n\n".join([hit["_source"]['text'] for hit in hits])

# Generate answer
prompt = make_prompt(context, your_question)
answer = model.generate_text(prompt)

print("\n" + "=" * 80)
print(f"Question: {your_question}")
print("=" * 80)
print(f"Answer: {answer}")
print("\nRelevant sources:")
for i, hit in enumerate(hits[:3], 1):
    print(f"\n{i}. {hit['_source']['text'][:300]}...")

---

## Summary

You successfully completed this notebook! You learned how to:

‚úÖ Connect to Elasticsearch Cloud from Google Colab  
‚úÖ Use SentenceTransformers for embeddings  
‚úÖ Create and index vectors using Elasticsearch Python SDK  
‚úÖ Perform k-NN semantic search  
‚úÖ Generate RAG responses with Watsonx  
‚úÖ Evaluate results using ROUGE metrics  

**Next Steps:**
- Try different embedding models (e.g., `all-mpnet-base-v2`)
- Experiment with different similarity metrics (cosine, dot product)
- Index your own documents
- Fine-tune retrieval parameters (k, num_candidates)

For more information:
- [Watsonx.ai Documentation](https://ibm.github.io/watsonx-ai-python-sdk/samples.html)
- [Elasticsearch Python Client](https://elasticsearch-py.readthedocs.io/)

---

**Copyright ¬© 2023 IBM. This notebook and its source code are released under the terms of the MIT License.**