![image](https://raw.githubusercontent.com/IBM/watson-machine-learning-samples/master/cloud/notebooks/headers/watsonx-Prompt_Lab-Notebook.png)
# Use AutoAI RAG and Chroma to create a pattern and get information from `ibm-watsonx-ai` SDK documentation

#### Disclaimers

- Use only Projects and Spaces that are available in the watsonx context.


## Notebook content

This notebook contains the steps and code to demonstrate the usage of IBM AutoAI RAG. The AutoAI RAG experiment conducted in this notebook uses data scraped from the `ibm-watsonx-ai` SDK documentation.

Some familiarity with Python is helpful. This notebook uses Python 3.11.


## Learning goal

The learning goals of this notebook are:

- Create an AutoAI RAG job that will find the best RAG pattern based on provided data


## Table of Contents

This notebook contains the following parts:

- [Setup](#setup)
- [Define the RAG Optimizer](#definition)
- [Run the RAG Experiment](#run)
- [Compare and test RAG Patterns](#comparison)
- [Historical runs](#runs)
- [Clean up](#cleanup)
- [Summary and next steps](#summary)

<a id="setup"></a>
## Set up the environment

Before you use the sample code in this notebook, you must perform the following setup task:

-  Contact your Cloud Pak for Data administrator and ask them for your account credentials

### Install and import the required modules and dependencies

In [None]:
!pip install -U 'ibm-watsonx-ai[rag]>=1.2.4' | tail -n 1
!pip install -U "langchain_community>=0.3,<0.4" | tail -n 1

### Connect to WML

Authenticate the Watson Machine Learning service on IBM Cloud Pak for Data. You need to provide the platform `url`, your `username`, and your `api_key`.

In [None]:
username = 'PASTE YOUR USERNAME HERE'
api_key = 'PASTE YOUR API_KEY HERE'
url = 'PASTE THE PLATFORM URL HERE'

In [None]:
from ibm_watsonx_ai import Credentials

credentials = Credentials(
    username=username,
    api_key=api_key,
    url=url,
    instance_id="openshift",
    version="5.1"
)

Alternatively, you can use your `username` and `password` to authenticate WML services.

```python
credentials = Credentials(
    username=***,
    password=***,
    url=***,
    instance_id="openshift",
    version="5.1"
)

```

In [3]:
from ibm_watsonx_ai import APIClient

client = APIClient(credentials)

### Working with projects

First, you need to create a project for your work. If you do not have a project already, create one by following these steps:

- Open IBM Cloud Pak for Data
- From the menu, click **View all projects**
- Create a new project
- Go to the **Manage** tab
- Copy the `project_id`

**Action**: Assign the project ID below

In [None]:
project_id = 'PASTE YOUR PROJECT ID HERE'

To be able to interact with all resources available in Watson Machine Learning, set the project that you are using.

In [5]:
client.set.default_project(project_id)

'SUCCESS'

<a id="definition"></a>

## RAG Optimizer definition

### Define a connection to the training data

Upload the training data to the project as a data asset and then define a connection to the file. This example uses the `ModelInference` description from the [`ibm_watsonx_ai`](https://ibm.github.io/watsonx-ai-python-sdk/fm_model_inference.html) documentation.

In [6]:
from langchain_community.document_loaders import WebBaseLoader

url = "https://ibm.github.io/watsonx-ai-python-sdk/fm_model_inference.html"

docs = WebBaseLoader(url).load()
model_inference_content = docs[0].page_content

USER_AGENT environment variable not set, consider setting it to identify your requests.


Upload the training data to the project as a data asset.

In [7]:
import os

document_filename = "ModelInference.txt"

if not os.path.isfile(document_filename):
    with open(document_filename, "w") as file:
        file.write(model_inference_content)

document_asset_details = client.data_assets.create(name=document_filename, file_path=document_filename)

document_asset_id = client.data_assets.get_id(document_asset_details)
document_asset_id

Creating data asset...
SUCCESS


'7cf9bcac-b7af-49a1-bfba-a413c006777f'

Define a connection to the training data.

In [8]:
from ibm_watsonx_ai.helpers import DataConnection

input_data_references = [DataConnection(data_asset_id=document_asset_id)]

### Define a connection to the test data

Upload a `json` file that you want to use as a benchmark to the project as a data asset and then define a connection to the file. This example uses content from the [`ibm_watsonx_ai`](https://ibm.github.io/watsonx-ai-python-sdk/index.html) SDK documentation.

In [9]:
benchmarking_data_IBM_page_content = [
    {
        "question": "What is path to ModelInference class?",
        "correct_answer": "ibm_watsonx_ai.foundation_models.inference.ModelInference",
        "correct_answer_document_ids": [
            "ModelInference.txt"
        ]
    },
    {
        "question": "What is method for get model inferance details?",
        "correct_answer": "get_details()",
        "correct_answer_document_ids": [
            "ModelInference.txt"
        ]
    }
]

Upload the benchmark testing data to the project as a data asset with `json` extension.

In [10]:
import json

test_filename = "benchmarking_data_ModelInference.json"

if not os.path.isfile(test_filename):
    with open(test_filename, "w") as json_file:
        json.dump(benchmarking_data_IBM_page_content, json_file, indent=4)

test_asset_details = client.data_assets.create(name=test_filename, file_path=test_filename)

test_asset_id = client.data_assets.get_id(test_asset_details)
test_asset_id

Creating data asset...
SUCCESS


'cf8be77e-a4ff-4ca2-a2b6-0e729df93a18'

Define a connection to the benchmark testing data.

In [11]:
test_data_references = [DataConnection(data_asset_id=test_asset_id)]

### Configure the RAG Optimizer

Provide the input information for the AutoAI RAG optimizer:
- `name` - experiment name
- `description` - experiment description
- `max_number_of_rag_patterns` - maximum number of RAG patterns to create
- `optimization_metrics` - target optimization metrics

In [12]:
from ibm_watsonx_ai.experiment import AutoAI

experiment = AutoAI(credentials, project_id=project_id)

rag_optimizer = experiment.rag_optimizer(
    name='AutoAI RAG run - ModelInference documentation',
    description="AutoAI RAG Optimizer on ibm_watsonx_ai ModelInference documentation",
    max_number_of_rag_patterns=4,
    optimization_metrics=[AutoAI.RAGMetrics.ANSWER_CORRECTNESS]
)

To retrieve the configuration parameters, use `get_params()`.

In [13]:
rag_optimizer.get_params()

{'name': 'AutoAI RAG run - ModelInference documentation',
 'description': 'AutoAI RAG Optimizer on ibm_watsonx_ai ModelInference documentation',
 'max_number_of_rag_patterns': 4,
 'optimization_metrics': ['answer_correctness']}

<a id="run"></a>
## Run the RAG Experiment

Call the `run()` method to trigger the AutoAI RAG experiment. Choose one of two modes: 

- To use the **interactive mode** (synchronous job), specify `background_mode=False` 
- To use the **background mode** (asynchronous job), specify `background_mode=True`

In [14]:
run_details = rag_optimizer.run(
    input_data_references=input_data_references,
    test_data_references=test_data_references,
    background_mode=False
)



##############################################

Running '1ee9d070-c83d-4cb5-a435-43d94de1d87f'

##############################################


pending.....
running.....................................
completed
Training of '1ee9d070-c83d-4cb5-a435-43d94de1d87f' finished successfully.


To monitor the AutoAI RAG jobs in background mode, use the `get_run_status()` method.

In [15]:
rag_optimizer.get_run_status()

'completed'

<a id="comparison"></a>
## Compare and test RAG Patterns

You can list the trained patterns and information on evaluation metrics in the form of a Pandas DataFrame by calling the `summary()` method. Use the DataFrame to compare all discovered patterns and select the one you want for further testing.

In [16]:
summary = rag_optimizer.summary()
summary

Unnamed: 0_level_0,mean_answer_correctness,mean_faithfulness,mean_context_correctness,chunking.chunk_size,embeddings.model_id,vector_store.distance_metric,retrieval.method,retrieval.number_of_chunks,generation.model_id
Pattern_Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Pattern1,1.0,0.6667,1.0,512,ibm/slate-125m-english-rtrvr,cosine,window,5,meta-llama/llama-3-1-8b-instruct
Pattern2,1.0,0.8281,1.0,1024,ibm/slate-125m-english-rtrvr,euclidean,window,5,meta-llama/llama-3-1-8b-instruct
Pattern4,1.0,0.672,1.0,1024,ibm/slate-125m-english-rtrvr,cosine,window,5,meta-llama/llama-3-1-8b-instruct
Pattern3,0.5,0.3576,1.0,1024,ibm/slate-125m-english-rtrvr,euclidean,simple,5,meta-llama/llama-3-1-8b-instruct


Additionally, you can pass the `scoring` parameter to the summary method to filter RAG patterns, starting with the best.

In [None]:
summary = rag_optimizer.summary(scoring="faithfulness")

### Get the selected pattern

Get the RAGPattern object from the RAG Optimizer experiment. By default, the RAGPattern of the best pattern is returned.

In [17]:
best_pattern_name = summary.index.values[0]
print('Best pattern is:', best_pattern_name)

best_pattern = rag_optimizer.get_pattern(pattern_name="Pattern1")
best_pattern

Best pattern is: Pattern1


<ibm_watsonx_ai.foundation_models.extensions.rag.pattern.pattern.RAGPattern at 0x155736c50>

To retrieve the pattern details, use the `get_pattern_details` method.

In [None]:
rag_optimizer.get_pattern_details(pattern_name='Pattern2')

### Create the index/collection

Build a solution using the best pattern with additional document indexing.

To check the `index_name` that you are working on, use the `best_pattern` method. 

In [None]:
best_pattern.vector_store._index_name

In [18]:
urls = [
    "https://ibm.github.io/watsonx-ai-python-sdk/fm_embeddings.html",
    "https://ibm.github.io/watsonx-ai-python-sdk/fm_custom_models.html",
    "https://ibm.github.io/watsonx-ai-python-sdk/fm_text_extraction.html"
]
docs_list = WebBaseLoader(urls).load()
doc_splits = best_pattern.chunker.split_documents(docs_list)

In [19]:
best_pattern.indexing_function(doc_splits)

['6725adcaf281965af27100854c86e7cca86b2e101ea2aaa03bb535bc73122c66',
 '097a23e86e44625d53a3017a98d5a4480437eec7f07f1190e54d0af638653807',
 '5ca05bba9bdb16a696e16b10763a309fc607141063550ae617a9432f4dd8be45',
 '403478a3ca41145145daf0e6d15cb8a482bc79b0486bcb79f7dc5f4d487a5c62',
 '9d3b51d2aeaa0bad48672e42135825b34548742a4d20cfb13617f7004055edb9',
 '24628f30b0c805ed6f3423d7d8cae878df6cda0601785a02df72a7dbe29672da',
 '02cab2320f3ab03bf2f27f94bcd143202e2cc88c0a7275b2b4f1bef0ab645df8',
 'dfb96bc7f12b52572739ee31a86ed93f97bea43b8ec0b09608943345456ca6f1',
 '9452bf02ff2e978bafa4c9021e4e4774bce393103bd0d3e130bc392b79e9c97d',
 '06c81a352d551a38d5d43f71c4346e26aadd461c8f130371edb97bff2fa63fec',
 '234fdc00083820332c723a4f6bd1a9ab5181726df785899729953aeb2b82e1a6',
 '753eed2b1575615322ee5482dd899b33d5d6f9886034954aec05934ca7a8ee3c',
 'c9965f644c828f1cc3de727b5c83c212c921d40b4aaa7037ab4ce6932ad86a69',
 '2dc9caad80d046bdfa883fac89d8d21d656ea101da1a9d5cb107a3ba3caae117',
 '8102aadc0c8647633aaca64d3bec56c3

Query the RAGPattern locally to test it.

In [20]:
questions = ["How to add Task Credentials?"]

payload = {
    client.deployments.ScoringMetaNames.INPUT_DATA: [
        {
            "values": questions,
            "access_token": client.token
        }
    ]
}

best_pattern.inference_function()(payload)

{'predictions': [{'fields': ['answer', 'reference_documents'],
   'values': [['\n\nAccording to the document, to add Task Credentials, you can use the following methods:\n\n1. List available task credentials: `client.task_credentials.list()`\n2. Create new task credentials: `client.task_credentials.store()`\n3. Get the status of available task credentials: `client.task_credentials.get_details()`\n\nIf the list is empty, you can create new task credentials using the `store` method. Task Credentials are required on IBM watsonx.ai for IBM Cloud to make a deployment, and they can help deploy a custom foundation model and avoid token expiration issues.',
       'metadata': {'sequence_number': [6, 7, 8, 9, 10],
        'document_id': '-2689579869660702020'}},
       'metadata': {'sequence_number': [5, 6, 7, 8, 9],
        'document_id': '-2689579869660702020'}},
      {'page_content': 'the path to a CA_BUNDLE file\nthe path of a directory with certificates of trusted CAs\nTrue - default path

<a id="runs"></a>
## Historical runs

In this section, you will learn how to work with historical RAG Optimizer jobs (runs).

To list historical runs, use the `list()` method and provide the `'rag_optimizer'` filter.

In [21]:
experiment.runs(filter='rag_optimizer').list()

Unnamed: 0,timestamp,run_id,state,auto_pipeline_optimizer name
0,2024-11-27T10:53:06.823Z,1ee9d070-c83d-4cb5-a435-43d94de1d87f,completed,AutoAI RAG run - ModelInference documentation


In [22]:
run_id = run_details['metadata']['id']
run_id

'1ee9d070-c83d-4cb5-a435-43d94de1d87f'

### Get the executed optimizer's configuration parameters

In [23]:
experiment.runs.get_rag_params(run_id=run_id)

{'name': 'AutoAI RAG run - ModelInference documentation',
 'description': 'AutoAI RAG Optimizer on ibm_watsonx_ai ModelInference documentation',
 'max_number_of_rag_patterns': 4,
 'optimization_metrics': ['answer_correctness']}

### Get the historical rag_optimizer instance and training details

In [24]:
historical_opt = experiment.runs.get_rag_optimizer(run_id)

### List trained patterns for the selected optimizer

In [25]:
historical_opt.summary()

Unnamed: 0_level_0,mean_answer_correctness,mean_faithfulness,mean_context_correctness,chunking.chunk_size,embeddings.model_id,vector_store.distance_metric,retrieval.method,retrieval.number_of_chunks,generation.model_id
Pattern_Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Pattern1,1.0,0.6667,1.0,512,ibm/slate-125m-english-rtrvr,cosine,window,5,meta-llama/llama-3-1-8b-instruct
Pattern2,1.0,0.8281,1.0,1024,ibm/slate-125m-english-rtrvr,euclidean,window,5,meta-llama/llama-3-1-8b-instruct
Pattern4,1.0,0.672,1.0,1024,ibm/slate-125m-english-rtrvr,cosine,window,5,meta-llama/llama-3-1-8b-instruct
Pattern3,0.5,0.3576,1.0,1024,ibm/slate-125m-english-rtrvr,euclidean,simple,5,meta-llama/llama-3-1-8b-instruct


<a id="cleanup"></a>
## Clean up

To delete the current experiment, use the `cancel_run(hard_delete=True)` method.

**Warning:** Be careful: once you delete an experiment, you will no longer be able to refer to it.

In [26]:
rag_optimizer.cancel_run(hard_delete=True)

'SUCCESS'

To clean up all of the created assets:
- experiments
- trainings
- pipelines
- model definitions
- models
- functions
- deployments

follow the steps in this sample [notebook](https://github.com/IBM/watson-machine-learning-samples/blob/master/cpd5.1/notebooks/python_sdk/instance-management/Machine%20Learning%20artifacts%20management.ipynb).

<a id="summary"></a>
## Summary and next steps

You successfully completed this notebook!

You learned how to use `ibm-watsonx-ai` to run AutoAI RAG experiments. 

 Check out our _<a href="https://ibm.github.io/watsonx-ai-python-sdk/samples.html" target="_blank" rel="noopener no referrer">Online Documentation</a>_ for more samples, tutorials, documentation, how-tos, and blog posts. 

### Authors

**Mateusz Szewczyk**, Software Engineer at Watson Machine Learning

Copyright © 2024-2025 IBM. This notebook and its source code are released under the terms of the MIT License.