![image](https://raw.githubusercontent.com/IBM/watson-machine-learning-samples/master/cloud/notebooks/headers/watsonx-Prompt_Lab-Notebook.png)
# Use watsonx, LangChain and Milvus to create and deploy RAG function

#### Disclaimers

- Use only Projects and Spaces that are available in watsonx context.

## Notebook content

This notebook contains the steps and code to demonstrate support of creating and deploying Retrieval Augumented Generation in watsonx.ai. It introduces commands for data retrieval, knowledge base building & querying, model testing, deploying a RAG solution for general use.

Some familiarity with Python is helpful. This notebook uses Python 3.10.

#### About Retrieval Augmented Generation
Retrieval Augmented Generation (RAG) is a versatile pattern that can unlock a number of use cases requiring factual recall of information, such as querying a knowledge base in natural language.

In its simplest form, RAG requires 3 steps:

- Index knowledge base passages (once)
- Retrieve relevant passage(s) from knowledge base (for every user query)
- Generate a response by feeding retrieved passage into a large language model (for every user query)

## Contents

This notebook contains the following parts:

- [Setup](#setup)
- [Data (test) loading](#data)
- [Set up connectivity information to Milvus](#milvus_conn)
- [Set up VectorStore with Milvus credentials](#vectorstore)
- [Create and deploy RAG solution](#deploy)
- [Calculate rougeL metric](#evaluate)

<a id="setup"></a>
## Set up the environment

Before you use the sample code in this notebook, you must perform the following setup tasks:

-  Contact with your Cloud Pak for Data administrator and ask them for your account credentials


### Install and import dependecies

In [1]:
%%capture
!pip install wget | tail -n 1
!pip install rouge-score | tail -n 1
!pip install -U "ibm_watsonx_ai>=1.1.22" | tail -n 1
!pip install -U "langchain>=0.3,<0.4" | tail -n 1
!pip install -U "langchain-milvus>=0.1,<0.2" | tail -n 1

In [1]:
import os, getpass, wget

from IPython.display import display, Markdown
from langchain_core.documents import Document
from langchain_text_splitters import RecursiveCharacterTextSplitter
from rouge_score import rouge_scorer

from ibm_watsonx_ai import APIClient, Credentials
from ibm_watsonx_ai.foundation_models import Embeddings
from ibm_watsonx_ai.foundation_models.extensions.rag import VectorStore
from ibm_watsonx_ai.foundation_models.extensions.rag.utils import verbose_search
from ibm_watsonx_ai.foundation_models.utils.enums import EmbeddingTypes, ModelTypes
from ibm_watsonx_ai.foundation_models.prompts import PromptTemplate, PromptTemplateManager
from ibm_watsonx_ai.helpers import DataConnection

### Connection to watsonx.ai Runtime

Authenticate the watsonx.ai Runtime service on IBM Cloud Pak for Data. You need to provide platform `url`, your `username` and `api_key`.

In [3]:
credentials_dict = {
    "url": "https://us-south.ml.cloud.ibm.com",
    "apikey": getpass.getpass("Please enter your api key and hit enter: ")
}

In [5]:
credentials = Credentials.from_dict(credentials_dict)

### Defining the project id
The Foundation Model requires project id that provides the context for the call. We will obtain the id from the project in which this notebook runs. Otherwise, please provide the project id.


In [5]:
try:
    project_id = os.environ["PROJECT_ID"]
except KeyError:
    project_id = input("Please enter your project_id and hit enter: ")

### Defining the space id
Deployed functions are available on deployment spaces. RAG we will create, will be a deployed function. You need to provide space id.

In [6]:
space_id = input("Please enter your space_id and hit enter: ")

### Initialize client
Create an instance of `APIClient` and set the default project.

In [6]:
client = APIClient(credentials, project_id=project_id)

### Defining the prompt id

We will use PromptTemplate to create a template for our RAG LLM query. If you don't have the PromptTemplate created in your project, this code will create an example one.

In [7]:
prompt_id = input("Please enter your prompt template asset id and hit enter, if not provided, a new one would be created: ") or None

if prompt_id is None:
    PROMPT_INSTRUCTION = \
    """
    Use the following pieces of documents to answer the question
    at the end. If you don't know the answer, just say that you
    don't know, don't try to make up an answer. Use three sentences
    maximum. Keep the answer as concise as possible. do not include
    question in your response.Your answers should not include any
    harmful, unethical, racist, sexist, toxic, dangerous, or illegal
    content. Please ensure that your responses are socially unbiased
    and positive in nature.\nPlease provide a concise professional
    response.
    """
    prompt_mgr = PromptTemplateManager(credentials=credentials, project_id=project_id)
    prompt_template = PromptTemplate(name="RAG_prompt_template",
                                     model_id=ModelTypes.LLAMA_2_13B_CHAT,
                                     input_variables=["question", "reference_documents"],
                                     instruction=PROMPT_INSTRUCTION,
                                     input_text="{reference_documents}\nQuestion:{question}\nAnswer:")
    stored_prompt_template = prompt_mgr.store_prompt(prompt_template=prompt_template)
    prompt_id = stored_prompt_template.prompt_id

### Build up knowledge base

The current state-of-the-art in RAG is to create dense vector representations of the knowledge base in order to calculate the semantic similarity to a given user query.

We can generate dense vector representations using embedding models. In this notebook, we use IBM's <a href="https://www.ibm.com/products/watsonx-ai/foundation-models#Embedding+model+library">IBM_SLATE_30M_ENG</a> model to embed both the knowledge base passages and user queries.

A vector database is optimized for dense vector indexing and retrieval. This notebook uses <a href="https://python.langchain.com/docs/integrations/vectorstores/Milvus#basic-example" target="_blank" rel="noopener no referrer">Milvus</a>, an open-source vector database.

The dataset we are using is already split into self-contained passages that can be ingested by Milvus. 

The size of each passage is limited by the embedding model's context window (which is 512 tokens for `IBM Slate 30M`).

### Load knowledge base documents

Load set of documents used further to build knowledge base and store them as a project asset.

In [8]:
filename = 'psgs.tsv'
url = f'https://raw.github.com/IBM/watson-machine-learning-samples/master/cloud/data/RAG/{filename}'
if not os.path.isfile(filename):
    wget.download(url)

asset_details = client.data_assets.create(name=filename, file_path=filename)

Creating data asset...
SUCCESS


### Read and prepare documents
Read documents using `DataConnection` and prepare them for vector database ingestion by combining title and text.

In [9]:
data_connection = DataConnection(data_asset_id=client.data_assets.get_id(asset_details))
data_connection.set_client(client)
documents = data_connection.read(csv_separator='\t')

In [10]:
documents['indextext'] = documents['title'].astype(str) + "\n" + documents['text']
documents = documents[:1000]
documents.head()

Unnamed: 0,id,text,title,indextext
0,1.0,History of Idaho - wikipedia History of Idaho ...,History of Idaho,History of Idaho\nHistory of Idaho - wikipedia...
1,2.0,"1957 . Location Cataldo , Idaho Built 1848 Arc...",History of Idaho,"History of Idaho\n1957 . Location Cataldo , Id..."
2,3.0,"of the Columbia was created in June 1816 , and...",History of Idaho,History of Idaho\nof the Columbia was created ...
3,4.0,"Canyon , he concluded that water transport was...",History of Idaho,"History of Idaho\nCanyon , he concluded that w..."
4,5.0,"1842 , Father Pierre - Jean De Smet , with Fr....",History of Idaho,"History of Idaho\n1842 , Father Pierre - Jean ..."


### Create an embedding function for VectorStore

Note that you can feed a custom embedding function to be used by Milvus. The performance of Milvus may differ depending on the embedding model used. 

In [11]:
embeddings = Embeddings(
    model_id=EmbeddingTypes.IBM_SLATE_30M_ENG,
    credentials=credentials,
    project_id=project_id
)

<a id="elastic_conn"></a>
## Set up connectivity information to Milvus

**This notebook focuses on self-managed Milvus cluster using <a href="https://cloud.ibm.com/docs/watsonxdata?topic=watsonxdata-adding-milvus-service" target="_blank" rel="noopener no referrer">IBM watsonx.data.</a>**

The following cell retrieves the Milvus username, password, host and port from the environment if available and prompts you otherwise.

You can provide a connection asset ID to read all required connection data from it. Before doing so, make sure that connection asset was created in your project.

In [13]:
connection_id = input("Provide connection asset ID in your project. Skip this, if you wish to type credentials by hand and hit enter: ") or None

if connection_id is None:
    try:
        username = os.environ["USERNAME"]
    except KeyError:
        username = input("Please enter your Milvus user name and hit enter: ")
    try:
        password = os.environ["PASSWORD"]
    except KeyError:
        password = getpass.getpass("Please enter your Milvus password and hit enter: ")
    try:
        host = os.environ["HOST"]
    except KeyError:
        host = input("Please enter your Milvus hostname and hit enter: ")
    try:
        port = os.environ["PORT"]
    except KeyError:
        port = input("Please enter your Milvus port number and hit enter: ")
    try:
        ssl = os.environ["SSL"]
    except:
        ssl = bool(input("Please enter ('y'/anything) if your Milvus instance has SSL enabled. Skip if it is not: "))

    # Create connection
    milvus_data_source_type_id = client.connections.get_datasource_type_uid_by_name(
        "milvus"
    )
    details = client.connections.create(
        {
            client.connections.ConfigurationMetaNames.NAME: "Milvus Connection",
            client.connections.ConfigurationMetaNames.DESCRIPTION: "Connection created by the sample notebook",
            client.connections.ConfigurationMetaNames.DATASOURCE_TYPE: milvus_data_source_type_id,
            client.connections.ConfigurationMetaNames.PROPERTIES: {
                "host": host,
                "port": port,
                "username": username,
                "password": password,
                "ssl": ssl,
            },
        }
    )

    connection_id = client.connections.get_id(details)

Creating connections...
SUCCESS


### Promote assets from project to space

Some of the assets need to be promoted from project to space. Since RAGPattern, and eventually our RAG function, will be using deployed space, those resources are required for it to work correctly.

In [14]:
assets_to_promote = [connection_id, prompt_id]
promoted_connection_id, promoted_prompt_id = [
    client.spaces.promote(id, project_id, space_id) for id in assets_to_promote
]

In [15]:
client.set.default_space(space_id)

Unsetting the project_id ...


'SUCCESS'

<a id="vectorstore"></a>
## Set up VectorStore with Milvus credentials 

Create a VectorStore class that automatically detects the database type (in our case it will be Milvus) and allows us to add, search and delete documents.

It works as a wrapper for LangChain VectorStore classes. You can customize the settings as long as it is supported. Consult the LangChain documentation for more information about <a href="https://api.python.langchain.com/en/latest/vectorstores/langchain_community.vectorstores.milvus.Milvus.html" target="_blank" rel="noopener no referrer">Milvus</a> connector.

Provide the name of your Milvus index for subsequent operations:

In [16]:
index_name = input("Please enter Milvus index name and hit enter: ")

In [17]:
vector_store = VectorStore(
    client=client,
    embeddings=embeddings,
    connection_id=promoted_connection_id,
    index_name=index_name,
    secure=True
)

<a id="milvus_index"></a>
### Embed and index documents with Milvus

**Note: Could take several minutes if you don't have pre-built indices**

In [18]:
texts = documents.indextext.tolist()
metadatas = [{'title': title, 'id': doc_id} for (title, doc_id) in zip(documents.title, documents.id)]
docs_to_add = [Document(page_content=text, metadata=metadata) for text, metadata in zip(texts, metadatas)]

text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=10)
docs_to_add_split = text_splitter.split_documents(docs_to_add)

ids = vector_store.add_documents(docs_to_add_split, batch_size=200)

Verify the number of documents loaded into the Milvus.

In [19]:
doc_count = vector_store.count()
doc_count

4051

Let's search for an example document as a sample. Note the embedding in the vector field, that was generated with the sentence transformer.

In [20]:
vector_store.search("United States of America", k=5, verbose=True)

**Question:** United States of America

Unnamed: 0,page_content,title,id,pk
0,", D.C. States / Territories Alabama Alaska Ame...","United States Senate elections, 2018",639.0,9a4eb29daf1b4658f073e569f69ae15fc2959b3e4d2469...
1,"United States , 1797 -- 1801 1st Vice Presiden...",Founding Fathers of the United States,918.0,7d3d14905ddb5242dfeb01b9bfa682e9b3f3928636b7ec...
2,who led the American Revolution against the au...,Founding Fathers of the United States,878.0,5ae53eb7ed9eaba633ceb59196e8383eddede3d017d986...
3,-- 1793 ) U.S. Minister to France ( 1785 -- 17...,Founding Fathers of the United States,927.0,a21f18163b87ed13fa40094c179fa183f2183f0ad77fab...
4,"the United States of America , which was recog...",British colonization of the Americas,521.0,380b686d2cb5b91bac2768e97db84ee75a9b678d720a4b...


[Document(metadata={'title': 'United States Senate elections, 2018', 'id': 639.0, 'pk': '9a4eb29daf1b4658f073e569f69ae15fc2959b3e4d24694e292af9c9be742b48'}, page_content=', D.C. States / Territories Alabama Alaska American Samoa Arizona Arkansas California Colorado Connecticut Delaware Florida Georgia Guam Hawaii Idaho Illinois Indiana Iowa Kansas Kentucky Louisiana Maine Maryland Massachusetts Michigan Minnesota Mississippi Missouri Montana Nebraska Nevada New Hampshire New Jersey New Mexico New York North Carolina North Dakota Ohio Oklahoma Oregon Pennsylvania Puerto Rico'),
 Document(metadata={'title': 'Founding Fathers of the United States', 'id': 918.0, 'pk': '7d3d14905ddb5242dfeb01b9bfa682e9b3f3928636b7eca129ef02910309c06c'}, page_content='United States , 1797 -- 1801 1st Vice President of the United States , 1789 -- 1797 U.S. Ambassador to the United Kingdom , 1785 -- 1788 U.S. Ambassador to the Netherlands , 1782 -- 1788 Delegate , Second Continental Congress , 1775 -- 1778 Del

<a id="deploy"></a>
## Create and deploy RAG solution

### Define function code

Deployed function for RAG should implement the functionality of retrieval and augmenting the prompt for the LLM model.
Function defined can be used as an example. To modify the deployed function behaviour, change the values in `params`.

In [21]:
params={
    'space_id': space_id, 
    'retriever': {'method': 'simple', 'number_of_chunks': 5},
    'credentials': credentials_dict,
    'vector_store': {
        'connection_id': promoted_connection_id, 
        'embeddings': {
            '__class__': 'Embeddings', 
            '__module__': 'ibm_watsonx_ai.foundation_models.embeddings.embeddings', 
            'model_id': 'ibm/slate-30m-english-rtrvr',
            'credentials': credentials_dict, 
            'params': None, 
            'project_id': project_id, 
            'space_id': None, 
            'verify': None
        }, 
        'index_name': index_name, 
        'datasource_type': 'milvus', 
        'distance_metric': None, 
        'secure': True
    }, 
    'prompt_template_text': "\n    Use the following pieces of documents to answer the question\n    at the end. If you don't know the answer, just say that you\n    don't know, don't try to make up an answer. Use three sentences\n    maximum. Keep the answer as concise as possible. do not include\n    question in your response.Your answers should not include any\n    harmful, unethical, racist, sexist, toxic, dangerous, or illegal\n    content. Please ensure that your responses are socially unbiased\n    and positive in nature.\nPlease provide a concise professional\n    response.\n    \n\n{reference_documents}\nQuestion:{question}\nAnswer:", 
    'context_template_text': None, 
    'model': {
        'model_id': 'meta-llama/llama-2-13b-chat', 
        'params': 
        {'decoding_method': 'greedy', 'min_new_tokens': 1, 'max_new_tokens': 200}, 
        'project_id': None, 
        'space_id': space_id
    }, 
    'inference_function_params': {}
}

def default_inference_function(params=params):
    """
    Deployed function.

    Input schema:
    payload = {
        client.deployments.ScoringMetaNames.INPUT_DATA: [
            {'values': ['question 1', 'question 2']}
        ]
    }

    Output schema:
    result = {
        'predictions': [
            {
                'fields': ['answer', 'reference_documents'],
                'values': [
                    ['answer 1', [ {'page_content': 'page content 1',
                                    'metadata':     'metadata 1'} ]],
                    ['answer 2', [ {'page_content': 'page content 2',
                                    'metadata':     'metadata 2'} ]]
                ]
            }
        ]
    }
    """
    from ibm_watsonx_ai import APIClient
    from ibm_watsonx_ai.metanames import GenTextParamsMetaNames
    from ibm_watsonx_ai.foundation_models import ModelInference
    from ibm_watsonx_ai.foundation_models.extensions.rag import Retriever, VectorStore
    from ibm_watsonx_ai.foundation_models.extensions.rag.pattern.prompt_builder import (
        build_prompt,
    )
    client = APIClient(
        credentials=params["credentials"], space_id=params["space_id"]
    )
    vector_store = VectorStore.from_dict(client=client, data=params["vector_store"])
    retriever = Retriever.from_vector_store(
        vector_store=vector_store, init_parameters=params["retriever"]
    )
    prompt_template_text = params["prompt_template_text"]
    context_template_text = params["context_template_text"]
    model = ModelInference(api_client=client, **params["model"])
    model_specs = client.foundation_models.get_model_specs(model_id=model.model_id)
    model_max_new_tokens = (model.params or {}).get(
        GenTextParamsMetaNames.MAX_NEW_TOKENS, 20
    )
    model_max_input_tokens = (
        model_specs["model_limits"]["max_sequence_length"] - model_max_new_tokens
    )

    def score(payload):
        result = {"predictions": [{"fields": ["answer", "reference_documents"]}]}

        all_prompts = []
        all_retrieved_docs = []

        for question in payload[client.deployments.ScoringMetaNames.INPUT_DATA][0][
            "values"
        ]:
            retrieved_docs = retriever.retrieve(query=question)
            all_retrieved_docs.append(retrieved_docs)
            reference_documents = [doc.page_content for doc in retrieved_docs]

            prompt_input_text = build_prompt(
                prompt_template_text=prompt_template_text,
                context_template_text=context_template_text,
                question=question,
                reference_documents=reference_documents,
                model_max_input_tokens=model_max_input_tokens,
            )
            all_prompts.append(prompt_input_text)

        answers = model.generate_text(prompt=all_prompts)

        predictions = [
            [
                answer,
                [
                    {"page_content": doc.page_content, "metadata": doc.metadata}
                    for doc in retrieved_docs
                ],
            ]
            for answer, retrieved_docs in zip(answers, all_retrieved_docs)
        ]

        result["predictions"][0]["values"] = predictions

        return result

    return score


### Test the function locally

To test our solution we can query the function locally without deploying.

In [22]:
questions_and_answers = {
    'what are the names of founding fathers of the united states?': "Thomas Jefferson::James Madison::John Jay::George Washington::John Adams::Benjamin Franklin::Alexander Hamilton",
    'who played in the super bowl in 2013?': 'Baltimore Ravens::San Francisco 49ers',
    'when did bucharest become the capital of romania?': '1862'
}

Define a helper function for formatting the response:

In [23]:
def print_rag_response(response):
    for question, (answer, reference_docs) in zip(questions_and_answers.keys(), response['predictions'][0]['values']):
        verbose_search(question, [Document(**d) for d in reference_docs])
        display(Markdown(f'**Answer:** {answer}'))

Questions have to be provided in the payload that have format provided below.

In [24]:
payload = {
    client.deployments.ScoringMetaNames.INPUT_DATA: [{
        "values": list(questions_and_answers.keys())
    }]
}

In [25]:
response = default_inference_function()(payload)
print_rag_response(response)

**Question:** what are the names of founding fathers of the united states?

Unnamed: 0,page_content,title,id,pk
0,Founding Fathers of the United States,Founding Fathers of the United States,878.0,98aa4910657083171c7baf0a4f07b36ca1736a8fa509cb...
1,Founding Fathers of the United States,Founding Fathers of the United States,879.0,42d8e6fe20269c23a85346969087d5dd6ebf0e00314479...
2,Founding Fathers of the United States,Founding Fathers of the United States,880.0,e731e99f25e0338de640caab51397e36112000f5cecae2...
3,Founding Fathers of the United States,Founding Fathers of the United States,881.0,8c018ebca9d712a9ed9daa914c290de8f93c329831582e...
4,Founding Fathers of the United States,Founding Fathers of the United States,882.0,5f2ef7fbdd0a03d4c16a685060d0a6e6b91e44bca99670...


**Answer:** The names of the founding fathers of the united states are george washington, john adams, thomas jefferson, benjamin franklin, james madison, alexander hamilton, john jay, and james monroe.














































































































































**Question:** who played in the super bowl in 2013?

Unnamed: 0,page_content,title,id,pk
0,Super Bowl XLVII - wikipedia Super Bowl XLVII ...,Super Bowl XLVII,818.0,d6ac7acbbf3cfa1d40295c0f3bf426c5eeaf5a5b94e838...
1,Opponents Announced '' . NewOrleansSaints.com ...,Super Bowl XLVII,856.0,6a9d55d7d47ec5a64744d8b46f661784185fa25b29c5e0...
2,"responded to the claim on Twitter in jest , tw...",Super Bowl XLVII,848.0,4035f1be2332109f293fd4520860b08339d2374206730d...
3,: Super Bowl 2012 National Football League sea...,Super Bowl XLVII,876.0,1494ee13a6d4ad66c0691b619bce671b52454b2b363134...
4,"February 4 , 2013 . Jump up ^ `` Lights go out...",Super Bowl XLVII,866.0,c54217c2c0f9837459ebc4583f57e538a2c6e41c39b9d9...


**Answer:**  The Baltimore Ravens played against the San Francisco 49ers in Super Bowl XLVII in 2013.

**Question:** when did bucharest become the capital of romania?

Unnamed: 0,page_content,title,id,pk
0,destroying a third of the city . Ottoman massa...,Bucharest,948.0,b28759a8b47ed9226b5e06b553deade254920c17a71193...
1,"to become joyful ) , while an early 19th - cen...",Bucharest,946.0,3d6d9a6d9c9ebde4c9a0a882a373d9682c08079c857a0a...
2,"route to the Eastern Front , Bucharest suffere...",Bucharest,949.0,82dddf5c627be56e49818dcd7a486b26b32e698ecfa588...
3,Bucharest,Bucharest,973.0,8ce9d6f9b874733c3009d0c786006a876313018aa15021...
4,Bucharest,Bucharest,974.0,2ec3a2f60be5d22b6451ca880b61b8627b82cb0e42ee6e...


**Answer:**  Bucharest became the capital of Romania in 1862, after Wallachia and Moldavia were united to form the Principality of Romania.

### Deploy RAGPattern

Deployment is done by storing the defined RAG function and then by creating a deployed asset. It would be now accessed as an endpoint that we can score. Before the deployment a custom software specification need to be created to use langchain packages.

In [37]:
sw_spec_name = 'rag_sample_spec_24.1-py3.11'
pkg_ext_name = "rag_sample_pkg"

try:
    sw_specs = client.software_specifications.list(limit=200)
    sw_spec_id = sw_specs.loc[sw_specs.NAME == sw_spec_name, 'ID'].values[0]
    client.software_specifications.delete(sw_spec_id)
    print(f'{sw_spec_id} deleted')
except IndexError:
    pass

try:
    pkg_extns = client.package_extensions.list()
    pkg_extn_id = pkg_extns.loc[pkg_extns.NAME == pkg_ext_name, 'ASSET_ID'].values[0]
    client.package_extensions.delete(pkg_extn_id)
    print(f'{pkg_extn_id} deleted')
except IndexError:
    pass

from ibm_watsonx_ai import __version__

BASE_SW_SPEC_NAME = "runtime-24.1-py3.11"
SW_SPEC_NAME = "rag_24.1-py3.11"
PKG_EXTN_NAME = "rag_pattern-py3.11"
CONFIG_PATH = "config.yaml"
CONFIG_TYPE = "conda_yml"
CONFIG_CONTENT = f"""
        name: python311
        channels:
          - empty
        dependencies:
          - pip:
            - ibm-watsonx-ai[rag]
        prefix: /opt/anaconda3/envs/python311
"""
with open(CONFIG_PATH, 'w', encoding='utf-8') as f:
    f.write(CONFIG_CONTENT)
pkg_extn_meta_props = {
    client.package_extensions.ConfigurationMetaNames.NAME: pkg_ext_name,
    client.package_extensions.ConfigurationMetaNames.TYPE: CONFIG_TYPE
}

pkg_extn_details = client.package_extensions.store(meta_props=pkg_extn_meta_props, file_path=CONFIG_PATH)
pkg_extn_uid = client.package_extensions.get_id(pkg_extn_details)

sw_spec_meta_props = {
    client.software_specifications.ConfigurationMetaNames.NAME: sw_spec_name,
    client.software_specifications.ConfigurationMetaNames.BASE_SOFTWARE_SPECIFICATION: {
        'guid': client.software_specifications.get_id_by_name(BASE_SW_SPEC_NAME)
    }
}

sw_spec_details = client.software_specifications.store(meta_props=sw_spec_meta_props)
sw_spec_id = client.software_specifications.get_id(sw_spec_details)

client.software_specifications.add_package_extension(sw_spec_id, pkg_extn_uid)
import os
os.remove(CONFIG_PATH)

meta_data = {
    client.repository.FunctionMetaNames.NAME: 'rag_function_watsonx_ai',
    client.repository.FunctionMetaNames.DESCRIPTION: 'RAG with Milvus',
    client.repository.FunctionMetaNames.SOFTWARE_SPEC_UID: sw_spec_id
}

watsonx_function_details = client.repository.store_function(meta_props=meta_data, function=default_inference_function)

watsonx_function_uid = client.repository.get_function_id(watsonx_function_details)

meta_props = {
   client.deployments.ConfigurationMetaNames.NAME: "rag_function",
   client.deployments.ConfigurationMetaNames.DESCRIPTION: "RAG function",
   client.deployments.ConfigurationMetaNames.HARDWARE_SPEC: { 'name': 'S'},  
   client.deployments.ConfigurationMetaNames.SERVING_NAME: "rag_function"
}

watsonx_deployment_details = client.deployments.create(watsonx_function_uid, meta_props=meta_props)

watsonx_deployment_id = client.deployments.get_id(watsonx_deployment_details)

efffe223-3753-4dd8-8eb0-c3d0bd799d72 deleted
d49988df-e1af-47a0-b25e-4d739787acc4 deleted
Creating package extensions
SUCCESS
SUCCESS


######################################################################################

Synchronous deployment creation for id: 'd55fd0b0-f160-4925-a78a-e385b2c3aacd' started

######################################################################################


initializing.......
ready


-----------------------------------------------------------------------------------------------
Successfully finished deployment creation, deployment_id='3e369f71-2204-4ae1-a6cb-8ddfc4b0b902'
-----------------------------------------------------------------------------------------------




### Test the deployed function

RAG service is now deployed on our space. To test our solution we can run the cell below. Questions have to be provided in the payload that have format provided below.

In [38]:
response = client.deployments.score(watsonx_deployment_id, payload)
print_rag_response(response)

**Question:** what are the names of founding fathers of the united states?

Unnamed: 0,page_content,title,id,pk
0,Founding Fathers of the United States,Founding Fathers of the United States,878.0,98aa4910657083171c7baf0a4f07b36ca1736a8fa509cb...
1,Founding Fathers of the United States,Founding Fathers of the United States,879.0,42d8e6fe20269c23a85346969087d5dd6ebf0e00314479...
2,Founding Fathers of the United States,Founding Fathers of the United States,880.0,e731e99f25e0338de640caab51397e36112000f5cecae2...
3,Founding Fathers of the United States,Founding Fathers of the United States,881.0,8c018ebca9d712a9ed9daa914c290de8f93c329831582e...
4,Founding Fathers of the United States,Founding Fathers of the United States,882.0,5f2ef7fbdd0a03d4c16a685060d0a6e6b91e44bca99670...


**Answer:** The names of the founding fathers of the united states are george washington, john adams, thomas jefferson, benjamin franklin, james madison, alexander hamilton, john jay, and james monroe.














































































































































**Question:** who played in the super bowl in 2013?

Unnamed: 0,page_content,title,id,pk
0,Super Bowl XLVII - wikipedia Super Bowl XLVII ...,Super Bowl XLVII,818.0,d6ac7acbbf3cfa1d40295c0f3bf426c5eeaf5a5b94e838...
1,Opponents Announced '' . NewOrleansSaints.com ...,Super Bowl XLVII,856.0,6a9d55d7d47ec5a64744d8b46f661784185fa25b29c5e0...
2,"responded to the claim on Twitter in jest , tw...",Super Bowl XLVII,848.0,4035f1be2332109f293fd4520860b08339d2374206730d...
3,: Super Bowl 2012 National Football League sea...,Super Bowl XLVII,876.0,1494ee13a6d4ad66c0691b619bce671b52454b2b363134...
4,"February 4 , 2013 . Jump up ^ `` Lights go out...",Super Bowl XLVII,866.0,c54217c2c0f9837459ebc4583f57e538a2c6e41c39b9d9...


**Answer:**  The Baltimore Ravens played against the San Francisco 49ers in Super Bowl XLVII in 2013.

**Question:** when did bucharest become the capital of romania?

Unnamed: 0,page_content,title,id,pk
0,destroying a third of the city . Ottoman massa...,Bucharest,948.0,b28759a8b47ed9226b5e06b553deade254920c17a71193...
1,"to become joyful ) , while an early 19th - cen...",Bucharest,946.0,3d6d9a6d9c9ebde4c9a0a882a373d9682c08079c857a0a...
2,"route to the Eastern Front , Bucharest suffere...",Bucharest,949.0,82dddf5c627be56e49818dcd7a486b26b32e698ecfa588...
3,Bucharest,Bucharest,973.0,8ce9d6f9b874733c3009d0c786006a876313018aa15021...
4,Bucharest,Bucharest,974.0,2ec3a2f60be5d22b6451ca880b61b8627b82cb0e42ee6e...


**Answer:**  Bucharest became the capital of Romania in 1862, after Wallachia and Moldavia were united to form the Principality of Romania.

<a id="evaluate"></a>
## Calculate rougeL metric 
Calculate rougeL recall score to verify expected answer presence in generated response.

In [39]:
text_responses = [v[0] for v in response['predictions'][0]['values']]
targets = [answer for answer in questions_and_answers.values()]

In [40]:
scorer = rouge_scorer.RougeScorer(['rougeL'], use_stemmer=True)
scores = [scorer.score(target, prediction) for target, prediction in zip(targets, text_responses)]
mean_rougeL = sum([s['rougeL'].recall for s in scores]) / len(questions_and_answers)

print(f"Mean rougeL recall score: {mean_rougeL}")

Mean rougeL recall score: 0.8571428571428571


<a id="summary"></a>
## Summary and next steps

You successfully completed this notebook!

Check out our _<a href="https://ibm.github.io/watsonx-ai-python-sdk/samples.html" target="_blank" rel="noopener no referrer">Online Documentation</a>_ for more samples, tutorials, documentation, how-tos, and blog posts. 

### Authors:
**Dominik Zimny**, Software Engineer at watsonx.ai

**Mateusz Szewczyk**, Software Engineer at watsonx.ai

Copyright © 2024-2025 IBM. This notebook and its source code are released under the terms of the MIT License.