![image](https://raw.githubusercontent.com/IBM/watson-machine-learning-samples/master/cloud/notebooks/headers/watsonx-Prompt_Lab-Notebook.png)
# Use watsonx, and `meta-llama/llama-3-1-8b-instruct` to run as an AI service

#### Disclaimers

- Use only Projects and Spaces that are available in watsonx context.


## Notebook content

This notebook provides a detailed demonstration of the steps and code required to showcase support for watsonx.ai AI service.

Some familiarity with Python is helpful. This notebook uses Python 3.11.


## Learning goal

The learning goal for your notebook is to leverage AI services to generate accurate and contextually relevant responses based on a question.


## Table of Contents

This notebook contains the following parts:

- [Setup](#setup)
- [Create AI service](#ai_service)
- [Testing AI service's function locally](#testing)
- [Deploy AI service](#deploy)
- [Example of Executing an AI service](#example)
- [Summary](#summary)

<a id="setup"></a>
## Set up the environment

Before you use the sample code in this notebook, you must perform the following setup tasks:

-  Contact with your Cloud Pak for Data administrator and ask them for your account credentials

### Install dependencies

In [1]:
%pip install -U "ibm_watsonx_ai>=1.2.4" | tail -n 1

Successfully installed ibm_watsonx_ai-1.2.4


#### Define credentials

Authenticate the Watson Machine Learning service on IBM Cloud Pak for Data. You need to provide the **admin's** `username` and the platform `url`.

In [2]:
username = "PASTE YOUR USERNAME HERE"
url = "PASTE THE PLATFORM URL HERE"

Use the **admin's** `api_key` to authenticate WML services:

In [None]:
import getpass
from ibm_watsonx_ai import Credentials

credentials = Credentials(
    username=username,
    api_key=getpass.getpass("Enter your watsonx.ai API key and hit enter: "),
    url=url,
    instance_id="openshift",
    version="5.1"
)

Alternatively you can use the **admin's** `password`:

In [3]:
import getpass
from ibm_watsonx_ai import Credentials

credentials = Credentials(
    username=username,
    password=getpass.getpass("Enter your watsonx.ai password and hit enter: "),
    url=url,
    instance_id="openshift",
    version="5.1"
)

#### Working with spaces

First of all, you need to create a space that will be used for your work. If you do not have a space, you can use `{PLATFORM_URL}/ml-runtime/spaces?context=icp4data` to create one.

- Click New Deployment Space
- Create an empty space
- Go to space `Settings` tab
- Copy `space_id` and paste it below

**Tip**: You can also use SDK to prepare the space for your work. More information can be found [here](https://github.com/IBM/watson-machine-learning-samples/blob/master/cpd5.1/notebooks/python_sdk/instance-management/Space%20management.ipynb).

**Action**: Assign space ID below

In [4]:
space_id = "PASTE YOUR SPACE ID HERE"

#### Create `APIClient` instance

In [5]:
from ibm_watsonx_ai import APIClient

api_client = APIClient(credentials, space_id=space_id)

#### Specify model

This notebook uses chat model `meta-llama/llama-3-1-8b-instruct`, which has to be available on your Cloud Pak for Data environment for this notebook to run successfully.  
If this model is not available on your Cloud Pack for Data environment, you can specify any other available chat model.  
You can list available chat models by running the cell below.

In [6]:
if len(api_client.foundation_models.ChatModels):
    print(*api_client.foundation_models.ChatModels, sep="\n")
else:
    print("Chat models are missing in this environment. Install chat models to proceed.")

meta-llama/llama-3-1-8b-instruct


Specify the `model_id` of the model you will use for the chat.

In [7]:
model_id = "meta-llama/llama-3-1-8b-instruct"

<a id="ai_service"></a>
## Create AI service

Prepare function which will be deployed using AI service.

Please specify the default parameters that will be passed to the function.

In [8]:
def deployable_ai_service(context, space_id=space_id, url=credentials["url"], model_id=model_id, params={"temperature": 1}, **kwargs):
    from ibm_watsonx_ai import APIClient, Credentials
    from ibm_watsonx_ai.foundation_models import ModelInference

    api_client = APIClient(
        credentials=Credentials(
            url=url,
            token=context.generate_token(),
            instance_id="openshift",
        ),
        space_id=space_id,
    )

    model = ModelInference(
        model_id=model_id,
        api_client=api_client,
        params=params,
    )

    def generate(context) -> dict:
        api_client.set_token(context.get_token())

        payload = context.get_json()
        question = payload["question"]

        messages = [
            {
                "role": "system",
                "content": "You are a helpful assistant.",
            },
            {
                "role": "user",
                "content": question
            }
        ]

        response = model.chat(messages=messages)

        return {
            "body": response
        }

    def generate_stream(context):
        api_client.set_token(context.get_token())

        payload = context.get_json()
        question = payload["question"]

        messages = [
            {
                "role": "system",
                "content": "You are a helpful assistant.",
            },
            {
                "role": "user",
                "content": question
            }
        ]

        yield from model.chat_stream(messages)

    return generate, generate_stream

<a id="testing"></a>
## Testing AI service's function locally

You can test AI service's function locally. Initialize `RuntimeContext` firstly.

In [9]:
from ibm_watsonx_ai.deployments import RuntimeContext

context = RuntimeContext(api_client=api_client)

In [10]:
local_function = deployable_ai_service(context=context)

Prepare request json payload for local invoke.

In [11]:
context.request_payload_json = {"question": "When was IBM founded?"}

Execute the `generate` function locally.

In [12]:
resp = local_function[0](context)
resp

{'body': {'id': 'chat-95847c14bdfe4219a5152a7b05ae14b9',
  'model_id': 'meta-llama/llama-3-1-8b-instruct',
  'model': 'meta-llama/llama-3-1-8b-instruct',
  'choices': [{'index': 0,
    'message': {'role': 'assistant',
     'content': 'IBM, or International Business Machines, was founded on June 16, 1911.'},
    'finish_reason': 'stop'}],
  'created': 1737447745,
  'created_at': '2025-01-21T08:22:25.449Z',
  'usage': {'completion_tokens': 19, 'prompt_tokens': 46, 'total_tokens': 65},
     'more_info': 'https://www.ibm.com/docs/en/cloud-paks/cp-data/4.8.x?topic=models-supported-foundation'},
    {'message': "The value of 'max_tokens' for this model was set to value 1024",
     'id': 'unspecified_max_token',
     'additional_properties': {'limit': 0,
      'new_value': 1024,
      'parameter': 'max_tokens',
      'value': 0}}]}}}

Execute the `generate_stream` function locally.

In [13]:
response = local_function[1](context)

In [14]:
for event in response:
    print(event["choices"][0]["delta"]["content"], end="")

IBM (International Business Machines) was founded on June 16, 1911. It was formed through the merger of three companies: Tabulating Machine Company, International Time Recording Company, and Computing Scale Company.

<a id="deploy"></a>
## Deploy AI service

Store AI service with previous created custom software specifications.

In [15]:
sw_spec_id = api_client.software_specifications.get_id_by_name("runtime-24.1-py3.11")
sw_spec_id

'45f12dfe-aa78-5b8d-9f38-0ee223c47309'

In [16]:
meta_props = {
    api_client.repository.AIServiceMetaNames.NAME: "AI service with SDK",
    api_client.repository.AIServiceMetaNames.SOFTWARE_SPEC_ID: sw_spec_id
}
stored_ai_service_details = api_client.repository.store_ai_service(deployable_ai_service, meta_props)

In [17]:
ai_service_id = api_client.repository.get_ai_service_id(stored_ai_service_details)
ai_service_id

'bf74d772-f259-47ec-9545-7adbc8d08cd2'

Create online deployment of AI service.

In [18]:
meta_props = {
    api_client.deployments.ConfigurationMetaNames.NAME: "AI service with SDK",
    api_client.deployments.ConfigurationMetaNames.ONLINE: {},
}

deployment_details = api_client.deployments.create(ai_service_id, meta_props)



######################################################################################

Synchronous deployment creation for id: 'bf74d772-f259-47ec-9545-7adbc8d08cd2' started

######################################################################################


initializing
Note: online_url is deprecated and will be removed in a future release. Use serving_urls instead.
.......
ready


-----------------------------------------------------------------------------------------------
Successfully finished deployment creation, deployment_id='90266394-4ed8-421b-9b86-48aabefcf9b9'
-----------------------------------------------------------------------------------------------




Obtain the `deployment_id` of the previously created deployment.

In [19]:
deployment_id = api_client.deployments.get_id(deployment_details)

<a id="example"></a>
## Example of Executing an AI service.

Execute `generate` method.

In [20]:
question = "When was IBM founded?"

deployments_results = api_client.deployments.run_ai_service(
    deployment_id, {"question": question}
)

In [21]:
import json

print(json.dumps(deployments_results, indent=2))

{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "IBM (International Business Machines) was founded on June 16, 1911, under the name Computing-Tabulating-Recording Company (CTR). It wasn't until 1924 that the company changed its name to International Business Machines Corporation (IBM).",
        "role": "assistant"
      }
    }
  ],
  "created": 1737447811,
  "created_at": "2025-01-21T08:23:32.343Z",
  "id": "chat-94595835a8af48ab87d7cfdf19a4549c",
  "model": "meta-llama/llama-3-1-8b-instruct",
  "model_id": "meta-llama/llama-3-1-8b-instruct",
  "system": {
      {
        "message": "This model is a Non-IBM Product governed by a third-party license that may impose use restrictions and other obligations. By using this model you agree to its terms as identified in the following URL.",
        "more_info": "https://www.ibm.com/docs/en/cloud-paks/cp-data/4.8.x?topic=models-supported-foundation"
      },
      {
        "additi

Execute `generate_stream` method.

In [22]:
question = "When was IBM founded?"

deployments_results = api_client.deployments.run_ai_service_stream(
    deployment_id, {"question": question}
)

In [23]:
import json

for chunk in deployments_results:
    print(json.loads(chunk)["choices"][0]["delta"]["content"], end="")

IBM (International Business Machines Corporation) was founded on June 16, 1911. It began as the Computing-Tabulating-Recording Company (CTR), a merger of several companies, including the Tabulating Machine Company, which was founded by Herman Hollerith in 1896. The company officially adopted the name IBM in 1924.

<a id="summary"></a>
## Summary and next steps

You successfully completed this notebook!

You learned how to create and deploy AI service using `ibm_watsonx_ai` SDK.

Check out our _<a href="https://ibm.github.io/watsonx-ai-python-sdk/samples.html" target="_blank" rel="noopener no referrer">Online Documentation</a>_ for more samples, tutorials, documentation, how-tos, and blog posts. 

### Author

**Rafał Chrzanowski**, Software Engineer Intern at watsonx.ai.

Copyright © 2025 IBM. This notebook and its source code are released under the terms of the MIT License.