![image](https://raw.githubusercontent.com/IBM/watson-machine-learning-samples/master/cloud/notebooks/headers/watsonx-Prompt_Lab-Notebook.png)
# Use watsonx, and `meta-llama/llama-3-2-11b-vision-instruct` to run as an AI service

#### Disclaimers

- Use only Projects and Spaces that are available in watsonx context.


## Notebook content

This notebook provides a detailed demonstration of the steps and code required to showcase support for watsonx.ai AI service.

Some familiarity with Python is helpful. This notebook uses Python 3.11.


## Learning goal

The learning goal for your notebook is to leverage AI services to generate accurate and contextually relevant responses based on a given image and a related question.


## Table of Contents

This notebook contains the following parts:

- [Setup](#setup)
- [Create AI service](#ai_service)
- [Testing AI service's function locally](#testing)
- [Deploy AI service](#deploy)
- [Example of Executing an AI service](#example)
- [Summary](#summary)

<a id="setup"></a>
## Set up the environment

Before you use the sample code in this notebook, you must perform the following setup tasks:

-  Create a <a href="https://cloud.ibm.com/catalog/services/watson-machine-learning" target="_blank" rel="noopener no referrer">Watson Machine Learning (WML) Service</a> instance (a free plan is offered and information about how to create the instance can be found <a href="https://dataplatform.cloud.ibm.com/docs/content/wsj/getting-started/wml-plans.html?context=wx&audience=wdp" target="_blank" rel="noopener no referrer">here</a>).

### Install and import the `datasets` and dependencies

In [None]:
!pip install -U "ibm_watsonx_ai>=1.1.22" | tail -n 1

### Define the WML credentials
Use the code cell below to define the WML credentials that are required to work with watsonx Foundation Model inferencing.

**Action:** Provide the IBM Cloud user API key. For details, see <a href="https://cloud.ibm.com/docs/account?topic=account-userapikey&interface=ui" target="_blank" rel="noopener no referrer">Managing user API keys</a>.

In [2]:
import getpass
from ibm_watsonx_ai import Credentials

credentials = Credentials(
    url="https://us-south.ml.cloud.ibm.com",
    api_key=getpass.getpass("Enter your WML API key and hit enter: "),
)

### Working with spaces

You need to create a space that will be used for your work. If you do not have a space, you can use [Deployment Spaces Dashboard](https://dataplatform.cloud.ibm.com/ml-runtime/spaces?context=wx) to create one.

- Click **New Deployment Space**
- Create an empty space
- Select Cloud Object Storage
- Select Watson Machine Learning instance and press **Create**
- Go to **Manage** tab
- Copy `Space GUID` and paste it below

**Tip**: You can also use SDK to prepare the space for your work. More information can be found [here](https://github.com/IBM/watson-machine-learning-samples/blob/master/cloud/notebooks/python_sdk/instance-management/Space%20management.ipynb).

**Action**: assign space ID below

In [3]:
import os

try:
    space_id = os.environ["SPACE_ID"]
except KeyError:
    space_id = input("Please enter your project_id (hit enter): ")

Create an instance of APIClient with authentication details.

In [2]:
from ibm_watsonx_ai import APIClient

api_client = APIClient(credentials=credentials, space_id=space_id)

Specify the `model_id` of the model you will use for the chat with image.

In [3]:
model_id = "meta-llama/llama-3-2-11b-vision-instruct"

<a id="ai_service"></a>
## Create AI service

Prepare function which will be deployed using AI service.

In [4]:
def deployable_ai_service(context, **custom):
    
    import requests
    import base64
    from ibm_watsonx_ai import APIClient, Credentials
    from ibm_watsonx_ai.foundation_models import ModelInference

    space_id = custom.get("space_id")
    url = custom.get("url")
    model_id = custom.get("model_id")
    params = custom.get("params")

    api_client = APIClient(
        credentials=Credentials(url=url, token=context.generate_token()),
        space_id=space_id,
    )
    
    model = ModelInference(
        model_id=model_id,
        api_client=api_client,
        params=params,
    )

    def generate(context) -> dict:
        
        api_client.set_token(context.get_token())
   
        payload = context.get_json()
        question = payload["question"]
        image_url = payload["image_url"]
        
        response = requests.get(image_url)
        response.raise_for_status()
        base64_image = base64.b64encode(response.content).decode('utf-8')
        
        messages = [
            {
                "role": "user",
                "content": [
                {
                    "type": "text",
                    "text": question
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "data:image/jpeg;base64," + base64_image,
                        "detail": "auto"
                    }
                }
                ]
            }
        ]
        
        
        response = model.chat(messages=messages)

        return {
            "body": response
            }
    
    def generate_stream(context):
        
        api_client.set_token(context.get_token())
   
        payload = context.get_json()
        question = payload["question"]
        image_url = payload["image_url"]
        
        response = requests.get(image_url)
        response.raise_for_status()
        base64_image = base64.b64encode(response.content).decode('utf-8')
        
        messages = [
            {
                "role": "user",
                "content": [
                {
                    "type": "text",
                    "text": question
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "data:image/jpeg;base64," + base64_image,
                        "detail": "auto"
                    }
                }
                ]
            }
        ]
        
        for chunk in model.chat_stream(messages):
            yield chunk
            
    return generate, generate_stream

<a id="testing"></a>
## Testing AI service's function locally

You can test AI service's function locally. Initialise `RuntimeContext` firstly.

In [5]:
from ibm_watsonx_ai.deployments import RuntimeContext

context = RuntimeContext(api_client=api_client)

Please specify the keyword arguments that will be passed to the function.

In [6]:
kwargs = {
    "space_id": api_client.default_space_id,
    "url": api_client.credentials.url,
    "model_id": model_id,
    "params": {"temperature": 1}
}

local_function = deployable_ai_service(context=context, **kwargs)

Please retrieve an image and display it. This example is based on the IBM logo.

In [7]:
import requests
from IPython.display import Image

image_url = "https://raw.github.com/IBM/watson-machine-learning-samples/master/cloud/data/logo/ibm_logo.jpg"

response = requests.get(image_url)

Image(url=image_url, width=600)

Prepare request json payload for local invoke.

In [8]:
context.request_payload_json = {"question": "Describe the image", "image_url": image_url}

Execute the `generate` function locally.

In [9]:
resp = local_function[0](context)
resp

{'body': {'id': 'chat-2b644bfc0f3849d1af368b7d3809e174',
  'model_id': 'meta-llama/llama-3-2-11b-vision-instruct',
  'model': 'meta-llama/llama-3-2-11b-vision-instruct',
  'choices': [{'index': 0,
    'message': {'role': 'assistant',
     'content': 'The image presents the IBM logo, a prominent symbol in the tech industry.'},
    'finish_reason': 'stop'}],
  'created': 1730907393,
  'model_version': '3.2.0',
  'created_at': '2024-11-06T15:36:34.409Z',
  'usage': {'completion_tokens': 16,
   'prompt_tokens': 6523,
   'total_tokens': 6539},
     'more_info': 'https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/fm-models.html?context=wx'},
    {'message': "The value of 'max_tokens' for this model was set to value 1024",
     'id': 'unspecified_max_token',
     'additional_properties': {'limit': 0,
      'new_value': 1024,
      'parameter': 'max_tokens',
      'value': 0}}]}}}

Execute the `generate_stream` function locally.

In [10]:
response = local_function[1](context)

In [11]:
for event in response:
    print(event["choices"][0]["delta"]["content"], end="")

The image is the IBM (International Business Machines) logo, a stylized representation of the letters "IBM" in blue on a white background. The logo features geometric shapes resembling a light bulb, symbolizing innovation and illumination. The letters are stacked horizontally and vertically, creating a visually striking design that represents IBM's commitment to advancement and transformation in the technology industry.

<a id="deploy"></a>
## Deploy AI service

Store AI service with previous created custom software specifications.

In [12]:
sw_spec_id = api_client.software_specifications.get_id_by_name("runtime-24.1-py3.11")
sw_spec_id

'45f12dfe-aa78-5b8d-9f38-0ee223c47309'

In [13]:
meta_props = {
    api_client.repository.AIServiceMetaNames.NAME: "AI service with SDK",    
    api_client.repository.AIServiceMetaNames.SOFTWARE_SPEC_ID: sw_spec_id
}
stored_ai_service_details = api_client.repository.store_ai_service(deployable_ai_service, meta_props)

In [14]:
ai_service_id = api_client.repository.get_ai_service_id(stored_ai_service_details)
ai_service_id

'407f46ba-a945-4f67-b545-c901e72c90e6'

Create online deployment of AI service.

In [15]:
meta_props = {
    api_client.deployments.ConfigurationMetaNames.NAME: "AI service with SDK",
    api_client.deployments.ConfigurationMetaNames.ONLINE: {},
    api_client.deployments.ConfigurationMetaNames.CUSTOM: {
        "space_id": api_client.default_space_id,
        "url": api_client.credentials.url,
        "model_id": model_id,
        "params": {"temperature": 1},
    },
}

deployment_details = api_client.deployments.create(ai_service_id, meta_props)



######################################################################################

Synchronous deployment creation for id: '407f46ba-a945-4f67-b545-c901e72c90e6' started

######################################################################################


initializing
Note: online_url and serving_urls are deprecated and will be removed in a future release. Use inference instead.
...
ready


-----------------------------------------------------------------------------------------------
Successfully finished deployment creation, deployment_id='99531da1-03aa-45c1-bc70-1a26299579ef'
-----------------------------------------------------------------------------------------------




Obtain the `deployment_id` of the previously created deployment.

In [16]:
deployment_id = api_client.deployments.get_id(deployment_details)

<a id="example"></a>
## Example of Executing an AI service.

Execute `generate` method.

In [19]:
question = "Describe the image"

deployments_results = api_client.deployments.run_ai_service(
    deployment_id, {"question": question, "image_url": image_url}
)

In [20]:
import json

print(json.dumps(deployments_results, indent=2))

{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "The image depicts the IBM logo, a blue-stamped version of the company's name (IBM) in large letters, with lines extending off the letters to create a three-dimensional raised look. A small, circular symbol with an 'R' inside is set in the lower right corner.\n\nIBM stands for International Business Machines, a multinational technology company that has been in operation since 1911 and is one of the largest technology companies in the world. IBM is headquartered in Armonk, New York, USA and has a significant presence in over 170 countries.",
        "role": "assistant"
      }
    }
  ],
  "created": 1730907511,
  "created_at": "2024-11-06T15:38:33.379Z",
  "id": "chat-8834d4a2173b4fd48681454b99781249",
  "model": "meta-llama/llama-3-2-11b-vision-instruct",
  "model_id": "meta-llama/llama-3-2-11b-vision-instruct",
  "model_version": "3.2.0",
  "system": {
      {
        "messag

Execute `generate_stream` method.

In [27]:
question = "Describe the image"

deployments_results = api_client.deployments.run_ai_service_stream(
    deployment_id, {"question": question, "image_url": image_url}
)

In [28]:
import json

for chunk in deployments_results:
    print(json.loads(chunk)["choices"][0]["delta"]["content"], end="")

The image shows a blue logo with the letters "IBM" in the middle, surrounded by horizontal lines. The logo appears to be the IBM logo, which is a well-known company that was founded in 1911 and is known for its technology products and services.

The logo features the company's initials, "IBM", in bold, blue letters, with a series of horizontal lines above and below the letters. The lines are evenly spaced and of varying lengths, creating a sense of rhythm and harmony. The blue color of the logo is a deep, rich shade that is often associated with trust, stability, and innovation.

Overall, the IBM logo is a classic example of corporate branding and design. It has been used by the company for many years and is widely recognized around the world. The logo's simplicity, elegance, and consistency have made it a symbol of IBM's commitment to quality, excellence, and innovation.

<a id="summary"></a>
## Summary and next steps

You successfully completed this notebook!

You learned how to create and deploy AI service using `ibm_watsonx_ai` SDK.

Check out our _<a href="https://ibm.github.io/watsonx-ai-python-sdk/samples.html" target="_blank" rel="noopener no referrer">Online Documentation</a>_ for more samples, tutorials, documentation, how-tos, and blog posts. 

### Author

**Mateusz Szewczyk**, Software Engineer at Watson Machine Learning.

Copyright © 2024 IBM. This notebook and its source code are released under the terms of the MIT License.