IBM Tech 2025 Lab
=================

LAB 2: RHELAI & InstructLab
---------------------------

This python notebook is a simple chatbot that uses the requests package to `chat` with a model served on a RHELAI VSI.

If you are unfamiliar with Jupyter Notebook interface, you can quickly go through [documentation](https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/notebook-editor.html?context=wx&locale=en&audience=wdp)

Goals
-----
1) Chat with a model served on a RHELAI VSI using InstructLab VLLM
2) Try a document summarization usecase using external assets
3) Chat to expose some limitations of base mistral7b model
4) Chat with a fine-tuned model that has been trained on synthetic generated data (SGD

**Exercise 1.a**: Chat with a Mistral-7b model served on a RHELAI VSI instance.

Observe the Model Server URL (and port) endpoint
Set the API Key in appropriate cell.  Use the same API key which you used for the previous Lab.
Provide a prompt to chat with the served model. Type 'exit' or 'quit' to terminate the chat loop

1. Install Pre-requisite libraries

In [None]:
!pip install requests

In [None]:
import requests
import json
import getpass

2. Set the variables for accessing the model server

In [None]:
# Define the model server URL (ensure it's correct for your deployment)
MODEL_SERVER_URL = "https://ilab-base.fs-demo.biz/v1/chat/completions"

# Securely get API key or authentication token if required
API_KEY = None

3. Define function to chat with model using request package

In [None]:
def chat_with_model(prompt):
    headers = {"Content-Type": "application/json"}
    if API_KEY:
        headers["Authorization"] = f"Bearer {API_KEY}"
    
    payload = {
    "model": "rhelai_service_model",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": prompt}
    ]
    }
    
    try:
        response = requests.post(MODEL_SERVER_URL, headers=headers, json=payload, verify=True)
        response.raise_for_status()
        data = response.json()
        return data.get("choices", [{}])[0].get("message", {}).get("content", "No response received.")
    except requests.exceptions.RequestException as e:
        print(f"Error: {e}")
        return f"Error communicating with model: {e}"


4. Provide a prompt to chat with the served model. Wait for response. Type 'exit' or 'quit' to terminate the chat loop.

In [None]:
# Chat loop
while True:
    user_input = input("You: ")
    print(f"Please wait....")
    if user_input.lower() in ["exit", "quit"]:
        print("Exiting chat.")
        break
    
    response = chat_with_model(user_input)
    print(f"Model: {response}")

**Exercise 1.b**: Summarize documents

Load your external data as assets and use them in your queries. Here we will load the document into associated COS bucket first and then use its contents to form prompts.

Click on ![Image](https://www.ibm.com/docs/en/SSQNUZ_5.1.x/wsj/console/images/find_data_icon.png) to get started. Once data asset has been imported successfully, click on 
![Image](https://www.ibm.com/docs/en/SSQNUZ_5.1.x/wsj/analyze-data/images/code-snippets-icon.png) to generate the code snippet similar to below.

Refer: https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/load-and-access-data.html?context=wx&locale=en&audience=wdp

Sample Prompts:
1. This is a conversation between a doctor and patient, provide a diagnosis and plan the next steps. Explain your response.
2. This is a conversation between a doctor and patient, is there an emergency involved or does this situation need immediate medical attention?
3. Based on the above conversation between a doctor and patient, what are the major concerns discussed and course of treatment advised.

5. Insert code snippet to read data

In [None]:

import os, types
import pandas as pd
from botocore.client import Config
import ibm_boto3

def __iter__(self): return 0

# @hidden_cell
# The following code accesses a file in your IBM Cloud Object Storage. It includes your credentials.
# You might want to remove those credentials before you share the notebook.

cos_client = ibm_boto3.client(service_name='s3',
    ibm_api_key_id='pQT19GjBZoU33LeoUNiT-dAQf8osQUYpHdzs5lWwNQIR',
    ibm_auth_endpoint="https://iam.cloud.ibm.com/identity/token",
    config=Config(signature_version='oauth'),
    endpoint_url='https://s3.direct.us-south.cloud-object-storage.appdomain.cloud')

bucket = 'rhelaiinstructlab-donotdelete-pr-tjhwbcblh1zk4y'
object_key = 'RES0217.txt'

# load data of type "text/plain" into a botocore.response.StreamingBody object.
# Please read the documentation of ibm_boto3 and pandas to learn more about the possibilities to load the data.
# ibm_boto3 documentation: https://ibm.github.io/ibm-cos-sdk-python/
# pandas documentation: http://pandas.pydata.org/

streaming_body_1 = cos_client.get_object(Bucket=bucket, Key=object_key)['Body']

6. Review the document contents

In [None]:
document = streaming_body_1.read()
print(f"Document read is \n {document.decode('utf-8')}")

6. Use the document read and prompt with it.
  
Sample Prompts:

1) This is a conversation between a doctor and patient, provide a diagnosis and plan the next steps. Explain your response.
2) This is a conversation between a doctor and patient, is there an emergency involved or does this situation need immediate medical attention?
3) Based on the above conversation between a doctor and patient, what are the major concerns discussed and course of treatment advised.

In [None]:
# Chat loop
while True:
    user_input = input("You: ")
    print(f"Please wait....")
    if user_input.lower() in ["exit", "quit"]:
        print("Exiting chat.")
        break
    
    response = chat_with_model(f"{document} \n {user_input}")
    print(f"Model: {response}")

**Exercise 2:** Ask about IBM Cloud for Financial Services

The base model is not trained on IBM Cloud for financial services and starts to hallucinate. Prompt to ask about FSCloud

Prompt 1) What is IBM FSCloud and what industry clients is it focussed at?
Prompt 2) The security control requirements of IBM Cloud for Financial services are based on what?
Prompt 3) What is IBM Cloud for Financial Services Validated?

Notice how the responses are vague and in some cases downright incorrect. The base model is hallucinating.

7. Set the endpoint for VSI instance serving base model and chat with the model

In [None]:
# Define the base model server URL (ensure it's correct for your deployment)
MODEL_SERVER_URL = "https://ilab-base.fs-demo.biz/v1/chat/completions"

In [None]:
# Chat loop
while True:
    user_input = input("You: ")
    print(f"Please wait....")
    if user_input.lower() in ["exit", "quit"]:
        print("Exiting chat.")
        break
    
    response = chat_with_model(f"{user_input}")
    print(f"Model: {response}")

**Exercise 3:** The base model is fine-tuned on a knowledge base of fscloud. The fine-tuned model is served on a different endpoint. 

Repeat the prompts to learn about IBM Cloud for Financial Services.

In [None]:
8. Set the endpoint for VSI instance serving the fine tuned model and chat with the model

In [None]:
# Define the base model server URL (ensure it's correct for your deployment)
MODEL_SERVER_URL = "https://ilab-tuned.fs-demo.biz/v1/chat/completions"

In [None]:
# Chat loop
while True:
    user_input = input("You: ")
    print(f"Please wait....")
    if user_input.lower() in ["exit", "quit"]:
        print("Exiting chat.")
        break
    
    response = chat_with_model(f"{user_input}")
    print(f"Model: {response}")