
# Accesing Watsonx.ai via REST API

In this lab, we will take a comprehensive look at making HTTP requests to access [Watsonx.ai's REST API](https://workbench.res.ibm.com/docs/api-reference) and learn how to use the functionality. This lab explore only a few of the many REST endpoints available so explore the REST API documentation to view the full list of capabilities.

In [None]:
# first we'll start by installing some dependencies
import sys
!{sys.executable} -m pip install -q requests
!{sys.executable} -m pip install -q ibm-cloud-sdk-core

In [8]:
# and verifying they've install correctly
import json
import requests
from ibm_cloud_sdk_core import IAMTokenManager

## HTTP request headers

Headers contain parameter values that represent the metadata associated with an API requests and response. In the following example, the Authorization header provides the server with credentials to validate your access. Watsonx.ai uses a "Bearer" access token wich is used to pass our Watsonx.ai authentication key. The `Content-type` header in the request is added to tell the server or the browser which is serving the resource to the end user about the media type of the request. In this case, type of expected data as `application/json`.

In [7]:
# THESE ARE THE VALUE YOU'LL NEED TO FILL IN YOURSELF
# API key you created
api_key = "INSERT YOUR KEY HERE"

# Project ID of your watsonx instance
project_id = "INSERT YOUR PROJECT ID HERE"

# URL service endpoint
ibm_cloud_url = "INSERT SERVICE ENDPOINT HERE"

access_token = ''
try:
  access_token = IAMTokenManager(
    apikey = api_key,
    url = "https://iam.cloud.ibm.com/identity/token"
  ).get_token()
except:
  print('Issue obtaining access token. Check variables?')

# The headers we'll send for both our POST and GET requests
headers = {
  "Authorization": "Bearer " + access_token,
  "Content-Type": "application/json",
  "Accept": "application/json"
}

## POST vs GET

HTTP requests come in two flavors: GET and POST. When using GET, data parameters are included in the URL and visible to everyone. However, when using POST, data is not displayed in the URL but is instead passed in the HTTP message body.

GET requests are intended to retrieve data from a server and do not modify the server’s state. On the other hand, POST requests are used to send data to the server for processing and may modify the server’s state.

## POST requests with 'Generate' endpoint
The generate endpoint "https://us-south.ml.cloud.ibm.com/ml/v1-beta/generation/text" provides an interface for sending prompts to any model supported by Watsonx.ai. Given a text prompt as inputs, and required parameters, the selected model will attempt to complete the provide input and return "generated_text".

Request body needs to include:
- Model id (string): the id of the model
- Input (string): prompt to generate completion
- Parameters for the model (key-value pairs)
- your watsonx project ID

> It's important to note the generate endpoint URL is dependent on the location of where your watsonx instance is provisioned

In [9]:
# The body we'll send as part of our POST request
body = {
  "model_id": "google/flan-ul2",
  "input": "Write a short blog post for an advanced cloud service for large language models: This service is",
  "parameters": {
    "temperature": 0,
    "max_new_tokens": 50,
    "min_new_tokens": 25
  },
  "project_id": project_id  
}

In [10]:
version = "2023-07-07"
generation_endpoint = f"{ibm_cloud_url}/ml/v1-beta/generation/text?version={version}"

response = requests.post(url=generation_endpoint, headers=headers, json=body)
print("Raw JSON response:\n", json.dumps(response.json(), indent=2))

Raw JSON response:
 {
  "model_id": "google/flan-ul2",
  "created_at": "2023-09-12T18:13:54.031Z",
  "results": [
    {
      "generated_text": "a new cloud service for large language models. It is a scalable, high-performance, and secure service for training and deploying language models.",
      "generated_token_count": 33,
      "input_token_count": 20,
      "stop_reason": "EOS_TOKEN"
    }
  ],
  "system": {
      {
        "message": "This model is a Non-IBM Product governed by a third-party license that may impose use restrictions and other obligations. By using this model you agree to its terms as identified in the following URL. URL: https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/fm-models.html?context=wx",
      }
    ]
  }
}


## Using GET requests to retrieve data

The GET method is used to retrieve information from Watsonx.ai using a given URL.

### GET /models

This enpoint will get the list of all models currently supported by Watsonx.ai

In [11]:
model_endpoint = f"{ibm_cloud_url}/ml/v1-beta/foundation_model_specs?version={version}"
response = requests.get(url = model_endpoint, headers=headers)
models = response.json()['resources']

print(f"{len(models)} models supported the Watsonx.ai")

# Uncomment below to see the raw JSON
# print(json.dumps(models, indent=2))

# prettify the output just to see the model names
pretty_output = lambda m: m['label']
print(json.dumps(list(map(pretty_output, models)), indent=2))

7 models supported the Watsonx.ai
[
  "starcoder-15.5b",
  "mt0-xxl-13b",
  "gpt-neox-20b",
  "flan-t5-xxl-11b",
  "flan-ul2-20b",
  "mpt-7b-instruct2",
  "llama-2-70b-chat"
]


## Conclusion

Although we've used Python as our language of choice in this lab. It's important to note that through the use of an REST API essentially any language can use watsonx.ai programmatically. Meaning that there's no limit to how clients choose to integrate watsonx.ai into their currently existing tech stack. 