Copyright © 2024, SAS Institute Inc., Cary, NC, USA.  All Rights Reserved. SPDX-License-Identifier: Apache-2.0
# Register Azure OpenAI GPT Model Using REST API Calls
This notebook example, walks you through the steps required for leveraging a GPT-3.5-Turbo model from Azure OpenAI in SAS® Model Manager and SAS® Intelligent Decisioning. 
*** 
## Table of Contents 
1. [Introduction](#Introduction)
1. [Deploying a GPT Model in Azure OpenAI](#Deploying-a-GPT-Model-in-Azure-OpenAI)
1. [Considerations for Key Management](#Considerations-for-Key-Management)
1. [Integration with SAS Model Manager](#Integration-with-SAS-Model-Manager)
1. [Integration with SAS Intelligent Decisioning](#Integration-with-SAS-Intelligent-Decisioning)
1. [Conclusion](#Conclusion)
***
## Introduction 
[Azure OpenAI Service](https://azure.microsoft.com/en-us/products/ai-services/openai-service/) provides access to Large Language Models (LLMs) through REST API, their Python SDK, and a web-based interface in the Azure OpenAI studio. Various models are available out of the box, including GPT-4, GPT-4 Turbo with Vision, GPT-3.5-Turbo, and Embeddings model series. For more information about the Azure OpenAI Service, [this article](https://learn.microsoft.com/en-us/azure/ai-services/openai/overview). To incorporate these models in SAS Intelligent Decisioning and SAS Model Manager, you create a Python function and score code by calling the REST API of a GPT-3.5-Turbo model deployed in Azure OpenAI. At the time of writing this notebook, the Python SDK does not support all calls to each model type, but that might change in the future. This example also walks you through the steps that are required for deploying the model, based on what was required when this example was created.  
*** 
## Deploying a GPT Model in Azure OpenAI
Access to Azure OpenAI service must be requested and approved. You can determine whether you qualify for access and submit your request using [this form]( https://customervoice.microsoft.com/Pages/ResponsePage.aspx?id=v4j5cvGGr0GRqy180BHbR7en2Ais5pxKtso_Pz4b1_xUNTZBNzRKNlVQSFhZMU9aV09EVzYxWFdORCQlQCN0PWcu). After access is granted, you must create an Azure OpenAI Service resource and deploy your model. [This tutorial](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/create-resource?pivots=web-portal) walks you through setting up an Azure OpenAI Service and deploying a model in a few simple steps. GPT-3.5-Turbo is used for this example, but not all models are available in all regions. Next, it is recommended that you walk through [this tutorial](https://learn.microsoft.com/en-us/azure/ai-services/openai/quickstart?pivots=rest-api&tabs=command-line%2Cpython) with REST as your preferred method. Make sure that you save your endpoint, API-key, and deployment name, because they are needed for the next steps. Run the cell below and then save each of them using the input boxes that appear.   

In [None]:
import getpass

endpoint = input("Endpoint: ")
deployment_name = input("Deployment name: ")
api_key = getpass.getpass("API Key: ")

***
## Considerations for Key Management
Even though the cell above collects your API key, it uses the GetPass package to hide your key from view. With the endpoint, deployment name, and API key, anyone can make calls to the deployed LLM. Each call incurs a cost, so you need to prevent unauthorized usage. You should take steps to protect the API key from falling into the wrong hands, but you must use the API key in the score code to call the deployed LLM. Here are a few key management options and the potential risks of each. 

First, you can include the key in the score code used in SAS Model Manager and SAS Intelligent Decisioning. This gives anyone with access to the models or Python code files in SAS Viya the ability to view the key, and thus, use the model. You can restrict access to these resources in SAS Viya by leveraging the [access control rules](https://documentation.sas.com/?cdcId=sasadmincdc&cdcVersion=default&docsetId=evfun&docsetTarget=n1uw3er96phzpfn1pxvnf01f6sw3.htm) that are built into SAS Viya so that only a select few can view the models or Python code. 

This approach can be made a bit better by encoding the key using the Base64 package and then adding logic into the score code to decode the key prior to making the REST API call. Anyone can decode the encoded key, but now the key is no longer stored as plain text. You can encode your key using this code:

In [9]:
import base64
key = 'example_key'
en = base64.b64encode(key.encode('ascii')).decode('ascii')
en

'ZXhhbXBsZV9rZXk='

And you can decode your key by using this code: 

In [10]:
de = base64.b64decode(en.encode('ascii')).decode('ascii')
de

'example_key'

Anyone with access to the model, decision, or deployment can use it to make a call to the deployed LLM. If you want to restrict usage to only individuals with the key, you can add the key or the encoded key as an input to a model or decision. When calling the deployed model or decision, the individual must also pass the key as an input variable. When performing score tests or publishing validation tests in CAS, this writes the key to the CAS table, where it can be viewable to others. Also, when scoring using the SAS Micro Analytic Service, this can write the key to the log viewable by administrators. 

Finally, with help from an administrator, you can save the keys as environment variables in the Python environment within SAS Viya. This enables anyone with access to the model, decision, or deployment access to the deployed LLM, but this does hide the key from others. For example, in a Python environment where the model is executed, you can save an environment variable in this way: 

In [11]:
import os
os.environ['my_gpt_key'] = key

And use the code, like so: 

In [12]:
k = os.environ['my_gpt_key']
k

'example_key'

The key management techniques above can be combined to select the best level of security for your needs. You must update the code blocks below to take advantage of the techniques that you have chosen. 
*** 
## Integration with SAS Model Manager
To run the model in SAS Model Manager, you must write the score code and save it as a Python (.py) file. You also must specify the inputs in the score code. The score code generates the outputs, and properties of the model. You can then register it all into SAS Model Manager directly from this notebook using the python-sasctl package. First, use python-sasctl to authenticate with the target SAS Viya environment. Run the cell below to specify your information to create a session with SAS Viya. 

In [None]:
from sasctl import Session
from sasctl.services import model_repository as mr, model_management as mm

host = input("Host name: ")
username = input("User name: ")
password = getpass.getpass("Password: ")

sess = Session(host, username, password)
sess

Next, write the score code. Keep in mind your chosen key management techniques and edit the score code to reflect your choice. In the code block below, you must perform these steps: 
1. Choose a name for your model and update the `writefile` line with that name. This writes the file to your current directory, but you can edit this line to save the file to another directory. 
2. Update the endpoint and deployment name within the URL string.
3. Update the key variable to reflect your key management strategy as well as make any necessary changes within SAS Viya.
4. Run the block to write the score code file. 

In [None]:
%%writefile [INSERT-YOUR-MODEL-NAME].py
import requests

def score(prompt):
    "Output: answer_llm,finish_reason"
    
    url = '[INSERT-YOUR-ENDPOINT]/openai/deployments/[INSERT-YOUR-DEPLOYMENT-NAME]/chat/completions?api-version=2023-05-15'

    k =  '[INSERT-YOUR-API-KEY]'
	
    h =  {"Accept": "application/json", "Content-type": "application/json; charset=utf-8", "api-key": k}

    data = {"messages":[{"role": "user", "content": prompt}]}

    response = requests.post(url, json = data , headers = h )

    jsonResponse = response.json()   
    finish_reason = jsonResponse["choices"][0]['finish_reason']
    answer_llm = jsonResponse["choices"][0]['message']['content']


    return answer_llm,finish_reason

Next, add the file to SAS Model Manager and update its properties. In the below code block, you must perform these steps:
1. In the first three lines of code, specify the name of your project in SAS Model Manager, the name of your model, and the algorithm from your deployed LLM, such as GPT 3.5 Turbo. 
2. Update the add_model_content function with your model name in the two indicated locations. 
After the code block has run, you can open your project in SAS Model Manager to find your model. Using SAS Model Manager, you can run score test for the model, or deploy it to other destinations, including a container. 

In [None]:
# Update these variables to match your project
project = '[INSERT-YOUR-PROJECT-NAME]'
model_name = '[INSERT-YOUR-MODEL-NAME]'
algorithm = '[INSERT-YOUR-LLM-ALGORITHM]'

# Specify input variables and output variables
inputvariables = [{'name': 'prompt', 'role': 'input', 'type': 'string', 'level': 'nominal', 'length': 500}]
outputvariables = [{'name': 'answer_llm', 'role': 'output', 'type': 'string', 'level': 'nominal', 'length': 500}, {'name': 'finish_reason', 'role': 'output', 'type': 'string', 'level': 'nominal', 'length': 15}]

# Create the model
model = mr.create_model(
    model=model_name,
    project=project,
    algorithm=algorithm,
    modeler=username,
    tool='Python 3',
    function = "Text Generation",
    score_code_type = 'Python',
    input_variables = inputvariables,
    output_variables = outputvariables
)

# Add score code
scorefile = mr.add_model_content(
    model,
    open('[INSERT-YOUR-MODEL-NAME].py', 'rb'),
    name='[INSERT-YOUR-MODEL-NAME].py',
    role='score'
)


***
## Integration with SAS Intelligent Decisioning 
You can include the Python model that you just developed and registered in SAS Model Manager within a decision flow. But, if you do not have SAS Model Manager, no need to worry, you can leverage a Python code file in SAS Intelligent Decisioning instead. To run the GPT-3.5 model, you must create an execute function that you can then copy and paste into the Python code files in SAS Intelligent Decisioning. The same concerns apply to key management, so update the below code block to reflect your key management strategy. In the code block below, you must perform these steps: 
1. Update the endpoint and deployment name within the URL string. 
2. Update the key variable to reflect your key management strategy as well as make any necessary changes within SAS Viya. 
3. Copy the code block into a Python code file in SAS Intelligent Decisioning. 

In [None]:
''' List all output parameters as comma-separated values in the "Output:" docString. Do not specify "None" if there is no output parameter. '''
''' List all Python packages that are not built-in packages in the "DependentPackages:" docString. Separate the package names with commas on a single line. '''


import requests

def execute (prompt):
   'Output:answer_llm,finish_reason'
   'DependentPackages: requests'
   
   url = '[INSERT-YOUR-ENDPOINT]/openai/deployments/[INSERT-YOUR-DEPLOYMENT-NAME]/chat/completions?api-version=2023-05-15'

   k =  '[INSERT-YOUR-API-KEY]'
	
   h =  {"Accept": "application/json", "Content-type": "application/json; charset=utf-8", "api-key": k}

   data = {"messages":[{"role": "user", "content": prompt}]}

   response = requests.post(url, json = data , headers = h )

   jsonResponse = response.json()   
   finish_reason = jsonResponse["choices"][0]['finish_reason']
   answer_llm = jsonResponse["choices"][0]['message']['content']


   return answer_llm,finish_reason


After you add those changes to the above code block, it becomes a valid Python function, and then you can run it using the execute function to test it. 

In [None]:
execute("Write a tagline for an ice cream shop.")

*** 
## Conclusion
Now, you are all set to leverage a GPT-3.5-Turbo model that is deployed in Azure OpenAI with SAS Model Manager or SAS Intelligent Decisioning. It can then be managed with other models within your organization, combined with business logic, orchestrated with other models, and deployed into destinations within SAS Viya or beyond using containers.

***