Copyright © 2024, SAS Institute Inc., Cary, NC, USA.  All Rights Reserved. SPDX-License-Identifier: Apache-2.0
# Register Azure OpenAI GPT Model REST API Calls
In this notebook, we will walk through the steps required for leveraging a GPT-3.5-Turbo model from Azure OpenAI in SAS® Model Manager® and SAS® Intelligent Decisioning®. 
*** 
## Table of Contents 
1. [Introduction](#Introduction)
1. [Deploying a GPT Model in Azure OpenAI](#Deploying-a-GPT-Model-in-Azure-OpenAI)
1. [Considerations for Key Management](#Considerations-for-Key-Management)
1. [Integration with SAS Model Manager](#Integration-with-SAS-Model-Manager)
1. [Integration with SAS Intelligent Decisioning](#Integration-with-SAS-Intelligent-Decisioning)
1. [Conclusion](#Conclusion)
***
## Introduction 
[Azure OpenAI Service](https://azure.microsoft.com/en-us/products/ai-services/openai-service/) provides access to Large Language Models (LLMs) through REST API, their Python SDK, and a web-based interface in the Azure OpenAI studio. Various models are available out of the box, including GPT-4, GPT-4 Turbo with Vision, GPT-3.5-Turbo, and Embeddings model series. You can read more about Azure OpenAI Service [here](https://learn.microsoft.com/en-us/azure/ai-services/openai/overview). To incorporate these models in SAS Intelligent Decisioning and SAS Model Manager, we will create a Python function and scoring code calling the REST API of a GPT-3.5-Turbo model deployed in Azure OpenAI. At the time of writing this notebook, the Python SDK does not support all calls to each model type, but that may change in the future. Additionally, we will walk through the steps required for deploying the model, based on what is required at the time of writing.  
*** 
## Deploying a GPT Model in Azure OpenAI
At the time of writing, access to Azure OpenAI service must be requested and approved. You can determine if you qualify for access and submit your request via [this form]( https://customervoice.microsoft.com/Pages/ResponsePage.aspx?id=v4j5cvGGr0GRqy180BHbR7en2Ais5pxKtso_Pz4b1_xUNTZBNzRKNlVQSFhZMU9aV09EVzYxWFdORCQlQCN0PWcu). Once access is granted, you must create an Azure OpenAI Service resource and deploy your model. [This tutorial](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/create-resource?pivots=web-portal) walks through setting up an Azure OpenAI Service and deploying a model in a few simple steps. For this example, I’ve used GPT-3.5-Turbo, but not all models are available in all regions. Next, I recommend walking through [this tutorial](https://learn.microsoft.com/en-us/azure/ai-services/openai/quickstart?pivots=rest-api&tabs=command-line%2Cpython) with REST as your preferred method. Importantly, save your endpoint, API-key, and deployment name. We will need them for the next steps. Run the cell below and you can save each of them via the input boxes that appear.   

In [None]:
import getpass

endpoint = input("Endpoint: ")
deployment_name = input("Deployment name: ")
api_key = getpass.getpass("API Key: ")

***
## Considerations for Key Management
Even though the cell above collects your API key, it uses the GetPass package to hide your key from view. With our endpoint, deployment name, and API key, anyone can make calls to our deployed LLM. Each call incurs a cost, so we need to prevent unauthorized usage. We want to take steps to protect our API key from falling into the wrong hands, but we need to use our API key in our score code to call our deployed LLM. Let’s outline a few key management options and discuss the potential risks of each. 

First, we can hardcode our key into the score code used in SAS Model Manager and SAS Intelligent Decisioning. This will give anyone with access to our models or Python code files on SAS Viya the ability to view our key, and thus, use our model. We can restrict access to these resources in SAS Viya by leveraging the [access control rules](https://go.documentation.sas.com/doc/en/sasadmincdc/v_049/evfun/n1uw3er96phzpfn1pxvnf01f6sw3.htm) built into SAS Viya so only a select few can see our models or Python code. 

We can make this approach a bit better by encoding our key using the Base64 package and then adding logic into our score code to decode our key prior to make the REST API call. Anyone can decode our encoded key, but now our key is no longer stored plain text. You can encode your key via:

In [9]:
import base64
key = 'example_key'
en = base64.b64encode(key.encode('ascii')).decode('ascii')
en

'ZXhhbXBsZV9rZXk='

And you can decode your key by: 

In [10]:
de = base64.b64decode(en.encode('ascii')).decode('ascii')
de

'example_key'

Anyone with access to our model, decision, or deployment can use it to make a call to our deployed LLM. If we want to restrict usage to only folks with our key, we can add our key or our encoded key as an input to the model or decision. When calling the deployed model or decision, the individual must also pass the key as an input variable. When performing score tests in CAS or publishing validation, this will write the key to the CAS table, where it may be viewable to others. Additionally, when scoring via MAS, this may write the key to the log viewable by administrators. 

Finally, with help from an administrator, you can save the keys as Environment variables Python environment within SAS Viya. This allows anyone with access to the model, decision, or deployment access to our deployed LLM, but this does hide our key from others. For example, in a Python environment where the model to be executed, we can save an environment variable like so: 

In [11]:
import os
os.environ['my_gpt_key'] = key

And use it our code, like so: 

In [12]:
k = os.environ['my_gpt_key']
k

'example_key'

The key management techniques above can be combined to select the best level of security for your needs. You will need to update the code blocks below to take advantage of the techniques you’ve chosen. 
*** 
## Integration with SAS Model Manager
To run our model in SAS Model Manager, we need to write our score code and save it as a .py file. We also need to specify the inputs to our score code, the outputs the score code generates, and properties of the model. Then we can register it all into SAS Model Manager directly from this notebook using the python-sasctl package. First, let’s use python-sasctl to authenticate to our target SAS Viya environment. Run the cell below to input your information to create a session with SAS Viya. 

In [None]:
from sasctl import Session
from sasctl.services import model_repository as mr, model_management as mm

host = input("Hostname: ")
username = input("Username: ")
password = getpass.getpass("Password: ")

sess = Session(host, username, password)
sess

Next, let’s write our score code. Keep in mind your chosen key management techniques and edit the score code to reflect your choice. In the block below, you need to: 
1. Pick a name for your model and update the write file line to match your chosen name. This will write the file to your current directory, but you can edit this line to save the file to another directory. 
1. Update the endpoint and deployment name within the URL string.
1. Update the key variable to reflect your key management strategy as well as make any necessary changes within SAS Viya.

Run the block to write the score code file. 

In [None]:
%%writefile [INSERT-YOUR-MODEL-NAME].py
import requests

def score(prompt):
    "Output: answer_llm,finish_reason"
    
    url = '[INSERT-YOUR-ENDPOINT]/openai/deployments/[INSERT-YOUR-DEPLOYMENT-NAME]/chat/completions?api-version=2023-05-15'

    k =  '[INSERT-YOUR-API-KEY]'
	
    h =  {"Accept": "application/json", "Content-type": "application/json; charset=utf-8", "api-key": k}

    data = {"messages":[{"role": "user", "content": prompt}]}

    response = requests.post(url, json = data , headers = h )

    jsonResponse = response.json()   
    finish_reason = jsonResponse["choices"][0]['finish_reason']
    answer_llm = jsonResponse["choices"][0]['message']['content']


    return answer_llm,finish_reason

Next, let’s add our file to SAS Model Manager and update its properties. In the blow below, you need to:
1. Add the name of your project in SAS Model Manager, your chosen name for the model, and the algorithm from your deployed LLM, such as GPT 3-5 Turbo, in the first three lines of code.
1. Update the add_model_content function with your model name in the two indicated locations. 
Once the code block has run, you can open your project in SAS Model Manager to find your model! From SAS Model Manager, you can run the model in a score test or deploy it to other destinations, including a container. 

In [None]:
# Update these variable to match your project
project = '[INSERT-YOUR-PROJECT-NAME]'
model_name = '[INSERT-YOUR-MODEL-NAME]'
algorithm = '[INSERT-YOUR-LLM-ALGORITHM]'

# Specify input variables and output variables
inputvariables = [{'name': 'prompt', 'role': 'input', 'type': 'string', 'level': 'nominal', 'length': 500}]
outputvariables = [{'name': 'answer_llm', 'role': 'output', 'type': 'string', 'level': 'nominal', 'length': 500}, {'name': 'finish_reason', 'role': 'output', 'type': 'string', 'level': 'nominal', 'length': 15}]

# Create the model
model = mr.create_model(
    model=model_name,
    project=project,
    algorithm=algorithm,
    modeler=username,
    tool='Python 3',
    function = "Text Generation",
    score_code_type = 'Python',
    input_variables = inputvariables,
    output_variables = outputvariables
)

# Add score code
scorefile = mr.add_model_content(
    model,
    open('[INSERT-YOUR-MODEL-NAME].py', 'rb'),
    name='[INSERT-YOUR-MODEL-NAME].py',
    role='score'
)


***
## Integration with SAS Intelligent Decisioning 
We can use the Python model we just developed and registered to SAS Model Manager within our decision flow as a model. But, if you don’t have SAS Model Manager, no need to worry! We can leverage a Python code file in SAS Intelligent Decisioning instead. To run the GPT-3.5 model, we need to create an execute function that we will then copy and paste the code into the Python code files in SAS Intelligent Decisioning. The same concerns apply for key management, so update the code block below to reflect your key management strategy. In the code block below, you need to: 
1. Update the endpoint and deployment name within the URL string. 
1. Update the key variable to reflect your key management strategy as well as make any necessary changes within SAS Viya. 
1. Copy the code block into a Python code file in SAS Intelligent Decisioning. 

In [None]:
''' List all output parameters as comma-separated values in the "Output:" docString. Do not specify "None" if there is no output parameter. '''
''' List all Python packages that are not built-in packages in the "DependentPackages:" docString. Separate the package names with commas on a single line. '''


import requests

def execute (prompt):
   'Output:answer_llm,finish_reason'
   'DependentPackages: requests'
   
   url = '[INSERT-YOUR-ENDPOINT]/openai/deployments/[INSERT-YOUR-DEPLOYMENT-NAME]/chat/completions?api-version=2023-05-15'

   k =  '[INSERT-YOUR-API-KEY]'
	
   h =  {"Accept": "application/json", "Content-type": "application/json; charset=utf-8", "api-key": k}

   data = {"messages":[{"role": "user", "content": prompt}]}

   response = requests.post(url, json = data , headers = h )

   jsonResponse = response.json()   
   finish_reason = jsonResponse["choices"][0]['finish_reason']
   answer_llm = jsonResponse["choices"][0]['message']['content']


   return answer_llm,finish_reason


Once you add those changes to your code block above, it becomes a valid Python function and you can run here using the execute function to test it, like so: 

In [None]:
execute("Write a tagline for an ice cream shop.")

*** 
## Conclusion
Now, you are all set to leverage a GPT-3.5-Turbo model deployed in Azure OpenAI in SAS Model Manager or SAS Intelligent Decisioning where it can be managed with other models in your organizations, combined with business logic, orchestrated with other models, and deployed into destinations within SAS Viya or beyond using containers.

***