# 01 - Working with the Azure OpenAI API directly

In this lab, we will perform a couple of simple calls to the Azure OpenAI API.
- The first call will allow us to find out which Model Deployments are available for the Azure OpenAI API.
- The second call will send a prompt to the Azure OpenAI API.

## Setup

First, we start by providing the API key for Azure OpenAI and the endpoint.

In [None]:
#Install the required packages:
%pip install python-dotenv

In [None]:
import json
import requests
import os
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

API_KEY = os.getenv("AZURE_OPENAI_API_KEY")
API_VERSION = os.getenv("OPENAI_API_VERSION")
RESOURCE_ENDPOINT = os.getenv("OPENAI_API_BASE")

Next, we'll set the endpoint that we're going to call. This endpoint will return a list of available model deployments.

In [None]:
url = RESOURCE_ENDPOINT + "/openai/deployments?api-version=" + API_VERSION

print (url)

Above you will see the full URL that we are going to be calling.

## Get the list of available Model Deployments

Now let's call that Azure OpenAI endpoint and see what it returns. We pass the API key in the HTTP header.

In [None]:
r = requests.get(url, headers={"api-key": API_KEY})

print(r.text)

Above, there should now be a JSON formatted list of model deployments. The output will contain one or more sections similar to the following

```
    {
      "scale_settings": {
        "scale_type": "standard"
      },
      "model": "gpt-35-turbo",
      "owner": "organization-owner",
      "id": "gpt35turbo",
      "status": "succeeded",
      "created_at": 1684150536,
      "updated_at": 1684150536,
      "object": "deployment"
    }
```

In the output, you'll see the **Model Name** as the `model` value and the **Deployment Name** as the `id` value. We'll be using an `id` value for the next section.

## Send a prompt to Azure OpenAI using the API

Next, let's call the Azure OpenAI API with a prompt. To do this, we'll need the `id` of our completion model deployment.

We've already setup this value in the `.env` file, but you should see a matching value in the list that was output above.

In [None]:
DEPLOYMENT_ID = os.getenv("AZURE_OPENAI_COMPLETION_DEPLOYMENT_NAME")

As before, we'll construct a URL to call. This time, we'll also be creating a payload.

In [None]:
url = RESOURCE_ENDPOINT + "/openai/deployments/" + DEPLOYMENT_ID + "/completions?api-version=" + API_VERSION

print(url)

Again, you will see the full URL that we are going to call. This URL will use the specified deployment to call the **completions** API.

Next, we will call the Azure OpenAI API using the URL above. Just like last time, we pass the API key in the HTTP header. We also send a JSON formatted body as part of the request which contains the *prompt* that we want to use to get a response from the OpenAI model. In this case, our prompt is "Once upon a time", which should cause the model to complete the prompt by generating a story.

In [None]:
r = requests.post(url, headers={"api-key": API_KEY}, json={"prompt": "Once upon a time"})

print(json.dumps(r.json(), indent=2))

As before, the result of the API call will be JSON data similar to the below example.

```
{
  "id": "cmpl-7WoRsMTvolWjs2IzK9OdI0MXHHXwu",
  "object": "text_completion",
  "created": 1688054776,
  "model": "text-davinci-003",
  "choices": [
    {
      "text": " there was a lovely princess who have magical powers. She lived in a castle far",
      "index": 0,
      "finish_reason": "length",
      "logprobs": null
    }
  ],
  "usage": {
    "completion_tokens": 16,
    "prompt_tokens": 4,
    "total_tokens": 20
  }
}
```

Here's some more information about some of the data contained in that response.

Key | Description
--- | ---
`model` | The model that was used to generate the response to the prompt
`text` | This is the response that was generated by the OpenAI model
`finish_reason` | In this case, the `finish_reason` is `length`, which indicates that the response was cut short due to reaching the *token limit*. We didn't specify a maximum number of tokens in our request, so it will have used the default limit of **16**. Therefore, our response was cut short when the limit of 16 tokens was reached.
`completion_tokens` | The number of tokens that were used in generating the response
`prompt_tokens` | The number of tokens that were consumed by the prompt
`total_tokens` | The total number of tokens that were consumed by the request (`prompt_tokens` + `completion_tokens`)

## Summary

In this lab, we used the Azure OpenAI API directly to query information about our instance of the Azure OpenAI service and to send a prompt to an OpenAI model.

## Up Next

In the next lab, we will look at using the OpenAI SDK to work with the Azure OpenAI service.