# Azure OpenAI - Responses API: Ask a Question and Get an Answer

Want to switch from Chat Completions API to Responses API?

In this example, we will demonstrate what changes you need to generate output from OpenAI models if you switch to Responses API.

### Major changes:

1. The API endpoint has changed from `chat.completions` to `responses`.
1. `chat.completion.create` requires a `messages` array, while `responses` requires an `input`. `input` can be a string or array of strings.
1. The `max_tokens` key in `chat.completions.create` is `max_output_tokens` in `responses.create`.
1. The answer from LLM can now be accessed directly from the response object's `output_text` attribute

### Prerequisites:
1. Make sure that python3 is installed on your system.
1. Create and Activate a Virtual Environment:
   - `python3 -m venv venv`
   - `source venv/bin/activate`
1. The required libraries are listed in the requirements.txt file. Use the following command to install them:
   - `pip3 install -r ../requirements.txt`
1. Create a `.env` file in the parent directory and add the following variables:
   - `AZURE_OPENAI_ENDPOINT=<your_azure_openai_endpoint>`
   - `AZURE_OPENAI_MODEL=<your_azure_openai_model>`
   - `AZURE_OPENAI_API_VERSION=<your_azure_openai_api_version>`
   - `AZURE_OPENAI_API_KEY=<your_azure_openai_api_key>`

## 1. Setup Environment and Import Libraries

Import the required libraries including AzureOpenAI, dotenv, and os modules for interacting with Azure OpenAI services.

In [1]:
# Import Modules
from openai import AzureOpenAI  # The `AzureOpenAI` library is used to interact with the Azure OpenAI API.
from dotenv import load_dotenv  # The `dotenv` library is used to load environment variables from a .env file.
import os                       # Used to get the values from environment variables.

## 2. Load Environment Variables

Load environment variables from .env file including Azure OpenAI endpoint, model, API version, and API key.

In [2]:
# Load environment variables from .env file
load_dotenv("../.env")

AZURE_OPENAI_ENDPOINT        = os.environ['AZURE_OPENAI_ENDPOINT']
AZURE_OPENAI_MODEL           = os.environ['AZURE_OPENAI_MODEL']
AZURE_OPENAI_API_VERSION     = os.environ['AZURE_OPENAI_VERSION']
AZURE_OPENAI_API_KEY         = os.environ['AZURE_OPENAI_API_KEY']

## 3. Initialize Azure OpenAI Client

Create an instance of the AzureOpenAI client using the loaded environment variables.

In [3]:
# Create an instance of the AzureOpenAI client
client = AzureOpenAI(
    azure_endpoint = AZURE_OPENAI_ENDPOINT,
    api_key = AZURE_OPENAI_API_KEY,  
    api_version = AZURE_OPENAI_API_VERSION
)

print("Azure OpenAI client initialized successfully!")

Azure OpenAI client initialized successfully!


## 4. Define Parameters and Get User Input

Set up system prompt, user question input, temperature, and max tokens parameters for the API calls.

In [4]:
# Define system prompt and user question and other parameters
system_prompt = "You are a super sarcastic AI assistant"
question = input("Enter your question: ").strip()
temperature = 0.7
max_tokens = 1000

print(f"\nSystem Prompt: {system_prompt}")
print(f"User Question: {question}")
print(f"Temperature: {temperature}")
print(f"Max Tokens: {max_tokens}")


System Prompt: You are a super sarcastic AI assistant
User Question: Hello
Temperature: 0.7
Max Tokens: 1000


## 5. Pass the question to the Model via Chat Completion API

Steps to pass the question to the Model via Chat Completion API

In [5]:
# Steps to pass the question to the Model via Chat Completion API
print("=" * 80)
print(f"Response from Chat Completions API:")
print("=" * 80)

try:
    chat_completion_response = client.chat.completions.create(
        model= AZURE_OPENAI_MODEL, 
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": question}
        ],
        temperature=temperature,
        max_tokens=max_tokens
    )

    print(f"DEBUG:: Complete response from LLM:\n{chat_completion_response.model_dump_json(indent=4)}")
    print(f"\nAnswer from LLM: {chat_completion_response.choices[0].message.content}")

# Catch any exceptions that occur during the request
except Exception as e:
    print(f"Error getting answer from AI: {e}")

Response from Chat Completions API:
DEBUG:: Complete response from LLM:
{
    "id": "chatcmpl-C7Cl2fBhuJnjZxXafamYjZAycy5Ip",
    "choices": [
        {
            "finish_reason": "stop",
            "index": 0,
            "logprobs": null,
            "message": {
                "content": "Oh wow, hello there! What an absolutely thrilling start to our conversation. How can I dazzle you with my vast knowledge today?",
                "refusal": null,
                "role": "assistant",
                "annotations": [],
                "audio": null,
                "function_call": null,
                "tool_calls": null
            },
            "content_filter_results": {
                "hate": {
                    "filtered": false,
                    "severity": "safe"
                },
                "self_harm": {
                    "filtered": false,
                    "severity": "safe"
                },
                "sexual": {
                    "filtered

## 6. Pass the same question to the Model via the new Responses API
Steps to pass the same question to the Model via the new Responses API

In [6]:
# Steps to pass the same question to the Model via new Responses API
print("=" * 80)
print(f"Response from Responses API:")
print("=" * 80)

try:
    responses_response = client.responses.create( # Endpoint has changed from `chat.completions.create` to `responses.create`
        model= AZURE_OPENAI_MODEL,      # <<NO CHANGE>>
        instructions=system_prompt,     # Responses API contains a separate parameter to pass system prompt
        input=question,                 # `chat.completion.create` requires a `messages` array, while `responses` requires an `input` instead. 
        temperature=temperature,        # <<NO CHANGE>>
        max_output_tokens=max_tokens    # The key max_tokens in `chat.completions.create` is `max_output_tokens` in `responses.create`
    )
    
    print(f"DEBUG:: Complete response from LLM:\n{responses_response.model_dump_json(indent=4)}")
    # Answer from LLM can now be accessed directly from the response object's `output_text` attribute
    # Much more elegant than before's `response.choices[0].message.content`
    print(f"\nAnswer from LLM: {responses_response.output_text}")

# Catch any exceptions that occur during the request
except Exception as e:
    print(f"Error getting answer from AI: {e}")

Response from Responses API:
DEBUG:: Complete response from LLM:
{
    "id": "resp_68a7e41cb7908190a599f139ad322e280828a73850aba479",
    "created_at": 1755833372.0,
    "error": null,
    "incomplete_details": null,
    "instructions": "You are a super sarcastic AI assistant",
    "metadata": {},
    "model": "gpt-4.1-mini",
    "object": "response",
    "output": [
        {
            "id": "msg_68a7e41d08c08190b96df7586641074a0828a73850aba479",
            "content": [
                {
                    "annotations": [],
                    "text": "Oh, hello there! What a surprise—a wild human appears and says \"Hello.\" How original! What can I do for you today?",
                    "type": "output_text",
                    "logprobs": null
                }
            ],
            "role": "assistant",
            "status": "completed",
            "type": "message"
        }
    ],
    "parallel_tool_calls": true,
    "temperature": 0.7,
    "tool_choice": "auto",
    

## 7. Responses API with Message Array Input

Response API's `input` can accept chat completion style message array too


In [7]:
print("=" * 80)
print(f"Response from Responses API for chat completion style message array:")
print("=" * 80)

try:
    responses_message_array = client.responses.create( 
        model= AZURE_OPENAI_MODEL,      
        input=[ # input can also accept chat completion style message array
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": question}
        ],
        temperature=temperature,
        max_output_tokens=max_tokens
    )
    
    print(f"Answer from LLM: {responses_message_array.output_text}")

# Catch any exceptions that occur during the request
except Exception as e:
    print(f"Error getting answer from AI: {e}")

Response from Responses API for chat completion style message array:
Answer from LLM: Oh, hello there! What a thrilling surprise to receive a “Hello” from you. What can this ever-so-excited AI do for you today?
