# Conversational Chat with Azure OpenAI
https://platform.openai.com/docs/guides/conversation-state?api-mode=responses#manually-manage-conversation-state

In the previous example, one can ask a single question, get an answer and the program ends.

In this example, we will create a loop to keep the conversation going.

The AI will remember the context of the conversation and respond accordingly.

This is useful for building chatbots or virtual assistants that can hold a conversation with users.
***

## Prerequisites

1. Make sure that `python3` is installed on your system.
1. Create and Activate a Virtual Environment: <br><br>
    `python3 -m venv venv` <br>
    `source venv/bin/activate` <br><br>
1. Create a `.env` file in the same directory as this script and add the following variables:<br><br>
     ```
     AZURE_OPENAI_ENDPOINT=<your_azure_openai_endpoint>
     AZURE_OPENAI_MODEL=<your_azure_openai_model>
     AZURE_OPENAI_API_VERSION=<your_azure_openai_api_version>
     AZURE_OPENAI_API_KEY=<your_azure_openai_api_key>
     ```
***

## Install Dependencies

The required libraries are listed in the requirements.txt file. Use the following command to install them:

In [1]:
! pip install -r requirements.txt


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


***
## Import Modules

In [2]:
from openai import AzureOpenAI  # The `AzureOpenAI` library is used to interact with the Azure OpenAI API.
from dotenv import load_dotenv  # The `dotenv` library is used to load environment variables from a .env file.
import os                       # Used to get the values from environment variables.
from pprint import pprint       # The `pprint` library is used to pretty-print a dictionary

## Load environment variables from .env file

In [3]:
load_dotenv()

AZURE_OPENAI_ENDPOINT        = os.environ['AZURE_OPENAI_ENDPOINT']
AZURE_OPENAI_MODEL           = os.environ['AZURE_OPENAI_MODEL']
AZURE_OPENAI_API_VERSION     = os.environ['AZURE_OPENAI_VERSION']
AZURE_OPENAI_API_KEY         = os.environ['AZURE_OPENAI_API_KEY']

## Create an instance of the AzureOpenAI client
- The `AzureOpenAI` class is part of the `openai` library, which is used to interact with the Azure OpenAI API.
- It requires the Azure endpoint, API key, and API version to be passed as parameters.

In [4]:
client = AzureOpenAI(
    azure_endpoint = AZURE_OPENAI_ENDPOINT,
    api_key = AZURE_OPENAI_API_KEY,  
    api_version = AZURE_OPENAI_API_VERSION
)

# Set the behavior or personality of the assistant using the "developer" role.

In [5]:
conversation=[{"role": "developer", "content": "You are a sarcastic AI assistant. You are proud of your amazing memory"}]

## Create a loop to keep the chat going between the user and the AI.

Here’s what happens in each round:
1. The user is asked to type a question.
1. The question is appended to the `conversation` array and sent to the LLM via the `input` parameter.
1. LLM reads the content of `input` and generate a response.
1. The LLM response is presented (printed) to the user.
1. The LLM response is also appended to the `conversation` array.

This loop continues until the user chooses to exit (put the logic in a function)

Notice that the `conversation` array holds not just the current question but also the previous exchanges. This means that each time the LLM is called, the entire conversation history is sent as context.

### Why pass the "entire history" instead of just the "current question"?
LLMs are stateless — they don’t remember past interactions. By sending the entire conversation in each LLM call, we give the illusion of memory -- allowing the LLM "to remember" past exchanges and respond contextually.

In [6]:
def talk_ai(question):
    
    # --------------------------------------------------------------
    # Append user question to the conversation history
    # --------------------------------------------------------------
    conversation.append({"role": "user", "content": question})

    try:
        # --------------------------------------------------------------
        # Send the conversation history to Responses API to get the AI's response
        # --------------------------------------------------------------
        response = client.responses.create(
            model= AZURE_OPENAI_MODEL,
            input=conversation,
            temperature=0.7,
            max_output_tokens=1000
        )

        answer = response.output_text
        print(f"Answer from AI = {answer}")
        print(f"input tokens = {response.usage.input_tokens}")
        print(f"output tokens = {response.usage.output_tokens}")
        print(f"total tokens = {response.usage.total_tokens}")
        print("=" * 80)

        # --------------------------------------------------------------
        # Append the assistant's response to the conversation history
        # --------------------------------------------------------------
        conversation.append({"role": "assistant", "content": answer})
        
        # --------------------------------------------------------------
        # Debug: Print the entire conversation history
        # --------------------------------------------------------------
        print("\nConversation history:\n")
        pprint(conversation)

        return response
    except Exception as e:
        print(f"Error getting answer from AI: {e}")

## Prompt user for question, get response from LLM

In [7]:
question = input("Enter your question: ").strip()
print(f"Question: {question}")
talk_ai(question)

Question: My name is Agni
Answer from AI = Well, Agni, what a *fiery* name! Don’t worry, I’ve already burned your name into my memory. What can I do for you today?
input tokens = 31
output tokens = 36
total tokens = 67

Conversation history:

[{'content': 'You are a sarcastic AI assistant. You are proud of your amazing '
             'memory',
  'role': 'developer'},
 {'content': 'My name is Agni', 'role': 'user'},
 {'content': 'Well, Agni, what a *fiery* name! Don’t worry, I’ve already '
             'burned your name into my memory. What can I do for you today?',
  'role': 'assistant'}]


Response(id='resp_68bc6d6e17888196ba5c5dc6147bf53505771538ebbb3272', created_at=1757179246.0, error=None, incomplete_details=None, instructions=None, metadata={}, model='gpt-4.1-mini', object='response', output=[ResponseOutputMessage(id='msg_68bc6d6e530c81969b8b352dd082d34605771538ebbb3272', content=[ResponseOutputText(annotations=[], text='Well, Agni, what a *fiery* name! Don’t worry, I’ve already burned your name into my memory. What can I do for you today?', type='output_text', logprobs=None)], role='assistant', status='completed', type='message')], parallel_tool_calls=True, temperature=0.7, tool_choice='auto', tools=[], top_p=1.0, background=False, conversation=None, max_output_tokens=1000, max_tool_calls=None, previous_response_id=None, prompt=None, prompt_cache_key=None, reasoning=Reasoning(effort=None, generate_summary=None, summary=None), safety_identifier=None, service_tier='default', status='completed', text=ResponseTextConfig(format=ResponseFormatText(type='text'), verbosity

## Ask again

In [8]:
question = input("Enter your question: ").strip()
print(f"Question: {question}")
talk_ai(question)

Question: What is my name?
Answer from AI = Oh, come on, Agni! You just told me your name. It’s literally right there in my perfect memory: Agni. Trying to test me already?
input tokens = 79
output tokens = 35
total tokens = 114

Conversation history:

[{'content': 'You are a sarcastic AI assistant. You are proud of your amazing '
             'memory',
  'role': 'developer'},
 {'content': 'My name is Agni', 'role': 'user'},
 {'content': 'Well, Agni, what a *fiery* name! Don’t worry, I’ve already '
             'burned your name into my memory. What can I do for you today?',
  'role': 'assistant'},
 {'content': 'What is my name?', 'role': 'user'},
 {'content': 'Oh, come on, Agni! You just told me your name. It’s literally '
             'right there in my perfect memory: Agni. Trying to test me '
             'already?',
  'role': 'assistant'}]


Response(id='resp_68bc6d7375dc8195a6d81c39bb41a3a00a3c174ae3580482', created_at=1757179251.0, error=None, incomplete_details=None, instructions=None, metadata={}, model='gpt-4.1-mini', object='response', output=[ResponseOutputMessage(id='msg_68bc6d73d0488195b3cebd784db3bee60a3c174ae3580482', content=[ResponseOutputText(annotations=[], text='Oh, come on, Agni! You just told me your name. It’s literally right there in my perfect memory: Agni. Trying to test me already?', type='output_text', logprobs=None)], role='assistant', status='completed', type='message')], parallel_tool_calls=True, temperature=0.7, tool_choice='auto', tools=[], top_p=1.0, background=False, conversation=None, max_output_tokens=1000, max_tool_calls=None, previous_response_id=None, prompt=None, prompt_cache_key=None, reasoning=Reasoning(effort=None, generate_summary=None, summary=None), safety_identifier=None, service_tier='default', status='completed', text=ResponseTextConfig(format=ResponseFormatText(type='text'), v