# **Sales Analyst Agent**

## **About the Scenario**
In this scenario, we analyze sales data for AdventureWorks using Azure OpenAI. The AI agent performs tasks such as file retrieval, sales calculations, and generating insights by leveraging tools like File Search. This showcases how businesses can employ Azure OpenAI Agents to streamline analytical workflows and decision making.

Key Steps:

1. *File Conversion*: Converting Excel data into a Markdown format compatible with the AI agent.
2. *File Upload*: Storing the converted files in the Azure OpenAI Project for efficient processing.
3. *AI-Powered Analysis*: Leveraging Azure OpenAI to generate insights such as revenue metrics by region.

This hands-on demonstration will prepare you to use Azure OpenAI Assistants for similar real-world data engineering and analytics tasks.

## **Data**
This scenario uses files from the folder [`data/`](./data/) in this repo. You can clone this repo or copy this folder to make sure you have access to these files when running the sample.

The sales data is stored in multiple Excel files located in the data/ directory. These files contain essential details to simulate sales orders, which the AI agent will analyze.

Ensure you have the following files ready in the data/ directory:

- SalesOrder_43659.xlsx
- SalesOrder_43661.xlsx
- SalesOrder_43662.xlsx
- SalesOrder_43665.xlsx

## **Time**
You should expect to spend 10-15 minutes building and running this scenario. 

## **Before you begin**

#### Step 1: Install required libraries
Installing dependencies directly within a Jupyter notebook is a good practice because it ensures that all required packages are installed in the correct versions, making the notebook self-contained and reproducible. This approach helps other users or collaborators to set up the environment quickly and avoid potential issues related to missing or incompatible packages.

In [98]:
# Install the packages
%pip install -r ./requirements.txt

print("\nPackages installed successfully.")


Packages installed successfully.



#### Step 2: Setting up the environment
Before we begin, we need to load the necessary environment variables from a `.env` file. These variables include sensitive information such as API keys and endpoint URLs, which are crucial for running the code successfully.

Here’s what you need to do:
- Ensure your `.env` file is properly configured in the `.venv/.env` format. We have provided an template `.env` file, `.env.example` for your reference.
- Verify that all required secrets are included in the file before running the code.


The `.env` file must contain the following secrets:
- PROJECT_CONNECTION_STRING: URL to connect to the Azure OpenAI Project to access project resources.
- AZURE_OPENAI_DEPLOYMENT: The name of the Azure OpenAI model deployment.

Now, let’s load these variables and get started!

<code style="background:yellow;color:black">Note: Make sure to keep your `.env` file secure and avoid sharing it publicly. </code>

*For more information about leveraging Python Virtual Environments can be found [here](https://docs.python.org/3/library/venv.html).*

In [99]:
import os
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

# Retrieve the secrets
__PROJECT_CONNECTION_STRING = os.getenv("PROJECT_CONNECTION_STRING")
__AZURE_OPENAI_DEPLOYMENT = os.getenv("AZURE_OPENAI_DEPLOYMENT")

# Verify environment variables
if not all([__PROJECT_CONNECTION_STRING, __AZURE_OPENAI_DEPLOYMENT]):
    raise EnvironmentError("One or more environment variables are missing. Please check the .env file.")
else:
    print("Environment variables loaded successfully.")

Environment variables loaded successfully.


## **Azure OpenAI Agent Setup**

### Step 1: Initializing the Azure AI Studio Project Client
First, we will initialize the Azure AI Studio Project client using Azure’s `DefaultAzureCredential` for authentication, allowing seamless integration with Azure resources. You will need to log into Azure using the Azure CLI.

In [100]:
from azure.ai.projects import AIProjectClient
from azure.identity import DefaultAzureCredential

try:
    # Initialize the Azure AI Project client
    project_client = AIProjectClient.from_connection_string(
        credential=DefaultAzureCredential(),
        conn_str=os.environ["PROJECT_CONNECTION_STRING"],
    )

    print("Azure AI Studio client created successfully.")

except Exception as e:
    print(f"Error creating the project client: {e}")

Azure AI Studio client created successfully.


### Step 2: Initializing the Azure Agent Client
Next, we’ll initialize the Azure Agent Runtime client. The Azure Agent Client serves as the interface to interact with Azure OpenAI services. 

In [101]:
agent_client = project_client.agents

print(f"Agent Client created successfully.")

Agent Client created successfully.


## **Data Processing**

### Step 1: Convert Excel files to Markdown
Azure OpenAI Agents require data in specific file formats for processing. Since Excel files aren’t natively supported, our files need to be converted to Markdown tables in separate Markdown files. This step involves:

- Reading Excel files using pandas.
- Writing the data into Markdown format for compatibility.

In [102]:
import pandas as pd

# Define paths
output_dir = "uploads"
data_dir_path = "data"
sales_order_files = ["SalesOrder_43659.xlsx", "SalesOrder_43661.xlsx", "SalesOrder_43662.xlsx", "SalesOrder_43665.xlsx"]
output_dir_path = os.path.join(data_dir_path, output_dir)

# Ensure output directory exists
if not os.path.exists(output_dir_path):
    os.makedirs(output_dir_path)

# Prefix the data directory path to each file
sales_order_files = [os.path.join(data_dir_path, curr_file) for curr_file in sales_order_files]

# Store the uploaded file IDs to be used later when enabling File Search
markdown_file_paths = []

for curr_file in sales_order_files:
    try:
        # Check if the file exists
        if not os.path.exists(curr_file):
            raise FileNotFoundError(f"The file '{curr_file}' does not exist.")

        # Load the Excel File into a DataFrame
        df = pd.read_excel(curr_file)
        print(f"Workbook '{curr_file}' successfully loaded.")

        # Get the base name of the Excel file without the extension
        base_name = os.path.splitext(os.path.basename(curr_file))[0]

        # Convert the DataFrame to a Markdown table
        md_tbl_str = df.to_markdown(index=False, tablefmt="pipe")
        
        # Define the output file name
        output_file = os.path.join(output_dir_path, f"{base_name}.md")
        markdown_file_paths.append(output_file)
        
        # Write the Markdown table to the file
        with open(output_file, "w") as f:
            f.write(md_tbl_str)

        print(f"Markdown file '{output_file}' successfully written.")

    except FileNotFoundError as e:
        print(f"Error: {e}")

    except Exception as e:
        print(f"An error occurred while processing the Excel file '{curr_file}': {e}")

Workbook 'data\SalesOrder_43659.xlsx' successfully loaded.
Markdown file 'data\uploads\SalesOrder_43659.md' successfully written.
Workbook 'data\SalesOrder_43661.xlsx' successfully loaded.
Markdown file 'data\uploads\SalesOrder_43661.md' successfully written.
Workbook 'data\SalesOrder_43662.xlsx' successfully loaded.
Markdown file 'data\uploads\SalesOrder_43662.md' successfully written.
Workbook 'data\SalesOrder_43665.xlsx' successfully loaded.
Markdown file 'data\uploads\SalesOrder_43665.md' successfully written.


### Step 2: Upload Markdown file(s) to Azure OpenAI deployment
Next, we'll upload the markdown files from the local directory, `data\uploads`, to the Azure OpenAI Project Deployment. Before uploading, any markdown files in the project with the same name are removed to ensure the latest versions are used and to prevent duplicates. This step efficiently manages cleanup and file upload.

In [103]:
uploaded_file_ids = []

try:
    for local_file in os.listdir(output_dir_path):

        # Define the local file path
        local_file_path = os.path.join(output_dir_path, local_file)

        # Check if the file is a Markdown file
        if not local_file.endswith(".md"):
            continue
        else:
            print(f"Processing File '{local_file_path}'...")

            # Check if the file already exists in the cloud, and delete it if it does
            for cloud_file in agent_client.list_files().data:
                if cloud_file.filename == local_file:
                    agent_client.delete_file(cloud_file.id)
                    print(f"Deleted existing cloud file: '{cloud_file.filename}'")

            # Use the upload and poll SDK helper to upload the local file, add them to the vector store,
            # and poll the status of the file batch for completion.
            file = project_client.agents.upload_file_and_poll(file_path=local_file_path, purpose="assistants")
            uploaded_file_ids.append(file.id)
            print(f"Uploaded file, file ID: {file.id}")
            
    print(f"File IDs: {uploaded_file_ids}")

except FileNotFoundError:
    print(f"Error: The file '{curr_file}' was not found.")
    raise

except Exception as e:
    print(f"An error occurred while processing the Excel file: {e}")
    raise

Processing File 'data\uploads\SalesOrder_43659.md'...
An error occurred while processing the Excel file: JSON is invalid: Expecting value: line 1 column 1 (char 0)
Content: Exception Type: BadRequestException
Message: There are more than one Azure Open AI connections associated with this project. Enterprise agents is only supported when project has 1 Azure Open AI connection
Stack Trace:    at Microsoft.MachineLearning.Rag.EntryPoints.Transforms.ProjectProxyTransform.GetAOAIConnection(ConnectionCategory connectionCategory, Guid subscriptionId, String resourceGroupName, String workspaceName) in /mnt/vss/_work/1/s/src/azureml-api/src/RAG/EntryPoints/Transforms/ProjectProxyTransform.cs:line 498
   at Microsoft.MachineLearning.Rag.EntryPoints.Transforms.ProjectProxyTransform.<Apply>b__18_0(RequestTransformContext transformContext) in /mnt/vss/_work/1/s/src/azureml-api/src/RAG/EntryPoints/Transforms/ProjectProxyTransform.cs:line 209



DecodeError: JSON is invalid: Expecting value: line 1 column 1 (char 0)
Content: Exception Type: BadRequestException
Message: There are more than one Azure Open AI connections associated with this project. Enterprise agents is only supported when project has 1 Azure Open AI connection
Stack Trace:    at Microsoft.MachineLearning.Rag.EntryPoints.Transforms.ProjectProxyTransform.GetAOAIConnection(ConnectionCategory connectionCategory, Guid subscriptionId, String resourceGroupName, String workspaceName) in /mnt/vss/_work/1/s/src/azureml-api/src/RAG/EntryPoints/Transforms/ProjectProxyTransform.cs:line 498
   at Microsoft.MachineLearning.Rag.EntryPoints.Transforms.ProjectProxyTransform.<Apply>b__18_0(RequestTransformContext transformContext) in /mnt/vss/_work/1/s/src/azureml-api/src/RAG/EntryPoints/Transforms/ProjectProxyTransform.cs:line 209


### Step 3: Create Vector Store
The Vector Store is used to store embeddings of uploaded files, enabling the File Search tool to efficiently locate relevant content.

In [None]:
# Create a vector store called "Financial Statements"
vector_store = agent_client.create_vector_store_and_poll(file_ids=uploaded_file_ids, name="Sales Orders")
print(f"Vector store '{vector_store.name} ({vector_store.id})' created successfully.")

### Step 4: Create File Search Tool
The File Search tool enables the AI agent to query the uploaded files. It uses the embeddings stored in the Vector Store to locate and retrieve relevant content.

In [None]:
from azure.ai.projects.models import FileSearchTool

# Create file search tool with resources followed by creating agent
file_search = FileSearchTool(vector_store_ids=[vector_store.id])
print(f"File search tool created successfully for Vector Store {vector_store.name} ({vector_store.id}).")

## **Running the Azure OpenAI Agent**

### Step 1: Create a Sales Analyst Agent

In [None]:
try:
    agent = project_client.agents.create_agent(
        model=__AZURE_OPENAI_DEPLOYMENT,
        name="Sales Analyst Agent",
        instructions=(
            "You are an expert sales analyst. "
            "Use your knowledge base to answer questions about company sales, customers, and products."
        ),
        tools=file_search.definitions,
        tool_resources=file_search.resources,
    )
    print(f"Agent created successfully.({agent.id})")
except Exception as e:
    print("Error creating Agent:", e)

### Step 2: Start a New Converstaion
Conversation threads in Azure OpenAI enable context-aware interactions, storing both user prompts and agent responses. This step creates a new thread for the Sales Analyst Agent to handle queries.

In [None]:
# Create a conversation thread
try:
    thread = agent_client.create_thread()
    print(f"Thread created successfully ({thread.id})")
except Exception as e:
    print("Error creating thread:", e)

### Step 3: Query the AI Agent
Next, we’ll add a user message to the thread. The AI agent responds to a user-defined prompt, such as calculating revenue by region. It processes the prompt and retrieves the necessary data to deliver insights.

In [None]:
# Define the user question
prompt_content = "How much revenue did we generate by Region and what is the percentage of revenue generated by each region?"

# Add the question to the thread
try:
    message = agent_client.create_message(
        thread_id=thread.id,
        role="user",
        content=prompt_content,
    )
    print(f"Successfully added User prompt to the thread. (Message ID {message.id})")
except Exception as e:
    print("Error adding user question:", e)

### Step 4: Run the AI Agent
In this step, we instruct the AI agent to process the user’s query within the created thread. The agent analyzes the context, executes the required tools (e.g., File Search), and generates a response. The `create_and_proces_run` function triggers the agent to process the user's prompt and produce an agent output.

In [None]:
# Initiate the Agent's response
try:
    run = agent_client.create_and_process_run(
        thread_id=thread.id,
        assistant_id=agent.id,
    )
    print("Run started:", run.id)
except Exception as e:
    print("Error starting run:", e)

If the agent run fails, we will identify the issue. A common cause is exceeding the rate limit, requiring additional Azure quotas.

In [None]:
if run.status == "failed":
    # Check if you got "Rate limit is exceeded.", then you want to get more quota
    print(f"Run failed: {run.last_error}")

### Step 5: Extract Insights
After the agent processes the prompt, we retrieve its responses, which contain the requested insights. This step ensures that the conversation thread is queried to fetch all relevant messages generated during the interaction.

In [None]:
# Retrieves all messages from the thread in ascending order after the user message
messages = agent_client.list_messages(thread_id=thread.id, order="asc", after=message.id)

print(f'Run completed!\n\nMESSAGES\n')

# Loop through messages and print content based on role
for msg in messages.data:
    role = msg.role
    content = msg.content[0].text.value
    print(f"{role.capitalize()}: {content}")

## **Clean Up**

Properly cleaning up Azure resources after completing the analysis is essential to maintain a tidy Azure AI Studio Project environment and to avoid incurring unnecessary costs.

### Step 1: Deleting the Vector Store
Remove the Vector Store to free up storage resources.

In [None]:
project_client.agents.delete_vector_store(vector_store.id)
print("Deleted vector store")

### Step 2: Deleting Uploaded Files
Ensure all uploaded files are removed from Azure to maintain a clean environment.


In [None]:
# For each file id in the list, delete the file
for file_id in uploaded_file_ids:
    project_client.agents.delete_file(file_id)
    print(f"Deleted file: {file_id}")

### Step 3: Deleting the AI Agent
Remove the AI Agent to release associated resources.

In [None]:
# Get agent list
agents = agent_client.list_agents()

# If the agent exists in the list, delete it
if agent.id in [a.id for a in agents.data]:
    response = agent_client.delete_agent(agent.id)
    print("Deleted Agent Client\n",response)
else:
    print("Agent does not exist to delete.")