# Multiagent CSV Data Analysis with Code Interpreter

This notebook demonstrates a multiagent system that:

- Scans CSV files from a local folder called `data`.
- For each CSV file, uploads it and uses a Code Interpreter–based agent (via the Azure AI Projects service) to perform data analysis (e.g. generating summary statistics and visualizations).
- Uses a second agent (via Semantic Kernel orchestration) to format and present the analysis results in a structured Markdown output.

Make sure you have the required packages installed (e.g., `azure-ai-projects`, `azure-identity`, and `semantic_kernel`) and that your environment variables (such as `PROJECT_CONNECTION_STRING` and `MODEL_DEPLOYMENT_NAME`) are set appropriately.


In [3]:
import os
import sys
from dotenv import load_dotenv
import asyncio
import logging
import requests
import uuid
import json
import csv
import datetime
from typing import List, Dict, Any, Optional, Annotated
from azure.identity import DefaultAzureCredential
from azure.core.credentials import AccessToken
from semantic_kernel import Kernel
from semantic_kernel.utils.logging import setup_logging
from semantic_kernel.functions import kernel_function
from semantic_kernel.connectors.ai.function_choice_behavior import FunctionChoiceBehavior
from semantic_kernel.connectors.ai.chat_completion_client_base import ChatCompletionClientBase
from semantic_kernel.agents import ChatCompletionAgent
from semantic_kernel.functions.kernel_arguments import KernelArguments
from semantic_kernel.contents.chat_history import ChatHistory
from semantic_kernel.contents.chat_message_content import ChatMessageContent
from semantic_kernel.contents.utils.author_role import AuthorRole
from semantic_kernel.functions.kernel_function_from_prompt import KernelFunctionFromPrompt
from semantic_kernel.connectors.ai.open_ai.prompt_execution_settings.azure_chat_prompt_execution_settings import (
    AzureChatPromptExecutionSettings,
)
from semantic_kernel.agents import Agent, AgentChat
from semantic_kernel.agents.strategies import (
    DefaultTerminationStrategy,
    SequentialSelectionStrategy,
)
from semantic_kernel.connectors.ai.open_ai import (
    AzureChatCompletion,
    AzureChatPromptExecutionSettings,
)
from semantic_kernel.agents.strategies.selection.selection_strategy import SelectionStrategy
from semantic_kernel.agents.strategies.termination.termination_strategy import TerminationStrategy
from semantic_kernel.contents.history_reducer.chat_history_reducer import ChatHistoryReducer
from semantic_kernel.contents import ChatHistoryTruncationReducer
from semantic_kernel.exceptions.agent_exceptions import AgentChatException

from semantic_kernel.agents import AgentGroupChat
from semantic_kernel.agents.strategies import (
    KernelFunctionSelectionStrategy,
    KernelFunctionTerminationStrategy,
)
from azure.search.documents import SearchClient
from azure.core.credentials import AzureKeyCredential
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

In [None]:
load_dotenv()
# Create a DefaultAzureCredential instance
credential = DefaultAzureCredential()
 
# # Acquire a token for a specific scope
# scope = "https://management.azure.com/.default"
# token: AccessToken = credential.get_token(scope)
 
print("Using Azure OpenAI Chat Deployment:", os.getenv("AZURE_OPENAI_CHAT_DEPLOYMENT_NAME"))

# Initialize the semantic kernel and add an AzureChatCompletion service.
kernel = Kernel()
kernel.add_service(AzureChatCompletion(service_id=os.getenv('GLOBAL_LLM_SERVICE')))

# Configure prompt execution settings for auto function calling.
settings = kernel.get_prompt_execution_settings_from_service_id(os.getenv("GLOBAL_LLM_SERVICE"))
settings.function_choice_behavior = FunctionChoiceBehavior.Auto()
upload_plugin = UploadCSVPlugin()


analyst_agent = ChatCompletionAgent(
    kernel=kernel,
    name="DataAnalyst",
    instructions="""
    "You are a data analyst. Your role is to extract key insights from the provided CSV data. You can use the upload_csv plugin to read the CSV file and return its content as a JSON string.
    You will receive a CSV file path as input. Your task is to read the CSV file and return its content as a JSON string.
    You will receive a JSON string representing the CSV content. Your task is to analyze this data and answer user question
    """,
    plugins=[upload_plugin]
)

formatter_agent = ChatCompletionAgent(
    kernel=kernel,
    name="FormatterAgent",
    instructions="""
    You are a formatting expert. Your role is to take the analysis summary and format it in Markdown 
    with clear headers, bullet points, and a final 'Formatted Result:' section summarizing the insights.
    """,
)

# Define a selection function to determine which agent should take the next turn.
selection_function = KernelFunctionFromPrompt(
    function_name="selection",
    prompt=f"""
        Examine the provided RESPONSE and choose the next participant.
        State only the name of the chosen participant without explanation.
        Never choose the participant named in the RESPONSE.

        Choose only from these participants:
        - SearchAssistantAgent
        - ColorAssistantAgent

        Rules:
        - If RESPONSE is user input, it is DatetimeAssistantAgent's turn.
        - If RESPONSE is by SearchAssistantAgent, it is ColorAssistantAgent's turn.
        - If RESPONSE is by ColorAssistantAgent, it is SearchAssistantAgent's turn.

        RESPONSE:
        {{{{$lastmessage}}}}
        """,
)

# Define a termination function where the reviewer signals completion with "yes".
termination_keyword = "yes"

termination_function = KernelFunctionFromPrompt(
    function_name="termination",
    prompt=f"""
        Examine the RESPONSE and determine whether the content has been deemed satisfactory.
        If the content is satisfactory, respond with a single word without explanation: {termination_keyword}.
        If specific suggestions are being provided, it is not satisfactory.
        If no correction is suggested, it is satisfactory.

        RESPONSE:
        {{{{$lastmessage}}}}
        """,
        )

history_reducer = ChatHistoryTruncationReducer(target_count=5)

# Create the AgentGroupChat with selection and termination strategies.
chat = AgentGroupChat(
    agents=[analyst_agent, formatter_agent],
    selection_strategy=KernelFunctionSelectionStrategy(
        initial_agent=analyst_agent,
        function=selection_function,
        kernel=kernel,
        result_parser=lambda result: str(result.value[0]).strip() if result.value[0] is not None else WRITER_NAME,
        history_variable_name="lastmessage",
        history_reducer=history_reducer,
    ),
    termination_strategy=KernelFunctionTerminationStrategy(
        agents=[analyst_agent, formatter_agent],
        function=termination_function,
        kernel=kernel,
        result_parser=lambda result: termination_keyword in str(result.value[0]).lower(),
        history_variable_name="lastmessage",
        maximum_iterations=10,
        history_reducer=history_reducer,
    ),
)
group_chat = AgentGroupChat(
    agents=[analyst_agent, formatter_agent])

Using Azure OpenAI Chat Deployment: gpt-4o


In [None]:
content = "what is Net Change in Plan Fiduciary Net Position for all yeas?"
# Add the user message
group_chat.add_chat_message(content)
print(f"# User: {content}")
async for response in group_chat.invoke():
    print(f"# {response.role} - {response.name or '*'}: '{response.content}'")

  group_chat.add_chat_message(content)


# User: Based on the CSV data, what are the key insights?
# assistant - DataAnalyst: 'Please provide the file path to the CSV file, and I'll help you process and analyze its data accordingly!'
# assistant - FormatterAgent: 'Let me know if you need assistance analyzing, formatting, or summarizing the data in the file.'
# assistant - DataAnalyst: 'Could you please specify the file path to the CSV file? Once provided, I can analyze the data and answer your questions based on it!'
# assistant - FormatterAgent: 'Could you confirm the file path to proceed with processing the data? Once received, I'll handle further insights generation.'
# assistant - DataAnalyst: 'It seems the file could not be found at the provided path. Could you double-check the file path or provide the correct one?'


In [4]:
load_dotenv()

# Authenticate and initialize services.
credential = DefaultAzureCredential()
print("Using Azure OpenAI Chat Deployment:", os.getenv("AZURE_OPENAI_CHAT_DEPLOYMENT_NAME"))

# Initialize the kernel and add an AzureChatCompletion service.
kernel = Kernel()
kernel.add_service(AzureChatCompletion(service_id=os.getenv('GLOBAL_LLM_SERVICE')))

# (Optional) Configure prompt execution settings for auto function calling.
settings = kernel.get_prompt_execution_settings_from_service_id(os.getenv("GLOBAL_LLM_SERVICE"))
# settings.function_choice_behavior = FunctionChoiceBehavior.Auto()   # Uncomment if needed

# Helper function: Read CSV file and return JSON string.
def read_csv_content(file_path):
    with open(file_path, "r", encoding="utf8") as f:
        reader = csv.DictReader(f)
        data = list(reader)
    return json.dumps(data)

# Define file paths.
# Assuming the notebook is in the "agents" folder and the "data_processing" folder is a sibling folder.
current_dir = os.getcwd()
parent_dir = os.path.dirname(current_dir)
csv_file_path_1 = os.path.join(parent_dir, "data_processing", "csv_tables", "merged_analysis_of_financial_experience.csv")
csv_file_path_2 = os.path.join(parent_dir, "data_processing", "csv_tables", "merged_schedules_of_changes_in_net_pension_liability.csv")

# Read CSV contents.
csv_content_1 = read_csv_content(csv_file_path_1)
csv_content_2 = read_csv_content(csv_file_path_2)

# Create the DataAnalyst agent with CSV content embedded into its instructions.
analyst_instructions = f"""
You are a data analyst. Your role is to extract key insights from the provided CSV data.
Below is the content of two CSV files provided as context.

CSV File 1:
{csv_content_1}

CSV File 2:
{csv_content_2}

Based on this data, analyze and answer the user's question.
"""

analyst_agent = ChatCompletionAgent(
    kernel=kernel,
    name="DataAnalyst",
    instructions=analyst_instructions
)

# Create the FormatterAgent that will format analysis summaries in Markdown.
formatter_agent = ChatCompletionAgent(
    kernel=kernel,
    name="FormatterAgent",
    instructions="""
You are a formatting expert. Your role is to take the analysis summary and format it in Markdown 
with clear headers, bullet points, and a final 'Formatted Result:' section summarizing the insights.
"""
)

# Create an AgentGroupChat using both agents.
group_chat = AgentGroupChat(
    agents=[analyst_agent, formatter_agent]
)



Using Azure OpenAI Chat Deployment: gpt-4o


In [8]:
# Reinitialize the group chat so that its internal history starts fresh.
group_chat = AgentGroupChat(agents=[analyst_agent, formatter_agent])

# Add the user message (await since it's a coroutine).
content = "what is Net Change in Plan Fiduciary Net Position for all yeas?"
await group_chat.add_chat_message(content)
print(f"# User: {content}")

# Invoke the group chat and iterate over responses.
async for response in group_chat.invoke():
    print(f"# {response.role} - {response.name or '*'}: '{response.content}'")




# User: what is Net Change in Plan Fiduciary Net Position for all yeas?
# assistant - DataAnalyst: 'The "Net Change in Plan Fiduciary Net Position" for all years from the second CSV is as follows:

- **2023**: $4,929,152  
- **2022**: ($7,137,547)  
- **2021**: $14,916,460  
- **2020**: $3,449,855  
- **2019**: $3,360,605  
- **2018**: $3,973,310  
- **2017**: $5,085,422  
- **2016**: ($998,101)  
- **2015**: ($100,630)  
- **2014**: $7,361,019  

This represents the annual changes in the plan's fiduciary net position as recorded in the dataset.'
# assistant - FormatterAgent: '### Net Change in Plan Fiduciary Net Position (All Years)

The "Net Change in Plan Fiduciary Net Position" reflects the annual net changes in fiduciary resources for the plan over the years, as documented below:

#### Annual Changes:
- **2023**: **$4,929,152**  
- **2022**: **($7,137,547)**  
- **2021**: **$14,916,460**  
- **2020**: **$3,449,855**  
- **2019**: **$3,360,605**  
- **2018**: **$3,973,310**  
- **2

In [10]:
from IPython.display import Markdown, display

final_formatted_messages = [
    msg for msg in group_chat.history.messages
    if msg.role == "assistant" and msg.name == "FormatterAgent"
]

if final_formatted_messages:
    final_message = final_formatted_messages[-1] 
    final_markdown = final_message.content
    display(Markdown(final_markdown))

### Net Change in Plan Fiduciary Net Position (All Years)

The following table displays the net annual changes in the plan's fiduciary resources:

#### Yearly Changes:
- **2023**: **$4,929,152**  
- **2022**: **($7,137,547)**  
- **2021**: **$14,916,460**  
- **2020**: **$3,449,855**  
- **2019**: **$3,360,605**  
- **2018**: **$3,973,310**  
- **2017**: **$5,085,422**  
- **2016**: **($998,101)**  
- **2015**: **($100,630)**  
- **2014**: **$7,361,019**

---

### Formatted Result:

#### Positive Trends:
- **Highest Positive Growth**: The largest annual gain occurred in **2021** with **$14,916,460**.
- Years with sustained positive growth: 2023, 2020, 2019, 2018, 2017, and 2014.

#### Negative Trends:
- **Largest Decline**: The most significant loss happened in **2022**, with **($7,137,547)**.
- Other years experiencing net declines: **2016** and **2015**, though less pronounced compared to 2022.

This overview illustrates notable volatility with periods of positive growth outweighing occasional declines in the plan's fiduciary net position across the years.