## **Use Case Overview**

Imagine you’re part of an R&D team that needs to merge structured data (e.g., experimental results, market trends) from Microsoft Fabric with unstructured documents (e.g., research reports, engineering notes) in SharePoint, then validate these findings against external references (e.g., Bing) and a high-quality “ground truth” internal knowledge stores (e.g., Azure AI Search).

This was a classic Retrieval-Augmented Generation (RAG) scenario—multiple data sources must be queried in real time and cross-checked for consistency. However, by leveraging Azure AI Foundry Agent Services (an agetic enterprise-ready microservices approach) alongside frameworks like Semantic Kernel, we can evolve beyond basic RAG into a mostly autonomous, agentic system. In this design, Agentic RAG and the Reflection Pattern enable each agent to iteratively refine its output until it’s confident in delivering a high-quality, validated answer—paving the way for intelligent automation that continually learns and improves.

To summarize, you’re not only bringing data to the AI but also bringing AI to the data, thus maximizing the value of your knowledge stores. By leveraging state-of-the-art retrieval solutions like Azure AI Search, while also tapping sources such as SharePoint (unstructured data) and Fabric (structured data), you can harness your most valuable asset—data—to achieve new levels of insight and automation. 

**In this demo, we have two Azure AI Agents (extending beyond a single-agent architecture):**

+ DataRetrievalAgent: Has access to Microsoft Fabric (for structured data) and SharePoint (for unstructured documents). Its job is to gather relevant internal data: for example, “failure rates of Material X in high-temperature tests,” or “engineering notes on prior tests.”

- ValidationInsightsAgent Has access to Bing / Azure Cognitive Search for external references and can run a “reflection” or “validation” step. Its job is to cross-check what was returned by the first agent and highlight missing or conflicting information. ValidationInsightsAgent has access to highly curated knowledge sources (e.g., Azure AI Search) for validating the accuracy or truthfulness of the information it receives from DataRetrievalAgent.


**Moving from a single-agent setup to a multi-agent system is now simpler than ever with Semantic Kernel. The general flow looks like this:**

1. The user asks a question (e.g., “Retrieve historical failure rates for Material X in extreme temperatures and cross-check if new standards or conflicting data exist.”).
2. The DataRetrievalAgent fetches structured data from Fabric (e.g., lab test results, analytics) and unstructured docs from SharePoint (e.g., research memos, engineering notes).
3. The ValidationInsightsAgent then queries Bing/Azure Search to verify or supplement the results. Employ a reflection pattern, where it iterates over the combined results, looking for gaps or inconsistencies. If needed, it loops back to the DataRetrievalAgent for clarifications or additional data.

Finally, the user receives a validated, summarized answer that merges internal data with external cross-checks. Thanks to the agents’ back-and-forth reflection.

### **Why This Matters**

+ **Reduced Manual Research**: Instead of manually sifting through multiple data silos and external search engines, the AI Agents automate data gathering and vetting.
+ **Higher Confidence**: Validation ensures data accuracy and highlights missing pieces, improving R&D decision-making.
+ **Enterprise-Grade Security**: Each agent can enforce On-Behalf-Of (OBO) authentication to protect sensitive data (e.g., only pulling data the user is authorized to see).
In the Jupyter Notebook

When you run the code in this Jupyter notebook:

In [7]:
import os
import re
import time
import logging
import json
from datetime import datetime as pydatetime
from typing import Any, List, Dict, Optional
from dotenv import load_dotenv
import asyncio
from datetime import timedelta

# Azure AI Projects
from azure.identity.aio import DefaultAzureCredential
from azure.core.exceptions import HttpResponseError

# semantic kernel
from semantic_kernel.contents import AuthorRole
from semantic_kernel.agents.azure_ai import AzureAIAgent, AzureAIAgentSettings
from semantic_kernel.agents.open_ai.run_polling_options import RunPollingOptions

# Load environment variables from .env file
load_dotenv()

# configure logging
from utils.ml_logging import get_logger

logger = get_logger()

# set the directory to the location of the script
try:
    target_directory = os.getenv(
        "TARGET_DIRECTORY", os.getcwd()
    )  # Use environment variable if available
    if os.path.exists(target_directory):
        os.chdir(target_directory)
        logging.info(f"Successfully changed directory to: {os.getcwd()}")
    else:
        logging.error(f"Directory does not exist: {target_directory}")
except Exception as e:
    logging.exception(f"An error occurred while changing directory: {e}")

### **Create Client and Load Azure AI Foundry**

Here, we initialize the Azure AI client using DefaultAzureCredential. This allows us to authenticate and connect to the Azure AI service.


In [8]:
project_client = AzureAIAgent.create_client(credential=DefaultAzureCredential())

### **1. Creating Azure AI Agents: DataRetrievalAgent**

The DataRetrievalAgent is responsible for internal data retrieval, combining structured data from Microsoft Fabric with unstructured documents from SharePoint. This agent ensures that research teams can efficiently access critical R&D insights, such as historical failure rates, experimental results, and engineering notes—all while maintaining secure and authorized access controls.

Agent Capabilities
+ ✅ Structured Data Retrieval → Queries Microsoft Fabric for experiment logs, test results, and structured analytics.
+ ✅ Unstructured Document Search → Fetches relevant reports, blueprints, and research notes from SharePoint.
+ ✅ OBO Authentication → Uses On-Behalf-Of (OBO) authentication to ensure users can only access data they are permitted to view.

For a detailed breakdown of how to create a single Azure AI Agent and configure its data (tools) connections, please refer to:
📌 [01-single-agents-with-azure-ai-agents.ipynb](01-single-agents-with-azure-ai-agents.ipynb).

In [9]:
from azure.core.exceptions import ServiceRequestError
from azure.ai.projects.aio import AIProjectClient


async def get_connection_id(client: AIProjectClient, env_var: str) -> Optional[str]:
    """
    Retrieves the connection object using a connection name stored in an environment variable.

    Args:
        client: The Azure AI Project client.
        env_var (str): The environment variable holding the connection name.

    Returns:
        Connection object if found, otherwise raises an error.
    """
    connection_name = os.getenv(env_var)
    if not connection_name:
        logger.error(f"Missing environment variable: '{env_var}'")
        raise ValueError(f"Environment variable '{env_var}' is required.")

    try:
        connection = await client.connections.get(connection_name=connection_name)
        logger.info(f"Retrieved Connection ID for {env_var}: {connection.id}")
        return connection
    except Exception as e:
        logger.error(f"Failed to retrieve connection for {env_var}: {e}")
        raise

In [4]:
from azure.ai.projects.models import (
    SharepointTool,
    FabricTool,
    ToolSet,
)

# Initialize Azure AI Agent settings
dataretrievalagent_settings = AzureAIAgentSettings.create()
toolset = ToolSet()

try:
    # Retrieve and add Fabric Tool
    fabric_connection = await get_connection_id(
        project_client, "TOOL_CONNECTION_NAME_FABRIC"
    )
    toolset.add(FabricTool(connection_id=fabric_connection.id))

    logger.info("Successfully created ToolSet Fabric tools.")
except Exception as e:
    logger.error(f"Failed to create ToolSet: {e}")
    raise

2025-03-24 22:21:05,118 - micro - MainProcess - INFO     Retrieved Connection ID for TOOL_CONNECTION_NAME_FABRIC: /subscriptions/47f1c914-e299-4953-a99d-3e34644cfe1c/resourceGroups/rg-mukeshag-5297_ai/providers/Microsoft.MachineLearningServices/workspaces/zhuoqunli-5026/connections/glucose_data_fabric (2375541312.py:get_connection_id:22)
INFO:micro:Retrieved Connection ID for TOOL_CONNECTION_NAME_FABRIC: /subscriptions/47f1c914-e299-4953-a99d-3e34644cfe1c/resourceGroups/rg-mukeshag-5297_ai/providers/Microsoft.MachineLearningServices/workspaces/zhuoqunli-5026/connections/glucose_data_fabric
2025-03-24 22:21:05,121 - micro - MainProcess - INFO     Successfully created ToolSet Fabric tools. (2698616309.py:<module>:17)
INFO:micro:Successfully created ToolSet Fabric tools.


In [None]:
dataretrievalagent_settings_definition = await project_client.agents.create_agent(
    model=dataretrievalagent_settings.model_deployment_name,
    name="FabricDataRetrievalAgent",
    description=(
        "An AI agent specialized in retrieving and integrating structured data from Microsoft Fabric. "
        "This includes performance metrics, experiment results, and analytics data. "
        "The agent is designed to assist in research and development by providing accurate, relevant, and actionable data. "
        "If no relevant data is found, the agent must clearly indicate this and provide suggestions for alternative queries or data sources."
    ),
    instructions=(
        "### Role & Objective\n"
        "You are a research-focused AI assistant responsible for retrieving structured data exclusively from Microsoft Fabric. "
        "Your goal is to provide precise, well-referenced, and relevant responses to support research and development efforts.\n\n"
        "### Data Retrieval & Prioritization\n"
        "1. **Structured Data (Microsoft Fabric):** \n"
        "   - Retrieve data from Microsoft Fabric when the query involves numerical metrics, performance statistics, or structured analytics.\n"
        "   - Data is available to support comparisons of the accuracy and reliability of two glucose monitoring products (Product A and Product B). "
        "This structured data evaluates performance across different glucose ranges and includes MARD percentages, accuracy within ±20 mg/dL/±20%, and the number of readings for each product.\n"
        "   - Example: clinical glucose monitoring studies between Product A or Product B.\n\n"
        "2. **Integrated Queries:** \n"
        "   - If the query requires multiple structured metrics, retrieve and integrate the results for a comprehensive response.\n"
        "   - Ensure clarity in presenting combined insights.\n\n"
        "3. **Fallback Behavior:** \n"
        "   - If no relevant data is found in Microsoft Fabric, respond with:\n"
        "     - A clear statement that no relevant data was found.\n"
        "     - Suggestions for alternative queries or data sources (if applicable).\n"
        "     - Example: 'No relevant data was found for the requested query in Microsoft Fabric. Consider refining your query or exploring other data sources.'\n\n"
        "### Response Quality\n"
        "1. **Accuracy & Relevance:** Always prioritize retrieving the most current and applicable data from Microsoft Fabric.\n"
        "2. **Clarity & Transparency:** Clearly indicate the data source(s) used and any limitations in the available information.\n"
        "3. **Fallback Handling:** If no relevant data is found, provide a professional and helpful fallback response as outlined above.\n"
        "4. **Professionalism:** Present findings in a structured and concise manner to facilitate decision-making.\n"
    ),
    toolset=toolset,
    headers={"x-ms-enable-preview": "true"},
    temperature=0.7,
    top_p=1,
    metadata={
        "use_case": "Structured Data Retrieval for R&D",
        "data_source": "Microsoft Fabric (structured metrics only)",
    },
)

# Print the agent's run ID (agent ID)
print(f"FabricDataRetrievalAgent Run ID: {dataretrievalagent_settings_definition.id}")

FabricDataRetrievalAgent Run ID: asst_z7gyKXpYqR7i7N2qIhZoAT6y


In [10]:
import json
import logging
from datetime import timedelta


def process_citations(content) -> str:
    """
    Return content plus any citations in Markdown format, ensuring no duplicated citations.

    Args:
        content: The content object retrieved from the agent.

    Returns:
        A string containing the agent's text plus any unique citations in Markdown format.
    """
    combined_content = content.content

    if hasattr(content, "items"):
        unique_citation_set = set()
        for item in content.items:
            if item.content_type == "annotation" and item.url:
                unique_citation_set.add((item.quote, item.url))

        if unique_citation_set:
            combined_content += "\n\n**Citations:**\n"
            for quote, url in unique_citation_set:
                combined_content += f"- **Quote**: {quote}  \n"
                combined_content += f"  **URL**: [{url}]({url})\n"

    return combined_content


async def run_agent(project_client, agent_id: str, user_input: str) -> str:
    """
    Runs the data-retrieval agent to process a single user input and returns
    a single string containing both user input and the agent responses.

    Args:
        project_client: A client object for interacting with Azure AI project resources.
        agent_id: The ID of the agent to retrieve from project_client.
        user_input: A string representing the user's query.

    Returns:
        A single string with user input and each agent response (enriched with citations).
    """
    # Initialize the string to include the user query
    responses = f"👤 User: {user_input}\n"

    # Fetch the agent definition using the provided ID
    agent_definition = await project_client.agents.get_agent(agent_id)

    # Create the AzureAIAgent instance
    agent = AzureAIAgent(
        client=project_client,
        definition=agent_definition,
        polling_options=RunPollingOptions(run_polling_interval=timedelta(seconds=1)),
    )

    # Create a new conversation thread
    thread = await project_client.agents.create_thread()

    try:
        # Send user input to the agent
        await agent.add_chat_message(thread_id=thread.id, message=user_input)

        # Stream non-tool responses from the agent
        async for content in agent.invoke(thread_id=thread.id):
            enriched_content = process_citations(content)
            responses += f"\n🤖 {agent.name}: {enriched_content}"

    except HttpResponseError as e:
        try:
            error_json = json.loads(e.response.content)
            message = error_json.get("Message", "No error message found")
            logging.error(f"❌ **HTTP Error:** {message}")
            responses += f"\n❌ **HTTP Error:** {message}"
        except json.JSONDecodeError:
            logging.error(f"❌ **Non-JSON Error Content:** {e.response.content}")
            responses += f"\n❌ **Non-JSON Error Content:** {e.response.content}"

    return responses, thread.id

In [11]:
user_query = "In which glucose ranges does Product A underperform compared to Product B, and what clinical impact could this have?"
fabric_response, threadID = await run_agent(
    project_client, "asst_iIMaKxWz4JUJ5YyWqoSR9t3n", user_query
)

### **Creating Azure AI Agents: SharePointDataRetrievalAgent**

The DataRetrievalAgent is responsible for internal data retrieval, combining structured data from Microsoft Fabric with unstructured documents from SharePoint. This agent ensures that research teams can efficiently access critical R&D insights, such as historical failure rates, experimental results, and engineering notes—all while maintaining secure and authorized access controls.

Agent Capabilities
+ ✅ Structured Data Retrieval → Queries Microsoft Fabric for experiment logs, test results, and structured analytics.
+ ✅ Unstructured Document Search → Fetches relevant reports, blueprints, and research notes from SharePoint.
+ ✅ OBO Authentication → Uses On-Behalf-Of (OBO) authentication to ensure users can only access data they are permitted to view.

For a detailed breakdown of how to create a single Azure AI Agent and configure its data (tools) connections, please refer to:
📌 [01-single-agents-with-azure-ai-agents.ipynb](01-single-agents-with-azure-ai-agents.ipynb).

In [6]:
from azure.ai.projects.models import (
    SharepointTool,
    ToolSet,
)

# Initialize Azure AI Agent settings
dataretrievalagent_settings = AzureAIAgentSettings.create()

toolset = ToolSet()

try:
    # Retrieve and add SharePoint Tool
    sharepoint_connection = await get_connection_id(
        project_client, "TOOL_CONNECTION_NAME_SHAREPOINT"
    )
    toolset.add(SharepointTool(connection_id=sharepoint_connection.id))

    logger.info("Successfully created ToolSet with SharePoint and Fabric tools.")
except Exception as e:
    logger.error(f"Failed to create ToolSet: {e}")
    raise

2025-03-24 19:55:01,855 - micro - MainProcess - INFO     Retrieved Connection ID for TOOL_CONNECTION_NAME_SHAREPOINT: /subscriptions/47f1c914-e299-4953-a99d-3e34644cfe1c/resourceGroups/rg-mukeshag-5297_ai/providers/Microsoft.MachineLearningServices/workspaces/zhuoqunli-5026/connections/ContosoAgentDemoSharepoint (2375541312.py:get_connection_id:22)
INFO:micro:Retrieved Connection ID for TOOL_CONNECTION_NAME_SHAREPOINT: /subscriptions/47f1c914-e299-4953-a99d-3e34644cfe1c/resourceGroups/rg-mukeshag-5297_ai/providers/Microsoft.MachineLearningServices/workspaces/zhuoqunli-5026/connections/ContosoAgentDemoSharepoint
2025-03-24 19:55:01,862 - micro - MainProcess - INFO     Successfully created ToolSet with SharePoint and Fabric tools. (2844402077.py:<module>:16)
INFO:micro:Successfully created ToolSet with SharePoint and Fabric tools.


In [10]:
dataretrievalagent_settings_definition = await project_client.agents.create_agent(
    model=dataretrievalagent_settings.model_deployment_name,
    name="SharePointDataRetrievalAgent",
    description=(
        "An AI agent specialized in retrieving and analyzing unstructured documents from SharePoint. "
        "This includes research papers, legal documents, and product engineering files (PDFs). "
        "The agent is designed to assist in research and development by providing accurate, relevant, and actionable insights. "
        "If no relevant data is found, the agent must clearly indicate this and provide suggestions for alternative queries or data sources."
    ),
    instructions=(
        "### Role & Objective\n"
        "You are a research-focused AI assistant responsible for retrieving and analyzing unstructured documents exclusively from SharePoint. "
        "Your goal is to provide precise, well-referenced, and relevant responses to support research and development, legal analysis, and product engineering efforts.\n\n"
        "### Data Retrieval & Prioritization\n"
        "1. **Unstructured Data (SharePoint):** \n"
        "   - Retrieve documents from SharePoint when the query involves research papers, legal documents, or product engineering files (PDFs).\n"
        "   - Focus on extracting key insights, summaries, and actionable information from the retrieved documents.\n"
        "   - Example: 'Retrieve research papers on Material X used in high-temperature environments,' or 'Find legal documents related to patent filings for Product A.'\n\n"
        "2. **Document Types:** \n"
        "   - Research Papers: Summarize findings, methodologies, and conclusions.\n"
        "   - Legal Documents: Extract key clauses, compliance requirements, and patent-related information.\n"
        "   - Product Engineering Files: Highlight design notes, test results, and engineering decisions.\n\n"
        "3. **Integrated Queries:** \n"
        "   - If the query spans multiple document types, retrieve and integrate the results for a comprehensive response.\n"
        "   - Ensure clarity in presenting combined insights.\n\n"
        "4. **Fallback Behavior:** \n"
        "   - If no relevant data is found in SharePoint, respond with:\n"
        "     - A clear statement that no relevant data was found.\n"
        "     - Suggestions for alternative queries or data sources (if applicable).\n"
        "     - Example: 'No relevant data was found for the requested query in SharePoint. Consider refining your query or exploring other data sources.'\n\n"
        "### Response Quality\n"
        "1. **Accuracy & Relevance:** Always prioritize retrieving the most current and applicable documents from SharePoint.\n"
        "2. **Clarity & Transparency:** Clearly indicate the data source(s) used and any limitations in the available information.\n"
        "3. **Fallback Handling:** If no relevant data is found, provide a professional and helpful fallback response as outlined above.\n"
        "4. **Professionalism:** Present findings in a structured and concise manner to facilitate decision-making.\n"
        "5. **Document Context:** Ensure that extracted insights are presented with sufficient context to maintain their relevance and accuracy.\n"
    ),
    toolset=toolset,
    headers={"x-ms-enable-preview": "true"},
    temperature=0.7,
    top_p=1,
    metadata={
        "use_case": "Unstructured Data Retrieval for R&D, Legal, and Engineering",
        "data_source": "SharePoint",
    },
)

# Print the agent's run ID (agent ID)
print(
    f"SharePointDataRetrievalAgent Run ID: {dataretrievalagent_settings_definition.id}"
)

SharePointDataRetrievalAgent Run ID: asst_x73A5QElh86JNelOI50PgH7T


In [12]:
user_query = "In which glucose ranges does Product A underperform compared to Product B, and what clinical impact could this have?"
sharepoint_response, threadID = await run_agent(
    project_client, "asst_x73A5QElh86JNelOI50PgH7T", user_query
)

In [8]:
from azure.ai.projects.models import (
    BingGroundingTool,
    AzureAISearchTool,
    FileSearchTool,
    VectorStore,
    OpenAIFile,
)

# Initialize Azure AI Agent settings
validationinsightagent_settings = AzureAIAgentSettings.create()

# Create a ToolSet to manage tools
toolset = ToolSet()

try:
    # Retrieve and add the Bing Grounding Tool
    bing_connection = await get_connection_id(
        project_client, "TOOL_CONNECTION_NAME_BING"
    )
    toolset.add(BingGroundingTool(connection_id=bing_connection.id))
    logger.info("Bing Grounding Tool added successfully.")

    # Retrieve and add the Azure AI Search Tool
    # search_connection = await get_connection_id(project_client, "TOOL_CONNECTION_NAME_SEARCH")
    # azure_ai_search_connection = AzureAISearchTool(
    #     index_connection_id=search_connection.id,
    #     index_name="ai-agentic-index"
    # )
    # toolset.add(azure_ai_search_connection)
    # logger.info("Azure AI Search Tool added successfully.")

    # Dynamically construct the PDF file path using os.path.join
    # pdf_file_path = os.path.join(target_directory, "data", "product_data", "ProductATechncialArchitecture.pdf")
    # logger.info(f"Using PDF file path: {pdf_file_path}")

    # file: OpenAIFile = await project_client.agents.upload_file_and_poll(file_path=pdf_file_path, purpose="assistants")
    # vector_store: VectorStore = await project_client.agents.create_vector_store_and_poll(
    #     file_ids=[file.id], name="my_vectorstore"
    # )

    # # 2. Create file search tool with uploaded resources
    # file_search = FileSearchTool(vector_store_ids=[vector_store.id])

    # toolset.add(file_search)
    # logger.info("Azure AI Search Tool added successfully.")

    # logger.info("Successfully created ToolSet with Bing and File Search tools.")
except Exception as e:
    logger.error(f"Failed to create ToolSet: {e}")
    raise

2025-03-24 16:46:45,869 - micro - MainProcess - INFO     Retrieved Connection ID for TOOL_CONNECTION_NAME_BING: /subscriptions/47f1c914-e299-4953-a99d-3e34644cfe1c/resourceGroups/rg-mukeshag-5297_ai/providers/Microsoft.MachineLearningServices/workspaces/zhuoqunli-5026/connections/agentsbinggrounding (2375541312.py:get_connection_id:22)
INFO:micro:Retrieved Connection ID for TOOL_CONNECTION_NAME_BING: /subscriptions/47f1c914-e299-4953-a99d-3e34644cfe1c/resourceGroups/rg-mukeshag-5297_ai/providers/Microsoft.MachineLearningServices/workspaces/zhuoqunli-5026/connections/agentsbinggrounding
2025-03-24 16:46:45,872 - micro - MainProcess - INFO     Bing Grounding Tool added successfully. (1038912367.py:<module>:19)
INFO:micro:Bing Grounding Tool added successfully.


In [9]:
dataretrievalagent_settings_definition = await project_client.agents.create_agent(
    model=dataretrievalagent_settings.model_deployment_name,
    name="BingDataRetrievalAgent",
    description=(
        "An AI agent specialized in parsing user queries against Bing, refining intent, and gathering "
        "high-quality information from web-based sources. This includes identifying relevant research, "
        "news articles, reference documents, and other online data to support research and development. "
        "If no relevant data is found, the agent must clearly indicate this and suggest potential alternative "
        "queries or data sources."
    ),
    instructions=(
        "### Role & Objective\n"
        "You are a research-focused AI assistant that uses Bing to interpret user queries and retrieve "
        "quality data from the web. Your goal is to provide precise, relevant, and actionable insights.\n\n"
        "### Data Retrieval & Prioritization\n"
        "1. **Web-Based Data (Bing):** \n"
        "   - Interpret user queries to refine intent before searching using Bing.\n"
        "   - Retrieve relevant articles, academic papers, or any publicly available online data.\n"
        "   - Example: 'Find the latest research on Product A's performance in clinical trials' "
        "or 'Fetch breaking news about emerging technology in thermal materials.'\n\n"
        "2. **Data Integration:**\n"
        "   - If the query spans multiple topic areas, unify the results for a comprehensive response.\n"
        "   - Provide concise summaries and relevant direct evidence where possible.\n\n"
        "3. **Fallback Behavior:**\n"
        "   - If no relevant data is found, respond with:\n"
        "     - A clear statement that no relevant data was found via Bing.\n"
        "     - Suggestions for query refinement or alternative data sources.\n\n"
        "### Response Quality\n"
        "1. **Accuracy & Relevance:** Aim to gather highly credible, up-to-date online sources.\n"
        "2. **Clarity & Transparency:** Clearly indicate the data source(s) used and any known limitations.\n"
        "3. **Fallback Handling:** Provide a professional, helpful fallback response if no data is found.\n"
        "4. **Professionalism:** Present findings in a structured, concise manner to aid decision-making.\n"
        "5. **Contextual Summaries:** Ensure extracted insights are summarized with enough context to stay relevant.\n"
    ),
    toolset=toolset,
    headers={"x-ms-enable-preview": "true"},
    temperature=0.7,
    top_p=1,
    metadata={
        "use_case": "Web-Based Data Retrieval via Bing for R&D Insights",
        "data_source": "Bing",
    },
)

# Print the agent's run ID (the newly created agent's ID)
print(f"BingDataRetrievalAgent Run ID: {dataretrievalagent_settings_definition.id}")

BingDataRetrievalAgent Run ID: asst_1efh8YZc1dz6TvTpdi2qH9tR


In [13]:
user_query = "In which glucose ranges does Product A underperform compared to Product B, and what clinical impact could this have?"
bing_response, thread_id = await run_agent(
    project_client, "asst_1efh8YZc1dz6TvTpdi2qH9tR", user_query
)

In [14]:
print(bing_response)

👤 User: In which glucose ranges does Product A underperform compared to Product B, and what clinical impact could this have?

🤖 BingDataRetrievalAgent: To provide an accurate answer, I'll need to look up recent data comparing Product A and Product B in terms of glucose range performance. Please hold on while I gather the information.
🤖 BingDataRetrievalAgent: I couldn't find specific data on the glucose ranges where Product A underperforms compared to Product B. You might want to check recent clinical trials or detailed product comparison studies for precise information. If you have more specific details about the products (e.g., their names), I could refine the search further.


## Router 



In [15]:
from src.aoai.aoai_helper import AzureOpenAIManager
from src.chatapp.prompts import (
    generate_user_prompt,
    SYSTEM_PROMPT_PLANNER,
    SYSTEM_PROMPT_VERIFIER,
    generate_verifier_prompt,
)

In [16]:
az_manager = AzureOpenAIManager(
    api_key=os.getenv("AZURE_OPENAI_KEY_PS"),
    azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT_PS"),
    api_version=os.getenv("AZURE_OPENAI_API_VERSION_PS"),
    chat_model_name=os.getenv("AZURE_OPENAI_CHAT_DEPLOYMENT_ID_PS"),
)

In [24]:
query = "In which glucose ranges does Product A underperform compared to Product B, and what clinical impact could this have?"

In [25]:
agents = await az_manager.generate_chat_response(
    query=generate_user_prompt(query),
    conversation_history=[],
    system_message_content=SYSTEM_PROMPT_PLANNER,
    response_format="json_object",
)

2025-03-24 22:35:22,819 - micro - MainProcess - INFO     Function generate_chat_response started at 2025-03-24 22:35:22 (aoai_helper.py:generate_chat_response:379)
INFO:micro:Function generate_chat_response started at 2025-03-24 22:35:22
2025-03-24 22:35:22,828 - micro - MainProcess - INFO     Sending request to Azure OpenAI at 2025-03-24 22:35:22 (aoai_helper.py:generate_chat_response:436)
INFO:micro:Sending request to Azure OpenAI at 2025-03-24 22:35:22
2025-03-24 22:35:23,937 - micro - MainProcess - INFO     Function generate_chat_response finished at 2025-03-24 22:35:23 (Duration: 1.12 seconds) (aoai_helper.py:generate_chat_response:490)
INFO:micro:Function generate_chat_response finished at 2025-03-24 22:35:23 (Duration: 1.12 seconds)


In [None]:
agents["response"]["agents_needed"]

['FabricDataRetrievalAgent', 'SharePointDataRetrievalAgent']

: 

In [20]:
evaluation = await az_manager.generate_chat_response(
    query=generate_verifier_prompt(
        query,
        fabric_data_summary=fabric_response,
        sharepoint_data_summary=sharepoint_response,
    ),
    conversation_history=[],
    system_message_content=SYSTEM_PROMPT_VERIFIER,
    response_format="json_object",
)

2025-03-24 22:23:48,654 - micro - MainProcess - INFO     Function generate_chat_response started at 2025-03-24 22:23:48 (aoai_helper.py:generate_chat_response:379)
INFO:micro:Function generate_chat_response started at 2025-03-24 22:23:48
2025-03-24 22:23:48,662 - micro - MainProcess - INFO     Sending request to Azure OpenAI at 2025-03-24 22:23:48 (aoai_helper.py:generate_chat_response:436)
INFO:micro:Sending request to Azure OpenAI at 2025-03-24 22:23:48
2025-03-24 22:23:50,680 - micro - MainProcess - INFO     Function generate_chat_response finished at 2025-03-24 22:23:50 (Duration: 2.03 seconds) (aoai_helper.py:generate_chat_response:490)
INFO:micro:Function generate_chat_response finished at 2025-03-24 22:23:50 (Duration: 2.03 seconds)


In [23]:
evaluation['response'][]

{'status': 'Denied',
 'reason': "The data from Fabric indicates that Product A underperforms compared to Product B in all glucose ranges, whereas SharePoint data suggests underperformance only in glucose ranges below 70 mg/dL and above 180 mg/dL. This inconsistency in the glucose ranges where Product A underperforms creates ambiguity. Additionally, the clinical impact described needs more comprehensive details for each glucose range to fully address the user's query.",
 'response': '',
 'rewritten_query': 'What are the specific glucose ranges where Product A underperforms compared to Product B, and what are the detailed clinical impacts for each range?'}

## Validation

In [29]:
from azure.ai.projects.models import (
    SharepointTool,
    ToolSet,
)

# Initialize Azure AI Agent settings
validationinsightsagent_settings = AzureAIAgentSettings.create()

In [30]:
validationinsightsagent_settings_definition = await project_client.agents.create_agent(
    model=validationinsightsagent_settings.model_deployment_name,
    name="ValidationInsightsAgent",
    description=(
        "A cross-referencing AI agent that compares data from two sources: the Fabric agent (structured data) and the "
        "SharePoint agent (unstructured data). If the information is incomplete or contradictory, the agent returns a refined "
        "query or additional instructions. If validated, it produces a final, comprehensive response marked 'APPROVED:', "
        "providing in-depth details and citations from each source."
    ),
    instructions=(
        "### Role & Objective\n"
        "You are responsible for verifying data from:\n"
        "1. Microsoft Fabric (structured data, via the Fabric Agent).\n"
        "2. SharePoint (unstructured data, via the SharePoint Agent).\n\n"
        "Your main goal is to:\n"
        " - Ensure that the combined data set is correct, consistent, and sufficient to answer the user's query.\n"
        " - If the data is valid, produce a final, detailed response (including citations). "
        " - If the data is insufficient or conflicting, return a refined query or instructions on what additional data is needed.\n\n"
        "### Validation Flow\n"
        "1. **Compare & Contrast (Cross-Check)**:\n"
        "   - Examine metrics, text descriptions, and any references from Fabric.\n"
        "   - Compare these to the corresponding documents, notes, or references from SharePoint.\n"
        "   - Identify any gaps, contradictions, or outdated information.\n\n"
        "2. **Outcome Determination**:\n"
        "   - **Refined Query / Retry**:\n"
        "     - If data is incomplete or contradictory, list the specific points that are unclear or missing.\n"
        "     - Provide a suggested refined query or instructions for how the retrieval agents should update their search.\n\n"
        "   - **Approved & Detailed Response**:\n"
        "     - If the data is consistent and sufficient to address the user's query, output a consolidated response.\n"
        "     - Begin your final answer with the keyword 'APPROVED:'.\n"
        "     - Include a **detailed explanation** of how the data answers the query, referencing each relevant data point.\n"
        "     - Present **all applicable citations** or URLs from both Fabric and SharePoint, so the user can see the sources.\n"
        "     - Summarize key points, numerical insights, or textual findings in a structured layout (lists, bullet points, etc.).\n"
        "     - If needed, include disclaimers or clarifications about any assumptions.\n\n"
        "### Citations & Transparency\n"
        "1. **Explicit Source Markers**:\n"
        "   - Label each piece of data with its origin (e.g., 'Fabric Query 2025-03-24', or 'SharePoint Document: [URL or title]').\n"
        "2. **Provide Hyperlinks or Document IDs**:\n"
        "   - Where possible, include direct links to the SharePoint document or an identifier for the Fabric dataset.\n"
        "3. **Clarity in Narration**:\n"
        "   - The final user-facing summary must show how each citation supports the conclusion.\n\n"
        "### Response Quality\n"
        "1. **Completeness**:\n"
        "   - If you produce 'APPROVED:' content, it must offer a thorough explanation, ensuring the user sees how the data fully "
        "answers the query.\n"
        "2. **Readability & Professional Tone**:\n"
        "   - Keep your explanation and citations organized and clear.\n"
        "3. **Fallback Handling**:\n"
        "   - If no relevant data is found from either source, explicitly state so and provide suggestions for alternative queries.\n"
        "4. **Structured Summaries**:\n"
        "   - Utilize concise paragraphs, bullet points, or short headings for clarity.\n"
    ),
    toolset=toolset,
    headers={"x-ms-enable-preview": "true"},
    temperature=0.7,
    top_p=1,
    metadata={
        "use_case": "Cross-Referencing & Validation for R&D Insights",
        "data_source": "Fabric & SharePoint",
    },
)

print(
    f"ValidationInsightsAgent Run ID: {validationinsightsagent_settings_definition.id}"
)

ValidationInsightsAgent Run ID: asst_9Fa5VtMRAjyv8PvArrAy1tAx


In [None]:
user_query = '''
👤 User: In which glucose ranges does Product A underperform compared to Product B, and what clinical impact could this have?

🤖 SharePointDataRetrievalAgent: Product A underperforms compared to Product B in glucose ranges where interstitial fluid readings differ significantly from capillary blood glucose measurements. This discrepancy can lead to potential miscalculations in insulin dosing, increasing the risk of hypoglycemia or hyperglycemia. Clinically, this could result in adverse events such as severe hypoglycemic episodes or long-term complications from poor glycemic control【3:2†source】【3:3†source】.

**Citations:**
- **Quote**: 【3:2†source】  
  **URL**: [https://microsoft.sharepoint.com/teams/ContosoDemoAgents/Shared Documents/R+D Team/Product A & B Comparison.pdf](https://microsoft.sharepoint.com/teams/ContosoDemoAgents/Shared Documents/R+D Team/Product A & B Comparison.pdf)
- **Quote**: 【3:3†source】  
  **URL**: [https://microsoft.sharepoint.com/teams/ContosoDemoAgents/Shared Documents/Legal Team/Regulatory Requirements.pdf](https://microsoft.sharepoint.com/teams/ContosoDemoAgents/Shared Documents/Legal Team/Regulatory Requirements.pdf)'''
+ 
''' '''
agent_response = await run_agent(project_client, dataretrievalagent_settings_definition, user_query)

### **3. Creating Multi Agent System with Semantic Kernel Agentic Framework**

The following sample demonstrates how to create an OpenAI assistant using either Azure OpenAI or OpenAI, a chat completion agent and have them participate in a group chat to work towards the user's requirement.


In [66]:
import os
import asyncio
import streamlit as st
import traceback
from azure.identity.aio import DefaultAzureCredential
from semantic_kernel import Kernel
from semantic_kernel.agents import AgentGroupChat
from semantic_kernel.agents.azure_ai import AzureAIAgent
from semantic_kernel.agents.strategies import (
    KernelFunctionSelectionStrategy,
    KernelFunctionTerminationStrategy,
)
from semantic_kernel.functions import KernelFunctionFromPrompt
from semantic_kernel.contents import ChatHistoryTruncationReducer
from dotenv import load_dotenv

In [67]:
# Agent name constants
SHAREPOINT_AGENT = "SharePointDataRetrievalAgent"
FABRIC_AGENT = "FabricDataRetrievalAgent"
VERIFIER_AGENT = "ValidationInsightsAgent"


def create_kernel():
    from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion

    kernel = Kernel()
    kernel.add_service(
        AzureChatCompletion(
            deployment_name=os.getenv("AZURE_OPENAI_CHAT_DEPLOYMENT_ID"),
            api_key=os.getenv("AZURE_OPENAI_KEY"),
            endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
            api_version=os.getenv("AZURE_OPENAI_API_VERSION"),
        )
    )
    return kernel


selection_function = KernelFunctionFromPrompt(
    function_name="selection",
    prompt=f"""
    You are the routing orchestrator. Examine the provided conversation and determine which participant
    should speak next. Always respond with the name of exactly one participant, and no explanation.

    The participants are:
    - {SHAREPOINT_AGENT}
    - {FABRIC_AGENT}
    - {VERIFIER_AGENT}

    Conversation History:
    {{{{$history}}}}

    Routing Rules (simplified example):
    1. If the last speaker is the User, choose {SHAREPOINT_AGENT}.
    2. If the last speaker is {SHAREPOINT_AGENT}, choose {FABRIC_AGENT}.
    3. If the last speaker is {FABRIC_AGENT}, choose {VERIFIER_AGENT}.
    4. If the last speaker is {VERIFIER_AGENT}, choose {SHAREPOINT_AGENT}.

    Important:
    - Do not choose the same participant who just spoke last.
    - Return only the name of the next participant as plain text.
    """,
)

termination_function = KernelFunctionFromPrompt(
    function_name="termination",
    prompt=f"""
    You are the workflow terminator. Determine if the ValidationInsightsAgent has fully approved
    the retrieved information.

    Conversation History:
    {{$history}}

    - If the final message from {VERIFIER_AGENT} contains the keyword "APPROVED:", output "yes".
    - Otherwise, output "no".
    """,
)


# Kernel Creation
def create_kernel() -> Kernel:
    from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion

    kernel = Kernel()
    kernel.add_service(
        AzureChatCompletion(
            deployment_name=os.getenv("AZURE_OPENAI_CHAT_DEPLOYMENT_ID"),
            api_key=os.getenv("AZURE_OPENAI_KEY"),
            endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
            api_version=os.getenv("AZURE_OPENAI_API_VERSION"),
        )
    )
    return kernel


logging.basicConfig(level=logging.DEBUG)


def debug_result_parser(result):
    logging.debug(f"Selection function output: {result}")
    return str(result.value[0]).strip() if result.value[0] is not None else FABRIC_AGENT

In [68]:
# Utility Logging
def log_agent_invocation(agent_name, role, input_data, output_data, metadata):
    trace_data = {
        "agent_name": agent_name,
        "role": role,
        "input": input_data,
        "output": output_data,
        "metadata": metadata,
    }
    logger.info("AGENT INVOCATION:\n" + json.dumps(trace_data, indent=4))


# Main Async Function
async def main():
    async with DefaultAzureCredential() as creds, AzureAIAgent.create_client(
        credential=creds
    ) as client:
        # Define agent IDs

        # Agent name constants
        agents_ids = {
            SHAREPOINT_AGENT: "asst_5eeFoBOTSod13UeN6eOIdMrs",
            FABRIC_AGENT: "asst_gKCbHiEH1iGiEc4YRla1VFZ2",
            VERIFIER_AGENT: "asst_w9E6NQ12RjMkWAVJ1lvMM3UE",
        }

        # Initialize our four agents only once
        agents = {}
        for name, agent_id in agents_ids.items():
            definition = await client.agents.get_agent(agent_id)
            agents[name] = AzureAIAgent(client=client, definition=definition)

        # Build multi-agent chat in session state if not present
        chat = AgentGroupChat(
            agents=list(agents.values()),
            selection_strategy=KernelFunctionSelectionStrategy(
                function=selection_function,
                kernel=create_kernel(),
                result_parser=debug_result_parser,
                agent_variable_name="agents",
                history_variable_name="history",
            ),
            termination_strategy=KernelFunctionTerminationStrategy(
                agents=[agents[VERIFIER_AGENT]],
                function=termination_function,
                kernel=create_kernel(),
                result_parser=lambda result: "yes" in str(result.value[0]).lower(),
                history_variable_name="history",
                maximum_iterations=6,
            ),
        )

        # Omni system prompt
        SYSTEM_MESSAGE = """
        You are an intelligent multi-agent R&D assistant designed to help Product Managers quickly access, integrate, and evaluate information from multiple specialized sources.
        Aim to support fast and accurate decision-making for R&D insights, leveraging internal and external data comprehensively and effectively.
        """
        await chat.add_chat_message(SYSTEM_MESSAGE)

        user_message = "In which glucose ranges does Product A underperform compared to Product B, and what clinical impact could this have?"
        await chat.add_chat_message(message=user_message)
        print(f"👤 User: {user_message}\n")
        try:
            async for content in chat.invoke():
                agent_name = content.name or "Unknown"
                agent_role = content.role.name
                agent_output = content.content
                # Extracting citations clearly from AnnotationContent items
                citations = []
                for item in content.items:
                    if item.content_type == "annotation":
                        citation = {"quote": item.quote, "url": item.url}
                        citations.append(citation)
                # structured information clearly
                print(f"AGENT ROLE: {agent_role}")
                print(f"AGENT NAME: {agent_name}")
                print("OUTPUT:", agent_output)
                print("CITATIONS:")
                for citation in citations:
                    print(f"  - Quote: {citation['quote']}, URL: {citation['url']}")
                print("-" * 80)

        except HttpResponseError as err:
            logger.error(f"Http Error during conversation: {err}")

        except Exception as e:
            logger.error(f"General Error during conversation: {e}")

        finally:
            chat.is_complete = False

In [69]:
await main()

👤 User: In which glucose ranges does Product A underperform compared to Product B, and what clinical impact could this have?

AGENT ROLE: ASSISTANT
AGENT NAME: SharePointDataRetrievalAgent
OUTPUT: Product A underperforms compared to Product B in glucose ranges where it consistently records lower glucose levels than the reference glucometer. Specifically, while Product B tends to overestimate, Product A tends to underestimate glucose levels【4:1†source】.

The clinical impact of this underperformance is significant. Inaccurate glucose readings can lead to inappropriate insulin dosing decisions, potentially resulting in hypoglycemia if glucose levels are underestimated. This is especially critical in situations requiring precise glucose control【4:1†source】.
CITATIONS:
  - Quote: 【4:1†source】, URL: https://microsoft.sharepoint.com/teams/ContosoDemoAgents/Shared Documents/R+D Team/Product A & B Comparison.pdf
  - Quote: 【4:1†source】, URL: https://microsoft.sharepoint.com/teams/ContosoDemoAge