# 🤖🚀 Lab: Building a Vector-Based AI Solution with Multimodal RAG Pattern

## Introduction
In this lab, you will learn how to set up and build a comprehensive AI solution by leveraging a multimodal Retrieval-Augmented Generation (RAG) pattern. Follow the steps outlined to achieve the learning objectives.

![image.png](attachment:image.png)

## Learning Objectives 🎯

1. [**Setup and Environment Configuration**](#setup-and-environment-configuration): Begin by setting up your development environment and configuring necessary tools and libraries.
2. [**Building a Vector Database in Azure**](#building-a-vector-database-in-azure): Create and manage a vector-based data store leveraging Azure AI Search, with integrated vectorization for efficient indexing and searching.
3. [**Enable Multimodality in Your AI Solution**](#enable-multimodality-in-your-ai-solution): Incorporate text, images, and other data types to provide a richer, more dynamic user experience.
4. [**Master Orchestration Frameworks with Semantic Kernel**](#master-orchestration-frameworks-with-semantic-kernel): Gain a deep understanding of how Microsoft’s Semantic Kernel enables seamless coordination of multiple Azure AI services to build scalable and intelligent AI systems.
5. [**Build a Backend Orchestration Framework**](#build-a-backend-orchestration-framework): Establish a robust architecture that coordinates services and tasks for your AI solution.
6. [**Develop a Front-End with Streamlit**](#develop-a-front-end-with-streamlit): Implement a streamlined interface that interacts with backend microservices and components, facilitating a fully functional AI solution.

## Setup and Environment Configuration

Begin by setting up your development environment. Please refer to the detailed instructions in the [**assets/set-up.ipynb**](assets/set-up.ipynb)

#### Checklist

- [ ] Verify environment correct activation  

In [21]:
import importlib

# List of required libraries
required_libraries = [
    "azure.ai.documentintelligence",
    "azure.search.documents",
    "openai",
    "streamlit",
    "semantic_kernel",
    "dotenv"
]

# Check if libraries are installed
print("Checking required libraries...\n")
all_installed = True
for lib in required_libraries:
    try:
        importlib.import_module(lib)
        print(f"✅ {lib} is installed.")
    except ImportError:
        print(f"❌ {lib} is NOT installed. Please install it using 'pip install {lib.replace('.', '-')}'.")
        all_installed = False

if not all_installed:
    print("\nSome libraries are not installed. Please return to the setup notebook to review the configuration.")

print("\nLibrary check completed.")

Checking required libraries...

✅ azure.ai.documentintelligence is installed.
✅ azure.search.documents is installed.
✅ openai is installed.
✅ streamlit is installed.
✅ semantic_kernel is installed.
✅ dotenv is installed.

Library check completed.


In [22]:
import os

# Print your current directory
print(f"Current Directory: {os.getcwd()}")

Current Directory: c:\Users\pablosal\Desktop\azure-ai-engineer-in-five-weeks


In [23]:

# Define the target directory
target_directory = r"/Users/pablosal/Desktop/azure-ai-engineer-in-five-weeks"  # change your directory to the root folder

# Check if the directory exists
if os.path.exists(target_directory):
    # Change the current working directory
    os.chdir(target_directory)
    print(f"Directory changed to {os.getcwd()}")
else:
    print(f"Directory {target_directory} does not exist.")

Directory changed to c:\Users\pablosal\Desktop\azure-ai-engineer-in-five-weeks


## Building a Vector Database in Azure

In this section, we'll delve into **Context Engineering**, a crucial process that involves the ingestion and indexing of new data—a practice often referred to as the "new data engineering paradigm." Our objective is to prepare diverse internal data sources so they can be effectively utilized by the Retrieval-Augmented Generation (RAG) pipeline, enabling the production of relevant, context-rich responses.

![image-2.png](attachment:image-2.png)

By ensuring that our data—whether it be text, PDFs, audio, or other formats—is meticulously indexed, we empower our chatbot to retrieve accurate and grounded information, thereby enhancing its generated answers.

#### Key Points

1. **New Data Ingestion**  
   - **Collection**: Gather incoming data from various sources, including documents, databases, URLs, audio files, and PDFs.  
   - **Pre-processing**: Ensure this data undergoes appropriate pre-processing to standardize formats and enhance quality.  
   - **Metadata Organization**: Systematically organize and label metadata to facilitate seamless context retrieval in subsequent stages.

2. **Indexing & Vectorization**  
   - **Vector Conversion**: Transform content into vector embeddings using Azure AI services or other embedding generation tools.  
   - **Storage**: Store these embeddings within a vector database, enabling semantic matching and retrieval that surpasses traditional keyword-based searches.

3. **Context-Driven Responses**  
   - **Automated Retrieval**: Once indexed, content can be automatically retrieved during query execution, providing the Large Language Model (LLM) with pertinent context for informed responses.  
   - **Handling Unknowns**: In scenarios where the context is absent from the database, the system is designed to respond with "I don’t know," maintaining the integrity and grounding of its answers.

4. **Retrieval Pipeline**  
   - **Azure AI Search Utilization**: Leverage Azure AI Search or similar vector-enabled search services to access robust retrieval methods, including semantic, keyword, or hybrid searches.  
   - **Strategic Adaptation**: This versatility allows for the adaptation of retrieval strategies tailored to the specific nature of incoming queries.

In summary, **Context Engineering & Retrieval** is pivotal in preparing data for efficient indexing, ensuring that the RAG pipeline can swiftly locate and provide precise context to the LLM. This foundational step is essential for the reliability and scalability of the AI system's responses, laying the groundwork for an effective AI solution.


To automate the indexing of policy documents into Azure AI Search, you can utilize the `PolicyIndexingPipeline` class, which streamlines the process by setting up necessary components and configurations. Here's a concise guide to help you understand and implement this pipeline effectively:

### Initialize the Pipeline
Begin by importing and creating an instance of the `PolicyIndexingPipeline` class:



In [24]:
from src.aisearch.run import PolicyIndexingPipeline
indexer = PolicyIndexingPipeline()



This step prepares the pipeline for subsequent configurations and operations.

### Load Configurations
The pipeline loads necessary settings from a YAML configuration file, ensuring that all parameters required for indexing are properly set.

### Set Up Azure Services
The pipeline establishes connections to Azure services, including Azure Blob Storage and Azure AI Search, facilitating seamless data flow and management.

### Prepare Clients
Clients for interacting with Azure Blob Storage and Azure AI Search are initialized, enabling operations such as uploading documents and managing search indexes.

### Indexing Process
With the pipeline configured, you can proceed to index your policy documents. The process involves:

1. **Data Ingestion**: Documents are retrieved from the specified storage location.
2. **Data Chunking**: Large documents are divided into smaller, manageable chunks to enhance search performance.
3. **Data Enrichment**: Optional AI enrichment can be applied to extract key phrases, entities, or perform language detection.
4. **Vectorization**: Content is transformed into vector representations, enabling semantic search capabilities.
5. **Indexing**: Processed data is indexed into Azure AI Search, making it searchable.

### Execution
We are laveraging [Integrated vectorization - Azure AI Search](https://learn.microsoft.com/en-us/azure/search/vector-search-integrated-vectorization). It streamlines the process of converting text into vector embeddings during both indexing and querying, enhancing search capabilities by enabling semantic search and similarity matching.

### Key Components

- **Indexer**: Retrieves raw data from supported data sources and initiates data enrichment, including chunking and vectorization.
- **Skillset**: Defines the processing steps, such as:
  - **Text Split Skill**: Chunks large documents into manageable pieces to meet embedding model token limits.
  - **Embedding Skills**: Generate vector representations of text, utilizing models like Azure OpenAI's text-embedding-ada-002 or custom models.
- **Vector Index**: Stores the vectorized content, facilitating efficient similarity searches.
- **Vectorizer**: Applied at query time, it converts user input into vectors using the same embedding model as during indexing, ensuring consistency in search operations.

### Implementation Steps

1. **Data Ingestion**: Use an indexer to pull data from sources like Azure Blob Storage.
2. **Data Chunking**: Apply the Text Split Skill to divide documents into smaller chunks, accommodating embedding model constraints.
3. **Vectorization**: Employ embedding skills to transform text chunks into vector embeddings.
4. **Indexing**: Store the vector embeddings in a vector index within Azure AI Search.
5. **Query Processing**: Configure a vectorizer in the search index to convert user queries into vectors at runtime, enabling semantic search capabilities.

### Benefits

- **Simplified Workflow**: Reduces the need for external processing pipelines by integrating chunking and vectorization directly into Azure AI Search.
- **Scalability**: Supports large-scale document processing with efficient data chunking and embedding.


![image.png](attachment:image.png)


In [25]:
# Upload Document to Landing Zone Blob Storage
indexer.upload_documents(local_path="weeks/week-2/assets/contosodata")

2025-01-15 15:05:39,494 - micro - MainProcess - INFO     Uploaded weeks/week-2/assets/contosodata\Benefit_Options.pdf to lab_rawdata_ocr\Benefit_Options.pdf (run.py:upload_documents:161)
2025-01-15 15:05:39,765 - micro - MainProcess - INFO     Uploaded weeks/week-2/assets/contosodata\employee_handbook (2).pdf to lab_rawdata_ocr\employee_handbook (2).pdf (run.py:upload_documents:161)
2025-01-15 15:05:39,967 - micro - MainProcess - INFO     Uploaded weeks/week-2/assets/contosodata\PerksPlus.pdf to lab_rawdata_ocr\PerksPlus.pdf (run.py:upload_documents:161)
2025-01-15 15:05:40,367 - micro - MainProcess - INFO     Uploaded weeks/week-2/assets/contosodata\role_library.pdf to lab_rawdata_ocr\role_library.pdf (run.py:upload_documents:161)


In [26]:
# Create Data Source (Connect Blob)
indexer.create_data_source()

2025-01-15 15:05:40,701 - micro - MainProcess - INFO     Data source 'lab-ai-blob' created or updated (run.py:create_data_source:186)


In [27]:
# Create Index, pelase chekc https://github.com/pablosalvador10/gbbai-azure-ai-search-indexing/blob/main/01-creation-indexes.ipynb for a mcure dpepepr diicisisusison and epxpanabaoit hoe to crjwte innecusts 
indexer.create_index()

2025-01-15 15:05:40,901 - micro - MainProcess - INFO     Index 'lab-ai-index' created or updated successfully. (run.py:create_index:312)


In [28]:
# Create Skillset
indexer.create_skillset()

2025-01-15 15:05:41,911 - micro - MainProcess - INFO     Skillset 'lab-ai-skillset' created or updated (run.py:create_skillset:530)


In [29]:
# Create Indexer
indexer.create_indexer()

2025-01-15 15:05:55,566 - micro - MainProcess - INFO     Indexer 'lab-ai-indexer' created or updated (run.py:create_indexer:564)


In [30]:
# Run the Indexer (Index the Documents)
from src.aisearch.run import IndexerRunner
indexer = IndexerRunner(indexer_name="lab-ai-indexer")
indexer.monitor_indexer_status()

2025-01-15 15:05:55,929 - micro - MainProcess - INFO     Indexer 'lab-ai-indexer' has been started. (run.py:run_indexer:636)
2025-01-15 15:05:55,975 - micro - MainProcess - INFO     Indexer Status: running (run.py:monitor_indexer_status:680)
2025-01-15 15:05:55,982 - micro - MainProcess - INFO     Last Run Time: 2025-01-15 19:56:01.497000+00:00 (run.py:monitor_indexer_status:681)
2025-01-15 15:05:55,983 - micro - MainProcess - INFO     Execution Status: success (run.py:monitor_indexer_status:682)
2025-01-15 15:05:55,983 - micro - MainProcess - INFO     Indexer 'lab-ai-indexer' completed successfully. (run.py:monitor_indexer_status:690)


### In Class Challenge #1

#### Objective
Explore various retrieval methodologies in Azure's Vector Database to optimize search results through indexing and query processes.

- **Explore Retrieval Methods**
- Combine vector search with keyword search.
- Compare semantic vs. term-based relevance.

- **Document Quality Matters**
- Analyze the impact of document formatting.
- Address data inconsistencies.

- **LLM and Retrieval**
- Understand LLM reactions to poorly formatted data during RAG.
- Test retrieval system handling of edge cases.

- **Iterate and Optimize**
- Refine the search pipeline.
- Adjust embeddings or indexing for better ranking and relevance.

#### Test Queries

1. **Health Plan Comparison**
   - **Query**: What are the differences between the Northwind Health Plus and Northwind Standard plans offered by Contoso Electronics?
   - **Target**: Assess the system's ability to retrieve a detailed comparison of health insurance plans from the "Benefit Options" document.

2. **Workplace Safety Policy**
   - **Query**: What is the policy on workplace safety at Contoso Electronics, and what steps are included in the Workplace Safety Program?
   - **Target**: Verify if the system can extract structured details about workplace safety from the "Employee Handbook."

3. **PerksPlus Coverage**
   - **Query**: What fitness activities are covered under the PerksPlus Health and Wellness Reimbursement Program?
   - **Target**: Confirm whether the system can retrieve the specific list of fitness activities from the "PerksPlus" document.

4. **CTO Role Details**
   - **Query**: What are the responsibilities and qualifications for the Chief Technology Officer (CTO) role at Contoso Electronics?
   - **Target**: Evaluate the ability to extract role-specific responsibilities and qualifications from the "Role Library" document.

5. **Performance Review Process**
   - **Query**: How are performance reviews conducted at Contoso Electronics, and what is their purpose?
   - **Target**: Test the retrieval of procedural details and the intent behind performance reviews from the "Employee Handbook."

## Notes for Future Work
- Focus on automated evaluation pipelines.
- Ensure the search system has the right context.
- Leverage RAG to maintain high retrieval quality.

By experimenting with these tools and methods, you'll gain a deeper understanding of document retrieval systems and how to optimize them for real-world applications.

In [63]:
from azure.search.documents.models import (
    QueryAnswerType,
    QueryCaptionType,
    QueryType,
    VectorizableTextQuery,
)

from azure.search.documents import SearchClient
from azure.search.documents.models import VectorizableTextQuery
from azure.core.credentials import AzureKeyCredential
from azure.identity import DefaultAzureCredential

credential = (
    AzureKeyCredential(os.getenv("AZURE_AI_SEARCH_ADMIN_KEY"))
    if os.getenv("AZURE_AI_SEARCH_ADMIN_KEY")
    else DefaultAzureCredential()
)
index_name = os.getenv("AZURE_AI_SEARCH_INDEX_NAME", "lab-ai-index")


search_client = SearchClient(
    endpoint=os.environ["AZURE_AI_SEARCH_SERVICE_ENDPOINT"],
    index_name=index_name,
    credential=AzureKeyCredential(os.environ["AZURE_AI_SEARCH_ADMIN_KEY"]),
)

In [69]:
SEARCH_QUERY = "What are the differences between the Northwind Health Plus and Northwind Standard plans offered by Contoso Electronics?"

In [70]:
vector_query = VectorizableTextQuery(
    text=SEARCH_QUERY, k_nearest_neighbors=5, fields="vector", weight=0.5
)

In [71]:
results = search_client.search(
    search_text=SEARCH_QUERY,
    vector_queries=[vector_query],
    query_type=QueryType.SEMANTIC,
    semantic_configuration_name="my-semantic-config",
    query_caption=QueryCaptionType.EXTRACTIVE,
    query_answer=QueryAnswerType.EXTRACTIVE,
    top=5
)

In [72]:
def format_azure_search_results(results: list, truncate: int = 1000) -> str:
        """
        Formats Azure AI Search results into a structured, readable string.
        
        Each result contains:
        - Chunk ID
        - Reranker Score
        - Source Document Path
        - Content (truncated to the specified number of characters if too long)
        - Caption (highlighted if available)
        
        :param results: List of results from the Azure AI Search API.
        :param truncate: Maximum number of characters to include in the content before truncating.
        :return: Formatted string representation of the search results.
        """
        formatted_results = []

        for result in results:
            # Access all properties like a dictionary
            chunk_id = result['chunk_id'] if 'chunk_id' in result else 'N/A'
            reranker_score = result['@search.reranker_score'] if '@search.reranker_score' in result else 'N/A'
            source_doc_path = result['parent_path'] if 'parent_path' in result else 'N/A'
            content = result['chunk'] if 'chunk' in result else 'N/A'
            
            # Truncate content to specified number of characters
            content = content[:truncate] + "..." if len(content) > truncate else content

            # Extract caption (highlighted caption if available)
            captions = result['@search.captions'] if '@search.captions' in result else []
            caption = "Caption not available"
            if captions:
                first_caption = captions[0]
                if first_caption.highlights:
                    caption = first_caption.highlights
                elif first_caption.text:
                    caption = first_caption.text

            # Format each result section
            result_string = (
                f"========================================\n"
                f"🆔 ID: {chunk_id}\n"
                f"📂 Source Doc Path: {source_doc_path}\n"
                f"📜 Content: {content}\n"
                f"💡 Caption: {caption}\n"
                f"========================================"
            )

            formatted_results.append(result_string)

        # Join all the formatted results into a single string
        return "\n\n".join(formatted_results)


In [68]:
result = format_azure_search_results(results)
print(result)

🆔 ID: 5e1d6644ccd2_aHR0cHM6Ly9ud2Fpc3RvcmFnZWh1Yi5ibG9iLmNvcmUud2luZG93cy5uZXQvbGFiLWJsb2ItY29udGFpbmVyL2xhYl9yYXdkYXRhX29jci9yb2xlX2xpYnJhcnkucGRm0_normalized_images_5_pages_0
📂 Source Doc Path: https://nwaistoragehub.blob.core.windows.net/lab-blob-container/lab_rawdata_ocr/role_library.pdf
📜 Content: · Proficiency in financial modeling and analysis. · Proficiency with financial software and other computer applications. Chief Technology Officer Job Title: Chief Technology Officer Company: Contoso Electronics Location: Anywhere Job Overview: The Chief Technology Officer of Contoso Electronics will be responsible for leading the company's technology strategy and ensuring the company's technology remains competitive in the marketplace. The CTO will be the driving force behind the development of new products, processes, and standards. They will also be responsible for developing and maintaining a highly secure IT infrastructure and ensuring the quality and reliability of the company's pro

## Master Orchestration Frameworks with Semantic Kernel

Before diving into the implementation, let's understand the key components of Semantic Kernel.

#### The Kernel

The Kernel is the central orchestrator in Semantic Kernel. It manages:

- **Connecting to AI Models**: Interfaces with various AI models via connectors.
- **Registering and Invoking Plugins**: Manages the lifecycle and execution of plugins.
- **Managing Memory and Context**: Maintains state and context across interactions.
- **Interacting with the Planner**: Coordinates with the Planner to achieve specified goals.

#### Connectors

Connectors allow the Kernel to interface with various AI models and services. They define how the Kernel communicates with these models, whether they're:

- **OpenAI Models**: Such as GPT-3.5, GPT-4.
- **Azure OpenAI Services**: Leveraging Microsoft's cloud-based AI capabilities.
- **Local Models**: Using libraries like Hugging Face Transformers.

#### Plugins

Plugins are modular units that extend the Kernel's functionality. They consist of:

- **Prompt Functions**: Use natural language prompts to interact with AI models.
- **Native Functions**: Written in Python, performing deterministic tasks or interfacing with external services.

Plugins act as building blocks for complex workflows.

#### The Planner

The Planner uses AI to dynamically create a sequence of actions (a plan) to achieve a specified goal. It considers:

- **Available Plugins and Their Functions**: Understands the capabilities of each plugin.
- **Function Descriptions**: Uses metadata to understand what each function does.
- **Combining Functions**: Determines how functions can be combined to fulfill the goal.


#### Hands-on Exercise

In the realm of medical research, producing high-quality documentation is crucial. This guide aims to design a system to streamline the creation of medical documents:

- **Medical Researcher**: Gathers and summarizes relevant medical information.
- **Clinical Evaluator**: Assesses the clinical relevance and accuracy of the information.
- **Medical Editor**: Refines the language and ensures adherence to medical writing standards.

Using Semantic Kernel's capabilities, we'll create plugins representing these roles and orchestrate their workflow based on a given goal.

#### Implementation

We will:

1. **Define Plugins**: Create plugins for the Medical Researcher, Clinical Evaluator, and Medical Editor.
2. **Configure the Kernel**: Set up the Kernel with the necessary connectors and plugins.
3. **Develop the Workflow**: Implement the logic to sequentially execute tasks for producing high-quality medical documentation.
4. **Execute the Process**: Run the system to generate polished medical content.

In [77]:
#We have defined the plugin for MedicalAgents located in the specified directory  `C:\Users\pablosal\Desktop\gbbai-agent-architecture-lab\src\plugins\plugins_store`

import semantic_kernel as sk

# Initialize the kernel
kernel = sk.Kernel()

# add Azure OpenAI service connector to the kernel
import os
from semantic_kernel import Kernel
from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion

# Load environment variables from a .env file
from dotenv import load_dotenv
load_dotenv()

AZURE_OPENAI_KEY = os.getenv("AZURE_OPENAI_KEY")
AZURE_OPENAI_ENDPOINT = os.getenv("AZURE_OPENAI_ENDPOINT")
AZURE_OPENAI_API_VERSION = os.getenv("AZURE_OPENAI_API_VERSION")
AZURE_OPENAI_CHAT_DEPLOYMENT_ID = os.getenv("AZURE_OPENAI_CHAT_DEPLOYMENT_ID")

service_id = "openai-chat"
# Add Azure OpenAI chat completion
kernel.add_service(AzureChatCompletion(
    service_id=service_id,
    deployment_name=AZURE_OPENAI_CHAT_DEPLOYMENT_ID,
    api_key=AZURE_OPENAI_KEY,
    endpoint=AZURE_OPENAI_ENDPOINT,
    api_version=AZURE_OPENAI_API_VERSION,
))

print("Registered services:", kernel.services)

# Define the parent directory and plugin name
parent_directory = os.path.abspath(os.path.join("src", "plugins", "plugins_store"))
plugin_name = "medResearch"

# Add the plugin to the kernel
plugin = kernel.add_plugin(parent_directory=parent_directory, plugin_name=plugin_name)

print("Available functions in MedicalAgents plugin:")

print("Loaded plugin functions:", plugin.functions.keys())

print(plugin.functions['MedicalResearcher'].metadata)


Registered services: {'openai-chat': AzureChatCompletion(ai_model_id='gpt-4o', service_id='openai-chat', client=<openai.lib.azure.AsyncAzureOpenAI object at 0x000001F2B0056650>, ai_model_type=<OpenAIModelTypes.CHAT: 'chat'>, prompt_tokens=0, completion_tokens=0, total_tokens=0)}


PluginInitializationError: Plugin directory does not exist: medResearch