This notebook combines Figure-Understanding Document Intelligence and AI Foundry Evaluation SDK.

TBU: RAG with Semantic Chunking and AI Search

# Document Intelligence
### For Figure Understanding & Hierarchical Document Structure Analysis

Detect figures in document, hierarchy in markdown

1. Crop figures

2. Send figure content with caption to Azure OpenAI

3. Figure description used to update markdown output

4. TBU: Semantic chunking of output (Document Intelligence in EUS / WUS2 / West Europe)

5. TBU: Emebdding with Azure OpenAI

6. TBU: AI Search for RAG

## 0. Setup

In [4]:
"""
%%writefile .env

AZURE_OPENAI_ENDPOINT = ""  # Your Azure OpenAI resource's endpoint value.
AZURE_OPENAI_API_KEY = "" # Your Azure OpenAI resource's API key value.
AZURE_SEARCH_ENDPOINT = "" # Your Azure AI Search resource's endpoint value.
AZURE_SEARCH_ADMIN_KEY = "" # Your Azure AI Search resource's admin key value.
AZURE_DOCUMENT_INTELLIGENCE_ENDPOINT= "" # Your Azure Document Intelligence resource's endpoint value.
AZURE_DOCUMENT_INTELLIGENCE_KEY = "" # Your Azure Document Intelligence resource's key value.
"""

'\n%%writefile .env\n\nAZURE_OPENAI_ENDPOINT = ""  # Your Azure OpenAI resource\'s endpoint value.\nAZURE_OPENAI_API_KEY = "" # Your Azure OpenAI resource\'s API key value.\nAZURE_SEARCH_ENDPOINT = "" # Your Azure AI Search resource\'s endpoint value.\nAZURE_SEARCH_ADMIN_KEY = "" # Your Azure AI Search resource\'s admin key value.\nAZURE_DOCUMENT_INTELLIGENCE_ENDPOINT= "" # Your Azure Document Intelligence resource\'s endpoint value.\nAZURE_DOCUMENT_INTELLIGENCE_KEY = "" # Your Azure Document Intelligence resource\'s key value.\n'

In [5]:
%pip install -r requirements.txt

Note: you may need to restart the kernel to use updated packages.


In [2]:
"""
This code loads environment variables using the `dotenv` library and sets the necessary environment variables for Azure services.
The environment variables are loaded from the `.env` file in the same directory as this notebook.
"""

import os
from dotenv import load_dotenv
from azure.core.credentials import AzureKeyCredential
from azure.ai.documentintelligence import DocumentIntelligenceClient
from azure.ai.documentintelligence.models import ContentFormat
from openai import AzureOpenAI

load_dotenv()

doc_intelligence_endpoint = os.getenv("AZURE_DOCUMENT_INTELLIGENCE_ENDPOINT")
doc_intelligence_key = os.getenv("AZURE_DOCUMENT_INTELLIGENCE_KEY")

aoai_api_base = os.getenv("AZURE_OPENAI_ENDPOINT")
aoai_api_key= os.getenv("AZURE_OPENAI_API_KEY")
aoai_deployment_name = 'gpt-4o' # your model deployment name for GPT-4o
aoai_api_version = '2024-08-01-preview' # Version specified in Endpoint Target URI

## 1. Crop figure from document

Cropping based on the bounding box

In [3]:
from PIL import Image
import fitz  # PyMuPDF
import mimetypes



def crop_image_from_image(image_path, page_number, bounding_box):
    """
    Crops an image based on a bounding box.

    Args:
        param image_path: Path to the image file.
        param page_number: The page number of the image to crop (for TIFF format).
        param bounding_box: A tuple of (left, upper, right, lower) coordinates for the bounding box.
    Returns:
        A cropped image, rtype: PIL.Image.Image
    """

    with Image.open(image_path) as img:
        if img.format == "TIFF":
            # Open the TIFF image
            img.seek(page_number)
            img = img.copy()
            
        # The bounding box is expected to be in the format (left, upper, right, lower).
        cropped_image = img.crop(bounding_box)
        return cropped_image



def crop_image_from_pdf_page(pdf_path, page_number, bounding_box):
    """
    Crops a region from a given page in a PDF and returns it as an image.

    Args:
        param pdf_path: Path to the PDF file.
        param page_number: The page number to crop from (0-indexed).
        param bounding_box: A tuple of (x0, y0, x1, y1) coordinates for the bounding box.
    Returns:
        A PIL Image of the cropped area.
    """

    doc = fitz.open(pdf_path)
    page = doc.load_page(page_number)
    
    # Cropping the page. The rect requires the coordinates in the format (x0, y0, x1, y1).
    bbx = [x * 72 for x in bounding_box]
    rect = fitz.Rect(bbx)
    pix = page.get_pixmap(matrix=fitz.Matrix(300/72, 300/72), clip=rect)
    
    img = Image.frombytes("RGB", [pix.width, pix.height], pix.samples)
    
    doc.close()

    return img



def crop_image_from_file(file_path, page_number, bounding_box):
    """
    Crop an image from a file.

    Args:
        file_path (str): The path to the file.
        page_number (int): The page number (for PDF and TIFF files, 0-indexed).
        bounding_box (tuple): The bounding box coordinates in the format (x0, y0, x1, y1).

    Returns:
        A PIL Image of the cropped area.
    """
    mime_type = mimetypes.guess_type(file_path)[0]
    
    if mime_type == "application/pdf":
        return crop_image_from_pdf_page(file_path, page_number, bounding_box)
    else:
        return crop_image_from_image(file_path, page_number, bounding_box)


## 2. Understand semantics of the figure content (Azure OpenAI)

In [4]:
import base64
from mimetypes import guess_type



# Function to encode a local image into data URL 
def local_image_to_data_url(image_path):
    # Guess the MIME type of the image based on the file extension
    mime_type, _ = guess_type(image_path)
    if mime_type is None:
        mime_type = 'application/octet-stream'  # Default MIME type if none is found

    # Read and encode the image file
    with open(image_path, "rb") as image_file:
        base64_encoded_data = base64.b64encode(image_file.read()).decode('utf-8')

    # Construct the data URL
    return f"data:{mime_type};base64,{base64_encoded_data}"


In [None]:
MAX_TOKENS = 2000



def understand_image_with_aoai(api_base, api_key, deployment_name, api_version, image_path, caption):
    """
    Generates a description for an image using the GPT-4V model.

    Args:
        api_base (str): The base URL of the API. (Endpoint)
        api_key (str): The API key for authentication.
        deployment_name (str): The name of the deployment.
        api_version (str): The version of the API.
        image_path (str): The path to the image file.
        caption (str): The caption for the image.

    Returns:
        img_description (str): The generated description for the image.
    """

    client = AzureOpenAI(
        api_key=api_key,  
        api_version=api_version,
        base_url=f"{api_base}/openai/deployments/{deployment_name}"
    )

    data_url = local_image_to_data_url(image_path)

    # We send both image caption and the image body to GPT for better understanding
    if caption != "":
        response = client.chat.completions.create(
                model=deployment_name,
                messages=[
                    { "role": "system", "content": "You are an assisstant embedded in Hyundai vehicle." },
                    { "role": "user", "content": [  
                        { 
                            "type": "text", 
                            "text": f"Describe this image (note: it has image caption: {caption}):" 
                        },
                        { 
                            "type": "image_url",
                            "image_url": {
                                "url": data_url
                            }
                        }
                    ] } 
                ],
                max_tokens=MAX_TOKENS
            )

    else:
        response = client.chat.completions.create(
            model=deployment_name,
            messages=[
                { "role": "system", "content": "You are an assisstant embedded in Hyundai vehicle." },
                { "role": "user", "content": [  
                    { 
                        "type": "text", 
                        "text": "Describe this image:" 
                    },
                    { 
                        "type": "image_url",
                        "image_url": {
                            "url": data_url
                        }
                    }
                ] } 
            ],
            max_tokens=MAX_TOKENS
        )

    img_description = response.choices[0].message.content
    
    return img_description

## 3. Update markdown with image description

Add image description to figure content section

In [None]:
def update_figure_description(md_content, img_description, idx):
    """
    Updates the figure description in the Markdown content.

    Args:
        md_content (str): The original Markdown content.
        img_description (str): The new description for the image.
        idx (int): The index of the figure.

    Returns:
        str: The updated Markdown content with the new figure description.
    """

    # The substring you're looking for
    start_substring = f"![](figures/{idx})"
    end_substring = "</figure>"
    new_string = f"<!-- FigureContent=\"{img_description}\" -->"
    
    new_md_content = md_content
    # Find the start and end indices of the part to replace
    start_index = md_content.find(start_substring)
    if start_index != -1:  # if start_substring is found
        start_index += len(start_substring)  # move the index to the end of start_substring
        end_index = md_content.find(end_substring, start_index)
        if end_index != -1:  # if end_substring is found
            # Replace the old string with the new string
            new_md_content = md_content[:start_index] + new_string + md_content[end_index:]
    
    return new_md_content


## RUN

* Analyze a document with Azure AI Document Intelligence Layout model

* Update figure description in the markdown output

In [7]:

def analyze_layout(input_file_path, output_folder):
    """
    Analyzes the layout of a document and extracts figures along with their descriptions, then update the markdown output with the new description.

    Args:
        input_file_path (str): The path to the input document file.
        output_folder (str): The path to the output folder where the cropped images will be saved.

    Returns:
        str: The updated Markdown content with figure descriptions.

    """
    document_intelligence_client = DocumentIntelligenceClient(
        endpoint=doc_intelligence_endpoint, 
        credential=AzureKeyCredential(doc_intelligence_key),
        headers={"x-ms-useragent":"sample-code-figure-understanding/1.0.0"},
    )

    with open(input_file_path, "rb") as f:
        poller = document_intelligence_client.begin_analyze_document(
            "prebuilt-layout", analyze_request=f, content_type="application/octet-stream", output_content_format=ContentFormat.MARKDOWN 
        )

    result = poller.result()
    md_content = result.content
    
    
    if result.figures:
        print("Figures:")
        for idx, figure in enumerate(result.figures):
            figure_content = ""
            img_description = ""
            print(f"Figure #{idx} has the following spans: {figure.spans}")
            for i, span in enumerate(figure.spans):
                print(f"Span #{i}: {span}")
                figure_content += md_content[span.offset:span.offset + span.length]
            print(f"Original figure content in markdown: {figure_content}")

            # Note: figure bounding regions currently contain both the bounding region of figure caption and figure body
            if figure.caption:
                caption_region = figure.caption.bounding_regions
                print(f"\tCaption: {figure.caption.content}")
                print(f"\tCaption bounding region: {caption_region}")
                for region in figure.bounding_regions:
                    if region not in caption_region:
                        print(f"\tFigure body bounding regions: {region}")
                        # To learn more about bounding regions, see https://aka.ms/bounding-region
                        boundingbox = (
                                region.polygon[0],  # x0 (left)
                                region.polygon[1],  # y0 (top)
                                region.polygon[4],  # x1 (right)
                                region.polygon[5]   # y1 (bottom)
                            )
                        print(f"\tFigure body bounding box in (x0, y0, x1, y1): {boundingbox}")
                        cropped_image = crop_image_from_file(input_file_path, region.page_number - 1, boundingbox) # page_number is 1-indexed

                        # Get the base name of the file
                        base_name = os.path.basename(input_file_path)
                        # Remove the file extension
                        file_name_without_extension = os.path.splitext(base_name)[0]

                        output_file = f"{file_name_without_extension}_cropped_image_{idx}.png"
                        cropped_image_filename = os.path.join(output_folder, output_file)

                        cropped_image.save(cropped_image_filename)
                        print(f"\tFigure {idx} cropped and saved as {cropped_image_filename}")
                        img_description += understand_image_with_aoai(aoai_api_base, aoai_api_key, aoai_deployment_name, aoai_api_version, cropped_image_filename, figure.caption.content)
                        print(f"\tDescription of figure {idx}: {img_description}")
            else:
                print("\tNo caption found for this figure.")
                for region in figure.bounding_regions:
                    print(f"\tFigure body bounding regions: {region}")
                    # To learn more about bounding regions, see https://aka.ms/bounding-region
                    boundingbox = (
                            region.polygon[0],  # x0 (left)
                            region.polygon[1],  # y0 (top
                            region.polygon[4],  # x1 (right)
                            region.polygon[5]   # y1 (bottom)
                        )
                    print(f"\tFigure body bounding box in (x0, y0, x1, y1): {boundingbox}")

                    cropped_image = crop_image_from_file(input_file_path, region.page_number - 1, boundingbox) # page_number is 1-indexed

                    # Get the base name of the file
                    base_name = os.path.basename(input_file_path)
                    # Remove the file extension
                    file_name_without_extension = os.path.splitext(base_name)[0]

                    output_file = f"{file_name_without_extension}_cropped_image_{idx}.png"
                    cropped_image_filename = os.path.join(output_folder, output_file)
                    # cropped_image_filename = f"data/cropped/image_{idx}.png"
                    cropped_image.save(cropped_image_filename)
                    print(f"\tFigure {idx} cropped and saved as {cropped_image_filename}")
                    img_description += understand_image_with_aoai(aoai_api_base, aoai_api_key, aoai_deployment_name, aoai_api_version, cropped_image_filename, "")
                    print(f"\tDescription of figure {idx}: {img_description}")
            
            # replace_figure_description(figure_content, img_description, idx)
            md_content = update_figure_description(md_content, img_description, idx)

    return md_content
            


In [25]:
# Preparing data: Run code only if needed

'''
import fitz

def extract_pages(input_pdf, output_pdf, start_page, end_page):
    doc = fitz.open(input_pdf)
    new_doc = fitz.open()

    for i in range(start_page - 1, end_page):  # Pages are 0-indexed in PyMuPDF
        new_doc.insert_pdf(doc, from_page=i, to_page=i)
    
    new_doc.save(output_pdf)
    print(f"Extracted pages saved as {output_pdf}")

# Example usage: Extract 2 pages
extract_pages("data/Owner's_Manual.pdf", "data/manual_sample.pdf", 163, 164)
'''

'\nimport fitz\n\ndef extract_pages(input_pdf, output_pdf, start_page, end_page):\n    doc = fitz.open(input_pdf)\n    new_doc = fitz.open()\n\n    for i in range(start_page - 1, end_page):  # Pages are 0-indexed in PyMuPDF\n        new_doc.insert_pdf(doc, from_page=i, to_page=i)\n    \n    new_doc.save(output_pdf)\n    print(f"Extracted pages saved as {output_pdf}")\n\n# Example usage: Extract 2 pages\nextract_pages("data/Owner\'s_Manual.pdf", "data/manual_sample.pdf", 163, 164)\n'

In [26]:
updated_md_with_figure_understanding = analyze_layout("data/manual_sample.pdf", "data/cropped")

print("-------------------------------------------------------------------------------------------")
print(f"Updated markdown content with figure understanding: {updated_md_with_figure_understanding}")

Figures:
Figure #0 has the following spans: [{'offset': 93, 'length': 22}]
Span #0: {'offset': 93, 'length': 22}
Original figure content in markdown: <figure>

!

</figure>
	No caption found for this figure.
	Figure body bounding regions: {'pageNumber': 1, 'polygon': [2.4042, 1.0158, 2.9979, 1.0158, 2.998, 1.4541, 2.4042, 1.4541]}
	Figure body bounding box in (x0, y0, x1, y1): (2.4042, 1.0158, 2.998, 1.4541)
	Figure 0 cropped and saved as data/cropped\manual_sample_cropped_image_0.png
Figure #1 has the following spans: [{'offset': 549, 'length': 18}]
Span #0: {'offset': 549, 'length': 18}
Original figure content in markdown: <figure>
</figure>
	No caption found for this figure.
	Figure body bounding regions: {'pageNumber': 1, 'polygon': [2.4018, 3.7053, 3.0074, 3.7054, 3.0074, 4.1646, 2.4019, 4.1645]}
	Figure body bounding box in (x0, y0, x1, y1): (2.4018, 3.7053, 3.0074, 4.1646)
	Figure 1 cropped and saved as data/cropped\manual_sample_cropped_image_1.png
	Description of figure 1: Thi

In [27]:
# Save as markdown file
with open("data/analyzed_results.md", "w", encoding="utf-8") as file:
    file.write(updated_md_with_figure_understanding)

print("Markdown file saved as data/analyzed_results.md")

Markdown file saved as data/analyzed_results.md


# Evaluator SDK

## 0. Setup

In [3]:
import pandas as pd
import os
import json

from pprint import pprint
from azure.ai.evaluation import evaluate
from azure.ai.evaluation import RelevanceEvaluator
from azure.ai.evaluation import GroundednessEvaluator, GroundednessProEvaluator
from azure.identity import DefaultAzureCredential
from dotenv import load_dotenv
from azure.ai.projects import AIProjectClient
from azure.ai.projects.models import (
    Evaluation,
    Dataset,
    EvaluatorConfiguration,
    ConnectionType,
    EvaluationSchedule,
    RecurrenceTrigger,
    ApplicationInsightsConfiguration
)
import pathlib

from azure.ai.evaluation import evaluate
from azure.ai.evaluation import (
    ContentSafetyEvaluator,
    RelevanceEvaluator,
    CoherenceEvaluator,
    GroundednessEvaluator,
    FluencyEvaluator,
    SimilarityEvaluator,
    F1ScoreEvaluator,
    RetrievalEvaluator
)
# from model_endpoint import ModelEndpoint



load_dotenv(override=True)

True

In [4]:
credential = DefaultAzureCredential()

# Initialize Azure AI project and Azure OpenAI conncetion with your environment variables
azure_ai_project = {
    "subscription_id": os.environ.get("AZURE_SUBSCRIPTION_ID"),
    "resource_group_name": os.environ.get("AZURE_RESOURCE_GROUP_NAME"),
    "project_name": os.environ.get("AZURE_PROJECT_NAME"),
}

model_config = {
    "azure_endpoint": os.environ.get("AZURE_OPENAI_ENDPOINT"),
    "api_key": os.environ.get("AZURE_OPENAI_API_KEY"),
    "azure_deployment": os.environ.get("AZURE_OPENAI_DEPLOYMENT_NAME"),
    "api_version": os.environ.get("AZURE_OPENAI_API_VERSION"),
    "type": "azure_openai",
}


In [5]:
project_client = AIProjectClient.from_connection_string(
    credential=DefaultAzureCredential(),
    conn_str=os.environ.get("AZURE_AI_PROJECT_CONN_STR"),  # At the moment, it should be in the format "<Region>.api.azureml.ms;<AzureSubscriptionId>;<ResourceGroup>;<HubName>" Ex: eastus2.api.azureml.ms;xxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxxxx;rg-sample;sample-project-eastus2
)

## 1. Create Response

Will evaluate response from AOAI.

The object is to compare desired response / RAG based response / Intent considered response / Bing Search response for up-to-date data... etc.

In the meantime we will compare DocInt+AOAI based response (assumed as RAG based) to the desired response... DocInt markdown output given as context. (Assume RAG was performed to retreive such context)

Other factors TBU.

In [None]:
# Get markdown output (Only use this when re-running notebook)
"""
updated_md_with_figure_understanding = []
with open("data/analyzed_results.md", "r", encoding="utf-8") as file:
    for line in file:
        updated_md_with_figure_understanding.append(line.strip())  # Strip removes extra spaces or newlines

print("\n".join(updated_md_with_figure_understanding))  # Join lines into a full string
"""

<!-- PageHeader="Convenience features" -->




<figure>

!

</figure>



. When the ignition is moved to ON,
about 3 seconds and turns off
automatically if no problem.

while driving, it indicates that there
is a problem with the electric power
steering system. In this case, we
recommend that you have the vehi-
cle inspected by an authorized
HYUNDAI dealer.


## Malfunction Indicator Lamp (MIL)


<figure>
</figure>



· When you set the ignition switch or
the Engine Start/Stop button to the
ON position.

\- It illuminates for approximately 3
seconds and then goes off.

. When there is a malfunction with
the emission control system.

In this case, we recommend that
you have the vehicle inspected by
an authorized HYUNDAI dealer.


## NOTICE

Driving with the Malfunction
Indicator Lamp (MIL) on may
cause damage to the emission
control system which could affect
drivability and/or fuel economy.


## ! CAUTION

Gasoline Engine

If the Malfunction Indicator
Lamp (MIL) illuminates, poten-
tial

In [7]:
client = AzureOpenAI(
    api_key=model_config['api_key'],  
    api_version=model_config['api_version'],
    base_url=f"{model_config['azure_endpoint']}/openai/deployments/{model_config['azure_deployment']}"
)

In [8]:
def ask_question(query):
    MAX_TOKENS = 2000

    response = client.chat.completions.create(
        model=model_config['azure_deployment'],
        messages=[
            {"role": "system", "content": 
             """
             You are an assisstant embedded in Hyundai vehicle.
             Using information from the given context, create responses about the following request.
             Specify the reference.
            - Questions about image alerts
            - Questions about reconfiguring vehicle after alert
            - Additional user tips
            """},
            {"role": "user", "content": f"Context: {updated_md_with_figure_understanding}"},
            { "role": "user", "content": query}
        ],
        max_tokens=MAX_TOKENS
        )
    response_parsed = response.choices[0].message.content
    
    print(response_parsed)
    # return response_parsed

In [9]:
ask_question("What does an exclamation mark next to a handle image mean?")


For more detailed clarification:
- If the issue relates to a **charging system or engine oil pressure**, follow the related troubleshooting steps and consult with an authorized Hyundai dealer if necessary.

Make sure to check your vehicle's manual for the specific meaning of the exclamation mark if it’s paired with a handle image. For any uncertainty, it's always recommended to have your car inspected by a Hyundai service center.



In [10]:
ask_question("What does the symbol with plus minus marks mean?")


1. Drive carefully to the nearest safe location and stop your vehicle.
2. Turn the engine off and check the alternator drive belt for looseness or breakage.

In such a situation, it is recommended that you have your vehicle inspected by an authorized Hyundai dealer as soon as possible. 



### Prepare json

Add previous repsonses to json file.

Responses will be compared to ground truth.

## 2. Evaluate

In [11]:
project_client = AIProjectClient.from_connection_string(
    credential=DefaultAzureCredential(),
    conn_str=os.environ.get("AZURE_AI_PROJECT_CONN_STR"),  # At the moment, it should be in the format "<Region>.api.azureml.ms;<AzureSubscriptionId>;<ResourceGroup>;<HubName>" Ex: eastus2.api.azureml.ms;xxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxxxx;rg-sample;sample-project-eastus2
)

In [12]:
# id for each evaluator can be found in your AI Studio registry - please see documentation for more information
# init_params is the configuration for the model to use to perform the evaluation
# data_mapping is used to map the output columns of your query to the names required by the evaluator
# Evaluator parameter format - https://learn.microsoft.com/en-us/azure/ai-studio/how-to/develop/evaluate-sdk#evaluator-parameter-format
evaluators_cloud = {
    "f1_score": EvaluatorConfiguration(
        id=F1ScoreEvaluator.id,
    ),
    "relevance": EvaluatorConfiguration(
        id=RelevanceEvaluator.id,
        init_params={"model_config": model_config},
        data_mapping={"query": "${data.query}", "context": "${data.context}", "response": "${data.response}"},
    ),
    "groundedness": EvaluatorConfiguration(
        id=GroundednessEvaluator.id,
        init_params={"model_config": model_config},
        data_mapping={"query": "${data.query}", "context": "${data.context}", "response": "${data.response}"},
    ),
    # from azure.ai.evaluation._evaluators._common.math import list_mean_nan_safe\nModuleNotFoundError: No module named 'azure.ai.evaluation._evaluators._common.math'
    # "retrieval": EvaluatorConfiguration(
    #     id=RetrievalEvaluator.id,
    #     init_params={"model_config": model_config},
    #     data_mapping={"query": "${data.query}", "context": "${data.context}"},
    # ),
    "coherence": EvaluatorConfiguration(
        id=CoherenceEvaluator.id,
        init_params={"model_config": model_config},
        data_mapping={"query": "${data.query}", "response": "${data.response}"},
    ),
    "fluency": EvaluatorConfiguration(
        id=FluencyEvaluator.id,
        init_params={"model_config": model_config},
        data_mapping={"query": "${data.query}", "context": "${data.context}", "response": "${data.response}"},
    ),
     "similarity": EvaluatorConfiguration(
        id=SimilarityEvaluator.id,
        init_params={"model_config": model_config},
        data_mapping={"query": "${data.query}", "response": "${data.response}"},
    ),

}


### Upload data

Troubleshooting for Auth

* secrets reader (AI Hub > VM): does not have authorization to perform action 'Microsoft.MachineLearningServices/workspaces/datastores/read' over scope '/subscriptions/{vm id}/resourceGroups/AZ-FoundryRAG-RG-suzy/providers/Microsoft.MachineLearningServices/workspaces/AZ-AIProject-Sweden-suzy/datastores/workspaceblobstore

* AzureML Data Scientist (AI Project > VM): does not have authorization to perform action 'Microsoft.MachineLearningServices/workspaces/data/versions/write'

etc...

In [13]:
# # Upload data for evaluation
data_id, _ = project_client.upload_file("data/evaluate_test_data.jsonl")


In [14]:
evaluation = Evaluation(
    display_name="Cloud Evaluation",
    description="Cloud Evaluation of dataset",
    data=Dataset(id=data_id),
    evaluators=evaluators_cloud,
)

# Create evaluation
evaluation_response = project_client.evaluations.create(
    evaluation=evaluation,
)

In [15]:
# Get evaluation
get_evaluation_response = project_client.evaluations.get(evaluation_response.id)

print("----------------------------------------------------------------")
print("Created evaluation, evaluation ID: ", get_evaluation_response.id)
print("Evaluation status: ", get_evaluation_response.status)
print("AI Foundry Portal URI: ", get_evaluation_response.properties["AiStudioEvaluationUri"])
print("----------------------------------------------------------------")

----------------------------------------------------------------
Created evaluation, evaluation ID:  875344c8-5ec4-44b0-aead-8e19ffb3461a
Evaluation status:  Queued
AI Foundry Portal URI:  https://ai.azure.com/build/evaluation/875344c8-5ec4-44b0-aead-8e19ffb3461a?wsid=/subscriptions/f8b0b83e-4e98-41d9-90f8-7dd582a59d62/resourceGroups/AZ-FoundryRAG-RG-suzy/providers/Microsoft.MachineLearningServices/workspaces/AZ-AIProject-Sweden-suzy
----------------------------------------------------------------


In [16]:
from tqdm import tqdm
import time

# Monitor the status of the run_result
def monitor_status(project_client:AIProjectClient, evaluation_response_id:str):
    with tqdm(total=3, desc="Running Status", unit="step") as pbar:
        status = project_client.evaluations.get(evaluation_response_id).status
        if status == "Queued":
            pbar.update(1)
        while status != "Completed" and status != "Failed":
            if status == "Running" and pbar.n < 2:
                pbar.update(1)
            print(f"Current Status: {status}")
            time.sleep(10)
            status = project_client.evaluations.get(evaluation_response_id).status
        while(pbar.n < 3):
            pbar.update(1)
        print("Operation Completed")

In [17]:
monitor_status(project_client, evaluation_response.id)

Running Status:  33%|███▎      | 1/3 [00:00<00:00,  3.00step/s]

Current Status: Queued
Current Status: Queued
Current Status: Queued
Current Status: Queued
Current Status: Queued
Current Status: Queued
Current Status: Queued
Current Status: Queued
Current Status: Queued
Current Status: Queued
Current Status: Queued
Current Status: Queued
Current Status: Queued
Current Status: Queued
Current Status: Queued
Current Status: Queued
Current Status: Queued


Running Status:  67%|██████▋   | 2/3 [02:56<01:43, 103.65s/step]

Current Status: Running
Current Status: Running
Current Status: Running
Current Status: Running
Current Status: Running
Current Status: Running
Current Status: Running
Current Status: Running
Current Status: Running
Current Status: Running
Current Status: Running
Current Status: Running
Current Status: Running
Current Status: Running
Current Status: Finalizing


Running Status: 100%|██████████| 3/3 [05:34<00:00, 111.42s/step]

Operation Completed



