# AI Studio Azure batch run Evaluation
### Intent Prompt Flow - Base Run

Now in order to test these more thoroughly, we can use the Azure AI Studio to run batches of test data with the evaluation prompt flow on a larger dataset.

In [1]:
import json
# Import required libraries
from promptflow.azure import PFClient
# Import required libraries
from azure.identity import DefaultAzureCredential, InteractiveBrowserCredential
from evaluate import run_azure_flow, run_azure_eval_flow

In [2]:
try:
    credential = DefaultAzureCredential()
    # Check if given credential can get token successfully.
    credential.get_token("https://management.azure.com/.default")
except Exception as ex:
    # Fall back to InteractiveBrowserCredential in case DefaultAzureCredential not work
    credential = InteractiveBrowserCredential()

Populate the `config.json` file with the subscription_id, resource_group, and workspace_name.

In [3]:
config_path = "../config.json"
pf_azure_client = PFClient.from_config(credential=credential, path=config_path)

Found the config file in: ../config.json


Set the properties needed to run in Azure

In [4]:
# Update the runtime to the name of the runtime you created previously
runtime = "automatic"
flow = "../contoso-intent"
data = "../data/alltestdata.jsonl"
run_name = "intent_base_run"
column_mapping={"customerId": "${data.customerId}","question": "${data.question}"}


Create a base run to use as the variant for the evaluation runs. 

In [5]:
base_run = run_azure_flow(runtime, flow, run_name, data, column_mapping, pf_azure_client)

[32mUploading contoso-intent (0.08 MBs): 100%|██████████| 82705/82705 [00:01<00:00, 58704.17it/s]
[39m



Portal url: https://ai.azure.com/projectflows/bulkrun/run/intent_base_run_03_05_2105/details?wsid=/subscriptions/a195fdab-ef20-44e0-ae3a-48b2e1764e85/resourcegroups/contchat-rg/providers/Microsoft.MachineLearningServices/workspaces/contoso-chat-sf-aiproj


In [6]:
pf_azure_client.stream(base_run)

(Run status is 'NotStarted', continue streaming...)
(Run status is 'NotStarted', continue streaming...)
(Run status is 'NotStarted', continue streaming...)
(Run status is 'NotStarted', continue streaming...)
(Run status is 'NotStarted', continue streaming...)
(Run status is 'NotStarted', continue streaming...)
2024-03-05 13:10:10 +0000      50 promptflow-runtime INFO     [intent_base_run_03_05_2105] Receiving v2 bulk run request 8d491848-f56f-483f-bdf9-fc3d168c1343: {"flow_id": "intent_base_run_03_05_2105", "flow_run_id": "intent_base_run_03_05_2105", "flow_source": {"flow_source_type": 1, "flow_source_info": {"snapshot_id": "51d26488-879d-426d-8664-656703f3ae01"}, "flow_dag_file": "flow.dag.yaml"}, "connections": "**data_scrubbed**", "log_path": "https://stcontosoy46u2sqpv3h6g.blob.core.windows.net/8927edb2-b165-4c51-a19d-7201534af4ac-azureml/ExperimentRun/dcid.intent_base_run_03_05_2105/logs/azureml/executionlogs.txt?sv=2019-07-07&sr=b&sig=**data_scrubbed**&skoid=3d6e6777-5406-46a6-9

<promptflow._sdk.entities._run.Run at 0x7f0bdd7cd350>

In [7]:
details = pf_azure_client.get_details(base_run)
details.head(10)

Unnamed: 0_level_0,inputs.chat_history,inputs.customerId,inputs.question,inputs.line_number,outputs.answer,outputs.intent_context,outputs.context
outputs.line_number,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
0,[],4,tell me about your hiking jackets,0,,,
1,[],1,Do you have any climbing gear?,1,,,
2,[],3,Can you tell me about your selection of tents?,2,,,
3,[],6,Do you have any hiking boots?,3,"Hey Emily! 👋 Absolutely, we have some amazing ...",intent: chat,[{'content': 'Introducing the TrekReady Hiking...
4,[],2,What gear do you recommend for hiking?,4,,,
5,[],7,what is the temperature rating of my sleeping ...,5,,,
6,[],7,what is the temperature rating of the cozynigh...,6,,,
7,[],8,what is the waterproof rating of the tent I bo...,7,,,
8,[],8,what is the waterproof rating of the TrailMast...,8,Hi Melissa! 🌧️ The TrailMaster X4 Tent's rainf...,intent: chat,[{'content': 'Unveiling the TrailMaster X4 Ten...
9,[],2,What is your return or exchange policy?,9,,,


## Intent Prompt Flow Evaluation - Eval Run

In [8]:
eval_flow = "multi_flow/"
data = "../data/alltestdata.jsonl"
run_name = "intent_eval_run"
column_mapping={
        # reference data
        "customerId": "${data.customerId}",
        "question": "${data.question}",
        "context": "${run.outputs.context}",
        # reference the run's output
        "answer": "${run.outputs.answer}",
    }

In [9]:
eval_run = run_azure_eval_flow(runtime, eval_flow, run_name, data, column_mapping, base_run, pf_azure_client)



Portal url: https://ai.azure.com/projectflows/bulkrun/run/intent_eval_run_03_05_2111/details?wsid=/subscriptions/a195fdab-ef20-44e0-ae3a-48b2e1764e85/resourcegroups/contchat-rg/providers/Microsoft.MachineLearningServices/workspaces/contoso-chat-sf-aiproj


In [10]:
pf_azure_client.stream(eval_run)

(Run status is 'NotStarted', continue streaming...)
2024-03-05 13:11:42 +0000     109 promptflow-runtime INFO     [intent_eval_run_03_05_2111] Receiving v2 bulk run request 2a6abe7a-2619-4c87-9378-9facae64bee9: {"flow_id": "intent_eval_run_03_05_2111", "flow_run_id": "intent_eval_run_03_05_2111", "flow_source": {"flow_source_type": 1, "flow_source_info": {"snapshot_id": "01f27b19-1806-4c9d-b60f-e635ecf93fbc"}, "flow_dag_file": "flow.dag.yaml"}, "connections": "**data_scrubbed**", "log_path": "https://stcontosoy46u2sqpv3h6g.blob.core.windows.net/8927edb2-b165-4c51-a19d-7201534af4ac-azureml/ExperimentRun/dcid.intent_eval_run_03_05_2111/logs/azureml/executionlogs.txt?sv=2019-07-07&sr=b&sig=**data_scrubbed**&skoid=3d6e6777-5406-46a6-9b64-dbf0f2b204d3&sktid=16b3c013-d300-468d-ac64-7eda0820b6d3&skt=2024-03-05T10%3A19%3A06Z&ske=2024-03-06T18%3A29%3A06Z&sks=b&skv=2019-07-07&st=2024-03-05T13%3A01%3A42Z&se=2024-03-05T21%3A11%3A42Z&sp=rcw", "app_insights_instrumentation_key": "InstrumentationKey=

<promptflow._sdk.entities._run.Run at 0x7f0bd419b210>

In [11]:
details = pf_azure_client.get_details(eval_run)
details.head(10)

Unnamed: 0_level_0,inputs.customerId,inputs.question,inputs.context,inputs.answer,inputs.line_number,inputs.chat_history,outputs.gpt_coherence,outputs.gpt_fluency,outputs.gpt_groundedness,outputs.gpt_relevance
outputs.line_number,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
3,6,Do you have any hiking boots?,"[{'content': ""Introducing the TrekReady Hiking...","Hey Emily! 👋 Absolutely, we have some amazing ...",3,[],5.0,5.0,5.0,4.0
8,8,what is the waterproof rating of the TrailMast...,[{'content': 'Unveiling the TrailMaster X4 Ten...,Hi Melissa! 🌧️ The TrailMaster X4 Tent's rainf...,8,[],4.0,4.0,3.0,2.0


In [12]:

metrics = pf_azure_client.get_metrics(eval_run)
print(json.dumps(metrics, indent=4))

{
    "gpt_coherence": 4.5,
    "gpt_coherence_pass_rate(%)": 100.0,
    "gpt_fluency": 4.5,
    "gpt_fluency_pass_rate(%)": 100.0,
    "gpt_groundedness": 4.0,
    "gpt_groundedness_pass_rate(%)": 50.0,
    "gpt_relevance": 3.0,
    "gpt_relevance_pass_rate(%)": 50.0
}


In [13]:
pf_azure_client.visualize([base_run, eval_run])

Web View: https://ml.azure.com/prompts/flow/bulkrun/runs/outputs?wsid=/subscriptions/a195fdab-ef20-44e0-ae3a-48b2e1764e85/resourceGroups/contchat-rg/providers/Microsoft.MachineLearningServices/workspaces/contoso-chat-sf-aiproj&runId=intent_base_run_03_05_2105,intent_eval_run_03_05_2111
