# CICD Example with Azure AI Studio

This notebook is a highly simplified example of how Azure AI Studio and Prompt Flow can be integrated into a CICD pipeline.

This notebook uses the CLI and passes variable names manually between steps, as the objective is to learn about the process throughout this workshop. For a deep dive into LLMOps and CICD, explore this repository: https://github.com/microsoft/llmops-promptflow-template.

## Set up environment variables

In [None]:
import subprocess
import os
from dotenv import load_dotenv
import datetime

# Load environment variables from a .env file
load_dotenv()

# For Jupyter Notebook, use the current working directory as the base directory
base_dir = os.getcwd()
# Move the base path one level up
base_dir = os.path.dirname(base_dir)

# Constructing the flow path relative to the base directory
flow_path = os.path.join(base_dir, "Part-4-custom-evaluation-metrics/PromptFlow/Custom-Evaluation-Metrics")

# Retrieve workspace name and resource group from environment variables
workspace_name = os.getenv('WORKSPACE_NAME')
resource_group = os.getenv('RESOURCE_GROUP')



## Create a flow in Azure AI Studio

This flow is then stored in the cloud, and can be used for evaluation, rather than uploading the flow each time from a local directory.

In [None]:
current_datetime_utc = datetime.datetime.now(datetime.timezone.utc).strftime('%Y-%m-%d_%H:%M:%S_UTC')

# Include the formatted datetime string in the display name
command = f"""
pfazure flow create --flow {flow_path} --set display_name='llm_evaluation_workshop_{current_datetime_utc}' type=evaluation --resource-group={resource_group} --workspace-name={workspace_name}
"""

# Start the command and get the process
process = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)

# Stream the output
while True:
    output = process.stdout.readline()
    if output == '' and process.poll() is not None:
        break
    if output:
        print(output.strip())

# Check for errors
err = process.stderr.read()
if err:
    print(f"Error: {err}")

You should see a similar output to the below.

Flow created successfully:
```{
"name": "72e5ccb9-88bc-430a-90e4-638d71455f92",
"type": "evaluation",
"path": "Users/luca.stamatescu/promptflow/Custom-Evaluation-Metrics-06-19-2024-06-30-50/flow.dag.yaml",
"code": "azureml://locations/australiaeast/workspaces/<your-value>/flows/72e5ccb9-88bc-430a-90e4-638d71455f92",
"display_name": "llm_evaluation_workshop_2024-06-18_20:30:27_UTC",
"owner": {
"user_object_id": "70d305b4-2657-46fe-ae29-a684065e4a0e",
"user_tenant_id": "<your-value>",
"user_name": "Luca Stamatescu"
},
"is_archived": false,
"created_date": "2024-06-18 20:31:01.329366+00:00",
"flow_portal_url": "https://ai.azure.com/projectflows/<your-value>/details/Flow?wsid=/subscriptions/<your-value>/resourcegroups/ragdemo/providers/Microsoft.MachineLearningServices/workspaces/lstamatescu-8228"
}```

Edit the below path on Azure AI Studio, using "name" value above. You must add azureml: as a prefix.
Example: "azureml:72e5ccb9-88bc-430a-90e4-638d71455f92"

In [None]:
path_on_azure_ai_studio="azureml:<your-flow-name>"

# Assuming the data file is now in the same directory as the base directory
data_path = os.path.join(base_dir, "Part-4-custom-evaluation-metrics/Data/Input/Evaluation_Dataset_Combined_SME_Ground_Truth_And_Logging_Artefacts_quick_test.csv")


In [None]:
# Get the current datetime in UTC and format it as a string
current_datetime_utc = datetime.datetime.now(datetime.timezone.utc).strftime('%Y-%m-%d_%H:%M:%S_UTC')

# Include the formatted datetime string in the display name
command = f"""
pfazure run create --flow {path_on_azure_ai_studio} --data {data_path} --set display_name='CICDEvalRun_{current_datetime_utc}' tags.product='AzureOpenAIQnAChatBot' --column-mapping question='${{data.question}}' answer='${{data.answer}}' context='${{data.context}}' ground_truth='${{data.ground_truth}}' metrics='${{data.metrics}}' groundtruthcontext='${{data.groundtruthcontext}}' selected_tool='${{data.selected_tool}}' selected_tool_ground_truth='${{data.selected_tool_ground_truth}}' --resource-group={resource_group} --workspace-name={workspace_name} --stream -s
"""

# Start the command and get the process
process = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)

# Stream the output
while True:
    output = process.stdout.readline()
    if output == '' and process.poll() is not None:
        break
    if output:
        print(output.strip())

# Check for errors
err = process.stderr.read()
if err:
    print(f"Error: {err}")

At the end of this output, you will see a JSON object similar to the below:

```{
"name": "n72e5ccb9_88bc_430a_90e4_638d71455f92_variant_0_20240619_063748_166730",
"created_on": "2024-06-18T20:37:54.769895+00:00",
"status": "Completed",
"display_name": "CICDEvalRun_2024-06-18_20:37:41_UTC",
"description": null,
"tags": {
"product": "AzureOpenAIQnAChatBot"
},
"properties": {
"azureml.promptflow.inputs_mapping": "{\"question\":\"${data.question}\",\"answer\":\"${data.answer}\",\"context\":\"${data.context}\",\"ground_truth\":\"${data.ground_truth}\",\"metrics\":\"${data.metrics}\",\"groundtruthcontext\":\"${data.groundtruthcontext}\",\"selected_tool\":\"${data.selected_tool}\",\"selected_tool_ground_truth\":\"${data.selected_tool_ground_truth}\"}",
"azureml.promptflow.runtime_name": "automatic",
"azureml.promptflow.disable_trace": "false",
"azureml.promptflow.session_id": "72e5ccb9-88bc-430a-90e4-638d71455f92",
"azureml.promptflow.definition_file_name": "flow.dag.yaml",
"azureml.promptflow.flow_lineage_id": "72e5ccb9-88bc-430a-90e4-638d71455f92",
"azureml.promptflow.flow_definition_resource_id": "azureml://locations/australiaeast/workspaces/<your-value>/flows/72e5ccb9-88bc-430a-90e4-638d71455f92",
"azureml.promptflow.flow_id": "72e5ccb9-88bc-430a-90e4-638d71455f92",
"_azureml.evaluation_run": "promptflow.BatchRun",
"azureml.promptflow.snapshot_id": "12f2bd19-d360-4ce8-9dbe-72620a2782c2",
"azureml.promptflow.runtime_version": "20240529.v1",
"_azureml.evaluate_artifacts": "[{\"path\": \"instance_results.jsonl\", \"type\": \"table\"}]",
"azureml.promptflow.total_tokens": "26856",
"azureml.promptflow.completion_tokens": "519",
"azureml.promptflow.prompt_tokens": "26337"
},
"creation_context": {
"userObjectId": "70d305b4-2657-46fe-ae29-a684065e4a0e",
"userPuId": "100320030B1E9B4C",
"userIdp": "https://sts.windows.net/72f988bf-86f1-41af-91ab-2d7cd011db47/",
"userAltSecId": "5::1003200302599F5A",
"userIss": "https://sts.windows.net/16b3c013-d300-468d-ac64-7eda0820b6d3/",
"userTenantId": "<your-value>",
"userName": "Luca Stamatescu",
"upn": null
},
"start_time": "2024-06-18T20:39:10.867385+00:00",
"end_time": "2024-06-18T20:39:52.249316+00:00",
"duration": "00:00:41.3819309",
"portal_url": "https://ai.azure.com/projectflows/bulkrun/run/n72e5ccb9_88bc_430a_90e4_638d71455f92_variant_0_20240619_063748_166730/details?wsid=/subscriptions/<your-value>/resourcegroups/ragdemo/providers/Microsoft.MachineLearningServices/workspaces/lstamatescu-8228",
"data": "azureml://datastores/workspaceblobstore/paths/LocalUpload/70c9645db5bacbc99157923d70e0e310/Evaluation_Dataset_Combined_SME_Ground_Truth_And_Logging_Artefacts_quick_test.csv",
"output": "azureml://locations/australiaeast/workspaces/<your-value>/data/azureml_n72e5ccb9_88bc_430a_90e4_638d71455f92_variant_0_20240619_063748_166730_output_data_flow_outputs/versions/1"
}```

In [None]:
# You can also list the runs using the below command

# # Get the current datetime in UTC and format it as a string
# current_datetime_utc = datetime.datetime.now(datetime.timezone.utc).strftime('%Y-%m-%d_%H:%M:%S_UTC')

# # Include the formatted datetime string in the display name
# command = f"""
# pfazure run list  --resource-group={resource_group} --workspace-name={workspace_name}
# """

# # Start the command and get the process
# process = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)

# # Stream the output
# while True:
#     output = process.stdout.readline()
#     if output == '' and process.poll() is not None:
#         break
#     if output:
#         print(output.strip())

# # Check for errors
# err = process.stderr.read()
# if err:
#     print(f"Error: {err}")

The "name" value will be used in the next step.

Example: "n72e5ccb9_88bc_430a_90e4_638d71455f92_variant_0_20240619_063748_166730"

In [27]:
run_name="<your-name>"

In [None]:
# Get the current datetime in UTC and format it as a string
current_datetime_utc = datetime.datetime.now(datetime.timezone.utc).strftime('%Y-%m-%d_%H:%M:%S_UTC')

# Include the formatted datetime string in the display name
command = f"""
pfazure run show-metrics --name {run_name} --resource-group={resource_group} --workspace-name={workspace_name}
"""

# Start the command and get the process
process = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)

# Initialize a list to store output lines
output_lines = []

# Stream the output and store in the list
while True:
    output = process.stdout.readline()
    if output == '' and process.poll() is not None:
        break
    if output:
        print(output.strip())
        output_lines.append(output.strip())

# Join the output lines into a single string
full_output = "\n".join(output_lines)

# Check for errors
err = process.stderr.read()
if err:
    print(f"Error: {err}")

# Now `full_output` contains all the output as a single string that you can parse
# For example, to print the full output:
print(full_output)

## Use results to approve or reject a deployment

You can set a threshold, and choose whether to proceed with a deployment based on the output score.

In [None]:
import json

# Assuming `full_output` is a JSON string that includes the gpt_similarity score
json_output = json.loads(full_output)

# Retrieve the GPT similarity score from the JSON output
gpt_similarity_score = json_output["gpt_similarity"]

# Decision making based on the GPT similarity score
if gpt_similarity_score > 4.5:
    print("Deployment approved: GPT similarity score is above 4.5.")
    # Code to approve the deployment
else:
    print("Deployment declined: GPT similarity score is not above 4.5.")
    # Code to decline the deployment