# Evaluate Time-Series Forecasting AutomatedML Models with  the Responsible AI Dashboard

This notebook will allow you to use the Responsible AI components in AzureML to assess a time-series forecasting model trained with AutoML.

To use this workbook, you must have:

    1. a valid AzureML subscription ID, workspace, and resource group
    2. a trained AutoML time-series forecasting model
    3. a running compute cluster on AzureML

If you have not trained an AutoML model yet, please use the forecasting-data-preprocessing notebook to prepare your data to be trained.

In [None]:
%pip install datasets

### 1. Workspace Details

Enter the details of your AzureML workspace below. If the provided compute name does not exist, it will be created for you later.

In [None]:
subscription_id = "<SUBSCRIPTION_ID>"
resource_group = "<RESOURCE_GROUP>"
workspace = "<AML_WORKSPACE_NAME>"
compute_name = "rai-cluster"

### 2. Model Details
Enter the name of your model. You can obtain the name of your model from under the **"Output name: best model"** section in the **Outputs** section. 

In [None]:
model_id = "azureml_plucky_shelf_lgrt16cy1f_40_output_mlflow_log_model_261482756:1"

### 3. Data Details

Enter the details of the data you used to train your AutoML model. If you created your data from the forecasting-data-preprocessing notebook, the dataset names and versions should be the same across the two notebooks.

In [None]:
data_version = "1"
input_train_data = "forecasting_train_mltable"
input_test_data = "forecasting_test_mltable"

### 4. Get a handle to the workspace

We will use the information provided in the Workspace Details section to get a handle to the required Azure Machine Learning workspace. No additional input is required for this section.

In [None]:
import json
import time
from azure.ai.ml import Input, MLClient, Output, dsl, load_job
from azure.ai.ml.entities import PipelineJob
from IPython.core.display import HTML
from IPython.display import display
from azure.ai.ml.constants import AssetTypes
from azure.identity import DefaultAzureCredential

In [None]:
# Create MLClient object

credential = DefaultAzureCredential()
ml_client = MLClient(
    credential=credential,
    subscription_id=subscription_id,
    resource_group_name=resource_group,
    workspace_name=workspace,
)
print(ml_client)

# Get handle to azureml registry for the RAI built in components

registry_name = "azureml"
ml_client_registry = MLClient(
    credential=credential,
    subscription_id=subscription_id,
    resource_group_name=resource_group,
    registry_name=registry_name,
)
print(ml_client_registry)

In [None]:
from azure.ai.ml.entities import AmlCompute

all_compute_names = [x.name for x in ml_client.compute.list()]

if compute_name in all_compute_names:
    print(f"Found existing compute: {compute_name}")
else:
    my_compute = AmlCompute(
        name=compute_name,
        size="Standard_D2_v2",
        min_instances=0,
        max_instances=4,
        idle_time_before_scale_down=3600,
    )
    ml_client.compute.begin_create_or_update(my_compute).result()
    print("Initiated compute creation")

In [None]:
# Helper function
def submit_and_wait(ml_client, pipeline_job) -> PipelineJob:
    created_job = ml_client.jobs.create_or_update(pipeline_job)
    assert created_job is not None

    print("Pipeline job can be accessed in the following URL:")
    display(HTML('<a href="{0}">{0}</a>'.format(created_job.studio_url)))

    while created_job.status not in [
        "Completed",
        "Failed",
        "Canceled",
        "NotResponding",
    ]:
        time.sleep(30)
        created_job = ml_client.jobs.get(created_job.name)
        print("Latest status : {0}".format(created_job.status))
    assert created_job.status == "Completed"
    return created_job

### 5. Run the RAI Dashboard Pipeline

We will now define and start the pipeline to generate an RAI Dashboard from your model. There are four parameters that need to be adjusted here.

1. **target_column** should be the name of the column that you want your model to predict over time. 

2. **datetime_column** should be the name of the column in your dataset with datetime features.

3. **time_series_id_column** should be the column in your dataset which represents the group IDs that describe which time-series in your dataset each row belongs to. If your dataset contains a single time-series, then the value in this column should be the same in every row. See forecasting-data-preprocessing.ipynb for more details.

4. **categorical_columns** should include a list of all the columns in your data **except** for the three columns you already provided. These columns will be the features that you can perturb with What-If Analysis once the Dashboard is launched. 

In [None]:
target_column = "demand"
datetime_column = "datetime"
time_series_id_column = "group_id"
categorical_columns = ["precip", "temp", "group_id"]

In [None]:
# Don't touch any of this
d = {}
d["datetime_features"] = [datetime_column]
d["time_series_id_features"] = [time_series_id_column]
feature_metadata = json.dumps(d)
categorical_columns.append(time_series_id_column)
categorical_column_names = json.dumps(categorical_columns)

In [None]:
def forecasting_rai_dashboard():

    rai_constructor_component = ml_client_registry.components.get(
        name="microsoft_azureml_rai_tabular_insight_constructorr", label="latest"
    )

    rai_gather_component = ml_client_registry.components.get(
        name="microsoft_azureml_rai_tabular_insight_gather", label="latest"
    )

    train_data = Input(
        type="mltable",
        path=f"azureml:{input_train_data}:{data_version}",
        mode="download",
    )
    test_data = Input(
        type="mltable",
        path=f"{input_test_data}:{data_version}",
        mode="download",
    )

    # Pipeline skips on analysis; relies on the constructor component verifying the model works
    @dsl.pipeline(
        compute=compute_name,
        description="Create RAI Dashboard for Forecasting",
        experiment_name=f"create_rai_dashboard_forecasting{version_string}",
    )
    def construct_dashboard(train_data, test_data):
        construct_job = rai_constructor_component(
            title="RAI Forecasting Dashboard",
            task_type="forecasting",
            model_info=model_id,
            model_input=Input(type=AssetTypes.MLFLOW_MODEL, path=f"azureml:{model_id}"),
            train_dataset=train_data,
            test_dataset=test_data,
            target_column_name=target_column,
            feature_metadata=feature_metadata,
            categorical_column_names=categorical_column_names,
            maximum_rows_for_test_dataset=5000,
            classes="[]",  # Should be default value
            use_model_dependency=True,
        )
        construct_job.set_limits(timeout=1800)

        rai_gather_job = rai_gather_component(
            constructor=construct_job.outputs.rai_insights_dashboard
        )
        rai_gather_job.set_limits(timeout=1800)

    insights_pipeline_job = construct_dashboard(
        train_data=train_data,
        test_data=test_data,
    )

    # Send it
    insights_pipeline_job = submit_and_wait(ml_client, insights_pipeline_job)
    assert insights_pipeline_job is not None

In [None]:
forecasting_rai_dashboard()