# Upload of evaluators
In this notebook we are demonstrating the upload of the standard evaluators.

### Import

In [None]:
import os
import json
import pandas as pd
import shutil
import uuid
import yaml

from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
from azure.ai.ml.entities import (
    Model
)

from promptflow.client import PFClient
from promptflow.evals.evaluate import evaluate
from promptflow.evals.evaluators import F1ScoreEvaluator

## End to end demonstration of evaluator saving and uploading to Azure.
### Saving the standard evaluators to the flex format.
First we will create the promptflow client, which will be used to save the existing flows.

In [None]:
pf = PFClient()

We will use F1 score evaluator from the standard evaluator set and save it to local directory. 

In [None]:
pf.flows.save(F1ScoreEvaluator, path='./f1_score')

Let us inspect, what has been saved

In [None]:
print('\n'.join(os.listdir('f1_score')))

The file, defining entrypoint of our model is called flow.flex.yaml, let us display it.

In [None]:
with open(os.path.join('f1_score', 'flow.flex.yaml')) as fp:
    flex_definition = yaml.safe_load(fp)
print(f"The evaluator entrypoint is {flex_definition['entry']}")

In [None]:
pf = PFClient()
run = pf.run(
    flow='f1_score',
    data='data.jsonl',
    name=f'test_{uuid.uuid1()}',
    stream=True
)

Now let us test the flow with the simple dataset, consisting of one ground true and one actual sentense.

In [None]:
data = pd.DataFrame({
    "ground_truth": ["January is the coldest winter month."],
    "answer": ["June is the coldest summer month."]
})
in_file = 'sample_data.jsonl'
data.to_json('sample_data.jsonl', orient='records', lines=True, index=False)

Load the evaluator in a FLEX format and test it.

In [None]:
flow_result = pf.test(flow='f1_score', inputs='sample_data.jsonl')
print(f"Flow outputs: {flow_result}")

Now we have all the tools to upload our model to Azure
### Uploading data to Azure
First we will need to authenticate to azure. For this purpose we will use the the configuration file of the net structure.
```json
{
    "resource_group_name": "resource-group-name",
    "workspace_name": "ws-name",
    "subscription_id": "subscription-uuid",
    "registry_name": "registry-name"
}
```


In [None]:
with open('config.json') as f:
    configuration = json.load(f)

#### Uploading to the workspace
In this scenario we will not need the `registry_name` in our configuration.

In [None]:
config_ws = configuration.copy()
del config_ws["registry_name"]

credential = DefaultAzureCredential()
ml_client = MLClient(
    credential=credential,
    **config_ws,
)

We will use the evaluator operations API to upload our model to workspace.

In [None]:
eval = Model(
    path="f1_score",
    name='F1Score-Evaluator',
    description="Measures the ratio of the number of shared words between the model generation and the ground truth answers.",
)
ml_client.evaluators.create_or_update(eval)

Now we will retrieve model and check that it is functional.

In [None]:
ml_client.evaluators.download('F1Score-Evaluator', version='1', download_path='f1_score_downloaded')

In [None]:
flow_result = pf.test(flow=os.path.join('f1_score_downloaded', 'F1Score-Evaluator', 'f1_score'), inputs='data.jsonl')
print(f"Flow outputs: {flow_result}")

In [None]:
shutil.rmtree('f1_score_downloaded')
assert not os.path.isdir('f1_score_downloaded')

#### Uploading to the registry
In this scenario we will not need the `workspace_name` in our configuration.

In [None]:
config_reg = configuration.copy()
del config_reg["workspace_name"]

ml_client = MLClient(
    credential=credential,
    **config_reg
)

We are creating new eval here, because create_or_update changes the model inplace, adding non existing link to workspace

In [None]:
eval = Model(
    path="f1_score",
    name='F1Score-Evaluator',
    description="Measures the ratio of the number of shared words between the model generation and the ground truth answers.",
    properties={"show-artifact": "true"}
)
ml_client.evaluators.create_or_update(eval)

Now we will perform the same sanity check, we have done for the workspace.

In [None]:
ml_client.evaluators.download('F1Score-Evaluator', version='1', download_path='f1_score_downloaded')
flow_result = pf.test(flow=os.path.join('f1_score_downloaded', 'F1Score-Evaluator', 'f1_score'), inputs='data.jsonl')
print(f"Flow outputs: {flow_result}")

In [None]:
from promptflow.core import Flow

# This is not working but it should. Will uncomment once PF team provides a fix.
# f = Flow.load('f1_score_downloaded/F1Score-Evaluator/f1_score')
# f(question='What is the capital of France?', answer='Paris', ground_truth='Paris is the capital of France.')

Finally, we will do the cleanup.

In [None]:
shutil.rmtree('f1_score_downloaded')
assert not os.path.isdir('f1_score_downloaded')