# PRODUCTION phase: About this notebook
- Purpose: Creates 1 AKS webservice, to serve the model as an ONLINE endpoint
    - `AKS ONLINE Webservice:` Fetches the best trained model, Deployes that on an `AKS cluster`, always up and running, ready to be pinged and return results (via REST / Swagger, or Python SDK)

## DETAILS - about this notebook and the 2 pipelines, generated            
- 1) `Initiates ESMLProject` and sets `active model` and `active date folder`:
- 2) `DEPLOY & SERVER: Fetched the BEST MODEL, and deploys on AKS`
- 3) `Smoke testing: Fetches some data and calls the webservice` (smoke testing purpose - see that it works...)
    - Gets test data
    - Calls webservice, which both returns data via REST call, and ESML optionally also saves the returned result to datalake 

# Login / Switch DEV_TEST_PROD environment (1-timer)

import sys
sys.path.insert(0, "../azure-enterprise-scale-ml/esml/common/")
from azureml.core.authentication import InteractiveLoginAuthentication
from esml import ESMLProject

p = ESMLProject() # Will search in ROOT for your copied SETTINGS folder '../settings/model/active/active_scoring_in_folder.json',
p.dev_test_prod="dev"
auth = InteractiveLoginAuthentication(tenant_id = p.tenant)
#auth = InteractiveLoginAuthentication(force=True, tenant_id = p.tenant)
ws, config_name = p.authenticate_workspace_and_write_config(auth)

# 1) `Initiates ESMLProject` and sets `active model` and `active date folder`:

In [None]:
import sys
sys.path.insert(0, "../azure-enterprise-scale-ml/")
from esmlrt.interfaces.iESMLController import IESMLController
sys.path.insert(0, "../azure-enterprise-scale-ml/esml/common/")
from esml import ESMLProject
import pandas as pd

p = ESMLProject() # Will search in ROOT for your copied SETTINGS folder '../settings/model/active/active_scoring_in_folder.json',
p.active_model = 11
p.inference_mode = False
p.ws = p.get_workspace_from_config() #2) Load DEV or TEST or PROD Azure ML Studio workspace
p.verbose_logging = False

## 2a) `DEPLOY`: Option A - Let ESML find BEST Model ( and its environment, scoring script) 

In [None]:
inference_config, model, best_run = IESMLController.get_best_model_inference_config(p.ws, p.model_folder_name, p.ModelAlias)
service,api_uri, kv_aks_api_secret= p.deploy_model_as_private_aks_online_endpoint(model,inference_config,overwrite_endpoint=True)

## 2b) `DEPLOY`: Option B - Inject YOUR selection of model and run, any model, override "ESML best model logic"

In [None]:
from azureml.core import Experiment
from azureml.core import Model
from azureml.pipeline.core import PipelineRun
from azureml.train.automl.run import AutoMLRun

def get_best_model_inference_config(model_name,model_version, run_id = None):
    print(" - model_name {} | version {} | run_id: {}".format(model_name,model_version,run_id))
    model = Model(workspace=p.ws,name=model_name, version=model_version)
    experiment = Experiment(p.ws,p.experiment_name )
    best_run = None
    if(run_id is not None):
        main_run = PipelineRun(experiment=experiment, run_id=run_id)
        best_run = main_run
    inference_config, model, best_run = IESMLController.get_best_model_inference_config(p.ws, p.model_folder_name, p.ModelAlias,scoring_script_folder_local=None, current_model=model,run_id_tag=run_id, best_run = best_run)
    return inference_config, model, best_run

#### Option B: Model A: ...Deploy ANY model (not only the best promoted model)
- Find information in Azure ML Studio in Models registry table

In [None]:
##### Model 1  (Example Pipeline: DatabricksSteps + ManualML)
model_name = 'your_model_folder_name' # '11_diabetes_model_reg' - Find a model name in Azure ML Studio/Model register or lake_settings.json
model_version = 1 # Find a model name in Azure ML Studio/Model register
run_id = 'todo_c70-3ef4-470c-9f55-92b33318c8ad'# '9360ac70-3ef4-470c-9f55-92b33318c8ad' # Main pipeline run - Find a model name in Azure ML Studio/Model register

inference_config, model, best_run = get_best_model_inference_config(model_name,model_version,run_id)
service,api_uri, kv_aks_api_secret= p.deploy_model_as_private_aks_online_endpoint(model,inference_config,overwrite_endpoint=True)

#### Look at the Run -  Example: 100% Databricks pipeline

In [None]:
best_run

#### Option B: Model B (wrong model type): ...Deploy ANY model (not only the best promoted model)
- `Flexibility:` You are able to, not recommended, `override the ESML validation_guard, to pass any model`:
    ```python 
    validation_guard=False 
    ```

- HOWTO TEST: If you have a model_name of a classification, and p.active_model is of ml_type regression, an error message will be triggered if validation_guard=True

In [None]:
##### Model 2: AutoML: Titanic classification = wrong/incompatible model. 

model_name = 'your_model_folder_name' # '11_diabetes_model_reg' - Find a model name in Azure ML Studio/Model register or lake_settings.json
model_version = 1 # Find a model name in Azure ML Studio/Model register
run_id = 'todo_c70-3ef4-470c-9f55-92b33318c8ad'# '9360ac70-3ef4-470c-9f55-92b33318c8ad' # Main pipeline run - Find a model name in Azure ML Studio/Model register

inference_config, model, best_run = get_best_model_inference_config(model_name,model_version,run_id)

# Problem: incorrect test_data for smoke testing will be fetched, wrong schema
# Solution: The ESML Validation guard, will catch this, and provide solution
service,api_uri, kv_aks_api_secret= p.deploy_model_as_private_aks_online_endpoint(model,inference_config,overwrite_endpoint=True, validation_guard=True)

# 3) `Smoke testing:` TEST ENDPOINT - Score with some test data

### Get testdata, and score it

In [None]:
p.connect_to_lake()
X_test, y_test, tags = p.get_gold_validate_Xy()
caller_id = "10965d9c-40ca-4e47-9723-5a608a32a0e4" # Pass an optional tracking ID for the request, parquet file will then have this name

#df = p.call_webservice(p.ws, X_test,caller_id) # Saves to datalake also
df = p.call_webservice(ws=p.ws, pandas_X_test=X_test,user_id=caller_id,firstRowOnly=True,save_2_lake_also=False) # If not saving also to datalake
#df = p.call_webservice(ws=p.ws, pandas_X_test=X_test.iloc[:1],user_id=caller_id,firstRowOnly=True,save_2_lake_also=False) # If not saving also to datalake, predict 1 row only
pd.set_option('display.max_colwidth', None)
df.head()

## END) `SUMMARY - what did the notebook do:DEPLOY & SERVE: Fetched the BEST MODEL, and deploys on AKS`
- ESML saves `API_key in Azure keyvault automatically`
- ESML auto-config solves 4 common 'errors/things': `correct compute name` and `valid replicas, valid agents, valid auto scaling`
    - Tip: You can adjust the number of replicas, and different CPU/memory configuration, or using a different compute target.

### Note: AKS_SETTINGS
- Here you can edit AKS settings (performance for DEV, TEST, PROD environments) under PROJECT specific MODEL settings and ONLINE = AKS
    -  Link: [aks_config_dev.json](../settings/project_specific/model/dev_test_prod_override/online/aks_config_dev.json)
- Note: 
    - Q: Why is `docker_bridge_cidr` a PUBLIC IP? Isn't this a PRIVATE AKS cluster?
    - A: Yes, it is PRIVATE. We don’t use the docker bridge for pod communication, but as Docker is configured as part of the Kubernetes setup, this docker bridge it also gets created as well, so in order to avoid that it picks random unknown CIDR that could collide with any of your existent subnets, we give the option to change it and set it a known range. So the indication for docker bridge is to define any CIDR that doesn’t to Azure, and doesn’t collide with any other subnet. 
        - Read more: [learn.microsoft.com](https://learn.microsoft.com/en-us/answers/questions/199786/how-to-update-docker-bridge-cidr-for-aks-to-a-diff.html)

# `EXTRA`: Logging, edit AutoML scoring script, etc

## `Logging 01` - View request to webservice: Input / Output
#### Purpose: 
 - To view information logged from the score.py file, look at the traces table. The following query searches for logs where the input value was logged
 - Docs: https://learn.microsoft.com/en-us/azure/machine-learning/v1/how-to-enable-app-insights#view-metrics-and-logs
 - How:
   - 1)Open the link in Azure ML Studio, Endpoints, and property: 'Application Insights url'
    -    Note: If this property or link is not visible, edit [aks_config_dev.json](../settings/project_specific/model/dev_test_prod_override/online/aks_config_dev.json)
       -    `enable_app_insights:true`
         - `collect_model_data:true`
   - 2) Write queries from traces table - see examples below: 

### Application insights - examples
- View data input to the request, see `customDimensions` in the result for this query

   ```python
   traces
   | where customDimensions contains "input"
   | limit 10
   ```

- View all events:
   ```python
   traces
   | limit 10
   ```

## `Logging 02` - View `AMLOnlineEndpointConsoleLog`
#### Purpose: 
 - If the container fails to start, the console log may be useful for debugging.
 - performance analysis in determining the time required by the model to process each request.
 - https://learn.microsoft.com/en-us/azure/machine-learning/how-to-monitor-online-endpoints#logs


## Print logging info - Deployment Logs 
- This will validate init() method, if model is loaded correctly

In [None]:
from azureml.core import Workspace
from azureml.core.webservice import Webservice

# load existing web service
service = Webservice(name="esml-dev-p02-m11-aksapi", workspace=p.ws)
logs = service.get_logs()
print(logs)

## ESML - `save_2_lake_also`: You can also store response in datalake
- If you are using the flag `save_2_lake_also` as below, here you will see WHERE the data is stored:

    ```python
    df = p.call_webservice(ws=p.ws, pandas_X_test=X_test.iloc[:1],user_id=caller_id,firstRowOnly=True,save_2_lake_also=True)
    ```

    The default behaviour is to store in datalake also: 
    ```python
    df = p.call_webservice(p.ws, X_test,caller_id) # Saves to datalake also
    ```

In [None]:
to_score_folder, scored_folder, date_folder = p.get_gold_scored_unique_path()
print("Example of where your scored data is saved. Unique folder will be different each time though")
print(scored_folder)
print()
print("Note: Last folder, UUID folder, should represent a 'unique scoring' for a day, but can be injected. Example: if we want a customerGUID instead ")

# END

## EXTRA - HOW to customize the AutoML scoring file (that you defined earlier during the TRAIN RUN)
- Info: If using AutoML (in pipeline as AutoMLStep or AutoMLRun), then the scoringscript file is autogenerated by Azure ML (not by ESML as it is for manual ML). 
    - AutoML will save this scoring script file at its Run in Azure ML - you then need to download it, edit it, at use the local one.
- You need to download it locally first, then edit it, as below:

In [None]:
import os
os.chdir(os.path.dirname(globals()['_dh'][0]))

scoring_file = "scoring_file_{}_automl.py".format(p.model_folder_name)
script_file_local = "./settings/project_specific/model/dev_test_prod/train/ml/"+scoring_file 
script_file_abs = os.path.abspath(script_file_local)

print("1) Download & EDIT: Local path: to look and edit the file: {}".format(script_file_abs))
best_run.download_file('outputs/scoring_file_v_1_0_0.py', script_file_abs)

In [None]:
print("2) Then set the EDITED local scoring script, to the inference_config")
inference_config.entry_script = script_file_abs

print("3) Then Deploy the model")
service,api_uri, kv_aks_api_secret= p.deploy_model_as_private_aks_online_endpoint(model,inference_config)

In [None]:
X_test, y_test, tags = p.get_gold_validate_Xy()
caller_id = "10965d9c-40ca-4e47-9723-5a608a32a0e4" # Pass an optional tracking ID for the request, parquet file will then have this name

#df = p.call_webservice(p.ws, X_test,caller_id) # Saves to datalake also
df = p.call_webservice(ws=p.ws, pandas_X_test=X_test.iloc[:1],user_id=caller_id,firstRowOnly=True,save_2_lake_also=False) # If not saving also to datalake
df.head()