# External Monitoring

This notebook give you a example how to use MLOps to host a monitoring for the models you trained on your machine!

## MLOpsDataSourceClient

First of all, you must register a datasource

In [None]:
from mlops_codex.datasources import MLOpsDataSourceClient

datasource_client = MLOpsDataSourceClient(
    login="<email>",
    password="<tenant>",
    tenant="<tenant>"
)

In [None]:
datasource = datasource_client.register_datasource(
    datasource_name='TestAzure',
    provider='Azure',
    cloud_credentials="<path/to/your/cloud/credentials>",
    group='datarisk'
)

## MLOpsTrainingClient

Next, createa a training experiment

In [None]:
from mlops_codex.training import MLOpsTrainingClient
training_client = MLOpsTrainingClient(
    login="<email>",
    password="<tenant>",
    tenant="<tenant>"
)

In [None]:
training = training_client.create_training_experiment(
    experiment_name='External Train',
    model_type='Classification',
    group='datarisk'
)

In [None]:
import pandas as pd
from lightgbm import LGBMClassifier
from sklearn.impute import SimpleImputer
from sklearn.pipeline import make_pipeline
from sklearn.model_selection import cross_val_score

base_path = './samples/train/'
df = pd.read_csv(base_path+"/dados.csv")
X = df.drop(columns=['target'])
y = df[["target"]]

pipe = make_pipeline(SimpleImputer(), LGBMClassifier(force_col_wise=True))
pipe.fit(X, y)

with training.log_train(name='External training', X_train=X, y_train=y) as logger:
    logger.save_model(pipe)
    
    model_output = pd.DataFrame({"pred": pipe.predict(X), "proba": pipe.predict_proba(X)[:,1]})
    
    logger.save_model_output(model_output)

    auc = cross_val_score(pipe, X, y, cv=5, scoring="roc_auc")
    f_score = cross_val_score(pipe, X, y, cv=5, scoring="f1")
    logger.save_metric(name='auc', value=auc.mean())
    logger.save_metric(name='f1_score', value=f_score.mean())

    logger.set_python_version('3.10')

In [None]:
training

In [None]:
training.get_training_execution()

And pass the necessary data for an external training

## MLOpsExternalMonitoringClient

Finally, you can start the process of hosting your local trained model

In [None]:
from mlops_codex.external_monitoring import MLOpsExternalMonitoringClient
external_monitoring_client = MLOpsExternalMonitoringClient(
    login="<email>",
    password="<tenant>",
    tenant="<tenant>"
)

Create a dictionary with a configuration file:
- **Name**: External Monitoring name.
- **TrainingExecutionId**: Valid Mlops training execution id.
- **Period**: Day | Week | Quarter | Month | Year 
- **InputCols**: Array with input columns name.
- **OuputCols**: Array with output columns name.
- **DataSourceName**: Valid Mlops datasource name.
- **DataSourceUri**: Valid datasource Uri.
- **ExtractionType**: Incremental | Full
- **ReferenceDate**: Reference extraction date. 
- **ColumnName**: Column name of the data column.
- **PythonVersion**: Python38 | Python39 | Python310. Needed if you plan to have `preprocessing`/`shap` steps.

Register your external monitoring

In [None]:
external_monitoring = external_monitoring_client.register_monitoring(
    name="Teste",
    group='<group>',
    training_execution_id=1,
    period="Week",
    input_cols=[
        "VALOR_A_PAGAR", "TAXA", "RENDA_MES_ANTERIOR", "NO_FUNCIONARIOS","RZ_RENDA_FUNC", 
        "VL_TAXA","DDD", "SEGMENTO_INDUSTRIAL","DOMINIO_EMAIL", "PORTE", "CEP_2_DIG"
    ],
    output_cols=["probas"],
    datasource_name=datasource.datasource_name,
    extraction_type="Full",
    datasource_uri="<datasource uri>",
    column_name="SAFRA_REF"
)

In [None]:
external_monitoring

### Uploading files

Before host your monitoring, you may want to upload files. It is possibles to upload `model.pkl`, `requirements.txt` and a preprocessing script files.

**Rules**

- If you upload a model.pkl file, you must load the requirements and script file along with the shap and preprocessing entrypoints.
- If you only upload the script, you must upload the requirements and the preprocessing entrypoint
- You can also not upload any files

In [None]:
PATH = './samples/externalMonitoring/'

external_monitoring.upload_file(
    model_file=PATH + 'model.pkl',
    requirements_file=PATH + 'requirements.txt',
    preprocess_file=PATH + 'preprocess_async.py',
    preprocess_reference='build_df',
    shap_reference='get_shap',
    python_version='3.10',
)

After upload or not the files, you can start the host process

In [None]:
external_monitoring.host(wait=True)

In [None]:
external_monitoring

Get the logs of the monitoring

In [None]:
external_monitoring.logs()

Also, it is possible to access an existing monitoring

In [None]:
external_monitoring_client.get_external_monitoring(
    group='<group>',
    external_monitoring_hash="external_monitoring_hash"
)

And list all of your external monitoring

In [None]:
external_monitoring_client.list_hosted_external_monitorings()