### Managing the model lifecycle with Model Registry

<img src="https://github.com/QuentinAmbard/databricks-demo/raw/main/product_demos/mlops-end2end-flow-4.png" width="1200">

One of the primary challenges among data scientists and ML engineers is the absence of a central repository for models, their versions, and the means to manage them throughout their lifecycle.  

[The MLflow Model Registry](https://docs.databricks.com/applications/mlflow/model-registry.html) addresses this challenge and enables members of the data team to:
<br><br>
* **Discover** registered models, current stage in model development, experiment runs, and associated code with a registered model
* **Transition** models to different stages of their lifecycle
* **Deploy** different versions of a registered model in different stages, offering MLOps engineers ability to deploy and conduct testing of different model versions
* **Test** models in an automated fashion
* **Document** models throughout their lifecycle
* **Secure** access and permission for model registrations, transitions or modifications

<!-- Collect usage data (view). Remove it to disable collection. View README for more details.  -->
<img width="1px" src="https://ppxrzfxige.execute-api.us-west-2.amazonaws.com/v1/analytics?category=data-science&org_id=1549883858499596&notebook=%2F04_from_notebook_to_registry&demo_name=mlops-end2end&event=VIEW&path=%2F_dbdemos%2Fdata-science%2Fmlops-end2end%2F04_from_notebook_to_registry&version=1">
<!-- [metadata={"description":"MLOps end2end workflow: Move model to registry and request transition to STAGING.",
 "authors":["quentin.ambard@databricks.com"],
 "db_resources":{},
  "search_tags":{"vertical": "retail", "step": "Data Engineering", "components": ["mlflow"]},
                 "canonicalUrl": {"AWS": "", "Azure": "", "GCP": ""}}] -->
                 

### A cluster has been created for this demo
To run this demo, just select the cluster `dbdemos-mlops-end2end-shawnzou2020` from the dropdown menu ([open cluster configuration](https://dbc-abdbb8e0-f50f.cloud.databricks.com/#setting/clusters/0410-014028-ndqe9et5/configuration)). <br />
*Note: If the cluster was deleted after 30 days, you can re-create it with `dbdemos.create_cluster('mlops-end2end')` or re-install the demo: `dbdemos.install('mlops-end2end')`*

### How to Use the Model Registry
Typically, data scientists who use MLflow will conduct many experiments, each with a number of runs that track and log metrics and parameters. During the course of this development cycle, they will select the best run within an experiment and register its model with the registry.  Think of this as **committing** the model to the registry, much as you would commit code to a version control system.  

The registry defines several model stages: `None`, `Staging`, `Production`, and `Archived`. Each stage has a unique meaning. For example, `Staging` is meant for model testing, while `Production` is for models that have completed the testing or review processes and have been deployed to applications. 

Users with appropriate permissions can transition models between stages.

In [0]:
%run ./_resources/00-setup $reset_all_data=false $catalog="hive_metastore"



USE CATALOG `hive_metastore`
using cloud_storage_path /Users/quentin.ambard@databricks.com/demos/retail
using catalog.database `hive_metastore`.`retail_quentin_ambard`


#### Sending our model to the registry

We'll programatically select the best model from our last Auto-ML run and deploy it in the registry. We can easily do that using MLFlow `search_runs` API:

In [0]:
#Let's get our last auto ml run. This is specific to the demo, it just gets the experiment ID of the last Auto ML run.
experiment_id = get_automl_churn_run()['experiment_id']

best_model = mlflow.search_runs(experiment_ids=[experiment_id], order_by=["metrics.val_f1_score DESC"], max_results=1, filter_string="status = 'FINISHED'")
best_model

Unnamed: 0,run_id,experiment_id,status,artifact_uri,start_time,end_time,metrics.test_accuracy_score,metrics.training_recall_score,metrics.training_precision_score,metrics.val_score,metrics.val_true_positives,metrics.test_score,metrics.training_roc_auc,metrics.test_roc_auc,metrics.val_recall_score,metrics.training_true_negatives,metrics.val_true_negatives,metrics.val_example_count,metrics.val_precision_recall_auc,metrics.val_f1_score,metrics.test_false_negatives,metrics.test_example_count,metrics.training_false_positives,metrics.training_score,metrics.test_true_negatives,metrics.test_f1_score,metrics.test_log_loss,metrics.training_example_count,metrics.test_false_positives,metrics.training_f1_score,metrics.training_false_negatives,metrics.val_precision_score,metrics.training_log_loss,metrics.val_log_loss,metrics.test_true_positives,metrics.test_precision_score,metrics.val_false_negatives,metrics.training_true_positives,metrics.val_false_positives,metrics.training_precision_recall_auc,...,params.classifier__importance_type,params.classifier__subsample,params.preprocessor__boolean__imputers,params.classifier__lambda_l1,params.classifier__colsample_bytree,params.preprocessor__verbose_feature_names_out,params.preprocessor__numerical__standardizer__with_mean,params.preprocessor__numerical__memory,params.preprocessor__numerical__standardizer,params.preprocessor__boolean__cast_type__inverse_func,params.preprocessor__boolean__steps,params.column_selector__cols,params.preprocessor__numerical__imputers__sparse_threshold,params.classifier__subsample_for_bin,params.preprocessor__boolean__cast_type__validate,params.classifier__min_split_gain,params.preprocessor__numerical__converter__check_inverse,params.preprocessor__boolean__onehot__dtype,params.preprocessor__boolean__imputers__remainder,params.preprocessor__boolean__cast_type__accept_sparse,params.classifier__min_child_samples,params.preprocessor__transformer_weights,params.preprocessor__numerical__imputers,params.preprocessor__numerical__converter__inv_kw_args,params.classifier__class_weight,params.classifier__max_depth,params.preprocessor__numerical__imputers__impute_mean__missing_values,params.classifier__lambda_l2,params.preprocessor__numerical__imputers__n_jobs,params.preprocessor__boolean__cast_type__kw_args,tags.mlflow.user,tags.model_type,tags.mlflow.source.name,tags.mlflow.runName,tags.estimator_class,tags.mlflow.log-model.history,tags.mlflow.databricks.notebookID,tags.mlflow.source.type,tags.estimator_name,tags.mlflow.datasets
0,d0d33d85f56f4561a77baf8c4a678cff,402627122877070,FINISHED,dbfs:/databricks/mlflow-tracking/4026271228770...,2023-06-21 12:53:25.737000+00:00,2023-06-21 12:53:44.833000+00:00,0.810754,0.562677,0.672297,0.795948,215.0,0.810754,0.862129,0.851221,0.547074,2877.0,885.0,1382.0,0.674456,0.603933,170.0,1432.0,291.0,0.821471,916.0,0.64389,0.424309,4229.0,101.0,0.612622,464.0,0.673981,0.387719,0.440234,245.0,0.708092,178.0,597.0,104.0,0.67725,...,split,0.6744020945600699,"ColumnTransformer(remainder='passthrough', tra...",6.454044807263527,0.4909947649465118,True,True,,StandardScaler(),,"[('cast_type', FunctionTransformer(func=<funct...","['online_security_no', 'online_backup_yes', 'p...",0.3,200000,False,0.0,True,<class 'numpy.float64'>,passthrough,False,232,,ColumnTransformer(transformers=[('impute_mean'...,,,5,,93.78090037129849,,,quentin.ambard@databricks.com,lightgbm_classifier,Notebook: LightGBMClassifier,receptive-sloth-463,sklearn.pipeline.Pipeline,"[{""artifact_path"":""model"",""saved_input_example...",402627122878611,NOTEBOOK,Pipeline,"[{""name"":""c798edc4c97f380c61bdcea4defa51de"",""h..."


Once we have our best model, we can now deploy it in production using it's run ID

In [0]:
run_id = best_model.iloc[0]['run_id']

#add some tags that we'll reuse later to validate the model
client = mlflow.tracking.MlflowClient()
client.set_tag(run_id, key='demographic_vars', value='seniorCitizen,gender_Female')
client.set_tag(run_id, key='db_table', value=f'{dbName}.dbdemos_mlops_churn_features')

#Deploy our autoML run in MLFlow registry
model_details = mlflow.register_model(f"runs:/{run_id}/model", "dbdemos_mlops_churn")

Registered model 'dbdemos_mlops_churn' already exists. Creating a new version of this model...
2023/06/21 12:58:02 INFO mlflow.tracking._model_registry.client: Waiting up to 300 seconds for model version to finish creation.                     Model name: dbdemos_mlops_churn, version 56
Created version '56' of model 'dbdemos_mlops_churn'.


At this point the model will be in `None` stage.  Let's update the description before moving it to `Staging`.

#### Update Description
We'll do this for the registered model overall, and the particular version.

In [0]:
model_version_details = client.get_model_version(name="dbdemos_mlops_churn", version=model_details.version)

#The main model description, typically done once.
client.update_registered_model(
  name=model_details.name,
  description="This model predicts whether a customer will churn.  It is used to update the Telco Churn Dashboard in DB SQL."
)

#Gives more details on this specific model version
client.update_model_version(
  name=model_details.name,
  version=model_details.version,
  description="This model version was built using XGBoost. Eating too much cake is the sin of gluttony. However, eating too much pie is okay because the sin of pie is always zero."
)

Out[23]: <ModelVersion: creation_timestamp=1687352282042, current_stage='None', description=('This model version was built using XGBoost. Eating too much cake is the sin '
 'of gluttony. However, eating too much pie is okay because the sin of pie is '
 'always zero.'), last_updated_timestamp=1687352288556, name='dbdemos_mlops_churn', run_id='d0d33d85f56f4561a77baf8c4a678cff', run_link='', source='dbfs:/databricks/mlflow-tracking/402627122877070/d0d33d85f56f4561a77baf8c4a678cff/artifacts/model', status='READY', status_message='', tags={}, user_id='7644138420879474', version='56'>

#### Request Transition to Staging

<img style="float: right" src="https://github.com/QuentinAmbard/databricks-demo/raw/main/retail/resources/images/churn_move_to_stating.gif">

Our model is now read! Let's request a transition to Staging. 

While this example is done using the API, we can also simply click on the Model Registry button.

In [0]:
request_transition(model_name = "dbdemos_mlops_churn", version = model_details.version, stage = "Staging")

Out[24]: {'request': {'creation_timestamp': 1687352288722,
  'user_id': 'quentin.ambard@databricks.com',
  'activity_type': 'REQUESTED_TRANSITION',
  'comment': '',
  'to_stage': 'Staging'}}

#### Leave Comment in Registry

In [0]:
# Leave a comment for the ML engineer who will be reviewing the tests
comment = "This was the best model from AutoML, I think we can use it as a baseline."

model_comment(model_name = "dbdemos_mlops_churn",
             version = model_details.version,
             comment = comment)

Out[25]: {'comment': {'creation_timestamp': 1687352288947,
  'user_id': 'quentin.ambard@databricks.com',
  'comment': 'This was the best model from AutoML, I think we can use it as a baseline.',
  'last_updated_timestamp': 1687352288947,
  'id': '56e2550e0ca246d5b5f222a7f187abef'}}

### Next: MLOps model testing and validation

Because we defined our webhooks earlier, a job will automatically start, testing the new model being deployed and validating the request.

Remember our webhook setup ? That's the orange part in the diagram.

<img style="float: right" src="https://github.com/QuentinAmbard/databricks-demo/raw/main/retail/resources/images/churn-mlflow-webhook-1.png" width=600 >

If the model passes all the tests, it'll be accepted and moved into STAGING. Otherwise it'll be rejected, and a slack notification will be sent.

Next: 
 * Find out how the model is being tested befored moved to STAGING [using the Databricks Staging test notebook]($./05_job_staging_validation) (optional)
 * Or discover how to [run Batch and Real-time inference from our STAGING model]($./06_staging_inference)