Skip to content

Latest commit

 

History

History
60 lines (49 loc) · 2 KB

2.4 Model management.md

File metadata and controls

60 lines (49 loc) · 2 KB

2.4 Model management

Why?

We need a process to keep track of the model. It's not recommendable to use just the file manager.

We need to use this command to disable auto-logging

mlflow.xgboost.autolog(disable=True)

Using mlflow logging utilities:

# Log the model as an artifact
mlflow.log_artifact("models/my_best_model.b", artifact_path="my_best_model")

# Log the model as and mlflow model
mlflow.xgboost.log_model(booster, artifact_path="models_mlflow")

we can obtain the following information.

# Where the model is stored 
artifact_path: models_mlflow
flavors:
  # We can use our model as a python function
  python_function:
    data: model.xgb
    env: conda.yaml
    loader_module: mlflow.xgboost
    # python version
    python_version: 3.9.12
  # We can also use it as a xgboost object  
  xgboost:
    code: null
    data: model.xgb
    model_class: xgboost.core.Booster
    # xbg library version
    xgb_version: 1.6.1
mlflow_version: 1.26.0
model_uuid: cb835e881b7a410a9f4e1332a5cd4433
run_id: 99a8073207d14d0f86bae40f3489cc44
utc_time_created: '2022-06-04 11:09:59.226073'

Apart from the above we also have access to a python_env.yaml and conda.yaml to recreate the environments that were used to train and test the model.

Clearly this is very powerful and useful to keep track of our model and reproduce it in a controlled environment.

By logging our model as an mlflow model we can load our saved model in two different ways like this:

logged_model = 'runs:/99a8073207d14d0f86bae40f3489cc44/models_mlflow'

# Load model as a PyFuncModel.
loaded_model = mlflow.pyfunc.load_model(logged_model)

#Load model as an xgboost model.
xgboost_model = mlflow.xgboost.load_model(logged_model)

This is very important because we can save a model from any framework into the mlflow format and then load them easily. Mlflow serves as an interface between the ML frameworks. This then facilitates deploying models to different services like Spark, Sagemaker, etc.