In [1]:
# run this to shorten the data import from the files
import os
cwd = os.path.dirname(os.getcwd())+'/'
path_data = os.path.join(os.path.dirname(os.getcwd()), 'datasets/')


# MLproject file layout

A MLproject file is a yaml file that describes machine learning code, dependencies, and configurations so that it can be easily shared, reproduced, and executed across different environments. It enables users to manage their machine learning projects in a reproducible way.

What key properties make up a MLproject file in order to describe an MLflow Project?

### Possible Answers

    name{Answer}


    flavor


    entry_points{Answer}


    python_env{Answer}

In [2]:
# exercise 01

"""
Creating an MLproject

An MLproject file is a yaml file that stores the configuration of an MLflow Project. The file defines information such as name of the Project, Python environment and entry points to be executed as part of a workflow.

In this exercise, you will create an MLproject file to describe an MLflow Project. You will define the name of the Project, the Python environment, and also create an entry point.
"""

# Instructions

"""

    Set the name of the Project to insurance_model.

    Set the Python environment to use a file called python_env.yaml.

    Create an entry point called main.

    Create a command for the main entry point that uses python3.9 to execute train_model.py.

"""

# solution

"""
# Set name of the Project
name: insurance_model

# Set the environment
python_env: python_env.yaml

entry_points:
 	# Create an entry point
  	main:
  		# Create a command
    	command: 'python3.9 train_model.py'
"""

#----------------------------------#

# Conclusion

"""
Perfect! Creating an MLproject file will allow for the Project to be reproduced when executed on a different machine or environment.
"""

'\n\n'

In [3]:
# exercise 02

"""
MLflow run command

The MLflow run command is a command line interface for running MLflow Projects. The run command accepts arguments for entry point, experiment name and URI containing the MLproject that describes the Project.

You can test each run command in the IPython Shell by placing a ! at the beginning of the command.

Which of the following run commands is used to run an MLflow Project where the MLproject file is in the current working directory, uses an experiment called "Insurance", and uses an entry point "main"?
"""

# Instructions

"""
Possible answers:
    
    mlflow run --entry-point main --exp-name "Insurance" --env-manager local ./
    
    mlflow run --entry-point main --experiment-name "Insurance" -d --env-manager local ./
    
    mlflow run --entry-point main --experiment-name "Insurance" --env-manager local ./ {Answer}
    
    mlflow run -p main --entry-point "Insurance" --env-manager local ./
"""

# solution



#----------------------------------#

# Conclusion

"""
Yes! This command will run our main entry point which trains our Insurance model.
"""

'\n\n'

In [4]:
# exercise 03

"""
MLflow projects module

MLflow Projects can also be run programmatically with Python using the mlflow projects module.

In this exercise you will run an MLflow Project using the projects module to train a model for your "Insurance" Project. You will define the entry point from your MLproject file to execute the training code. You will also define the experiment name of "Insurance" so that the model is properly logged to the correct experiment in MLflow Tracking.

You may read the contents of the MLproject file by executing print(MLproject) in the IPython shell.
"""

# Instructions

"""

    Call the run() function from the mlflow projects module.

    Set the URI for the MLproject file to the current working directory.

    Set the entry point to "main" according to the MLproject file.

    Set the experiment name to "Insurance".

"""

# solution

import mlflow

# Set the run function from the MLflow Projects module
mlflow.projects.run(
  	# Set the URI as the current working directory
    uri='./',
    # Set the entry point to main
    entry_point='main',
    # Set the experiment name as Insurance
    experiment_name='Insurance',
    env_manager="local",
    synchronous=True,
)

#----------------------------------#

# Conclusion

"""
Awesome! Using mlflow projects module is a great way to run MLflow Projects using Python.
"""

'\n\n'

In [5]:
# exercise 04

"""
Adding parameters to MLproject

Defining parameters in MLflow Projects allows you to make your ML code reproducible. Parameters also simplify running training experiments with different settings without having to change code.

In this exercise, you are going to add parameters to your MLproject file for the main entry point. This entry point is used to run the train_model.py script which trains a Logistic Regression model from Insurance data.

The script accepts two parameters, n_jobs and fit_intercept, which are hyperparameters used to train the model. You will begin by adding the n_jobs parameter in the MLproject file. You will then add the fit_intercept parameter. Finally, you will add the parameters to the command executed in the main entry point.
"""

# Instructions

"""

    Create a parameter called n_jobs as a type int and a default value of 1.

    Create a second parameter called fit_intercept as a type bool with a default value set to True.

    Pass both parameters into the command ensuring that n_jobs is the first followed by fit_intercept.

"""

# solution

"""
name: insurance_model
python_env: python_env.yaml
entry_points:
  main:
    parameters:
      # Create parameter for number of jobs as n_jobs
      n_jobs:
        type: int
        default: 1
      # Create parameter for fit_intercept
      fit_intercept:
        type: bool
        default: True
    # Add parameters to be passed into the command
    command: "python3.9 train_model.py {n_jobs} {fit_intercept}"
"""

#----------------------------------#

# Conclusion

"""
Great! Parameters must be defined within a MLproject file before the Project can be run.
"""

'\n\n'

In [6]:
# exercise 05

"""
Adding parameters to project run

Parameters can be used to configure the behavior of a model by being passed as variables to the model during training. This allows you to train the model several times using different parameters without modifying the training code itself.

In this exercise, you will use the mlflow projects module to run a Project used to train a Logistic Regression model for your Insurance experiment. You will create code using the mlflow projects module that will run your project. You will then add parameters that will be passed as hyperparameters to the model during training.
"""

# Instructions

"""

    Call mlflow.projects.run() function from the mlflow projects module.

    Create the parameters dictionary and set n_jobs to 2 and fit_intercept to False.

"""

# solution

import mlflow

# Set the run function from the MLflow Projects module
mlflow.projects.run(
    uri='./',
    entry_point='main',
    experiment_name='Insurance',
  	env_manager='local',
  	# Set parameters for n_jobs and fit_intercept
  	parameters={
        'n_jobs_param': 2, 
        'fit_intercept_param': False
    }
)


#----------------------------------#

# Conclusion

"""
Excellent work! Passing parameters into the MLflow Projects module is a great way to tune hyperparameters.
"""

'\n\n'

In [7]:
# exercise 06

"""
Creating an MLproject for the ML Lifecycle: Model Engineering

The MLproject file can include more than one entry point. This means that you can use a single MLproject file to execute multiple entry points, making it possible to execute a workflow of multiple steps using a single MLproject file.

In this exercise you are going to build the beginning of an MLproject file that contains the model_engineering entry point. This entry point will execute a python script that accepts parameters used as hyperparameter values for fit_intercept and n_jobs to a Logistic Regression model. This model is used to predict sex of person from an insurance claim.
"""

# Instructions

"""

    Create an entry point for the Model Engineering step of the ML lifecycle called model_engineering.

    Set the first entry point parameter to n_jobs and and second to fit_intercept.

    Place the parameters within the command.

"""

# solution

"""
name: insurance_model
python_env: python_env.yaml
entry_points:
  # Set the entry point
  model_engineering:
    parameters: 
      # Set n_jobs 
      n_jobs:
        type: int
        default: 1
      # Set fit_intercept
      fit_intercept:
        type: bool
        default: True
    # Pass the parameters to the command
    command: "python3.9 train_model.py {n_jobs} {fit_intercept}"
"""

#----------------------------------#

# Conclusion

"""
The model_engineering entry point is going to handle building a model. Lets now create the part of the MLproject file responsible for Model Evaluation.
"""

'\n\n'

In [8]:
# exercise 07

"""
Creating an MLproject for the ML Lifecycle: Model Evaluation

In this exercise, you will continue creating your MLproject file to manage steps of the ML lifecycle. You will create another entry point called model_evaluation. This step in the workflow accepts the run_id output from the model_engineering step and runs model evaluation using training data from our Insurance dataset.

You can print the current MLproject file using the IPython Shell and executing print(MLproject).
"""

# Instructions

"""

    Create an entry point called model_evaluation.

    Set parameters for run_id.

    Place the parameter within the command.

"""

# solution

"""
  # Set the model_evaluation entry point
  model_evaluation:
    parameters:
      # Set run_id parameter
      run_id:
        type: str 
        default: None
    # Set the parameters in the command
    command: "python3.9 evaluate.py {run_id}"
"""


#----------------------------------#

# Conclusion

"""
This entry point will help evaluate our model. We are now ready to execute our multi-step workflow for managing the Model Engineering and Model Evaluation steps of the ML lifecycle.
"""

'\n\n'

In [9]:
# exercise 08

"""
Creating a multi-step workflow: Model Engineering

The MLflow Projects module can be used as a way to run a multi-step workflow. All steps can be coordinated though a single Python program that passes results from previous steps to the following.

In this exercise, you will begin creating a multi-step workflow to manage the Model Engineering and Model Evaluation steps of the ML lifecycle. You will use the run() method from the MLflow Projects module for the model_engineering entry point and pass parameters used as hyperparameters for model training. You will also capture the output of the run_id and set it to a variable so that it can be passed to the model_evaluation step of the workflow as a parameter.

The MLproject created in the previous step is available in the IPython Shell using print(MLproject). The MLflow module is imported.
"""

# Instructions

"""

    Assign the run() method from MLflow Projects module to a variable called model_engineering.

    Set the entry point argument to "model_engineering".

    Set parameters for training the model. "n_jobs" to 2 and "fit_intercept" to False.

    Set the run_id attribute of model_engineering to a variable called model_engineering_run_id.

"""

# solution

# Set run method to model_engineering
model_engineering = mlflow.projects.run(
    uri='./',
    # Set entry point to model_engineering
    entry_point='model_engineering',
    experiment_name='Insurance',
    # Set the parameters for n_jobs and fit_intercept
    parameters={
        'n_jobs_param': 2, 
        'fit_intercept_param': False
    },
    env_manager='local'
)

# Set Run ID of model training to be passed to Model Evaluation step
model_engineering_run_id = model_engineering.run_id
print(model_engineering_run_id)

#----------------------------------#

# Conclusion

"""
Excellent! The model_engineering_run_id can now be passed to the Model Evaluation step of our workflow.
"""

'\n\n'

In [10]:
# exercise 09

"""
Creating a multi-step workflow: Model Evaluation

In this exercise, you will create the Model Evaluation step of our multi-step workflow used to manage part of the ML lifecycle. You will use the run() method from the MLflow Projects module and set the entry point to model_evaluation. You will then take the model_engineering_run_id as a parameter that was generated as an output in the previous exercise and pass it to the command.

The MLproject created in the previous step is available in the IPython Shell using print(MLproject).

The mlflow module is imported.
"""

# Instructions

"""

    Assign the run() method from MLflow Projects module to model_evaluation.

    Set the entry point argument to "model_evaluation".

    Set a parameter called "run_id" with a value of model_engineering_run_id.

"""

# solution

# Set the MLflow Projects run method
model_evaluation = mlflow.projects.run(
    uri="./",
    # Set the entry point to model_evaluation
    entry_point="model_evaluation",
  	# Set the parameter run_id to the run_id output of previous step
    parameters={
        "run_id": model_engineering_run_id,
    },
    env_manager="local"
)

print(model_evaluation.get_status())

#----------------------------------#

# Conclusion

"""
Great! MLflow Projects has just been utilized to execute a workflow that manages several stages of the ML lifecycle.
"""

'\n\n'