# Custom ML Runtime

## Legacy engines vs ML Runtime

Precursor to Cloudera Machine Learning was Cloudera Data Science Workbench. CDSW engines are container images that contain OS, interpreters and libraries. ML runtime is next generation of CDSW engines that is smaller in size and has improvements in terms of performance, maintenance and security. To read more about the difference between legacy engines and ML runtimes, refer [documentation](https://docs.cloudera.com/machine-learning/cloud/runtimes/topics/ml-runtimes-vs-engines.html).

## Custom ML Runtime

For new users, standard ML runtimes are enough to get started and run their projects. However, for users that have passed the beginning stages of project or users that are required by their organization to 'bake-in' the organizational standards into CML, ML runtime customisation may be required. Since ML runtimes are container images, it is easier to customize the standard runtimes or build a new one from scratch.

For steps to create a custom ML runtimes, please refer to [documentation](https://docs.cloudera.com/machine-learning/cloud/runtimes/topics/ml-creating-a-customized-runtimes-image.html).

## Building custom runtime

Typically, all required packages are to be pre-installed into custom ML runtimes are determined by surveying the users by the platform administrator. In our 'Default prediction' project, we have created a baseline model. To create this model, we had to install pandas and imblearn python packages. Since these packages will be required for hyper parameter tuning, we will create a custom runtime and add them to 'Runtime Catalog'.

### Dockerfile

Using Cloudera standard Python 3.10 ML runtime as base runtime, we will create a runtime and install python packages. To begin with, we create a [dockerfile](../scripts/base-ps-asset.Dockerfile) and [requirements](../scripts/requirements.txt) files.

### Custom runtime

After creating a dockerfile and supporting files, new custom ML runtime was built, published to repository and then added to Runtime Catalog.

In [1]:
!echo $ML_RUNTIME_EDITION

PS Project base runtime


As can be noticed from ML_RUNTIME_EDITION environment variable, new run time edition has been successfully built and used in current session. This means pandas is also available by default which is not prepackaged in standard runtime.

In [2]:
import pandas as pd

Pandas has imported successfully. So, the new runtime is working as expected.

## Model Metrics

Now that we have built a custom runtime, we can use this as runtime for next stages of the project. Refer notebook [Model Metrics](../notebooks/3_Model_Metrics.ipynb) for next steps.