# Creating custom Pyfunc models

MLflow’s persistence modules provide convenience functions for creating models with the pyfunc flavor in a variety of machine learning frameworks (scikit-learn, Keras, Pytorch, and more); however, they do not cover every use case. For example, you may want to create an MLflow model with the pyfunc flavor using a framework that MLflow does not natively support. Alternatively, you may want to build an MLflow model that executes custom logic when evaluating queries, such as preprocessing and postprocessing routines. Therefore, mlflow.pyfunc provides utilities for creating pyfunc models from arbitrary code and model data.

## Function-based Model & Class-based Model

### Function-based Model

If you’re looking to serialize a simple python function without additional dependent methods, you can simply log a predict method via the keyword argument python_model.




In [2]:
from mlflow_for_ml_dev.experiments.exp_utils import get_or_create_experiment
import mlflow 
import pandas as pd

In [3]:
# create experiment
experiment_name = "custom_models"
experiment = get_or_create_experiment(
    experiment_name = experiment_name,
    tags = {
        "proejct_name": "UNDEFINED",
        "topic":"pyfunc_flavor",
        "mlflow.note.content": "This experiment is used to store custom models that are not supported by mlflow out of the box. The models are saved using the pyfunc flavor."
    }
)

Experiment with name custom_models and ID 747157440134674892 created.


In [4]:
# Define a simple function to log
def predict(model_input):
    """
    Predicts the input multiplied by 2

    :param model_input: The input to the model
    :return: The input multiplied by 2
    """
    return model_input.apply(lambda x: x * 2)


# Save the function as a model
with mlflow.start_run(run_name="function_model"):
    mlflow.pyfunc.log_model("model", python_model=predict, pip_requirements=["pandas"])
    run_id = mlflow.active_run().info.run_id

# Load the model from the tracking server and perform inference
model = mlflow.pyfunc.load_model(f"runs:/{run_id}/model")
x_new = pd.Series([1,2,3,4,5])

prediction = model.predict(x_new)
print(prediction)



0     2
1     4
2     6
3     8
4    10
dtype: int64


### Class-based Model

If you’re looking to serialize a more complex object, for instance a class that handles preprocessing, complex prediction logic, or custom serialization, you should subclass the PythonModel class.

--------
```python
import mlflow
import pandas as pd

class MyModel(mlflow.pyfunc.PythonModel):
    def predict(self, context, model_input, params=None):
        return [x*2 for x in model_input]

# Save the function as a model
with mlflow.start_run():
    mlflow.pyfunc.log_model("model", python_model=MyModel(), pip_requirements=["pandas"])
    run_id = mlflow.active_run().info.run_id

# Load the model from the tracking server and perform inference
model = mlflow.pyfunc.load_model(f"runs:/{run_id}/model")
x_new = pd.Series([1, 2, 3])

print(f"Prediction:
    {model.predict(x_new)}")
```
--------

Basic Guidelines for a PythonModel

The guidelines for this approach are as follows:

* Your class must be a subclass of mlflow.pyfunc.PythonModel

* Your class must implement a predict method

* The predict method must adhere to the requirements of the Inference API.

* The predict method must have reference to context as the first named argument

* If you wish to provide parameters with your model, these must be defined as part of the model signature. The signature must be saved along with the model.

* If you intend to have additional functionality execute when loading the model (such as loading additional dependent files), you may decide to define the load_context method in your class.



In [5]:
class CustomModel(mlflow.pyfunc.PythonModel):
    def __init__(self):
        pass

    def predict(self, context, model_input):
        return model_input.apply(lambda x: x * 2)

In [6]:
with mlflow.start_run(run_name="class_model") as run:
    print(run.info.run_id)
    mlflow.pyfunc.log_model("model", python_model=CustomModel(), pip_requirements=["pandas"])
    



b4f03dadb14d424a987b5081dd1a53b5


In [7]:
loaded_model = mlflow.pyfunc.load_model(f"runs:/{run.info.run_id}/model")
prediction = loaded_model.predict(x_new)
print(prediction)

0     2
1     4
2     6
3     8
4    10
dtype: int64


# Obtaining the original Class

In [17]:
class CustomModel(mlflow.pyfunc.PythonModel):
    def __init__(self):
        pass

    def predict(self, context, model_input):
        return model_input.apply(lambda x: x * 2)
    
    def load_context(self, context):
        print("Running when loading the model")

    def additional_method(self):
        print("Running an additional method")

In [18]:
with mlflow.start_run(run_name="class_model_with_additional_methods") as run:
    print(run.info.run_id)
    mlflow.pyfunc.log_model("model", python_model=CustomModel(), pip_requirements=["pandas"])
    



4108e210ba80443aa9433f66aafac945


In [19]:
loaded_model = mlflow.pyfunc.load_model(f"runs:/{run.info.run_id}/model")
prediction = loaded_model.predict(x_new)
print(prediction)

Running when loading the model
0     2
1     4
2     6
3     8
4    10
dtype: int64


In [20]:
type(loaded_model)

mlflow.pyfunc.PyFuncModel

In [None]:
# I can use the method "unwrap_python_model" to get the original class
original_class = loaded_model.unwrap_python_model()
original_class.additional_method()

Running an additional method
