# Effortless models deployment with Mlflow

## Creating an recommendation system model using a custom flavor in MLFlow

In the examples provided before, we shown how, by simply adopting the MLFlow model specification to persist our models, we can then achieve easy deployments in a variety of platforms. All we had to do is to save the model using, for instance for a FastAI model, `mlflow.fastai.log_model` to save the model using the `MLModel` format. MlFlow supports the most popular frameworks out there.

However, this technique has a bold implication: you can achieve everything you want to do using one single model flavor. It is certainly true that several frameworks like `TensorFlow` or `PyTorch` are flexible enough to be *modeling complete*, meaning that any modeling task we want to do can be achieved by using any of the building blocks they provide. However, it may not be necessary simple and it is very common to use the strengths of multiple frameworks in a single modeling task.

For instance, you might find yourself using `scikit-learn` to do categorical encoding or missing values inputation to then feed the data to a `Keras` model. How can you log this kind of models in MLFlow? Is it one or the other?

In this notebook, I will show you how you con combine multiple flavor in MLFlow to generate your own one to then, achieve the efortless model deployment we have been talking about. Also, this technique is useful to package the multiple pieces that are required to run a model, ensuring that all the dependencies are packaged correctly.

To demostrate this example, we will switch gears to a different modeling task: a recommender for events people can attend to. On this example we will try to build a recommender that will take an user and a history of the events they attended to and recommend 10 events they might be interested to attend to. We will save this model using MLFlow to then deploy it. Let's take a look:

## Exploring the dataset

First, we will load the dataset. It is composed of two columns: 'user_id' and 'event_id':

In [1]:
import pandas as pd
import numpy as np

def getData():
    train = pd.read_csv('train.csv')[['user_id', 'event_id']]
    test = pd.read_csv('test.csv')[['user_id', 'event_id']]
    return train, test

In [2]:
train, test = getData()

If we display our dataset we will see that both the training and testing datasets contains pairs of elements `user` and `event`, both encoded as GUIDs indicating that the given user picked the given event. 

In [3]:
train

Unnamed: 0,user_id,event_id
0,734eb5b3-9852-4456-8bab-86d507965f3d,39e8d4e9-d3a6-47f3-aed1-4849908cb6ba
1,734eb5b3-9852-4456-8bab-86d507965f3d,daaa51bc-e8bf-47c9-9268-a46de19ac7e2
2,734eb5b3-9852-4456-8bab-86d507965f3d,6c0068fb-dc83-4dd1-a5b9-98210a5dd267
3,734eb5b3-9852-4456-8bab-86d507965f3d,40ce7f97-650e-4d3e-b7d3-b776673f4fc2
4,734eb5b3-9852-4456-8bab-86d507965f3d,f426dff5-0e14-40ce-9489-45643982d99f
...,...,...
807800,b91bc14d-347d-4aa4-81ba-b8391fdd8c2c,0385ff2c-6c9a-4467-83b0-5aaf6b5a6c96
807801,b91bc14d-347d-4aa4-81ba-b8391fdd8c2c,f9388434-6056-4432-b25e-b933b22a4ff6
807802,b91bc14d-347d-4aa4-81ba-b8391fdd8c2c,768bcd5b-8c81-4ef0-80dc-cf65ddaf8c1a
807803,b91bc14d-347d-4aa4-81ba-b8391fdd8c2c,798cfc47-0191-4d95-9f96-3babd7539cae


In [4]:
print('Training size:', len(train))
print('Testing size:', len(test))
print('Unique users:', len(train['user_id'].unique()))
print('Unique items:', len(train['event_id'].unique()))

Training size: 807805
Testing size: 62700
Unique users: 1113
Unique items: 6235


## Modeling

We will consider that each interaction the user has with an item (an event) is equally important. This means that any person attending to any event is a signal as strong as any other person attending to any other event (hint: this implies that we are not capturing situations where a person attended to an event they didn't like). To capture that explicitly, we will create a new column in the dataset called `value` that will contain the number `1.0`, representing the weight of the interaction between an user and an event. Take into account that weighting is a very import aspect of a recommender and there are several ways of doing so.

In [5]:
train['value'] = 1.0

In [6]:
train

Unnamed: 0,user_id,event_id,value
0,734eb5b3-9852-4456-8bab-86d507965f3d,39e8d4e9-d3a6-47f3-aed1-4849908cb6ba,1.0
1,734eb5b3-9852-4456-8bab-86d507965f3d,daaa51bc-e8bf-47c9-9268-a46de19ac7e2,1.0
2,734eb5b3-9852-4456-8bab-86d507965f3d,6c0068fb-dc83-4dd1-a5b9-98210a5dd267,1.0
3,734eb5b3-9852-4456-8bab-86d507965f3d,40ce7f97-650e-4d3e-b7d3-b776673f4fc2,1.0
4,734eb5b3-9852-4456-8bab-86d507965f3d,f426dff5-0e14-40ce-9489-45643982d99f,1.0
...,...,...,...
807800,b91bc14d-347d-4aa4-81ba-b8391fdd8c2c,0385ff2c-6c9a-4467-83b0-5aaf6b5a6c96,1.0
807801,b91bc14d-347d-4aa4-81ba-b8391fdd8c2c,f9388434-6056-4432-b25e-b933b22a4ff6,1.0
807802,b91bc14d-347d-4aa4-81ba-b8391fdd8c2c,768bcd5b-8c81-4ef0-80dc-cf65ddaf8c1a,1.0
807803,b91bc14d-347d-4aa4-81ba-b8391fdd8c2c,798cfc47-0191-4d95-9f96-3babd7539cae,1.0


### Encoding

Users and events are encoded as GUIDs in our dataset, and that's what probably we will get as inputs when we deploy this model in production. However, ML models won't work with this kind of data. We will have to transform these two columns to categorical values so we can use them in our recommender. There are multiple ways to achieve this, but using `OrdinalEncoder` is probably one of the easier ones.

In [7]:
from sklearn.preprocessing import OrdinalEncoder

In [8]:
users_encoder = OrdinalEncoder(handle_unknown='use_encoded_value', unknown_value=np.nan)
train['user_id'] = users_encoder.fit_transform(train['user_id'].values.reshape(-1,1))

items_encoder = OrdinalEncoder(handle_unknown='use_encoded_value', unknown_value=np.nan)
train['event_id'] = items_encoder.fit_transform(train['event_id'].values.reshape(-1,1))

Note that we are creating encoders for both items and users, cause both of them uses GUIDs. Pay attention around how `OrdinalEncoder` is configured, particularly:

- `handle_unknown` is set to `use_encoded_value`, meaning that if the encoder has to encode a new value that was not seen in the training set, it will use a predefined value to denote that. This is useful in the case that new items/users are added to the platform for which the model is not that useful.
- `unknown_value` is set to `np.nan`. This will be our missing value.

> You will see later way `OrdinalEncoder` is an smart choice, specially when working with sparse matrices.

### Creating the training dataset

Now, it is time to create our training dataset. We will use here the library `implicit`, which contains several implementations of recommendation system's algorithmn. An important details about the library `implicit` is that it requires us to provide the data as sparse matrices. Sparse matrices are an efficient way to represent matrices that contains a high number of zeros. They also help on doing computation on them more efficient (at expenses of some other operations being more costly). Recommendation systems tent to deal with datasets that exploit this fact a lot.

We need to construct a matrix of shape `(#EVENTS, #USERS)` where a value of `1` in row `EVENT` and column `USER` means that the user in position `USER` has attended to event in position `EVENT`. Since our `OrdinalEncoder` provides us with values in the range of `[0..#USERS]` for users and `[0..#EVENTS]` for events, the encoded value represents a convenient way to populate this matrix:

In [9]:
from scipy import sparse

sparse_event_user_train = sparse.csr_matrix((train['value'],
                                            (train['event_id'].astype(int), train['user_id'].astype(int))))

> Notes: Type casting is require for `csr_matrix`. Matrix indeces are integers while its values should be float

Let's check the shape of the training matrix:

In [10]:
sparse_event_user_train.shape

(6235, 1113)

This matches the number of users and events we had before. Now, let's create our model:

In [11]:
from implicit.als import AlternatingLeastSquares

als_model = AlternatingLeastSquares(factors=100)

We will construct a model using the algorithm Alternating Least Squares, with a latent space of 100 factors. The implementation corresponds to the paper [Applications of the Conjugate Gradient Method for Implicit Feedback Collaborative Filtering](https://dl.acm.org/doi/10.1145/2043932.2043987). Going over the details of the algorithm is out of our scope but you are more than welcome to read the documentation.

Let's train the model:

In [12]:
als_model.fit(sparse_event_user_train)

  0%|          | 0/15 [00:00<?, ?it/s]

## Evaluating the performance

We need to evaluate the performance of the model to have an understanding about how good or bad it is. There are several ways to measure the performance of a recommender. In this case we will pay attention to Precision@K. This metric is an extension of the the metric Precision but extended to a recommender scenario. While precision measures "*Out of the items predicted to be relevant, how many are truly relevant?*", Precision@k measures "*Out of the top k items predicted to be relevant for each user, how many are truly relevant for the user?*". As you can see, this metric is very relevant in a recommender settings.

To get that done, let's start by preparing our testing dataset. It's shape is basically the same one that the one we used for training. In order to be consistent, it is important to apply the same categorical encoding to the elements in the testing set that we did in the training set.

Our `OrdinalEncoder` does this job very simple:

In [13]:
test['event_id'] = items_encoder.transform(test['event_id'].values.reshape(-1,1))
test['user_id'] = users_encoder.transform(test['user_id'].values.reshape(-1,1))

We will use the same weighting schema we used before:

In [14]:
test['value'] = 1.0

Some users or items in the testing set may never been seen in the training set and hence we know nothing about them. As a consequence, we can't recommend anything about them. We will drop them. In reality, we should pick an strategy to deal with them, but that's a battle for tomorrow.

In [15]:
test.dropna(inplace=True)

As we did in the training phase, the evaluation also requires data to be supplied as sparse matrices. Again, our `OrdinalEncoder` is very handy:

In [16]:
sparse_event_user_test = sparse.csr_matrix((test['value'],
                                           (test['event_id'].astype(int),test['user_id'].astype(int))))

Let's check sizes:

In [17]:
sparse_event_user_test.shape

(6187, 1112)

It's time to run our evaluation routines:

### Precision@K

In [18]:
import mlflow
from implicit import evaluation

In [19]:
pk10 = evaluation.mean_average_precision_at_k(als_model, sparse_event_user_train, sparse_event_user_test, K=10)
print(pk10)

  0%|          | 0/270 [00:00<?, ?it/s]

0.8888888888888888


In [20]:
mlflow.log_metric('precision_10', pk10)

> This means that 88% of the time, our model predicts something that is relevant for the user. That's not bad considering the amount of effort we put in building the model itself.

### AUC@K

In [21]:
auc10 = evaluation.AUC_at_k(als_model, sparse_event_user_train, sparse_event_user_test)
print(auc10)

  0%|          | 0/270 [00:00<?, ?it/s]

0.6832641519691369


In [22]:
mlflow.log_metric('auc_10', auc10)

## Running the recommend method from the model

Now it's time to see how we can actually use this model in production. To generate recomendations for a given user we can use the method `recommend` from the model. The method `recommend` requires an sparse matrix as input indicating the items that the user already selected. We can construct this information in the following way:

Let's reload our initial data so we can demostrate how to run the model

In [23]:
train, test = getData()

I will pick a sample user randomly to showcase this (random... well, I will pick the one at position 0 :D):

In [24]:
sample_user_id = test['user_id'][0]
print(sample_user_id)

8c6b414e-b79a-418d-8948-7ffb7538cbd9


The method `recommend` requires to know which are the items the user already selected. In this example, we will consider the items from the training set, so:

In [25]:
sample_user_items = train[train['user_id'] ==  sample_user_id].copy()

We need to do a couple of things now:

- Encode the user ID in the same way we did before, again using our `OrdinalEncoder`.
- Encode the events ID in the same way we did before, again using our `OrdinalEncoder`.
- Weight the items in the same way we did before, `1.0`.

In [26]:
sample_user_items['user_id'] = users_encoder.transform(sample_user_items['user_id'].values.reshape(-1, 1))
sample_user_items['event_id'] = items_encoder.transform(sample_user_items['event_id'].values.reshape(-1,1))
sample_user_items['value'] = 1.

The user we are working with has the following ID:

In [27]:
sample_user = int(sample_user_items['user_id'].iloc[0])
print(sample_user)

612


As mentioned before, our model most of the time works with sparse matrices and this is also the case for the method `recommend`. So we will provide this data using an sparse matrix:

In [28]:
items = sparse.csr_matrix((sample_user_items['value'],
                          (np.zeros(len(sample_user_items), dtype=int), sample_user_items['event_id'].astype(int))))

> **What is `np.zeros` doing here?** The recommend method requires to provide sparse vectors for the items already liked for the user. `np.zeros` is just creating an array with all zeros of lenght `len(sample_user_items)` - which is the amount of items the user selected. This indicates that the items correspond to the same user, being in this case represented by 0.

In [29]:
items.shape

(1, 6235)

Let's run the method now:

In [30]:
ids, scores = als_model.recommend([sample_user], user_items=items, 
                                  N=10, filter_already_liked_items=False)

Let's check results:

In [31]:
pd.DataFrame({"event_id": items_encoder.inverse_transform(ids.reshape(-1,1)).reshape(-1), 
              "score": scores.reshape(-1)})

Unnamed: 0,event_id,score
0,1a9e233a-0b96-4031-8984-f66d527f31c5,1.000906
1,21d373ca-daf7-4225-87d2-56c04dff1e3e,1.000863
2,0f9e2891-373a-419f-8e8c-32ad21e03121,1.000739
3,0f47dc95-d034-4418-8afe-6bf452f7cea3,1.000653
4,159b7b72-a5fd-4233-9fb9-15c1c55fb40e,1.000631
5,1bca8b70-a1c8-419d-b748-b1b7afa73f2b,1.000597
6,09ce03da-e4a6-4129-bc0c-fa49bb0184d1,1.00059
7,217981b1-d95f-4f46-9aed-5796e8249c1b,1.000496
8,0b7df7af-b8e2-4c0b-b27c-d3094b02dc84,1.000494
9,10069266-e0c0-40c7-9ed7-737f14036762,1.000466


> **Too much reshape?** Yes, I know, there is a lot. But `OrdinalEncoder` required 2-D arrays so the first `reshape` call you see converts a 1-D array to a 2-D one, and the second one converts it back to a single dimensional array.

## Can we talk now about MLFlow? 

Now let's get back to business. How we can package this model using MLflow so it is simple to deploy it? I picked this example because it has a lot of interesting pieces which makes it very appeling. Our model now contains elements from `scikit-learn` and elements from another framework that it is not even supported by MLFlow by default, `implicit`. So how can we proceed?

One option, of course it's to package everything using Scikit-Learn pipelines and then log the model using the `scikit` flavor. It will work, but we are looking for a more flexible approach now. One approach that can be useful here is to create our own flavor:

### Creating a custom model in MLFlow

In my last post I introduced the flavor `pyfunc` that MLFlow uses to run models at deployment time. We saw a way to customize how our model is loaded and inference is run by providing a custom model loader module. However, `pyfunc` also provides a way to create our own custom flavor, that can be composed of whatever elements we want. To do that MLFlow requires us to do 2 thing:

- Create a class representing our model, that inherets from `PythoModel`.
- Implement in this class the method `load_context` and `predict`.

Let's see how this looks like. The class should look like the following:

```python
class AlternatingLeastSquaresModel(PythonModel):
    def __init__(self, ...)
        (...)

    def load_context(self, context: PythonModelContext):
        (..)

    def predict(self, context: PythonModelContext, data):
        (...)
```

Let's go over this in details:

### The init method

We will use the constructor to instantiate the model. This is our chance to include in the model all the elements we want to compose it. For instance we can pass here:

- The number of items we want to recommend.
- The model we trained
- The items and user's encoders

```python
def __init__(self, 
             recommender: AlternatingLeastSquares, 
             item2id: OrdinalEncoder,
             user2id: OrdinalEncoder,
             k: int = 10):

    self.model = recommender
    self._k = k
    self._item2id = item2id
    self._user2id = user2id
```

### The load_context method

This method allows you to load any additional asset that you might need. For instance, let's consider the case where the user mappings are stored in a file, a json file for instance. Another typical case would be when you model weights are persisted, let's say, in weights file like `h5`, `pt`, etc. You can load this files here. MLFlow can persist these files and load them on runtime in this point.

In `context.artifacts['artifact_key']` you will find the path to the specific artifact. For instance, let's imagine our model is a Keras model, and we have our model persisted in `h5` format, then `context.artifacts['model']` can be a pointer to the file `model.h5` that you can use to the call `model.load(model_path)` in Keras.

We will skip this step cause our model doesn't require it.

```python
def load_context(self, context: PythonModelContext):
    pass
```

### The predict method

The predict method is what get's called each time MLFLow runs our model. The signature is pretty similar to what you would expect, but there is also a `PythonModelContext` passed as argument. This is the same configuration that get's passed to you in the `load_context` method in case you need it. It's rarely the case you will need this on the predict, but it is there just in case.

The `data` parameter is where you input data is passed. The type of this argument will depend on the signature you define for your model. In our case, I will require this data to be columnar, so it will be `pd.DataFrame`. The return type should also comply with the signature. Again, in our case we will be returning a table with the recommendations so it will be `pd.DataFrame`.

What happens inside the method is almost the same we did before when we esay how to run the `recommend` method. However, if you see, now we are using the `OrdinalEncoder` objects instanciated in the `AlternatingLeastSquaresModel`. 

In [32]:
from mlflow.pyfunc import PythonModel, PythonModelContext
from typing import Dict

class AlternatingLeastSquaresModel(PythonModel):
    def __init__(self, 
                 recommender: AlternatingLeastSquares, 
                 item2id: OrdinalEncoder,
                 user2id: OrdinalEncoder,
                 k: int = 10):
        
        self.model = recommender
        self._k = k
        self._item2id = item2id
        self._user2id = user2id

    def load_context(self, context: PythonModelContext):
        pass

    def predict(self, context: PythonModelContext, data):
        # Inputs are dataframes with 2 columns, `user_id` containing the user to run the recommendations
        # and `event_id` containing the items the user already picked. Notice that the user is repeated
        # multiple time per each event so we can have a tabular input data.
        # We make a copy of the dataframe cause we will modify it.
        data = data.copy()
        
        # Convert GUIDs to the IDs we used for training
        data['event_id'] = self._item2id.transform(data['event_id'].values.reshape(-1,1))
        data['user_id'] = self._user2id.transform(data['user_id'].values.reshape(-1,1))
        
        # Use the same weighting schema we used for training
        data['value'] = 1.0
        
        # Drop NAs that will be generated for events not included in the training dataset we don't know
        # anything about.
        data.dropna(inplace=True)
        
        # The user we will recommend for
        users = data['user_id'].astype('int').unique()
        
        # Items picked by the user as an sparse matrix.
        items = sparse.csr_matrix((data['value'],
                                  (np.zeros(len(sample_user_items), dtype=int), data['event_id'].astype(int))))
        
        # Run the recomendations
        ids, scores = self.model.recommend(users,
                                           user_items=items, 
                                           N=self._k,
                                           filter_already_liked_items=False)
        
        # Return the output as a dataframe
        return pd.DataFrame({
            "event_id": self._item2id.inverse_transform(ids.reshape(-1,1)).reshape(-1), 
            "score": scores.reshape(-1)
        })

Let's test this out:

In [33]:
mlflow_model = AlternatingLeastSquaresModel(als_model, item2id = items_encoder, user2id = users_encoder, k=10)

Running the `predict` function:

In [34]:
sample_user_items = train[train['user_id'] ==  sample_user_id].copy()

In [35]:
data = mlflow_model.predict(None, sample_user_items)

> You will notice I'm passing `None` for the argument `context`. This is just for testing, but in runtime, when the model is deployed using MLFlow, this will have the actual context object for the model. We won't pass the parameter directly, cause it will be passed automatically by MLflow.

In [36]:
data

Unnamed: 0,event_id,score
0,1a9e233a-0b96-4031-8984-f66d527f31c5,1.000906
1,21d373ca-daf7-4225-87d2-56c04dff1e3e,1.000863
2,0f9e2891-373a-419f-8e8c-32ad21e03121,1.000739
3,0f47dc95-d034-4418-8afe-6bf452f7cea3,1.000653
4,159b7b72-a5fd-4233-9fb9-15c1c55fb40e,1.000631
5,1bca8b70-a1c8-419d-b748-b1b7afa73f2b,1.000597
6,09ce03da-e4a6-4129-bc0c-fa49bb0184d1,1.00059
7,217981b1-d95f-4f46-9aed-5796e8249c1b,1.000496
8,0b7df7af-b8e2-4c0b-b27c-d3094b02dc84,1.000494
9,10069266-e0c0-40c7-9ed7-737f14036762,1.000466


So we got what we expected. Let's now save the model using this our custom flavor.

### Logging the model with MLFlow

As usual, we will log the model using the method `mlflow.pyfunc.log_model()`. But first, let's define our signature. Remember we have two ways to do this, using the `infer_signature` method and supplying a sample of the input and the output, or creating that manually.

Using `infer_signature`:

In [37]:
from mlflow.models.signature import infer_signature

signature = infer_signature(train[['user_id', 'event_id']], data)
signature

inputs: 
  ['user_id': string, 'event_id': string]
outputs: 
  ['event_id': string, 'score': float]

Using `ColSpec`:

In [38]:
from mlflow.models.signature import ModelSignature
from mlflow.types.schema import ColSpec, DataType, Schema

signature = ModelSignature(
    inputs=Schema([
        ColSpec(DataType.string, "user_id"),
        ColSpec(DataType.string, "event_id")
    ]), 
    outputs=Schema([
        ColSpec(DataType.string, "event_id"),
        ColSpec(DataType.float, "score")
    ]))
signature

inputs: 
  ['user_id': string, 'event_id': string]
outputs: 
  ['event_id': string, 'score': float]

Let's register the model

In [40]:
mlflow.pyfunc.log_model("recommender", 
                        python_model=mlflow_model,
                        signature=signature,
                        registered_model_name='event_recommender')

Registered model 'event_recommender' already exists. Creating a new version of this model...
2022/04/05 22:19:41 INFO mlflow.tracking._model_registry.client: Waiting up to 300 seconds for model version to finish creation.                     Model name: event_recommender, version 4
Created version '4' of model 'event_recommender'.


ModelInfo(artifact_path='recommender', flavors={'python_function': {'cloudpickle_version': '2.0.0', 'python_model': 'python_model.pkl', 'loader_module': 'mlflow.pyfunc.model', 'python_version': '3.8.8', 'env': 'conda.yaml'}}, model_uri='runs:/f8845fed-a4bf-45da-9e57-b996d9da442f/recommender', model_uuid='ccefda98165745c79bfb48566026f0bc', run_id='f8845fed-a4bf-45da-9e57-b996d9da442f', saved_input_example_info=None, signature_dict={'inputs': '[{"name": "user_id", "type": "string"}, {"name": "event_id", "type": "string"}]', 'outputs': '[{"name": "event_id", "type": "string"}, {"name": "score", "type": "float"}]'}, utc_time_created='2022-04-05 22:18:11.440174')

Let's explore this arguments:

- `"recommender"` is just the name of the folder where the artifacts will be stored. It can be any name.
- `python_model` is the instance of the class that inherits from `PythonModel`. Pay special attention that this object will be serialized using the `Pickle` format and then loaded into memory in runtime.
- `signature` is the model input and output signature, as usual.
- `registered_model_name` is the name of the model we will register in the model registry. Remember that this parameter is optional and you should only include it when you want the model to be registered in the registry.

In [41]:
mlflow.end_run()

### Testig the MLFlow model

We can load the model from the code using the following line. In this case we are assuming the model was registered using the name `event_recommender`. We are also retrieving the last version of it.

In [42]:
mlflow.set_registry_uri = "azureml://eastus.api.azureml.ms/mlflow/v1.0/subscriptions/18522758-626e-4d88-92ac-dc9c7a5c26d4/resourceGroups/Analytics.Aml.Experiments.Workspaces/providers/Microsoft.MachineLearningServices/workspaces/aa-ml-aml-workspace"

In [43]:
model = mlflow.pyfunc.load_model('models:/event_recommender/latest')

 - cloudpickle (current: 2.1.0, required: cloudpickle==2.0.0)
 - ipython (current: 8.3.0, required: ipython==7.22.0)
 - scikit-learn (current: 1.1.1, required: scikit-learn==1.0.2)
 - typing-extensions (current: uninstalled, required: typing-extensions==3.7.4.3)
To fix the mismatches, call `mlflow.pyfunc.get_model_dependencies(model_uri)` to fetch the model's environment and install dependencies using the resulting environment file.
https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations


Running the `predict` function:

In [44]:
sample_user_items = train[train['user_id'] ==  sample_user_id].copy()

In [45]:
data = model.predict(sample_user_items)

In [46]:
data

Unnamed: 0,event_id,score
0,159b7b72-a5fd-4233-9fb9-15c1c55fb40e,1.003558
1,10069266-e0c0-40c7-9ed7-737f14036762,1.003018
2,1a9e233a-0b96-4031-8984-f66d527f31c5,1.002954
3,27dfc792-00cb-42aa-9613-748cabb9be29,1.002925
4,1bca8b70-a1c8-419d-b748-b1b7afa73f2b,1.002875
5,0b7df7af-b8e2-4c0b-b27c-d3094b02dc84,1.002443
6,0f47dc95-d034-4418-8afe-6bf452f7cea3,1.002399
7,0f9e2891-373a-419f-8e8c-32ad21e03121,1.002333
8,217981b1-d95f-4f46-9aed-5796e8249c1b,1.002129
9,09ce03da-e4a6-4129-bc0c-fa49bb0184d1,1.001551


### Serving the model locally

We can run the model in an inference server locally in our local compute. Again, with this we can check that our deployment strategy will work. 

To do so, let's serve our model using mlflow:

```bash
mlflow models serve -m models:/event_recommender/latest
```

Creating a sample request

In [36]:
import json

with open("sample.json", "w") as f:
    f.write(sample_user_items.reset_index(drop=True).to_json(orient='split', index=False))

> Note how the model inputs is indicated. MLFlow requires the inputs to the model to be submitted using `JSON` format and multiple specification are supported. In the Cats vs Dogs sample we saw before we used the TensorFlow Serving specification. Now, since we are using tabular data, we can use the Columnar format in Pandas.

Sending the request

In [77]:
!cat -A sample.json | curl http://127.0.0.1:5000/invocations \
                        --request POST \
                        --header 'Content-Type: application/json' \
                        --data-binary @-

[{"event_id": "1a9e233a-0b96-4031-8984-f66d527f31c5", "score": 1.0018339157104492}, {"event_id": "10069266-e0c0-40c7-9ed7-737f14036762", "score": 1.0017820596694946}, {"event_id": "0b7df7af-b8e2-4c0b-b27c-d3094b02dc84", "score": 1.0017375946044922}, {"event_id": "0f9e2891-373a-419f-8e8c-32ad21e03121", "score": 1.0016436576843262}, {"event_id": "09ce03da-e4a6-4129-bc0c-fa49bb0184d1", "score": 1.001570701599121}, {"event_id": "27dfc792-00cb-42aa-9613-748cabb9be29", "score": 1.001529574394226}, {"event_id": "1e22c59f-7542-4ccd-8630-cc387a7c5947", "score": 1.0014907121658325}, {"event_id": "1bca8b70-a1c8-419d-b748-b1b7afa73f2b", "score": 1.0013089179992676}, {"event_id": "159b7b72-a5fd-4233-9fb9-15c1c55fb40e", "score": 1.001258373260498}, {"event_id": "217981b1-d95f-4f46-9aed-5796e8249c1b", "score": 1.0012205839157104}]

### Deploying to Azure ML

#### ACI

In [None]:
from mlflow.deployments import get_deploy_client

In [None]:
import azureml.mlflow

In [None]:
client = get_deploy_client("azureml:/..")

In [None]:
import json

deploy_config = {
  "computeType": "aci",
  "containerResourceRequirements": 
  {
    "cpu": 2,
    "memoryInGB": 4 
  }
}

deployment_config_path = "deployment_config.json"
with open(deployment_config_path, "w") as outfile:
    outfile.write(json.dumps(deploy_config))

In [None]:
webservice = client.create_deployment(model_uri=f'models:/event_recommender/4',
                                      name="event-recommender-10",
                                      config={'deploy-config-file': deployment_config_path})

#### Online Endpoint

In [58]:
import json

with open("sample.json", "w") as f:
    f.write('{ "input_data": ' + sample_user_items.reset_index(drop=True).to_json(orient='split') + '}')

In [59]:
!cat -A sample.json | curl https://event-recommender.eastus.inference.ml.azure.com/score \
                    --request POST \
                    --header 'Content-Type: application/json' \
                    --header 'Authorization: Bearer 2zU7eE3hu6At0J0mtkQQDtxNaFiHtbGt' \
                    --data-binary @-

[{"event_id": "1a9e233a-0b96-4031-8984-f66d527f31c5", "score": 1.0018339157104492}, {"event_id": "10069266-e0c0-40c7-9ed7-737f14036762", "score": 1.0017820596694946}, {"event_id": "0b7df7af-b8e2-4c0b-b27c-d3094b02dc84", "score": 1.0017375946044922}, {"event_id": "0f9e2891-373a-419f-8e8c-32ad21e03121", "score": 1.0016436576843262}, {"event_id": "09ce03da-e4a6-4129-bc0c-fa49bb0184d1", "score": 1.001570701599121}, {"event_id": "27dfc792-00cb-42aa-9613-748cabb9be29", "score": 1.001529574394226}, {"event_id": "1e22c59f-7542-4ccd-8630-cc387a7c5947", "score": 1.0014907121658325}, {"event_id": "1bca8b70-a1c8-419d-b748-b1b7afa73f2b", "score": 1.0013089179992676}, {"event_id": "159b7b72-a5fd-4233-9fb9-15c1c55fb40e", "score": 1.001258373260498}, {"event_id": "217981b1-d95f-4f46-9aed-5796e8249c1b", "score": 1.0012205839157104}]

## Bonus extra: Scoring multiple users at a time

If you pay closer look to the model we created, you will see that it will only be able to score one user at a time. Usually, it can be more efficient to score multiple users at the time, specially if you run the model on hardware with multiple cores. Some changes are required to do this:

In [82]:
from mlflow.pyfunc import PythonModel, PythonModelContext

class AlternatingLeastSquaresModel(PythonModel):
    def __init__(self, 
                 recommender: AlternatingLeastSquares, 
                 item2id: OrdinalEncoder,
                 user2id: OrdinalEncoder,
                 k: int = 10):
        
        self.model = recommender
        self._item2id = item2id
        self._user2id = user2id
        self._k = k

    def load_context(self, context: PythonModelContext):
        pass

    def predict(self, context: PythonModelContext, data):
        data = data.copy()
        
        data['user_cat'] = data['user_id'].astype('category')
        data['event_id'] = self._item2id.transform(data['event_id'].values.reshape(-1,1))
        data['user_id'] = self._user2id.transform(data['user_id'].values.reshape(-1,1))
        data['value'] = 1.0
        
        data.dropna(inplace=True)
        
        items = sparse.csr_matrix((data['value'].astype(float),
                                  (data['user_cat'].cat.codes, data['event_id'].astype(int))))
        
        ids, scores = self.model.recommend(data['user_id'].astype('int').unique(),
                                           user_items=items, 
                                           N=self._k,
                                           filter_already_liked_items=False)
        
        return pd.DataFrame({
            "user_id": np.repeat(data['user_cat'].cat.categories, self._k),
            "event_id": self._item2id.inverse_transform(ids.reshape(-1,1)).reshape(-1), 
            "score": scores.reshape(-1)
        })

Let's create an instance of this model:

In [83]:
mlflow_model = AlternatingLeastSquaresModel(als_model, item2id = items_encoder, user2id = users_encoder, k=10)

The signature for this model would be then:

In [84]:
from mlflow.models.signature import ModelSignature
from mlflow.types.schema import ColSpec, DataType, Schema

signature = ModelSignature(
    inputs=Schema([
        ColSpec(DataType.string, "user_id"),
        ColSpec(DataType.string, "event_id")
    ]), 
    outputs=Schema([
        ColSpec(DataType.string, "user_id"),
        ColSpec(DataType.string, "event_id"),
        ColSpec(DataType.float, "score")
    ]))
signature

inputs: 
  ['user_id': string, 'event_id': string]
outputs: 
  ['user_id': string, 'event_id': string, 'score': float]

Now, we can log the model:

In [85]:
mlflow.pyfunc.log_model("recommender", 
                        python_model=mlflow_model,
                        signature=signature,
                        registered_model_name='event_recommender_mutiuser')

Successfully registered model 'event_recommender_mutiuser'.
2022/04/04 21:50:38 INFO mlflow.tracking._model_registry.client: Waiting up to 300 seconds for model version to finish creation.                     Model name: event_recommender_mutiuser, version 1
Created version '1' of model 'event_recommender_mutiuser'.


ModelInfo(artifact_path='recommender', flavors={'python_function': {'cloudpickle_version': '2.0.0', 'python_model': 'python_model.pkl', 'loader_module': 'mlflow.pyfunc.model', 'python_version': '3.8.8', 'env': 'conda.yaml'}}, model_uri='runs:/d1076159-8e9f-4bfb-8de5-f15c2be3afb3/recommender', model_uuid='b2d57331fbbf49ae9c25150f97dd8f1e', run_id='d1076159-8e9f-4bfb-8de5-f15c2be3afb3', saved_input_example_info=None, signature_dict={'inputs': '[{"name": "user_id", "type": "string"}, {"name": "event_id", "type": "string"}]', 'outputs': '[{"name": "user_id", "type": "string"}, {"name": "event_id", "type": "string"}, {"name": "score", "type": "float"}]'}, utc_time_created='2022-04-04 21:50:33.460556')

In [86]:
mlflow.end_run()

In [87]:
model = mlflow.pyfunc.load_model('models:/event_recommender_mutiuser/latest')

Running the `predict` function:

In [88]:
data = model.predict(test)

In [89]:
data

Unnamed: 0,user_id,event_id,score
0,00115a30-da72-4ff7-a19d-b8dfa370ed6b,1a9e233a-0b96-4031-8984-f66d527f31c5,1.001834
1,00115a30-da72-4ff7-a19d-b8dfa370ed6b,10069266-e0c0-40c7-9ed7-737f14036762,1.001782
2,00115a30-da72-4ff7-a19d-b8dfa370ed6b,0b7df7af-b8e2-4c0b-b27c-d3094b02dc84,1.001738
3,00115a30-da72-4ff7-a19d-b8dfa370ed6b,0f9e2891-373a-419f-8e8c-32ad21e03121,1.001644
4,00115a30-da72-4ff7-a19d-b8dfa370ed6b,09ce03da-e4a6-4129-bc0c-fa49bb0184d1,1.001571
...,...,...,...
6265,ffdfceac-37cd-484b-b3ab-b2aa0a76fb19,2b4c1b4e-9fa4-4a62-aebd-a6af36663cb4,1.003394
6266,ffdfceac-37cd-484b-b3ab-b2aa0a76fb19,216483f0-dc39-4269-a0e4-42d8689bd75e,1.002992
6267,ffdfceac-37cd-484b-b3ab-b2aa0a76fb19,10aaa832-7334-4e34-805c-87a64adcbc38,1.002810
6268,ffdfceac-37cd-484b-b3ab-b2aa0a76fb19,07987522-106e-4470-bdce-a346faa17bc8,1.002783
