# Productivización de modelos

Quizás uno de los aspectos clave es cómo poner en valor los modelos construidos para que tengan impacto en los procesos de negocio. Existen distintas modalidades en las que este proceso toma forma. Disponer de un entorno con garantías de qué modelo es el correcto a poner en marcha es quizás una de las claves a la hora de dar servicio a escala en la mayoría de las organizaciones. Veremos formas _manuales_ de hacerlo, pero es bueno que conozcamos las mejores prácticas en lo que respecta al servicio de modelos o _model serving_

![mlflow](https://mlflow.org/docs/latest/_images/mlflow-deployment-overview.png)



## MLFlow

Ampliaremos el ejercicio anteriormente realizado con Comet para el caso de MLFlow desplegado de forma local. MLFlow nos permite desplegar un servicio y actuar de forma local incluyendo el poder servir un modelo registrado en nuestro servidor de experimentos.

* https://mlflow.org/docs/latest/introduction/index.html

In [2]:
# !pip install mlflow

![cicloe](https://www.mlflow.org/docs/latest/_images/quickstart_tracking_overview.png)

Una vez instalado podemos ejecutar nuestro servidor para que se quede "escuchando" en el puerto 5000. Deberemos abrir un terminal con el entorno python donde instalamos mlflow activo y ejecutar:

```sh
mlflow ui
```

No cerréis el terminal ya que el proceso se cerrará. Podéis acceder a la ruta http://127.0.0.1:5000/ para acceder a la interfaz local de vuestro sistema. Esto os permite configurar vuestro entorno Python para que emplee este registro como el punto en el que registrar nuestras métricas y modelos.

In [3]:
import mlflow

mlflow.set_tracking_uri("http://localhost:5000")

Al igual que hicimos con Comet, podemos registrar las métricas que creamos relevantes para un experimento.

In [6]:
mlflow.set_experiment("check-localhost-connection")

with mlflow.start_run():
    mlflow.log_metric("foo", 1)
    mlflow.log_metric("bar", 2)

2024/07/26 11:14:43 INFO mlflow.tracking.fluent: Experiment with name 'check-localhost-connection' does not exist. Creating a new experiment.


Volver al interfaz para ver cómo un nuevo experimento fue registrado y las métricas asociadas a este. Veréis que no hay mucha magia ya que los datos como tal se registran en una carpeta en la ruta en la que estamos trabajando (revisad las carpetas _mlruns_ y _mlartifacts_).

In [1]:
from sklearn.datasets import make_regression
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split

import mlflow
import mlflow.sklearn

with mlflow.start_run() as run:
    X, y = make_regression(n_features=4, n_informative=2, random_state=0, shuffle=False)
    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.2, random_state=42
    )

    params = {"max_depth": 2, "random_state": 42}
    model = RandomForestRegressor(**params)
    model.fit(X_train, y_train)

    # Log parameters and metrics using the MLflow APIs
    mlflow.log_params(params)

    y_pred = model.predict(X_test)
    mlflow.log_metrics({"mse": mean_squared_error(y_test, y_pred)})

    # Log the sklearn model and register as version 1
    mlflow.sklearn.log_model(
        sk_model=model,
        artifact_path="sklearn-model",
        input_example=X_train,
        registered_model_name="sk-learn-random-forest-reg-model",
    )

2024-07-26 12:00:59.646951: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-07-26 12:00:59.730536: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-07-26 12:00:59.770314: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-07-26 12:00:59.780758: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-07-26 12:00:59.835156: I tensorflow/core/platform/cpu_feature_guar

Acabamos de registrar nuestro primer modelo http://127.0.0.1:5000/#/models/sk-learn-random-forest-reg-model. Podemos incluir información adicional (etiquetas) para conocer de qué tipo de modelo se trata.

![modelo](https://mlflow.org/docs/latest/_images/model-alias-and-tags.png)

Cualquier modelo registrado es accesible una vez tenemos el servidor de MLFlow en marcha. De este modo podemos rescatar distintas versiones del modelo de una forma centralizada.

In [4]:
import mlflow.sklearn
from sklearn.datasets import make_regression

model_name = "sk-learn-random-forest-reg-model"
model_version = "1"

# Load the model from the Model Registry
model_uri = f"models:/{model_name}/{model_version}"
model = mlflow.sklearn.load_model(model_uri)

# Generate a new dataset for prediction and predict
X_new, _ = make_regression(n_features=4, n_informative=2, random_state=0, shuffle=False)
y_pred_new = model.predict(X_new)

print(y_pred_new)

  from .autonotebook import tqdm as notebook_tqdm
Downloading artifacts: 100%|██████████| 6/6 [00:00<00:00, 302.76it/s]  

[ 16.36355607 -20.09258424   8.0136586    6.16919118  -1.81185423
   4.03116362 -24.95801449  68.78053495 -45.0766513   64.44760141
 -40.16931792 -25.54191065 -14.39985794 -38.0567874    8.05358765
 -25.73029816 -15.91990041 -10.99985266 -24.2475118  -32.70582446
  17.34781751  68.49980732  44.5541425   41.31593646  48.16602726
 -23.62019943  47.15590018  69.12741949  48.16602726  -0.26024544
 -28.49126919 -10.99985266  10.73067585 -10.61092056  -4.7324722
   2.76556278  58.93099448 -31.19567455 -35.55773052 -23.99366895
  48.16602726  13.34984948  12.56552213 -18.66808469 -32.70582446
 -39.30386685 -34.29680647  48.44675489 -33.40149961  20.35083862
 -15.0214084  -34.55064932  -2.28963784 -19.61227378   7.6979477
 -25.86538741 -11.95702358 -15.36598686   5.88539811 -30.23881739
 -25.47645531 -43.61170248 -43.7442754  -14.59055495 -40.16931792
 -32.70582446  -2.68114572  -5.39418041  16.15991316  -2.28963784
  41.662821    10.04512765  51.22797543 -23.09874036  10.04512765
  46.5774364




Veámoslo con un ejemplo entero. Nuestro data scientist procede a obtener los datos y realizar su magia encontrando un modelo que devuelve buenos resultados.

In [5]:
import pandas as pd
from mlflow.models import infer_signature

# Load dataset
data = pd.read_csv(
    "https://raw.githubusercontent.com/mlflow/mlflow/master/tests/datasets/winequality-white.csv",
    sep=";",
)

# Split the data into training, validation, and test sets
train, test = train_test_split(data, test_size=0.25, random_state=42)
train_x = train.drop(["quality"], axis=1).values
train_y = train[["quality"]].values.ravel()
test_x = test.drop(["quality"], axis=1).values
test_y = test[["quality"]].values.ravel()
train_x, valid_x, train_y, valid_y = train_test_split(
    train_x, train_y, test_size=0.2, random_state=42
)
signature = infer_signature(train_x, train_y)

[Hyperopt](https://hyperopt.github.io/hyperopt/) nos permite buscar una serie de hiperparámetros para nuestro modelo de forma eficiente y distribuida. Esto se vuelve muy importante cuando requerimos entrenar modelo pesado como las redes neuronales a escala.

In [None]:
#!pip install hyperopt

In [15]:
import keras
import numpy as np
from hyperopt import STATUS_OK

def train_model(params, epochs, train_x, train_y, valid_x, valid_y, test_x, test_y):
    # Define model architecture
    mean = np.mean(train_x, axis=0)
    var = np.var(train_x, axis=0)
    model = keras.Sequential(
        [
            keras.Input([train_x.shape[1]]),
            keras.layers.Normalization(mean=mean, variance=var),
            keras.layers.Dense(64, activation="relu"),
            keras.layers.Dense(1),
        ]
    )

    # Compile model
    model.compile(
        optimizer=keras.optimizers.SGD(
            learning_rate=params["lr"], momentum=params["momentum"]
        ),
        loss="mean_squared_error",
        metrics=[keras.metrics.RootMeanSquaredError()],
    )

    # Train model with MLflow tracking
    with mlflow.start_run(nested=True):
        model.fit(
            train_x,
            train_y,
            validation_data=(valid_x, valid_y),
            epochs=epochs,
            batch_size=64,
        )
        # Evaluate the model
        eval_result = model.evaluate(valid_x, valid_y, batch_size=64)
        eval_rmse = eval_result[1]

        # Log parameters and results
        mlflow.log_params(params)
        mlflow.log_metric("eval_rmse", eval_rmse)

        # Log model
        mlflow.tensorflow.log_model(model, "model", signature=signature)

        return {"loss": eval_rmse, "status": STATUS_OK, "model": model}

La función objetivo, como en todo proceso de optimización, guía cómo de bien estamos cambiando los parámetros de nuestro proceso. En este caso serán los hiperparámetros de nuestro entrenamiento (learning-rate y momentum).

In [16]:
def objective(params):
    # MLflow will track the parameters and results for each run
    result = train_model(
        params,
        epochs=3,
        train_x=train_x,
        train_y=train_y,
        valid_x=valid_x,
        valid_y=valid_y,
        test_x=test_x,
        test_y=test_y,
    )
    return result

In [17]:
from hyperopt import Trials, fmin, hp, tpe

space = {
    "lr": hp.loguniform("lr", np.log(1e-5), np.log(1e-1)),
    "momentum": hp.uniform("momentum", 0.0, 1.0),
}

mlflow.set_experiment("/wine-quality")
with mlflow.start_run():
    # Conduct the hyperparameter search using Hyperopt
    trials = Trials()
    best = fmin(
        fn=objective,
        space=space,
        algo=tpe.suggest,
        max_evals=8,
        trials=trials,
    )

    # Fetch the details of the best run
    best_run = sorted(trials.results, key=lambda x: x["loss"])[0]

    # Log the best parameters, loss, and model
    mlflow.log_params(best)
    mlflow.log_metric("eval_rmse", best_run["loss"])
    mlflow.tensorflow.log_model(best_run["model"], "model", signature=signature)

    # Print out the best parameters and corresponding loss
    print(f"Best parameters: {best}")
    print(f"Best eval rmse: {best_run['loss']}")

  0%|          | 0/8 [00:00<?, ?trial/s, best loss=?]

Epoch 1/3                                            

[1m 1/46[0m [37m━━━━━━━━━━━━━━━━━━━━[0m [1m19s[0m 433ms/step - loss: 35.3525 - root_mean_squared_error: 5.9458
[1m46/46[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - loss: 35.5836 - root_mean_squared_error: 5.9652   
[1m46/46[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 11ms/step - loss: 35.5854 - root_mean_squared_error: 5.9653 - val_loss: 35.6236 - val_root_mean_squared_error: 5.9686

Epoch 2/3                                            

[1m 1/46[0m [37m━━━━━━━━━━━━━━━━━━━━[0m [1m0s[0m 19ms/step - loss: 36.8871 - root_mean_squared_error: 6.0735
[1m46/46[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - loss: 35.2757 - root_mean_squared_error: 5.9393 - val_loss: 35.1616 - val_root_mean_squared_error: 5.9297

Epoch 3/3                                            

[1m 1/46[0m [37m━━━━━━━━━━━━━━━━━━━━[0m [1m0s[0m 19ms/step - loss: 32.0520 - root_mean_squared_error: 5.66

Nuestro mejor RMSE es de 0.79 con los parámetros:

* learning-rate: 0.01909
* momentum: 0.32

**NOTA**: Vuestro parámetros pueden variar ligeramente.

Verificad en el interfaz de MLFlow si esto es así.

In [18]:
mlflow.set_experiment("/wine-quality")
with mlflow.start_run():
    # Conduct the hyperparameter search using Hyperopt
    trials = Trials()
    best = fmin(
        fn=objective,
        space=space,
        algo=tpe.suggest,
        max_evals=8,
        trials=trials,
    )

    # Fetch the details of the best run
    best_run = sorted(trials.results, key=lambda x: x["loss"])[0]

    # Log the best parameters, loss, and model
    mlflow.log_params(best)
    mlflow.log_metric("eval_rmse", best_run["loss"])
    mlflow.tensorflow.log_model(best_run["model"], "model", signature=signature)

    # Print out the best parameters and corresponding loss
    print(f"Best parameters: {best}")
    print(f"Best eval rmse: {best_run['loss']}")

Epoch 1/3                                            

[1m 1/46[0m [37m━━━━━━━━━━━━━━━━━━━━[0m [1m20s[0m 465ms/step - loss: 34.0935 - root_mean_squared_error: 5.8390
[1m46/46[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - loss: 11.9452 - root_mean_squared_error: 3.3233   
[1m46/46[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 11ms/step - loss: 11.8062 - root_mean_squared_error: 3.3021 - val_loss: 1.6276 - val_root_mean_squared_error: 1.2758

Epoch 2/3                                            

[1m 1/46[0m [37m━━━━━━━━━━━━━━━━━━━━[0m [1m0s[0m 17ms/step - loss: 0.6805 - root_mean_squared_error: 0.8249
[1m46/46[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - loss: 1.3373 - root_mean_squared_error: 1.1552 - val_loss: 1.2218 - val_root_mean_squared_error: 1.1053

Epoch 3/3                                            

[1m 1/46[0m [37m━━━━━━━━━━━━━━━━━━━━[0m [1m0s[0m 18ms/step - loss: 1.2790 - root_mean_squared_error: 1.1309
[

Si estamos contentos con un modelo en concreto podemos proceder a registrarlo:

![register](https://www.mlflow.org/docs/latest/_images/register_model_button.png)

Una vez hecho esto es sencillo invocar al proceso que sirve el modelo desde la terminal. Para ello es necesario establecer la URL del servidor de tracking en una variable local previamente:

```
export MLFLOW_TRACKING_URI=http://localhost:5000
```

Puede que para la gestión del entorno os pida también incluir las librerías [pyenv](https://github.com/pyenv/pyenv) y virtualenv (`!pip install virtualenv`).

Una vez configurada vuestra máquina, se vuelve un proceso sencillo en el que poder invocar el comando siguiente para servir el modelo:

```
mlflow models serve -m "models:/<nombre del modelo>/1" --port 5002
```

In [21]:
import requests

url_modelo = "http://localhost:5002/invocations"

json_data = {"dataframe_split": {
                "columns": [
                    "fixed acidity","volatile acidity","citric acid","residual sugar","chlorides","free sulfur dioxide","total sulfur dioxide","density","pH","sulphates","alcohol"],
                    "data": [[7,0.27,0.36,20.7,0.045,45,170,1.001,3,0.45,8.8]]}
}
headers = {'Content-Type' : 'application/json'}

response = requests.post(url=url_modelo, headers=headers, json=json_data)
print(response.status_code)

200


In [22]:
response.content

b'{"predictions": [[6.128170013427734]]}'