# TasteNet-MNL 

El presente notebook busca replicar experimentos de los autores del paper de TasteNet-MNL con el uso de la librería Choice Learn de python.

Ventajas de utilizar la implementación de Choice Learn por sobre las implementaciones de los autores de los modelos:

- Escalabilidad

- Enfoque a los grandes volúmenes de datos

- Mantenimiento 

- Documentación

- Generalizacion

# TasteNet

The TasteNet model is available in Choice-Learn. Here is a small example on how it can be used.\
Following the paper, we will use it on the SwissMetro [2] dataset.

### Summary
- [Data Loading](#data-loading)
- [Model Parametrization](#model-parametrization)
- [Model Estimation](#model-estimation)
- [Estimated Tastes Analysis](#estimated-tastes-analysis)
- [References](#references)

In [None]:
import numpy as np
import pandas as pd

from choice_learn.datasets import load_swissmetro
from choice_learn.models.tastenet import TasteNet

In [None]:
# The preprocessing="tastenet" let us format the data just like in the paper
customers_id, dataset = load_swissmetro(preprocessing="tastenet", as_frame=False)

In [None]:
print("Items Features:", dataset.items_features_by_choice_names)
print("Shared Features:", dataset.shared_features_by_choice_names)

### Parametrización del modelo

Los elementos del conjunto de datos están ordenados: "TREN", "SM" y "CAR". Ahora podemos configurar los hiperparámetros del modelo TasteNet.
- **taste_net_layers:** lista del número de neuronas para cada capa en la red neuronal del sabor
- **taste_net_activation:** función de activación que se utilizará dentro de la red neuronal del gusto
- **items_features_by_choice_parametrization:** parametrización de los coeficientes estimados para las Características de los Artículos.

TasteNet utiliza las características del cliente (shared_features_by_choice) para estimar diferentes coeficientes que se multiplicarán con características alternativas (items_features_by_choice) para estimar la utilidad:
$$ U(alternativa) = \sum_{i \in características alternativas} f(NN_i(características del cliente)) \cdot i$$

Con $f$ una función de normalización que se puede utilizar para establecer algunas restricciones como la positividad.

**items_features_by_choice_parametrization** describe la parametrización de cada característica alternativa y, por lo tanto, debe tener la misma forma, (3, 7) en nuestro caso. Los índices también deben coincidir.
- si el parámetro es flotante, el valor se utiliza directamente para multiplicar la característica correspondiente.
- si el parámetro es una cadena, indica qué función $f$ usar, lo que significa que usaremos la red neuronal de sabor para estimar un parámetro antes de usar $f$.

In [None]:
taste_net_layers = []
taste_net_activation = "relu"
items_features_by_choice_parametrization = [[-1., "-exp", "-exp", 0., "linear", 0., 0.],
                            [-1., "-exp", "-exp", "linear", 0., "linear", 0.],
                            [-1., "-exp", 0., 0., 0., 0., 0.]]

In this example from the paper, the utilities defined by *items_features_by_choice_parametrization* are the following:

With $\mathcal{C}$ the customer features and $NN_k$ the output of the taste embedding neural network:
$$
U(train) = -1 \cdot train_{CO} - e^{-NN_1(\mathcal{C})} \cdot train_{TT} - e^{-NN_2(\mathcal{C})} \cdot train_{HE} + NN_3(\mathcal{C}) \cdot ASC_{train}
$$

$$
U(sm) = -1 \cdot sm_{CO} - e^{-NN_4(\mathcal{C})} \cdot sm_{TT} - e^{-NN_5(\mathcal{C})} \cdot sm_{HE} + NN_6(\mathcal{C}) \cdot sm_{SEATS} + NN_7(\mathcal{C}) \cdot ASC_{sm}
$$

$$
U(car) = -1 \cdot car_{CO} - e^{-NN_8(\mathcal{C})} \cdot car_{TT} 
$$

In order to evaluate the model we work with a Cross-Validation scheme. We need to pay attention that the split take into account the fact that the same person has answered several times and appears several time in the dataset. We work with a GroupOut strategy meaning that one person has all his answers in the same testing fold.

### Model estimation

In [None]:
from sklearn.model_selection import GroupKFold

folds_history = []
folds_test_nll = []
gkf = GroupKFold(n_splits=5)
# specift customer_id to regroup each customer answer
for train, test in gkf.split(list(range(len(dataset))), list(range(len(dataset))), customers_id): 
    tastenet = TasteNet(taste_net_layers=taste_net_layers,
                    taste_net_activation=taste_net_activation,
                    items_features_by_choice_parametrization=items_features_by_choice_parametrization,
                    optimizer="Adam",
                    epochs=40,
                    lr=0.001,
                    batch_size=32)
    train_dataset, test_dataset = dataset[train], dataset[test]
    hist = tastenet.fit(train_dataset, val_dataset=test_dataset)
    folds_history.append(hist)
    folds_test_nll.append(tastenet.evaluate(test_dataset))

We need to pay attention to overfitting, here is a plot to understand each fold train/test over the fitting epochs:

In [None]:
import matplotlib.pyplot as plt
for hist, color in zip(folds_history,
                       ["darkblue", "slateblue", "mediumpurple", "violet", "hotpink"]):
    plt.plot(hist["train_loss"], c=color)
    plt.plot(hist["test_loss"], c=color, linestyle="dotted")
plt.legend()
plt.show()

In [None]:
print("Average NegativeLogLikelihood on testing set:", np.mean(folds_test_nll))

### Estimated Tastes Analysis

In order to analyze the model, one can look at the average output of the taste network.
It is possible to reach the taste network with *tastenet.taste_params_module* or to call *tastenet.predict_tastes*.

In [None]:
for (item_index, feature_index), nn_output_index in tastenet.items_features_to_weight_index.items():
    print("Alternative:", ["train", "sm", "car"][item_index])
    print("Feature:", dataset.items_features_by_choice_names[0][feature_index])
    print("Average value over dataset:")
    act = tastenet.get_activation_function(items_features_by_choice_parametrization[item_index][feature_index])
    print(np.mean(act(tastenet.predict_tastes(dataset.shared_features_by_choice[0])[:, nn_output_index])))
    print("----------------------------\n")