## Introduction to modelling with RUMnet

We reproduce in this notebook the results of the paper Reprensenting Random Uility Models with Neural Networks on the SwissMetro dataset.

In [None]:
import os

# Remove/Add GPU use
os.environ["CUDA_VISIBLE_DEVICES"] = "0"


import sys

sys.path.append("../")

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

from choice_learn.data import ChoiceDataset
from choice_learn.models import RUMnet
from choice_learn.datasets import load_swissmetro

Note that there are two implementation of RUMnet: One more CPU-oriented and one more GPU-oriented.
The import of the right model is automatically done. You can also import the model directly with:

```python
from choice_learn.models import CPURUMnet, GPURUMnet
```

First, we download the SwissMetro dataset:

In [None]:
df = load_swissmetro(as_frame=True)
df.head()

We follow the same data preparation as in the original paper in order to get the exact same results.


In [None]:
df = df.loc[df.CHOICE!=0]
choices = df.CHOICE.to_numpy() - 1
contexts_items_availabilities = df[["TRAIN_AV", "SM_AV", "CAR_AV"]].to_numpy()
contexts_items_features = np.stack([df[["TRAIN_TT", "TRAIN_CO", "TRAIN_HE"]].to_numpy(),
                                    df[["SM_TT", "SM_CO", "SM_HE"]].to_numpy(),
                                    df[["CAR_TT", "CAR_CO", "CAR_HE"]].to_numpy()], axis=1)
# contexts_features = df[["GROUP", "PURPOSE", "FIRST", "TICKET", "WHO", "LUGGAGE", "AGE", "MALE",
#                         "INCOME", "GA", "ORIGIN", "DEST"]].to_numpy()
fixed_items_features = np.eye(3)

contexts_items_features[:, :, 0] = contexts_items_features[:, :, 0] / 1000
contexts_items_features[:, :, 1] = contexts_items_features[:, :, 1] / 5000
contexts_items_features[:, :, 2] = contexts_items_features[:, :, 2] / 100

long_data = pd.get_dummies(df,
                           columns=["GROUP", "PURPOSE", "FIRST", "TICKET", "WHO",
                                        "LUGGAGE", "AGE", "MALE",
                                        "INCOME", "GA", "ORIGIN", "DEST"],
                                        drop_first=False)

# Transorming the category data into OneHot
contexts_features = []
for col in long_data.columns:
    if col.startswith("GROUP"):
        contexts_features.append(col)
    if col.startswith("PURPOSE"):
        contexts_features.append(col)
    if col.startswith("FIRST"):
        contexts_features.append(col)
    if col.startswith("TICKET"):
        contexts_features.append(col)
    if col.startswith("WHO"):
        contexts_features.append(col)
    if col.startswith("LUGGAGE"):
        contexts_features.append(col)
    if col.startswith("AGE"):
        contexts_features.append(col)
    if col.startswith("MALE"):
        contexts_features.append(col)
    if col.startswith("INCOME"):
        contexts_features.append(col)
    if col.startswith("GA"):
        contexts_features.append(col)
    if col.startswith("ORIGIN"):
        contexts_features.append(col)
    if col.startswith("DEST"):
        contexts_features.append(col)

contexts_features = long_data[contexts_features].to_numpy()

Now, we can create our ChoiceDataset from the dataframe.

In [None]:
dataset = ChoiceDataset(fixed_items_features=(fixed_items_features.astype("float32"), ),
                        contexts_features=(contexts_features.astype("float32"), ),
                        contexts_items_features=(contexts_items_features.astype("float32"), ),
                        contexts_items_availabilities=contexts_items_availabilities,
                        choices=choices)

Let's Cross-Validate !
We keep a scikit-learn-like structure.
To avoid creating dependancies, we use a different train/test split code, but the following would totally work:


```python
from sklearn.model_selection import ShuffleSplit

rs = ShuffleSplit(n_splits=5, test_size=.2, random_state=0)

for i, (train_index, test_index) in enumerate(rs.split(dataset.choices)):
    train_dataset = dataset[train_index]
    test_dataset = dataset[test_index]

    model = RUMnet(**args)
    model.instantiate()
    model.fit(train_dataset)
    model.evaluate(test_dataset)
```

We just use a numpy based split, but the core code is the same!

In [None]:
model_args = {
    "num_products_features": 6,
    "num_customer_features": 83,
    "width_eps_x": 20,
    "depth_eps_x": 5,
    "heterogeneity_x": 10,
    "width_eps_z": 20,
    "depth_eps_z": 5,
    "heterogeneity_z": 10,
    "width_u": 20,
    "depth_u": 5,
    "tol": 1,
    "optimizer": "Adam",
    "lr": 0.0002,
    "logmin": 1e-10,
    "label_smoothing": 0.02,
    "callbacks": [],
    "epochs": 150,
    "batch_size": 128,
    "tol": 1e-5,
}

In [None]:
indexes = np.random.permutation(list(range(len(dataset))))

fit_losses = []
test_eval = []
for i in range(5):
    test_indexes = indexes[int(len(indexes) * 0.2 * i):int(len(indexes) * 0.2 * (i + 1))]
    train_indexes = np.concatenate([indexes[:int(len(indexes) * 0.2 * i)],
                                    indexes[int(len(indexes) * 0.2 * (i + 1)):]],
                                   axis=0)

    train_dataset = dataset[train_indexes]
    test_dataset = dataset[test_indexes]

    model = RUMnet(**model_args)
    model.instantiate()

    losses = model.fit(train_dataset, val_dataset=test_dataset)
    probas = model.predict_probas(test_dataset)
    eval = tf.keras.losses.CategoricalCrossentropy(from_logits=False)(y_pred=model.predict_probas(test_dataset), y_true=tf.one_hot(test_dataset.choices, 3))
    test_eval.append(eval)
    print(test_eval)

    fit_losses.append(losses)

In [None]:
for i in range(len(fit_losses)):
    plt.plot(fit_losses[i]["train_loss"], label=f"fold {i}", c=["r", "g", "b", "cyan", "purple"][i])
    plt.plot(fit_losses[i]["test_loss"], c=["r", "g", "b", "cyan", "purple"][i])
plt.legend()

In [None]:
print("Average LogLikeliHood on test:", np.mean(test_eval))