# Assortment Example
A short example for assortment optimization under the conditional MNL.

In [None]:
# Importing the right base libraries
import os
# Remove GPU use
os.environ["CUDA_VISIBLE_DEVICES"] = ""

import sys
sys.path.append("../")

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

We will use the TaFeng Dataset that is available on [Kaggle](https://www.kaggle.com/datasets/chiranjivdas09/ta-feng-grocery-dataset). You can load it automatically with Choice-Learn !

In [None]:
from choice_learn.datasets.examples import load_tafeng

In [None]:
# Short illustration of the dataset
tafeng_df = load_tafeng(as_frame=True)
tafeng_df.head()

In this example we will use the sales_price and age_group features to estimate a discrete choice model in the form of a conditional MNL:

for a customer $z$ and a product $i$, we define the utility function:

$$U(i, z) = u_i + e_{dem(z)} \cdot p_i$$

with:
- $u_i$ the base utility of product $i$
- $p_i$ the price of product $i$
- $e_{dem(z)}$ the price elasticity of customer $z$ depending of its age

We decide to estimate three coefficients of price elasticity for customers <=25 y.o, 26<=.<=55 y.o. and =>56 y.o.

In [None]:
# Let's reload the TaFeng dataset as a Choice Dataset
dataset = load_tafeng(as_frame=False, preprocessing="assort_example")

# The age categories are encoded as OneHot features:
print("Age Categories Encoding for choices 0, 4 and 16:")
print(dataset.contexts_features[0][[0, 4, 16]])

Let's define a custom model that would fit our formulation using Choice-Learn's ChoiceModel inheritance:

In [None]:
import tensorflow as tf

from choice_learn.models.base_model import ChoiceModel


class TaFengMNL(ChoiceModel):
    """Custom model for the TaFeng dataset."""

    def __init__(self, **kwargs):
        """Instantiation of our custom model."""
        # Standard inheritance stuff
        super().__init__(**kwargs)

        # Instantiation of base utilties weights
        # We have 25 items in the dataset making 25 weights
        self.base_utilities = tf.Variable(
                            tf.random_normal_initializer(0.0, 0.02, seed=42)(shape=(1, 25))
                        )
        # Instantiation of price elasticities weights
        # We have 3 age categories making 3 weights
        self.price_elasticities = tf.Variable(
                            tf.random_normal_initializer(0.0, 0.02, seed=42)(shape=(1, 3))
                        )
        # Don't forget to add the weights to be optimized in self.weights !
        self.weights = [self.base_utilities, self.price_elasticities]

    def compute_batch_utility(self,
                        fixed_items_features,
                        contexts_features,
                        contexts_items_features,
                        contexts_items_availabilities,
                        choices):
        """Function where to define the utility function for the model.

        It uses a standard ChoiceModel signature that needs to be adapted to our usecase.

        Parameters:
        -----------
        fixed_items_features : tf.Tensor
            Fixed features of the items in the choice set. We do not have any here.
        contexts_features : tf.Tensor
            Features of the contexts. Here we have the customer age categories.
        context_items_features : tf.Tensor
            Features of the items in the choice set. Items Prices in our case.
        contexts_items_availabilities : tf.Tensor
            Availabilities of the items in the choice set. All items are always available in the dataset making it irrelevant.
        choices : tf.Tensor
            Choices made by the customers. Not relevant in utility computation here.

        Returns:
        --------
        tf.Tensor
            Utilities for each item in each choice set.
        """
        # Unused arguments
        _ = (fixed_items_features, contexts_items_availabilities, choices)

        # Get the right price elasticity coefficient according to the age cateogry
        price_coeffs = tf.tensordot(contexts_features,
                                    tf.transpose(self.price_elasticities),
                                    axes=1)
        # Compute the utility: u_i + p_i * c
        return tf.multiply(contexts_items_features[:, :, 0], price_coeffs) + self.base_utilities


Let's estimate the model coefficients using the dataset !

In [None]:
model = TaFengMNL(optimizer="lbfgs", epochs=1000)
history = model.fit(dataset)

We can observe each weight estimation with the .weights argument:

In [None]:
print("Model Negative Log-Likelihood: ", model.evaluate(dataset))
print("Model Weights:")
print("Base Utilities u_i:", model.weights[0].numpy())
print("Price Elasticities:", model.weights[1].numpy())

As a short analysis we can observe that the price elasticiy in negative as expected and the younger the population the more impacted by the price.\
Our models looks good good enough for a first and fast modelization.
Now let's see how to compute an optimal assortment using our model.

The first step is to compute the utility of each product. Here, let's consider that the last prices will also be the future prices of our products in our future assortment.\
It can be easily change if theses prices were to be changed.\
We can compute each age category utility using the *compute_batch_utility* method of our ChoiceModel:

In [None]:
future_prices = np.stack([dataset.contexts_items_features[0][-1]]*3, axis=0)
age_category = np.array([[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]]).astype("float32")
predicted_utilities = model.compute_batch_utility(fixed_items_features=None,
                                                  contexts_features=age_category,
                                                  contexts_items_features=future_prices,
                                                  contexts_items_availabilities=None,
                                                  choices=None
                                                  )

We compute the ratio of each age category appearance in our dataset to obtain an average utility for each product.

In [None]:
age_frequencies = np.mean(dataset.contexts_features[0], axis=0)

final_utilities = []
for freq, ut in zip(age_frequencies, predicted_utilities):
    final_utilities.append(freq*ut)
final_utilities = np.mean(final_utilities, axis=0)

We need to define what quantity needs to be optimized by our assortment. A usual answer is to optimize the revenue or margin. In our case we do not have these values, so let's say that we want to obtain the assortment with 12 products that will generate the highest turnover. # right word ?\
We have everything we need to use Choice-Learn's AssortmentOptimizer !

In [None]:
from choice_learn.toolbox.assortment_optimizer import AssortmentOptimizer

opt = AssortmentOptimizer(
                          utilities=np.exp(final_utilities), # Utilities need to be transformed with exponential function
                          itemwise_values=future_prices[0][:, 0], # Values to optimize for each item, here price that is used to compute turnover
                          assortment_size=12) # Size of the assortment we want

In [None]:
assortment, average_estimated_revenue = opt.solve()
print("Our Optimal Assortment is:")
print(assortment)
print("With an estimated average reveue of:", average_estimated_revenue)

## Ending Notes
- In this example, the outside option is automatically integrated in the AssortmentOptimizer and not computed through the model. If you compute the outside option utility and give it to AssortmentOptimizer you can set its attribute *outside_option_given* to True.
- The current AssortmentOptimzer uses [Gurobi](https://www.gurobi.com/) for which you need a license. Future developments will integrate OR-Tools that is OpenSource.
- If you want to add custom constraints you can use the base code of the AssortmentOptimizer and manually add your constraints. Future developments will add an easy interface to integrate such needs.
