# Assortment Example
A short example for assortment optimization under the conditional MNL.

In [None]:
# Importing the right base libraries
import os
# Remove GPU use
os.environ["CUDA_VISIBLE_DEVICES"] = ""

import sys
sys.path.append("../")

import numpy as np

We will use the TaFeng Dataset that is available on [Kaggle](https://www.kaggle.com/datasets/chiranjivdas09/ta-feng-grocery-dataset). You can load it automatically with Choice-Learn !

In [None]:
from choice_learn.datasets import load_tafeng

In [None]:
# Short illustration of the dataset
tafeng_df = load_tafeng(as_frame=True)
tafeng_df.head()

Unnamed: 0,TRANSACTION_DT,CUSTOMER_ID,AGE_GROUP,PIN_CODE,PRODUCT_SUBCLASS,PRODUCT_ID,AMOUNT,ASSET,SALES_PRICE
0,11/1/2000,1104905,45-49,115,110411,4710199010372,2,24,30
1,11/1/2000,418683,45-49,115,120107,4710857472535,1,48,46
2,11/1/2000,1057331,35-39,115,100407,4710043654103,2,142,166
3,11/1/2000,1849332,45-49,Others,120108,4710126092129,1,32,38
4,11/1/2000,1981995,50-54,115,100205,4710176021445,1,14,18


In this example we will use the sales_price and age_group features to estimate a discrete choice model in the form of a conditional MNL:

for a customer $z$ and a product $i$, we define the utility function:

$$U(i, z) = u_i + e_{dem(z)} \cdot p_i$$

with:
- $u_i$ the base utility of product $i$
- $p_i$ the price of product $i$
- $e_{dem(z)}$ the price elasticity of customer $z$ depending of its age

We decide to estimate three coefficients of price elasticity for customers <=25 y.o, 26<=.<=55 y.o. and =>56 y.o.

In [None]:
# Let's reload the TaFeng dataset as a Choice Dataset
dataset = load_tafeng(as_frame=False, preprocessing="assort_example")

# The age categories are encoded as OneHot features:
print("Age Categories Encoding for choices 0, 4 and 16:")
print(dataset.shared_features_by_choice[0][[0, 4, 16]])

                                    fit models needing them such as Conditional Logit.
                                fit models needing them such as Conditional Logit.


Age Categories Encoding for choices 0, 4 and 16:
[[0. 1. 0.]
 [1. 0. 0.]
 [0. 0. 1.]]


Let's define a custom model that would fit our formulation using Choice-Learn's ChoiceModel inheritance:

In [None]:
import tensorflow as tf

from choice_learn.models.base_model import ChoiceModel


class TaFengMNL(ChoiceModel):
    """Custom model for the TaFeng dataset."""

    def __init__(self, **kwargs):
        """Instantiation of our custom model."""
        # Standard inheritance stuff
        super().__init__(**kwargs)

        # Instantiation of base utilties weights
        # We have 25 items in the dataset making 25 weights
        self.base_utilities = tf.Variable(
                            tf.random_normal_initializer(0.0, 0.02, seed=42)(shape=(1, 25))
                        )
        # Instantiation of price elasticities weights
        # We have 3 age categories making 3 weights
        self.price_elasticities = tf.Variable(
                            tf.random_normal_initializer(0.0, 0.02, seed=42)(shape=(1, 3))
                        )
        # Don't forget to add the weights to be optimized in self.weights !
        self.trainable_weights = [self.base_utilities, self.price_elasticities]

    def compute_batch_utility(self,
                              shared_features_by_choice,
                              items_features_by_choice,
                              available_items_by_choice,
                              choices):
        """Method that defines how the model computes the utility of a product.

        Parameters
        ----------
        shared_features_by_choice : tuple of np.ndarray (choices_features)
            a batch of shared features
            Shape must be (n_choices, n_shared_features)
        items_features_by_choice : tuple of np.ndarray (choices_items_features)
            a batch of items features
            Shape must be (n_choices, n_items_features)
        available_items_by_choice : np.ndarray
            A batch of items availabilities
            Shape must be (n_choices, n_items)
        choices_batch : np.ndarray
            Choices
            Shape must be (n_choices, )

        Returns:
        --------
        np.ndarray
            Utility of each product for each choice.
            Shape must be (n_choices, n_items)
        """
        # Unused arguments
        _ = (available_items_by_choice, choices)

        # Get the right price elasticity coefficient according to the age cateogry
        price_coeffs = tf.tensordot(shared_features_by_choice,
                                    tf.transpose(self.price_elasticities),
                                    axes=1)
        # Compute the utility: u_i + p_i * c
        return tf.multiply(items_features_by_choice[:, :, 0], price_coeffs) + self.base_utilities


We estimate the coefficients values using .fit:

In [None]:
model = TaFengMNL(optimizer="lbfgs", epochs=1000, tolerance=1e-4)
history = model.fit(dataset, verbose=1)

Using L-BFGS optimizer, setting up .fit() function




We can observe estimated coefficients with the .weights argument:

In [None]:
print("Model Negative Log-Likelihood: ", model.evaluate(dataset))
print("Model Weights:")
print("Base Utilities u_i:", model.trainable_weights[0].numpy())
print("Price Elasticities:", model.trainable_weights[1].numpy())

Model Negative Log-Likelihood:  tf.Tensor(2.765724, shape=(), dtype=float32)
Model Weights:
Base Utilities u_i: [[ 0.5069443   2.9347017   1.9965347   0.54595953  0.726565    1.0065292
  -0.71810067 -0.9723515  -0.00809288 -3.0388074   1.0723325   1.6365193
  -3.635091   -1.2458814   3.0090377   1.6789885   1.8595006  -1.2637141
  -1.1653861  -0.08477285 -1.7731239  -1.965      -1.7938461   1.4977448
  -0.7458043 ]]
Price Elasticities: [[-0.06282081 -0.05757841 -0.05423078]]


As a short analysis we can observe that the price elasticiy in negative as expected and the younger the population the more impacted by the price.\
Our models looks good enough for a first and fast modelization.
Now let's see how to compute an optimal assortment using our model.

The first step is to compute the utility of each product. Here, let's consider that the last prices will also be the future prices of our products in our future assortment.\
It can be easily adapted if theses prices were to be changed.\
We can compute each age category utility using the *compute_batch_utility* method of our ChoiceModel:

In [None]:
future_prices = np.stack([dataset.items_features_by_choice[0][-1]]*3, axis=0)
age_category = np.array([[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]]).astype("float32")
predicted_utilities = model.compute_batch_utility(shared_features_by_choice=age_category,
                                                  items_features_by_choice=future_prices,
                                                  available_items_by_choice=None,
                                                  choices=None
                                                  )

We compute the ratio of each age category appearance in our dataset to obtain an average utility for each product.

In [None]:
age_frequencies = np.mean(dataset.shared_features_by_choice[0], axis=0)

final_utilities = []
for freq, ut in zip(age_frequencies, predicted_utilities):
    final_utilities.append(freq*ut)
final_utilities = np.mean(final_utilities, axis=0)
print("Estimated final utilities for each product:", final_utilities)

Estimated final utilities for each product: [-0.24943848 -0.39114037 -0.70386267 -0.5407389  -0.48053703 -0.3872156
 -0.6577868  -0.9327279  -0.72542304 -1.5835084  -1.3162355  -0.1772189
 -1.6301169  -0.8337137  -0.49949536 -0.80971146 -1.0538461  -0.83965784
 -0.80688196 -0.6939256  -0.9904421  -1.0163628  -1.0163687  -1.3836414
 -0.42928278]


We need to define what quantity needs to be optimized by our assortment. A usual answer is to optimize the revenue or margin. In our case we do not have these values, so let's say that we want to obtain the assortment with 12 products that will generate the highest turnover.\
We have everything we need to use Choice-Learn's AssortmentOptimizer !

In [None]:
from choice_learn.toolbox.assortment_optimizer import AssortmentOptimizer

opt = AssortmentOptimizer(utilities=np.exp(final_utilities), # Utilities need to be transformed with exponential function
                          itemwise_values=future_prices[0][:, 0], # Values to optimize for each item, here price that is used to compute turnover
                          assortment_size=12) # Size of the assortment we want

In [None]:
assortment, opt_obj = opt.solve()
print("Our Optimal Assortment is:")
print(assortment)
print("With an estimated average reveue of:", opt_obj)

Our Optimal Assortment is:
[0. 1. 1. 1. 1. 1. 0. 0. 1. 0. 1. 1. 0. 0. 1. 1. 1. 0. 0. 0. 0. 0. 0. 1.
 0.]
With an estimated average reveue of: 50.06177024937321


## Ending Notes
- In this example, the outside option is automatically integrated in the AssortmentOptimizer and not computed through the model. If you compute the outside option utility and give it to AssortmentOptimizer you can set its attribute *outside_option_given* to True.
- The current AssortmentOptimzer uses [Gurobi](https://www.gurobi.com/) for which you need a license. Future developments will integrate OR-Tools that is OpenSource.
- If you want to add custom constraints you can use the base code of the AssortmentOptimizer and manually add your constraints. Future developments will add an easy interface to integrate such needs.
