# Train the model on your own dataset

## Install choice-learn

You will need to install the choice-learn package with pip:

In [None]:
# !pip install choice-learn

## Instantiating a TripDataset

First, it is needed to format you data as a TripDataset.

In [None]:
# Base dependencies
import matplotlib.pyplot as plt
import numpy as np

# Import the right classes
from choice_learn.basket_models import Trip, TripDataset

### Purchases, basket & assortment

You can split each data point (i.e. a basket of purchased items) as three information:
- the purchased items (the "basket")
- the list of available items at the time of the purchase (the "assortment")
- the price of every item at the time of the purchase (the "prices")

### Small Example

Let's take an example and consider the case of a supermarket selling 4 items: {0, 1, 2, 3}.

During day 1, the item 3 is out-of-stock and the shelves look like this:

<table>
<tr><th> Shelf in the Supermarket </th><th> Customer Purchases</th></tr>
<tr><td>

| Items | Sold | Price |
|-|-|-|
| 0 | Yes | 1.0 $ |
| 1 | Yes | 1.5 $ |
| 2 | Yes | 3.0 $ |
| 3 | No | NA |

</td><td>

| Customer | Purchases | | Customer | Purchases | 
|--|--|--|--|--|
| a | 0 | | d | 1 |
| b | 1 | | e | 1, 2|
| c | 0, 2 | | f | 2 |

</td></tr> </table>


During day 2, the item 3 is refurnished, while this time, item 1 is out-of-stock.
Also, the owner decides to slightly change the prices.

<table>
<tr><th> Shelf in the Supermarket </th><th> Customer Purchases</th></tr>
<tr><td>

| Items | Sold | Price |
|-|-|-|
| 0 | Yes | 1.2 $ |
| 1 | No | NA |
| 2 | Yes | 2.8 $ |
| 3 | Yes | 2.0 $ |

</td><td>

| Customer | Purchases | | Customer | Purchases | 
|--|--|--|--|--|
| g | 0 | | j | 3 |
| h | 2, 3 | | k | 2, 3|
| i | 2 | |  |  |

</td></tr> </table>

We show here how to create Trips from the different purchases and instantiate a TripDataset.

In [None]:
# Day 1:
assortment_day_1 = np.array([1., 1., .1, 0.]) # Item 3 is oos
prices_day_1 = np.array([1.0, 1.5, 3.0, 0.0]) # Needs to give a value for all prices even for not sold items, the value does not matter

trips = []
for basket in [[0], [1], [0, 2], [1], [1, 2], [2]]:
    trips.append(Trip(purchases=basket, prices=prices_day_1, assortment=assortment_day_1)) # Create a Trip for all baskets

# Day 2
assortment_day_2 = np.array([1., 0., .1, 1.]) # Item 1 is oos
prices_day_2 = np.array([1.2, 0.0, 2.9, 2.0]) # Needs to give a value for all prices even for not sold items, the value does not matter

for basket in [[0], [2, 3], [2], [3], [2, 3]]:
    trips.append(Trip(purchases=basket, prices=prices_day_2, assortment=assortment_day_2)) # Create a Trip for all baskets

trip_dataset = TripDataset(trips=trips, available_items=np.ones((1, 4)))

## Instantiating and training the model

Now, we just need to define a few hyperparamters and launch a training.

In [None]:
from choice_learn.basket_models import AleaCarta

latent_sizes = {"preferences": 2}
n_negative_samples = 1
optimizer = "adam"
lr = 5e-4
epochs = 10
batch_size = 4

aleacarta = AleaCarta(
    item_intercept=False,
    price_effects=False,
    seasonal_effects=False,
    latent_sizes=latent_sizes,
    n_negative_samples=n_negative_samples,
    optimizer=optimizer,
    lr=lr,
    epochs=epochs,
    batch_size=batch_size,
)

aleacarta.instantiate(n_items=4, n_stores=1)

# Fit the model
history = aleacarta.fit(trip_dataset)

## Using the model: probabilities & more

Once the model is trained, we can use it to obtain different insighes:
- Next-Item probability
- Items interactions


### Prediction the next purchased item

Let's consider this time that there is no out of stock and all the items are sold.
Let's see how a customer might complete its basket.

In [None]:
proba_empty_basket = aleacarta.compute_item_likelihood(trip=Trip(purchases=[], prices=np.array([1.2, 1.5, 2.9, 2.0]), assortment=np.array([1., 1., 1., 1.])))
proba_basket_with_one_item = aleacarta.compute_item_likelihood(trip=Trip(purchases=[1], prices=np.array([1.2, 1.5, 2.9, 2.0]), assortment=np.array([1., 1., 1., 1.])))

In [None]:
print("A customer with an empty basket has the following probabilities to purchase each item:")
for i in range(4):
    print(f"                item {i}: {proba_empty_basket[i]}")


print("A customer with the item 1 in its basket has the following probabilities to purchase each item:")
for i in range(4):
    print(f"                item {i}: {proba_basket_with_one_item[i]}")


### Visualization of the interactions

It is possible to directly visuzalize the interactions term doing the following:

In [None]:
interactions = np.dot(aleacarta.alpha.numpy(), aleacarta.alpha.numpy().T)
np.fill_diagonal(interactions, np.zeros((4, )))
plt.imshow(interactions)
plt.xlabel("Item i")
plt.ylabel("Item j")
plt.title("$a_i \cdot a_j$")
plt.colorbar()