# Tutorial for Using Different Optimizers for `bemb`.

Author: Tianyu Du

Update: May. 19, 2023

This tutorial offers a simple simulation exercise to demonstrate how to use BEMB model and the power of BEMB's `obs2prior` feature. We highly recommend you to read the BEMB tutorial first.

In [1]:
import torch
import bemb
from bemb.data import load_simulation_dataset
from bemb.model import LitBEMBFlex
from bemb import run

In [2]:
print(torch.__version__)

1.13.0+cu117


In [3]:
num_users = 1500
num_items = 50
data_size = 1000
dataset_train, dataset_val, dataset_test  = load_simulation_dataset(num_users=num_users, num_items=num_items, data_size=data_size)

No `session_index` is provided, assume each choice instance is in its own session.


In [4]:
LATENT_DIM = 10  # the dimension of alpha and theta.
bemb = LitBEMBFlex(
    learning_rate=0.01,  # set the learning rate, feel free to play with different levels.
    pred_item=True,  # let the model predict item_index, don't change this one.
    num_seeds=256,  # number of Monte Carlo samples for estimating the ELBO.
    utility_formula='theta_user * alpha_item',  # the utility formula.
    num_users=num_users,
    num_items=num_items,
    num_user_obs=dataset_train.user_obs.shape[1],
    num_item_obs=dataset_train.item_obs.shape[1],
    # whether to turn on obs2prior for each parameter.
    obs2prior_dict={'theta_user': True, 'alpha_item': True},
    # the dimension of latents, since the utility is an inner product of theta and alpha, they should have
    # the same dimension.
    coef_dim_dict={'theta_user': LATENT_DIM, 'alpha_item': LATENT_DIM},
    model_optimizer="Adam",
)

BEMB: utility formula parsed:
[{'coefficient': ['theta_user', 'alpha_item'], 'observable': None}]


In [5]:
bemb

Bayesian EMBedding Model with U[user, item, session] = theta_user * alpha_item
Total number of parameters: 33000.
With the following coefficients:
ModuleDict(
  (theta_user): BayesianCoefficient(num_classes=1500, dimension=10, prior=N(H*X_obs(H shape=torch.Size([10, 50]), X_obs shape=50), Ix1.0))
  (alpha_item): BayesianCoefficient(num_classes=50, dimension=10, prior=N(H*X_obs(H shape=torch.Size([10, 50]), X_obs shape=50), Ix1.0))
)
[]
Optimizer: Adam, Learning rate: 0.01

In [6]:
run(bemb, dataset_train=dataset_train, dataset_val=dataset_val, dataset_test=dataset_test,
    batch_size=len(dataset_train) // 20, num_epochs=1000, device="cuda")

Bayesian EMBedding Model with U[user, item, session] = theta_user * alpha_item
Total number of parameters: 33000.
With the following coefficients:
ModuleDict(
  (theta_user): BayesianCoefficient(num_classes=1500, dimension=10, prior=N(H*X_obs(H shape=torch.Size([10, 50]), X_obs shape=50), Ix1.0))
  (alpha_item): BayesianCoefficient(num_classes=50, dimension=10, prior=N(H*X_obs(H shape=torch.Size([10, 50]), X_obs shape=50), Ix1.0))
)
[]
Optimizer: Adam, Learning rate: 0.01
[Train dataset] ChoiceDataset(label=[], item_index=[800], user_index=[800], session_index=[800], item_availability=[], user_obs=[1500, 50], item_obs=[50, 50], device=cpu)
[Validation dataset] ChoiceDataset(label=[], item_index=[100], user_index=[100], session_index=[100], item_availability=[], user_obs=[1500, 50], item_obs=[50, 50], device=cpu)
[Test dataset] ChoiceDataset(label=[], item_index=[100], user_index=[100], session_index=[100], item_availability=[], user_obs=[1500, 50], item_obs=[50, 50], device=cpu)


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
You are using a CUDA device ('NVIDIA GeForce RTX 3090') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name  | Type     | Params
-----------------------------------
0 | model | BEMBFlex | 33.0 K
-----------------------------------
33.0 K    Trainable params
0         Non-trainable params
33.0 K    Total params
0.132     Total estimated model params size (MB)


Sanity Checking: 0it [00:00, ?it/s]

  rank_zero_warn(
  rank_zero_warn(


Training: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

`Trainer.fit` stopped: `max_epochs=1000` reached.
You are using a CUDA device ('NVIDIA GeForce RTX 3090') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Time taken for training: 193.26278162002563


  rank_zero_warn(


Testing: 0it [00:00, ?it/s]

Bayesian EMBedding Model with U[user, item, session] = theta_user * alpha_item
Total number of parameters: 33000.
With the following coefficients:
ModuleDict(
  (theta_user): BayesianCoefficient(num_classes=1500, dimension=10, prior=N(H*X_obs(H shape=torch.Size([10, 50]), X_obs shape=50), Ix1.0))
  (alpha_item): BayesianCoefficient(num_classes=50, dimension=10, prior=N(H*X_obs(H shape=torch.Size([10, 50]), X_obs shape=50), Ix1.0))
)
[]
Optimizer: Adam, Learning rate: 0.01