<i>Copyright (c) Recommenders contributors.</i>

<i>Licensed under the MIT License.</i>

# Bilateral Variational Autoencoder (BiVAE)

This notebook serves as a tutorial on Bilateral Variational Autoencoder (BiVAE) model for collaborative filtering. The research paper of BiVAE [1] is presented at WSDM'21 conference. For all experiments related to BiVAE model, please refer to [this repository](https://github.com/PreferredAI/bi-vae).

The implementation of the model is from [Cornac](https://github.com/PreferredAI/cornac) [2], which is a framework for multimodal recommender systems focusing on models that utilize auxiliary data (e.g., item descriptive text and image, social network, etc).

## 0 Global Settings and Imports

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [2]:
%cd /content/drive/MyDrive/Colab_me/DS300/recommenders/

/content/drive/MyDrive/Colab_me/DS300/recommenders


##### ----Test----

In [5]:
!pip3 install cornac
!pip3 install dgl

Installing collected packages: powerlaw, cornac
Successfully installed cornac-1.18.0 powerlaw-1.5


In [27]:
import cornac
from cornac.eval_methods import RatioSplit
from cornac.datasets import amazon_clothing
from cornac.data import Reader


# Load the Amazon Clothing  dataset, and binarise ratings using cornac.data.Reader
feedback = movielens.load_feedback(variant="100k")

# Define an evaluation method to split feedback into train and test sets
ratio_split = RatioSplit(
    data=feedback,
    test_size=0.2,
    rating_threshold=1.0,
    seed=123,
    exclude_unknowns=True,
    verbose=True,
)

backend = "tensorflow"  # or 'pytorch'

# Instantiate the recommender models to be compared
gmf = cornac.models.GMF(
    num_factors=8,
    num_epochs=10,
    learner="adam",
    backend=backend,
    batch_size=256,
    lr=0.001,
    num_neg=50,
    seed=123,
)
mlp = cornac.models.MLP(
    layers=[64, 32, 16, 8],
    act_fn="tanh",
    learner="adam",
    backend=backend,
    num_epochs=10,
    batch_size=256,
    lr=0.001,
    num_neg=50,
    seed=123,
)
neumf1 = cornac.models.NeuMF(
    num_factors=8,
    layers=[64, 32, 16, 8],
    act_fn="tanh",
    learner="adam",
    backend=backend,
    num_epochs=10,
    batch_size=256,
    lr=0.001,
    num_neg=50,
    seed=123,
)

# Instantiate evaluation metrics
ndcg_50 = cornac.metrics.NDCG(k=50)
rec_50 = cornac.metrics.Recall(k=50)

# Put everything together into an experiment and run it
cornac.Experiment(
    eval_method=ratio_split,
    models=[
        gmf,
        mlp,
        neumf1,
        neumf2,
    ],
    metrics=[ndcg_50, rec_50],
).run()

rating_threshold = 1.0
exclude_unknowns = True
---
Training data:
Number of users = 943
Number of items = 1656
Number of ratings = 80000
Max rating = 5.0
Min rating = 1.0
Global mean = 3.5
---
Test data:
Number of users = 943
Number of items = 1656
Number of ratings = 19971
Number of unknown users = 0
Number of unknown items = 0
---
Total users = 943
Total items = 1656

[GMF] Training started!


  0%|          | 0/10 [00:00<?, ?it/s]


[GMF] Evaluation started!


Ranking:   0%|          | 0/942 [00:00<?, ?it/s]


[MLP] Training started!


  0%|          | 0/10 [00:00<?, ?it/s]


[MLP] Evaluation started!


Ranking:   0%|          | 0/942 [00:00<?, ?it/s]


[NeuMF] Training started!


  0%|          | 0/10 [00:00<?, ?it/s]


[NeuMF] Evaluation started!


Ranking:   0%|          | 0/942 [00:00<?, ?it/s]


[NeuMF_pretrained] Training started!


  0%|          | 0/10 [00:00<?, ?it/s]


[NeuMF_pretrained] Evaluation started!


Ranking:   0%|          | 0/942 [00:00<?, ?it/s]


TEST:
...
                 | NDCG@50 | Recall@50 | Train (s) | Test (s)
---------------- + ------- + --------- + --------- + --------
GMF              |  0.2359 |    0.2805 |  876.9557 |   1.4877
MLP              |  0.3395 |    0.4086 |  902.8060 |   2.1408
NeuMF            |  0.3452 |    0.4173 |  905.5586 |   2.2903
NeuMF_pretrained |  0.3148 |    0.3706 |  922.0728 |   2.6690



In [25]:
import cornac
from cornac.datasets import movielens
from cornac.eval_methods import RatioSplit
from cornac.models import IBPR


# Load the MovieLens 1M dataset
ml_1m = movielens.load_feedback(variant="100k")

# Instantiate an evaluation method.
ratio_split = RatioSplit(
    data=ml_1m, test_size=0.2, rating_threshold=1.0, exclude_unknowns=True, verbose=True
)

# Instantiate a IBPR recommender model.
ibpr = IBPR(k=10, verbose=True)

# Instantiate evaluation metrics.
rec_20 = cornac.metrics.Recall(k=20)
pre_20 = cornac.metrics.Precision(k=20)

# Instantiate and then run an experiment.
cornac.Experiment(
    eval_method=ratio_split, models=[ibpr], metrics=[rec_20, pre_20], user_based=True
).run()

rating_threshold = 1.0
exclude_unknowns = True
---
Training data:
Number of users = 943
Number of items = 1641
Number of ratings = 80000
Max rating = 5.0
Min rating = 1.0
Global mean = 3.5
---
Test data:
Number of users = 943
Number of items = 1641
Number of ratings = 19954
Number of unknown users = 0
Number of unknown items = 0
---
Total users = 943
Total items = 1641

[IBPR] Training started!


Epoch 1/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 2/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 3/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 4/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 5/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 6/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 7/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 8/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 9/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 10/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 11/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 12/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 13/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 14/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 15/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 16/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 17/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 18/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 19/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 20/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 21/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 22/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 23/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 24/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 25/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 26/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 27/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 28/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 29/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 30/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 31/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 32/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 33/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 34/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 35/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 36/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 37/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 38/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 39/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 40/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 41/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 42/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 43/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 44/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 45/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 46/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 47/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 48/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 49/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 50/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 51/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 52/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 53/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 54/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 55/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 56/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 57/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 58/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 59/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 60/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 61/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 62/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 63/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 64/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 65/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 66/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 67/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 68/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 69/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 70/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 71/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 72/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 73/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 74/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 75/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 76/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 77/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 78/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 79/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 80/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 81/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 82/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 83/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 84/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 85/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 86/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 87/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 88/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 89/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 90/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 91/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 92/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 93/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 94/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 95/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 96/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 97/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 98/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 99/100:   0%|          | 0/800 [00:00<?, ?it/s]

Epoch 100/100:   0%|          | 0/800 [00:00<?, ?it/s]


[IBPR] Evaluation started!


Ranking:   0%|          | 0/943 [00:00<?, ?it/s]


TEST:
...
     | Precision@20 | Recall@20 | Train (s) | Test (s)
---- + ------------ + --------- + --------- + --------
IBPR |       0.1482 |    0.1659 |  695.2946 |   1.0016



In [26]:
import cornac
from cornac.eval_methods import RatioSplit


# Load user-item feedback
data = movielens.load_feedback(variant="100k")

# Instantiate an evaluation method to split data into train and test sets.
ratio_split = RatioSplit(
    data=data,
    test_size=0.2,
    exclude_unknowns=True,
    verbose=True,
    seed=123,
    rating_threshold=0.5,
)

# Instantiate the VAECF model
vaecf = cornac.models.VAECF(
    k=10,
    autoencoder_structure=[20],
    act_fn="tanh",
    likelihood="mult",
    n_epochs=100,
    batch_size=100,
    learning_rate=0.001,
    beta=1.0,
    seed=123,
    use_gpu=True,
    verbose=True,
)

# Instantiate evaluation measures
rec_20 = cornac.metrics.Recall(k=20)
ndcg_20 = cornac.metrics.NDCG(k=20)
auc = cornac.metrics.AUC()

# Put everything together into an experiment and run it
cornac.Experiment(
    eval_method=ratio_split,
    models=[vaecf],
    metrics=[rec_20, ndcg_20, auc],
    user_based=True,
).run()

rating_threshold = 0.5
exclude_unknowns = True
---
Training data:
Number of users = 943
Number of items = 1656
Number of ratings = 80000
Max rating = 5.0
Min rating = 1.0
Global mean = 3.5
---
Test data:
Number of users = 943
Number of items = 1656
Number of ratings = 19971
Number of unknown users = 0
Number of unknown items = 0
---
Total users = 943
Total items = 1656

[VAECF] Training started!


  0%|          | 0/100 [00:00<?, ?it/s]


[VAECF] Evaluation started!


Ranking:   0%|          | 0/942 [00:00<?, ?it/s]


TEST:
...
      |    AUC | NDCG@20 | Recall@20 | Train (s) | Test (s)
----- + ------ + ------- + --------- + --------- + --------
VAECF | 0.9166 |  0.3409 |    0.2818 |    9.7836 |   2.4067



In [22]:
import cornac
from cornac.datasets import movielens
from cornac.eval_methods import RatioSplit

# Load user-item feedback
data_100k = movielens.load_feedback(variant="100K")

# Instantiate an evaluation method to split data into train and test sets.
ratio_split = RatioSplit(
    data=data_100k,
    test_size=0.1,
    val_size=0.1,
    exclude_unknowns=True,
    verbose=True,
    seed=123,
)

pmf = cornac.models.PMF(
    k=10,
    max_iter=100,
    learning_rate=0.001,
    lambda_reg=0.001
)


gcmc = cornac.models.GCMC(
    seed=123,
)

# Put everything together into an experiment and run it
cornac.Experiment(
    eval_method=ratio_split,
    models=[pmf, gcmc],
    metrics=[cornac.metrics.RMSE()],
    user_based=False,
).run()

rating_threshold = 1.0
exclude_unknowns = True
---
Training data:
Number of users = 943
Number of items = 1656
Number of ratings = 80000
Max rating = 5.0
Min rating = 1.0
Global mean = 3.5
---
Test data:
Number of users = 943
Number of items = 1656
Number of ratings = 9990
Number of unknown users = 0
Number of unknown items = 0
---
Validation data:
Number of users = 943
Number of items = 1656
Number of ratings = 9981
---
Total users = 943
Total items = 1656

[PMF] Training started!

[PMF] Evaluation started!


Rating:   0%|          | 0/9990 [00:00<?, ?it/s]

Rating:   0%|          | 0/9981 [00:00<?, ?it/s]


[BiasMF] Training started!

[BiasMF] Evaluation started!


Rating:   0%|          | 0/9990 [00:00<?, ?it/s]

Rating:   0%|          | 0/9981 [00:00<?, ?it/s]


[GCMC] Training started!


  avg = a.mean(axis, **keepdims_kw)
  ret = ret.dtype.type(ret / rcount)
Training:  67%|██████▋   | 1345/1999 [04:58<02:24,  4.51iter/s]



[GCMC] Evaluation started!


Rating:   0%|          | 0/9990 [00:00<?, ?it/s]

Rating:   0%|          | 0/9981 [00:00<?, ?it/s]


VALIDATION:
...
       |   RMSE | Time (s)
------ + ------ + --------
PMF    | 0.9175 |   0.2693
BiasMF | 0.8991 |   0.2605
GCMC   | 0.8743 |   0.2081

TEST:
...
       |   RMSE | Train (s) | Test (s)
------ + ------ + --------- + --------
PMF    | 0.9432 |    3.8081 |   0.2156
BiasMF | 0.9182 |    0.0683 |   0.2969
GCMC   | 0.8988 |  298.2121 |   0.2342



In [21]:
import cornac
from cornac.datasets import citeulike
from cornac.eval_methods import RatioSplit

# Load user-item feedback
data = movielens.load_feedback(variant="100k")

# Instantiate an evaluation method to split data into train and test sets.
ratio_split = RatioSplit(
    data=data,
    val_size=0.1,
    test_size=0.1,
    exclude_unknowns=True,
    verbose=True,
    seed=123,
    rating_threshold=0.5,
)

# Instantiate the NGCF model
ngcf = cornac.models.NGCF(
    seed=123,
    num_epochs=5,
    emb_size=64,
    layer_sizes=[64, 64, 64],
    dropout_rates=[0.1, 0.1, 0.1],
    early_stopping={"min_delta": 1e-4, "patience": 50},
    batch_size=1024,
    learning_rate=0.001,
    lambda_reg=1e-5,
    verbose=True,
)

# Instantiate evaluation measures
rec_20 = cornac.metrics.Recall(k=20)
ndcg_20 = cornac.metrics.NDCG(k=20)

# Put everything together into an experiment and run it
cornac.Experiment(
    eval_method=ratio_split,
    models=[ngcf],
    metrics=[rec_20, ndcg_20],
    user_based=True,
).run()

rating_threshold = 0.5
exclude_unknowns = True
---
Training data:
Number of users = 943
Number of items = 1656
Number of ratings = 80000
Max rating = 5.0
Min rating = 1.0
Global mean = 3.5
---
Test data:
Number of users = 943
Number of items = 1656
Number of ratings = 9990
Number of unknown users = 0
Number of unknown items = 0
---
Validation data:
Number of users = 943
Number of items = 1656
Number of ratings = 9981
---
Total users = 943
Total items = 1656

[NGCF] Training started!


Training:   0%|          | 0/5 [00:00<?, ?iter/s]

Epoch:   0%|          | 0/79 [00:00<?, ?it/s]

Ranking:   0%|          | 0/922 [00:00<?, ?it/s]

Epoch:   0%|          | 0/79 [00:00<?, ?it/s]

Ranking:   0%|          | 0/922 [00:00<?, ?it/s]

Epoch:   0%|          | 0/79 [00:00<?, ?it/s]

Ranking:   0%|          | 0/922 [00:00<?, ?it/s]

Epoch:   0%|          | 0/79 [00:00<?, ?it/s]

Ranking:   0%|          | 0/922 [00:00<?, ?it/s]

Epoch:   0%|          | 0/79 [00:00<?, ?it/s]

Ranking:   0%|          | 0/922 [00:00<?, ?it/s]


[NGCF] Evaluation started!


Ranking:   0%|          | 0/929 [00:00<?, ?it/s]

Ranking:   0%|          | 0/922 [00:00<?, ?it/s]


VALIDATION:
...
     | NDCG@20 | Recall@20 | Time (s)
---- + ------- + --------- + --------
NGCF |  0.1521 |    0.1956 |   2.0144

TEST:
...
     | NDCG@20 | Recall@20 | Train (s) | Test (s)
---- + ------- + --------- + --------- + --------
NGCF |  0.1553 |    0.1922 |  525.9356 |   2.2945



In [17]:
import cornac
from cornac.datasets import movielens
from cornac.eval_methods import RatioSplit


# Load user-item feedback
data = movielens.load_feedback(variant="100k")

# Instantiate an evaluation method to split data into train and test sets.
ratio_split = RatioSplit(
    data=data,
    test_size=0.2,
    exclude_unknowns=True,
    verbose=True,
    seed=123,
    rating_threshold=0.8,
)

ease_original = cornac.models.EASE(
    lamb=500,
    name="EASEᴿ (B>0)",
    posB=True
)

ease_all = cornac.models.EASE(
    lamb=500,
    name="EASEᴿ (B>-∞)",
    posB=False
)


# Instantiate evaluation measures
rec_20 = cornac.metrics.Recall(k=20)
rec_50 = cornac.metrics.Recall(k=50)
ndcg_100 = cornac.metrics.NDCG(k=100)


# Put everything together into an experiment and run it
cornac.Experiment(
    eval_method=ratio_split,
    models=[ease_original, ease_all],
    metrics=[rec_20, rec_50, ndcg_100],
    user_based=True, #If `False`, results will be averaged over the number of ratings.
    save_dir=None
).run()

rating_threshold = 0.8
exclude_unknowns = True
---
Training data:
Number of users = 943
Number of items = 1656
Number of ratings = 80000
Max rating = 5.0
Min rating = 1.0
Global mean = 3.5
---
Test data:
Number of users = 943
Number of items = 1656
Number of ratings = 19971
Number of unknown users = 0
Number of unknown items = 0
---
Total users = 943
Total items = 1656

[EASEᴿ (B>0)] Training started!

[EASEᴿ (B>0)] Evaluation started!


Ranking:   0%|          | 0/942 [00:00<?, ?it/s]


[EASEᴿ (B>-∞)] Training started!

[EASEᴿ (B>-∞)] Evaluation started!


Ranking:   0%|          | 0/942 [00:00<?, ?it/s]


TEST:
...
             | NDCG@100 | Recall@20 | Recall@50 | Train (s) | Test (s)
------------ + -------- + --------- + --------- + --------- + --------
EASEᴿ (B>0)  |   0.4587 |    0.3138 |    0.4782 |    0.6287 |   1.1102
EASEᴿ (B>-∞) |   0.4755 |    0.3280 |    0.4944 |    0.5840 |   1.1147



In [None]:
# Code nay de lay do do
import cornac
from cornac.eval_methods import RatioSplit
from cornac.models import PMF, BiVAECF
from cornac.metrics import MAE, RMSE, Precision, Recall, NDCG, AUC, MAP

# load the built-in MovieLens 100K and split the data based on ratio
ml_100k = cornac.datasets.movielens.load_feedback()
rs = RatioSplit(data=ml_100k, test_size=0.2, rating_threshold=4.0, seed=123)

# initialize models, here we are comparing: Biased MF, PMF, and BPR
bivaecf = BiVAECF()
# pmf = PMF(k=10, max_iter=100, learning_rate=0.001, lambda_reg=0.001, seed=123)
models = [bivaecf]

# define metrics to evaluate the models
metrics = [MAE(), RMSE(), Precision(k=10), Recall(k=10), NDCG(k=10), AUC(), MAP()]

# put it together in an experiment, voilà!
cornac.Experiment(eval_method=rs, models=models, metrics=metrics, user_based=True).run()

##### ----Test----

In [None]:
!pip install cornac
!pip install papermill

Installing collected packages: papermill
Successfully installed papermill-2.5.0


In [None]:
!pip install scrapbook

Installing collected packages: jedi, scrapbook
Successfully installed jedi-0.19.1 scrapbook-0.5.0


In [None]:
import sys
import os
import torch
import cornac
import papermill as pm
import scrapbook as sb
import pandas as pd
# from recommenders.datasets import movielens
from recommenders.datasets.python_splitters import python_random_split
from recommenders.evaluation.python_evaluation import map_at_k, ndcg_at_k, precision_at_k, recall_at_k
from recommenders.models.cornac.cornac_utils import predict_ranking
from recommenders.utils.timer import Timer
from recommenders.utils.constants import SEED

print("System version: {}".format(sys.version))
print("PyTorch version: {}".format(torch.__version__))
print("Cornac version: {}".format(cornac.__version__))

System version: 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]
PyTorch version: 2.1.0+cu118
Cornac version: 1.17


In [None]:
# Select MovieLens data size: 100k, 1m, 10m, or 20m
MOVIELENS_DATA_SIZE = '100k'

# top k items to recommend
TOP_K = 10

# Model parameters
LATENT_DIM = 50
ENCODER_DIMS = [100]
ACT_FUNC = "tanh"
LIKELIHOOD = "pois"
NUM_EPOCHS = 500
BATCH_SIZE = 128
LEARNING_RATE = 0.001

## 2 Cornac implementation of BiVAE

BiVAE is implemented in the Cornac framework as part of the [model collections](https://github.com/PreferredAI/cornac#models).
* Detailed documentations of the BiVAE model in Cornac can be found [here](https://cornac.readthedocs.io/en/latest/models.html#module-cornac.models.bivaecf.recom_bivaecf).
* Source codes of the BiVAE implementation is available on [Cornac](https://github.com/PreferredAI/cornac/tree/master/cornac/models/bivaecf).
* For all experiments related to BiVAE, please refer to [this repository](https://github.com/PreferredAI/bi-vae).


## 3 Experiments on MovieLens


### 3.1 Load and split data

To evaluate the performance of item recommendation, we adopted the provided `python_random_split` tool for the consistency.  Data is randomly split into training and test sets with the ratio of 75/25.


Note that Cornac also cover different [built-in schemes](https://cornac.readthedocs.io/en/latest/eval_methods.html) for model evaluation.

In [None]:
data = pd.read_csv('/content/drive/MyDrive/Colab_me/DS300/recommenders/recommenders/datasets/Movielens/ml-latest-small/ratings.csv')
data = data[["userId", "movieId", "rating"]]
data = data.rename(columns={'userId': "userID", 'movieId': "itemID"})

data.head()

Unnamed: 0,userID,itemID,rating
0,1,1,4.0
1,1,3,4.0
2,1,6,4.0
3,1,47,5.0
4,1,50,5.0


In [None]:
train, test = python_random_split(data, 0.75)

### 3.2 Cornac Dataset

To work with models implemented in Cornac, we need to construct an object from [Dataset](https://cornac.readthedocs.io/en/latest/data.html#module-cornac.data.dataset) class.

Dataset Class in Cornac serves as the main object that the models will interact with.  In addition to data transformations, Dataset provides a bunch of useful iterators for looping through the data, as well as supporting different negative sampling techniques.

In [None]:
train_set = cornac.data.Dataset.from_uir(train.itertuples(index=False), seed=SEED)

print('Number of users: {}'.format(train_set.num_users))
print('Number of items: {}'.format(train_set.num_items))

Number of users: 610
Number of items: 8787


### 3.3 Train the BiVAE model

The BiVAE has a few important parameters that we need to consider:

- `k`: dimension of the latent space (i.e. the size of $\bf{\theta}_u$  and  $\bf{\beta}_i$ ).
- `encoder_structure`: dimension(s) of hidden layer(s) of the user and item encoders.
- `act_fn`: non-linear activation function used in the encoders.
- `likelihood`: choice of the likelihood function being optimized.
- `n_epochs`: number of passes through training data.
- `batch_size`: size of mini-batches of data during training.
- `learning_rate`: step size in the gradient update rules.

To train the model, we simply need to call the `fit()` method.

In [None]:
bivae = cornac.models.PMF(
    k=LATENT_DIM,
    encoder_structure=ENCODER_DIMS,
    act_fn=ACT_FUNC,
    likelihood=LIKELIHOOD,
    n_epochs=NUM_EPOCHS,
    batch_size=BATCH_SIZE,
    learning_rate=LEARNING_RATE,
    seed=SEED,
    use_gpu=torch.cuda.is_available(),
    verbose=True
)

with Timer() as t:
    bivae.fit(train_set)
print("Took {} seconds for training.".format(t))

  0%|          | 0/500 [00:00<?, ?it/s]

Took 519.0101 seconds for training.


### 3.4 Prediction and Evaluation

Now that our model is trained, we can produce the ranked lists for recommendation.  Every recommender models in Cornac provide `rate()` and `rank()` methods for predicting item rated value as well as item ranked list for a given user.  To make use of the current evaluation schemes, we will through `predict()` and `predict_ranking()` functions inside `cornac_utils` to produce the predictions.

Let's measure recommendation performance of the model using top-K ranking metrics.

In [None]:
with Timer() as t:
    all_predictions = predict_ranking(bivae, train, usercol='userID', itemcol='itemID', remove_seen=True)
print("Took {} seconds for prediction.".format(t))

Took 6.8095 seconds for prediction.


In [None]:
eval_map = map_at_k(test, all_predictions, col_prediction='prediction', k=TOP_K)
eval_ndcg = ndcg_at_k(test, all_predictions, col_prediction='prediction', k=TOP_K)
eval_precision = precision_at_k(test, all_predictions, col_prediction='prediction', k=TOP_K)
eval_recall = recall_at_k(test, all_predictions, col_prediction='prediction', k=TOP_K)

print("MAP:\t%f" % eval_map,
      "NDCG:\t%f" % eval_ndcg,
      "Precision@K:\t%f" % eval_precision,
      "Recall@K:\t%f" % eval_recall, sep='\n')

MAP:	0.099425
NDCG:	0.382272
Precision@K:	0.341639
Recall@K:	0.156017


In [None]:
# Record results with papermill for tests
sb.glue("map", eval_map)
sb.glue("ndcg", eval_ndcg)
sb.glue("precision", eval_precision)
sb.glue("recall", eval_recall)

## 4 Discussion

BiVAE is a new variational autoencoder tailored for dyadic data, where observations consist of measurements associated with two sets of objects, e.g., users, items and corresponding ratings.  The model is symmetric, which makes it easier to extend axiliary data from both sides of users and items.  In addition to preference data, the model can be applied to other types of dyadic data such as documentword matrices, and other tasks such as co-clustering.  

In the paper, there is also a discussion on Constrained Adaptive Priors (CAP), a proposed method to build informative priors to mitigate the well-known posterior collapse problem. We have left out that part purposely, not to distract the audiences.  Nevertheless, it is very interesting and worth taking a look.  

[This repository](https://github.com/PreferredAI/bi-vae) will provide you a more comprehensive set of experiments related to BiVAE.

## References

1. Quoc-Tuan Truong, Salah, Aghiles, and Hady W. Lauw. "Bilateral Variational Autoencoder for Collaborative Filtering." Proceedings of the 14th ACM International Conference on Web Search and Data Mining. 2021. https://dl.acm.org/doi/pdf/10.1145/3437963.3441759
2. Salah, Aghiles, Quoc-Tuan Truong, and Hady W. Lauw. "Cornac: A Comparative Framework for Multimodal Recommender Systems." Journal of Machine Learning Research 21.95 (2020): 1-5. https://cornac.preferred.ai
3. Liang, Dawen, et al. "Variational autoencoders for collaborative filtering." Proceedings of the 2018 World Wide Web Conference. 2018.