Qcactus/add lightgcn #1123

Qcactus · 2020-06-17T08:36:26Z

Description

Add LightGCN algorithm and lightgcn_deep_dive notebook.

Related Issues

Checklist:

I have followed the contribution guidelines and code style for this project.
I have added tests covering my contributions.
I have updated the documentation accordingly.
This PR is being made to staging and not master.

review-notebook-app · 2020-06-17T08:36:33Z

Check out this pull request on

Review Jupyter notebook visual diffs & provide feedback on notebooks.

Powered by ReviewNB

ghost · 2020-06-17T08:36:39Z

All CLA requirements met.

Leavingseason · 2020-06-17T09:15:10Z

Hi all, this is a contributor from university. They are putting the very recent SIGIR2020 paper https://arxiv.org/abs/2002.02126 to our codebase. I will go through the code first and will let the authors correct some issues (if there is any). After that I will ping you in Teams so that you can start the review process.

miguelgfierro · 2020-06-17T11:05:29Z

this is awesome! super good work

notebooks/02_model/lightgcn_deep_dive.ipynb

miguelgfierro · 2020-06-19T12:37:54Z

notebooks/02_model/lightgcn_deep_dive.ipynb

@@ -0,0 +1,807 @@
+{


for all the deprecated warnings:
WARNING:tensorflow:From ../../reco_utils/recommender/deeprec/graphrec/lightgcn.py:158: The name tf.sparse_tensor_dense_matmul is deprecated. Please use tf.sparse.sparse_dense_matmul instead.
WARNING:tensorflow:From ../../reco_utils/recommender/deeprec/graphrec/lightgcn.py:116: The name tf.train.AdamOptimizer is deprecated. Please use tf.compat.v1.train.AdamOptimizer instead.

WARNING:tensorflow:From ../../reco_utils/recommender/deeprec/graphrec/lightgcn.py:117: The name tf.train.Saver is deprecated. Please use tf.compat.v1.train.Saver instead.

WARNING:tensorflow:From ../../reco_utils/recommender/deeprec/graphrec/lightgcn.py:119: The name tf.GPUOptions is deprecated. Please use tf.compat.v1.GPUOptions instead.

WARNING:tensorflow:From ../../reco_utils/recommender/deeprec/graphrec/lightgcn.py:120: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

WARNING:tensorflow:From ../../reco_utils/recommender/deeprec/graphrec/lightgcn.py:121: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

WARNING:tensorflow:From ../../reco_utils/recommender/deeprec/graphrec/lightgcn.py:123: The name tf.global_variables_initializer is deprecated. Please use tf.compat.v1.global_variables_initializer instead.

would you mind to change the code to tf.compact.v1? it would help if/when we change to TF2

Reply via ReviewNB

Do we have the plan to switch to TF2?
If not, I should suggest it keep the old fashion, because as far as I know, most of people in industry are not willing to use TF2, becuase it will cause a lot of platform refactor in their codebase,

so far we are not planning to, there was a discussion about this #953.

However, I guess that at some point we will change, but I don't expect us to rewrite the algos for TF2. We are doing a large refactor in PR #1086 and one of the things we are doing, slowly, is to add tf.compat.v1 https://github.com/microsoft/recommenders/blob/24b6ba9664b808abb41f118c9adefb983b56be1d/reco_utils/recommender/ncf/ncf_singlenode.py#L58. The reason to do this work now is to save work in the future, if at some point we change to TF2, if we have everything changed to compat.v1, we won't break the repo.

anargyri

Great work, thanks for contributing this method!

miguelgfierro

the code is really nice, but there are several parts that I think we should change before merging. I think it is important to have single responsibility with the fit method and also to perform DRY in the metrics

reco_utils/recommender/deeprec/graphrec/lightgcn.py

reco_utils/recommender/deeprec/graphrec/ranking_metrics.py

reco_utils/recommender/deeprec/DataModel/ImplicitCF.py

reco_utils/recommender/deeprec/config/lightgcn.yaml

reco_utils/recommender/deeprec/models/graphrec/lightgcn.py

tests/integration/test_notebooks_gpu.py

miguelgfierro

this is really good

miguelgfierro · 2020-06-25T09:59:25Z

hey @Qcactus, this is really nice. If you want to, feel free to add your name to https://github.com/microsoft/recommenders/blob/master/AUTHORS.md

miguelgfierro · 2020-07-07T11:01:43Z

hey @Qcactus, in which machine have you computed the stats of the notebook? I'm trying to replicate

FYI @Leavingseason

Qcactus · 2020-07-07T11:15:12Z

@miguelgfierro GeForce GTX 1080Ti. Anything wrong with the notebook?

miguelgfierro · 2020-07-07T12:51:16Z

@miguelgfierro GeForce GTX 1080Ti. Anything wrong with the notebook?

I was testing the notebook on a K80 gpu with different batch sizes, but interestingly, the gpu memory doesn't change when I show it with nvidia-smi. These are the results I got:

EPOCHS = 5
#BATCH_SIZE = 1024 # with ML1m: Epoch 2 (train)169.7s, gpu memory 56Mb
#BATCH_SIZE = 4096 # with ML1m: Epoch 2 (train)47.2s, gpu memory 56Mb
# BATCH_SIZE = 16384 # (=1024*4*4), with ML1m: Epoch 2 (train)16.2s, gpu memory 56Mb
# BATCH_SIZE = 65536 # (=1024*4*4*4), with ML1m: Epoch 2 (train)9.0s, gpu memory 56Mb

I tried another machine, this time with 4K80, and got similar results: BATCH_SIZE = 65536 # (=1024*4*4*4), with ML1m: Epoch 2 (train)8.8s, gpu memory 56Mb

Looking at the code it seems it is using gpu: https://github.com/microsoft/recommenders/blob/staging/reco_utils/recommender/deeprec/models/graphrec/lightgcn.py#L107 so I don't understand why the gpu memory is so low, do you know what could be happening?

Qcactus · 2020-07-07T14:12:49Z

@miguelgfierro
I tested the notebook on GeForce GTX 1080Ti. Here is the result:

MovieLens 1m
BATCH_SIZE=1024, Epoch 2 (train)28.0s, GPU Memory 396Mb
BATCH_SIZE=4096, Epoch 2 (train)13.3s, GPU Memory 396Mb
BATCH_SIZE=16384, Epoch 2 (train)9.7s, GPU Memory 396Mb
BATCH_SIZE=65536, Epoch 2 (train)7.5s, GPU Memory 457Mb

It seems reasonable on my machine. But I don't have access to other kinds of GPU, so I might not be able to find out the problem. Have you used tensorflow 1.15.2? (I noticed that some codes in this repo are tested with tf 1.11.) Or maybe you can try to test the notebook on a GeForce to see whether the result is similar with mine.

miguelgfierro · 2020-07-08T15:10:36Z

I found the issue, the low memory consumption we were having was because the datasets were small, if I used ML10M or ML20M, I was getting 6713MiB

Qcactus added 2 commits June 17, 2020 16:00

add lightgcn model & test & notebook

c66da19

update README

2a32ee6

Qcactus requested review from anargyri, gramhagen, loomlike, miguelgfierro and yueguoguo as code owners June 17, 2020 08:36

modify a parameter of model

c263032

miguelgfierro reviewed Jun 19, 2020

View reviewed changes

anargyri approved these changes Jun 19, 2020

View reviewed changes

miguelgfierro requested changes Jun 19, 2020

View reviewed changes

reco_utils/recommender/deeprec/graphrec/lightgcn.py Outdated Show resolved Hide resolved

reco_utils/recommender/deeprec/graphrec/ranking_metrics.py Outdated Show resolved Hide resolved

Qcactus requested a review from anargyri June 20, 2020 08:46

Qcactus added 5 commits June 21, 2020 19:42

modify notebook & format codes

23addb4

change dicrectories & DRY evaluation

e5647b0

remove n_fold

3434cf4

add infer method & add comments & modify notebook

d51b864

modify tests

3a70a75