# Wasserstein Collaborative Filtering for Item Cold-start Recommendation
This is the python code for our 2020 UMAP paper "Wasserstein Collaborative Filtering for Item Cold-start Recommendation".<br>
You have to run our code with Python and PyTorch. You could speed up training with your cuda devices, by setting options['gpu'] to 1. If you don't have any GPUs, setting options['gpu'] to 0. <br>
Thank you for **citing** our paper:<br>
*@inproceedings{meng2020wcf, title={Wasserstein Collaborative Filtering for Item Cold-start Recommendation}, author={Meng, Yitong and Yan, Xiao and Liu, Weiwen and Wu, Huanhuan and Cheng, James}, booktitle={Proceedings of the 28th ACM Conference on User Modeling, Adaptation and Personalization}, pages={318-322}, year={2020} } <br>*

# Preparing data
we assume there are $n$ users, $m$ warm items and $k$ cold-start items.<br>
the train matrix is $n$ by $m$, M (the item distance matrix) is $m$ by $k$, test is $n$ by $k$.<br>
The three returned matrices are numpy.array, and of course there are lots of 0 in them.<br>

Here is a toy example:

In [1]:
import numpy as np
train = np.array([[3,0,0,0,0],[0,4,0,0,0],[0,0,0,5,0],[0,0,3,0,4]])
M = np.array([[0.1,0.8],[0.2,0.9],[0.85,0.15],[0.9,0.1],[0.95,0.05]])
test = np.array([[5,0],[4,0],[0,3],[0,5]])

We transpose train for the need of later training.

In [2]:
train = train.T

We normalize the rating values of each user, and convert nan to zero if needed.

In [3]:
train=train/sum(train) # sum() is summing each column of train
train=np.nan_to_num(train) # convert nan to zero


We convert all the data to PyTorch tensor.

In [4]:
import torch as tc

Tensor=lambda M:tc.DoubleTensor(M)
train=Tensor(train)
M=Tensor(M)

## Training WCF

Load packages:

In [5]:
import tcwasserstein_DL as wd
import tcutil as ut
import  torch as tc
import sys
import numpy as np

Initialize hyperparameters:

In [6]:
options={}
options['stop']=1e-5 # if the change of the loss function is smaller than this ratio, stop training.
options['t0']=10 # this is a parameter userd for backtrack in linear_projected_gradient_descent, which is a technique used in the optimization process.
options['verbose']=1 # set to 1 if you want to print verbal information.
options['D_step_stop']=1e-3 # this is a parameter userd for linear_projected_gradient_descent, which is a technique used in the optimization process. You can simply use this fixed value here.
options['lambda_step_stop']=1e-2 # this is a parameter userd for linear_projected_gradient_descent, which is a technique used in the optimization process. 
options['alpha']=0.5 # this is a parameter userd for backtrack in linear_projected_gradient_descent, which is a technique used in the optimization process.
options['beta']=0.8 # this is a parameter userd for backtrack in linear_projected_gradient_descent, which is a technique used in the optimization process.
options['gpu']=0 # 0 is using cpu, 1 is using gpu.
k = 2 # the latent dimension of the "[user] by [cold-start item]" marix.
gamma=1/50 # gamma correspond to eq (4) of our paper. It controls the importance of the Entropy regularization term.
rho1=0 # rho1 and roh2 should be set to 0. rho1 and roh2 are parameters correspond to eq (9) in reference [27], which are used for nonnegtive matrix factorization. In our paper, we don't do nonnegtive matrix factorization and thus these parameters should be set to 0. While, I did implement the nonnegtive matrix factorization part in my code and you can use it by setting rho1 and roh2 to non-zero values for other research purposes. 
rho2=0

Transfer parameters to pytorch Tensor:

In [7]:
Tensor=lambda M:tc.DoubleTensor(M)

for key, value in options.items():
    options[key] = Tensor([value])

gamma= Tensor([gamma])
rho1 = Tensor([rho1])
rho2 = Tensor([rho2])

Initialize model parameters:

In [8]:
sizeD=(M.shape[1],k)
D, HD, Hlambda=ut.initialValue(train,sizeD,options['gpu'],Tensor)

You are encouraged to use GPU, by switching options['gpu'] on (to 1).

In [9]:
if options['gpu']:
    data = data.cuda()
    M = M.cuda()
    for key, value in options.items():
        options[key]=value.cuda()
    gamma= gamma.cuda()
    rho1 = rho1.cuda()
    rho2 = rho2.cuda()

Training WCF:

In [10]:
D, lambdA, objectives=wd.wasserstein_DL(train,k,M,gamma,rho1,rho2,D, HD, Hlambda, options,Tensor)
print("done")

k: 2 ; gamma: [0.02] ; rhoL: [0.] ; rhoD: [0.] ; lambda_step_stop: [0.01] ; D_step_stop:  [0.001] ; stop: [1.e-05]
1
Optimize with respect to lambda
Optimize with respect to D
done


## Evaluation

In [11]:
pred = D @ lambdA
pred = pred.cpu().data.numpy()
# print(pred.T)
import evaluate as ev
performance = ev.eval2(pred.T, test)
print('\tMAP:',performance[0],'NDCG:',performance[1],'recall:',performance[2])

	MAP: 1.0 NDCG: 1.0 recall: 1.0


## Miscellaneous
Hope our work can help you in your research:) <br>
If you have any questions regarding our work, please contact Yitong Meng via mengyitongge@163.com .