BPRH

This repository implements the model from Qiu, Huihuai, et al. "BPRH: Bayesian personalized ranking for heterogeneous implicit feedback" Information Sciences 453 (2018): 80-98.

Platform and Packages

The codes are programmed and tested on python 3.7.6. And they should also run on other versions of python.

bprH.py is the basic model wrapped in class for convenient usage. Packages below are required to run bprH.py

pickle
random
numpy==1.18.1
pandas==1.0.1
tqdm==4.42.1
livelossplot==0.5.1
scikit-learn==0.22.1

Since repeated vector and matrix manipulations are involved in BPRH model. bprH_gpu.py leverage the power of NVIDIA GPU for acceleration. Package CuPy is required to run bprH_gpu.py. You may check CuPy Installation Guide for installation help. The version we used is cupy-cuda101==7.3.0 and CUDA 10.1.

Sobazaar_cleaning.ipynb is the Jupyter Notebook that cleans the raw Sobazaar data "Sobazaar-hashID.csv.gz" located in data folder. You may unzip it manually before execute Sobazaar_cleaning.ipynb. Notice that we do not consider Like action and only View action will get processed in bprH_gpu.py and bprH.py.

BRPH_50_1000_0.00001_0.1_0.1.ipynb illustrate the usage and training process of BPRH on GPU.

Parameters Sensitivity Analysis

gamma	lambda_u, lambda_v	lambda_b	P@5	P@10	R@5	R@10	AUC
0.1	0.00001	0.00001	0.014	0.011	0.061	0.091	0.857
0.1	0.00001	0.0001	0.014	0.011	0.062	0.094	0.858
0.1	0.00001	0.001	0.018	0.013	0.075	0.105	0.861
0.1	0.00001	0.01	0.033	0.021	0.146	0.175	0.866
0.1	0.00001	0.1	0.052	0.033	0.224	0.276	0.89
0.1	0.00001	1.0	0.052	0.034	0.22	0.285	0.902
0.1	0.0001	0.00001	0.014	0.011	0.06	0.092	0.86
0.1	0.0001	0.0001	0.015	0.011	0.064	0.091	0.856
0.1	0.0001	0.001	0.016	0.012	0.071	0.106	0.86
0.1	0.001	0.00001	0.013	0.01	0.054	0.087	0.858
0.1	0.001	0.0001	0.014	0.011	0.058	0.089	0.859
0.1	0.001	0.001	0.016	0.011	0.069	0.097	0.859

We set the number of iterations as 720,000 for the table above. $\gamma = 0.1, \lambda_{u} = \lambda_{v} = 0.00001, \lambda_{b} = 1.0$ is selected for a 5-folds cross validation on 600,000 iterations. Results are presented belows.

FOLD NUM	P@5	P@10	R@5	R@10	AUC
0	0.047167488	0.031958128	0.190700122	0.252367529	0.877665012
1	0.048883666	0.032961222	0.203459515	0.270577332	0.888670492
2	0.050859514	0.033135744	0.213800679	0.270629746	0.881617605
3	0.050421179	0.032430806	0.20976547	0.262078064	0.881358235
4	0.047374702	0.032159905	0.194494702	0.262413134	0.888949442
AVG	0.04894131	0.032529161	0.202444098	0.263613161	0.883652157
STD	0.001693631	0.000506637	0.009807132	0.007549718	0.004962164

Implementation Detail

This section includes the implementation details unmetioned in Qiu, Huihuai, et al. "BPRH: Bayesian personalized ranking for heterogeneous implicit feedback" Information Sciences 453 (2018): 80-98.

There are nine types of action in the original Sobazaar dataset. We group 'purchase:buy_clicked' as Purchase, 'content:interact:product_clicked', 'content:interact:product_detail_viewed', 'product_detail_clicked' as View, and 'content:interact:product_wanted', 'product_wanted' as Like. Then we can get 4712 users and 7015 items with 15208 purchases, 126846 views, and 96689 likes. This is aligned to Table 4 in Qiu, Huihuai, et al. "BPRH: Bayesian personalized ranking for heterogeneous implicit feedback" Information Sciences 453 (2018): 80-98.
For auxiliary and target actions correlation, we only consider the case of View with Purchase. Hence, $\rho = 1$ . What's more, on Sobazaar dataset, it is possible that $I_{a}^{u} \cap I_{t}^{u} = \emptyset$ , leading to the 0-devided-by-0 error when calculating $C^{u}_{ta}, C^{u}_{at}, C^{u}$ . Therefore, we set $\alpha_{u} = 1$ in this case.
For item-set coselection, when item $i$ is only purchased by one user, then according to the definition of $S^{i} = \{ j | |U^{i} \cap U^{j}| \geq 2, i,j \in I\}$ is an empty set since $|U^{i}| = 1$ . However, $S^{i}$ should contain item $i$ no matter what the size of $S^{i}$ is accroding to the paper. We fix this issue in our code.
For item-set coselection involved in Algorithm 1 in BPRH paper, we think there are some typos. Taking Line 20 - 21 as an instance, to construct the item-set $K$ , first we randomly selection item $k \in I_{n}^{u}$ , then $K$ should come from $K = I_{n}^{u} \cap S^{k}$ , not $K = I_{n}^{u} \cap S^{i}$ . So is the case of item-set $J$ . We fix this issue in bprH_gpu.py - Line 286, Line 301, Line 317.
BPRH model does not consider user bias. So we add a all-ones-column at the last column in user matrix and set the last row of item matrix as item bias (bprH_gpu.py - Line 255). We utilize normal distribution with 0 expectation and 0.1 standard deviation to initialize user and item matrices.
When constructing item-sets $I, J, K$ , we may come across some empty item-set because of random spliting train and test dataset. bprH_gpu.py - Line 363 address this issue. For example, when $J = \emptyset$ , the objective function of BPRH and corresponding gradients downgrade to COFISET model.
When recommending items for users, a user might appear in test and not in train. In our implementation, we can choose to ignore this type of user, i.e. we do not recommend for this type of user. In another option, we use item popularity of target action learned from training data to make recommendations for this type of users. bprH_gpu.py - Line 520 solve this issue. What's more, we exclude user $u$ 's purchased items from user $u$ 's recommendation lists.
User online updating scheme.

Mathematical Detail

For mathematical details, please visit my blogs.

Copyright

This repository is under MIT License. Please cite this repository if you use our codes.

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
data		data
.gitattributes		.gitattributes
.gitignore		.gitignore
BPRH Example.xlsx		BPRH Example.xlsx
BPRH_online.py		BPRH_online.py
BRPH_50_1000_0.00001_0.1_0.1.ipynb		BRPH_50_1000_0.00001_0.1_0.1.ipynb
LICENSE		LICENSE
README.md		README.md
Sobazaar_cleaning.ipynb		Sobazaar_cleaning.ipynb
bprH.py		bprH.py
bprH_gpu.py		bprH_gpu.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

.gitattributes

.gitattributes

.gitignore

.gitignore

BPRH Example.xlsx

BPRH Example.xlsx

BPRH_online.py

BPRH_online.py

BRPH_50_1000_0.00001_0.1_0.1.ipynb

BRPH_50_1000_0.00001_0.1_0.1.ipynb

LICENSE

LICENSE

README.md

README.md

Sobazaar_cleaning.ipynb

Sobazaar_cleaning.ipynb

bprH.py

bprH.py

bprH_gpu.py

bprH_gpu.py

Repository files navigation

BPRH

Platform and Packages

Parameters Sensitivity Analysis

Implementation Detail

Mathematical Detail

Copyright

About

Languages

License

liu-yihong/BPRH

Folders and files

Latest commit

History

Repository files navigation

BPRH

Platform and Packages

Parameters Sensitivity Analysis

Implementation Detail

Mathematical Detail

Copyright

About

Topics

Resources

License

Stars

Watchers

Forks

Languages