Bayesian Negative Sampling

Accepted by International Conference on Data Engineering (ICDE 2023)

@INPROCEEDINGS{Bin:ICDE:2023,

author={Liu, Bin and Wang, Bang},

booktitle={2023 IEEE 39th International Conference on Data Engineering (ICDE)},

title={Bayesian Negative Sampling for Recommendation},

year={2023},

pages={749-761},

doi={10.1109/ICDE55515.2023.00063}}

Required Packages

numpy : Implement BNS for Matrix Factorization (MF) (run main_MF.py );
pytorch: Implement BNS for light Graph Convolution Network (LightGCN) (run main_lightGCN.py ).

About BNS Framework:

We summarize the Bayesian negative sampling algorithm as: Randomly select a small set of candidates $\mathcal{M}_u$ from the negative instances set, for each negative instances $l \in \mathcal{M}_u$ do:

(i) Computing prior probability

By assuming $pop_l \sim B (N, P_{fn}(l))$, we compute prior probability as: $$P_{fn}(l) = \frac{pop_l}{N}$$ This step needs $\mathcal{O}(1)$.

(ii) Computing cumulative distribution function (likelihood)

$$ F_{n}(\hat{x} _ l) = \frac{1}{n} \sum I_ {|x_ \cdot \leq \hat{x}_l |} $$

This step needs $\mathcal{O}(|\mathcal{I}|)$.

The empirical distribution function $F_n (\cdot)$ converges to common cumulative distribution function $F(\cdot)$ almost surely by the strong law of large numbers. Glivenko Theorem (1933) strengthened this result by proving uniform convergence of $F_n(\cdot)$ to $F(\cdot)$. This property makes it possible for us to compute the $F(\cdot)$, even if we do not know its explicit expression. Given the observation $\hat{x}$, $F(\hat{x})$
assigns a probabilistic prediction valued in $[0,1]$ of $l$ being false negative (positive).

(iii) Computing negative signal (posterior probability)

$$ \mathtt{unbias}(l) = \frac{[1 - F(\hat{x} _ l)] [1-P _ {fn} (l)] }{1 - F(\hat{x}_ l) -P_{fn}(l) + 2F(\hat{x}_ l)P_{fn}(l)} $$

Note that $\mathtt{unbias}(l)$ is an unbiased estimator for $l$ being true negative (see Lemma 3.1). This step needs $\mathcal{O}(1)$.

(iv) Performing Bayesian negative sampling

$$ j = \mathop{\arg\min}\limits_{l \in\mathcal{M}_u}~ \mathtt{info}(l)\cdot [1-(1+\lambda)\mathtt{unbias}(l)]$$

If the sampler $h^*$ minimizes the conditional sampling risk $R(l|i)$, then the empirical sampling risk will be minimized (see Theorem 3.1). Thus the Bayesian optimal sampling rule is essentially sampling the instance with minimal conditional sampling risk. This step needs $\mathcal{O}(1)$.

If you have any questions, please feel free to contact contact "wangbang@hust.edu.cn; liubin0606@hust.edu.cn". More details about BNS(Bin Liu & Bang Wang, 2022) see our paper or poster at https://doi.org/10.48550/arXiv.2204.06520 .

Distribution Analysis

The key is the class conditional density. On the basis of order relation analysis of negatives' scores, we derive the class conditional density of true negatives and that of false negatives, and provide an affirmative answer from a Bayesian viewpoint to distinguish true negatives from false negatives.

Real distribution

Theoretical distribution

We exhibits the distribution morphology of false negatives and true negatives derived from the ordinal relation with different types of $f(x)$: Gaussian distribution $x \sim \mathcal{N}(\mu,\sigma)$ (symmetrical), student distribution $x \sim t(n)$ (symmetrical), and Gamma distribution $x\sim Ga(\alpha,\lambda)$ (asymmetrical) . As we can see, during the training process, the real distribution of true/false negatives gradually exhibit the same structure as theoretical distribution. Although the probability density function $f(x)$ changes during the training process, this separated structure is sufficient for us to classify the true negative instances from false negative instances.

Dataset Description:

MovieLens100K; MovieLens1M; Yahoo!-R3.

More details about MovieLens datasets at https://grouplens.org/datasets/movielens .
More details about Yahoo datasets at http://webscope.sandbox.yahoo.com/catalog.php?datatype=r .

Our Bayesian Negative Sampling algothrim does not depend on the datasets. Just split the dataset you are interested in into the training set and test set, replace the train.csv and test.csv files, you can run Bayesian negative sampling easily. You just need to make sure that you have chosen appropriate prior information for modeling prior probability $P_{tn}(\cdot)$ or $P_{fn}(\cdot)$.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
BNS_MF		BNS_MF
BNS_lightGCN		BNS_lightGCN
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BNS_MF

BNS_MF

BNS_lightGCN

BNS_lightGCN

README.md

README.md

Repository files navigation

Bayesian Negative Sampling

Required Packages

About BNS Framework:

(i) Computing prior probability

(ii) Computing cumulative distribution function (likelihood)

(iii) Computing negative signal (posterior probability)

(iv) Performing Bayesian negative sampling

Distribution Analysis

Dataset Description:

About

Releases

Packages

Languages

liubin06/BNS

Folders and files

Latest commit

History

Repository files navigation

Bayesian Negative Sampling

Required Packages

About BNS Framework:

(i) Computing prior probability

(ii) Computing cumulative distribution function (likelihood)

(iii) Computing negative signal (posterior probability)

(iv) Performing Bayesian negative sampling

Distribution Analysis

Dataset Description:

About

Resources

Stars

Watchers

Forks

Languages