-
Notifications
You must be signed in to change notification settings - Fork 742
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
7 changed files
with
256 additions
and
194 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,7 +6,7 @@ Introduction | |
:caption: Table of Contents | ||
:maxdepth: 1 | ||
|
||
task | ||
data | ||
model | ||
task | ||
model/index | ||
benchmark |
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
============ | ||
CoClustering | ||
============ | ||
|
||
Co-Clustering [#COC]_ predicts ratings by clustering users and items. | ||
|
||
.. math:: | ||
\hat r_{ij}=A^{COC}_{gh}+(A^R_i-A^{RC}_g)+(A^C_j-A^{CC}_h) | ||
Training | ||
======== | ||
|
||
References | ||
========== | ||
|
||
.. [#COC] George, Thomas, and Srujana Merugu. "A scalable collaborative filtering framework based on co-clustering." Data Mining, Fifth IEEE international conference on. IEEE, 2005. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
====== | ||
Models | ||
====== | ||
|
||
|
||
+----------------------+------------------------------+------------------+ | ||
| Model | Data | Task | | ||
+----------------------+----------+----------+--------+--------+---------+ | ||
| | explicit | implicit | weight | rating | ranking | | ||
+======================+==========+==========+========+========+=========+ | ||
| BaseLine | √ | | | √ | √ | | ||
+----------------------+----------+----------+--------+--------+---------+ | ||
| NMF [#NMF]_ | √ | | | √ | √ | | ||
+----------------------+----------+----------+--------+--------+---------+ | ||
| SVD | √ | | | √ | √ | | ||
+----------------------+----------+----------+--------+--------+---------+ | ||
| SVD++ [#SVDPP]_ | √ | | | √ | √ | | ||
+----------------------+----------+----------+--------+--------+---------+ | ||
| KNN [#KNN]_ | √ | | | √ | √ | | ||
+----------------------+----------+----------+--------+--------+---------+ | ||
| CoClustering [#COC]_ | √ | | | √ | √ | | ||
+----------------------+----------+----------+--------+--------+---------+ | ||
| SlopeOne [#SO]_ | √ | | | √ | √ | | ||
+----------------------+----------+----------+--------+--------+---------+ | ||
| ItemPop | √ | √ | | | √ | | ||
+----------------------+----------+----------+--------+--------+---------+ | ||
| WRMF [#WRMF]_ | √ | √ | √ | | √ | | ||
+----------------------+----------+----------+--------+--------+---------+ | ||
| BPR [#BPR]_ | √ | √ | | | √ | | ||
+----------------------+----------+----------+--------+--------+---------+ | ||
|
||
Apparently, these models using implicit feedbacks are more general since explicit feedbacks could be converted to implicit feedbacks and item ranking could be done by rating prediction. | ||
|
||
|
||
Non-Personalized Models | ||
======================= | ||
|
||
Personalized Models | ||
=================== | ||
|
||
.. toctree:: | ||
:caption: Personalized Models | ||
:maxdepth: 1 | ||
|
||
matrix_factorization | ||
knn | ||
coclustering | ||
slopeone | ||
|
||
|
||
References | ||
========== | ||
|
||
.. [#Surprise] Hug, Nicolas. Surprise, a Python library for recommender systems. http://surpriselib.com, 2017. | ||
.. [#LibRec] G. Guo, J. Zhang, Z. Sun and N. Yorke-Smith, LibRec: A Java Library for Recommender Systems, in Posters, Demos, Late-breaking Results and Workshop Proceedings of the 23rd Conference on User Modelling, Adaptation and Personalization (UMAP), 2015. | ||
.. [#NMF] Luo, Xin, et al. "An efficient non-negative matrix-factorization-based approach to collaborative filtering for recommender systems." IEEE Transactions on Industrial Informatics 10.2 (2014): 1273-1284. | ||
.. [#SO] "Slope one predictors for online rating-based collaborative filtering." Proceedings of the 2005 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, 2005. | ||
.. [#COC] George, Thomas, and Srujana Merugu. "A scalable collaborative filtering framework based on co-clustering." Data Mining, Fifth IEEE international conference on. IEEE, 2005. | ||
.. [#WRMF] Hu, Yifan, Yehuda Koren, and Chris Volinsky. "Collaborative filtering for implicit feedback datasets." Data Mining, 2008. ICDM'08. Eighth IEEE International Conference on. Ieee, 2008. | ||
.. [#KNN] Desrosiers, Christian, and George Karypis. "A comprehensive survey of neighborhood-based recommendation methods." Recommender systems handbook. Springer, Boston, MA, 2011. 107-144. | ||
.. [#SVDPP] Koren, Yehuda. "Factorization meets the neighborhood: a multifaceted collaborative filtering model." Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2008. | ||
.. [#BPR] Rendle, Steffen, et al. "BPR: Bayesian personalized ranking from implicit feedback." Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence. AUAI Press, 2009. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,103 @@ | ||
=== | ||
KNN | ||
=== | ||
|
||
KNN | ||
--- | ||
|
||
Neighbors-based models [#KNN]_ predict ratings from similar items or similar users. There are two kinds of neighbors-based models: user-based and item-based, which means whether predictions are made by similar users or items. In general, item-based model is better than user-based model since items' characters are more consistent than users' preferences. | ||
|
||
.. _similarity: | ||
|
||
Similarity | ||
^^^^^^^^^^ | ||
|
||
Similarity metrics define the definition of nearest neighbors. Items rated by two different users :math:`u` and :math:`v` are represented by :math:`I_u` and :math:`I_v`. Users rated two different items :math:`i` and :math:`j` are represented by :math:`U_i` and :math:`U_j`. There are several most used similarity functions: | ||
|
||
Cosine | ||
"""""" | ||
|
||
.. math:: | ||
\cos(u,v)=\frac{\sum\limits_{k\in|I_u\cap I_v|}r_{uk}\cdot r_{vk}}{\sqrt{\sum\limits_{k\in|I_u\cap I_v|}r_{uk}^2}\cdot\sqrt{\sum\limits_{k\in|I_u\cap I_v|}r_{vk}^2}} | ||
Pearson | ||
""""""" | ||
|
||
Pearson similarity is similar to cosine similarity but ratings are subtracted by means first. | ||
|
||
.. math:: | ||
\text{pearson}(a,b)=\frac{\sum\limits_{k\in|I_a\cap I_b|}(r_{ak}-\tilde r_a)\cdot (r_{bk}-\tilde r_b)}{\sqrt{\sum\limits_{k\in|I_a\cap I_b|}(r_{ak}-\tilde r_a)^2}\cdot\sqrt{\sum\limits_{k\in|I_a\cap I_b|}(r_{bk}-\tilde r_b)^2}} | ||
where :math:`\tilde r_a` is the mean of ratings rated by the user :math:`a`: | ||
|
||
.. math:: | ||
\tilde r_a = \sum_{k\in I_a} r_{ak} | ||
Mean Square Distance | ||
"""""""""""""""""""" | ||
|
||
|
||
The *Mean Square Distance* is | ||
|
||
.. math:: | ||
\text{msd}(a,b)=\frac{1}{|I_a\cap I_b|}\sum_{k\in|I_a\cap I_b|}(r_{ak}-r_{bk})^2 | ||
Then, the *Mean Square Distance Similarity* is | ||
|
||
.. math:: | ||
\text{msd_sim}(u, v) = \frac{1}{\text{msd}(u, v) + 1} | ||
Predict | ||
^^^^^^^ | ||
|
||
A rating could be predict by k nearest neighbors $\mathcal N_k(i)$ (k users or items with max k similarity). | ||
|
||
.. math:: | ||
\hat r_{ui}=\frac{\sum_{v\in \mathcal N_k(u)}\text{sim}(u,v)r_{vi}}{\sum_{v\in \mathcal N_k(u)}\text{sim}(u,v)} | ||
The basic KNN prediction has some problem, There are more advanced methods which achieve higher accuracy. | ||
|
||
KNN with Mean | ||
""""""""""""" | ||
|
||
$\tilde r_l$ is the mean of l-th user's (or item's) ratings. | ||
|
||
.. math:: | ||
\hat r_{ij}=\tilde r_i+\frac{\sum_{l\in \mathcal N_k(i)}\text{sim}(i,l)(r_{lj}-\tilde r_l)}{\sum_{l\in \mathcal N_k(i)}\text{sim}(i,l)} | ||
KNN with Z-score | ||
"""""""""""""""" | ||
|
||
$\sigma(r_i)$ is the standard deviation of l-th user's (or item's) ratings. | ||
|
||
.. math:: | ||
\hat r_{ij}=\tilde r_i+\sigma(r_i)\frac{\sum_{l\in \mathcal N_k(i)}\text{sim}(i,l)\frac{r_{lj}-\tilde r_l}{\sigma(r_l)}}{\sum_{l\in \mathcal N_k(i)}\text{sim}(i,l)} | ||
KNN with Baseline | ||
""""""""""""""""" | ||
|
||
$b_l$ is the baseline comes from the baseline model $\hat r_{ij}=b+b_i+b_j+p_i^Tq_j$. | ||
|
||
.. math:: | ||
\hat r_{ij}=b_i+\frac{\sum_{l\in \mathcal N_k(i)}\text{sim}(i,l)(r_{lj}- b_l)}{\sum_{l\in \mathcal N_k(i)}\text{sim}(i,l)} | ||
The KNN model with baseline is the best model since biases are used. | ||
|
||
|
||
|
||
References | ||
========== | ||
|
||
.. [#KNN] Desrosiers, Christian, and George Karypis. "A comprehensive survey of neighborhood-based recommendation methods." Recommender systems handbook. Springer, Boston, MA, 2011. 107-144. |
Oops, something went wrong.