Skip to content

Commit

Permalink
Added proper references in documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
NicolasHug committed Dec 28, 2016
1 parent dcad442 commit 1a7a2e8
Show file tree
Hide file tree
Showing 13 changed files with 104 additions and 65 deletions.
3 changes: 2 additions & 1 deletion doc/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,8 @@
'sphinx.ext.viewcode',
'sphinx.ext.graphviz',
'sphinx.ext.inheritance_diagram',
'sphinx.ext.autosummary'
'sphinx.ext.autosummary',
'sphinxcontrib.bibtex',
]

# Add any paths that contain templates here, relative to this directory.
Expand Down
2 changes: 1 addition & 1 deletion doc/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,9 @@ to contribute and send pull requests (see `GitHub page
:hidden:

getting_started
notation_standards
prediction_algorithms
building_custom_algo
notation_standards


.. toctree::
Expand Down
13 changes: 11 additions & 2 deletions doc/source/notation_standards.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
.. _notation_standards:

Notation standards
==================
Notation standards, References
==============================

In the documentation, you will find the following notation:

Expand Down Expand Up @@ -30,3 +30,12 @@ In the documentation, you will find the following notation:
* :math:`N_u^k(i)` : the :math:`k` nearest neighbors of item :math:`i` that
are rated by user :math:`u`. This set is computed using a :py:mod:`similarity
metric <surprise.similarities>`.

.. rubric:: References

Here are the papers used as references in the documentation. Links to pdf files
where added when possible. A simple Google search should lead you easily to the
missing ones :)

.. bibliography:: refs.bib
:all:
33 changes: 11 additions & 22 deletions doc/source/prediction_algorithms.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
.. _prediction_algorithms:

Prediction algorithms
=====================
Using prediction algorithms
===========================

Surprise provides with a bunch of built-in algorithms. You can find the details
of each of these in the :mod:`surprise.prediction_algorithms` package
Expand Down Expand Up @@ -43,10 +43,8 @@ First of all, if you do not want to configure the way baselines are computed,
you don't have to: the default parameters will do just fine. If you do want to
well... This is for you.

You may want to read section 2.1 of `Factor in the Neighbors: Scalable and
Accurate Collaborative Filtering
<http://courses.ischool.berkeley.edu/i290-dm/s11/SECURE/a1-koren.pdf>`_ by
Yehuda Koren to get a good idea of what are baseline estimates.
You may want to read section 2.1 of :cite:`Koren:2010` to get a good idea of
what are baseline estimates.

Baselines can be estimated in two different ways:

Expand All @@ -60,29 +58,20 @@ values are ``'als'`` (default) and ``'sgd'``. Depending on its value, other
options may be set. For ALS:

- ``'reg_i'``: The regularization parameter for items. Corresponding to
:math:`\lambda_2` in the `paper
<http://courses.ischool.berkeley.edu/i290-dm/s11/SECURE/a1-koren.pdf>`_.
Default is 10.
:math:`\lambda_2` in :cite:`Koren:2010`. Default is ``10``.
- ``'reg_u'``: The regularization parameter for users, orresponding to
:math:`\lambda_3` in the `paper
<http://courses.ischool.berkeley.edu/i290-dm/s11/SECURE/a1-koren.pdf>`_.
Default is 15.
- ``'n_epochs'``: The number of iteration of the ALS procedure. Default is 10.
Note that in the `paper
<http://courses.ischool.berkeley.edu/i290-dm/s11/SECURE/a1-koren.pdf>`_, what
is described is a **single** iteration ALS process.
:math:`\lambda_3` in :cite:`Koren:2010`. Default is ``15``.
- ``'n_epochs'``: The number of iteration of the ALS procedure. Default is
``10``. Note that in :cite:`Koren:2010`, what is described is a **single**
iteration ALS process.

And for SGD:

- ``'reg'``: The regularization parameter of the cost function that is
optimized, corresponding to :math:`\lambda_1` and then :math:`\lambda_5` in
the `paper
<http://courses.ischool.berkeley.edu/i290-dm/s11/SECURE/a1-koren.pdf>`_.
Default is 0.02.
:cite:`Koren:2010` Default is ``0.02``.
- ``'learning_rate'``: The learning rate of SGD, corresponding to
:math:`\gamma` in the `paper
<http://courses.ischool.berkeley.edu/i290-dm/s11/SECURE/a1-koren.pdf>`_.
Default is 0.005.
:math:`\gamma` in :cite:`Koren:2010`. Default is ``0.005``.
- ``'n_epochs'``: The number of iteration of the SGD procedure. Default is 20.

.. note::
Expand Down
4 changes: 2 additions & 2 deletions doc/source/prediction_algorithms_package.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@ prediction_algorithms package

.. automodule:: surprise.prediction_algorithms

You may want to check the :ref:`notation_standards` before diving into the
formulas.
You may want to check the :ref:`notation standards <notation_standards>`
before diving into the formulas.


.. toctree::
Expand Down
55 changes: 55 additions & 0 deletions doc/source/refs.bib
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
@article{Koren:2010,
author = {Koren, Yehuda},
title = {Factor in the Neighbors: Scalable and Accurate Collaborative Filtering},
journal = {},
year = {2010},
url = {http://courses.ischool.berkeley.edu/i290-dm/s11/SECURE/a1-koren.pdf},
}

@book{Ricci:2010,
author = {Ricci, Francesco and Rokach, Lior and Shapira, Bracha and Kantor, Paul B.},
title = {Recommender Systems Handbook},
year = {2010},
edition = {1st},
publisher = {},
}

@article{salakhutdinov2008a,
author = {Salakhutdinov, Ruslan and Mnih, Andriy},
journal = {},
title = {Probabilistic Matrix Factorization},
year = 2008,
url = {http://papers.nips.cc/paper/3208-probabilistic-matrix-factorization.pdf},
}

@article{Koren:2009,
author = {Koren, Yehuda and Bell, Robert and Volinsky, Chris},
title = {Matrix Factorization Techniques for Recommender Systems},
journal = {},
year = {2009},
}

@article{lemire2007a,
author = {Daniel Lemire and Anna Maclachlan},
title = {Slope One Predictors for Online Rating-Based Collaborative Filtering},
journal = {},
year = {2007},
url = {http://arxiv.org/abs/cs/0702144},
}

@article{Koren:2008:FMN,
author = {Koren, Yehuda},
title = {Factorization Meets the Neighborhood: A Multifaceted Collaborative Filtering Model},
journal = {},
year = {2008},
url = {http://www.cs.rochester.edu/twiki/pub/Main/HarpSeminar/Factorization_Meets_the_Neighborhood-_a_Multifaceted_Collaborative_Filtering_Model.pdf},
}

@article{George:2005,
author = {George, Thomas and Merugu, Srujana},
title = {A Scalable Collaborative Filtering Framework Based on
Co-Clustering},
journal = {},
year = {2005},
url = {http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.113.6458&rep=rep1&type=pdf},
}
1 change: 1 addition & 0 deletions requirements_dev.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,4 @@ six>=1.10.0
pytest>=3.0.3
sphinx
sphinx_rtd_theme
sphinxcontrib-bibtex
2 changes: 1 addition & 1 deletion surprise/prediction_algorithms/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
knns.KNNBaseline
matrix_factorization.SVD
matrix_factorization.SVDpp
slope_one.SlopeOne
slope_one.SlopeOne
co_clustering.CoClustering
"""

Expand Down
4 changes: 2 additions & 2 deletions surprise/prediction_algorithms/baseline_only.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,13 @@ class BaselineOnly(AlgoBase):
If user :math:`u` is unknown, then the bias :math:`b_u` is assumed to be
zero. The same applies for item :math:`i` with :math:`b_u`.
See paper *Factor in the Neighbors: Scalable and Accurate Collaborative
Filtering* by Yehuda Koren for details.
See section 2.1 of :cite:`Koren:2010` for details.
Args:
bsl_options(dict): A dictionary of options for the baseline estimates
computation. See :ref:`baseline_estimates_configuration` for
accepted options.
"""

def __init__(self, bsl_options={}):
Expand Down
8 changes: 2 additions & 6 deletions surprise/prediction_algorithms/co_clustering.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,7 @@ from .predictions import PredictionImpossible
class CoClustering(AlgoBase):
"""A collaborative filtering algorithm based on co-clustering.
This is a straighforward implementation of paper `A Scalable Collaborative
Filtering Framework based on Co-clustering
<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.113.6458&rep=rep1&type=pdf>`_
by George and Merugu.
This is a straighforward implementation of :cite:`George:2005`.
Basically, users and items are assigned some clusters :math:`C_u`,
:math:`C_i`, and some co-clusters :math:`C_{ui}`.
Expand All @@ -35,8 +32,7 @@ class CoClustering(AlgoBase):
:math:`i`'s cluster.
Clusters are assigned using a straightforward optimization method, much
like k-means. More details can be found at the authors' `paper
<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.113.6458&rep=rep1&type=pdf>`_
like k-means.
Args:
n_cltr_u(int): Number of user clusters. Default is ``3``.
Expand Down
6 changes: 2 additions & 4 deletions surprise/prediction_algorithms/knns.py
Original file line number Diff line number Diff line change
Expand Up @@ -227,10 +227,8 @@ class KNNBaseline(SymmetricAlgo):
depending on the ``user_based`` field of the ``sim_options`` parameter.
For details, see paper `Factor in the Neighbors: Scalable and Accurate
Collaborative Filtering
<http://courses.ischool.berkeley.edu/i290-dm/s11/SECURE/a1-koren.pdf>`_ by
Yehuda Koren.
This algorithm corresponds to formula (3), section 2.2 of
:cite:`Koren:2010`.
Args:
k(int): The (max) number of neighbors to take into account for
Expand Down
29 changes: 12 additions & 17 deletions surprise/prediction_algorithms/matrix_factorization.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,9 @@ from .algo_base import AlgoBase
class SVD(AlgoBase):
"""The famous *SVD* algorithm, as popularized by `Simon Funk
<http://sifter.org/~simon/journal/20061211.html>`_ during the Netflix
Prize. When baselines are not used, this is equivalent to `Probabilistic
Matrix Factorization
<http://papers.nips.cc/paper/3208-probabilistic-matrix-factorization.pdf>`_
by Salakhutdinov and Mnih (see :ref:`note <unbiased_note>` below).
Prize. When baselines are not used, this is equivalent to Probabilistic
Matrix Factorization :cite:`salakhutdinov2008a` (see :ref:`note
<unbiased_note>` below).
The prediction :math:`\\hat{r}_{ui}` is set as:
Expand All @@ -30,11 +29,8 @@ class SVD(AlgoBase):
:math:`p_u` are assumed to be zero. The same applies for item :math:`i`
with :math:`b_i` and :math:`q_i`.
For details, see eq. 5 from `Matrix Factorization Techniques For
Recommender Systems
<http://www.columbia.edu/~jwp2128/Teaching/W4721/papers/ieeecomputer.pdf>`_
by Koren, Bell and Volinsky. See also *The Recommender System Handbook*,
section 5.3.1.
For details, see eq. (5) from :cite:`Koren:2009`. See also
:cite:`Ricci:2010`, section 5.3.1.
To estimate all the unkown, we minimize the following regularized squared
error:
Expand Down Expand Up @@ -73,9 +69,9 @@ class SVD(AlgoBase):
.. math::
\hat{r}_{ui} = q_i^Tp_u
This is equivalent to `Probabilistic Matrix Factorization
<http://papers.nips.cc/paper/3208-probabilistic-matrix-factorization.pdf>`_
and can be achieved by setting the ``biased`` parameter to ``False``.
This is equivalent to Probabilistic Matrix Factorization
(:cite:`salakhutdinov2008a`, section 2) and can be achieved by setting
the ``biased`` parameter to ``False``.
Args:
Expand Down Expand Up @@ -251,17 +247,16 @@ class SVDpp(AlgoBase):
|I_u|^{-\\frac{1}{2}} \sum_{j \\in I_u}y_j\\right)
Where the :math:`y_j` terms are a new set of item factors that capture
implicite ratings.
implicite ratings. Here, an implicite rating describes the fact that a user
:math:`u` rated an item :math:`j`, regardless of the rating value.
If user :math:`u` is unknown, then the bias :math:`b_u` and the factors
:math:`p_u` are assumed to be zero. The same applies for item :math:`i`
with :math:`b_i`, :math:`q_i` and :math:`y_i`.
For details, see eq. 15 from `Factorization Meets The
Neighborhood
<http://www.cs.rochester.edu/twiki/pub/Main/HarpSeminar/Factorization_Meets_the_Neighborhood-_a_Multifaceted_Collaborative_Filtering_Model.pdf>`_
by Yehuda Koren. See also *The Recommender System Handbook*, section 5.3.1.
For details, see section 4 of :cite:`Koren:2008:FMN`. See also
:cite:`Ricci:2010`, section 5.3.1.
Just as for :class:`SVD`, the parameters are learnt using a SGD on the
regularized squared error objective.
Expand Down
9 changes: 2 additions & 7 deletions surprise/prediction_algorithms/slope_one.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,8 @@ from .predictions import PredictionImpossible
class SlopeOne(AlgoBase):
"""A simple yet accurate collaborative filtering algorithm.
This is a straighforward implementation of the `SlopeOne
<http://lemire.me/fr/documents/publications/lemiremaclachlan_sdm05.pdf>`_
algorithm by Lemire and Maclachlan.
This is a straighforward implementation of the SlopeOne algorithm
:cite:`lemire2007a`.
The prediction :math:`\\hat{r}_{ui}` is set as:
Expand All @@ -36,10 +35,6 @@ class SlopeOne(AlgoBase):
.. math::
\\text{dev}(i, j) = \\frac{1}{
|U_{ij}|}\\sum\\limits_{u \in U_{ij}} r_{ui} - r_{uj}
For further details, please refer to the author's `paper
<http://lemire.me/fr/documents/publications/lemiremaclachlan_sdm05.pdf>`_.
"""

def __init__(self):
Expand Down

0 comments on commit 1a7a2e8

Please sign in to comment.