Skip to content

Commit

Permalink
document missing data
Browse files Browse the repository at this point in the history
  • Loading branch information
mdekstrand committed Nov 23, 2018
1 parent 0380d3b commit fcadf63
Show file tree
Hide file tree
Showing 4 changed files with 75 additions and 42 deletions.
47 changes: 6 additions & 41 deletions doc/evaluation.rst → doc/evaluation/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,12 @@ We generally recommend using Jupyter_ notebooks for evaluation.
.. _batch utilities: batch.html
.. _Jupyter: https://jupyter.org

.. toctree::
:caption: Evaluation Topics

predict-metrics
topn-metrics

Loading Outputs
~~~~~~~~~~~~~~~

Expand Down Expand Up @@ -47,44 +53,3 @@ data set, you can do::
# group and aggregate
nbr_ndcg = user_ndcg.groupby('max_neighbors').nDCG.mean()
nbr_ndcg.plot()

Prediction Accuracy Metrics
~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. module:: lenskit.metrics.predict

The :py:mod:`lenskit.metrics.predict` module containins prediction accuracy metrics.

.. autofunction:: rmse
.. autofunction:: mae

Top-*N* Accuracy Metrics
~~~~~~~~~~~~~~~~~~~~~~~~

.. module:: lenskit.metrics.topn

The :py:mod:`lenskit.metrics.topn` module contains metrics for evaluating top-*N*
recommendation lists.

Classification Metrics
----------------------

These metrics treat the recommendation list as a classification of relevant items.

.. autofunction:: precision
.. autofunction:: recall

Ranked List Metrics
-------------------

These metrics treat the recommendation list as a ranked list of items that may or may not
be relevant.

.. autofunction:: recip_rank

Utility Metrics
---------------

The nDCG function estimates a utility score for a ranked list of recommendations.

.. autofunction:: ndcg
38 changes: 38 additions & 0 deletions doc/evaluation/predict-metrics.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
Prediction Accuracy Metrics
~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. module:: lenskit.metrics.predict

The :py:mod:`lenskit.metrics.predict` module containins prediction accuracy metrics.

Metric Functions
----------------

.. autofunction:: rmse
.. autofunction:: mae

Working with Missing Data
-------------------------

LensKit rating predictors do not report predictions when their core model is unable
to predict. For example, a nearest-neighbor recommender will not score an item if
it cannot find any suitable neighbors. Following the Pandas convention, these items
are given a score of NaN (when Pandas implements better missing data handling, it will
use that, so use :py:fun:`pandas.Series.isna`/:py:fun:`pandas.Series.notna`, not the
``isnan`` versions.

However, this causes problems when computing predictive accuracy: recommenders are not
being tested on the same set of items. If a recommender only scores the easy items, for
example, it could do much better than a recommender that is willing to attempt more
difficult items.

A good solution to this is to use a *fallback predictor* so that every item has a
prediction. In LensKit, :py:class:`lenskit.algorithms.basic.Fallback` implements
this functionality; it wraps a sequence of recommenders, and for each item, uses
the first one that generates a score.

You set it up like this::

cf = ItemItem(20)
base = Bias(damping=5)
algo = Fallback(cf, base)
30 changes: 30 additions & 0 deletions doc/evaluation/topn-metrics.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
Top-*N* Accuracy Metrics
~~~~~~~~~~~~~~~~~~~~~~~~

.. module:: lenskit.metrics.topn

The :py:mod:`lenskit.metrics.topn` module contains metrics for evaluating top-*N*
recommendation lists.

Classification Metrics
----------------------

These metrics treat the recommendation list as a classification of relevant items.

.. autofunction:: precision
.. autofunction:: recall

Ranked List Metrics
-------------------

These metrics treat the recommendation list as a ranked list of items that may or may not
be relevant.

.. autofunction:: recip_rank

Utility Metrics
---------------

The nDCG function estimates a utility score for a ranked list of recommendations.

.. autofunction:: ndcg
2 changes: 1 addition & 1 deletion doc/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ Resources
GettingStarted
crossfold
batch
evaluation
evaluation/index
algorithms
util

Expand Down

0 comments on commit fcadf63

Please sign in to comment.