document missing data

lenskit · Nov 23, 2018 · fcadf63 · fcadf63
1 parent 0380d3b
commit fcadf63
Show file tree

Hide file tree

Showing 4 changed files with 75 additions and 42 deletions.
diff --git a/doc/evaluation.rst → doc/evaluation/index.rst b/doc/evaluation.rst → doc/evaluation/index.rst
@@ -9,6 +9,12 @@ We generally recommend using Jupyter_ notebooks for evaluation.
 .. _batch utilities: batch.html
 .. _Jupyter: https://jupyter.org
 
+.. toctree::
+   :caption: Evaluation Topics
+
+   predict-metrics
+   topn-metrics
+
 Loading Outputs
 ~~~~~~~~~~~~~~~
 
@@ -47,44 +53,3 @@ data set, you can do::
     # group and aggregate
     nbr_ndcg = user_ndcg.groupby('max_neighbors').nDCG.mean()
     nbr_ndcg.plot()
-
-Prediction Accuracy Metrics
-~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-.. module:: lenskit.metrics.predict
-
-The :py:mod:`lenskit.metrics.predict` module containins prediction accuracy metrics.
-
-.. autofunction:: rmse
-.. autofunction:: mae
-
-Top-*N* Accuracy Metrics
-~~~~~~~~~~~~~~~~~~~~~~~~
-
-.. module:: lenskit.metrics.topn
-
-The :py:mod:`lenskit.metrics.topn` module contains metrics for evaluating top-*N*
-recommendation lists.
-
-Classification Metrics
-----------------------
-
-These metrics treat the recommendation list as a classification of relevant items.
-
-.. autofunction:: precision
-.. autofunction:: recall
-
-Ranked List Metrics
--------------------
-
-These metrics treat the recommendation list as a ranked list of items that may or may not
-be relevant.
-
-.. autofunction:: recip_rank
-
-Utility Metrics
----------------
-
-The nDCG function estimates a utility score for a ranked list of recommendations.
-
-.. autofunction:: ndcg
diff --git a/doc/evaluation/predict-metrics.rst b/doc/evaluation/predict-metrics.rst
@@ -0,0 +1,38 @@
+Prediction Accuracy Metrics
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. module:: lenskit.metrics.predict
+
+The :py:mod:`lenskit.metrics.predict` module containins prediction accuracy metrics.
+
+Metric Functions
+----------------
+
+.. autofunction:: rmse
+.. autofunction:: mae
+
+Working with Missing Data
+-------------------------
+
+LensKit rating predictors do not report predictions when their core model is unable
+to predict.  For example, a nearest-neighbor recommender will not score an item if
+it cannot find any suitable neighbors.  Following the Pandas convention, these items
+are given a score of NaN (when Pandas implements better missing data handling, it will
+use that, so use :py:fun:`pandas.Series.isna`/:py:fun:`pandas.Series.notna`, not the
+``isnan`` versions.
+
+However, this causes problems when computing predictive accuracy: recommenders are not
+being tested on the same set of items. If a recommender only scores the easy items, for
+example, it could do much better than a recommender that is willing to attempt more
+difficult items.
+
+A good solution to this is to use a *fallback predictor* so that every item has a
+prediction.  In LensKit, :py:class:`lenskit.algorithms.basic.Fallback` implements
+this functionality; it wraps a sequence of recommenders, and for each item, uses
+the first one that generates a score.
+
+You set it up like this::
+
+    cf = ItemItem(20)
+    base = Bias(damping=5)
+    algo = Fallback(cf, base)
diff --git a/doc/evaluation/topn-metrics.rst b/doc/evaluation/topn-metrics.rst
@@ -0,0 +1,30 @@
+Top-*N* Accuracy Metrics
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. module:: lenskit.metrics.topn
+
+The :py:mod:`lenskit.metrics.topn` module contains metrics for evaluating top-*N*
+recommendation lists.
+
+Classification Metrics
+----------------------
+
+These metrics treat the recommendation list as a classification of relevant items.
+
+.. autofunction:: precision
+.. autofunction:: recall
+
+Ranked List Metrics
+-------------------
+
+These metrics treat the recommendation list as a ranked list of items that may or may not
+be relevant.
+
+.. autofunction:: recip_rank
+
+Utility Metrics
+---------------
+
+The nDCG function estimates a utility score for a ranked list of recommendations.
+
+.. autofunction:: ndcg
diff --git a/doc/index.rst b/doc/index.rst
@@ -45,7 +45,7 @@ Resources
    GettingStarted
    crossfold
    batch
-   evaluation
+   evaluation/index
    algorithms
    util