Skip to content

Commit

Permalink
delete old joblib notes
Browse files Browse the repository at this point in the history
  • Loading branch information
mdekstrand committed Oct 22, 2021
1 parent 1f31998 commit 6130dac
Show file tree
Hide file tree
Showing 2 changed files with 0 additions and 18 deletions.
17 changes: 0 additions & 17 deletions doc/impl-tips.rst
Expand Up @@ -63,20 +63,3 @@ that use randomness at predict or recommendation time, not just training time, s
value ``'user'`` for the ``rng`` parameter, and if it is passed, derive a new seed for each user
using :py:func:`seedbank.derive_seed` to allow reproducibility in the face of parallelism for common
experimental designs. :py:func:`lenskit.util.derivable_rng` automates this logic.

Memory Map Friendliness
-----------------------

LensKit uses :py:class:`joblib.Parallel` to parallelize internal operations (when it isn't using Numba).
Joblib is pretty good about using shared memory to minimize memory overhead in parallel computations,
and LensKit has some tricks to maximize this use. However, it does require a bit of attention in
your algorithm implementation.

The easiest way to make this fail is to use many small NumPy or Pandas data structures. If you have
a dictionary of :py:class:`np.ndarray` objects, for instance, it will cause a problem. This is because
each array will be memory-mapped, and each map will *reopen* the file. Having too many active
open files will cause your process to run out of file descriptors on many systems. Keep your
object count to a small, ideally fixed number; in :py:class:`lenskit.algorithms.basic.UnratedItemSelector`,
we do this by storing user and item indexes along with a :py:class:`matrix.CSR` containing the items
rated by each user. The old implementation had a dictionary mapping user IDs to ``ndarray``s with
each user's rated items. This is a change from :math:`|U|+1` arrays to 5 arrays.
1 change: 0 additions & 1 deletion min-constraints.txt
Expand Up @@ -4,5 +4,4 @@ scipy==1.2.1
numba==0.51.0
pyarrow==0.15.1
cffi==1.12.2
joblib==0.13.0
binpickle==0.3.2

0 comments on commit 6130dac

Please sign in to comment.