Skip to content

Commit

Permalink
bug with missing maxiter
Browse files Browse the repository at this point in the history
  • Loading branch information
david-cortes committed Aug 9, 2018
1 parent 294c646 commit 16c1132
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 5 deletions.
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

This is a Python package for hierarchical Poisson factorization, a form of probabilistic matrix factorization used for recommender systems with implicit count data, based on the paper _Scalable Recommendation with Hierarchical Poisson Factorization (P. Gopalan, 2015)_.

Although the package was created with recommender systems in mind, it can also be used for other domains, e.g. as a faster alternative to LDA (Latent Ditichlet Allocation), where users become documents and items become words.
Although the package was created with recommender systems in mind, it can also be used for other domains, e.g. as a faster alternative to LDA (Latent Ditichlet Allocation), where users become documents and items become words. For a similar package using also item/user side information see [ctpfrec](https://github.com/david-cortes/ctpfrec)

Supports parallelization, full-batch variational inference, mini-batch stochastic variational inference (based on batches of data from subsets of users), and different stopping criteria for the coordinate-ascent procedure. The bottleneck computations are written in fast Cython code.

Expand Down Expand Up @@ -47,7 +47,7 @@ pip install hpfrec
As it contains Cython code, it requires a C compiler. In Windows, this usually means it requires a Visual Studio installation (or MinGW + GCC), and if using Anaconda, might also require configuring it to use said Visual Studio instead of MinGW, otherwise the installation from `pip` might fail. For more details see this guide:
[Cython Extensions On Windows](https://github.com/cython/cython/wiki/CythonExtensionsOnWindows)

On Python 2.7 on Windows, it might additionally requiring installing extra Visual Basic modules.
On Python 2.7 on Windows, it might additionally require installing extra Visual Basic modules (untested).

On Linux and Mac, the `pip` install should work out-of-the-box, as long as the system has `gcc` (included by default in most installs).

Expand Down Expand Up @@ -112,7 +112,7 @@ recommender.predict(user=10, item=11)
recommender.predict(user=[10,10,10], item=[1,2,3])
recommender.predict(user=[10,11,12], item=[4,5,6])

## Evaluating model likelihood
## Evaluating Poisson likelihood
recommender.eval_llk(counts_df, full_llk=True)

## Determining latent factors for a new user, given her item interactions
Expand Down Expand Up @@ -162,7 +162,7 @@ It is also internally documented through docstrings (e.g. you can try `help(hpfr

## Speeding up optimization procedure

For faster fitting and predictions, use scipy and numpy libraries compiled against MKL. In Windows, you can find Python wheels (installable with pip after downloading them) of numpy and scipy precompiled with MKL in [Christoph Gohlke's website](https://www.lfd.uci.edu/~gohlke/pythonlibs/). In Linux and Mac, these come by default in Anaconda installations (but are likely to get overwritten if you enable `conda-forge`). In some small experiments from my side, this yields a near 4x speedup compared to using free linear algebra libraries (for AMD cpu's, the speedup might not be as large).
For faster fitting and predictions, use SciPy and NumPy libraries compiled against MKL. In Windows, you can find Python wheels (installable with pip after downloading them) of numpy and scipy precompiled with MKL in [Christoph Gohlke's website](https://www.lfd.uci.edu/~gohlke/pythonlibs/). In Linux and Mac, these come by default in Anaconda installations (but are likely to get overwritten if you enable `conda-forge`). In some small experiments from my side, this yields a near 4x speedup compared to using free linear algebra libraries (for AMD cpu's, the speedup might not be as large).

The constructor for HPF allows some parameters to make it run faster (if you know what you're doing): these are `allow_inconsistent_math=True`, `full_llk=False`, `stop_crit='diff-norm'`, `reindex=False`, `verbose=False`. See the documentation for more details.

Expand Down
3 changes: 2 additions & 1 deletion hpfrec/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ class HPF:
for the global parameters as output (only used when fitting with stochastic variational inference).
The step size must be a number between zero and one, and should be decresing with bigger iteration numbers.
Ignored when passing users_per_batch=None.
maxiter : int
maxiter : int or None
Maximum number of iterations for which to run the optimization procedure. This corresponds to epochs when
fitting in batches of users. Recommended to use a lower number when passing a batch size.
reindex : bool
Expand Down Expand Up @@ -217,6 +217,7 @@ def __init__(self, k=30, a=0.3, a_prime=0.3, b_prime=1.0,
else:
if stop_crit!='maxiter':
raise ValueError("If 'stop_crit' is set to 'maxiter', must provide a maximum number of iterations.")
maxiter = 10**10

if check_every is not None:
assert isinstance(check_every, int)
Expand Down

0 comments on commit 16c1132

Please sign in to comment.