Skip to content

Commit

Permalink
small bug fixes
Browse files Browse the repository at this point in the history
  • Loading branch information
david-cortes committed Aug 4, 2018
1 parent 686e7aa commit 5a142fa
Show file tree
Hide file tree
Showing 4 changed files with 22 additions and 7 deletions.
17 changes: 16 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ However, Poisson likelihood is given by the formula:
If taking the logarithm (log-likelihood), then this becomes:
```l(y) = -log(y!) + y*log(yhat) - yhat```

Since `log(0!) = 0`, and the sum of predictions for all combinations of users and items can be quickly calculated by `sum yhat = sum_{i,j} <U_i, V_j> = <sum_i U_i, sum_j V_j>` (`U` and `V` are non-negative matrices), it means the model doesn't ever need to make calculations on values that are equal to zero - simply not adding them to calculations would implicitly assume that they are zero.
Since `log(0!) = 0`, and the sum of predictions for all combinations of users and items can be quickly calculated by `sum yhat = sum_{i,j} <U_i, V_j> = <sum_i U_i, sum_j V_j>` (since `U` and `V` are non-negative matrices), it means the model doesn't ever need to make calculations on values that are equal to zero - simply not adding them to calculations would implicitly assume that they are zero.

Moreover, negative Poisson log-likelihood is a more appropriate loss for count data than squared loss, which tends to produce not-so-good results when the values to predict follow an exponential rather than a normal distribution.

Expand Down Expand Up @@ -139,6 +139,21 @@ For a more detailed example, see the IPython notebook [recommending songs with E

This package contains only functionality related to fitting this model. For general evaluation metrics for recommendations on implicit data see other packages such as [lightFM](https://github.com/lyst/lightfm).

## Saving model with pickle

Using pickle to save an `HPF` object might fail due to problems with lambda functions. The following solves it:

```python
import pickle
from hpfrec import HPF

h = HPF()
h.step_size = None
pickle.dump(h, open("HPF_obj.p", "wb"))
```

(Be aware though that afterwards it won't be possible to use `partial_fit` or `add_user` with updates to item parameters.)

## Documentation

Documentation is available at readthedocs: [http://hpfrec.readthedocs.io](http://hpfrec.readthedocs.io/en/latest/)
Expand Down
2 changes: 1 addition & 1 deletion hpfrec/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -343,7 +343,7 @@ def fit(self, counts_df, val_set=None):

## after terminating optimization
if self.keep_data:
if self.users_per_batch != 0:
if self.users_per_batch == 0:
self._store_metadata()
else:
self._st_ix_user = self._st_ix_user[:-1]
Expand Down
8 changes: 4 additions & 4 deletions hpfrec/cython_loops.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -303,9 +303,9 @@ def fit_hpf(float a, float a_prime, float b_prime,
np.savetxt(os.path.join(save_folder, "Gamma_rte.csv"), Gamma_rte, fmt="%.10f", delimiter=',')
np.savetxt(os.path.join(save_folder, "Lambda_shp.csv"), Lambda_shp, fmt="%.10f", delimiter=',')
np.savetxt(os.path.join(save_folder, "Lambda_rte.csv"), Lambda_rte, fmt="%.10f", delimiter=',')
np.savetxt(os.path.join(save_folder, "kappa_rte.csv"), k_rte.reshape(-1,1), fmt="%.10f", delimiter=',')
np.savetxt(os.path.join(save_folder, "tau_rte.csv"), t_rte.reshape(-1,1), fmt="%.10f", delimiter=',')
np.savetxt(os.path.join(save_folder, "Phi.csv"), phi/Y, fmt="%.10f", delimiter=',')
np.savetxt(os.path.join(save_folder, "kappa_rte.csv"), k_rte.reshape((-1,1)), fmt="%.10f", delimiter=',')
np.savetxt(os.path.join(save_folder, "tau_rte.csv"), t_rte.reshape((-1,1)), fmt="%.10f", delimiter=',')
np.savetxt(os.path.join(save_folder, "Phi.csv"), phi/Y.reshape((-1,1)), fmt="%.10f", delimiter=',')

## returning objects as needed
if keep_all_objs:
Expand Down Expand Up @@ -406,7 +406,7 @@ def calc_user_factors(float a, float a_prime, float b_prime,
Theta_prev = Theta.copy()

if return_all:
return Gamma_shp, Gamma_rte, phi/Y.reshape(-1, 1)
return Gamma_shp, Gamma_rte, phi/Y.reshape((-1, 1))
else:
return None

Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
'scipy',
'cython'
],
version = '0.2.1.1',
version = '0.2.1.2',
description = 'Hierarchical Poisson matrix factorization for recommender systems',
author = 'David Cortes',
author_email = 'david.cortes.rivera@gmail.com',
Expand Down

0 comments on commit 5a142fa

Please sign in to comment.