Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization objective under explicit feedback #35

Closed
david-cortes opened this issue Nov 5, 2020 · 7 comments
Closed

Optimization objective under explicit feedback #35

david-cortes opened this issue Nov 5, 2020 · 7 comments
Assignees
Labels

Comments

@david-cortes
Copy link
Contributor

In WRMF, there is the following loss function:

// [[Rcpp::export]]
double als_loss_explicit(const Rcpp::S4 &m_csc_r, arma::mat& X, arma::mat& Y, double lambda, unsigned n_threads) {
  dMappedCSC mat = extract_mapped_csc(m_csc_r);
  size_t nc = mat.n_cols;
  double loss = 0;
  #ifdef _OPENMP
  #pragma omp parallel for num_threads(n_threads) schedule(dynamic, GRAIN_SIZE) reduction(+:loss)
  #endif
  for(size_t i = 0; i < nc; i++) {
    int p1 = mat.col_ptrs[i];
    int p2 = mat.col_ptrs[i + 1];
    for(int pp = p1; pp < p2; pp++) {
      size_t ind = mat.row_indices[pp];
      double diff = mat.values[pp] - as_scalar(Y.col(i).t() * X.col(ind));
      loss += diff * diff;
    }
  }
  if(lambda > 0)
    loss += lambda * (accu(square(X)) + accu(square(Y)));
  return loss / mat.nnz;
}

The function calculates squared loss over non-missing entries.

But then there is the following solver:

solver_explicit_feedback = function(R, X) {
      # FIXME - consider to use private XtX
      XtX = tcrossprod(X) + diag(x = private$lambda, nrow = private$rank, ncol = private$rank)
      solve(XtX, as(X %*% R, "matrix"))
    }

The solver finds the solution to a problem in which the missing entries are zeros (i.e. they count towards the loss).

Should one of them be fixed?

When evaluated in terms or RMSE, the solver with missing-as-zero tends to lead to very poor results (e.g. fixing k=40 and testing on the ML10M, I get an RMSE of ~ 0.95-0.97, compared to an RMSE 0.78-0.82 with software that ignores missing entries).

@dselivanov
Copy link
Owner

dselivanov commented Nov 5, 2020 via email

@dselivanov
Copy link
Owner

@david-cortes you are right, solver_explicit_feedback needs to follow same logic as implicit. For quick fix pure R solver (but still fast) can be adopted from here - http://dsnotes.com/post/2017-05-28-matrix-factorization-for-recommender-systems/

I will try to fix this bug next week (but of course if you have solution feel free to send PR! )

@dselivanov dselivanov added the bug label Nov 5, 2020
@dselivanov dselivanov self-assigned this Nov 5, 2020
@dselivanov
Copy link
Owner

@david-cortes I've revisited solver and optimized "concise" notation I've used (solving full matrix at once instead of iterating updating user and item factors one by one). And it indeed ignores zeros (in order to test if it correct, I've created a solver which goes one by one).

Now the question is why you are getting worse RMSE. Would it be possible to share a reproducible example (so I don't need to download raw ML10M dataset and split and preprocess it)? Can it be possible that other software solves factorization with user and item biases?

@dselivanov dselivanov removed the bug label Nov 10, 2020
@david-cortes
Copy link
Contributor Author

Ok, I'll share an example about RMSE, but for now, I'll provide an example about solving the user factors.

Suppose we have a dataset with 8 items and 1 user, having these ratings/values:

.   .   19.25 17.47  .   18.58   .   .  

And we have these factors for the items:

      [,1]  [,2]
[1,] -0.56 -0.23
[2,]  1.56  0.07
[3,]  0.13  1.72
[4,]  0.46 -1.27
[5,] -0.69 -0.45
[6,]  1.22  0.36
[7,]  0.40  0.11
[8,] -0.56  1.79

The function solver_explicit_feedback would give these user factors (8 items, k=2):

6.17 2.38

If I calculate the optimization objective as:

sum( (real-predicted)^2 ) + lambda*sum((user_factors)^2)

Assuming something like lambda=1e-1, then this the result: 821.65

But now if I calculate the same for the following factors:

5.31 6.23

The function value would be lower: 779.04

Which means it was not the optimum.

Full code example:

library(Matrix)
set.seed(123)
n_items = 8L
n_factors = 2L
n_ratings = 3L
ItemFactors = matrix(rnorm(n_items*n_factors), ncol=n_items)
col_indices = sample(n_items, n_ratings, replace=FALSE)
ratings = rgamma(n_ratings, 20)
R = Matrix::sparseVector(ratings, col_indices, length=n_items)
R = as(R, "CsparseMatrix")

solver_explicit_feedback = function(R, X, lambda=1e-3, rank=2) {
  XtX = tcrossprod(X) + diag(x = lambda, nrow = rank, ncol = rank)
  solve(XtX, as(X %*% R, "matrix"))
}
solver_ignoring_missing = function(R, X, lambda=1e-1, rank=2) {
  col = R@i
  data = R@x
  rank = nrow(X)
  X = X[, col, drop=FALSE]
  LHS = X %*% t(X) + lambda*diag(x = lambda, nrow = rank, ncol = rank)
  RHS = X %*% data
  return(solve(LHS, RHS))
}
calc_obj = function(R, ItemFactors, user_vec, lambda=1e-3) {
  col = R@i
  data = R@x
  pred = t(ItemFactors) %*% user_vec
  pred = as.numeric(pred[col])
  err = pred - data
  return(sum(err^2) + lambda * sum(user_vec^2))
}

opt1 = solver_explicit_feedback(R, ItemFactors, rank=n_factors)
opt2 = solver_ignoring_missing(R, ItemFactors, rank=n_factors)
calc_obj(R, ItemFactors, opt1)
calc_obj(R, ItemFactors, opt2)

Outputs:

[1] 821.6462
[1] 779.0446

@dselivanov
Copy link
Owner

Please try latest commit https://github.com/rexyai/rsparse/tree/90966fb134fbfdd2e2c649e8198a1eab8fae1d0a

The issue seems was that XtX need to be calculated only for non-zero users/items (as you did in your example above).

Also note that in your example you need to use col = R@i + 1L (shift indices by 1). R's sparse matrices @i and @p indices start from 0 instead on common 1. So essentially this follows canonical CSC and CSR format.

@dselivanov dselivanov added the bug label Nov 10, 2020
@david-cortes
Copy link
Contributor Author

Thanks, seems to be working correctly now.

@dselivanov
Copy link
Owner

@david-cortes thanks for bug report and really helpful reproducible example!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants