Optimization objective under explicit feedback #35

david-cortes · 2020-11-05T15:46:12Z

In WRMF, there is the following loss function:

// [[Rcpp::export]]
double als_loss_explicit(const Rcpp::S4 &m_csc_r, arma::mat& X, arma::mat& Y, double lambda, unsigned n_threads) {
  dMappedCSC mat = extract_mapped_csc(m_csc_r);
  size_t nc = mat.n_cols;
  double loss = 0;
  #ifdef _OPENMP
  #pragma omp parallel for num_threads(n_threads) schedule(dynamic, GRAIN_SIZE) reduction(+:loss)
  #endif
  for(size_t i = 0; i < nc; i++) {
    int p1 = mat.col_ptrs[i];
    int p2 = mat.col_ptrs[i + 1];
    for(int pp = p1; pp < p2; pp++) {
      size_t ind = mat.row_indices[pp];
      double diff = mat.values[pp] - as_scalar(Y.col(i).t() * X.col(ind));
      loss += diff * diff;
    }
  }
  if(lambda > 0)
    loss += lambda * (accu(square(X)) + accu(square(Y)));
  return loss / mat.nnz;
}

The function calculates squared loss over non-missing entries.

But then there is the following solver:

solver_explicit_feedback = function(R, X) {
      # FIXME - consider to use private XtX
      XtX = tcrossprod(X) + diag(x = private$lambda, nrow = private$rank, ncol = private$rank)
      solve(XtX, as(X %*% R, "matrix"))
    }

The solver finds the solution to a problem in which the missing entries are zeros (i.e. they count towards the loss).

Should one of them be fixed?

When evaluated in terms or RMSE, the solver with missing-as-zero tends to lead to very poor results (e.g. fixing k=40 and testing on the ML10M, I get an RMSE of ~ 0.95-0.97, compared to an RMSE 0.78-0.82 with software that ignores missing entries).

The text was updated successfully, but these errors were encountered:

dselivanov · 2020-11-05T16:00:49Z

Explicit feedback solver should not treat missing as zeros in loss calculation, it should just ignore missing values... Let me see if there is a bug there.

…

On Thu, 5 Nov 2020, 19:46 david-cortes, ***@***.***> wrote: In WRMF, there is the following loss function: // [[Rcpp::export]]double als_loss_explicit(const Rcpp::S4 &m_csc_r, arma::mat& X, arma::mat& Y, double lambda, unsigned n_threads) { dMappedCSC mat = extract_mapped_csc(m_csc_r); size_t nc = mat.n_cols; double loss = 0; #ifdef _OPENMP #pragma omp parallel for num_threads(n_threads) schedule(dynamic, GRAIN_SIZE) reduction(+:loss) #endif for(size_t i = 0; i < nc; i++) { int p1 = mat.col_ptrs[i]; int p2 = mat.col_ptrs[i + 1]; for(int pp = p1; pp < p2; pp++) { size_t ind = mat.row_indices[pp]; double diff = mat.values[pp] - as_scalar(Y.col(i).t() * X.col(ind)); loss += diff * diff; } } if(lambda > 0) loss += lambda * (accu(square(X)) + accu(square(Y))); return loss / mat.nnz; } The function calculates squared loss over non-missing entries. But then there is the following solver: solver_explicit_feedback = function(R, X) { # FIXME - consider to use private XtX XtX = tcrossprod(X) + diag(x = private$lambda, nrow = private$rank, ncol = private$rank) solve(XtX, as(X %*% R, "matrix")) } The solver finds the solution to a problem in which the missing entries are zeros (i.e. they count towards the loss). Should one of them be fixed? When evaluated in terms or RMSE, the solver with missing-as-zero tends to lead to very poor results (e.g. fixing k=40 and testing on the ML10M, I get an RMSE of ~ 0.95-0.97, compared to an RMSE 0.78-0.82 with software that ignores missing entries). — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#35>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABHC5XJUSYJ2HGPFQWCJGNTSOLCFLANCNFSM4TLQ2DCQ> .

dselivanov · 2020-11-05T17:06:35Z

@david-cortes you are right, solver_explicit_feedback needs to follow same logic as implicit. For quick fix pure R solver (but still fast) can be adopted from here - http://dsnotes.com/post/2017-05-28-matrix-factorization-for-recommender-systems/

I will try to fix this bug next week (but of course if you have solution feel free to send PR! )

dselivanov · 2020-11-10T08:55:55Z

@david-cortes I've revisited solver and optimized "concise" notation I've used (solving full matrix at once instead of iterating updating user and item factors one by one). And it indeed ignores zeros (in order to test if it correct, I've created a solver which goes one by one).

Now the question is why you are getting worse RMSE. Would it be possible to share a reproducible example (so I don't need to download raw ML10M dataset and split and preprocess it)? Can it be possible that other software solves factorization with user and item biases?

david-cortes · 2020-11-10T10:17:37Z

Ok, I'll share an example about RMSE, but for now, I'll provide an example about solving the user factors.

Suppose we have a dataset with 8 items and 1 user, having these ratings/values:

.   .   19.25 17.47  .   18.58   .   .

And we have these factors for the items:

      [,1]  [,2]
[1,] -0.56 -0.23
[2,]  1.56  0.07
[3,]  0.13  1.72
[4,]  0.46 -1.27
[5,] -0.69 -0.45
[6,]  1.22  0.36
[7,]  0.40  0.11
[8,] -0.56  1.79

The function solver_explicit_feedback would give these user factors (8 items, k=2):

6.17 2.38

If I calculate the optimization objective as:

sum( (real-predicted)^2 ) + lambda*sum((user_factors)^2)

Assuming something like lambda=1e-1, then this the result: 821.65

But now if I calculate the same for the following factors:

5.31 6.23

The function value would be lower: 779.04

Which means it was not the optimum.

Full code example:

library(Matrix)
set.seed(123)
n_items = 8L
n_factors = 2L
n_ratings = 3L
ItemFactors = matrix(rnorm(n_items*n_factors), ncol=n_items)
col_indices = sample(n_items, n_ratings, replace=FALSE)
ratings = rgamma(n_ratings, 20)
R = Matrix::sparseVector(ratings, col_indices, length=n_items)
R = as(R, "CsparseMatrix")

solver_explicit_feedback = function(R, X, lambda=1e-3, rank=2) {
  XtX = tcrossprod(X) + diag(x = lambda, nrow = rank, ncol = rank)
  solve(XtX, as(X %*% R, "matrix"))
}
solver_ignoring_missing = function(R, X, lambda=1e-1, rank=2) {
  col = R@i
  data = R@x
  rank = nrow(X)
  X = X[, col, drop=FALSE]
  LHS = X %*% t(X) + lambda*diag(x = lambda, nrow = rank, ncol = rank)
  RHS = X %*% data
  return(solve(LHS, RHS))
}
calc_obj = function(R, ItemFactors, user_vec, lambda=1e-3) {
  col = R@i
  data = R@x
  pred = t(ItemFactors) %*% user_vec
  pred = as.numeric(pred[col])
  err = pred - data
  return(sum(err^2) + lambda * sum(user_vec^2))
}

opt1 = solver_explicit_feedback(R, ItemFactors, rank=n_factors)
opt2 = solver_ignoring_missing(R, ItemFactors, rank=n_factors)
calc_obj(R, ItemFactors, opt1)
calc_obj(R, ItemFactors, opt2)

Outputs:

[1] 821.6462
[1] 779.0446

dselivanov · 2020-11-10T12:15:41Z

Please try latest commit https://github.com/rexyai/rsparse/tree/90966fb134fbfdd2e2c649e8198a1eab8fae1d0a

The issue seems was that XtX need to be calculated only for non-zero users/items (as you did in your example above).

Also note that in your example you need to use col = R@i + 1L (shift indices by 1). R's sparse matrices @i and @p indices start from 0 instead on common 1. So essentially this follows canonical CSC and CSR format.

david-cortes · 2020-11-10T12:32:24Z

Thanks, seems to be working correctly now.

dselivanov · 2020-11-10T12:33:43Z

@david-cortes thanks for bug report and really helpful reproducible example!

dselivanov added the bug label Nov 5, 2020

dselivanov self-assigned this Nov 5, 2020

dselivanov removed the bug label Nov 10, 2020

dselivanov closed this as completed in 90966fb Nov 10, 2020

dselivanov added the bug label Nov 10, 2020

dselivanov mentioned this issue Nov 15, 2020

Cholesky solver #38

Closed

dselivanov added question bug and removed bug question labels Nov 18, 2020

dselivanov mentioned this issue Nov 24, 2020

Instability in rsparse::WRMF convergence and loss function #42

Closed

david-cortes mentioned this issue Jun 27, 2024

Solve CRAN issues from Rcpp with UBSAN #76

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimization objective under explicit feedback #35

Optimization objective under explicit feedback #35

david-cortes commented Nov 5, 2020

dselivanov commented Nov 5, 2020 via email •

edited

Loading

dselivanov commented Nov 5, 2020

dselivanov commented Nov 10, 2020

david-cortes commented Nov 10, 2020

dselivanov commented Nov 10, 2020

david-cortes commented Nov 10, 2020

dselivanov commented Nov 10, 2020

Optimization objective under explicit feedback #35

Optimization objective under explicit feedback #35

Comments

david-cortes commented Nov 5, 2020

dselivanov commented Nov 5, 2020 via email • edited Loading

dselivanov commented Nov 5, 2020

dselivanov commented Nov 10, 2020

david-cortes commented Nov 10, 2020

dselivanov commented Nov 10, 2020

david-cortes commented Nov 10, 2020

dselivanov commented Nov 10, 2020

dselivanov commented Nov 5, 2020 via email •

edited

Loading