Cholesky solver #38

david-cortes · 2020-11-15T07:50:39Z

I'm getting some pretty bad results using WRMF with the Cholesky solver.

I tried doing a reproducible experiment as follows:

Took the LastFM360k data.
Set the first 357k users as train, rest ~1800 as test.
Fit a WRMF model with k=40 and lambda=1 and 10 iterations.
Calculated factors and top-5 predictions for the test users.
Calculated hit rate at 5 for them.

And got the following results:

CG: 0.4441335
Cholesky: 0.4179763

Which is certainly not what I'd expect as the Cholesky is a more exact method and should lead to better results. Using different random seeds did not result in any significant change.

I additionally see a strange behavior with the timings:

CG: ~21s
Chol: ~32s

According to the reference paper Applications of the conjugate gradient method for implicit feedback collaborative filtering, the Cholesky solver in this case should take close to 3x longer than the CG one. I did the same experiment with this package and got the expected results: Cholesky takes 3x longer and gets a better end result:

CG: ~23s | 0.4501615
Chol: ~64s | 0.4512379

Tried playing with the armadillo option arma::solve_opts::fast, changing it to something more exact, but it didn't make any difference in the HR@5 metric.

I'm not familiar with armadillo but I suspect this line is incorrect:

arma::Mat<T> inv = XtX + X_nnz.each_row() % (confidence.t() - 1) * X_nnz.t();

Since it's doing an operation per row, whereas it should be doing rank-1 updates.

The text was updated successfully, but these errors were encountered:

dselivanov · 2020-11-15T10:01:04Z

Thanks for reporting, I will take a look. Actually I expect the problem is the same as in #35 - XtX.

Which is certainly not what I'd expect as the Cholesky is a more exact method and should lead to better results.

Loss with Cholesky should be lower, but hit rate might be worse since we don't optimize it directly...

dselivanov · 2020-11-15T13:00:15Z

@david-cortes could set logger to debug and check the loss? From what I see cholesky achieves smaller (or about the same) loss:

library(rsparse)
library(lgr)
lg = get_logger('rsparse')
lg$set_threshold('debug')
data('movielens100k')

train = movielens100k
train = log1p(train)

for (seed in c(1, 2, 3)) {
  for (solver in c('cholesky', 'conjugate_gradient')) {
    model = WRMF$new(rank = 50,  lambda = 1, solver = solver)
    message(glue::glue("seed {seed}, solver {solver}"))
    user_emb = model$fit_transform(train, n_iter = 10, convergence_tol = -1, cg_steps = 3)
  }
}

david-cortes · 2020-11-15T13:03:06Z

But that loss is not the same function that is being minimized. The solver minimizes (or is supposed to minimize) the values over all entries, whereas the loss function calculates it over the present entries.

dselivanov · 2020-11-15T13:09:15Z

@david-cortes you are right! starting to forget all the details...

david-cortes · 2020-11-15T13:22:02Z

Well I did some experiments calculating over all entries and they do indeed end up with similar loss. Weird that the resulting from Cholesky then performs worse at other metrics, and that it takes so little time.

david-cortes · 2020-11-15T13:34:19Z

Did some more experiments this time, from the same movielens. After running for 10 iterations with seed=1, I get these losses:

CG: 74202.05
Cholesky: 74295.25

Whereas with the other package I was comparing, I get these:

CG: 126483.4
Cholesky: 78421.88

Wonder where the difference comes from...

dselivanov · 2020-11-15T13:37:53Z

And if you remove arma::solve_opts::fast argument from the solver?

david-cortes · 2020-11-15T13:44:16Z

Then the loss does end up lower than for CG, even though in a test set it didn't improve HR@5 or AUC. I guess technically it is working as it should - that is, it minimizes the function it is intended to - so I'll close this.

david-cortes closed this as completed Nov 15, 2020

dselivanov added the question label Nov 18, 2020

david-cortes mentioned this issue Jun 27, 2024

Solve CRAN issues from Rcpp with UBSAN #76

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cholesky solver #38

Cholesky solver #38

david-cortes commented Nov 15, 2020 •

edited

Loading

dselivanov commented Nov 15, 2020

dselivanov commented Nov 15, 2020

david-cortes commented Nov 15, 2020 •

edited

Loading

dselivanov commented Nov 15, 2020

david-cortes commented Nov 15, 2020 •

edited

Loading

david-cortes commented Nov 15, 2020 •

edited

Loading

dselivanov commented Nov 15, 2020

david-cortes commented Nov 15, 2020

Cholesky solver #38

Cholesky solver #38

Comments

david-cortes commented Nov 15, 2020 • edited Loading

dselivanov commented Nov 15, 2020

dselivanov commented Nov 15, 2020

david-cortes commented Nov 15, 2020 • edited Loading

dselivanov commented Nov 15, 2020

david-cortes commented Nov 15, 2020 • edited Loading

david-cortes commented Nov 15, 2020 • edited Loading

dselivanov commented Nov 15, 2020

david-cortes commented Nov 15, 2020

david-cortes commented Nov 15, 2020 •

edited

Loading

david-cortes commented Nov 15, 2020 •

edited

Loading

david-cortes commented Nov 15, 2020 •

edited

Loading

david-cortes commented Nov 15, 2020 •

edited

Loading