Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inaccurate timing results #91

Open
dbackenroth opened this issue Jan 3, 2021 · 1 comment
Open

Inaccurate timing results #91

dbackenroth opened this issue Jan 3, 2021 · 1 comment

Comments

@dbackenroth
Copy link

Thank you for developing the great CVXR package. The timing stats provided by CVXR seem to be drastically underestimating the amount of time required for problem solution. 

Here is my script:

library(CVXR)
library(MASS)

Test <- function(n) {
  n_ds <- 1
  r_sq <- runif(n_ds, min = 0.05, max = 0.15)
  prev <- sample(c(0.005, 0.01, 0.05, 0.1), size = n_ds, replace = T)
  C <- GetCostMatrix(prev = prev, r_sq = r_sq, n)
  P <- GetPMatrix(C = C, m = round(n * 2/3))
}

GetPMatrix <- function(C, m) {
  n <- nrow(C)
  P <- Variable(rows = n, cols = n, name = "P", symmetric = T)
  ones <- matrix(rep(1, n))
  constraints <- list(
    P <= 1,
    0 <= P,
    P %*% ones == m * ones
  )
  objective <- Minimize(sum(C * P))
  problem <- Problem(objective, constraints = constraints)
  a <- Sys.time()
  result <- psolve(problem, solver = "ECOS", verbose = F)
  print("Sys.time timing:")
  print(Sys.time() - a)
  print("psolve timing:")
  print(result$solve_time + result$setup_time)
  P_res <- result$getValue(P)
  return(P_res)
}

GetCostMatrix <- function(prev, r_sq, n) {
  n_ds <- length(prev)
  z_k <- qnorm(prev, lower.tail = F)
  scores <- mvrnorm(n = n, mu = rep(0, n_ds), Sigma = diag(r_sq, nrow = length(r_sq)))
  grid <- expand.grid(m = 1:n, w = 1:n)
  mean_scores <- (scores[grid$m, , drop = F] + scores[grid$w, , drop = F]) / 2
  # 1 - Phi((z_k - c) / sqrt(1 - r^2_ps / 2))
  
  p_d <- matrix(NA, nrow = nrow(grid), ncol = n_ds)
  for (i in 1:n_ds) {
    p_d[, i] <- 1 - pnorm((z_k[i] - mean_scores[, i, drop = F]) / sqrt(1 - r_sq[i] / 2))
  }
  p_no_d <- apply(1 - p_d, 1, prod)
  no_d_matrix <- matrix(p_no_d, nrow = n, ncol = n, byrow = F)
}

I print out the time using both Sys.time() and also using the solve_time and setup_time variables returned by CVXR::psolve. It appears that time is both much higher with Sys.time() than with solve_time and setup_time, and it also increases at a much faster rate with problem size (# rows in symmetric matrix which we want to optimize) as measured by Sys.time() then by using solve_time and setup_time. 

Is there some other operation done by CVXR::psolve other than solve and setup, and is there anything that can reduce how long that operation takes?

Test(100)
[1] "Sys.time timing:"
Time difference of 1.458917 secs
[1] "psolve timing:"
[1] 0.05601097
Test(200)
[1] "Sys.time timing:"
Time difference of 16.15535 secs
[1] "psolve timing:"
[1] 0.3218943
Test(400)
[1] "Sys.time timing:"
Time difference of 3.830424 mins
[1] "psolve timing:"
[1] 1.831462

@bnaras
Copy link
Collaborator

bnaras commented Jan 4, 2021

This is a valid point. The timings returned by the psolve refer to the solver setup and solver solution times. It does not tell anything about CVXR setup time etc. which can substantial as that is pure R code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants