Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cross validation on 2nd PC in sPCA #12

Open
cicadawing opened this issue Apr 5, 2024 · 0 comments
Open

Cross validation on 2nd PC in sPCA #12

cicadawing opened this issue Apr 5, 2024 · 0 comments

Comments

@cicadawing
Copy link

Apologies if this is a novice question!

I am also interested in CV for the PCs 2+. I was hoping, if you had time, to confirm that the below approach would be suitable? I was trying to follow some documentation with regards to the proposed iterative deflation method.

#Cross validation ACROSS PCs for regularisation values

regularisation_values <- seq(1, sqrt(ncol(X)), len=15)
SPC_Best_Reg <- list()
SPC_CV_Error <- list()
SPC_CV_Info <- list()

# Initialize the matrix for the first iteration
matrix_iter <- as.matrix(X)

wd <- ("XXX")
for (k in 1:25) {
  # Apply SPC.cv
  SPC_cv <- SPC.cv(matrix_iter, sumabsvs=regularisation_values, nfolds=10, trace=TRUE, center=TRUE, niter=10, orth=TRUE)
  SPC_Best_Reg[[k]] <- SPC_cv$bestsumabsv1se
  SPC_CV_Error[[k]] <- SPC_cv$cv.error
 
  #Save the info in a plot
  SPC_cv_plot <- ggplot() +
    geom_point(aes(x = regularisation_values, y = SPC_CV_Error[[k]]),
               color = "purple", shape = 16) +
    labs(x = "Regularisation Values", y = "Standard error of the average sum of squared error",
         title = paste0("CV Error vs Normalisation Values for PC", k)) +
    geom_vline(xintercept = SPC_Best_Reg[[k]], linetype = "dashed") +
    geom_text(aes(x = 12, y = max(SPC_CV_Error[[k]]) - 30000000),
              label = paste0("Best regularisation level: ", round(SPC_Best_Reg[[k]], 2)),
              hjust = 0, vjust = 0) +
    theme_minimal()
  print(SPC_cv_plot)
  # Add labeled repel
  if (capabilities(what = "png")) {
    CV_regularisations_doc <- body_add_gg(CV_regularisations_doc, value = SPC_cv_plot, style = "centered")
  }
 
  # Apply SPC without cv by adding 1 extra K
  SPC_applied <- SPC(matrix_iter, sumabsv=SPC_Best_Reg[[k]], K=1, center=TRUE, orth=TRUE, cnames=colnames(matrix_iter))
 
  # Store the result
  SPC_CV_Info[[k]] <- list()
  SPC_CV_Info[[k]][['u']] <- SPC_applied$u
  SPC_CV_Info[[k]][['d']] <- SPC_applied$d
  SPC_CV_Info[[k]][['v']] <- SPC_applied$v
  SPC_CV_Info[[k]][['prop_var_explained']] <- SPC_applied$prop.var.explained
 
  # Compute u, d, and transposed(v)
  # d is intended to be a diagonal matrix - so convert to diag
  # but ONLY if k > 1; diag() when k=1 results in issues
  transposed_v <- t(SPC_applied$v)
  udv <- SPC_applied$u %*% (SPC_applied$d) %*% transposed_v
 
  # Update the matrix for the next iteration
  matrix_iter <- as.matrix(matrix_iter - udv)
 
  print(CV_regularisations_doc, target = paste0(wd, "/SPC_cv_plot.docx"))
  saveRDS(SPC_Best_Reg, file =paste0(wd, "/SPC_GMV_Best_Reg.rds"))
  saveRDS(SPC_CV_Error, file =paste0(wd, "/SPC_GMV_CV_Error.rds"))
  saveRDS(SPC_CV_Info, file =paste0(wd, "/SPC_GMV_CV_Info.rds"))
}

Any guidance would be appreciated! Thanks for the great package!!!

Best wishes,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant