Skip to content

Improve Joint-RPCA#808

Merged
TuomasBorman merged 20 commits intodevelfrom
improve_rpca
Mar 16, 2026
Merged

Improve Joint-RPCA#808
TuomasBorman merged 20 commits intodevelfrom
improve_rpca

Conversation

@TuomasBorman
Copy link
Contributor

No description provided.

@TuomasBorman
Copy link
Contributor Author

The old and new version are now producing the same results:


devtools::load_all()

data("ibdmdb")
mae <- ibdmdb
mae[[1]] <- transformAssay(mae[[1]], assay.type = "mgx", method = "rclr")
mae[[2]] <- transformAssay(mae[[2]], assay.type = "mtx", method = "rclr")

res <- getJointRPCA(mae, experiments = c(1, 2), assay.types = c("rclr", "rclr"), ncomponents = 3L, test.ratio = 10/60)

mats <- lapply(experiments(mae), function(x) assay(x, 1))
res2 <- .joint_rpca(mats, transform = c("rclr"))

res2$ord_res$samples <- res2$ord_res$samples[match(colnames(mae)[[1]], rownames(res2$ord_res$samples)), ]
all(abs(res - res2[["ord_res"]][["samples"]]) < 0.0001)

head(res2[["ord_res"]][["samples"]])
head(res)

res2$cv_stats
attributes(res)[["reconstruct_error"]]

@TuomasBorman
Copy link
Contributor Author

Scripts to test that the Gemelli and our implementation is working similarly

test_joint_optspace.py
test_joint_optspace.txt (R file but it seems that GH does not support R files...)

# Column centering (training means)
mat <- sweep(mat, 2L, center[["col"]], "-")
# Add grand mean to avoid subtracting the mean twice (training means)
# mat <- mat + center[["grand"]]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check this.


# We multiply U by singular values so that the sample coordinates reflect
# actual variance magnitude rather than just orthonormal directions.
# u <- u %*% diag(s)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check this.

mat <- sweep(mat, 2L, col_means, "-")
# Add overall mean so that we do not subtract the data effectively 2 times.
# The result is a matrix that has row and column means in zero.
# mat <- mat + grand_mean
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check this.

@TuomasBorman
Copy link
Contributor Author

TODO:

  • Add unit tests
  • Tidy the example dataset. colData includes columns with names including many dots
  • Add workflows

# Row centering (new samples)
mat <- sweep(mat, 1L, rowMeans(mat), "-")
# Column centering (training means)
# mat <- sweep(mat, 2L, center[["col"]], "-")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check this.

@TuomasBorman TuomasBorman merged commit abff00d into devel Mar 16, 2026
3 checks passed
@TuomasBorman TuomasBorman deleted the improve_rpca branch March 16, 2026 15:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants