Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up existing entry points #34

Open
Banana1530 opened this issue Jun 25, 2019 · 3 comments
Open

Clean up existing entry points #34

Banana1530 opened this issue Jun 25, 2019 · 3 comments
Labels
BREAKING Possibly breaking changes to current user-facing functionality Code Cleanliness Internal changes / clean-ups / refactoring Parameter Selection Issues / feature requests related to tuning parameter selection User Interface

Comments

@Banana1530
Copy link
Collaborator

Currently we have three R wrappers. They differ in functionalities, the abstraction level of arguments they take in, and where they are used in testsuites. Eventually their functionalities will be subsets of SFPCA wrappers' and thus they should be removed.

1. sfpca

https://github.com/michaelweylandt/MoMA/blob/7c8fd20fbd18d9cbfe21837bacd8ad401853efa6/R/sfpca.R#L1

It is simply an R interface for the C++ function cpp_sfpca , https://github.com/michaelweylandt/MoMA/blob/7c8fd20fbd18d9cbfe21837bacd8ad401853efa6/src/moma_R_function.cpp#L6 which uses repeatedly MoMA::solve and MoMA::deflate. We need to explicitly specify all parameters.

What it does: Solve the penalized SVD for fixed alpha_u/v, lambda_u/v. It also finds rank-k SVD by repeatedly deflating the matrix and then rerunning the algorithm. Note we don't have tests for the latter functionality yet.

Where it is used in the testsuite: It is used to test the correctness of the PG algorithm. To do this we inspect special cases where closed-form solutions exist. Then we check the results obtained by our algorithm against closed-form solutions. See https://github.com/michaelweylandt/MoMA/blob/7c8fd20fbd18d9cbfe21837bacd8ad401853efa6/tests/testthat/test_sfpca.R#L1.

2. moma_svd

https://github.com/michaelweylandt/MoMA/blob/7c8fd20fbd18d9cbfe21837bacd8ad401853efa6/R/moma_svd.R#L61

What it does: It supports the following three use cases. Note that it cooperates with prox argument wrappers like lasso(), scad() and PG loop settings wrapper (not merged yet). Essentially what it does is a proper subset of MoMA::select_nestedBIC described in section 3.

  1. Find rank-k penalized SVD with fixed alpha_u/v and lambda_u/v by calling cpp_sfpca described above;

  2. Run nested-BIC search on 2-D grids, whose axises could be a combination of any two parameters, by calling cpp_sfpca_nestedBIC. cpp_sfpca_nestedBIC does some sanity check and then calls MoMA::select_nestedBIC;
    https://github.com/michaelweylandt/MoMA/blob/7c8fd20fbd18d9cbfe21837bacd8ad401853efa6/src/moma_R_function.cpp#L179

  3. Run grid search on 2-D grids by calling cpp_sfpca_grid , which uses MoMA::reset and MoMA::solve;
    https://github.com/michaelweylandt/MoMA/blob/7c8fd20fbd18d9cbfe21837bacd8ad401853efa6/src/moma_R_function.cpp#L80

Where it is used in the testsuite: It tests that prox arguments are correctly passed to C++ side (see test_argument.R https://github.com/michaelweylandt/MoMA/blob/7c8fd20fbd18d9cbfe21837bacd8ad401853efa6/tests/testthat/test_arguments.R#L1 ). We also test that cpp_sfpca_grid and cpp_sfpca give identical result (see test_grid.R https://github.com/michaelweylandt/MoMA/blob/7c8fd20fbd18d9cbfe21837bacd8ad401853efa6/tests/testthat/test_grid.R#L1).

3 MoMA::grid_BIC_mix

This will become the core of SFPCA wrappers (in progress). It supports finding the first k pairs of singular vectors, and the combination of nested-BIC search and grid search.

Where it is used in the testsuite: We test that it gives correctly sized lists. See https://github.com/michaelweylandt/MoMA/blob/7c8fd20fbd18d9cbfe21837bacd8ad401853efa6/tests/testthat/test_BIC_gird_mixed.R#L1

@michaelweylandt michaelweylandt added BREAKING Possibly breaking changes to current user-facing functionality Code Cleanliness Internal changes / clean-ups / refactoring Parameter Selection Issues / feature requests related to tuning parameter selection User Interface labels Jun 25, 2019
@michaelweylandt
Copy link
Member

This is a great issue - thanks for opening!

Do you think it'd be possible to consolidate down to a single C++ entry point or is that too tricky to design?

As you mentioned in a recent email, it's tricky to have grid parameters and multi-rank solutions. (It's ok if it's a "grid" of a single parameter, but it's hard after that...) That might just be a thing we disallow until we can (someday) find a better fix.

@Banana1530
Copy link
Collaborator Author

Banana1530 commented Jul 18, 2019

@Banana1530
Copy link
Collaborator Author

Banana1530 commented Jul 27, 2019

Argument specification Function Use case
Specify all arguments explicitly sfpca sfpca.R sfpca(X, alpha_u = 1, alpha_v = 2, Omega_u = sec_diff_mat(n), Omega_v = diag(p), EPS = 1e-9, MAX_ITER = 1e+5)
Parameter values and penalty types are separated; algorithm precision settings are wrapped in moma_pg_settings() moma_svd, SFPCA$initialize ... moma_expose.R SFPCA$initialize(X, u_sparsity = lasso(), lambda_u = c(1,10), selection_str = "ggbg" )
Parameter values and selection method are absorbed in penalty types moma_sfpca, moma_twpca ... moma_sf*_wrapper.R moma_spca(X, u_sparse = moma_lasso(lambda = seq(0, 2, 0.2), select_scheme = "b") )

The above table summarizes ways to specify penalty arguments.

@Banana1530 Banana1530 changed the title Clean up existing old wrappers Clean up existing entry points Aug 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BREAKING Possibly breaking changes to current user-facing functionality Code Cleanliness Internal changes / clean-ups / refactoring Parameter Selection Issues / feature requests related to tuning parameter selection User Interface
Projects
None yet
Development

No branches or pull requests

2 participants