Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design improve #37

Merged
merged 15 commits into from
Jul 9, 2019
Merged

Conversation

Banana1530
Copy link
Collaborator

Two main focuses in this PR:

  • Add three member functions for the class MoMA.
  • A comprehensive test suite for cpp_multirank_BIC_grid_search, which will be the core of multivariate methods.

- separate class RcppFourDList declaration and implementation
- update R extractor to deal with 5-D list
- include MoMA::Omega_u/v
- remove MoMA::tol and MoMA::iter
- add flags `is_initialzied` and `is_solved`
- add helper functions `initialize_uv`, `set_X`, `check_convergence` and `evaluate_loss` (NOT IMPLEMENTED YET)
- add "level-1" function `multi_rank`, `grid_search` and `criterion_search`, and put them in `moma_level1.cpp`
- modify the almighty function `grid_BIC_mix`
- improve `check_convergence`
- remove `tol` and `iter` (working precision and iters)
- three "level-1" functions are exposed in `moma_expose.cpp`
- they share a common R interface in `moma_expose.R`
Copy link
Member

@michaelweylandt michaelweylandt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First set of comments. Continuing review.

n_lambda_u <- dim(x)[2]
n_alpha_v <- dim(x)[3]
n_lambda_v <- dim(x)[4]
n_rank <- dim(x)[5]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alignment.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't found how to set it in styler. Could you point it out?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the README, it looks like setting strict = FALSE won't mess with existing alignment but I don't see how to force my type of alignment.

Don't worry about it - we'll just commit to what styler uses. I was commenting out of habit.

@@ -0,0 +1,37 @@
get_5Dlist_elem <- function(x, alpha_u_i, lambda_u_i, alpha_v_i, lambda_v_i, rank_i = 1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Longer term, we should think about making an accessor which takes alpha_u, lambda_u, etc (values not indices) and does the extraction. If we don't have exactly the right value in the saved list, we should (by default) interpolate with an option for an exact solve.

(I do something similar for the coef function in my ExclusiveLasso package.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened an issue to track this.

#40 (comment)

};

int MoMA::deflate(double d)
{
MoMALogger::warning("Deflating.");
MoMALogger::debug("Deflating:\n")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a message? This doesn't quite seem debug-level.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also include which rank is being used here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh - and just an FYI: I'm working on a new short paper on alternate deflation schemes. Will share when it's ready.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should keep it as debug message since we also output << X << "u^T = " << u.t() << "v^T = " << v.t() << "d = u^TXv = " << d; (next line). This could be lengthy.

That's great! Can't wait to see it.

src/moma.cpp Outdated
@@ -114,107 +119,55 @@ void MoMA::solve()
u = solver_u.solve(X * v, u);
v = solver_v.solve(X.t() * u, v);

tol = norm(oldu - u) / norm(oldu) + norm(oldv - v) / norm(oldv);
MoMALogger::debug("Outer loop No.") << iter << "--" << tol;
double scale_u = norm(oldu) == 0.0 ? 1 : norm(oldu);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What norm function is this? Should it be arma::norm?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Fixed.

src/moma.cpp Outdated Show resolved Hide resolved
src/moma.cpp Outdated
// NOTE: We must keep the alpha's and lambda's up-to-date
// in both MoMA and solve_u/v
if (std::abs(alpha_u - newalpha_u) > EPS || std::abs(alpha_v - newalpha_v) > EPS ||
std::abs(lambda_v - newlambda_v) > EPS || std::abs(lambda_u - newlambda_u) > EPS)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alignment - can we line up the || in the first and second line?

A clearer way to write this might be std::max(......) > EPS instead of checking each condition

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I searched for it for 10 minutes and did not find such option. clang-format won't allow me to align them by hand.

True, but let's keep it this way. I changed the EPS here to MOMA_FLOATPOINT_EPS .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok - it's harmless.

@@ -82,7 +64,7 @@ Rcpp::List cpp_sfpca_grid(
double EPS_inner,
long MAX_ITER_inner,
std::string solver,
int k = 1)
int rank = 1) // `rank` is not used
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have this argument if not used? At least, maybe we should throw an error if rank != 1 to catch user mistakes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why I want it here is to make the three member functions have the same argument lists. See below:

https://github.com/Banana1530/MoMA-1/blob/afea2740f489994d6213b618ca5005c6da61e208/R/moma_expose.R#L171

Checking added. See commit "check users don't specify rank".

@@ -173,7 +115,7 @@ Rcpp::List cpp_sfpca_nestedBIC(
double EPS_inner,
long MAX_ITER_inner,
std::string solver,
int k = 1)
int rank = 1) // rank not used
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above - why have this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same.


// This function solves a squence of lambda's and alpha's
// [[Rcpp::export]]
Rcpp::List cpp_multirank_BIC_grid_search(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name seems a bit over-loaded since it doesn't always do BIC or GS. Maybe just cpp_multirank_svd or something?

flattened_list.attr("class") = "MoMA_5D_list";
};

int RcppFiveDList::insert(Rcpp::List object,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be more clearly written with the ()= paradigm than an insert function.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean an () operator for the C++ class RcppFiveDList ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so.

If we add RcppFiveDList::operator(), wont' that let us write

list(1,2,3,4,5) = Rcpp::List::create(....)

in the code that uses this class?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened an issue to track this.

#41 (comment)

Copy link
Member

@michaelweylandt michaelweylandt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test code looks solid on first pass.

// grid_au = alpha_u, bic_au_grid = [-1].
// If alpha_u is selected via nested BIC search,
// then grid_au = [-1], bic_au_grid = alpha_u
const arma::vec &grid_lu = set_greedy_grid(lambda_u, !selection_criterion_lambda_u);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this a const ref object instead of just a const object? I'm a bit confused by the semantics. (TBH: I didn't know you could just declare an object as a const ref outside of a function call context.)

Copy link
Collaborator Author

@Banana1530 Banana1530 Jul 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const ref because I want grid_lu to be either a reference of the vector lambda_u, when the parameter lambda_u is determined by grid search, so the vector lambda_u should constitute a dimension of the so-called "greedy grid" - see the doc linked below; or the MOMA_EMPTY_GRID_OF_LENGTH1 (defined in moma_base.h), when lambda_u is determined BIC.

We can do this as long as the functions function returns a const ref.

Copy link
Collaborator Author

@Banana1530 Banana1530 Jul 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well it doesn't have to be a ref. const will do just fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting - that's a weird corner of C++ I hadn't played with before, but makes sense.


RcppFiveDList five_d_list(n_alpha_u, n_lambda_u, n_alpha_v, n_lambda_v, rank);

arma::mat original_X =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Const?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. See commit "const arma::mat original_X".

{
for (int m = 0; m < n_lambda_v; m++)
{
arma::vec bic_au_grid = set_bic_grid(alpha_u, selection_criterion_alpha_u, i);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still confused as to what these set_*_grid functions do. Can you clarify?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I get what you are trying to do, but the names of the C++ functions are still a bit confusing.

Maybe construct_grid_for_search and construct_grid_no_search or something like that?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed. See commit "set_*_gric -> construct_*_search".

@michaelweylandt
Copy link
Member

Anything else you think we should do for this PR or is it good to merge?

@Banana1530
Copy link
Collaborator Author

Let's merge this PR. But one thing of concern is how to deal with deflation for CCA in current design.

@michaelweylandt michaelweylandt merged commit cc9be2e into DataSlingers:master Jul 9, 2019
@Banana1530 Banana1530 deleted the designimprove branch July 26, 2019 06:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants