Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NEW Deflation for CCA and LDA #54

Merged
merged 28 commits into from Aug 23, 2019
Merged

Conversation

Banana1530
Copy link
Collaborator

PR #46 is moved here.

@Banana1530
Copy link
Collaborator Author

Banana1530 commented Aug 21, 2019

Merge into develop under the new branch scheme instead of master -DONE

Increase coverage of the new stuff - this is a big hit compared to the previous version -NOT YET

Pull the Shiny stuff out of this PR and just leave it as CCA and LDA - DONE. Pulled out.

Clean up the history - DONE.


## Installation

The newest version of the package can be installed from Github:

```{r, eval=FALSE}
library(devtools)
install_github("michaelweylandt/MoMA", ref="build")
install_github("michaelweylandt/MoMA", ref = "build")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be changed to master now I think.

man/figures/moma-formula.svg Show resolved Hide resolved
@michaelweylandt
Copy link
Member

Is this PR supposed to add pkgdown websiting as well? If so, I think we'll need an adjustment in build_steps.R

#' \item{\code{\link{spfusedlasso}}} TODO
#' \item{\code{\link{cluster}}} TODO
#' \item{\code{\link{moma_lasso}}}: sparsity
#' \item{\code{\link{moma_mcp}}}: adaptive sparsity
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just "sparsity" or "non-convex sparsity" here? "Adaptive" makes me think of the adaptive lasso.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

#' \item{\code{\link{moma_grplasso}}}: group-wise sparsity
#' \item{\code{\link{moma_fusedlasso}}}: piece-wise constant
#' \item{\code{\link{moma_spfusedlasso}}}: sparsity and piece-wise constant
#' \item{\code{\link{moma_l1tf}}}: piece-wise quadratic
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Piecewise polynomial (default quadratic)" might be more accuracy.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

#' \item{\code{\link{moma_fusedlasso}}}: piece-wise constant
#' \item{\code{\link{moma_spfusedlasso}}}: sparsity and piece-wise constant
#' \item{\code{\link{moma_l1tf}}}: piece-wise quadratic
#' \item{\code{\link{moma_cluster}}}: structured sparsity
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Structured" is a bit vague - maybe "unordered fusion" (I contrast it with ordered fusion => fused lasso in my head).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

#' @name moma_sparsity
#' These functions specify the value of the \code{u_sparse,v_sparse} arguments in the
#' \code{moma_*pca} series of functions, and the \code{x_sparse,y_sparse} arguments
#' in the \code{moma_*cca} and \code{moma_*lda} series of functions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't picked up on the X vs U distinction for PCA vs multi-source methods. Nice touch.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool~

#' @param lambda A vector containing penalty values
#' @param select_scheme A char being either "b" (nested BIC search) or "g" (grid search).
#'
#' Selection of tuning parameters (four parameters in our case, \code{alpha_u}, \code{alpha_v}, \code{lambda_u}, \code{lambda_v}) is a problem in MoMA. Currently we provide
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not call this a problem - too negative.

Maybe something more like:

MoMA provides a flexible framework for regularized multivariate analysis with several tuning parameters for different forms of regularization. To assist the user in selecting these parameters ....

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

int selection_criterion_lambda_v = 0,
int max_bic_iter = 5,
int rank = 1)
int deflation_scheme = 1, // Defaults to 1 = PCA_Hotelling
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use DeflationScheme::PCA_Hotelling here instead of hard coding 1? It would make the code a bit easier to read. (Might need to change the type on deflation_scheme.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can Rcpp handle exporting a function with an enum-type argument?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm.... it looks like the answer is "sort of." The following works, but isn't totally type-safe. We could probably do a more general solution in the future. Issue?

#include "Rcpp.h"

enum DeflationMethod {
  PCA = 0, 
  PLS = 1,  
  CCA = 2
};

namespace Rcpp {
  SEXP wrap(DeflationMethod dm){
    Rcpp::IntegerVector dm_int = Rcpp::wrap(static_cast<int>(dm));
    dm_int.attr("class") = "DeflationMethod"; 
    return dm_int;
  }

  template <> DeflationMethod as(SEXP dm_sexp){
    Rcpp::IntegerVector dm_iv(dm_sexp);
    int dm_int = dm_iv(0); 
    DeflationMethod dm = static_cast<DeflationMethod>(dm_int);
    return dm; 
  }
}

// [[Rcpp::plugins(cpp11)]]
// [[Rcpp::export]]
DeflationMethod make_pls(){
  DeflationMethod pls = DeflationMethod::PLS;
  return pls;
}

// [[Rcpp::export]]
void take_pls(DeflationMethod x){
  Rcpp::Rcout << " x = " << x << std::endl; 
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See new commits "Enum-fy selection scheme encoding on R side", and "Enum-fy selection scheme encoding on C++ side".

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#60

double EPS_inner,
long MAX_ITER_inner,
std::string solver,
int deflation_scheme, // PCA = 1, CCA = 2, LDA = 3, PLS = 4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this comment is right.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

README.Rmd Outdated
)
```

# MoMA: Modern Multivariate Analysis

[![Travis-CI Build Status](https://api.travis-ci.com/michaelweylandt/MoMA.svg)](https://travis-ci.com/michaelweylandt/MoMA) [![License: GPL v2](https://img.shields.io/badge/License-GPL%20v2-brightgreen.svg)](https://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html) [![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/MoMA)](https://cran.r-project.org/package=MoMA) [![Project Status: WIP – Initial development is in progress, but there has not yet been a stable, usable release suitable for the public.](http://www.repostatus.org/badges/latest/wip.svg)](http://www.repostatus.org/#wip)
<!-- badges: start -->
[![Travis-CI Build Status](https://api.travis-ci.com/michaelweylandt/MoMA.svg)](https://travis-ci.com/michaelweylandt/MoMA)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these links need to refer to DataSlingers instead of michaelweylandt now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. See commit "Update badges".

README.Rmd Outdated

* `moma_spfusedlasso()`: Sparse fused LASSO.

* `moma_l1tf()`: L-one trend filtering.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to use MathJax / LaTeX here: $\ell_1$ and have it render properly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

_pkgdown.yml Outdated
@@ -1,9 +1,72 @@
template:
params:
bootswatch: paper
package: tidytemplate
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to use the tidyverse branding (unless I'm missing something).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I just thought their website template looks better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Their website says "Please don’t use it for your own package." What do you like about it? We might be able to copy/borrow aspects.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update, it is now

template:
  params:
    bootswatch: flatly
    

See commit "Not use tidytemplate".

@michaelweylandt
Copy link
Member

I hadn't given Travis permission to read this under the DataSlingers organization, but I just did. Hopefully the next commit in this branch will kick off a build properly.

@codecov
Copy link

codecov bot commented Aug 22, 2019

Codecov Report

❗ No coverage uploaded for pull request base (develop@784871b). Click here to learn what that means.
The diff coverage is 76.12%.

Impacted file tree graph

@@            Coverage Diff             @@
##             develop      #54   +/-   ##
==========================================
  Coverage           ?   85.63%           
==========================================
  Files              ?       34           
  Lines              ?     3182           
  Branches           ?        0           
==========================================
  Hits               ?     2725           
  Misses             ?      457           
  Partials           ?        0
Impacted Files Coverage Δ
src/moma.h 100% <ø> (ø)
R/moma_solve.R 98.07% <ø> (ø)
src/moma_solver.h 100% <ø> (ø)
src/moma_base.h 100% <ø> (ø)
src/moma_solver_BICsearch.cpp 100% <100%> (ø)
R/moma_arguments.R 94.33% <100%> (ø)
src/moma_solver.cpp 94.73% <100%> (ø)
R/moma_sflda.R 59.8% <59.8%> (ø)
R/moma_sfcca.R 74.64% <74.64%> (ø)
R/moma_expose.R 91.66% <83.33%> (ø)
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 784871b...03923ee. Read the comment docs.

std::string i_solver,
DeflationScheme i_ds)
: MoMA(i_X_working.t() * i_Y_working, // the matrix on which we find pSVD
i_lambda_u,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is why X_original cannot be zero.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this a reference to? Did I ask about zeros?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, it should be "here is why X_original cannot be constant".

@Banana1530
Copy link
Collaborator Author

What changes are needed in the build_step.R?

@@ -218,6 +281,19 @@ DEFLATION_SCHEME <- c(
PCA_Projection = 6
)

# MUST be consistent with
# `SelectionScheme` in moma_base.h
SELECTION_SCHEME <- c(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SELECTION_SCHEME is not exposed to the user and but used internally.

@michaelweylandt
Copy link
Member

Re: build_steps.R I think we need something to actually build the pkgdown site.

@michaelweylandt
Copy link
Member

Merging now to get this in before GSoC ends.

@michaelweylandt michaelweylandt merged commit 5d02a2a into DataSlingers:develop Aug 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants