Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[New Model]: msaenet (Multi-Step Adaptive Elastic-Net) #561

Closed
coforfe opened this issue Dec 30, 2016 · 9 comments
Closed

[New Model]: msaenet (Multi-Step Adaptive Elastic-Net) #561

coforfe opened this issue Dec 30, 2016 · 9 comments

Comments

@coforfe
Copy link

@coforfe coforfe commented Dec 30, 2016

Hello,

There is a new algorithm (regression) available and specially suitable for p>>n cases:

msaenet: Multi-Step Adaptive Elastic-Net

Multi-step adaptive elastic-net (MSAENet) algorithm for feature selection in high-dimensional regressions proposed in Xiao and Xu (2015) (pdf).

Link in CRAN: msaenet

Thanks,
Carlos.

@topepo
Copy link
Owner

@topepo topepo commented Apr 7, 2017

Looking at this, I will have train fit models that can tune over alphas and nsteps but use tune = "aic" for a single combination of those parameters. There doesn't seem to be a way to not tune, so I'll default to AIC (but there will be a single parameter set, so no tuning occurs).

topepo added a commit that referenced this issue Apr 7, 2017
@topepo
Copy link
Owner

@topepo topepo commented Apr 7, 2017

If you can so some testing in the next day or so, that would be a good idea.

@coforfe
Copy link
Author

@coforfe coforfe commented Apr 7, 2017

Thanks.

Sorry, I have just installed the dev version and I do not see msaenetthere..

Regards,
Carlos.

@topepo
Copy link
Owner

@topepo topepo commented Apr 9, 2017

You can get it by sourcing this file then using method = modelInfo. Thanks

@coforfe
Copy link
Author

@coforfe coforfe commented Apr 10, 2017

Thanks.
I could run it.

I run, several models without any error.

I have tried to replicate a msaenet example (binomial/classification) and compare it with the equivalent with caret. For the same dataset both get equivalent results.

Thanks,
Carlos.

@topepo
Copy link
Owner

@topepo topepo commented Apr 11, 2017

When I tested, it seemed that the scale parameter had the largest effect on the results.

@coforfe
Copy link
Author

@coforfe coforfe commented Apr 11, 2017

Yes, I see the same effect with scale.

I did this, just for the sake of reproducibility:

With syntetic binary data generated with msaenet function:

Not centered and with tuneLength = 5

library(caret)
library(msaenet)

cctrl0 <- trainControl(number = 1, classProbs = TRUE,
                       summaryFunction = twoClassSummary)

dat = msaenet.sim.binomial(
  n = 300, p = 500, rho = 0.6,
  coef = rep(1, 10), snr = 3, p.train = 0.7,
  seed = 1001)

tr_Y <- as.factor(ifelse(dat$y.tr == 0, "Class1", "Class2"))

set.seed(1003)
msa_caret <- train(
                      x = dat$x.tr,
                      y = tr_Y,
                      method = modelInfo,
                      tuneLength = 5,
                      trControl = cctrl0,
                      metric = "ROC"
)
plot(msa_caret)

Centered and with tuneLength = 5

To check if by scaling the data it gets any improvement:

msa_caret_cent <- train(
  x = dat$x.tr,
  y = tr_Y,
  method = modelInfo,
  tuneLength = 5,
  trControl = cctrl0,
  metric = "ROC",
  preProc = c("center", "scale")
)
plot(msa_caret_cent)

This produces an error:

Error in get_types(x) : `x` must have column names

While in this other way:

Centered but without preProc and with tuneLength = 5

msa_caret_cent <- train(
  x = scale(dat$x.tr),
  y = tr_Y,
  method = modelInfo,
  tuneLength = 5,
  trControl = cctrl0,
  metric = "ROC",
)
plot(msa_caret_cent)

It works and improves ROCa little bit.

In both cases, the highest ROC is achieved with a scale = 2.

topepo added a commit that referenced this issue Apr 11, 2017
@topepo
Copy link
Owner

@topepo topepo commented Apr 11, 2017

Who doesn't use column names? =]

Just added a check. I'm going to close this one out.

Thanks

@topepo topepo closed this Apr 11, 2017
@nanxstats
Copy link

@nanxstats nanxstats commented Apr 24, 2017

@coforfe @topepo -- thanks a lot for adding my method and package to caret. This is awesome! Please just let me know if there is anything I can do to make the API better.

To share some of my experience: SCAD-net and MCP-net usually do better than elastic-net in terms of reducing false positive selections. I use EBIC myself to select the optimal step, but AIC and BIC are also good choices sometimes. scale can be important, and the adaptive weights transformation is a very interesting direction to explore.

-Nan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.