cpoDropConstants does not work as intendet #59

ja-thomas · 2018-11-05T12:44:13Z

set.seed(3)
N = 2000
d = transform(data.frame(
    x1 = rnorm(N),
    x2 = rnorm(N),
    x3 = rnorm(N)),
    y = 2*x2 + (abs(x3) < 1) + rnorm(N))

train = (1 : N) <= 1000

task = makeRegrTask(data = d, target = "y")

lrn1 = makeLearner("regr.lm")
lrn2 = cpoDropConstants() %>>% lrn1

benchmark(list(lrn1, lrn2), task)

In this case, it randomly drops up to 2 features, even though they are standard normal

*added seeding for reproducibility

The text was updated successfully, but these errors were encountered:

ja-thomas · 2018-11-05T13:11:10Z

Found the bug here:

return(!(all(abs(col - cmean) < abs.tol) || all(abs(col - cmean) / cmean < rel.tol)))

should be

return(!(all(abs(col - cmean) < abs.tol) || all(abs((col - cmean) / cmean) < rel.tol)))

The first version will always drop features that have a negative mean

mb706 · 2018-11-05T14:46:27Z

Thanks!

ja-thomas mentioned this issue Nov 5, 2018

Poor performance on a simple dataset ja-thomas/autoxgboost#62

Open

mb706 pushed a commit that referenced this issue Nov 6, 2018

fix #59: don't drop cols w/ negative mean

e9a5cf4

mb706 mentioned this issue Nov 6, 2018

Fix dropconst #60

Merged

mb706 closed this as completed in #60 Nov 6, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cpoDropConstants does not work as intendet #59

cpoDropConstants does not work as intendet #59

ja-thomas commented Nov 5, 2018 •

edited

ja-thomas commented Nov 5, 2018

mb706 commented Nov 5, 2018

cpoDropConstants does not work as intendet #59

cpoDropConstants does not work as intendet #59

Comments

ja-thomas commented Nov 5, 2018 • edited

ja-thomas commented Nov 5, 2018

mb706 commented Nov 5, 2018

ja-thomas commented Nov 5, 2018 •

edited