Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cpoDropConstants does not work as intendet #59

Closed
ja-thomas opened this issue Nov 5, 2018 · 2 comments
Closed

cpoDropConstants does not work as intendet #59

ja-thomas opened this issue Nov 5, 2018 · 2 comments

Comments

@ja-thomas
Copy link
Contributor

ja-thomas commented Nov 5, 2018

set.seed(3)
N = 2000
d = transform(data.frame(
    x1 = rnorm(N),
    x2 = rnorm(N),
    x3 = rnorm(N)),
    y = 2*x2 + (abs(x3) < 1) + rnorm(N))

train = (1 : N) <= 1000

task = makeRegrTask(data = d, target = "y")

lrn1 = makeLearner("regr.lm")
lrn2 = cpoDropConstants() %>>% lrn1

benchmark(list(lrn1, lrn2), task)

In this case, it randomly drops up to 2 features, even though they are standard normal

*added seeding for reproducibility

@ja-thomas
Copy link
Contributor Author

Found the bug here:

return(!(all(abs(col - cmean) < abs.tol) || all(abs(col - cmean) / cmean < rel.tol)))

should be

return(!(all(abs(col - cmean) < abs.tol) || all(abs((col - cmean) / cmean) < rel.tol)))

The first version will always drop features that have a negative mean

@mb706
Copy link
Collaborator

mb706 commented Nov 5, 2018

Thanks!

mb706 pushed a commit that referenced this issue Nov 6, 2018
@mb706 mb706 mentioned this issue Nov 6, 2018
@mb706 mb706 closed this as completed in #60 Nov 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants