Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GBM/DRF: Remove redundant extraction weights in histogram #7589

Closed
exalate-issue-sync bot opened this issue May 11, 2023 · 2 comments
Closed

GBM/DRF: Remove redundant extraction weights in histogram #7589

exalate-issue-sync bot opened this issue May 11, 2023 · 2 comments

Comments

@exalate-issue-sync
Copy link

Speed-up GBM&DRF by avoiding unnecessary work.

This is done once at the beginning and is thus redundant, the removed line does it for every column - concurrently!!! - that can be very bad for performance because different threads are writing to the same shared array.

Timings master vs fix on redhat dataset with gbm-noscoring benchmark configuration with 5 fold cross-validation:
(seconds)

master.jar-1, 326.3
master.jar-2, 326.3
master.jar-3, 325.2
master.jar-4, 326.2
master.jar-5, 326.1
michalk_weights-fix.jar-1, 281
michalk_weights-fix.jar-2, 283.9
michalk_weights-fix.jar-3, 278.7
michalk_weights-fix.jar-4, 278.7
michalk_weights-fix.jar-5, 279.2
We see about 12% speed-up.

@h2o-ops
Copy link
Collaborator

h2o-ops commented May 14, 2023

JIRA Issue Details

Jira Issue: PUBDEV-8060
Assignee: Michal Kurka
Reporter: Michal Kurka
State: Resolved
Fix Version: 3.32.1.1
Attachments: N/A
Development PRs: Available

@h2o-ops
Copy link
Collaborator

h2o-ops commented May 14, 2023

Linked PRs from JIRA

#5337

@h2o-ops h2o-ops closed this as completed May 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant