-
Notifications
You must be signed in to change notification settings - Fork 514
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable probability output from RF binary classifier (alternative implementaton) #3869
Enable probability output from RF binary classifier (alternative implementaton) #3869
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from my comments the PR looks good to me.
int max_class_idx = 0; | ||
int max_count = 0; | ||
int total_count = 0; | ||
for (int i = 0; i < input.nclasses; ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This loop was executed collaboratively by threads in the block. Now it is executed redundantly by all the threads in the block. Any particular reasons for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two reasons:
- I was following Refactor to extract random forest objectives #3854, which also redundantly computes the sum over the classes.
- This PR requires the computation of
total_count
, which is the sum of all elements ofshist
. If the loop were to run collaboratively, I'd need to define an extra data structure to perform reduction fortotal_count
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
total_count
can be calculated with a call to cub::BlockReduce
. Assuming nclasses
is small this is small penalty. We could change it in future if that assumption breaks.
In that case, can the whole loop be moved inside the if (tid == 0)
block on line 91?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vinaydes Thanks. What do you think of #3854, where the summing is performed redundantly in all threads? For example:
cuml/cpp/src/decisiontree/batched-levelalgo/metrics.cuh
Lines 112 to 123 in bf5007c
static DI LabelT LeafPrediction(BinT* shist, int nclasses) { | |
int class_idx = 0; | |
int count = 0; | |
for (int i = 0; i < nclasses; i++) { | |
auto current_count = shist[i].x; | |
if (current_count > count) { | |
class_idx = i; | |
count = current_count; | |
} | |
} | |
return class_idx; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, in #3854, LeafPrediction()
is called on line kernels.cuh#L141. It is already inside tid == 0
block. I could be wrong if it is called at some other place too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vinaydes Got it. In that case, I can move this loop inside the tid == 0
block?
info.prediction = pred; | ||
info.colid = Leaf; | ||
info.quesval = DataT(0); // don't care for leaf nodes | ||
info.quesval = aux; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this reuse necessary? I understand quesval
is unused in leaf node but may be better to use separate member variable for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to avoid introducing a new member variable, since we want to introduce a new data structure to store the probability distribution for multi-class classifiers.
Codecov Report
@@ Coverage Diff @@
## branch-21.06 #3869 +/- ##
===============================================
Coverage ? 85.41%
===============================================
Files ? 227
Lines ? 17315
Branches ? 0
===============================================
Hits ? 14790
Misses ? 2525
Partials ? 0
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report at Codecov.
|
rerun tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes lgtm, @vinaydes was wondering if you have any further comments or does this look good to merge?
Good to merge 👍 |
@gpucibot merge |
…ive implementaton) (rapidsai#3869)" This reverts commit 92484fb.
…ementaton) (rapidsai#3869) Alternative implementation of rapidsai#3862 that does not depend on rapidsai#3854 Closes rapidsai#3764 Closes rapidsai#2518 Authors: - Philip Hyunsu Cho (https://github.com/hcho3) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) - Vinay Deshpande (https://github.com/vinaydes) URL: rapidsai#3869
Reverts rapidsai#3869, as it was shown to reduce the test accuracy in some cases. Closes rapidsai#3910 Authors: - Philip Hyunsu Cho (https://github.com/hcho3) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#3933
Alternative implementation of #3862 that does not depend on #3854
Closes #3764
Closes #2518