-
Notifications
You must be signed in to change notification settings - Fork 531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEAT: topk function in all backends. #2061
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor text changes, everything else looks good.
docs/details/statistics.dox
Outdated
@@ -1,7 +1,7 @@ | |||
/*! | |||
\page batch_detail_stat statistics | |||
|
|||
This function performs the operation across all batches present in the input simultaneously. | |||
This function performs the operation across all dimension of the input array. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
across all dimensions of
src/api/c/topk.cpp
Outdated
using namespace detail; | ||
|
||
template<typename T> | ||
af_err topkWithIndices(af_array *v, af_array* i, const af_array in, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you change this function name to be topk as well since we are not explicitly saying with indices anywhere now.
array. The indices along with their values are returned. If the input is a | ||
multi-dimensional array, the indices will be the index of the value in that | ||
dimension. Order of duplicate values are not preserved. This function is | ||
optimized for small values of k. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if we can add some sort of guideline as to from which value of k does the performance deteriorate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be different for different devices and implementations.
test/topk.cpp
Outdated
TYPED_TEST_CASE(TopK, TestTypes); | ||
|
||
template<typename T> | ||
void topkIndexTest(const unsigned ndims, const dim_t* dims, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can just make it topkTest
instead of topkIndexTest
now.
@umar456 why not partial_sort for cpu devices? |
68e8449
to
9a8fd9b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed feedback.
@pavanky I have updated the CPU backend to use partial_sort.
array. The indices along with their values are returned. If the input is a | ||
multi-dimensional array, the indices will be the index of the value in that | ||
dimension. Order of duplicate values are not preserved. This function is | ||
optimized for small values of k. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be different for different devices and implementations.
CUDA backend alone implements a custom kernel to fetch top k elements without sorting all the values. CPU backends sorts the data and fetch the top k elements. The OpenCL backend is optimized for CPU devices to map the memory and perform a partial sort to get the results.
CUDA backend alone implements a custom kernel to fetch top k elements
without sorting all the values. CPU backends sorts the data and fetch
the top k elements. The OpenCL backend is optimized for CPU devices
to map the memory and perform a partial sort to get the results.
Sponsored by SDL