-
Notifications
You must be signed in to change notification settings - Fork 615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reductions: min, max #2342
Reductions: min, max #2342
Conversation
Signed-off-by: Albert Wolant <awolant@nvidia.com>
Signed-off-by: Albert Wolant <awolant@nvidia.com>
Signed-off-by: Albert Wolant <awolant@nvidia.com>
Signed-off-by: Albert Wolant <awolant@nvidia.com>
Signed-off-by: Albert Wolant <awolant@nvidia.com>
!build |
CI MESSAGE: [1688761]: BUILD STARTED |
|
||
DALI_SCHEMA(ReduceBase) | ||
.AddOptionalArg( | ||
"axes", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also support "axis_names", perhaps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For sure. How about doing that in next PR? This is nice extension of the API, but right now this op can already be useful as it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine with me. There are already several operators using it (SliceAttr family, Erase,...). Maybe it'd be good to make a PR that extracts that to AxesAttr or something like this, so we reuse the implementation and the arg documentation.
CI MESSAGE: [1688761]: BUILD FAILED |
CI MESSAGE: [1688761]: BUILD PASSED |
Signed-off-by: Albert Wolant <awolant@nvidia.com>
!build |
CI MESSAGE: [1691238]: BUILD STARTED |
CI MESSAGE: [1691238]: BUILD FAILED |
Signed-off-by: Albert Wolant <awolant@nvidia.com>
pipe = Pipeline(batch_size=batch_size, num_threads=4, device_id=0) | ||
|
||
with pipe: | ||
input = fn.external_source(source = get_batch) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't you just:
input = fn.external_source(source = get_batch) | |
input = fn.external_source(source = batch_fn) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. ExternalSource
API for now does not work with some callables. I tried method and partial and it failed no a check inside. I think it could be reworked but not in this PR. For now I just wrapped it with ad hoc regular function so it works.
I put a comment about it.
Signed-off-by: Albert Wolant <awolant@nvidia.com>
!build |
CI MESSAGE: [1693996]: BUILD STARTED |
dali/kernels/reduce/reduce_gpu.cu
Outdated
template class SumGPU<uint16_t, uint16_t>; | ||
template class SumGPU<uint8_t, uint8_t>; | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One crazy idea, if we plan to have many reductions that follow the same pattern we could have a:
#define REDUCTION_IMPL(Kernel, Impl) \
template <typename Out, typename In> \
class Kernel<Out, In>::Impl : public Impl<Out, In> { \
...
Basically it'd cover all the repeated code (including template instantiation) and later you just do:
REDUCTION_IMPL(MinGPU, reduce_impl::MinImplGPU);
REDUCTION_IMPL(MaxGPU, reduce_impl::MaxImplGPU);
...
I have mixed feeling about having so much inside a macro, but on the other hand, now there's a lot of boiler-plate here. Second opinion maybe? @mzient ?
CI MESSAGE: [1693996]: BUILD PASSED |
Signed-off-by: Albert Wolant <awolant@nvidia.com>
Signed-off-by: Albert Wolant <awolant@nvidia.com>
!build |
CI MESSAGE: [1697300]: BUILD STARTED |
@@ -1263,6 +1263,18 @@ class SumImplGPU : public ReduceImplGPU<Out, In, default_sum_acc_t<Out, In>, Sum | |||
reductions::sum GetReduction() const { return {}; } | |||
}; | |||
|
|||
template <typename Out, typename In> | |||
class MinImplGPU : public ReduceImplGPU<Out, In, default_sum_acc_t<Out, In>, MinImplGPU<Out, In>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
class MinImplGPU : public ReduceImplGPU<Out, In, default_sum_acc_t<Out, In>, MinImplGPU<Out, In>> { | |
class MinImplGPU : public ReduceImplGPU<Out, In, In>, MinImplGPU<Out, In>> { |
Signed-off-by: Albert Wolant <awolant@nvidia.com>
!build |
CI MESSAGE: [1697500]: BUILD STARTED |
CI MESSAGE: [1697500]: BUILD PASSED |
CI MESSAGE: [1697500]: BUILD PASSED |
Why we need this PR?
What happened in this PR?
Added CPU operator for reductions: sum, min, max
Operators, Kernels
Operator implementation
Added python test to compare against NumPy
Added doc strings
JIRA TASK: [DALI-1621]