-
Notifications
You must be signed in to change notification settings - Fork 611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce Sum Op #2379
Reduce Sum Op #2379
Conversation
Signed-off-by: Albert Wolant <awolant@nvidia.com>
Signed-off-by: Albert Wolant <awolant@nvidia.com>
1a9c461
to
e47e965
Compare
!build |
CI MESSAGE: [1714406]: BUILD FAILED |
Signed-off-by: Albert Wolant <awolant@nvidia.com>
Signed-off-by: Albert Wolant <awolant@nvidia.com>
Signed-off-by: Albert Wolant <awolant@nvidia.com>
!build |
CI MESSAGE: [1716680]: BUILD STARTED |
CI MESSAGE: [1716680]: BUILD PASSED |
#include "include/dali/core/static_map.h" | ||
#include "dali/operators/generic/reduce/reduce.h" | ||
|
||
#define SUM_TYPES_MAP ( \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment what is the format and what this map is for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Added comment here and more elaborate doc style comments for TYPE_MAP with its implementation.
@@ -5,6 +5,18 @@ | |||
from nvidia.dali.pipeline import Pipeline | |||
import numpy as np | |||
|
|||
to_dali_type = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use np_types_to_dali from test_utils.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
(np.uint32, [np.uint64, np.float32]), | ||
(np.int32, [np.int32, np.int64, np.float32])] | ||
|
||
for keep_dims in [False, True]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How long does this test take. Maybe we should split it to smaller and bigger flavor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole file as it is in this PR takes around 30 seconds to run, so I don't think it's a problem.
@@ -41,15 +51,21 @@ DALI_SCHEMA(Max) | |||
.NumOutput(1) | |||
.AddParent("ReduceBase"); | |||
|
|||
using MinCPU = Reduce<kernels::MinCPU, CPUBackend>; | |||
using SumCPU = SumOp<kernels::SumCPU, CPUBackend>; | |||
DALI_REGISTER_OPERATOR(Sum, SumCPU, CPU); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't we want to have all the reductions in a reductions
namespace?
If we haven't released yet those it's as simple as renaming "Sum" to "reductions__Sum" and so on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have that in some nightly, but I can change that no problem. Let's discuss.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, moved to reductions
|
||
======================== | ||
|
||
uSHET Library - CPP Magic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you could mention why we are ack-ing this for the future reader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comment to relevant file - static_map.h
@@ -29,6 +30,15 @@ Not providing any axis results in reduction of all elements.)code", | |||
"If True, maintains original input dimensions.", | |||
false); | |||
|
|||
DALI_SCHEMA(Sum) | |||
.DocStr("") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing documentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
class Reduce : public Operator<Backend> { | ||
public: | ||
explicit inline Reduce(const OpSpec &spec) : | ||
Operator<Backend>(spec), | ||
axes_(spec.GetRepeatedArgument<int>("axes")), | ||
keep_dims_(spec.GetArgument<bool>("keep_dims")) { | ||
if (!spec.TryGetArgument<DALIDataType>(output_type_, "dtype")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (!spec.TryGetArgument<DALIDataType>(output_type_, "dtype")) { | |
output_type_ = spec.GetArgument<DALIDataType>("dtype"); |
you already have a default value in the schema
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is some leftover from previous version.
TYPE_SWITCH(data_type, type2id, DataType, REDUCE_TYPES, ( | ||
RunTyped<DataType, DataType>(ws);), | ||
DALI_FAIL(make_string("Unsupported input type: ", data_type))) | ||
ImplType<ReductionType, Backend>& reduce_impl = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ImplType<ReductionType, Backend>& reduce_impl = | |
auto& reduce_impl = |
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
DALIDataType input_type = in.type().id(); | ||
|
||
TYPE_SWITCH(input_type, type2id, DataType, REDUCE_TYPES, ( | ||
Reduce<ReductionType, Backend, ReduceOp>& base = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reduce<ReductionType, Backend, ReduceOp>& base = | |
auto& base = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
public: | ||
explicit inline ReduceOp(const OpSpec &spec) : Reduce<ReductionType, Backend, ReduceOp>(spec) {} | ||
|
||
void RunImplImpl(workspace_t<Backend> &ws) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RunImplImpl? :) I think that's too much. Why not overriding RunImpl directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because this is not inheritance but CRTP. I can change the name but it is quite accurate. RunImpl
calls implementations of RunImplImpl
based on last template parameter.
This way I can share most of the code for reduction and only change RunImplImpl
based on ReductionType
template parameter.
Maybe it's not super clear now, since Max and Min are using default implementation of RunImplImpl
, but Sum changes that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair enough
dali/operators/generic/reduce/sum.h
Outdated
auto& in = ws.template InputRef<Backend>(0); | ||
DALIDataType input_type = in.type().id(); | ||
|
||
Reduce<ReductionType, Backend, SumOp>& base = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reduce<ReductionType, Backend, SumOp>& base = | |
auto& base = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not done?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now done, I hope :)
(np.uint32, [np.uint64, np.float32]), | ||
(np.int32, [np.int32, np.int64, np.float32])] | ||
|
||
for keep_dims in [False, True]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe keep_dims could be a random choice in the inner part of the nested loops? You'll cut the number of tests in half and still cover what you need to test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole file as it is in this PR takes around 30 seconds to run, so I don't think it's a problem.
@@ -5,6 +5,18 @@ | |||
from nvidia.dali.pipeline import Pipeline | |||
import numpy as np | |||
|
|||
to_dali_type = { | |||
np.int8: types.INT8, | |||
np.uint8: types.UINT8, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test to cpu only file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Signed-off-by: Albert Wolant <awolant@nvidia.com>
Signed-off-by: Albert Wolant <awolant@nvidia.com>
!build |
CI MESSAGE: [1725506]: BUILD STARTED |
CI MESSAGE: [1725506]: BUILD FAILED |
Signed-off-by: Albert Wolant <awolant@nvidia.com>
@@ -240,7 +240,7 @@ def test_mfcc_cpu(): | |||
spectrum = fn.spectrogram(data, nfft = 60, window_length = 50, window_step = 25) | |||
mel = fn.mel_filter_bank(spectrum) | |||
dec = fn.to_decibels(mel) | |||
processed = fn.mfc(dec) | |||
processed = fn.mfcc(dec) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related typo fix.
Signed-off-by: Albert Wolant <awolant@nvidia.com>
!build |
dali/operators/generic/reduce/sum.h
Outdated
auto& in = ws.template InputRef<Backend>(0); | ||
DALIDataType input_type = in.type().id(); | ||
|
||
Reduce<ReductionType, Backend, SumOp>& base = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not done?
CI MESSAGE: [1727221]: BUILD STARTED |
Signed-off-by: Albert Wolant <awolant@nvidia.com>
!build |
CI MESSAGE: [1727370]: BUILD STARTED |
CI MESSAGE: [1727370]: BUILD PASSED |
dali/operators/generic/reduce/sum.h
Outdated
auto& base = | ||
static_cast<Reduce<ReductionType, Backend, SumOp>&>(*this); | ||
DALIDataType output_type = base.OutputType(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto& base = | |
static_cast<Reduce<ReductionType, Backend, SumOp>&>(*this); | |
DALIDataType output_type = base.OutputType(); | |
DALIDataType output_type = this->OutputType(); |
This should do the trick.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried that before, doesn't work.
dali/operators/generic/reduce/sum.h
Outdated
static_cast<Reduce<ReductionType, Backend, SumOp>&>(*this); | ||
DALIDataType output_type = base.OutputType(); | ||
if (output_type == DALI_NO_TYPE) { | ||
output_type = input_type; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still have my doubts here. This is going to be a major headache for the users of NumPy and PyTorch. The accumulator type for any integer is int64_t anyway, so it's not like we're saving anything by returning the result in a smaller type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -144,6 +138,33 @@ class Reduce : public Operator<Backend> { | |||
false); | |||
kmgr_.Run<Kernel>(0, 0, ctx, out_view, in_view); | |||
} | |||
|
|||
DALIDataType OutputType() { return output_type_; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DALIDataType OutputType() { return output_type_; } | |
DALIDataType OutputType() const { return output_type_; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Signed-off-by: Albert Wolant <awolant@nvidia.com>
Signed-off-by: Albert Wolant <awolant@nvidia.com>
Signed-off-by: Albert Wolant <awolant@nvidia.com>
dali/operators/generic/reduce/sum.h
Outdated
switch (input_type) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lint will complain
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
!build |
CI MESSAGE: [1728169]: BUILD STARTED |
CI MESSAGE: [1728169]: BUILD PASSED |
Why we need this PR?
What happened in this PR?
Added SumOp using CRTP to reuse existing reductions code, type mapping implemented with new macro
TYPE_MAP
Operators (new op), include (new macro)
TYPE_MAP
macro, Sum Op implementationAdded python tests for Sum Op
Added docs for new op, added comments for
TYPE_MAP
macroJIRA TASK: [Use DALI-1675]