Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce Sum Op #2379

Merged
merged 17 commits into from
Oct 23, 2020
Merged

Reduce Sum Op #2379

merged 17 commits into from
Oct 23, 2020

Conversation

awolant
Copy link
Contributor

@awolant awolant commented Oct 19, 2020

Why we need this PR?

  • It adds new feature needed because of we want to support all reductions

What happened in this PR?

  • What solution was applied:
    Added SumOp using CRTP to reuse existing reductions code, type mapping implemented with new macro TYPE_MAP
  • Affected modules and functionalities:
    Operators (new op), include (new macro)
  • Key points relevant for the review:
    TYPE_MAP macro, Sum Op implementation
  • Validation and testing:
    Added python tests for Sum Op
  • Documentation (including examples):
    Added docs for new op, added comments for TYPE_MAP macro

JIRA TASK: [Use DALI-1675]

Signed-off-by: Albert Wolant <awolant@nvidia.com>
Signed-off-by: Albert Wolant <awolant@nvidia.com>
@awolant awolant changed the title Reduce sum op Reduce Sum Op Oct 19, 2020
@awolant
Copy link
Contributor Author

awolant commented Oct 19, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1714406]: BUILD FAILED

Signed-off-by: Albert Wolant <awolant@nvidia.com>
Signed-off-by: Albert Wolant <awolant@nvidia.com>
Signed-off-by: Albert Wolant <awolant@nvidia.com>
@awolant
Copy link
Contributor Author

awolant commented Oct 20, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1716680]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1716680]: BUILD PASSED

#include "include/dali/core/static_map.h"
#include "dali/operators/generic/reduce/reduce.h"

#define SUM_TYPES_MAP ( \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment what is the format and what this map is for.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Added comment here and more elaborate doc style comments for TYPE_MAP with its implementation.

@@ -5,6 +5,18 @@
from nvidia.dali.pipeline import Pipeline
import numpy as np

to_dali_type = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use np_types_to_dali from test_utils.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

(np.uint32, [np.uint64, np.float32]),
(np.int32, [np.int32, np.int64, np.float32])]

for keep_dims in [False, True]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How long does this test take. Maybe we should split it to smaller and bigger flavor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole file as it is in this PR takes around 30 seconds to run, so I don't think it's a problem.

@@ -41,15 +51,21 @@ DALI_SCHEMA(Max)
.NumOutput(1)
.AddParent("ReduceBase");

using MinCPU = Reduce<kernels::MinCPU, CPUBackend>;
using SumCPU = SumOp<kernels::SumCPU, CPUBackend>;
DALI_REGISTER_OPERATOR(Sum, SumCPU, CPU);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't we want to have all the reductions in a reductions namespace?
If we haven't released yet those it's as simple as renaming "Sum" to "reductions__Sum" and so on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have that in some nightly, but I can change that no problem. Let's discuss.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, moved to reductions


========================

uSHET Library - CPP Magic
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you could mention why we are ack-ing this for the future reader.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comment to relevant file - static_map.h

@@ -29,6 +30,15 @@ Not providing any axis results in reduction of all elements.)code",
"If True, maintains original input dimensions.",
false);

DALI_SCHEMA(Sum)
.DocStr("")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing documentation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

class Reduce : public Operator<Backend> {
public:
explicit inline Reduce(const OpSpec &spec) :
Operator<Backend>(spec),
axes_(spec.GetRepeatedArgument<int>("axes")),
keep_dims_(spec.GetArgument<bool>("keep_dims")) {
if (!spec.TryGetArgument<DALIDataType>(output_type_, "dtype")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (!spec.TryGetArgument<DALIDataType>(output_type_, "dtype")) {
output_type_ = spec.GetArgument<DALIDataType>("dtype");

you already have a default value in the schema

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is some leftover from previous version.

TYPE_SWITCH(data_type, type2id, DataType, REDUCE_TYPES, (
RunTyped<DataType, DataType>(ws);),
DALI_FAIL(make_string("Unsupported input type: ", data_type)))
ImplType<ReductionType, Backend>& reduce_impl =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ImplType<ReductionType, Backend>& reduce_impl =
auto& reduce_impl =

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

DALIDataType input_type = in.type().id();

TYPE_SWITCH(input_type, type2id, DataType, REDUCE_TYPES, (
Reduce<ReductionType, Backend, ReduceOp>& base =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Reduce<ReductionType, Backend, ReduceOp>& base =
auto& base =

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

public:
explicit inline ReduceOp(const OpSpec &spec) : Reduce<ReductionType, Backend, ReduceOp>(spec) {}

void RunImplImpl(workspace_t<Backend> &ws) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RunImplImpl? :) I think that's too much. Why not overriding RunImpl directly?

Copy link
Contributor Author

@awolant awolant Oct 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this is not inheritance but CRTP. I can change the name but it is quite accurate. RunImpl calls implementations of RunImplImpl based on last template parameter.
This way I can share most of the code for reduction and only change RunImplImpl based on ReductionType template parameter.
Maybe it's not super clear now, since Max and Min are using default implementation of RunImplImpl, but Sum changes that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair enough

auto& in = ws.template InputRef<Backend>(0);
DALIDataType input_type = in.type().id();

Reduce<ReductionType, Backend, SumOp>& base =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Reduce<ReductionType, Backend, SumOp>& base =
auto& base =

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not done?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now done, I hope :)

(np.uint32, [np.uint64, np.float32]),
(np.int32, [np.int32, np.int64, np.float32])]

for keep_dims in [False, True]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe keep_dims could be a random choice in the inner part of the nested loops? You'll cut the number of tests in half and still cover what you need to test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole file as it is in this PR takes around 30 seconds to run, so I don't think it's a problem.

@@ -5,6 +5,18 @@
from nvidia.dali.pipeline import Pipeline
import numpy as np

to_dali_type = {
np.int8: types.INT8,
np.uint8: types.UINT8,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test to cpu only file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Signed-off-by: Albert Wolant <awolant@nvidia.com>
Signed-off-by: Albert Wolant <awolant@nvidia.com>
Signed-off-by: Albert Wolant <awolant@nvidia.com>
@awolant
Copy link
Contributor Author

awolant commented Oct 22, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1725506]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1725506]: BUILD FAILED

Signed-off-by: Albert Wolant <awolant@nvidia.com>
@@ -240,7 +240,7 @@ def test_mfcc_cpu():
spectrum = fn.spectrogram(data, nfft = 60, window_length = 50, window_step = 25)
mel = fn.mel_filter_bank(spectrum)
dec = fn.to_decibels(mel)
processed = fn.mfc(dec)
processed = fn.mfcc(dec)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related typo fix.

Signed-off-by: Albert Wolant <awolant@nvidia.com>
@awolant
Copy link
Contributor Author

awolant commented Oct 23, 2020

!build

auto& in = ws.template InputRef<Backend>(0);
DALIDataType input_type = in.type().id();

Reduce<ReductionType, Backend, SumOp>& base =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not done?

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1727221]: BUILD STARTED

Signed-off-by: Albert Wolant <awolant@nvidia.com>
@awolant
Copy link
Contributor Author

awolant commented Oct 23, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1727370]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1727370]: BUILD PASSED

Comment on lines 46 to 48
auto& base =
static_cast<Reduce<ReductionType, Backend, SumOp>&>(*this);
DALIDataType output_type = base.OutputType();
Copy link
Contributor

@mzient mzient Oct 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
auto& base =
static_cast<Reduce<ReductionType, Backend, SumOp>&>(*this);
DALIDataType output_type = base.OutputType();
DALIDataType output_type = this->OutputType();

This should do the trick.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor Author

@awolant awolant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried that before, doesn't work.

static_cast<Reduce<ReductionType, Backend, SumOp>&>(*this);
DALIDataType output_type = base.OutputType();
if (output_type == DALI_NO_TYPE) {
output_type = input_type;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still have my doubts here. This is going to be a major headache for the users of NumPy and PyTorch. The accumulator type for any integer is int64_t anyway, so it's not like we're saving anything by returning the result in a smaller type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -144,6 +138,33 @@ class Reduce : public Operator<Backend> {
false);
kmgr_.Run<Kernel>(0, 0, ctx, out_view, in_view);
}

DALIDataType OutputType() { return output_type_; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DALIDataType OutputType() { return output_type_; }
DALIDataType OutputType() const { return output_type_; }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Signed-off-by: Albert Wolant <awolant@nvidia.com>
Signed-off-by: Albert Wolant <awolant@nvidia.com>
Signed-off-by: Albert Wolant <awolant@nvidia.com>
Comment on lines 47 to 48
switch (input_type)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lint will complain

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Signed-off-by: Albert Wolant <awolant@nvidia.com>
@awolant
Copy link
Contributor Author

awolant commented Oct 23, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1728169]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1728169]: BUILD PASSED

@awolant awolant merged commit 35c91b6 into NVIDIA:master Oct 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants