Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable support for different layouts in the MelFilterBank GPU Op #2620

Conversation

banasraf
Copy link
Collaborator

Signed-off-by: Rafal Banas.Rafal97@gmail.com

Why we need this PR?

  • It adds a support for the layouts other than #ft in the MelFilterBank GPU Op. Needed for time-major layout support.

What happened in this PR?

  • What solution was applied:
    The frequency axis is queried from the layout string
  • Affected modules and functionalities:
    Operators impl.
  • Key points relevant for the review:
    docstring, tests
  • Validation and testing:
    I've extended the existing tests to run on "tf" layout
  • Documentation (including examples):
    Docstring updated

JIRA TASK: NA

Signed-off-by: Rafal <Banas.Rafal97@gmail.com>
@banasraf
Copy link
Collaborator Author

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1991582]: BUILD STARTED

@JanuszL JanuszL self-assigned this Jan 18, 2021
the fft bin index and the window index, respectively.
Expects an input with at least 2 dimensions.

Please note that the CPU implementation supports only the layout with the last two
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How hard would be to rework CPU implementation as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should use a separate implementation of the actual processing loop, but it should some 30-50 lines or so (if the kenrel isn't totally botched).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I plan to implement the CPU part. I can do it by contributing to this PR so that we don't introduce the note about the lack of support in the CPU version

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a problem in nightly, IMO. What I would not like to see is this partial support making it into tagged release.

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1991582]: BUILD PASSED

@awolant awolant self-assigned this Jan 19, 2021
Signed-off-by: Joaquin Anton <janton@nvidia.com>
@JanuszL JanuszL self-requested a review January 19, 2021 17:12
@awolant awolant self-requested a review January 19, 2021 17:13
int nfilter = args_.nfilter;

std::memset(out, 0, sizeof(T) * nfilter * nwindows);
for (int64_t fftbin = fftbin_start_; fftbin <= fftbin_end_; fftbin++) {
auto *in_row_start = in + fftbin * in_stride;
auto *in_row_start = in + fftbin * nwindows;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? Having strides had some potential for more generic usage.

int f2 = interval_ends_[m + 2];
for (; fftbin < f1; ++fftbin) {
auto weight_up = T(1) - weights_down_[fftbin];
if (args_.normalize)
Copy link
Contributor

@mzient mzient Jan 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move it after the loops as val *= norm_factors_[m];

@@ -23,8 +23,9 @@ DALI_SCHEMA(MelFilterBank)
.DocStr(R"code(Converts a spectrogram to a mel spectrogram by applying a bank of
triangular filters.

Expects an input with at least 2 dimensions where the last two dimensions correspond to
the fft bin index and the window index, respectively.
Expects an input with at least 2 dimensions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be a requirement - I mean, you can just calculate a single Mel spectrum (not necessarily a spectrogram).
It can be worked around by adding artificial dimensions for use by the kernels.
If, on the other hand, there are more than 2 dimensions, all trailing and leading dimensions can be collapsed.
Can we even handle a case when f is somewhere in the middle - think, AfB layout?

Copy link
Collaborator Author

@banasraf banasraf Jan 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GPU kernel supports any f axis

Signed-off-by: Joaquin Anton <janton@nvidia.com>
Signed-off-by: Joaquin Anton <janton@nvidia.com>
Comment on lines 189 to 197
while (axis > 1) {
in_shape = collapse_dim(in_shape, 0);
out_shape = collapse_dim(out_shape, 0);
axis--;
}
while (axis < in_shape.size() - 2) {
in_shape = collapse_dim(in_shape, in_shape.size() - 2);
out_shape = collapse_dim(out_shape, out_shape.size() - 2);
}
Copy link
Contributor

@mzient mzient Jan 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor:

Suggested change
while (axis > 1) {
in_shape = collapse_dim(in_shape, 0);
out_shape = collapse_dim(out_shape, 0);
axis--;
}
while (axis < in_shape.size() - 2) {
in_shape = collapse_dim(in_shape, in_shape.size() - 2);
out_shape = collapse_dim(out_shape, out_shape.size() - 2);
}
if (axis > 1) {
in_shape = collapse_dims(in_shape, {std::make_pair(0, axis)});
out_shape = collapse_dims(out_shape, {0, axis});
axis = 1;
}
if (axis < in_shape.size() - 2) {
in_shape = collapse_dims(in_shape, {std::make_pair(axis+1, in_shape.size()-axis-1)});
out_shape = collapse_dims(out_shape, {std::make_pair(axis+1, in_shape.size()-axis-1)});
}

in_shape = collapse_dim(in_shape, in_shape.size() - 2);
out_shape = collapse_dim(out_shape, out_shape.size() - 2);
}
bool is_freq_last = axis == in_shape.size() - 1 || in_shape[in_shape.size() - 1] == 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -122,22 +122,28 @@ class MelFilterBankGpu<T, Dims>::Impl : public MelFilterImplBase<T, Dims> {
}
}

void Setup(ScratchpadEstimator &se, const TensorListShape<Dims> &in_shape) {
void Setup(ScratchpadEstimator &se, TensorListShape<> in_shape) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void Setup(ScratchpadEstimator &se, TensorListShape<> in_shape) {
void Setup(ScratchpadEstimator &se, const TensorListShape<> &in_shape) {

It looks like you're not modifying it - no need to copy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftovers from a previous change I removed finally. Thanks.

(128, 16000.0, 0.0, 8000.0, (10, 513, 100), 'Ctf'),
(128, 48000.0, 4000.0, 24000.0, (513, 100), 'tf'),
(128, 44100.0, 0.0, 22050.0, (513, 100), 'tf'),
(128, 44100.0, 1000.0, 22050.0, (513, 100), 'tf')]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a test with 1D input.

auto ndim = in_shape.sample_dim();
args_.axis = layout.empty() ? std::max(0, ndim - 2) : layout.find('f');
DALI_ENFORCE(args_.axis >= 0 && args_.axis < ndim,
make_string("'f' axis not present in the layout. Got: ", layout));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
make_string("'f' axis not present in the layout. Got: ", layout));
make_string("'f' axis not present in the layout. Got: `", layout, "`"));

auto ndim = in_shape.sample_dim();
args_.axis = layout.empty() ? std::max(0, ndim - 2) : layout.find('f');
DALI_ENFORCE(args_.axis >= 0 && args_.axis < ndim,
make_string("'f' axis not present in the layout. Got: ", layout));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
make_string("'f' axis not present in the layout. Got: ", layout));
make_string("'f' axis not present in the layout. Got: `", layout, "`"));

interval_ends_.resize(nfilter + 2);
interval_ends_[0] = fftbin_start_;
interval_ends_[nfilter + 1] = fftbin_end_ + 1;
for (int interval = 1; interval < nfilter + 1; interval++, mel += mel_delta_) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this loop be merged with previous one?

Copy link
Contributor

@mzient mzient Jan 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be, but I see no reason to do it. know it's initialization only, but the inner loop would likely break the optimizations in the inner one (as the compiler would deem it less frequent).
Also, it would contain an ugly if (interval == 0)

Signed-off-by: Joaquin Anton <janton@nvidia.com>
Signed-off-by: Joaquin Anton <janton@nvidia.com>
@jantonguirao
Copy link
Contributor

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2001716]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2001716]: BUILD PASSED

@jantonguirao jantonguirao merged commit a2f39ba into NVIDIA:master Jan 21, 2021
@JanuszL JanuszL mentioned this pull request May 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants