New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Audio resampling operator #3840

Merged

mzient merged 8 commits into NVIDIA:main from mzient:AudioResampleOp

Apr 29, 2022

Contributor

mzient commented Apr 22, 2022

Category:

New feature (non-breaking change which adds functionality)

Description:

Adds a standalone audio resampling operator.
The resampling can be parameterized with either:

input and output sampling rates (scale and length are calculated from these)
scale factor (the length is calculated from the scale)
new length (scale is the ratio of lengths)

Additional information:

Affected modules and functionalities:

Audio decoder op (some minor refactoring).

Key points relevant for the review:

Checklist

Tests

Documentation

DALI team only

Requirements

Implements new requirements
Affects existing requirements
N/A

REQ IDs: TODO

JIRA TASK: DALI-2750

mzient changed the title ~~Audio resample op~~ Audio resampling operator

Collaborator

dali-automaton commented Apr 22, 2022

CI MESSAGE: [4650544]: BUILD STARTED

Collaborator

dali-automaton commented Apr 22, 2022

CI MESSAGE: [4650623]: BUILD STARTED

Collaborator

dali-automaton commented Apr 22, 2022

CI MESSAGE: [4650544]: BUILD FAILED

Collaborator

dali-automaton commented Apr 22, 2022

CI MESSAGE: [4650916]: BUILD STARTED

Collaborator

dali-automaton commented Apr 22, 2022

CI MESSAGE: [4650623]: BUILD PASSED

Collaborator

dali-automaton commented Apr 22, 2022

CI MESSAGE: [4650916]: BUILD PASSED

jantonguirao assigned stiepan and JanuszL

JanuszL reviewed

View reviewed changes

dali/operators/audio/resampling_params.h

+              struct ResamplingParams {
+                int lobes = 16;
+                int lookup_size = 2048;

Contributor

JanuszL Apr 25, 2022

Suggested change

      
              int lookup_size = 2048;
          
              int lookup_size = 1025;

If we use lobes * 64 + 1 formula then for lobes = 16 we should provide 1025 by default, ... I guess.

Contributor Author

mzient Apr 25, 2022

It's taken from the default parameters to resample function. I could change it in both places.

JanuszL reviewed

View reviewed changes

dali/test/python/test_operator_audio_resample.py

		@@ -0,0 +1,68 @@
		# Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Contributor

JanuszL Apr 25, 2022

Please add this a test to test_dali_cpu_only.py and test_dali_variable_batch_size.py.

JanuszL reviewed

View reviewed changes

dali/operators/audio/resample.cc Outdated

+              namespace dali {
+              DALI_SCHEMA(experimental__AudioResample)
+                .DocStr(R"(Resamples a signal with a different sampling rate.

Contributor

JanuszL Apr 25, 2022

Suggested change

      
              .DocStr(R"(Resamples a signal with a different sampling rate.
          
              .DocStr(R"(Resamples a signal with a given sampling rate.

Contributor Author

mzient Apr 25, 2022

Actually, neither is true. "given" suggests an absolute value and that's not how this operator works - you don't even need to supply sampling rates directly (just the ratio). Also, I could add a fast path for scale==1 (so if the rate isn't really different, there would be no perf hit).

Contributor

JanuszL Apr 25, 2022

Up to you. I just wanted to point out that not necessarily the sampling rate need to be different.

JanuszL reviewed

View reviewed changes

dali/operators/audio/resample.cc Outdated

Comment on lines 29 to 30

		The resampling ratio can specified either directly or as a ratio of target to source sampling rate
		or calculated from the ratio of requested output length to input length.

Contributor

JanuszL Apr 25, 2022

Suggested change

      
            The resampling ratio can specified either directly or as a ratio of target to source sampling rate
          
            or calculated from the ratio of requested output length to input length.
          
            The resampling ratio can specified either directly, or as a ratio of target to source sampling rate,
          
            or calculated from the ratio of requested output length to input length.

Contributor Author

mzient Apr 25, 2022

Actually, the rules for comma before "or" are more complex - there should be one if an independent clause follows or.

The resampling ratio can be specified directly, or it can be calculated as a ratio of input and output sampling rates.

Here, we have two independent clauses ("[ratio] can be be specified..." and "it can be calculated...").
In the original text, however, the clauses are dependent - they both refer to "be specified": "directly or as a ratio...".

Having said all that, the sentence is indeed incorrect, because:

it's missing the verb
it uses "either" in front of multiple choice

I think it should be:

The resampling ratio can be specified directly or as a ratio of target to source sampling rate, or calculated...

but perhaps the second comma is not mandatory, as the last clause can be treated as dependent wrt "be" (specified or calculated).

Contributor

JanuszL Apr 25, 2022

Up to you

JanuszL reviewed

View reviewed changes

dali/operators/audio/resample.cc Outdated

+              The ``in_rate`` and ``out_rate`` parameters cannot be specified together with ``scale`` or
+              ``out_length``.)",
+                  nullptr, true)
+                .AddOptionalArg<float>("out_rate", R"(Input sampling rate.

Contributor

JanuszL Apr 25, 2022

Suggested change

      
              .AddOptionalArg<float>("out_rate", R"(Input sampling rate.
          
              .AddOptionalArg<float>("out_rate", R"(Output sampling rate.

JanuszL reviewed

View reviewed changes

dali/operators/audio/resample.cc Outdated

+                  nullptr, true)
+                .AddOptionalArg<float>("out_rate", R"(Input sampling rate.
+              The sampling rate of the input sample. This parameter must be specified together with ``out_rate``.

Contributor

JanuszL Apr 25, 2022 •

edited

Loading

Suggested change

      
            The sampling rate of the input sample. This parameter must be specified together with ``out_rate``.
          
            The sampling rate of the output sample. This parameter must be specified together with ``in_rate``.

JanuszL reviewed

View reviewed changes

dali/operators/audio/resample.cc Outdated

+                .AddOptionalArg<float>("out_rate", R"(Input sampling rate.
+              The sampling rate of the input sample. This parameter must be specified together with ``out_rate``.
+              The value is relative to ``out_rate`` and doesn't need to use any specific unit as long as the

Contributor

JanuszL Apr 25, 2022

Suggested change

      
            The value is relative to ``out_rate`` and doesn't need to use any specific unit as long as the
          
            The value is relative to ``in_rate`` and doesn't need to use any specific unit as long as the

JanuszL reviewed

View reviewed changes

dali/operators/audio/resample.cc

+gives 3 lobes of the sinc filter, 50 gives 16 lobes, and 100 gives 64 lobes.)",
+.0f, false)
+                .AddOptionalArg<DALIDataType>("dtype", R"(The ouput type.

Contributor

JanuszL Apr 25, 2022

I don't know if this is clear enough here, that the input should be either -1, 1 for floats, or full range for integers?

Contributor Author

mzient Apr 26, 2022

Well, -1..1 for integers makes very little sense. I could add examples.

Contributor

JanuszL Apr 26, 2022

Up to you. On the other hand we don't do that for other ops....

JanuszL reviewed

View reviewed changes

dali/operators/audio/resample.cc

Comment on lines +159 to +180

		kernels::signal::resampling::Resampler R;
		std::vector<std::vector<float>> in_fp32;

Contributor

JanuszL Apr 25, 2022

Suggested change

      
              kernels::signal::resampling::Resampler R;
          
              std::vector<std::vector<float>> in_fp32;
          
              kernels::signal::resampling::Resampler R_;
          
              std::vector<std::vector<float>> in_fp32_;

JanuszL reviewed

View reviewed changes

dali/operators/audio/resample.cc

+                }
+                template <typename T>
+                InTensorCPU<float> ConvertInput(std::vector<float> &tmp, const InTensorCPU<T> &in) {

Contributor

JanuszL Apr 25, 2022

I'm not sure if ConvertInput needs to be a member. All it uses is just dtype_, other parameters are passed as arguments.

JanuszL reviewed

View reviewed changes

dali/operators/audio/resample.h

+                  DALI_ENFORCE(quality_ >= 0 && quality_ <= 100, make_string("``quality`` out of range: ",
+                    quality_, "\nValid range is [0..100]."));
+                  if (spec_.TryGetArgument(dtype_, "dtype")) {
+                    // silence useless warning -----------------------------vvvvvvvvvvvvvvvv

Contributor

JanuszL Apr 25, 2022

?

Contributor Author

mzient Apr 26, 2022

Without this weird T x; (void)x; I get "unused local typedef" warning for T.

JanuszL reviewed

View reviewed changes

dali/operators/audio/resample.h

Comment on lines +45 to +46

		TYPE_SWITCH(dtype_, type2id, T, (AUDIO_RESAMPLE_TYPES), (T x; (void)x;),
		(DALI_FAIL(make_string("Unsupported output type: ", dtype_,

Contributor

JanuszL Apr 25, 2022

Suggested change

      
                  TYPE_SWITCH(dtype_, type2id, T, (AUDIO_RESAMPLE_TYPES), (T x; (void)x;),
          
                  (DALI_FAIL(make_string("Unsupported output type: ", dtype_,
          
                  TYPE_SWITCH(dtype_, type2id, T, (AUDIO_RESAMPLE_TYPES),
          
                  (T x; (void)x;),
          
                  (DALI_FAIL(make_string("Unsupported output type: ", dtype_,

I think this reads better for the TYPE_SWITCH.

JanuszL reviewed

View reviewed changes

dali/operators/audio/resample.h

+                ArgValue<float> scale_{"scale", spec_};
+                ArgValue<int64_t> out_length_{"out_length", spec_};
+                std::vector<double> scales_;

Contributor

JanuszL Apr 25, 2022

Easy to confuse with scale_.

mzient force-pushed the AudioResampleOp branch from b68f03c to 75b5f26 Compare

April 26, 2022 14:15

Collaborator

dali-automaton commented Apr 26, 2022

CI MESSAGE: [4679776]: BUILD STARTED

Collaborator

dali-automaton commented Apr 26, 2022

CI MESSAGE: [4679776]: BUILD PASSED

JanuszL reviewed

View reviewed changes

dali/test/python/test_dali_variable_batch_size.py

		@@ -1020,6 +1020,7 @@ def test_subscript_dim_check():
		operator_fn=fn.subscript_dim_check, num_subscripts=1)

Contributor

JanuszL Apr 26, 2022

I think you need to add fn.experimental.audio_resample to the below as well.

JanuszL approved these changes

View reviewed changes

mzient added 3 commits

April 27, 2022 11:26


          Add standalone audio resampling operator.

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>


          Add rudimentary tests.

810d387

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>


          Fix copyright notice.

c865525

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

mzient added 5 commits

April 27, 2022 11:26


          Fix clang build.

294dac5

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>


          Add rate/scale/length checks.

918f8d3

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>


          Fix docstrings.

cfbfa0d

Add dynamic range conversion tests.

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>


          Add CPU-only and variable batch size tests.

5db0732

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>


          Fix variable batch size coverage.

384613c

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

mzient force-pushed the AudioResampleOp branch from 75b5f26 to 384613c Compare

April 27, 2022 12:03

Collaborator

dali-automaton commented Apr 27, 2022

CI MESSAGE: [4689377]: BUILD STARTED

Collaborator

dali-automaton commented Apr 27, 2022

CI MESSAGE: [4689377]: BUILD PASSED

stiepan approved these changes

View reviewed changes

mzient merged commit d6493aa into NVIDIA:main

cyyever pushed a commit to cyyever/DALI that referenced this pull request


          Audio resampling operator for CPU backend (NVIDIA#3840)

Add standalone (CPU) audio resampling operator with tests
* The operator can work with input/output sample rates, scaling factor or target length.
* The resampling is performed with the same algorithm as in audio decoder.
* Type conversion is supported, with 0-centered signed and midrange-centered unsigned semantics

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

cyyever pushed a commit to cyyever/DALI that referenced this pull request


          Audio resampling operator for CPU backend (NVIDIA#3840)

0d7746e

Add standalone (CPU) audio resampling operator with tests
* The operator can work with input/output sample rates, scaling factor or target length.
* The resampling is performed with the same algorithm as in audio decoder.
* Type conversion is supported, with 0-centered signed and midrange-centered unsigned semantics

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet