-
Notifications
You must be signed in to change notification settings - Fork 618
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Audio resampling operator #3840
Conversation
CI MESSAGE: [4650544]: BUILD STARTED |
CI MESSAGE: [4650623]: BUILD STARTED |
CI MESSAGE: [4650544]: BUILD FAILED |
CI MESSAGE: [4650916]: BUILD STARTED |
CI MESSAGE: [4650623]: BUILD PASSED |
CI MESSAGE: [4650916]: BUILD PASSED |
|
||
struct ResamplingParams { | ||
int lobes = 16; | ||
int lookup_size = 2048; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int lookup_size = 2048; | |
int lookup_size = 1025; |
If we use lobes * 64 + 1
formula then for lobes = 16
we should provide 1025 by default, ... I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's taken from the default parameters to resample function. I could change it in both places.
@@ -0,0 +1,68 @@ | |||
# Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add this a test to test_dali_cpu_only.py and test_dali_variable_batch_size.py.
dali/operators/audio/resample.cc
Outdated
namespace dali { | ||
|
||
DALI_SCHEMA(experimental__AudioResample) | ||
.DocStr(R"(Resamples a signal with a different sampling rate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.DocStr(R"(Resamples a signal with a different sampling rate. | |
.DocStr(R"(Resamples a signal with a given sampling rate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, neither is true. "given" suggests an absolute value and that's not how this operator works - you don't even need to supply sampling rates directly (just the ratio). Also, I could add a fast path for scale==1 (so if the rate isn't really different, there would be no perf hit).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Up to you. I just wanted to point out that not necessarily the sampling rate need to be different.
dali/operators/audio/resample.cc
Outdated
The resampling ratio can specified either directly or as a ratio of target to source sampling rate | ||
or calculated from the ratio of requested output length to input length. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The resampling ratio can specified either directly or as a ratio of target to source sampling rate | |
or calculated from the ratio of requested output length to input length. | |
The resampling ratio can specified either directly, or as a ratio of target to source sampling rate, | |
or calculated from the ratio of requested output length to input length. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, the rules for comma before "or" are more complex - there should be one if an independent clause follows or.
The resampling ratio can be specified directly, or it can be calculated as a ratio of input and output sampling rates.
Here, we have two independent clauses ("[ratio] can be be specified..." and "it can be calculated...").
In the original text, however, the clauses are dependent - they both refer to "be specified": "directly or as a ratio...".
Having said all that, the sentence is indeed incorrect, because:
- it's missing the verb
- it uses "either" in front of multiple choice
I think it should be:
The resampling ratio can be specified directly or as a ratio of target to source sampling rate, or calculated...
but perhaps the second comma is not mandatory, as the last clause can be treated as dependent wrt "be" (specified or calculated).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Up to you
dali/operators/audio/resample.cc
Outdated
The ``in_rate`` and ``out_rate`` parameters cannot be specified together with ``scale`` or | ||
``out_length``.)", | ||
nullptr, true) | ||
.AddOptionalArg<float>("out_rate", R"(Input sampling rate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.AddOptionalArg<float>("out_rate", R"(Input sampling rate. | |
.AddOptionalArg<float>("out_rate", R"(Output sampling rate. |
dali/operators/audio/resample.cc
Outdated
nullptr, true) | ||
.AddOptionalArg<float>("out_rate", R"(Input sampling rate. | ||
|
||
The sampling rate of the input sample. This parameter must be specified together with ``out_rate``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sampling rate of the input sample. This parameter must be specified together with ``out_rate``. | |
The sampling rate of the output sample. This parameter must be specified together with ``in_rate``. |
dali/operators/audio/resample.cc
Outdated
.AddOptionalArg<float>("out_rate", R"(Input sampling rate. | ||
|
||
The sampling rate of the input sample. This parameter must be specified together with ``out_rate``. | ||
The value is relative to ``out_rate`` and doesn't need to use any specific unit as long as the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The value is relative to ``out_rate`` and doesn't need to use any specific unit as long as the | |
The value is relative to ``in_rate`` and doesn't need to use any specific unit as long as the |
|
||
0 gives 3 lobes of the sinc filter, 50 gives 16 lobes, and 100 gives 64 lobes.)", | ||
50.0f, false) | ||
.AddOptionalArg<DALIDataType>("dtype", R"(The ouput type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this is clear enough here, that the input should be either -1, 1 for floats, or full range for integers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, -1..1 for integers makes very little sense. I could add examples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Up to you. On the other hand we don't do that for other ops....
kernels::signal::resampling::Resampler R; | ||
std::vector<std::vector<float>> in_fp32; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kernels::signal::resampling::Resampler R; | |
std::vector<std::vector<float>> in_fp32; | |
kernels::signal::resampling::Resampler R_; | |
std::vector<std::vector<float>> in_fp32_; |
} | ||
|
||
template <typename T> | ||
InTensorCPU<float> ConvertInput(std::vector<float> &tmp, const InTensorCPU<T> &in) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if ConvertInput
needs to be a member. All it uses is just dtype_
, other parameters are passed as arguments.
DALI_ENFORCE(quality_ >= 0 && quality_ <= 100, make_string("``quality`` out of range: ", | ||
quality_, "\nValid range is [0..100].")); | ||
if (spec_.TryGetArgument(dtype_, "dtype")) { | ||
// silence useless warning -----------------------------vvvvvvvvvvvvvvvv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this weird T x; (void)x;
I get "unused local typedef" warning for T
.
TYPE_SWITCH(dtype_, type2id, T, (AUDIO_RESAMPLE_TYPES), (T x; (void)x;), | ||
(DALI_FAIL(make_string("Unsupported output type: ", dtype_, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TYPE_SWITCH(dtype_, type2id, T, (AUDIO_RESAMPLE_TYPES), (T x; (void)x;), | |
(DALI_FAIL(make_string("Unsupported output type: ", dtype_, | |
TYPE_SWITCH(dtype_, type2id, T, (AUDIO_RESAMPLE_TYPES), | |
(T x; (void)x;), | |
(DALI_FAIL(make_string("Unsupported output type: ", dtype_, |
I think this reads better for the TYPE_SWITCH
.
ArgValue<float> scale_{"scale", spec_}; | ||
ArgValue<int64_t> out_length_{"out_length", spec_}; | ||
|
||
std::vector<double> scales_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Easy to confuse with scale_
.
CI MESSAGE: [4679776]: BUILD STARTED |
CI MESSAGE: [4679776]: BUILD PASSED |
@@ -1020,6 +1020,7 @@ def test_subscript_dim_check(): | |||
operator_fn=fn.subscript_dim_check, num_subscripts=1) | |||
|
|||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to add fn.experimental.audio_resample
to the below as well.
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Add dynamic range conversion tests. Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
CI MESSAGE: [4689377]: BUILD STARTED |
CI MESSAGE: [4689377]: BUILD PASSED |
Add standalone (CPU) audio resampling operator with tests * The operator can work with input/output sample rates, scaling factor or target length. * The resampling is performed with the same algorithm as in audio decoder. * Type conversion is supported, with 0-centered signed and midrange-centered unsigned semantics Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Add standalone (CPU) audio resampling operator with tests * The operator can work with input/output sample rates, scaling factor or target length. * The resampling is performed with the same algorithm as in audio decoder. * Type conversion is supported, with 0-centered signed and midrange-centered unsigned semantics Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Category:
New feature (non-breaking change which adds functionality)
Description:
Adds a standalone audio resampling operator.
The resampling can be parameterized with either:
Additional information:
Affected modules and functionalities:
Audio decoder op (some minor refactoring).
Key points relevant for the review:
Checklist
Tests
Documentation
DALI team only
Requirements
REQ IDs: TODO
JIRA TASK: DALI-2750