Extract windows GPU #1538

mzient · 2019-12-03T18:02:23Z

Why we need this PR?

Pick one

It adds a feature required to implement STFT for GPU

What happened in this PR?

added ExtractWindows kernel for GPU with horizontal and vertical layout

How is this tested?

C++ unit tests against naive implementation (some small batches with various parameters + size sweep to check for edge cases)

JIRA TASK: [DALI-1168]

mzient · 2019-12-03T18:02:35Z

!build

dali-automaton · 2019-12-03T18:05:08Z

CI MESSAGE: [1015952]: BUILD STARTED

dali-automaton · 2019-12-03T18:18:14Z

CI MESSAGE: [1015952]: BUILD FAILED

mzient · 2019-12-04T10:20:35Z

!build

dali-automaton · 2019-12-04T10:25:27Z

CI MESSAGE: [1017208]: BUILD STARTED

jantonguirao · 2019-12-04T11:37:39Z

dali/kernels/signal/window/extract_windows_gpu.cu

+struct ExtractWindowsGPU<Dst, Src>::Impl : public ExtractWindowsGPUImpl<Dst, Src> {
+};
+
+template <typename Dst, typename Src>


As I mentioned earlier, the CPU variant of extract windows can work on 2D inputs (e.g. stereo audio signal) as long as the temporal dimension is the inner-most. Would it be possible to do something similar here?

For the innermost dimension? Rather hard to do for the general case. For C=2 it should be doable. I'm not really convinced that this kernel is going to stay at all, so I wouldn't put too much effort into making it very flexible.

jantonguirao · 2019-12-04T11:40:59Z

dali/kernels/signal/window/extract_windows_gpu.cuh

+/// @remarks This function must be executed by all (or no) threads in a block!
+template <int num_pages = 1, typename Dst, typename Src>
+__device__ void ExtractWindowsBlock(
+    int first_window_idx,


Note: from the first read perspective at least, the order of those arguments seems a bit random.

jantonguirao · 2019-12-04T11:44:54Z

dali/kernels/signal/window/extract_windows_gpu.cuh

+          break;
+      }
+    }
+    float v = idx >= 0 && idx < length ? ConvertNorm<float>(src[idx]) * w : Src();


I find it a bit non-intuitive that we are doing normalization here on the fly when doing type conversion. Even if it makes sense for most of the usages it is not really documented and it is well hidden in the CUDA kernel implementation.

My opinion is that we should probably leave type conversion to the decoder and let this kernel to work on Dst = Src

I think it applies a windowing function during the extraction, it is not a normalization. But it deserves a similar documentation as ExtractHorizontalWindows.

jantonguirao · 2019-12-04T11:47:21Z

dali/kernels/signal/window/extract_windows_gpu.cuh

+    blockIdx.x * kBlock,        // first window index
+    dst, num_windows, stride,   // output
+    src, length,                // input
+    window, win_len, win_center, step, reflect);  // windowing options


Suggested change

window, win_len, win_center, step, reflect); // windowing options

window, win_len, win_center, win_step, reflect); // windowing options

?

Actually, I'd rename window_step to step in ExtractWindowsArgs - window_step might mean other things, e.g. step between samples when extracting dilated windows (e.g. for multi-channel data or just downsampling).

jantonguirao · 2019-12-04T12:02:16Z

dali/kernels/signal/window/extract_windows_gpu.h

+namespace kernels {
+namespace signal {
+
+struct ExtractWindowsGPUArgs : ExtractWindowsArgs {


Suggested change

struct ExtractWindowsGPUArgs : ExtractWindowsArgs {

struct ExtractWindowsGPUArgs : public ExtractWindowsArgs {

jantonguirao · 2019-12-04T12:04:06Z

dali/kernels/signal/window/extract_windows_gpu.h

+  static_assert(std::is_same<Dst, float>::value, "Output type must be float");
+  static_assert(
+    std::is_same<Src, float>::value ||
+    std::is_same<Src, int8_t>::value ||


If you want to keep the type conversion, we should document that when converting from int types to float, the range is normalized to [-1.0, 1.0]

jantonguirao · 2019-12-04T12:06:59Z

dali/kernels/signal/window/extract_windows_gpu_test.cu

+  }
+
+  ExtractWindowsArgs args;
+  args.window_length = window.empty() ? 55 : window.size();


suggestion: Those tests could be parametrized. At the very least, those magic numbers could be declared constants at the top of the test

jantonguirao · 2019-12-04T12:42:46Z

dali/kernels/signal/window/extract_windows_gpu.cuh

+      out_shape.set_tensor_shape(0, { out_win_length, total_windows });
+    }
+
+    while (xgrid > 0x10000 && xgrid > N) {


please make 0x10000 a named constant

I can, I just don't know if it really helps to have kMaxBlocks = 0x10000 defined in previous line and used just once...

dali/kernels/signal/window/extract_windows_gpu.h

szalpal · 2019-12-04T14:14:40Z

dali/kernels/signal/window/extract_windows_gpu.cu

+    const InListGPU<Src, 1> &input,
+    const InTensorGPU<float, 1> &window,
+    const ExtractWindowsBatchedArgs &args) {
+  (void)args;


Is it here only to pass the "unused" warning?

Yes, it is.

dali-automaton · 2019-12-04T16:56:41Z

CI MESSAGE: [1017208]: BUILD FAILED

mzient · 2019-12-09T11:05:01Z

!build

dali-automaton · 2019-12-09T11:10:32Z

CI MESSAGE: [1023982]: BUILD STARTED

dali-automaton · 2019-12-09T12:13:03Z

CI MESSAGE: [1023982]: BUILD PASSED

dali/kernels/signal/window/extract_windows_gpu.cuh

JanuszL · 2019-12-09T12:49:43Z

dali/kernels/signal/window/extract_windows_gpu.cuh

+};
+
+struct BlockDesc {
+  int sample_idx;


You can add some @brief to sample_idx as well.

JanuszL · 2019-12-09T12:49:46Z

dali/kernels/signal/window/extract_windows_gpu.cuh

+};
+
+struct HorizontalBlockDesc {
+  int sample_idx;


You can add some @brief to sample_idx as well.

dali/kernels/signal/window/extract_windows_gpu.cuh

JanuszL · 2019-12-09T12:56:15Z

dali/kernels/signal/window/extract_windows_gpu.h

+namespace signal {
+
+struct ExtractWindowsBatchedArgs : ExtractWindowsArgs {
+  bool vertical = false;


Some @brief for vertical ?

JanuszL · 2019-12-09T13:05:05Z

dali/kernels/signal/window/extract_windows_gpu.h

+  /**
+   * @brief If true, all outputs are concatenated.
+   *
+   * In case of vertical windows, tThe concatenated output will contain all first samples from


Suggested change

* In case of vertical windows, tThe concatenated output will contain all first samples from

* In case of vertical windows, the concatenated output will contain all first samples from

JanuszL · 2019-12-09T13:06:28Z

dali/kernels/signal/window/extract_windows_gpu.h

+  /**
+   * @brief Indicates that the output should be overallocated (or windows truncated) to this size.
+   */
+  int padded_output_window = -1;


Does it map to out_win_length?

Yes. Neither name really reflects what it does... it's hard make a descriptive name for it, really.

If we can't figure the best name, at least use one everywhere.

mzient · 2019-12-16T12:03:47Z

!build

dali-automaton · 2019-12-16T12:05:30Z

CI MESSAGE: [1034835]: BUILD STARTED

dali-automaton · 2019-12-16T13:19:09Z

CI MESSAGE: [1034835]: BUILD FAILED

mzient · 2019-12-16T14:30:17Z

!build

dali-automaton · 2019-12-16T14:35:21Z

CI MESSAGE: [1035008]: BUILD STARTED

dali-automaton · 2019-12-16T15:59:38Z

CI MESSAGE: [1035008]: BUILD PASSED

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

mzient · 2019-12-19T16:59:17Z

!build

dali-automaton · 2019-12-19T17:05:15Z

CI MESSAGE: [1040589]: BUILD STARTED

dali-automaton · 2019-12-19T17:50:44Z

CI MESSAGE: [1040589]: BUILD PASSED

mzient requested review from jantonguirao, JanuszL and a team December 3, 2019 18:02

mzient force-pushed the ExtractWindowsGPU branch from 40a1511 to 30ae7bb Compare December 4, 2019 10:11

jantonguirao reviewed Dec 4, 2019

View reviewed changes

szalpal reviewed Dec 4, 2019

View reviewed changes

dali/kernels/signal/window/extract_windows_gpu.h Outdated Show resolved Hide resolved

szalpal reviewed Dec 4, 2019

View reviewed changes

mzient changed the title ~~Extract windows gpu~~ Extract windows GPU Dec 4, 2019

mzient force-pushed the ExtractWindowsGPU branch from 339bf34 to 28a4115 Compare December 4, 2019 14:53

mzient force-pushed the ExtractWindowsGPU branch from 1ca21ed to 97cb3d2 Compare December 9, 2019 10:49

jantonguirao approved these changes Dec 9, 2019

View reviewed changes

JanuszL reviewed Dec 9, 2019

View reviewed changes

dali/kernels/signal/window/extract_windows_gpu.cuh Show resolved Hide resolved

JanuszL reviewed Dec 9, 2019

View reviewed changes

dali/kernels/signal/window/extract_windows_gpu.cuh Show resolved Hide resolved

JanuszL reviewed Dec 9, 2019

View reviewed changes

mzient force-pushed the ExtractWindowsGPU branch 3 times, most recently from 46a5651 to 187e0a6 Compare December 16, 2019 12:03

mzient force-pushed the ExtractWindowsGPU branch from 187e0a6 to 42af201 Compare December 16, 2019 14:29

mzient added 13 commits December 19, 2019 16:47

GPU window extraction

75067e5

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

Add tests with a window function.

784d65a

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

Added kernel API frontent for GPU window extraction.

bc1fa48

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

Bug fix.

429d042

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

Added some comments.

ea890f8

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

Some renaming. Review fixes.

99ec010

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

Fix compilation - implicit instantiation of unique_ptr destructor.

8490d0b

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

Add horizontal window extraction.

25f3b51

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

Fix index reflection for N == 1.

4193573

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

Improved boundary handling.

eb5d6b9

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

Add comments about dynamic range normalization.

5c3e783

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

Added zero-padding.

ef955af

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

Review issues.

e60c6bb

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

mzient force-pushed the ExtractWindowsGPU branch from 42af201 to e60c6bb Compare December 19, 2019 16:56

JanuszL approved these changes Dec 19, 2019

View reviewed changes

mzient merged commit 2e368b1 into NVIDIA:master Dec 19, 2019

	window, win_len, win_center, step, reflect); // windowing options
	window, win_len, win_center, win_step, reflect); // windowing options

	struct ExtractWindowsGPUArgs : ExtractWindowsArgs {
	struct ExtractWindowsGPUArgs : public ExtractWindowsArgs {

	* In case of vertical windows, tThe concatenated output will contain all first samples from
	* In case of vertical windows, the concatenated output will contain all first samples from

Extract windows GPU #1538

Extract windows GPU #1538

Conversation

mzient commented Dec 3, 2019 • edited Loading

Why we need this PR?

What happened in this PR?

mzient commented Dec 3, 2019

dali-automaton commented Dec 3, 2019

dali-automaton commented Dec 3, 2019

mzient commented Dec 4, 2019

dali-automaton commented Dec 4, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dali-automaton commented Dec 4, 2019

mzient commented Dec 9, 2019

dali-automaton commented Dec 9, 2019

dali-automaton commented Dec 9, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mzient commented Dec 16, 2019

dali-automaton commented Dec 16, 2019

dali-automaton commented Dec 16, 2019

mzient commented Dec 16, 2019

dali-automaton commented Dec 16, 2019

dali-automaton commented Dec 16, 2019

mzient commented Dec 19, 2019

dali-automaton commented Dec 19, 2019

dali-automaton commented Dec 19, 2019

mzient commented Dec 3, 2019 •

edited

Loading