Add SaltAndPepper GPU operator #2956

jantonguirao · 2021-05-13T17:38:27Z

Why we need this PR?

Pick one, remove the rest

It adds new feature needed because to generate salt and pepper noise on the GPU

What happened in this PR?

Fill relevant points, put NA otherwise. Replace anything inside []

What solution was applied:
Extended RNGBase to allow for monochrome noise. Register a GPU SaltAndPepper operator
Affected modules and functionalities:
fn.noise. operators*
Key points relevant for the review:
GPU RNGBase changes
Validation and testing:
Existing tests. Added new tests for variable batch size and cpu_only mode
Documentation (including examples):
No new documentation needed.

JIRA TASK: [DALI-1970]

Signed-off-by: Joaquin Anton <janton@nvidia.com>

jantonguirao · 2021-05-13T17:48:13Z

dali/test/python/test_dali_variable_batch_size.py

@@ -223,7 +223,9 @@ def test_external_source():
    fn.hsv,
    fn.hue,
    fn.jpeg_compression_distortion,
+    fn.noise.gaussian,


this one was missing.

Signed-off-by: Joaquin Anton <janton@nvidia.com>

jantonguirao · 2021-05-13T17:51:06Z

dali/test/python/test_dali_cpu_only.py

@@ -174,9 +174,15 @@ def test_flip_cpu():
 def test_jpeg_compression_distortion_cpu():
    check_single_input(fn.jpeg_compression_distortion, quality = 10)

+def test_noise_gaussian_cpu():


this one was missing

dali/operators/random/rng_base_gpu.h

JanuszL · 2021-05-13T18:51:13Z

dali/operators/random/rng_base_gpu.h

    }
  }
-  return blocks_num;
+  return nsamples;


Does it make sense to return just output.num_samples()?

it's exactly that. I just put it in a variable to be used in the loop.

JanuszL · 2021-05-13T18:53:15Z

dali/operators/random/rng_base_gpu.cuh

+  auto &samples_cpu = backend_data_.sample_descs_cpu_;
+  auto &samples_gpu = backend_data_.sample_descs_gpu_;
+  int nsamples = SetupSampleDescs(samples_cpu.data(), out_view, in_view, channel_dim);
+  if (nsamples == 0) {


You can as well put in L132:

if (out_view.num_samples() == 0) { return; }

JanuszL · 2021-05-13T18:57:17Z

dali/operators/random/rng_base_gpu.cuh

+    // For generate once, apply to all channels, the channel dimension will be
+    // removed, so that each CUDA thread always processes all channels of a pixel.
+    for (int s = 0; s < nsamples; s++) {
+      shape_copy.tensor_shape_span(s)[channel_dim] = 1;


Won't this change affect sample_siz inside SetupBlockDescs?

yes, that's the whole point. In this case we calculate blocks as number of full pixels. SetupBlockDesc is dividing a shape into blocks. Here we remove the channel dimension from the shape copy, calculate blocks, and then launch a different kernel, that uses c_count and c_stride to visit all channels in every point.

I think id would better to pass the original shape to SetupBlockDescs and just skip channel_dim when computing the volume.

Signed-off-by: Joaquin Anton <janton@nvidia.com>

jantonguirao · 2021-05-14T11:14:17Z

dali/operators/random/rng_base_gpu.cuh

-  blockdesc_count = SetupBlockDescs(
-    blocks_cpu, block_sz, max_nblocks, out_view, in_view);
+
+  // TODO(janton): set layout explicitly from the user for RNG


This TODO is about random generators, not noise generators. We should do it at some point but not in this PR.

Signed-off-by: Joaquin Anton <janton@nvidia.com>

jantonguirao · 2021-05-17T07:38:08Z

dali/operators/random/rng_base_gpu.cuh

@@ -28,33 +28,82 @@ namespace dali {

 namespace {

+template <bool value>
+using bool_const = std::integral_constant<bool, value>;
+


Here I am dividing the implementation into four categories (cross product of is_noise_gen and is_per_channel)
can get different noise).

jantonguirao · 2021-05-17T07:38:14Z

dali/operators/random/rng_base_gpu.cuh

 template <typename T, typename Dist>
-__device__ __inline__ void Generate(BlockDesc<true> desc,
+__device__ __inline__ void Generate(const SampleDesc &sample,


is_noise_gen=True, is_per_channel=True.
Noise generator (applies random noise to an input) that treats channels independently (each channel

jantonguirao · 2021-05-17T07:38:50Z

dali/operators/random/rng_base_gpu.cuh

    auto n = dist.Generate(in[idx], rng);
    dist.Apply(out[idx], in[idx], n);
  }
 }

 template <typename T, typename Dist>
-__device__ __inline__ void Generate(BlockDesc<false> desc,
+__device__ __inline__ void Generate(const SampleDesc &sample,


is_noise_gen=True, is_per_channel=false.
Noise generator (applies random noise to an input) that treats all channels in a pixel as a whole (same noise is applied to all channels in a pixel)

jantonguirao · 2021-05-17T07:39:43Z

dali/operators/random/rng_base_gpu.cuh

+  }
+}
+
+template <typename T, typename Dist>


is_noise_gen=False, is_per_channel=True.
RNG (generates a random number, doesn't depend on any input) that treats channels independently (each channel can get a different value)

jantonguirao · 2021-05-17T07:40:08Z

dali/operators/random/rng_base_gpu.cuh

+  }
+}
+
+template <typename T, typename Dist>


is_noise_gen=False, is_per_channel=False.
RNG (generates a random number, doesn't depend on any input) that produces the same number for all channels in a pixel

jantonguirao · 2021-05-17T07:42:13Z

dali/operators/random/rng_base_gpu.h

-  int sample_idx;
-  void* output;
-  size_t size;
+struct SampleDesc {


Divided into BlockDesc and SampleDesc, because we added some info that doesn't need to be repeated for every block (pixel stride, channel stride, number of channels...)

jantonguirao · 2021-05-17T07:43:19Z

dali/operators/random/rng_base_gpu.h

+    int64_t sample_sz = volume(sh);
+    if (channel_dim >= 0) {
+      int nchannels = sh[channel_dim];
+      samples[s].p_count = sample_sz / nchannels;


here p_count, p_stride are full "pixels"

jantonguirao · 2021-05-17T07:43:41Z

dali/operators/random/rng_base_gpu.h

+      samples[s].c_count = nchannels;
+      samples[s].c_stride = volume(sh.begin() + channel_dim + 1, sh.end());
+    } else {
+      samples[s].p_count = sample_sz;


here p_count are total number of elements (channels are flattened).

mzient · 2021-05-18T14:58:42Z

dali/operators/random/rng_base_gpu.h

  std::tie(blocks_per_sample, blocks_num) = DistributeBlocksPerSample(shape, block_sz, max_nblocks);
  int64_t block = 0;
  for (int s = 0; s < shape.size(); s++) {
-    T *sample_data = static_cast<T *>(output[s].data);
    auto sample_size = volume(shape[s]);


You can extract the TensorShape here and doctor the channel dimension here instead of relying on the caller passing a manipulated input shape.

Suggested change

auto sample_size = volume(shape[s]);

shape_in_pixels = shape[s];

if (channel_dim >= 0)

shape_in_pixels[channel_dim] = 1;l

auto sample_size = volume(shape_in_pixels);

mzient · 2021-05-18T15:00:13Z

dali/operators/random/rng_base_gpu.cuh

-    RNGKernel<T, Dist, false>
-      <<<gridDim, blockDim, 0, ws.stream()>>>(blocks_gpu, rngs, dists, blockdesc_count);
-  }
+  VALUE_SWITCH(use_default_dist ? 1 : 0, DefaultDist, (false, true), (


A random thought: perhaps we should have a BOOL_SWITCH and get rid of this default: statement nonsense.

Agree. Let's handle that in a separate PR.

Signed-off-by: Joaquin Anton <janton@nvidia.com>

jantonguirao · 2021-05-18T18:32:09Z

!build

dali-automaton · 2021-05-18T18:36:33Z

CI MESSAGE: [2383324]: BUILD STARTED

dali-automaton · 2021-05-18T20:00:36Z

CI MESSAGE: [2383324]: BUILD FAILED

Signed-off-by: Joaquin Anton <janton@nvidia.com>

jantonguirao · 2021-05-19T06:57:33Z

!build

dali-automaton · 2021-05-19T07:01:49Z

CI MESSAGE: [2385625]: BUILD STARTED

dali-automaton · 2021-05-19T08:19:40Z

CI MESSAGE: [2385625]: BUILD PASSED

jantonguirao added 13 commits May 12, 2021 10:41

WIP

42ed235

Signed-off-by: Joaquin Anton <janton@nvidia.com>

Add tests

708e097

Signed-off-by: Joaquin Anton <janton@nvidia.com>

RNGBase: Separate noise generation and application steps

293a110

Signed-off-by: Joaquin Anton <janton@nvidia.com>

WIP per-channel=False generation

332ecd2

Signed-off-by: Joaquin Anton <janton@nvidia.com>

WIP

e254f8b

Signed-off-by: Joaquin Anton <janton@nvidia.com>

Fix rebase merge

d1204d3

Signed-off-by: Joaquin Anton <janton@nvidia.com>

Working CPU SaltAndPepper

1449eb1

Signed-off-by: Joaquin Anton <janton@nvidia.com>

Removing CPU implementation from this PR

6198faa

Signed-off-by: Joaquin Anton <janton@nvidia.com>

Custom salt and pepper values

53ebd4d

Signed-off-by: Joaquin Anton <janton@nvidia.com>

Code review fixes

6ed4421

Signed-off-by: Joaquin Anton <janton@nvidia.com>

Code review fixes

0113c20

Signed-off-by: Joaquin Anton <janton@nvidia.com>

Remove assert from device code

90ef0a2

Signed-off-by: Joaquin Anton <janton@nvidia.com>

SaltAndPepper GPU operator

2c928bc

Signed-off-by: Joaquin Anton <janton@nvidia.com>

JanuszL self-assigned this May 13, 2021

Rebase

4d67107

Signed-off-by: Joaquin Anton <janton@nvidia.com>

jantonguirao force-pushed the salt_and_pepper_gpu_op branch from 1824630 to 4d67107 Compare May 13, 2021 17:40

jantonguirao commented May 13, 2021

View reviewed changes

Missing noise.gaussian cpu-only test

40f5f29

Signed-off-by: Joaquin Anton <janton@nvidia.com>

jantonguirao commented May 13, 2021

View reviewed changes

JanuszL reviewed May 13, 2021

View reviewed changes

dali/operators/random/rng_base_gpu.h Show resolved Hide resolved

JanuszL reviewed May 13, 2021

View reviewed changes

Remove return value from SetupSampleDescs

16e865d

Signed-off-by: Joaquin Anton <janton@nvidia.com>

JanuszL approved these changes May 14, 2021

View reviewed changes

jantonguirao assigned szalpal May 14, 2021

jantonguirao commented May 14, 2021

View reviewed changes

Remove variable batch size tests from this PR to avoid conflicts

b34f420

Signed-off-by: Joaquin Anton <janton@nvidia.com>

jantonguirao commented May 17, 2021

View reviewed changes

jantonguirao assigned mzient and unassigned mzient and szalpal May 17, 2021

mzient reviewed May 18, 2021

View reviewed changes

jantonguirao added 2 commits May 18, 2021 17:57

Code review fixes

aae99bc

Signed-off-by: Joaquin Anton <janton@nvidia.com>

Code review fixes

e19144c

Signed-off-by: Joaquin Anton <janton@nvidia.com>

mzient approved these changes May 18, 2021

View reviewed changes

bug fix

57220b7

Signed-off-by: Joaquin Anton <janton@nvidia.com>

jantonguirao merged commit ccdbb03 into NVIDIA:master May 19, 2021

JanuszL mentioned this pull request May 19, 2021

DALI 2021 roadmap #2978

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SaltAndPepper GPU operator #2956

Add SaltAndPepper GPU operator #2956

jantonguirao commented May 13, 2021

jantonguirao May 13, 2021

jantonguirao May 13, 2021

JanuszL May 13, 2021

jantonguirao May 14, 2021

JanuszL May 14, 2021

JanuszL May 13, 2021 •

edited

Loading

jantonguirao May 14, 2021

JanuszL May 13, 2021

jantonguirao May 14, 2021

JanuszL May 14, 2021

mzient May 18, 2021

jantonguirao May 18, 2021

jantonguirao May 14, 2021

jantonguirao May 17, 2021

jantonguirao May 17, 2021

jantonguirao May 17, 2021 •

edited

Loading

jantonguirao May 17, 2021

jantonguirao May 17, 2021

jantonguirao May 17, 2021

jantonguirao May 17, 2021

jantonguirao May 17, 2021

mzient May 18, 2021

jantonguirao May 18, 2021

mzient May 18, 2021

jantonguirao May 18, 2021

jantonguirao commented May 18, 2021

dali-automaton commented May 18, 2021

dali-automaton commented May 18, 2021

jantonguirao commented May 19, 2021

dali-automaton commented May 19, 2021

dali-automaton commented May 19, 2021

-    auto sample_size = volume(shape[s]);
+    shape_in_pixels = shape[s];
+    if (channel_dim >= 0)
+      shape_in_pixels[channel_dim] = 1;l
+    auto sample_size = volume(shape_in_pixels);

Add SaltAndPepper GPU operator #2956

Add SaltAndPepper GPU operator #2956

Conversation

jantonguirao commented May 13, 2021

Why we need this PR?

What happened in this PR?

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JanuszL May 13, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jantonguirao May 17, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jantonguirao commented May 18, 2021

dali-automaton commented May 18, 2021

dali-automaton commented May 18, 2021

jantonguirao commented May 19, 2021

dali-automaton commented May 19, 2021

dali-automaton commented May 19, 2021

JanuszL May 13, 2021 •

edited

Loading

jantonguirao May 17, 2021 •

edited

Loading