Equalize kernel #4565

stiepan · 2023-01-13T10:25:04Z

Signed-off-by: Kamil Tokarski ktokarski@nvidia.com

Category:

New feature (non-breaking change which adds functionality)

Description:

This PR adds equalize kernel. Equalization consits of following steps:

computing the histogram
computing the cumulative of histogram
preparing a lookup table to remap image values through (the table is basically the cumulative scaled to the image values range)
performing the lookup

The lookup table is different for different channels, so the existing lookup kernel did not seem to fit here.

The operation is needed for auto augment pipelines. For now it supports images and videos of uint8 type only.

Additional information:

Affected modules and functionalities:

No existing funcionalities are affected.

Key points relevant for the review:

Tests:

Checklist

Documentation

DALI team only

Requirements

Implements new requirements
Affects existing requirements
N/A

REQ IDs: N/A

JIRA TASK: DALI-3187

Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>

stiepan · 2023-01-16T16:51:39Z

!build

dali-automaton · 2023-01-16T16:55:50Z

CI MESSAGE: [7027870]: BUILD STARTED

Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>

dali-automaton · 2023-01-16T18:07:44Z

CI MESSAGE: [7027870]: BUILD FAILED

stiepan · 2023-01-16T18:22:41Z

!build

dali-automaton · 2023-01-16T18:25:11Z

CI MESSAGE: [7028794]: BUILD STARTED

dali-automaton · 2023-01-16T20:22:16Z

CI MESSAGE: [7028794]: BUILD FAILED

dali-automaton · 2023-01-16T20:34:37Z

CI MESSAGE: [7028794]: BUILD PASSED

jantonguirao · 2023-01-17T16:29:58Z

dali/kernels/imgproc/color_manipulation/equalize/equalize.h

+  static constexpr int hist_range = 256;
+
+  /**
+   * @brief Performs per-channel equalization.


This doesn't seem to do per-channel equalization, but equalization on a single-channel input.

The input can have multiple channels (the second extent), each of them will get different histogram and lookup tables.

jantonguirao · 2023-01-17T16:32:32Z

dali/kernels/imgproc/color_manipulation/equalize/hist.cu

+__global__ void ZeroMem(const SampleDesc *sample_descs) {
+  auto sample_desc = sample_descs[blockIdx.y];
+  sample_desc.out[blockIdx.x * SampleDesc::range_size + threadIdx.x] = 0;
+}


is this better than cudaMemsetAsync?

I didn't try it, likely not.

For single sample it seems to perform sligtly slower, but it is such a slim difference that I am not sure if that is a real thing. One concern is that I'd have to assume the tensor list is contigious here (or make num_sample calls).

jantonguirao · 2023-01-19T11:06:56Z

dali/kernels/imgproc/color_manipulation/equalize/hist.h

+  static constexpr int64_t kMaxGridSize = 128;
+  static constexpr int64_t kShmPerChannelSize = SampleDesc::range_size * sizeof(uint64_t);
+
+  HistogramKernelGpu() : shared_mem_limit_{GetSharedMemPerBlock()}, sample_descs_{} {}


Suggested change

HistogramKernelGpu() : shared_mem_limit_{GetSharedMemPerBlock()}, sample_descs_{} {}

HistogramKernelGpu() : shared_mem_limit_{GetSharedMemPerBlock()} {}

^^ redundant?

jantonguirao · 2023-01-19T12:03:32Z

dali/kernels/imgproc/color_manipulation/equalize/lut.h

+  const uint64_t *in;
+};
+
+struct LutKernelGpu {


no DLL_PUBLIC here?

jantonguirao · 2023-01-19T12:36:25Z

dali/kernels/imgproc/color_manipulation/equalize/lut.cu

+  __shared__ uint64_t workspace[SampleDesc::range_size];
+  auto sample_desc = sample_descs[blockIdx.x];
+  PrefixSum(workspace, sample_desc.in);
+  int32_t first_idx = FirstNonZero(workspace);


Do you need every single thread finding the first non-zero? wonder if it makes a difference?

As discussed elsewhere, we need that value in each thread anyway and there does not seem to be obvious alternative solution that would outperform it.

jantonguirao · 2023-01-19T13:42:56Z

dali/kernels/imgproc/color_manipulation/equalize/lookup.cu

+       idx += blockDim.x * gridDim.x) {
+    const uint8_t *in = sample_desc.in;
+    uint8_t *out = sample_desc.out;
+    uint64_t channel_idx = idx % sample_desc.num_channels;


why relying on modulus to index the channels, when you could use the y dimenson (threadIdx.y)?

To have strided accesses in the small lookup table rather than global input and output.

jantonguirao · 2023-01-19T13:54:07Z

dali/kernels/imgproc/color_manipulation/equalize/hist_test.cc

+    for (int64_t idx = 0; idx < batch_shape[0].num_elements(); idx++) {
+      sample_in_view.data[idx] = 51 * sample_idx;
+    }
+  }


You are not running the test, just setting the data. Is that expected?

Right, I missed it, thanks.

Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>

stiepan · 2023-01-19T17:12:47Z

!build

dali-automaton · 2023-01-19T17:15:52Z

CI MESSAGE: [7056329]: BUILD STARTED

dali-automaton · 2023-01-19T19:26:47Z

CI MESSAGE: [7056329]: BUILD FAILED

dali-automaton · 2023-01-19T21:20:42Z

CI MESSAGE: [7056329]: BUILD PASSED

* Adds equalization kernel for uint8 samples * The kernel computes histogram, lookup table and performs the lookup. Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>

stiepan force-pushed the equalize_kernel branch from cef9814 to 368a0b2 Compare January 13, 2023 10:34

Equalize kernel

c7ffa4f

Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>

stiepan force-pushed the equalize_kernel branch from 368a0b2 to c7ffa4f Compare January 16, 2023 10:00

stiepan marked this pull request as ready for review January 16, 2023 10:05

jantonguirao assigned jantonguirao and awolant Jan 16, 2023

stiepan force-pushed the equalize_kernel branch from fec67d2 to 2ef12eb Compare January 16, 2023 16:23

Add docstring to equalize kernel

2b67a69

Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>

stiepan force-pushed the equalize_kernel branch from 2ef12eb to 2b67a69 Compare January 16, 2023 16:51

Fix signed vs unsigned comparision for clang build

9c40214

Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>

stiepan mentioned this pull request Jan 16, 2023

Equalization operator #4575

Merged

18 tasks

jantonguirao reviewed Jan 17, 2023

View reviewed changes

jantonguirao reviewed Jan 19, 2023

View reviewed changes

Review remarks

e9f83b0

Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>

jantonguirao approved these changes Jan 19, 2023

View reviewed changes

awolant approved these changes Jan 19, 2023

View reviewed changes

stiepan merged commit 8d4f2a9 into NVIDIA:main Jan 19, 2023

JanuszL mentioned this pull request Sep 6, 2023

Roadmap 2023 #4578

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Equalize kernel #4565

Equalize kernel #4565

stiepan commented Jan 13, 2023 •

edited

Loading

stiepan commented Jan 16, 2023

dali-automaton commented Jan 16, 2023

dali-automaton commented Jan 16, 2023

stiepan commented Jan 16, 2023

dali-automaton commented Jan 16, 2023

dali-automaton commented Jan 16, 2023

dali-automaton commented Jan 16, 2023

jantonguirao Jan 17, 2023

stiepan Jan 17, 2023

jantonguirao Jan 17, 2023

stiepan Jan 17, 2023

stiepan Jan 18, 2023

jantonguirao Jan 19, 2023

stiepan Jan 19, 2023

jantonguirao Jan 19, 2023

stiepan Jan 19, 2023

jantonguirao Jan 19, 2023

stiepan Jan 19, 2023

jantonguirao Jan 19, 2023

stiepan Jan 19, 2023

jantonguirao Jan 19, 2023

stiepan Jan 19, 2023

stiepan commented Jan 19, 2023

dali-automaton commented Jan 19, 2023

dali-automaton commented Jan 19, 2023

dali-automaton commented Jan 19, 2023

	HistogramKernelGpu() : shared_mem_limit_{GetSharedMemPerBlock()}, sample_descs_{} {}
	HistogramKernelGpu() : shared_mem_limit_{GetSharedMemPerBlock()} {}

Equalize kernel #4565

Equalize kernel #4565

Conversation

stiepan commented Jan 13, 2023 • edited Loading

Category:

Description:

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

Checklist

Documentation

DALI team only

Requirements

stiepan commented Jan 16, 2023

dali-automaton commented Jan 16, 2023

dali-automaton commented Jan 16, 2023

stiepan commented Jan 16, 2023

dali-automaton commented Jan 16, 2023

dali-automaton commented Jan 16, 2023

dali-automaton commented Jan 16, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stiepan commented Jan 19, 2023

dali-automaton commented Jan 19, 2023

dali-automaton commented Jan 19, 2023

dali-automaton commented Jan 19, 2023

stiepan commented Jan 13, 2023 •

edited

Loading