COCOReader: Support for uncompressed RLE masks #2478

jantonguirao · 2020-11-18T12:36:11Z

Signed-off-by: Joaquin Anton janton@nvidia.com

Why we need this PR?

Pick one, remove the rest

It adds a new feature needed to support uncompressed RLE counts
It adds a new feature needed to support load from preprocessed annotations, including RLE masks

What happened in this PR?

Fill relevant points, put NA otherwise. Replace anything inside []

What solution was applied:
Added support for preprocessed annotation files describing RLE masks
Using cocoapi RLE struct to store the RLE masks.
Affected modules and functionalities:
COCOReader
Key points relevant for the review:
All
Validation and testing:
C++ Tests enhanced, new data added to DALI_extra
Documentation (including examples):
N/A

JIRA TASK: [DALI-1732]

jantonguirao · 2020-11-19T14:10:12Z

dali/operators/reader/coco_reader_op_test.cc

-    ASSERT_EQ(pixelwise_masks_shape[i][0], s.height);
-    EXPECT_EQ(0, std::memcmp(mask.data, labels.data(), s.width * s.height * sizeof(uchar)));
+  Pipeline pipe2(expected_size, 1, 0, kSeed);
+  pipe2.AddOperator(


second pipeline to test loading from preprocessed binary files

jantonguirao · 2020-11-19T14:28:30Z

!build

mzient · 2020-11-19T14:29:05Z

dali/operators/reader/loader/coco_loader.h

@@ -59,6 +63,39 @@ inline bool HasSavePreprocessedAnnotationsDir(const OpSpec &spec) {
    (spec.HasArgument("dump_meta_files_path") && spec.GetArgument<bool>("dump_meta_files_path"));
 }

+struct RLEMask {


I wonder... if you expose all the fields for modification, then this looks more like a C++ variant of the RLE structure - then maybe this should be handled by inheritance?

Suggested change

struct RLEMask {

struct RLEMask : RLE {

Otherwise, you can treat RLE as a resource and use struct RLEMask : UniqueResource<RLE>:

struct RLEMask : UniqueResource<RLE> { RLEMask(const char* str, int h, int w) { rleFrString(&handle_, const_cast<char*>(str), h, w); } RLEMask(span<const unsigned int> counts, int h, int w) { rleInit(&handle_, h, w, counts.size(), const_cast<unsigned int*>(counts.data())); } RLE* operator->() { return &rle_; } // do you really need this one? const RLE* operator->() const { return &rle_; } void DestroyHandle(RLE rle) { if (rle.cnts) rleFree(&rle); } };

dali-automaton · 2020-11-19T14:30:40Z

CI MESSAGE: [1814491]: BUILD STARTED

dali-automaton · 2020-11-19T14:38:59Z

CI MESSAGE: [1814491]: BUILD FAILED

jantonguirao · 2020-11-19T15:41:47Z

!build

dali-automaton · 2020-11-19T15:45:27Z

CI MESSAGE: [1814766]: BUILD STARTED

dali-automaton · 2020-11-19T16:50:22Z

CI MESSAGE: [1814766]: BUILD FAILED

jantonguirao

!build

jantonguirao · 2020-11-19T19:41:44Z

!build

JanuszL · 2020-11-19T20:35:54Z

dali/operators/reader/loader/coco_loader.cc

+    return;
+
+  unsigned size;
+  file.read(reinterpret_cast<char*>(&size), sizeof(unsigned));


If we are already touching this can you:

check size is smaller than the file size and >=0

m * sizeof(uint) is smaller than the file size and >= 0

h and w are sane as well?

Do the same for other LoadFromFile variants?

dali/operators/reader/loader/coco_loader.cc

dali-automaton · 2020-11-19T23:04:10Z

CI MESSAGE: [1815816]: BUILD STARTED

dali-automaton · 2020-11-20T00:28:37Z

CI MESSAGE: [1815816]: BUILD FAILED

jantonguirao · 2020-11-23T10:02:34Z

!build

dali-automaton · 2020-11-23T10:05:46Z

CI MESSAGE: [1824283]: BUILD STARTED

dali-automaton · 2020-11-23T11:14:22Z

CI MESSAGE: [1824283]: BUILD FAILED

jantonguirao · 2020-11-23T13:39:03Z

!build

dali-automaton · 2020-11-23T13:40:41Z

CI MESSAGE: [1824670]: BUILD STARTED

dali-automaton · 2020-11-23T14:56:23Z

CI MESSAGE: [1824670]: BUILD FAILED

jantonguirao · 2020-11-23T15:46:12Z

!build

dali-automaton · 2020-11-23T15:50:50Z

CI MESSAGE: [1824967]: BUILD STARTED

dali-automaton · 2020-11-23T19:49:58Z

CI MESSAGE: [1824967]: BUILD FAILED

Signed-off-by: Joaquin Anton <janton@nvidia.com>

dali-automaton · 2020-11-24T13:55:27Z

CI MESSAGE: [1828400]: BUILD STARTED

dali-automaton · 2020-11-24T15:23:36Z

CI MESSAGE: [1828400]: BUILD PASSED

JanuszL · 2020-11-25T14:47:03Z

dali/operators/reader/coco_reader_op_test.cc

-    .AddArg("dump_meta_files_path", "/tmp/")
-    ,
+    .AddArg("save_preprocessed_annotations", true)
+    .AddArg("save_preprocessed_annotations_dir", "/tmp/"),


I wonder if we should use mkdtmp instead of hardcode to just /tmp/.

Signed-off-by: Joaquin Anton <janton@nvidia.com>

mzient · 2020-11-26T09:22:35Z

DALI_EXTRA_VERSION

@@ -1 +1 @@
-fdd536addddc0f1a5bd52a15db708f95492c813e
+rle_uncompressed


Change to a commit tag before merging!

mzient · 2020-11-26T10:25:04Z

dali/operators/reader/loader/coco_loader.h

+    rleFrString(&handle_, const_cast<char*>(str), h, w);
+  }
+
+  static constexpr RLE null_handle() { return {0, 0, 0, nullptr}; }


Do you need this? Won't the default just work?

I remember it didn't work (ptr was not nullptr, if I remember correctly). I could double-check

It seems to work, so I am removing it

mzient · 2020-11-26T10:27:13Z

dali/operators/reader/loader/coco_loader.cc

-            sample_rles_idx.push_back(objects_in_sample);
-            sample_rles.push_back(std::move(annotation.rle_.rle_));
+            masks_rles_idx_.push_back(objects_in_sample);
+            masks_rles_.emplace_back(std::move(annotation.rle_));


Nitpick:

Suggested change

masks_rles_.emplace_back(std::move(annotation.rle_));

masks_rles_.push_back(std::move(annotation.rle_));

mzient · 2020-11-26T10:30:42Z

dali/operators/reader/loader/coco_loader.h

+  std::vector<RLEMask> masks_rles_;
+  std::vector<int> masks_rles_idx_;
+  std::vector<int64_t> masks_offset_;  // per sample offset
+  std::vector<int64_t> masks_count_;  // per sample size


Nitpick

Suggested change

std::vector<int64_t> masks_count_; // per sample size

std::vector<int64_t> mask_counts_ // number of masks per sample

mzient · 2020-11-26T10:40:31Z

dali/operators/reader/loader/coco_loader.h

-  std::vector<std::vector<int>> masks_rles_idx_;
+  std::vector<RLEMask> masks_rles_;
+  std::vector<int> masks_rles_idx_;
+  std::vector<int64_t> masks_offset_;  // per sample offset


Suggested change

std::vector<int64_t> masks_offset_; // per sample offset

std::vector<int64_t> mask_offsets_; // per-sample offsets of masks

mzient · 2020-11-26T13:01:25Z

dali/operators/reader/loader/coco_loader.cc

    int64_t polygons_sample_offset = polygon_data_.size();
    int64_t polygons_sample_count = 0;
    int64_t vertices_sample_offset = vertices_data_.size();
    int64_t vertices_sample_count = 0;
+    int64_t masks_offset = masks_rles_.size();
+    int64_t masks_count = 0;


Suggested change

int64_t masks_count = 0;

int64_t mask_count = 0;

mzient · 2020-11-26T13:07:36Z

dali/operators/reader/loader/coco_loader.cc

@@ -392,14 +504,14 @@ void CocoLoader::ParseJsonAnnotations() {

  for (auto &image_info : image_infos) {
    int objects_in_sample = 0;
-    std::vector<int> sample_rles_idx;
-    std::vector<std::string> sample_rles;
    int64_t polygons_sample_offset = polygon_data_.size();


This reads as "offset of a sample in polygons". I think it means the opposite and then it should be sample_polygon(s)_offset, sample_polygon_count, sample_vertex_offset / vertices_offset, sample_vertex_count etc.
Offsets can be preceded by a plural form because it's the offset of a block/range of polygons - "polygons" is the object.
Count should be preceded by a singular form, because we're going into the aforementioned region and counting individual object. If you're terribly attached to a plural form, then it could be polygons_size / vertices_size.

mzient · 2020-11-26T13:07:56Z

qa/setup_dali_extra.sh

@@ -2,7 +2,7 @@

 # Fetch test data
 export DALI_EXTRA_PATH=${DALI_EXTRA_PATH:-/opt/dali_extra}
-export DALI_EXTRA_URL=${DALI_EXTRA_URL:-"https://github.com/NVIDIA/DALI_extra.git"}
+export DALI_EXTRA_URL="https://github.com/jantonguirao/DALI_extra.git"


Fix before merging.

mzient · 2020-11-26T13:08:08Z

qa/setup_dali_extra.sh

@@ -13,6 +13,5 @@ if [ ! -d "$DALI_EXTRA_PATH" ] ; then
 fi

 pushd "$DALI_EXTRA_PATH"
-git fetch origin ${DALI_EXTRA_VERSION}
-git checkout ${DALI_EXTRA_VERSION}
+git checkout rle_uncompressed


Fix before merging.

awolant

With fixes to DALI_extra that Michał pointed out it looks ok

jantonguirao · 2020-11-27T12:51:39Z

!build

dali-automaton · 2020-11-27T12:55:39Z

CI MESSAGE: [1839138]: BUILD STARTED

dali-automaton · 2020-11-27T13:02:06Z

CI MESSAGE: [1839138]: BUILD FAILED

Signed-off-by: Joaquin Anton <janton@nvidia.com>

jantonguirao · 2020-11-27T13:13:05Z

!build

dali-automaton · 2020-11-27T13:15:27Z

CI MESSAGE: [1839192]: BUILD STARTED

dali-automaton · 2020-11-27T15:49:41Z

CI MESSAGE: [1839192]: BUILD PASSED

jantonguirao force-pushed the coco_uncompressed_rle branch from 49aa2fa to 538c4db Compare November 18, 2020 15:15

jantonguirao commented Nov 19, 2020

View reviewed changes

jantonguirao changed the title ~~[WIP] COCOReader: Support for uncompressed RLE masks~~ COCOReader: Support for uncompressed RLE masks Nov 19, 2020

mzient reviewed Nov 19, 2020

View reviewed changes

jantonguirao commented Nov 19, 2020

View reviewed changes

JanuszL reviewed Nov 19, 2020

View reviewed changes

dali/operators/reader/loader/coco_loader.cc Show resolved Hide resolved

jantonguirao force-pushed the coco_uncompressed_rle branch from dad74cb to 853328a Compare November 20, 2020 12:19

jantonguirao force-pushed the coco_uncompressed_rle branch from 84511f1 to 2b1c1e5 Compare November 23, 2020 15:45

jantonguirao force-pushed the coco_uncompressed_rle branch from 2b1c1e5 to 72b34d1 Compare November 24, 2020 11:52

jantonguirao added 2 commits November 24, 2020 12:54

Save rle masks with preprocessed annotations.

5e77d58

Signed-off-by: Joaquin Anton <janton@nvidia.com>

Uncompressed RLE counts handling

d569d58

Signed-off-by: Joaquin Anton <janton@nvidia.com>

JanuszL reviewed Nov 25, 2020

View reviewed changes

JanuszL approved these changes Nov 25, 2020

View reviewed changes

Code review fixes

ebdf7d7

Signed-off-by: Joaquin Anton <janton@nvidia.com>

jantonguirao force-pushed the coco_uncompressed_rle branch from ddfe592 to ebdf7d7 Compare November 25, 2020 17:07

mzient reviewed Nov 26, 2020

View reviewed changes

awolant assigned awolant and unassigned awolant Nov 26, 2020

awolant self-requested a review November 26, 2020 10:37

mzient reviewed Nov 26, 2020

View reviewed changes

awolant approved these changes Nov 27, 2020

View reviewed changes

jantonguirao force-pushed the coco_uncompressed_rle branch from 80e085e to e829cdf Compare November 27, 2020 12:34

Code review fixes

57171bd

Signed-off-by: Joaquin Anton <janton@nvidia.com>

jantonguirao force-pushed the coco_uncompressed_rle branch from e829cdf to 57171bd Compare November 27, 2020 13:13

nitish-awasthi approved these changes Nov 27, 2020

View reviewed changes

jantonguirao merged commit 737b7b9 into NVIDIA:master Nov 30, 2020

		@@ -1 +1 @@
		fdd536addddc0f1a5bd52a15db708f95492c813e
		rle_uncompressed

	masks_rles_.emplace_back(std::move(annotation.rle_));
	masks_rles_.push_back(std::move(annotation.rle_));

	std::vector<int64_t> masks_count_; // per sample size
	std::vector<int64_t> mask_counts_ // number of masks per sample

	std::vector<int64_t> masks_offset_; // per sample offset
	std::vector<int64_t> mask_offsets_; // per-sample offsets of masks

COCOReader: Support for uncompressed RLE masks #2478

COCOReader: Support for uncompressed RLE masks #2478

Conversation

jantonguirao commented Nov 18, 2020 • edited Loading

Why we need this PR?

What happened in this PR?

Choose a reason for hiding this comment

jantonguirao commented Nov 19, 2020

mzient Nov 19, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dali-automaton commented Nov 19, 2020

dali-automaton commented Nov 19, 2020

jantonguirao commented Nov 19, 2020

dali-automaton commented Nov 19, 2020

dali-automaton commented Nov 19, 2020

jantonguirao left a comment

Choose a reason for hiding this comment

jantonguirao commented Nov 19, 2020

Choose a reason for hiding this comment

dali-automaton commented Nov 19, 2020

dali-automaton commented Nov 20, 2020

jantonguirao commented Nov 23, 2020

dali-automaton commented Nov 23, 2020

dali-automaton commented Nov 23, 2020

jantonguirao commented Nov 23, 2020

dali-automaton commented Nov 23, 2020

dali-automaton commented Nov 23, 2020

jantonguirao commented Nov 23, 2020

dali-automaton commented Nov 23, 2020

dali-automaton commented Nov 23, 2020

dali-automaton commented Nov 24, 2020

dali-automaton commented Nov 24, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mzient Nov 26, 2020 • edited Loading

Choose a reason for hiding this comment

mzient Nov 26, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

awolant left a comment

Choose a reason for hiding this comment

jantonguirao commented Nov 27, 2020

dali-automaton commented Nov 27, 2020

dali-automaton commented Nov 27, 2020

jantonguirao commented Nov 27, 2020

dali-automaton commented Nov 27, 2020

dali-automaton commented Nov 27, 2020

jantonguirao commented Nov 18, 2020 •

edited

Loading

mzient Nov 19, 2020 •

edited

Loading

mzient Nov 26, 2020 •

edited

Loading

mzient Nov 26, 2020 •

edited

Loading