Add last_batch_policy to the framework iterator #2269

JanuszL · 2020-09-11T12:10:31Z

adds a last_batch_policy which can be DROP, FILL, or PARTIAL
replaces fill_last_batch, True means FILL, False PARTIAL
when DROP policy is set the last batch is dropped when cannot be fully filled

Signed-off-by: Janusz Lisiecki jlisiecki@nvidia.com

Why we need this PR?

Pick one, remove the rest

It adds last_batch_policy to the framework iterator

What happened in this PR?

Fill relevant points, put NA otherwise. Replace anything inside []

What solution was applied:
adds a last_batch_policy which can be DROP, FILL, or PARTIAL
deprecates fill_last_batch
Affected modules and functionalities:
framework iterator
Key points relevant for the review:
NA
Validation and testing:
new test cases are added
Documentation (including examples):
[ Describe here if documentation and examples were updated. ]

JIRA TASK: [NA]

JanuszL · 2020-09-11T12:26:36Z

!build

dali-automaton · 2020-09-11T12:30:55Z

CI MESSAGE: [1618455]: BUILD STARTED

dali-automaton · 2020-09-11T22:01:12Z

CI MESSAGE: [1618455]: BUILD FAILED

dali-automaton · 2020-09-11T22:40:12Z

CI MESSAGE: [1618455]: BUILD PASSED

JanuszL · 2020-09-14T12:29:48Z

!build

dali-automaton · 2020-09-14T12:35:56Z

CI MESSAGE: [1623519]: BUILD STARTED

dali-automaton · 2020-09-14T13:05:16Z

CI MESSAGE: [1623519]: BUILD FAILED

JanuszL · 2020-09-14T13:29:58Z

!build

dali-automaton · 2020-09-14T13:35:38Z

CI MESSAGE: [1623602]: BUILD STARTED

dali-automaton · 2020-09-14T15:05:40Z

CI MESSAGE: [1623602]: BUILD PASSED

JanuszL · 2020-09-15T08:29:42Z

!build

dali-automaton · 2020-09-15T08:35:43Z

CI MESSAGE: [1626200]: BUILD STARTED

dali-automaton · 2020-09-15T12:21:53Z

CI MESSAGE: [1626200]: BUILD FAILED

dali-automaton · 2020-09-15T17:44:54Z

CI MESSAGE: [1626200]: BUILD PASSED

mzient · 2020-09-23T18:04:36Z

dali/test/python/test_fw_iterators.py

@@ -253,18 +256,91 @@ def test_mxnet_iterator_not_fill_last_batch_pad_last_batch():
    assert len(next_img_ids_list_set) == data_size
    assert len(set(next_mirrored_data)) == 1

-def check_mxnet_iterator_pass_reader_name(shards_num, pipes_number, batch_size, stick_to_shard, pad, iters, fill_last_batch):
+LastBatchPolicy = None


mzient · 2020-09-23T18:05:26Z

dali/test/python/test_fw_iterators.py

+
+    if pad and pipes_number == shards_num:
+        assert len(set.intersection(*out_set)) == 0, "Shards should not overlaps in the epoch"
+    if last_batch_policy == LastBatchPolicy.DROP:


This is set afterwards with a global?? Why can't we just import the LastBatchPolicy at the beginning?

I have added just from nvidia.dali.plugin.base_iterator import LastBatchPolicy as LastBatchPolicy

But I don't know if we want to encourage users to use this internal, undocumented base module.

JanuszL · 2020-09-23T21:07:52Z

!build

dali-automaton · 2020-09-23T21:22:01Z

CI MESSAGE: [1649313]: BUILD STARTED

dali-automaton · 2020-09-23T21:56:22Z

CI MESSAGE: [1649313]: BUILD FAILED

JanuszL · 2020-09-24T07:46:55Z

!build

dali-automaton · 2020-09-24T07:51:09Z

CI MESSAGE: [1650799]: BUILD STARTED

jantonguirao · 2020-09-24T08:43:40Z

dali/python/nvidia/dali/plugin/base_iterator.py

+@unique
+class LastBatchPolicy(Enum):
+    """
+        Describes the last batch policy behavior when there are no enough samples in the epoch


Suggested change

Describes the last batch policy behavior when there are no enough samples in the epoch

Describes the last batch policy behavior when there are not enough samples in the epoch

jantonguirao · 2020-09-24T08:44:09Z

dali/python/nvidia/dali/plugin/base_iterator.py

+class LastBatchPolicy(Enum):
+    """
+        Describes the last batch policy behavior when there are no enough samples in the epoch
+        to fully fill it


Suggested change

to fully fill it

to fill a whole batch.

jantonguirao · 2020-09-24T08:49:29Z

dali/python/nvidia/dali/plugin/base_iterator.py

            # calculate each shard size for each id, and check how many samples are left by substracting
            # from iterator counter the shard size, then go though all GPUs and check how much data needs to be dropped
            left = self.batch_size - (self._counter - self._shard_sizes_per_gpu_initial[self._shards_id])
            if_drop = np.less(left, self.batch_size)
        return if_drop, left

+    def _advance_and_check_drop_last(self):
+        """
+        Checks if the current batch is not fully filled and if drop it


Suggested change

Checks if the current batch is not fully filled and if drop it

Checks whether the current batch is not fully filled and whether it should be dropped.

jantonguirao · 2020-09-24T08:51:04Z

dali/python/nvidia/dali/plugin/base_iterator.py

                `auto_reset` don't work in such case. It works with only one pipeline inside
                the iterator.
                Mutually exclusive with `reader_name` argument
    reader_name : str, default = None
                Name of the reader which will be queried to the shard size, number of shards and
                all other properties necessary to count properly the number of relevant and padded
-                samples that iterator needs to deal with. It automatically sets `fill_last_batch` and
-                `last_batch_padded` accordingly to match the reader's configuration
+                samples that iterator needs to deal with. It automatically sets `last_batch_policy` to


I don't understand the last sentence. Can you elaborate?

How about now?

jantonguirao · 2020-09-24T08:52:46Z

docs/examples/use_cases/pytorch/single_stage_detector/src/data.py

@@ -70,7 +70,7 @@ def get_train_dali_loader(args, default_boxes, local_seed):
        train_pipe,
        ["images", "boxes", "labels"],
        reader_name="Reader",
-        fill_last_batch=True)
+        last_batch_policy=LastBatchPolicy.PARTIAL)


You changed the policy. Is it intentional?

jantonguirao · 2020-09-24T08:54:17Z

docs/examples/frameworks/paddle/paddle-external_input.ipynb

+    "\n",
+    "`last_batch_padded` and `last_batch_policy` are set here only for the demonstration purposes. The user may write any custom code and change the epoch size epoch to epoch. In that case, it is recommended to set `size` to -1 and let the iterator just wait for StopIteration exception from the `iter_setup`.\n",
+    "\n",
+    "The `last_batch_padded` here tells the iterator that the difference between data set size and batch size alignment is padded by real data that could be skipped when provided to the framework (`last_batch_policy`):"


Suggested change

"The `last_batch_padded` here tells the iterator that the difference between data set size and batch size alignment is padded by real data that could be skipped when provided to the framework (`last_batch_policy`):"

"The `last_batch_padded` here tells the iterator that the difference between dataset size and batch size alignment is padded by real data that could be skipped when provided to the framework (`last_batch_policy`):"

jantonguirao · 2020-09-24T08:54:50Z

docs/examples/frameworks/mxnet/mxnet-external_input.ipynb

@@ -141,9 +141,9 @@
    "\n",
    "In the end, let us see how it works.\n",
    "\n",
-    "`last_batch_padded` and `fill_last_batch` are set here only for the demonstration purposes. The user may write any custom code and change the epoch size epoch to epoch. In that case, it is recommended to set `size` to -1 and let the iterator just wait for StopIteration exception from the `iter_setup`.\n",
+    "`last_batch_padded` and `last_batch_policy` are set here only for the demonstration purposes. The user may write any custom code and change the epoch size epoch to epoch. In that case, it is recommended to set `size` to -1 and let the iterator just wait for StopIteration exception from the `iter_setup`.\n",


Suggested change

"`last_batch_padded` and `last_batch_policy` are set here only for the demonstration purposes. The user may write any custom code and change the epoch size epoch to epoch. In that case, it is recommended to set `size` to -1 and let the iterator just wait for StopIteration exception from the `iter_setup`.\n",

"`last_batch_padded` and `last_batch_policy` are set here only for the demonstration purposes. The user may write any custom code and change the epoch size epoch to epoch. In that case, it is recommended to set `size` to -1 and let the iterator just wait for StopIteration exception from `iter_setup`.\n",

jantonguirao · 2020-09-24T08:55:20Z

docs/examples/frameworks/mxnet/mxnet-external_input.ipynb

    "\n",
-    "The `last_batch_padded` here tells the iterator that the difference between data set size and batch size alignment is padded by real data that could be skipped when provided to the framework (`fill_last_batch`):"
+    "The `last_batch_padded` here tells the iterator that the difference between data set size and batch size alignment is padded by real data that could be skipped when provided to the framework (`last_batch_policy`):"


Suggested change

"The `last_batch_padded` here tells the iterator that the difference between data set size and batch size alignment is padded by real data that could be skipped when provided to the framework (`last_batch_policy`):"

"The `last_batch_padded` here tells the iterator that the difference between dataset size and batch size alignment is padded by real data that could be skipped when provided to the framework (`last_batch_policy`):"

jantonguirao · 2020-09-24T09:03:39Z

docs/advanced_topics.rst

@@ -217,7 +217,10 @@ Here are the iterator options:
 - | ``last_batch_padded``: Determines whether the tail of the data consists of data from the next
    shard (``False``) or is duplicated dummy data (``True``).
  | It is applicable when the shard size is not a multiple of the batch size,
-
+- | ``last_batch_policy`` - whether the last batch should be full no matter if shard size is


Suggested change

- | ``last_batch_policy`` - whether the last batch should be full no matter if shard size is

- | ``last_batch_policy`` - Determines the police about the last batch when the shard size is not

jantonguirao · 2020-09-24T09:06:17Z

docs/advanced_topics.rst

-
+- | ``last_batch_policy`` - whether the last batch should be full no matter if shard size is
+    divisible by the batch size.
+  | Only partially filled with the data or dropped entirely if it


T

Suggested change

| Only partially filled with the data or dropped entirely if it

The possible options are:

- FILL ...

- DROP ...

- PARTIAL ...

Something like this would read better

dali-automaton · 2020-10-05T15:33:51Z

CI MESSAGE: [1676861]: BUILD PASSED

mzient · 2020-10-16T11:08:33Z

dali/python/nvidia/dali/plugin/mxnet.py

+        try:
+            self._first_batch = self.next()
+        except StopIteration:
+            assert False, "It seems that there is not data in the pipeline, please check if last_batch_policy is set PARTIAL and batch size is bigger than the shard size"


Suggested change

assert False, "It seems that there is not data in the pipeline, please check if last_batch_policy is set PARTIAL and batch size is bigger than the shard size"

assert False, "It seems that there is no data in the pipeline. This may happen if `last_batch_policy` is set to PARTIAL and the requested batch size is greater than the shard size."

mzient · 2020-10-16T11:09:12Z

dali/python/nvidia/dali/plugin/paddle.py

+        try:
+            self._first_batch = self.next()
+        except StopIteration:
+            assert False, "It seems that there is not data in the pipeline, please check if last_batch_policy is set PARTIAL and batch size is bigger than the shard size"


Suggested change

assert False, "It seems that there is not data in the pipeline, please check if last_batch_policy is set PARTIAL and batch size is bigger than the shard size"

assert False, "It seems that there is no data in the pipeline. This may happen if `last_batch_policy` is set to PARTIAL and the requested batch size is greater than the shard size."

mzient · 2020-10-16T11:09:28Z

dali/python/nvidia/dali/plugin/pytorch.py

+        try:
+            self._first_batch = self.next()
+        except StopIteration:
+            assert False, "It seems that there is not data in the pipeline, please check if last_batch_policy is set PARTIAL and batch size is bigger than the shard size"


Suggested change

assert False, "It seems that there is not data in the pipeline, please check if last_batch_policy is set PARTIAL and batch size is bigger than the shard size"

assert False, "It seems that there is no data in the pipeline. This may happen if `last_batch_policy` is set to PARTIAL and the requested batch size is greater than the shard size."

mzient · 2020-10-16T11:10:08Z

dali/python/nvidia/dali/plugin/mxnet.py

+        try:
+            self._first_batch = self.next()
+        except StopIteration:
+            assert False, "It seems that there is not data in the pipeline, please check if last_batch_policy is set PARTIAL and batch size is bigger than the shard size"


Suggested change

assert False, "It seems that there is not data in the pipeline, please check if last_batch_policy is set PARTIAL and batch size is bigger than the shard size"

assert False, "It seems that there is no data in the pipeline. This may happen if `last_batch_policy` is set to PARTIAL and the requested batch size is greater than the shard size."

mzient · 2020-10-16T11:11:50Z

docs/examples/frameworks/paddle/paddle-external_input.ipynb

-    "At the end let us see how it works. Please also notice the usage of `last_batch_padded` that tell iterator that the difference between data set size and batch size alignment is padded by real data that could be skipped at when provided to the framework (`fill_last_batch`):"
+    "In the end, let us see how it works.\n",
+    "\n",
+    "`last_batch_padded` and `last_batch_policy` are set here only for the demonstration purposes. The user may write any custom code and change the epoch size epoch to epoch. In that case, it is recommended to set `size` to -1 and let the iterator just wait for StopIteration exception from `iter_setup`.\n",


Suggested change

"`last_batch_padded` and `last_batch_policy` are set here only for the demonstration purposes. The user may write any custom code and change the epoch size epoch to epoch. In that case, it is recommended to set `size` to -1 and let the iterator just wait for StopIteration exception from `iter_setup`.\n",

"`last_batch_padded` and `last_batch_policy` are set here only for the demonstration purposes. The user may write any custom code and change the epoch size from epoch to epoch. In that case, it is recommended to set `size` to -1 and let the iterator just wait for StopIteration exception from `iter_setup`.\n",

jantonguirao · 2020-10-16T11:29:18Z

docs/examples/frameworks/mxnet/mxnet-external_input.ipynb

@@ -223,4 +224,4 @@
 },
 "nbformat": 4,
 "nbformat_minor": 2
-}
+}


missing empty line

jantonguirao · 2020-10-16T11:29:25Z

docs/examples/frameworks/paddle/paddle-external_input.ipynb

@@ -211,4 +216,4 @@
 },
 "nbformat": 4,
 "nbformat_minor": 2
-}
+}


JanuszL · 2020-10-16T15:54:38Z

!build

dali-automaton · 2020-10-16T16:00:38Z

CI MESSAGE: [1708269]: BUILD STARTED

dali-automaton · 2020-10-16T18:07:32Z

CI MESSAGE: [1708269]: BUILD FAILED

- adds a last_batch_policy which can be DROP, FILL, or PARTIAL - replaces fill_last_batch, True means FILL, False PARTIAL - when DROP policy is set the last batch is dropped when cannot be fully filled Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>

JanuszL · 2020-10-16T19:34:42Z

!build

dali-automaton · 2020-10-16T20:15:23Z

CI MESSAGE: [1708962]: BUILD STARTED

dali-automaton · 2020-10-16T22:13:11Z

CI MESSAGE: [1708962]: BUILD PASSED

JanuszL force-pushed the add_drop_last branch from 39fd3cc to 690a99e Compare September 11, 2020 12:25

JanuszL force-pushed the add_drop_last branch from 2aaed82 to 3439c67 Compare September 14, 2020 13:29

JanuszL force-pushed the add_drop_last branch from 3439c67 to 64295c7 Compare September 15, 2020 08:29

agniutkarsh approved these changes Sep 21, 2020

View reviewed changes

mzient reviewed Sep 23, 2020

View reviewed changes

JanuszL force-pushed the add_drop_last branch 2 times, most recently from b1cf9f8 to 6948e9a Compare September 23, 2020 20:58

JanuszL force-pushed the add_drop_last branch from 44cf9ac to 701fc47 Compare September 23, 2020 21:14

JanuszL force-pushed the add_drop_last branch from 701fc47 to d1289f0 Compare September 24, 2020 07:46

jantonguirao reviewed Sep 24, 2020

View reviewed changes

mzient reviewed Oct 16, 2020

View reviewed changes

mzient approved these changes Oct 16, 2020

View reviewed changes

jantonguirao reviewed Oct 16, 2020

View reviewed changes

jantonguirao approved these changes Oct 16, 2020

View reviewed changes

JanuszL force-pushed the add_drop_last branch 3 times, most recently from cfcfa47 to 049d631 Compare October 16, 2020 15:54

JanuszL added 7 commits October 16, 2020 21:18

Fix tests

6a7f7a8

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>

Docs update

70bf479

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>

Improvements

d529840

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>

Fixes

ff10e91

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>

Review fixes

06b1f6f

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>

Code review fixes

e0312de

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>

JanuszL force-pushed the add_drop_last branch from 049d631 to e0312de Compare October 16, 2020 19:19

Rebase fix

ada5de7

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>

JanuszL merged commit 1ee090d into NVIDIA:master Oct 17, 2020

JanuszL deleted the add_drop_last branch October 17, 2020 20:15

	Describes the last batch policy behavior when there are no enough samples in the epoch
	Describes the last batch policy behavior when there are not enough samples in the epoch

	Checks if the current batch is not fully filled and if drop it
	Checks whether the current batch is not fully filled and whether it should be dropped.

	"The `last_batch_padded` here tells the iterator that the difference between data set size and batch size alignment is padded by real data that could be skipped when provided to the framework (`last_batch_policy`):"
	"The `last_batch_padded` here tells the iterator that the difference between dataset size and batch size alignment is padded by real data that could be skipped when provided to the framework (`last_batch_policy`):"

	"`last_batch_padded` and `last_batch_policy` are set here only for the demonstration purposes. The user may write any custom code and change the epoch size epoch to epoch. In that case, it is recommended to set `size` to -1 and let the iterator just wait for StopIteration exception from the `iter_setup`.\n",
	"`last_batch_padded` and `last_batch_policy` are set here only for the demonstration purposes. The user may write any custom code and change the epoch size epoch to epoch. In that case, it is recommended to set `size` to -1 and let the iterator just wait for StopIteration exception from `iter_setup`.\n",

	- \| ``last_batch_policy`` - whether the last batch should be full no matter if shard size is
	- \| ``last_batch_policy`` - Determines the police about the last batch when the shard size is not

-  | Only partially filled with the data or dropped entirely if it
+  The possible options are:
+   - FILL ...
+   - DROP ...
+   - PARTIAL ...

	assert False, "It seems that there is not data in the pipeline, please check if last_batch_policy is set PARTIAL and batch size is bigger than the shard size"
	assert False, "It seems that there is no data in the pipeline. This may happen if `last_batch_policy` is set to PARTIAL and the requested batch size is greater than the shard size."

Add last_batch_policy to the framework iterator #2269

Add last_batch_policy to the framework iterator #2269

Conversation

JanuszL commented Sep 11, 2020

Why we need this PR?

What happened in this PR?

JanuszL commented Sep 11, 2020

dali-automaton commented Sep 11, 2020

dali-automaton commented Sep 11, 2020

dali-automaton commented Sep 11, 2020

JanuszL commented Sep 14, 2020

dali-automaton commented Sep 14, 2020

dali-automaton commented Sep 14, 2020

JanuszL commented Sep 14, 2020

dali-automaton commented Sep 14, 2020

dali-automaton commented Sep 14, 2020

JanuszL commented Sep 15, 2020

dali-automaton commented Sep 15, 2020

dali-automaton commented Sep 15, 2020

dali-automaton commented Sep 15, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mzient Sep 23, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JanuszL commented Sep 23, 2020

dali-automaton commented Sep 23, 2020

dali-automaton commented Sep 23, 2020

JanuszL commented Sep 24, 2020

dali-automaton commented Sep 24, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dali-automaton commented Oct 5, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JanuszL commented Oct 16, 2020

dali-automaton commented Oct 16, 2020

dali-automaton commented Oct 16, 2020

JanuszL commented Oct 16, 2020

dali-automaton commented Oct 16, 2020

dali-automaton commented Oct 16, 2020

mzient Sep 23, 2020 •

edited

Loading