Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add FW iterators handling of variable batch size and improve ES examples #2641

Merged
merged 4 commits into from
Jan 29, 2021

Conversation

JanuszL
Copy link
Contributor

@JanuszL JanuszL commented Jan 27, 2021

  • makes FW iterator ExternalSource examples handle properly StopIteration
  • adds a basic check if FW iterators handles properly variable batch size only when there is not externally provided the size of the iterator. Otherwise, it should assert

Signed-off-by: Janusz Lisiecki jlisiecki@nvidia.com

Why we need this PR?

Pick one, remove the rest

  • It adds FW iterators handling of variable batch size and improve ES examples

What happened in this PR?

Fill relevant points, put NA otherwise. Replace anything inside []

  • What solution was applied:
    makes FW iterator ExternalSource examples handle properly StopIteration
    adds a basic check if FW iterators handles properly variable batch size only when there is not externally provided the size of the iterator. Otherwise, it should assert
  • Affected modules and functionalities:
    FW data loading examples
    FW iterators
  • Key points relevant for the review:
    NA
  • Validation and testing:
    tests are added
  • Documentation (including examples):
    docs and examples have been updated

Relates to #2639
JIRA TASK: [NA]

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@JanuszL
Copy link
Contributor Author

JanuszL commented Jan 27, 2021

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2021336]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2021336]: BUILD FAILED

@JanuszL
Copy link
Contributor Author

JanuszL commented Jan 28, 2021

!build

@JanuszL JanuszL linked an issue Jan 28, 2021 that may be closed by this pull request
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2022819]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2022819]: BUILD FAILED

…ples

- makes FW iterator ExternalSource examples handle properly StopIteration
- adds a basic check if FW iterators handles properly variable batch size
  only when there is not externally provided the size of the iterator. Otherwise,
  it should assert

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
@JanuszL
Copy link
Contributor Author

JanuszL commented Jan 28, 2021

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2023433]: BUILD STARTED

@@ -234,7 +234,7 @@ class ExternalSource():

If set to False, StopIteration is raised when the end of data is reached.
This flag requires that the ``source`` is a collection, for example, an iterable object where
``iter(source)`` returns a fresh iterator on each call or a gensource erator function.
``iter(source)`` returns a fresh iterator on each call or a gensource iterator function.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
``iter(source)`` returns a fresh iterator on each call or a gensource iterator function.
``iter(source)`` returns a fresh iterator on each call, or a generator function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 24 to 25
"whenever possible. This may lead, in some situations, to miss some " +
"samples or return duplicated ones. Check the Sharding section of the "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lead to something

Suggested change
"whenever possible. This may lead, in some situations, to miss some " +
"samples or return duplicated ones. Check the Sharding section of the "
"whenever possible. This may lead, in some situations, to missing some " +
"samples or returning duplicated ones. Check the Sharding section of the "

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2023433]: BUILD PASSED

@@ -1051,6 +1052,33 @@ def get_data():
counter += 1
assert counter == iter_limit * runs

def check_external_source_variable_size(Iterator, *args, iter_size=-1, **kwargs):
batch_size = 4
Copy link
Contributor

@mzient mzient Jan 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick:

Suggested change
batch_size = 4
max_batch_size = 4

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 1061 to 1068
def get_data():
nonlocal i
if i == iter_limit:
i = 0
raise StopIteration
out = [[np.random.randint(0, 255, size = test_data_shape, dtype = np.uint8) for _ in range(random.randint(1, batch_size))]]
i += 1
return out
Copy link
Contributor

@mzient mzient Jan 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def get_data():
nonlocal i
if i == iter_limit:
i = 0
raise StopIteration
out = [[np.random.randint(0, 255, size = test_data_shape, dtype = np.uint8) for _ in range(random.randint(1, batch_size))]]
i += 1
return out
dataset = [[[np.random.randint(0, 255, size = test_data_shape, dtype = np.uint8) for _ in range(random.randint(1, batch_size))]] for _ in range(iter_limit)]

This will allow you to actually compare outputs against this pre-generated data. Aslo, passing dataset as the source will raise StopIteration automatically.

Copy link
Contributor Author

@JanuszL JanuszL Jan 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not possible with the current ES behavior.

Comment on lines 1078 to 1079
for __ in enumerate(it):
counter += 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can get the data here and compare with dataset, so you'll know for sure that data is intact.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the comment above.

@@ -163,7 +164,7 @@
"eii = ExternalInputIterator(batch_size, 0, 1)\n",
"pipe = ExternalSourcePipeline(batch_size=batch_size, num_threads=2, device_id = 0,\n",
" external_data = eii)\n",
"pii = PyTorchIterator(pipe, size=len(eii), last_batch_padded=True, last_batch_policy=LastBatchPolicy.PARTIAL)\n",
"pii = PyTorchIterator(pipe, size=-1, last_batch_padded=True, last_batch_policy=LastBatchPolicy.PARTIAL)\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it mandatory?

@JanuszL
Copy link
Contributor Author

JanuszL commented Jan 28, 2021

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2024247]: BUILD STARTED

@lgtm-com
Copy link
Contributor

lgtm-com bot commented Jan 28, 2021

This pull request introduces 1 alert when merging 402f972d278b6417557406ae5572207f34761218 into 0af127a - view on LGTM.com

new alerts:

  • 1 for Syntax error

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
@JanuszL
Copy link
Contributor Author

JanuszL commented Jan 28, 2021

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2025312]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2025312]: BUILD PASSED

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
@JanuszL
Copy link
Contributor Author

JanuszL commented Jan 29, 2021

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2026640]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2026640]: BUILD PASSED

Copy link
Member

@szalpal szalpal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two nitpicks also apply to remaining changed functions.

Please double check the set_outputs scope, I don't know how it should look like.

Otherwise LGTM

Comment on lines 26 to 37
def CreateCOCOReaderPipeline(data_paths, batch_size, num_threads, shard_id, num_gpus, random_shuffle, stick_to_shard, shuffle_after_epoch, pad_last_batch, initial_fill=1024, return_labels=False):
pipe = Pipeline(batch_size = batch_size, num_threads = num_threads, device_id = 0, prefetch_queue_depth=1)
with pipe:
_, _, labels, ids = fn.coco_reader(file_root = data_paths[0], annotations_file=data_paths[1],
shard_id = shard_id, num_shards = num_gpus, random_shuffle=random_shuffle,
image_ids=True, stick_to_shard=stick_to_shard,shuffle_after_epoch=shuffle_after_epoch,
pad_last_batch=pad_last_batch, initial_fill=initial_fill, name="Reader")
if return_labels:
pipe.set_outputs(labels, ids)
else:
pipe.set_outputs(ids)
return pipe
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two nitpicks:
One is that formatting around = is inconsistent in this function, maybe it would be good idea to make it consistent. The second is that maybe it would be better to use snake_case for the name, since it's a function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines +33 to +37
if return_labels:
pipe.set_outputs(labels, ids)
else:
pipe.set_outputs(ids)
return pipe
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't pipe.put_outputs be also in the scope of with?

pipe = Pipeline(batch_size = max_batch_size, num_threads = 1, device_id = 0)
with pipe:
outs = fn.external_source(source = get_data, num_outputs=1)
pipe.set_outputs(*outs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I don't really now precisely how this should work, if the set_outputs should be in the scope of with or not... Just putting attention to that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scope matters for the operators with side effects. I don't think set_outputs has any.

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
@JanuszL
Copy link
Contributor Author

JanuszL commented Jan 29, 2021

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2027748]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2027748]: BUILD PASSED

@JanuszL JanuszL merged commit 3a4fc33 into NVIDIA:master Jan 29, 2021
@JanuszL JanuszL deleted the fix_external_examples branch January 29, 2021 23:18
@JanuszL JanuszL mentioned this pull request May 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The pytorch-external example seems to be wrong
4 participants