Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add implicit scope to batch_permutation #5063

Merged
merged 6 commits into from
Sep 29, 2023

Conversation

stiepan
Copy link
Member

@stiepan stiepan commented Sep 27, 2023

Category:

Bug fix /New feature (non-breaking change which adds functionality)

Description:

The PR makes the following code work.

batch =  fn.decoders.image(batch, device="mixed")
if fn.random.coin_flip():
        permuted_batch = fn.permute_batch(batch, indices=fn.batch_permutation())

Without it, the fn.permute_batch will fail due indicies exceeding the valid indicies for the batch in the inner scope.

Even though we call fn.batch_permutation() inside the if body, the operator will run as if it was called in the outer scope - that is it will produce a permutation for a full batch size. Then, that permutation will be split according to the fn.random.coin_flip() predicate.

Unfortunately, randomly splitting permutation in two smaller lists usually does not produce a valid permutations for smaller lists, i.e. while [3, 0, 2, 1] is a valid permutation of four elements, splitting it in two [3, 0], [2, 1] does not produce valid permutations of two elements.

The hoisting of fn.batch_permutation into an outer scope happens becuase the fn.batch_permutation does not accept any tensor arguments - those args serve as a source of the "local scope" batch size. This PR fixes the problem by introducing an implicit "_scope" argument that is automatically injected with a dummy batch whose size matches the local batch size.

Additional information:

Affected modules and functionalities:

  • OpSchema - by introduction of HiddenOptional arguments.
  • fn.batch_permutation - it now has new parent schema with the _scope arg.
  • ops.py - code that automatically injects a referential scope to those operators that have the _scope argument.

A possible follow-up would be to use the same mechanism for any operator that can potentially have no other tensor arugments. For example random generators - in their case, hositing to global scope and then splitting the output does not impact the correctness, but may incur some performence penalty.

Key points relevant for the review:

Is it tested properly? Is the hidden _scope arg hidden in the docs? Anywhere else where you'd rather not see a hidden argument that I should check?

Tests:

  • Existing tests apply
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Checklist

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: DALI-3495

@stiepan
Copy link
Member Author

stiepan commented Sep 27, 2023

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [10010511]: BUILD FAILED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [10016104]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [10016104]: BUILD PASSED

@klecki klecki self-assigned this Sep 28, 2023
@mzient mzient self-assigned this Sep 28, 2023
@@ -64,6 +68,20 @@ enum class InputDevice : uint8_t {
GPU = 2,
};

inline void EnforceArgNameFormat(const std::string &name, bool hide_argument) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, why add functions AddHidden... instead of just hiding all arguments that start with _?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I've changed it for an arg to be hidden iff its name starts with "_".

#include "dali/pipeline/operator/operator.h"

DALI_SCHEMA(ImplicitScopeAttr)
.AddHiddenOptionalArg<int>("_scope", R"code(Any CPU batch to infer the current batch size from.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.AddHiddenOptionalArg<int>("_scope", R"code(Any CPU batch to infer the current batch size from.
.AddHiddenOptionalArg<int>("_scope", R"code(Any batch to infer the current batch size from.

That's just current limitation that we don't need to repeat here and may lift in the future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
@stiepan stiepan force-pushed the permute_batch_conditional_scope branch from b74bc95 to 32a3a81 Compare September 28, 2023 21:05
@stiepan
Copy link
Member Author

stiepan commented Sep 28, 2023

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [10030396]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [10030396]: BUILD FAILED

Comment on lines 71 to 73
inline bool HideArgument(const std::string &name) {
return name.size() && name[0] == '_';
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move it inside the Schema class as a static function?
Also, the imperative name doesn't look like a boolean query.
Alternatives: ShouldHideArgument, ArgumentHidden, IsArgumentHidden or perhaps just IsHidden(const string &arg_name)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point

…nction

Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
@stiepan
Copy link
Member Author

stiepan commented Sep 29, 2023

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [10037057]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [10030396]: BUILD PASSED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [10037057]: BUILD PASSED

@stiepan stiepan merged commit e5eb489 into NVIDIA:main Sep 29, 2023
5 checks passed
JanuszL pushed a commit to JanuszL/DALI that referenced this pull request Oct 13, 2023
* Add option to hide arguments from schema
* Add schema with a `_scope` and use it in batch_permutation
* Expose HasArgument method in the pipeline backend
* Automatically inject scope argument in conditional mode

---------

Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants