Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add operators for batch reordering #2417

Merged
merged 3 commits into from
Oct 30, 2020
Merged

Conversation

mzient
Copy link
Contributor

@mzient mzient commented Oct 29, 2020

  • BatchPermutation - obtains random indices of samples
  • PermuteBatch - reorders tensors within a batch according to given list of indices.

Signed-off-by: Michał Zientkiewicz mzient@gmail.com

Why we need this PR?

Pick one, remove the rest

  • It adds new feature needed for mosaicing, soft-labels, etc

What happened in this PR?

Fill relevant points, put NA otherwise. Replace anything inside []

  • What solution was applied:
    • BatchPermutation operator - generate a batch of scalars ranged from 0 to batch_size-1
    • PermuteBatch operator - copies tensors from input at given index
  • Affected modules and functionalities:
    • N/A
  • Key points relevant for the review:
    • N/A
  • Validation and testing:
    • Python test
  • Documentation (including examples):
    • Docstrings, no examples.

JIRA TASK: DALI-1706

- BatchPermutation - obtains random indices of samples
- PermuteBatch - reorders tensors within a batch.

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
@mzient mzient requested a review from a team October 29, 2020 20:29
int src = indices_[i];
tp.AddWork([&, i, src, size](int tid) {
output.SetMeta(i, input.GetMeta(i));
memcpy(output[i].raw_mutable_data(), input[src].raw_data(), size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not:

Suggested change
memcpy(output[i].raw_mutable_data(), input[src].raw_data(), size);
cudaStream_t stream = 0;
output[i].Copy(input[src], stream);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had some problems with it, but I think it turned out to be something else. I can give it another shot.

R"(If true, the output can contain repetitions and omissions.)", false);

void BatchPermutation::NoRepetitions(int N) {
tmp_out_.resize(N);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make the tmp_out_ an argument?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can.

else
NoRepetitions(N);
for (int i = 0; i < N; ++i) {
out_view.data[i][0] = tmp_out_[i];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tmp_out_ comes out of the blue, see my previous comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

@mzient
Copy link
Contributor Author

mzient commented Oct 30, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1748765]: BUILD STARTED

Add core/random.h include file for random sequence generators.

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1748765]: BUILD FAILED

.AddArg("indices", R"(List of indices, matching current batch size, or a batch
of scalars representing indices of the tensors in the input batch.

The indices must be within ``[0..batch_size)`` range. Repetitions and omissions are allowed.)",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The indices must be within ``[0..batch_size)`` range. Repetitions and omissions are allowed.)",
The indices must be within ``[0, batch_size)`` range. Repetitions and omissions are allowed.)",

auto &tp = ws.GetThreadPool();
int N = indices_.size();
for (int i = 0; i < N; i++) {
auto size = volume(output_shape.tensor_shape_span(i));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have

auto size = output_shape.tensor_size(i);

auto size = volume(output_shape.tensor_shape_span(i));
int src = indices_[i];
tp.AddWork([&, i, src](int tid) {
output.SetMeta(i, input.GetMeta(i));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Copy should also copy the meta. Isn't it the case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SetMeta will set it on the underlying TensorList, if the TensorVector is in a contiguous state - but there's a bug, it should be GetMeta(src).

int element_size = output.type().size();

for (int i = 0; i < N; i++) {
auto size = volume(out_shape.tensor_shape_span(i)) * element_size;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
auto size = volume(out_shape.tensor_shape_span(i)) * element_size;
auto size = out_shape.tensor_size(i) * element_size;


bool SetupImpl(vector<OutputDesc> &outputs, const workspace_t<Backend> &ws) override {
outputs.resize(1);
auto &input = ws.template InputRef<Backend>(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
auto &input = ws.template InputRef<Backend>(0);
const auto &input = ws.template InputRef<Backend>(0);

to make sure we don't magically change the type of the input

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In setup it won't happen, because the workspace is const-qualified. I'll change it in Run, though.

public:
explicit PermuteBatch(const OpSpec &spec)
: PermuteBatchBase<GPUBackend>(spec)
, sg_(1<<18, spec.GetArgument<int>("batch_size")) {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you run scatter gather with fewer samples than requested here? In other words, is this specifying max size?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

out[i] = x;
}
}
// we're above hi now - no fixed points posisble
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// we're above hi now - no fixed points posisble
// we're above hi now - no fixed points possible

pipe.set_outputs(data, fn.permute_batch(data, indices=perm), perm)
pipe.build()

for i in range(10):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for i in range(10):
num_iters = 10
for i in range(10):

Maybe make it an argument of the function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't matter - in general, the more the better and there's no need to make it a test argument.

Fix support for non-tensor permutation.
Add test for raising errors for out-of-bounds tensor index.

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>
@mzient
Copy link
Contributor Author

mzient commented Oct 30, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1749098]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1749098]: BUILD PASSED

@mzient mzient merged commit 8234d35 into NVIDIA:master Oct 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants