Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rotate per-frame #3820

Merged
merged 15 commits into from Apr 25, 2022
Merged

Rotate per-frame #3820

merged 15 commits into from Apr 25, 2022

Conversation

stiepan
Copy link
Member

@stiepan stiepan commented Apr 13, 2022

Signed-off-by: Kamil Tokarski ktokarski@nvidia.com

Category:

New feature (non-breaking change which adds functionality)

Description:

  • Adds support for sequence processing to rotate operator.
  • Adds support for per-frame tensor input to angle and axis parameters.
  • Adds sequence processing tests (where processing batches of expanded frames is a reference)
    • Test processing video input
    • Test processing synthetic sequence of random 3D inputs.

Additional information:

  • If not specified, the output shape is inferred from rotation parameters and the input shape: in case of per-frame arguments it requires some coalescing of different shapes of frames that belong to the same output sequence. The coalescing is performed by choosing: for each extent the maximal value from all frames, then the parity correction of the output shape is performed by majority vote.

Affected modules and functionalities:

  • warp base, to utilize SequenceOperator and pass around unfolded extents (to map frames to sequences when coalescing shape)
  • rotate params provider to perform shape coalescing
  • python tests of rotate op
  • sequences test utils - extending the framework so that it can account for shape coalescing in baseline pipline

Key points relevant for the review:

Checklist

Tests

  • Existing tests apply
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: ROTATE.10

JIRA TASK: DALI-2508

Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
@mzient mzient self-assigned this Apr 20, 2022
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
@stiepan stiepan marked this pull request as ready for review April 20, 2022 09:07
@stiepan stiepan changed the title WIP: Rotate per-frame Rotate per-frame Apr 20, 2022
Comment on lines 94 to 95
// kernels::vec2shape reverses the extents, store them in that order in parity vector
parity[2 - i] = in_size[dominant_src_axis[i]] % 2;
Copy link
Contributor

@mzient mzient Apr 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's wrong with storing it naturally and returning vec2shape(parity)? I think the code would be more readable that way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it has + operator defined:)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point :) But now that you've mentioned it, it also has a max. How about switching the order to x, y, z and returning two vectors? You could calculate the size on ivec2/3 and convert to TensorShape at the very end.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks for pointing to it.

@stiepan
Copy link
Member Author

stiepan commented Apr 20, 2022

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4629623]: BUILD STARTED

Comment on lines 299 to 304
for (int dim_idx = 0; dim_idx < spatial_ndim; dim_idx++) {
bool should_be_odd = 2 * acc_parity[dim_idx] > num_frames;
if (acc_shape[dim_idx] % 2 != should_be_odd) {
acc_shape[dim_idx]++;
}
}
Copy link
Contributor

@mzient mzient Apr 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if acc_shape was an ivec, you could do:

Suggested change
for (int dim_idx = 0; dim_idx < spatial_ndim; dim_idx++) {
bool should_be_odd = 2 * acc_parity[dim_idx] > num_frames;
if (acc_shape[dim_idx] % 2 != should_be_odd) {
acc_shape[dim_idx]++;
}
}
acc_shape += (acc_shape % 2) ^ (2 * acc_parity > num_frames);

(it must be ^, because operator != returns a single boolean).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 228 to 232
assert(len(arrays))
acc_max = arrays[0]
for array in arrays[1:]:
acc_max = np.maximum(acc_max, array)
return acc_max
Copy link
Contributor

@mzient mzient Apr 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert(len(arrays))
acc_max = arrays[0]
for array in arrays[1:]:
acc_max = np.maximum(acc_max, array)
return acc_max
# find the elementwise maximum of the arrays in the list
return np.max(arrays, axis=0)

It will automatically treat the outer list as an extra dimension.
Perhaps you don't even need a function for that (see below).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

corrected_shapes = [
np.array(get_3d_output_size(math.radians(angle), axis, shape, True), dtype=np.int32)
for shape, angle, axis in zip(input_shapes, angles, axes)]
max_shape = maximum_array(no_correction_shapes)
Copy link
Contributor

@mzient mzient Apr 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
max_shape = maximum_array(no_correction_shapes)
max_shape = np.max(no_correction_shapes, axis=0) # elementwise maximum

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 253 to 254
parity = sum([np.array([extent % 2 for extent in shape], dtype=np.int32)
for shape in corrected_shapes])
Copy link
Contributor

@mzient mzient Apr 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
parity = sum([np.array([extent % 2 for extent in shape], dtype=np.int32)
for shape in corrected_shapes])
parity = np.sum(np.array(corrected_shapes, dtype=np.int32) % 2, axis=0)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4629623]: BUILD PASSED

Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
for j in range(3):
if rotation[i, j] > maxv:
maxv = rotation[i, j]
dominant_axis[i] = j
Copy link
Contributor

@mzient mzient Apr 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're overwriting dominant_axis here, which is still used in the outer loop to get maxv.
Edit: this is not important - these nested loops can be reimplemented as a one-liner, see below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done



def get_3d_output_size(angle, axis, input_size, parity_correction=False):
rotation = np.abs(get_3d_lin_rotation(angle, axis))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: Some other name would be nice - this is not really a rotation matrix...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 80 to 85
for i in range(3):
maxv = rotation[i, dominant_axis[i]]
for j in range(3):
if rotation[i, j] > maxv:
maxv = rotation[i, j]
dominant_axis[i] = j
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for i in range(3):
maxv = rotation[i, dominant_axis[i]]
for j in range(3):
if rotation[i, j] > maxv:
maxv = rotation[i, j]
dominant_axis[i] = j
dominant_axis = np.argmax(rotation, axis=1)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

if out_size[i] % 2 != in_size[dominant_axis[i]] % 2:
out_size[i] += 1

return np.array(list(reversed(out_size)), dtype=np.int32)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return np.array(list(reversed(out_size)), dtype=np.int32)
return out_size[::-1]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -28,6 +28,49 @@
vid_file = os.path.join(data_root, 'db', 'video',
'sintel', 'sintel_trailer-720p.mp4')

class ParamsProvider:
def __init__(self, input_params):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

Suggested change
def __init__(self, input_params):
def __init__(self, **input_params : ParamDesc):

?

Copy link
Member Author

@stiepan stiepan Apr 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to leave existing tests as they are. I split the ParamsProvider into a base class and actual class, the first is just about providing input data to the provider instance while the actual one take input arguments description (added docs there with some type annotation) and computes parameters. This way derived classes can easily control format of tensor arguments description.

Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
@stiepan
Copy link
Member Author

stiepan commented Apr 25, 2022

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4667693]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4667693]: BUILD FAILED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4667693]: BUILD PASSED

@stiepan stiepan merged commit f41733b into NVIDIA:main Apr 25, 2022
cyyever pushed a commit to cyyever/DALI that referenced this pull request May 13, 2022
* Add support for FHWC and FDHWC layouts
* Add support for per-frame tensor input to angle and axis parameters

Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request Jun 7, 2022
* Add support for FHWC and FDHWC layouts
* Add support for per-frame tensor input to angle and axis parameters

Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants