Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Video reader resize #2097

Merged
merged 15 commits into from
Jul 10, 2020
Merged

Video reader resize #2097

merged 15 commits into from
Jul 10, 2020

Conversation

awolant
Copy link
Contributor

@awolant awolant commented Jul 9, 2020

Why we need this PR?

  • It adds new feature needed because of asks from multiple sources for video resize

What happened in this PR?

  • What solution was applied:
    Create new operator VideoReaderResize that combines reading and resizing the videos
  • Affected modules and functionalities:
    Code around video reader and resize
  • Key points relevant for the review:
    VideoReaderResize, refactoring of VideoReader
  • Validation and testing:
    Added Python tests to compare with existing VideoReader and Resize ops
  • Documentation (including examples):
    Spec for new op has docs, inherits docs from existing ops as well

JIRA TASK: [Use DALI-681]

Signed-off-by: Albert Wolant <awolant@nvidia.com>
Signed-off-by: Albert Wolant <awolant@nvidia.com>
Signed-off-by: Albert Wolant <awolant@nvidia.com>
Signed-off-by: Albert Wolant <awolant@nvidia.com>
Signed-off-by: Albert Wolant <awolant@nvidia.com>
Signed-off-by: Albert Wolant <awolant@nvidia.com>
Signed-off-by: Albert Wolant <awolant@nvidia.com>
@awolant awolant changed the title Video reader resize [WIP] Video reader resize Jul 9, 2020
@awolant
Copy link
Contributor Author

awolant commented Jul 9, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1455408]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1455429]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1455408]: BUILD FAILED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1455429]: BUILD FAILED

Signed-off-by: Albert Wolant <awolant@nvidia.com>
@awolant
Copy link
Contributor Author

awolant commented Jul 9, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1455477]: BUILD STARTED

@awolant
Copy link
Contributor Author

awolant commented Jul 9, 2020

!build

@awolant awolant changed the title [WIP] Video reader resize Video reader resize Jul 9, 2020
@awolant awolant requested a review from a team July 9, 2020 09:31
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1455540]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1455477]: BUILD FAILED

}

frames.set_type(sequence.type());
frames.Resize(shape);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can use

ShareData(void *ptr, size_t bytes, const TensorListShape<> &shape,
                                  const TypeInfo &type = {}```?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1455540]: BUILD FAILED

@awolant
Copy link
Contributor Author

awolant commented Jul 9, 2020

build!

@awolant
Copy link
Contributor Author

awolant commented Jul 9, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1456327]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1456327]: BUILD FAILED


namespace dali {

class VideoReaderResize : public VideoReader, protected ResizeAttr, protected ResizeBase {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inheritance or composition?
I don't think ti should inherit from ResizeAttr, not sure about ResizeBase either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how Resize is built, so I wanted to follow this pattern. If we want to change, let's change both, in separate PR maybe?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

output.Resize(output_shape);
}

void ShareSingleOutput(int data_idx, TensorList<GPUBackend> &batch_output,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void ShareSingleOutput(int data_idx, TensorList<GPUBackend> &batch_output,
void SequenceToTensorList(int data_idx, TensorList<GPUBackend> &batch_output,

I'm not sure if ShareSingleOutput tells what it does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

for sample in range(batch_size):
yield [sequences_out[sample]]

gt_pipeline = dali.pipeline.Pipeline(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can use ElementExtract to extract all frames from a sequence as a separate batches and call resize on it instead of going through ExternalSource?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm mistaken, but I think that writing a pipeline with ElementExtract that would be robust to changes of batch_size and sequence_length throughout the tests might be somewhat involved. You need to adjust number of outputs and calls to Resize depending on sequence_length. If you feel strongly about it, I can do it, but maybe we can leave it as it is?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would depend only on the sequence_length, and you can do it in the loop. So you can compare two pipelines:
VideoReader+seq_lenElementExtract+seq_lenResize vs VideoesizeReader+seq_len*ElementExtract. But it is just an idea. Let us wait for another opinion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Signed-off-by: Albert Wolant <awolant@nvidia.com>
@awolant
Copy link
Contributor Author

awolant commented Jul 9, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1457164]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1457164]: BUILD FAILED

Signed-off-by: Albert Wolant <awolant@nvidia.com>
Signed-off-by: Albert Wolant <awolant@nvidia.com>
@awolant
Copy link
Contributor Author

awolant commented Jul 10, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1459098]: BUILD STARTED

Comment on lines 99 to 100
for i in range(video_length):
resized_frames[i] = self.resize(resized_frames[i])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should do:

Suggested change
for i in range(video_length):
resized_frames[i] = self.resize(resized_frames[i])
resized_frames = self.resize(resized_frames)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Signed-off-by: Albert Wolant <awolant@nvidia.com>
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1459098]: BUILD FAILED

@@ -35,6 +35,14 @@ enum t_idInfo : uint32_t {
t_mirrorVert
};

struct TransformMeta {
int H, W, C;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: non-standard indentation (we are using 2 spaces)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

void share_frames(TensorList<GPUBackend> &frames) {
void *current_sequence = sequence.raw_mutable_data();

TensorListShape<> shape;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can:

auto shape = TensorListShape<>::make_uniform(count, frame_shape());

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

struct TransformMeta {
int H, W, C;
int rsz_h, rsz_w;
std::pair<int, int> crop;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what this crop pair refers to. Sizes? assuming 0 anchors? I'd rather have a CropWindow instance here but you can dismiss this change as out-of-scope if you consider it so

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to change existing resize related code as little as possible, so maybe let's leave it for now. AFAIK it will be heavily changed soon.

inline ~VideoReaderResize() override = default;

protected:
void SetupSharedSampleParams(DeviceWorkspace &ws) override {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

TensorList<GPUBackend> &video_output,
SequenceWrapper &prefetched_video,
DeviceWorkspace &ws) override {
std::fill_n(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::fill_n(
auto params = detail::GetResamplingParams(...);
for (auto& entry : resample_params_)
entry = params;

is it equivalent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so. STL version is more readable for me though, so I would like to leave it as it is, if that is ok with you?

frame = sample[frame_id]
gt_frame = gt_batch[frame_id].at(sample_id)

if gt_frame.shape == frame.shape:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not:

assert (gt_frame.shape == frame.shape), "Shapes are not equal: {} != {}".format(
                        gt_frame.shape, frame.shape)
assert (gt_frame == frame).all(), "Images are not equal"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I did an if to be able to set the breakpoint inside it and look at stuff while working on it.

batch, = pipeline.run()
batch = batch.as_cpu()
gt_batch = list(gt_pipeline.run())

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe:

gt_batch = [out.as_cpu() for out in gt_pipeline.run()]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

return pipeline


def compare_video_resize_pipelines(pipeline, gt_pipeline, batch_size, video_length, iterations=16):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't you use compare_pipelines from test_utils.py?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that straight forward, because gt_pipeline has different layout. It returns sequence_length outputs where n-th outputs has n-th frame for all videos.

def test_video_resize(batch_size=2):
for vp in video_reader_params:
for rp in resize_params:
yield run_for_params, batch_size, vp, rp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is printed when you run with nosetest?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This runs 14 separate tests. All info from vp and rp is printed. Maybe not supper pretty, but it's all there, so I guess it's fine.

label_output->set_type(TypeInfo::Create<int>());
label_output->Resize(label_shape_);
label_output_ = &ws.Output<GPUBackend>(output_index++);
label_output_->set_type(TypeInfo::Create<int>());
Copy link
Collaborator

@banasraf banasraf Jul 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
label_output_->set_type(TypeInfo::Create<int>());
label_output_->set_type(TypeTable::GetTypeInfo(TypeTable::GetTypeID<int>()));

It's rather better to use type table than create a new type info.

BTW, TypeTable should definitely have a method to obtain a TypeInfo from a static type (basically those two methods in one).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, added this method as: TypeTable::GetTypeInfoFromStatic<int>()

frame_num_output->set_type(TypeInfo::Create<int>());
frame_num_output->Resize(frame_num_shape_);
frame_num_output_ = &ws.Output<GPUBackend>(output_index++);
frame_num_output_->set_type(TypeInfo::Create<int>());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
frame_num_output_->set_type(TypeInfo::Create<int>());
frame_num_output_->set_type(TypeTable::GetTypeInfo(TypeTable::GetTypeID<int>()));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

timestamp_output->set_type(TypeInfo::Create<double>());
timestamp_output->Resize(timestamp_shape_);
timestamp_output_ = &ws.Output<GPUBackend>(output_index++);
timestamp_output_->set_type(TypeInfo::Create<double>());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
timestamp_output_->set_type(TypeInfo::Create<double>());
timestamp_output_->set_type(TypeTable::GetTypeInfo(TypeTable::GetTypeID<double>()));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if (dtype_ == DALI_FLOAT) {
tl_sequence_output.set_type(TypeInfo::Create<float>());
output.set_type(TypeInfo::Create<float>());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here, also GetTypeInfo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

} else { // dtype_ == DALI_UINT8
tl_sequence_output.set_type(TypeInfo::Create<uint8>());
output.set_type(TypeInfo::Create<uint8>());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here, GetTypeInfo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Collaborator

@banasraf banasraf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, with minor comments

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1459098]: BUILD PASSED

Signed-off-by: Albert Wolant <awolant@nvidia.com>
@awolant
Copy link
Contributor Author

awolant commented Jul 10, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1459445]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1459445]: BUILD PASSED

@awolant awolant merged commit ada6f49 into NVIDIA:master Jul 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants