Skip to content

Conversation

@putivsky
Copy link
Contributor

@putivsky putivsky commented Feb 5, 2020

Summary: Changed base decoder internals for a faster clip processing.

Differential Revision: D19748379

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D19748379

putivsky pushed a commit to putivsky/vision that referenced this pull request Feb 6, 2020
Summary:
Pull Request resolved: pytorch#1852

Changed base decoder internals for a faster clip processing.

Differential Revision: D19748379

fbshipit-source-id: 6fb3b4b97949ada9ae1e5431848391042cf8ed2c
@putivsky putivsky force-pushed the export-D19748379-to-fbsync branch from 3673fc1 to ed54bd5 Compare February 6, 2020 01:18
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D19748379

Summary:
Pull Request resolved: pytorch#1852

Changed base decoder internals for a faster clip processing.

Reviewed By: stephenyan1231

Differential Revision: D19748379

fbshipit-source-id: 71b16d63e7eb939107d5f4505af05b1d234553d5
@putivsky putivsky force-pushed the export-D19748379-to-fbsync branch from ed54bd5 to 85befde Compare February 8, 2020 21:49
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D19748379

facebook-github-bot pushed a commit that referenced this pull request Feb 9, 2020
Summary:
Pull Request resolved: #1852

Changed base decoder internals for a faster clip processing.

Reviewed By: stephenyan1231

Differential Revision: D19748379

fbshipit-source-id: 58a435f0a0b25545e7bd1a3edb0b1d558176a806
@fmassa
Copy link
Member

fmassa commented Feb 10, 2020

Merged in 64e52a1

@fmassa fmassa closed this Feb 10, 2020
fmassa pushed a commit to fmassa/vision-1 that referenced this pull request Mar 13, 2020
Summary:
Pull Request resolved: pytorch#1852

Changed base decoder internals for a faster clip processing.

Reviewed By: stephenyan1231

Differential Revision: D19748379

fbshipit-source-id: 58a435f0a0b25545e7bd1a3edb0b1d558176a806
fmassa pushed a commit to fmassa/vision-1 that referenced this pull request Mar 16, 2020
Summary:
Pull Request resolved: pytorch#1852

Changed base decoder internals for a faster clip processing.

Reviewed By: stephenyan1231

Differential Revision: D19748379

fbshipit-source-id: 58a435f0a0b25545e7bd1a3edb0b1d558176a806
fmassa pushed a commit to fmassa/vision-1 that referenced this pull request Mar 16, 2020
Summary:
Pull Request resolved: pytorch#1852

Changed base decoder internals for a faster clip processing.

Reviewed By: stephenyan1231

Differential Revision: D19748379

fbshipit-source-id: 58a435f0a0b25545e7bd1a3edb0b1d558176a806
fmassa added a commit that referenced this pull request Mar 17, 2020
* Base decoder for video. (#1747)

Summary:
Pull Request resolved: #1747

Pull Request resolved: #1746

Added the implementation of ffmpeg based decoder with functionality that can be used in VUE and TorchVision.

Reviewed By: fmassa

Differential Revision: D19358914

fbshipit-source-id: abb672f89bfaca6351dda2354f0d35cf8e47fa0f

* Integrated base decoder into VideoReader class and video_utils.py (#1766)

Summary:
Pull Request resolved: #1766

Replaced FfmpegDecoder (incompativle with VUE) by base decoder (compatible with VUE).
Modified python utilities video_utils.py for internal simplification. Public interface got preserved.

Reviewed By: fmassa

Differential Revision: D19415903

fbshipit-source-id: 4d7a0158bd77bac0a18732fe4183fdd9a57f6402

* Optimizating base decoder performance. (#1852)

Summary:
Pull Request resolved: #1852

Changed base decoder internals for a faster clip processing.

Reviewed By: stephenyan1231

Differential Revision: D19748379

fbshipit-source-id: 58a435f0a0b25545e7bd1a3edb0b1d558176a806

* Minor fix and decoder class members access.

Summary:
Found and fix a bug in cropping algorithm (simple mistyping).
Also derived classes need access to some decoder class members, like initialization parameters - make it protected.

Reviewed By: stephenyan1231, fmassa

Differential Revision: D19895076

fbshipit-source-id: 691336c8e18526b085ae5792ac3546bc387a6db9

* Added missing header for less dependencies. (#1898)

Summary:
Pull Request resolved: #1898

Include streams/samplers shouldn't depend on decoder headers. Add dependencies directly to the place where they are required.

Reviewed By: stephenyan1231

Differential Revision: D19911404

fbshipit-source-id: ef322a053708405c02cee4562b456b1602fb12fc

* Implemented VUE Asynchronous Decoder

Summary: For Mothership we have found that asynchronous decoder provides a better performance.

Differential Revision: D20026194

fbshipit-source-id: 627b91844b4e3f917002031dd32cb19c239f4ba8

* fix a bug in API read_video_from_memory (#1942)

Summary:
Pull Request resolved: #1942

In D18720474, it introduces a bug in `read_video_from_memory` API. Thank weiyaowang for reporting it.

Reviewed By: weiyaowang

Differential Revision: D20270179

fbshipit-source-id: 66348c99a5ad1f9129b90e934524ddfaad59de03

* extend decoder to support new video_max_dimension argument (#1924)

Summary:
Pull Request resolved: #1924

Extend `video reader` decoder python API in Torchvision to support a new argument `video_max_dimension`. This enables the new video decoding use cases. When setting `video_width=0`, `video_height=0`, `video_min_dimension != 0`, and `video_max_dimension != 0`, we can rescale the video clips so that its spatial resolution (height, width) becomes
 - (video_min_dimension, video_max_dimension) if original height < original width
 - (video_max_dimension, video_min_dimension) if original height >= original width

This is useful at video model testing stage, where we perform fully convolution evaluation and take entire video frames without cropping as input. Previously, for instance we can only set `video_width=0`, `video_height=0`, `video_min_dimension = 128`, which will preserve aspect ratio. In production dataset, there are a small number of videos where aspect ratio is either extremely large or small, and when the shorter edge is rescaled to 128, the longer edge is still large. This will easily cause GPU memory OOM when we sample multiple video clips, and put them in a single minibatch.

Now, we can set (for instance) `video_width=0`, `video_height=0`, `video_min_dimension = 128` and `video_max_dimension = 171` so that the rescale resolution is either (128, 171) or (171, 128) depending on whether original height is larger than original width. Thus, we are less likely to have gpu OOM because the spatial size of video clips is determined.

Reviewed By: putivsky

Differential Revision: D20182529

fbshipit-source-id: f9c40afb7590e7c45e6908946597141efa35f57c

* Fixing samplers initialization (#1967)

Summary:
Pull Request resolved: #1967

No-ops for torchvision diff, which fixes samplers.

Differential Revision: D20397218

fbshipit-source-id: 6dc4d04364f305fbda7ca4f67a25ceecd73d0f20

* Exclude C++ test files

Co-authored-by: Yuri Putivsky <yuri@fb.com>
Co-authored-by: Zhicheng Yan <zyan3@fb.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants