Video transforms #1353

stephenyan1231 · 2019-09-19T21:28:38Z

This PR replaces #1306 because the commit history of that one is polluted.

New features

Implement the following transforms for video clips

RandomCropVideo
RandomResizedCropVideo
CenterCropVideo
NormalizeVideo
ToTensorVideo
RandomHorizontalFlipVideo

Unit test

affected image tranfsorms
- test/test_transforms.py
new unit test of video transforms
- test/test_transforms_video

codecov-io · 2019-09-19T21:44:44Z

Codecov Report

Merging #1353 into master will increase coverage by 0.51%.
The diff coverage is 90.75%.

@@            Coverage Diff             @@
##           master    #1353      +/-   ##
==========================================
+ Coverage   65.47%   65.98%   +0.51%     
==========================================
  Files          75       77       +2     
  Lines        5827     5932     +105     
  Branches      892      900       +8     
==========================================
+ Hits         3815     3914      +99     
- Misses       1742     1746       +4     
- Partials      270      272       +2

Impacted Files	Coverage Δ
torchvision/transforms/__init__.py	`100% <100%> (ø)`	⬆️
torchvision/transforms/transforms.py	`80.94% <84.61%> (+0.55%)`	⬆️
torchvision/transforms/transforms_video.py	`88.88% <88.88%> (ø)`
torchvision/transforms/functional_video.py	`95.23% <95.23%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f677ea3...161848b. Read the comment docs.

bjuncek · 2019-09-23T09:43:53Z

torchvision/transforms/transforms.py

@@ -434,17 +434,17 @@ def __init__(self, size, padding=None, pad_if_needed=False, fill=0, padding_mode
        self.padding_mode = padding_mode

    @staticmethod
-    def get_params(img, output_size):
+    def get_params(w, h, output_size):


I have some reservations with respect to changing the API of the existing transforms, but I wonder how often this particular one is used externally.

Should we issue a warning maybe (cc @fmassa)?

yeah, we should not be doing a BC-breaking change here. There are ways of achieving the same thing without breaking BC, see for example https://github.com/pytorch/vision/pull/1104/files#diff-fc1f220b470714d05cf3ea6acf9fed59R34

bjuncek

One high-level reservation that i have is the fact that @fmassa et al were looking into introducing batched tensors, which would render this unnecessary, but I don't know what is the status on that.

fmassa

Thanks for the PR Zhicheng!

I'm thinking about a way of unifying the video and image cases. I'll come back with a proposal in the next day or so

fmassa · 2019-09-23T19:12:49Z

torchvision/transforms/transforms.py

@@ -434,17 +434,17 @@ def __init__(self, size, padding=None, pad_if_needed=False, fill=0, padding_mode
        self.padding_mode = padding_mode

    @staticmethod
-    def get_params(img, output_size):
+    def get_params(w, h, output_size):


yeah, we should not be doing a BC-breaking change here. There are ways of achieving the same thing without breaking BC, see for example https://github.com/pytorch/vision/pull/1104/files#diff-fc1f220b470714d05cf3ea6acf9fed59R34

fmassa · 2019-09-23T19:28:09Z

torchvision/transforms/functional_video.py

+    _is_tensor_video_clip(clip)
+    if not clip.dtype == torch.uint8:
+        raise TypeError("clip tensor should have data type uint8. Got %s" % str(clip.dtype))
+    return clip.float().permute(3, 0, 1, 2) / 255.0


I think I'll be using memory_format in the data reading functionality, so that this permutation is maybe handled automatically for us, in a safer way.

And I'm also thinking about creating a new transform for performing image type conversions, like https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/image/convert_image_dtype , which would let us perform the scaling for different dtypes

fmassa · 2019-09-24T13:34:08Z

I will be merging this PR as is for now to unblock @stephenyan1231 , but I'll be making changes to how things are structured in a follow-up PR.

Summary: Pull Request resolved: #62 Current dependency torchvision 0.4.0 was released in August. It missed quite a few PRs that are merged after that, and that are needed for video classification, such as - pytorch/vision#1437 - pytorch/vision#1431 - pytorch/vision#1423 - pytorch/vision#1418 - pytorch/vision#1408 - pytorch/vision#1376 - pytorch/vision#1363 - pytorch/vision#1353 - pytorch/vision#1303 This will fail the CI test when a diff uses changes made in those PRs. Before a new official version of TorchVision is released, we can temporarily use the nightly torchvision to get all the recent PRs, and unblock the PR merging. We plan to use a fixed version of TorchVision later. Reviewed By: vreis Differential Revision: D17944239 fbshipit-source-id: 86ff540e3fc4f08ef767e84ef103525db5158201

* video transforms * [video transforms]in ToTensorVideo, divide value by 255.0 * [video transforms] fix a bug * fix linting * Make changes backwards-compatible

fepegar · 2020-05-15T17:25:34Z

Are these documented?

fepegar · 2020-05-15T17:26:57Z

I suppose that not yet but they will be :) #1429

fmassa · 2020-05-22T16:56:31Z

@fepegar exactly, the video transforms will probably be unified with the image transforms, so that you can seamlessly use the same transform for both data types.

pulkitkumar95 · 2020-06-22T15:59:31Z

Hey @fmassa, any update on the unification and doc updation for video transform?

fmassa · 2020-06-22T16:53:12Z

@pulkitkumar95 unification is happening, but a bit slower than initially planned. See #2282 for the approach we will be tackling.

zyan3 added 3 commits September 19, 2019 14:31

video transforms

5106cfc

[video transforms]in ToTensorVideo, divide value by 255.0

19f5705

[video transforms] fix a bug

717aeba

stephenyan1231 force-pushed the video_transforms branch from 88ea2c6 to 717aeba Compare September 19, 2019 21:36

stephenyan1231 mentioned this pull request Sep 19, 2019

video transforms #1306

Closed

fix linting

84cb0c9

bjuncek reviewed Sep 23, 2019

View reviewed changes

fmassa reviewed Sep 23, 2019

View reviewed changes

Make changes backwards-compatible

161848b

fmassa merged commit 64917bc into pytorch:master Sep 24, 2019

This was referenced Sep 26, 2019

To extend torchvision for video #855

Open

[RFC] Add scriptable transforms #1375

Closed

stephenyan1231 mentioned this pull request Oct 16, 2019

use nightly torchvision and torch 1.3 facebookresearch/ClassyVision#62

Closed

fmassa mentioned this pull request Oct 31, 2019

[v0.4.2] Release Tracker #1545

Closed

fmassa pushed a commit that referenced this pull request Oct 31, 2019

Video transforms (#1353)

914132c

* video transforms * [video transforms]in ToTensorVideo, divide value by 255.0 * [video transforms] fix a bug * fix linting * Make changes backwards-compatible

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Video transforms #1353

Video transforms #1353

stephenyan1231 commented Sep 19, 2019 •

edited

codecov-io commented Sep 19, 2019 •

edited

bjuncek Sep 23, 2019

fmassa Sep 23, 2019

bjuncek left a comment

fmassa left a comment

fmassa Sep 23, 2019

fmassa Sep 23, 2019

fmassa Sep 23, 2019

fmassa commented Sep 24, 2019

fepegar commented May 15, 2020

fepegar commented May 15, 2020

fmassa commented May 22, 2020

pulkitkumar95 commented Jun 22, 2020

fmassa commented Jun 22, 2020

Video transforms #1353

Video transforms #1353

Conversation

stephenyan1231 commented Sep 19, 2019 • edited

New features

Unit test

codecov-io commented Sep 19, 2019 • edited

Codecov Report

bjuncek Sep 23, 2019

Choose a reason for hiding this comment

fmassa Sep 23, 2019

Choose a reason for hiding this comment

bjuncek left a comment

Choose a reason for hiding this comment

fmassa left a comment

Choose a reason for hiding this comment

fmassa Sep 23, 2019

Choose a reason for hiding this comment

fmassa Sep 23, 2019

Choose a reason for hiding this comment

fmassa Sep 23, 2019

Choose a reason for hiding this comment

fmassa commented Sep 24, 2019

fepegar commented May 15, 2020

fepegar commented May 15, 2020

fmassa commented May 22, 2020

pulkitkumar95 commented Jun 22, 2020

fmassa commented Jun 22, 2020

stephenyan1231 commented Sep 19, 2019 •

edited

codecov-io commented Sep 19, 2019 •

edited