[RFC] Batteries Included - Phase 2 #5410

datumbox · 2022-02-11T15:45:41Z

🚀 The feature

Note: To track the progress of the project check out this board.

This is the 2nd phase of TorchVision's modernization project (see phase 1). We aim to keep TorchVision relevant by ensuring it provides off-the-shelf all the necessary primitives, model architectures and recipe utilities to produce SOTA results for the supported Computer Vision tasks.

1. New Primitives

To enable our users to reproduce the latest state-of-the-art research we will enhance TorchVision with the following data augmentations, layers, losses and other operators:

Data Augmentations

Augmix - Adding AugMix implementation #5411
Large Scale Jitter - Adding Scale Jitter transform for detection #5435 Fix bbox scaling estimation for Large Scale Jitter #5446 Make ScaleJitter proportional #5559
Fixed Size Crop - Adding FixedSizeCrop transform #5607
Random Shortest Size - Adding RandomShortestSize transform #5610
Simple CopyPaste - Add SimpleCopyPaste augmentation #5825

Layers

Losses

Operators added in PyTorch Core

Better EMA support in AveragedModel - Remove state_dict from AveragedModel and use buffers instead pytorch#71763
Add support of empty output in SyncBatchNorm - Fix SyncBatchNorm for empty inputs pytorch#74944

2. New Architectures & Model Iterations

To ensure that our users have access to the most popular SOTA models, we will add the following architectures along with pre-trained weights. Moreover we will improve existing architectures with commonly adopted optimizations introduced in follow up research:

Image Classification

Object Detection & Segmentation

FCOS add FCOS #4961
Post-paper optimizations for RetinaNet, FasterRCNN & MaskRCNN Post-paper Detection Optimizations #5444

Video Classification

MViT - Add MViT architecture in TorchVision #6198

3. Improved Training Recipes & Pre-trained models

To ensure that are users can have access to strong baselines and SOTA weights, we will improve our training recipes to incorporate the newly released primitives and offer improved pre-trained models:

Reference Scripts

Update EMA to use PyTorch Core's new implementation - Simplify EMA to use Pytorch's update_parameters #5469
Add support of new Detection primitives in Reference Scripts - Detection recipe enhancements #5715

Pre-trained weights

Other Candidates

There are several other Operators (#5414), Losses (#2980), Augmentations (#3817) and Models (#2707) proposed by the community. Here are some potential candidates that we could implement depending on bandwidth. Contributions are welcome for any of the below:

AutoAugment Detection code - Implement AutoAugment for Detection #6224
Deformable DeTR
Polynomial LR scheduler (upstream to Core)
Shortcut Regularizer (FX-based)

cc @datumbox @vfdev-5

The text was updated successfully, but these errors were encountered:

xiaohu2015 · 2022-02-25T15:46:31Z

@datumbox I think Swin Transformer is a very popular model, so I am planing to add it to torchvsion.

datumbox · 2022-02-25T15:57:38Z

Sounds great @xiaohu2015, thanks for the help!

Can you open an "empty" PR similar to what you did for Dropblock initiatilly? It will help us mark the item as in-progress and avoid others trying to do the same.

lezwon · 2022-04-14T06:10:28Z

Hey @datumbox, I'd like to take a shot at Simple CopyPaste augmentation, if it's available. Although I would definitely require some initial guidance on it :)

datumbox · 2022-04-14T07:03:41Z

@lezwon Yes it's available and very high on our candidate list. :) Note that the API of this transform is tricky because it combines transforms across images in the batch (similar to MixUp and CutMix located at Classification references, not the ones on prototype).

How about the following? If you write a functional implementation I can help you review, adapt it to the necessary API and test it on real models/data. Let me know your thoughts!

PS: Note that I am currently OOO until Tuesday, so I might be slow to respond until then.

lezwon · 2022-04-14T07:17:33Z

@datumbox sounds good 👍 I'll get started on it and ping you once i have a POC ready.

datumbox changed the title ~~Batteries Included - Phase 2~~ [RFC] Batteries Included - Phase 2 Feb 11, 2022

datumbox added module: models module: ops module: reference scripts module: transforms needs discussion new feature labels Feb 11, 2022

jdsgomes mentioned this issue Feb 23, 2022

Upstream Resnext3d from classivision #5463

Closed

3 tasks

xiaohu2015 mentioned this issue Feb 27, 2022

Adding Swin Transformer architecture #5491

Merged

vfdev-5 mentioned this issue Mar 2, 2022

[RFC] Implement transforms primitives for Bounding Boxes #5514

Closed

8 tasks

datumbox mentioned this issue Apr 5, 2022

[RFC] Loss Functions in Torchvision #2980

Open

20 tasks

datumbox added help wanted and removed needs discussion labels Apr 5, 2022

datumbox mentioned this issue Apr 6, 2022

Improve the accuracy of Detection & Segmentation models by using SOTA recipes and primitives #5307

Closed

vfdev-5 mentioned this issue Apr 7, 2022

[RFC] Implement transforms primitives for Segmentation Masks #5782

Closed

8 tasks

datumbox mentioned this issue Apr 8, 2022

[RFC] New Ops in TorchVision #5414

Open

10 tasks

datumbox mentioned this issue Apr 21, 2022

fix fcos gt_areas calculation #5816

Merged

datumbox mentioned this issue Apr 28, 2022

Are new models planned to be added? #2707

Open

37 tasks

datumbox mentioned this issue Jul 27, 2022

[RFC] Batteries Included - Phase 3 #6323

Open

16 tasks

datumbox closed this as completed Jul 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Batteries Included - Phase 2 #5410

[RFC] Batteries Included - Phase 2 #5410

datumbox commented Feb 11, 2022 •

edited

Loading

xiaohu2015 commented Feb 25, 2022

datumbox commented Feb 25, 2022

lezwon commented Apr 14, 2022

datumbox commented Apr 14, 2022

lezwon commented Apr 14, 2022

[RFC] Batteries Included - Phase 2 #5410

[RFC] Batteries Included - Phase 2 #5410

Comments

datumbox commented Feb 11, 2022 • edited Loading

🚀 The feature

1. New Primitives

Data Augmentations

Layers

Losses

Operators added in PyTorch Core

2. New Architectures & Model Iterations

Image Classification

Object Detection & Segmentation

Video Classification

3. Improved Training Recipes & Pre-trained models

Reference Scripts

Pre-trained weights

Other Candidates

xiaohu2015 commented Feb 25, 2022

datumbox commented Feb 25, 2022

lezwon commented Apr 14, 2022

datumbox commented Apr 14, 2022

lezwon commented Apr 14, 2022

datumbox commented Feb 11, 2022 •

edited

Loading