Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] TorchVision with Batteries included - Phase 1 #3911

Closed
16 tasks done
datumbox opened this issue May 25, 2021 · 14 comments
Closed
16 tasks done

[RFC] TorchVision with Batteries included - Phase 1 #3911

datumbox opened this issue May 25, 2021 · 14 comments

Comments

@datumbox
Copy link
Contributor

datumbox commented May 25, 2021

🚀 Feature

Note: To track the progress of the project check out this board.

Add popular primitives (Losses, Schedulers, Data Augmentations, Operators etc) which are often used to reproduce SOTA references and new popular highly accurate models with pre-trained weights to TorchVision.

Motivation

Though TorchVision currently includes many common building blocks necessary for training CV models, it currently lacks popular primitives which are often used to reproduce SOTA. Some of these primitives are part of our reference scripts (Data utils, transforms etc) because previously did not want to commit to a specific API. Others are part of libraries from the broader ecosystem. Additionally, it does not provide some of the newer, popular architectures which currently achieve good results in a variety of vision tasks.

Adding support of such primitives and models to TorchVision will give a “batteries included” experience to its users. Researchers will be able to do SOTA research and reproduce papers by using common building blocks rather than rewriting their own while industry users will be able to adapt easier the models in their domains using SOTA techniques.

Pitch

The addition of primitives should be done in several phases, iterating between trying to reproduce SOTA recipes, identifying accuracy gaps and implementing the necessary methods to close them. The progress of this project is tracked on this board.

During phase 1, add to TorchVision the following primitives and models:

Other potential primitives to be considered during phase 2:

Note that any of the suggested primitives that are not vision-specific should be added on PyTorch, so that all Domain libraries can benefit from them.

cc @vfdev-5 @fmassa @oke-aditya @jbschlosser @iramazanli

@oke-aditya
Copy link
Contributor

oke-aditya commented May 27, 2021

Adding on little bit. There have been such approaches to create batteries loaded libraries on top of torchvision.
We might as well take inspirations, motivations and add features that can be useful.

Edit Rewrote the ideas after giving more thoughts.

  • 3D Operations: -

Requests have come for 3D NMS, 3D ops, etc #2402 .

  • nn API to do instantiate building blocks: -

Say we could easily do nn.MLP nn.SqeezeExcite nn.TwoMLPHead or nn.InvertedResidual. nn.BasicBlock (see #4333).
These would help people in create models more easily than copy pasting torchvision files.

@bmanga
Copy link
Contributor

bmanga commented Jun 3, 2021

+1 from me.
Maybe there should be an experimental namespace where we can refine the API over time before promoting features to stable.

@oke-aditya
Copy link
Contributor

oke-aditya commented Aug 21, 2021

Any thoughts about adding different types of IoU metrics such as DIoU (Distance IoU) and CIoU (Complete IoU)
Refer paper

The above paper mentions some benefits in training with FRCNN, SSD and YOLOv4.
These are now used by YOLOv5

This was earlier asked in #3026 #2545

Probably these two operations are more mature now to be included?

@datumbox
Copy link
Contributor Author

datumbox commented Sep 4, 2021

@oke-aditya I just realized I haven't responded. Apologies for that.

I agree we should explore the ideas you added. I know you have separate issues for all of them, so let's track them there. Happy to move them to Batteries Included once the first set of primitives is added. Some additional discussion will be necessary to prioritize but we'll do this when it's time to review additions.

@oke-aditya
Copy link
Contributor

Hey, now worries @datumbox 😄 You are doing an awesome work, and batteries included is a great initiative 👍

Let me know if there are any features that I could work on. Maybe ones that are not critical or timeline specific so that it won't hinder overall development.

@datumbox
Copy link
Contributor Author

datumbox commented Sep 4, 2021

@oke-aditya I'm currently collecting feedback from various research teams at FB to see which other operators are worth including. I will definitely let you know once things become clear; I just want to make sure you won't work on something that we later feel shouldn't be added. Any way if you want to have an early look, check out the DropBlock paper.

@vaibhava0
Copy link

I recommend adding Large Scale Jitter data augmentation as well. It is quite simple and powerful. It has shown promise across many use cases. Here is an implementation in D2:

https://github.com/facebookresearch/detectron2/blob/main/configs/new_baselines/mask_rcnn_R_50_FPN_100ep_LSJ.py#L44

Some benchmark results:
https://github.com/facebookresearch/detectron2/blob/main/MODEL_ZOO.md#new-baselines-using-large-scale-jitter-and-longer-training-schedule

@0x00b1
Copy link
Contributor

0x00b1 commented Sep 15, 2021

Oof! I feel bad! This is shaping up to be a great release.

@datumbox
Copy link
Contributor Author

@vaibhava0 Thanks for the proposal. You are right, this is quite important. We have it on the #3817 but I'll add it here more prominently (I also added the code and the benchmarks links you provided).

@0x00b1 Your contribution can still make it in. tap tap tap ⌨️ 🚀

@oke-aditya
Copy link
Contributor

Any thoughts about adding different types of IoU metrics such as DIoU (Distance IoU) and CIoU (Complete IoU)
Refer paper

CIoU and DIoU have been added to Detectron2

https://github.com/facebookresearch/detectron2/blob/dfe8d368c8b7cc2be42c5c3faf9bdcc3c08257b1/detectron2/layers/losses.py#L66

@datumbox
Copy link
Contributor Author

datumbox commented Oct 21, 2021

@oke-aditya We are wrapping phase 1 of this project. There will be a phase 2 in Q1 and we can definitely reassess what else needs to be added. It's great that you keep the linked Issues up-to-date with references and proposals so that we can keep up with the proposals there.

@datumbox datumbox changed the title [RFC] TorchVision with Batteries included [RFC] TorchVision with Batteries included - Phase 1 Jan 28, 2022
@datumbox
Copy link
Contributor Author

datumbox commented Jan 28, 2022

Batteries Included - Phase 1 is now concluded! I believe we have successfully refreshed TorchVision library to support the Classification use-case and we have managed to refresh all of the popular pre-trained models of the library.

I'm going to close this ticket and start scoping Phase 2, which will focus on the Detection and Segmentation use-cases. Massive thanks to all of the people who contributed to this project either by implementing primitives, adding models or training new weights. I'll follow up with a new RFC for phase 2 and start outlining next steps so that we can get the feedback from the community.

Note: Some of the primitives that didn't make the cut in this RFC, will move on the next phase.

@xiaoyuan0203
Copy link

Will Mixup and CutMix be added to torchvision.transform and when? #4379

@datumbox
Copy link
Contributor Author

datumbox commented Jul 5, 2022

@xiaoyuan0203 that's the plan. We have untested implementations on prototype. We are working to finalize the API, document them and start testing them. I don't recommend them yet for production use-cases, but we will make sure to post an update when our confidence in them is increased.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants