[FEATURE] Support EfficientViT #1815

Randl · 2023-05-18T05:03:53Z

Add models from
EfficientViT: Memory Efficient Vision Transformer with Cascaded Group Attention https://arxiv.org/abs/2305.07027
https://github.com/microsoft/Cream/tree/main/EfficientViT

Looks like fast and high-quality models, would be nice to have them in timm

rwightman · 2023-05-18T16:01:05Z

@Randl I noticed that one, also related are their mini/tiny vit. Look like reasonable arch, blend of LeViT / EfficientFormer w/ Swin and other ideas. BUT, they all need fairly extensive refactoring w/ checkpoint mapping to fit timm, get feat extraction working, etc so not currently something I have bandwidth for.

If anyone wants to tackle this or the others, criteria for accepting:

refactored to stages, downsample always at beginning of stage! See this relevant examples
Residuals inline with block code
Add feature_info and feature extraction support if model has NCHW or NHWC layout at end of stages
- see https://github.com/huggingface/pytorch-image-models/blob/main/timm/models/efficientformer_v2.py
Rename some of the modules like Conv_BN to match above examples, remove _
Extract stem as module as per levit/efficientformer

seefun · 2023-08-02T06:46:45Z

I tried to implement EfficientViT (MSRA) in here: #1894

There is another job with the same name before this EfficientViT, which looks good and has also been added.

youssefadr · 2023-08-02T20:10:24Z

Hello @rwightman, is there a model I can work on? I would be very happy to contribute!

rwightman · 2023-08-21T20:31:11Z

@youssefadr this one is done, I currently have MobileOne, FastViT and Inception_neXt (and a prototype underway... but #1842 (FasterViT) hasn't been tackled, although it looks like it's an easy adaptation, I'd want the downsamples moved, and some other aspects cleaned up, so that can be a bit of fun...

youssefadr · 2023-09-03T14:13:44Z

Thank you for the answer, I will take a look at FasterViT soon 👍

Kaschi14 · 2023-10-06T11:10:27Z

In the EfficientVit https://github.com/mit-han-lab/efficientvit there are weights for higher res models which are currently not supported in timm. Is it planned to include the higher res (1024x2048) Models in Timm as well? :)

seefun · 2023-10-06T14:26:35Z

@Kaschi14 I noticed the origin repo added the L-series efficientvit weights and the SAM distillation weights, but the higher res (1024x2048) models is from cityscape segmentation models.

Randl added the enhancement New feature or request label May 18, 2023

rwightman added the help wanted Extra attention is needed label May 18, 2023

rwightman closed this as completed Aug 21, 2023

youssefadr mentioned this issue Sep 3, 2023

[FEATURE] Support FasterViT #1842

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Support EfficientViT #1815

[FEATURE] Support EfficientViT #1815

Randl commented May 18, 2023

rwightman commented May 18, 2023 •

edited

Loading

seefun commented Aug 2, 2023 •

edited

Loading

youssefadr commented Aug 2, 2023 •

edited

Loading

rwightman commented Aug 21, 2023

youssefadr commented Sep 3, 2023 •

edited

Loading

Kaschi14 commented Oct 6, 2023

seefun commented Oct 6, 2023

[FEATURE] Support EfficientViT #1815

[FEATURE] Support EfficientViT #1815

Comments

Randl commented May 18, 2023

rwightman commented May 18, 2023 • edited Loading

seefun commented Aug 2, 2023 • edited Loading

youssefadr commented Aug 2, 2023 • edited Loading

rwightman commented Aug 21, 2023

youssefadr commented Sep 3, 2023 • edited Loading

Kaschi14 commented Oct 6, 2023

seefun commented Oct 6, 2023

rwightman commented May 18, 2023 •

edited

Loading

seefun commented Aug 2, 2023 •

edited

Loading

youssefadr commented Aug 2, 2023 •

edited

Loading

youssefadr commented Sep 3, 2023 •

edited

Loading