New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Refactor] Refator ViT (Continue #295) #395
Conversation
Codecov Report
@@ Coverage Diff @@
## master #395 +/- ##
==========================================
+ Coverage 77.96% 78.51% +0.54%
==========================================
Files 102 102
Lines 5619 5612 -7
Branches 923 915 -8
==========================================
+ Hits 4381 4406 +25
+ Misses 1111 1087 -24
+ Partials 127 119 -8
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
472c97a
to
885a4f2
Compare
c97350a
to
a6a7655
Compare
33f1ae8
to
be4bd58
Compare
input format of ViT head.
073de79
to
213b172
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
* [Squash] Refator ViT (from open-mmlab#295) * Use base variable to simplify auto_aug setting * Use common PatchEmbed, remove HybridEmbed and refactor ViT init structure. * Add `output_cls_token` option and change the output format of ViT and input format of ViT head. * Update unit tests and add test for `output_cls_token`. * Support out_indices. * Standardize config files * Support resize position embedding. * Add readme file of vit * Rename config file * Improve docs about ViT. * Update docstring * Use local version `MultiheadAttention` instead of mmcv version. * Fix MultiheadAttention * Support `qk_scale` argument in `MultiheadAttention` * Improve docs and change `layer_cfg` to `layer_cfgs` and support sequence. * Use init_cfg to init Linear layer in VisionTransformerHead * update metafile * Update checkpoints and configs * Imporve docstring. * Update README * Revert GAP modification.
Motivation
This PR is the update of PR #295, to refactor vit backbone.
Modification
Based on #295, continue to work.
BC-breaking (Optional)
ViT structure is changed, old config and checkpoint will be broken.
Checklist
Before PR:
After PR: