-
Notifications
You must be signed in to change notification settings - Fork 246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ersi lig 3912 refactor mae to use timm vit #1461
Ersi lig 3912 refactor mae to use timm vit #1461
Conversation
* This is required as torch.no_grad doesn't change the model configuration while manual gradient deactivation/activation can have unintended consequences. For example, MAE ViT positional embeddings are parameters with requires_grad=False that should never receive an update. But if we use activate_requires_grad for finetuning we break those parameters.
…om:lightly-ai/lightly into guarin-lig-3056-add-mae-imagenet-benchmark
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1461 +/- ##
==========================================
- Coverage 84.35% 82.07% -2.28%
==========================================
Files 140 144 +4
Lines 5802 6065 +263
==========================================
+ Hits 4894 4978 +84
- Misses 908 1087 +179 ☔ View full report in Codecov by Sentry. |
* modified imagenette benchmark * formatted * edited vitb16 benchmark * added the posibility to handle images of different sizes * formatted * removed comments * revert * changed import * initialize class token * specified that class token should be used * chabged architecture * addressed comments * formatted * Masked vision transformer (#1482) * added hackathon * changed comments * formatted * addressed comments * fixed typing * addressed comments * added pre-norm and fixed arguments * added masked vision transformer with Torchvision * weight initialization * cleanup * modifies imagenette benchmark * made mask token optional and adapted benchmarks * removed unused import * adapted to dynamic image size * moved positional embed init to utils * updated benchmark * adapted benchmark * moved mask token to decoder * revert example * removed example * removed file * inheriting from Module * reverted dataset paths * use timm's drop_path_rate * removed unused import * removed private method * changed slicing * formatted * path dropout only for fine tune * formatted * account for mask token in backbone * mask token of decoder * removed appending of mask token in params
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Left some comments :)
* renamed test class * fixed imports * ficed imports * fixed import * fixed imports and decreased batch size * format * removed comments * use function defined in utils * added docstrings * added doctrings * added docstring * formatted * formatted * import Tensor
No description provided.