You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Could authors share the configs that you used to produce the AVA v2.2 results in the Masked Autoencoders As Spatiotemporal Learners paper?
Throughout the repo I cannot find any related configs for ViT. The hyperparameters that mentioned in the paper (https://arxiv.org/pdf/2205.09113.pdf, appendix A. Table 6) seem to be unreasonable to me. With batch size 128, the learning rate is 7.2 for ViT-L with SGD optimizer.
Thanks
The text was updated successfully, but these errors were encountered:
Hi there,
Could authors share the configs that you used to produce the AVA v2.2 results in the Masked Autoencoders As Spatiotemporal Learners paper?
Throughout the repo I cannot find any related configs for ViT. The hyperparameters that mentioned in the paper (https://arxiv.org/pdf/2205.09113.pdf, appendix A. Table 6) seem to be unreasonable to me. With batch size 128, the learning rate is 7.2 for ViT-L with SGD optimizer.
Thanks
The text was updated successfully, but these errors were encountered: