New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Excellent work (mae.ipynb
)!
#1
Comments
This makes sense. I will try incorporating this.
Noted!
Yes you are right! You will find the correct loss implementation in the TODO:
|
Can't wait for this baby to train! 😃 |
Taking a look now. |
@ariG23498 |
I have updated the TODO accordingly. |
@sayakpaul I have pushed a single notebook |
Closing this. |
@ariG23498 this is fantastic stuff. Super clean, readable, and coherent with the original implementation. A couple of suggestions that would likely make things even better:
PatchEncoder
itself? That way you could let it accept a test image, apply random masking, and plot it just like the way you are doing in the earlier cells. This way I believe the notebook will be cleaner.tfa.optimizers.adamw
) is a better choice when it comes to training Transformer-based models.After these points are addressed I will take a crack at porting the training loop to TPUs along with other performance monitoring callbacks.
The text was updated successfully, but these errors were encountered: