Understanding the Difficulty of Training Transformers
-
Updated
May 31, 2022 - Python
Understanding the Difficulty of Training Transformers
Simple implementation of the LSUV initialization in keras
Simple implementation of the LSUV initialization in PyTorch
Class Normalization for Continual Zero-Shot Learning
Short description for quick search
[NeurIPS 2022] Old can be Gold: Better Gradient Flow can Make Vanilla-GCNs Great Again by Ajay Jaiswal*, Peihao Wang*, Tianlong Chen, Justin F Rousseau, Ying Ding, Zhangyang Wang
This repository is for numerical experiments with convolutional neural networks, in particular for testing initialization procedures. The training data is the CIFAR10 dataset and the CNN-architecture is the FitNet-1 from ROMERO, Adriana, et al. Fitnets: Hints for thin deep nets. arXiv preprint arXiv:1412.6550, 2014.
Structured Initialization for Attention in Vision Transformers
Odoo add-on that processes the content of one or more configuration folders.
Warmup initialisation procedure for RNNs
Repo for the course project
Used to run some very basic setup steps on initial VM startup.
My solution to an assignment on neural network initializers and optimizers. Contains some of the most popular approaches such as Xavier/He initialization and SGD, Momentum, AdaGrad, AdaDelta and Adam optimizers.
Add a description, image, and links to the initialization topic page so that developers can more easily learn about it.
To associate your repository with the initialization topic, visit your repo's landing page and select "manage topics."