Skip to content
PyTorch Examples repo for "ReZero is All You Need: Fast Convergence at Large Depth"
Jupyter Notebook
Branch: master
Clone or download

Latest commit

Fetching latest commit…
Cannot retrieve the latest commit at this time.


Type Name Latest commit message Commit time
Failed to load latest commit information.
Plots Add files via upload Apr 1, 2020
LICENSE Create LICENSE Mar 16, 2020 Update Apr 1, 2020
ReZNet-6x_faster_ResNet_training_via_ReZero.ipynb Update ReZNet-6x_faster_ResNet_training_via_ReZero.ipynb Mar 29, 2020
ReZero-Deep_Fast_NeuralNetwork.ipynb Added fully connected ReZero notebook Mar 16, 2020
ReZero-Deep_Fast_Transformer.ipynb Update ReZero-Deep_Fast_Transformer.ipynb Mar 16, 2020
requirements.txt Create requirements.txt Mar 16, 2020


This repo contains examples demonstrating the power of the ReZero architecture, see the paper.

The official ReZero repo is here.


Final valid errors: Vanilla - 7.74. FixUp - 7.5. ReZero - 6.38, see .


If you find ReZero or a similar architecture improves the performance of your application, you are invited to share a demonstration here.


To install ReZero via pip use pip install rezero


We provide custom ReZero Transformer layers (RZTX).

For example, this will create a Transformer encoder:

import torch
import torch.nn as nn
from rezero.transformer import RZTXEncoderLayer

encoder_layer = RZTXEncoderLayer(d_model=512, nhead=8)
transformer_encoder = nn.TransformerEncoder(encoder_layer, num_layers=6)
src = torch.rand(10, 32, 512)
out = transformer_encoder(src)


If you find rezero useful for your research, please cite our paper:

    title = "ReZero is All You Need: Fast Convergence at Large Depth",
    author = "Bachlechner, Thomas  and
      Majumder, Bodhisattwa Prasad
      Mao, Huanru Henry and
      Cottrell, Garrison W. and
      McAuley, Julian",
    booktitle = "arXiv",
    year = "2020",
    url = ""
You can’t perform that action at this time.