Skip to content
PyTorch implementation of the paper Dynamic Routing Between Capsules by Sara Sabour, Nicholas Frosst and Geoffrey Hinton
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.

Capsule Network


PyTorch implementation of the following paper:

Official implemenation

Visual represenation


Image source: Mike Ross, A Visual Representation of Capsule Network Computations

Run the experiment

  • For details, run python --help

Example of reconstructed vs. original images



Default hyper-parameters (similar to the paper):

  • Per-GPU batch_size = 128
  • Initial learning_rate = 0.001
  • Exponential lr_decay = 0.96
  • Number of routing iteration (num_routing) = 3

Loss function hyper-parameters (see

  • Lambda for Margin Loss = 0.5
  • Scaling factor for reconstruction loss = 0.0005

GPU Speed benchmarks:

(with above mentioned hyper-parameters)

  • Single GeForce GTX 1080Ti - 35.6s per epoch
  • Two GeForce GTX 1080Ti - 35.8s per epoch (twice the batch size -> half the iteration)
You can’t perform that action at this time.