Skip to content

Felix-Petersen/isaac

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

isaac Newton - Accelerating NN Training with Input-based Approximate Curvature for Newton's Method

This repository includes the official implementation of our ICLR 2023 Paper "ISAAC Newton: Input-based Approximate Curvature for Newton's Method".

Paper @ OpenReview

Video @ Youtube

video

💻 Installation

isaac is based on PyTorch and can be installed via pip from PyPI with

pip install isaac

👩‍💻 Usage

isaac.Linear acts as a drop-in replacement for torch.nn.Linear. It only requires additional specification of the regularization parameter $\lambda_a$ la as specified in the paper. A good starting point for $\lambda_a$ is la=1, but the optimal choice varies from experiment to experiment. The method operates by efficiently modifying the gradient of the module in such a way that the input-based curvature information is used when applying a gradient descent optimizer on the modified gradients.

In the following, we specify an example MNIST neural network where ISAAC is applied to the first 3 out of 5 layers:

import torch
import isaac

net = torch.nn.Sequential(
    torch.nn.Flatten(),
    isaac.Linear(784, 1_600, la=1),
    torch.nn.ReLU(),
    isaac.Linear(1_600, 1_600, la=1),
    torch.nn.ReLU(),
    isaac.Linear(1_600, 1_600, la=1),
    torch.nn.ReLU(),
    torch.nn.Linear(1_600, 1_600),
    torch.nn.ReLU(),
    torch.nn.Linear(1_600, 10)
)

🧪 Experiments

You can find an example MNIST experiment in examples/mnist.py, which is based on the experiment in Figure 5 in the paper.

To run ISAAC applied to the first X out of 5 layers, run

python examples/mnist.py -nil <X>

To run the baseline, run

python examples/mnist.py -nil 0

The device can be specified, e.g., as --device cuda, the learning rate and $\lambda_a$ may be set via --lr and --la, respectively.

📖 Citing

@inproceedings{petersen2023isaac,
  title={ISAAC Newton: Input-based Approximate Curvature for Newton's Method},
  author={Petersen, Felix and Sutter, Tobias and Borgelt, Christian and Huh, Dongsung and Kuehne, Hilde and Sun, Yuekai and Deussen, Oliver},
  booktitle={International Conference on Learning Representations (ICLR)},
  year={2023}
}

License

isaac is released under the MIT license. See LICENSE for additional details about it.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages