This repository includes the official implementation of our ICLR 2023 Paper "ISAAC Newton: Input-based Approximate Curvature for Newton's Method".
Paper @ OpenReview
Video @ Youtube
isaac
is based on PyTorch and can be installed via pip from PyPI with
pip install isaac
isaac.Linear
acts as a drop-in replacement for torch.nn.Linear
. It only requires additional specification of
the regularization parameter la
as specified in the paper. A good starting point for la=1
, but the optimal choice varies from experiment to experiment.
The method operates by efficiently modifying the gradient of the module in such a way that the input-based curvature
information is used when applying a gradient descent optimizer on the modified gradients.
In the following, we specify an example MNIST neural network where ISAAC is applied to the first 3 out of 5 layers:
import torch
import isaac
net = torch.nn.Sequential(
torch.nn.Flatten(),
isaac.Linear(784, 1_600, la=1),
torch.nn.ReLU(),
isaac.Linear(1_600, 1_600, la=1),
torch.nn.ReLU(),
isaac.Linear(1_600, 1_600, la=1),
torch.nn.ReLU(),
torch.nn.Linear(1_600, 1_600),
torch.nn.ReLU(),
torch.nn.Linear(1_600, 10)
)
You can find an example MNIST experiment in examples/mnist.py
, which is based on the experiment in Figure 5 in the paper.
To run ISAAC applied to the first X
out of 5 layers, run
python examples/mnist.py -nil <X>
To run the baseline, run
python examples/mnist.py -nil 0
The device can be specified, e.g., as --device cuda
, the learning rate and --lr
and --la
, respectively.
@inproceedings{petersen2023isaac,
title={ISAAC Newton: Input-based Approximate Curvature for Newton's Method},
author={Petersen, Felix and Sutter, Tobias and Borgelt, Christian and Huh, Dongsung and Kuehne, Hilde and Sun, Yuekai and Deussen, Oliver},
booktitle={International Conference on Learning Representations (ICLR)},
year={2023}
}
isaac
is released under the MIT license. See LICENSE for additional details about it.