This repository includes the official implementation of our ICLR 2023 Paper "ISAAC Newton: Input-based Approximate Curvature for Newton's Method".
Paper @ OpenReview
Video @ Youtube
isaac is based on PyTorch and can be installed via pip from PyPI with
pip install isaacisaac.Linear acts as a drop-in replacement for torch.nn.Linear. It only requires additional specification of
the regularization parameter la as specified in the paper. A good starting point for la=1, but the optimal choice varies from experiment to experiment.
The method operates by efficiently modifying the gradient of the module in such a way that the input-based curvature
information is used when applying a gradient descent optimizer on the modified gradients.
In the following, we specify an example MNIST neural network where ISAAC is applied to the first 3 out of 5 layers:
import torch
import isaac
net = torch.nn.Sequential(
torch.nn.Flatten(),
isaac.Linear(784, 1_600, la=1),
torch.nn.ReLU(),
isaac.Linear(1_600, 1_600, la=1),
torch.nn.ReLU(),
isaac.Linear(1_600, 1_600, la=1),
torch.nn.ReLU(),
torch.nn.Linear(1_600, 1_600),
torch.nn.ReLU(),
torch.nn.Linear(1_600, 10)
)You can find an example MNIST experiment in examples/mnist.py, which is based on the experiment in Figure 5 in the paper.
To run ISAAC applied to the first X out of 5 layers, run
python examples/mnist.py -nil <X>To run the baseline, run
python examples/mnist.py -nil 0The device can be specified, e.g., as --device cuda, the learning rate and --lr and --la, respectively.
@inproceedings{petersen2023isaac,
title={ISAAC Newton: Input-based Approximate Curvature for Newton's Method},
author={Petersen, Felix and Sutter, Tobias and Borgelt, Christian and Huh, Dongsung and Kuehne, Hilde and Sun, Yuekai and Deussen, Oliver},
booktitle={International Conference on Learning Representations (ICLR)},
year={2023}
}isaac is released under the MIT license. See LICENSE for additional details about it.

