- Research repo to deep dive into a couple of zero-order optimization techniques.
- We explore how directional sharpness can provide actionable optimization insights.
- Memory-efficient zerothorder optimizer (MeZO). Paper; official repo.
- Smart evolutionary strategy (SmartES).
Define a config yaml file in the config directory; Then
python main.py --config $path --epochs $nMore options are available (see main file)
Simply copy paste mezo.py in your repo and import the optimizer.
from zeroptim.optim.mezo import MeZO
opt = MeZO(torch.optim.SGD(model.parameters(), lr=0.05), eps=1e-3)
opt = MeZO(torch.optim.AdamW(model.parameters(), lr=0.005), eps=1e-3) Work in progress. May have bugs. Use at your discretion.
# in torch.autograd.functional
L358: def jvp(func, inputs, v=None, create_graph=False, strict=False, **kwargs):
L437: outputs = func(*inputs, **kwargs)
# and same for vhv!