[FEATURE REQUEST] Implementation of L-BFGS Optimizer in MLX

For Physics-Informed NN (PINNs) often switching to  L-BFGS (from Adam type optimizers) at later stages of training seems to help final convergence. I know it is probably not in the plans of the MLX team to implement L-BFGS under `mlx.optimizers`, but I would be interested in giving it a try by coding a basic functionality using the Python API. 

The key challenge is maintaining compatibility with `mx.compile`.

Before getting started, I would be thankful if @awni could give me a high-level skeleton of  the steps I should follow to ensure that the implementation will be as efficient as possible given that L‑BFGS involves iterative, stateful updates (tracking curvature approximations and doing line searches) that don’t always fit neatly into a compiled, pure function framework.  OR, if it is simply not a good idea to go down this path please let me know to avoid wasting time. 



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE REQUEST] Implementation of L-BFGS Optimizer in MLX #1967

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEATURE REQUEST] Implementation of L-BFGS Optimizer in MLX #1967

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions