apply linearise in init for Normal and iterative solvers#198
Conversation
|
This looks reasonable to me! ...although we appear to have a lot of test failures (not just the nitty ongoing LSMR thing)? |
|
I think these are fixed by #200 just not merged here (this PR I think was the trigger to go and fix it! 😅), if you're happy with #200 I can rebase the other PR's accordingly. Happy to take a relay approach here where we iterate and merge a PR or two at a time rather than a big sweep. Just trying to break up the cognitive load in to manageable chunks for you. 😊 #200 is definitely the candidate for first merge rn. I can let you know which are next most ready as we go, sorry about the avalanche, when you get deep in the weeds these things all knot together😅 |
|
I tried updating the optimistix Levenberg-Marquadt benchmarks to use the In summary, the How do you suggest we handle this (options in order of decreasing user control/visibility)?
My preference would probably be 2 or 4. Even if often more performant I think materialising for iterative solvers would be a surprise to users looking to minimise their memory requirements. Furthermore, I think the memory cost for the materialised rectangular matrix would be transient giving more weight to option 4. For these options, we would need to come up with a nice way of detecting an iterative solver. This could feasibly be "whether is has I've intentionally left out the option of giving the |
|
IIUC, then the main concern here is that Normal+Cholesky – on presumably a matrix-free operator representation, though I don't think you specify – is unnecessarily slow, just in light of the fact that What I don't understand is why I have one guess, which is that Do you have an better idea of what's going on / perhaps a MWE of the above sort if my guess is correct? Different topic – I think we're happy with this PR / shall I merge? |
|
Yes exactly, lineax can't tell that op.T and op are related so the linearised function is vmapped over twice. The MWE I'm using is just the optimistix Levenberg Marquadt benchmarks in the repo but swapping out your custom NormalCholesky for Normal(Cholesky). This PR is indeed a net improvement from lineax main right now, so feel free to merge if you like, but using materialise instead of linearise on |
|
Okay! In that case let's merge this one 🎉 For a follow-up, then maybe it would suffice for Alternatively, my guess is that this isn't an optimization that belongs in These are just some guesses though, and ultimately I don't really have strong feelings. I'd even be happy with just special-casing |
This is not without precedent as is done already in CG. The motivation is that both Normal and iterative solvers employ multiple mv's in sequence that cannot be parallelised, in which case calling linearise to cache the primal computation for JacobianLinearOperators should be more efficient. This is essentially hiding some complexity and decision-making stress from users at the cost of reduce control if there really is a case where not caching the primal computation saves significant memory. Typically, when memory is the bottleneck iterative solvers are the go to and I can't really envisage a case where you could run e.g. LSMR with JacobianLinearOperator but not with FunctionLinearOperator but I may be way off the mark.
If you'd rather not do this, should we remove linearise from CG?