FunctionOperator caches prototypes of the input and output #145

sbrisard · 2023-02-02T18:15:02Z

FunctionOperator structures have a cache = (in, out) member, where in and out are prototypes of the input and output.
These prototypes are used in * and \

https://github.com/SciML/SciMLOperators.jl/blob/master/src/func.jl#L297-L307

This seems to me unnecessarily costly. Am I missing something?

The text was updated successfully, but these errors were encountered:

vpuri3 · 2023-02-02T22:43:33Z

L=FunctionOperator has two important boolean parameters {iip,oop} that stand for is-in-place, and out-of-place. iip=true means you can call L(v, u, p, t), equivalent to mul!(v, L, u). And oop=true means you can call v=L(u,p,t) equivalent to v=L*u.

The methods you highlighted are for FunctionOperator{true,false}, i.e. one that has an in-place method, but no out-of-place method. So we're cheekily forming a new array (using the cache), and passing it to the in-place-method to write the output in.

SciMLOperators.jl/src/func.jl

Lines 290 to 300 in d2afff8

    
           function Base.:*(L::FunctionOperator{true,false}, u::AbstractVecOrMat) 
        
               _, co = L.cache 
        
               du = zero(co) 
        
               L.op(du, u, L.p, L.t) 
        
           end 
        
           function Base.:\(L::FunctionOperator{true,false}, u::AbstractVecOrMat) 
        
               ci, _ = L.cache 
        
               du = zero(ci) 
        
               L.op_inverse(du, u, L.p, L.t) 
        
           end

vpuri3 · 2023-02-02T22:46:42Z

When we have out-of-place methods, we simply use them:

SciMLOperators.jl/src/func.jl

Lines 287 to 288 in d2afff8

    
           Base.:*(L::FunctionOperator{iip,true}, u::AbstractVecOrMat) where{iip} = L.op(u, L.p, L.t) 
        
           Base.:\(L::FunctionOperator{iip,true}, u::AbstractVecOrMat) where{iip} = L.op_inverse(u, L.p, L.t)

And for nonallocating mul!, ldiv!, we necessarily need a cache.

sbrisard · 2023-02-03T18:07:21Z

I understand how these cached arrays are used, but I fail to see why they are needed?
I intend to implement operators that operate on tensor fields in 3d. So ci and co would be vectors of size 6 (number of tensor components) * 1024^3 (size of the grid). It would be very expensive to store two of those guys just for the sake of having a prototype to allocate new vectors.

Would it not be more efficient to just store the size and element-type of ci and co? If you still want to have some freedom with the type of the returned vector, you could pass user-provided functions that create the required vectors on demand.

Does that make sense?

To make things clearer: why not store an allocator, rather than a prototype?

vpuri3 · 2023-02-03T19:20:17Z

The caches are used to provide a nonallocating 5-argument mul!

SciMLOperators.jl/src/func.jl

Lines 310 to 317 in d2afff8

    
           function LinearAlgebra.mul!(v::AbstractVecOrMat, L::FunctionOperator{true}, u::AbstractVecOrMat, α, β) 
        
               _, co = L.cache 
        
               copy!(co, v) 
        
               mul!(v, L, u) 
        
               lmul!(α, v) 
        
               axpy!(β, co, v) 
        
           end

ANd to provide a nonallocating 2-argument ldiv!

SciMLOperators.jl/src/func.jl

Lines 323 to 327 in d2afff8

    
           function LinearAlgebra.ldiv!(L::FunctionOperator{true}, u::AbstractVecOrMat) 
        
               ci, _ = L.cache 
        
               copy!(ci, u) 
        
               ldiv!(u, L, ci) 
        
           end

sbrisard · 2023-02-03T19:23:06Z

I see. Thanks for this clarification. I understand I did not read correctly your first answer.
So if I want to avoid these cached variables, I need to provide both an in-place and an out-of-place method, right?
This makes perfect sense.
I guess this issue should be closed, then.

vpuri3 · 2023-02-03T21:08:00Z

No, even if you have both in-place v=L(u,p,t) and out-of-place L(v,u,p,t) methods, you still need cache to support the 5-argument mu!(v, L, u, a, b), and the two argument ldiv!(L, u). These methods are used by almost every package out there - Krylov solvers, nonlinear solvers, everything. That's why the cache is important for interoperability.

Usually, the size of cache arrays (O(N)) is not the limiting factor in computations which usually have some sort of matrix(-free) multiplications (O(N log(N)) for Fourier which is the best case scenario).

I understand that you are working on a very large problem (10^9 points) and efficiency does matter. For now, if you want, we can set up a PR to skip caching when FunctionOperator is created. And then optionally create cache if needed by calling cache_operator(L::FunctionOperator, in, out). Let me know if you want that.

Eventually, we should support a truly fused 5-argument mul! as recommended in #48. However, that requires broader agreement on accepted function signatures across SciML.

ChrisRackauckas · 2023-02-04T22:05:08Z

The in-place ones still should not pre-allocate and alias the output like that, that's dangerous.

sbrisard · 2023-02-05T08:48:01Z

I understand that you are working on a very large problem (10^9 points) and efficiency does matter. For now, if you want, we can set up a PR to skip caching when FunctionOperator is created. And then optionally create cache if needed by calling cache_operator(L::FunctionOperator, in, out). Let me know if you want that.

For the time being, I would not do anything to SciMLOperators for the sake of my "large" application. By the time I'm up an running, v 3.0 of SciMLOperators will probably be out.

I do think that SciMLOperators would nicely fit into my application, so I will keep an eye on it. For the sake of testing on small grids, caching does not matter to me in terms of memory efficiency (so long as the code is correct).

Thanks for your these clarifications!

vpuri3 · 2023-02-05T12:01:53Z

@ChrisRackauckas what's the dangerous aliasing part?

ChrisRackauckas · 2023-02-05T12:14:13Z

The following will calculate the wrong result:

y1 = A * x1
y2 = A * x2
y1 + y2

The reason is because A * x1 makes y1 be a reference to the internal cache A.cache. But then A * x2 makes y2 equal to the same cache, and therefore it will mutate the values of y1 making y1 + y2 == 2 * y2, which is not what this syntax is intended to mean. Never expose internal caches, it leads to very difficult to debug issues. Caches should only ever be used internally. Instead, this is meant to be an allocating operation and should allocate a new array for the output.

vpuri3 · 2023-02-07T19:25:36Z

@sbrisard , with #150, you would be able to pass a flag to avoid caching as follows:

FunctionOperator(func, u_in, u_out; ifcache = false)

vpuri3 · 2023-02-08T15:06:08Z

@ChrisRackauckas it would be good to allow a syntax like op(du, u, p, t, alpha, beta) == mul!(du, op, u, alpha, beta) in the future so the user can pass their own 5-arg mul! implementation

ChrisRackauckas · 2023-02-08T15:17:43Z

The future is now.

ChrisRackauckas mentioned this issue Feb 5, 2023

FunctionOperator fixes #147

Merged

vpuri3 mentioned this issue Feb 7, 2023

Update FunctionOperator #150

Merged

ChrisRackauckas closed this as completed in #150 Feb 8, 2023

vpuri3 mentioned this issue Feb 8, 2023

allow syntax op(du, u, p, t, alpha, beta) == mul!(du, op, u, alpha, beta) #151

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FunctionOperator caches prototypes of the input and output #145

FunctionOperator caches prototypes of the input and output #145

sbrisard commented Feb 2, 2023

vpuri3 commented Feb 2, 2023

vpuri3 commented Feb 2, 2023

sbrisard commented Feb 3, 2023 •

edited

vpuri3 commented Feb 3, 2023

sbrisard commented Feb 3, 2023 •

edited

vpuri3 commented Feb 3, 2023

ChrisRackauckas commented Feb 4, 2023

sbrisard commented Feb 5, 2023

vpuri3 commented Feb 5, 2023

ChrisRackauckas commented Feb 5, 2023

vpuri3 commented Feb 7, 2023

vpuri3 commented Feb 8, 2023 •

edited

ChrisRackauckas commented Feb 8, 2023

FunctionOperator caches prototypes of the input and output #145

FunctionOperator caches prototypes of the input and output #145

Comments

sbrisard commented Feb 2, 2023

vpuri3 commented Feb 2, 2023

vpuri3 commented Feb 2, 2023

sbrisard commented Feb 3, 2023 • edited

vpuri3 commented Feb 3, 2023

sbrisard commented Feb 3, 2023 • edited

vpuri3 commented Feb 3, 2023

ChrisRackauckas commented Feb 4, 2023

sbrisard commented Feb 5, 2023

vpuri3 commented Feb 5, 2023

ChrisRackauckas commented Feb 5, 2023

vpuri3 commented Feb 7, 2023

vpuri3 commented Feb 8, 2023 • edited

ChrisRackauckas commented Feb 8, 2023

sbrisard commented Feb 3, 2023 •

edited

sbrisard commented Feb 3, 2023 •

edited

vpuri3 commented Feb 8, 2023 •

edited