Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FunctionOperator caches prototypes of the input and output #145

Closed
sbrisard opened this issue Feb 2, 2023 · 13 comments · Fixed by #150
Closed

FunctionOperator caches prototypes of the input and output #145

sbrisard opened this issue Feb 2, 2023 · 13 comments · Fixed by #150

Comments

@sbrisard
Copy link

sbrisard commented Feb 2, 2023

FunctionOperator structures have a cache = (in, out) member, where in and out are prototypes of the input and output.
These prototypes are used in * and \

https://github.com/SciML/SciMLOperators.jl/blob/master/src/func.jl#L297-L307

This seems to me unnecessarily costly. Am I missing something?

@vpuri3
Copy link
Member

vpuri3 commented Feb 2, 2023

L=FunctionOperator has two important boolean parameters {iip,oop} that stand for is-in-place, and out-of-place. iip=true means you can call L(v, u, p, t), equivalent to mul!(v, L, u). And oop=true means you can call v=L(u,p,t) equivalent to v=L*u.

The methods you highlighted are for FunctionOperator{true,false}, i.e. one that has an in-place method, but no out-of-place method. So we're cheekily forming a new array (using the cache), and passing it to the in-place-method to write the output in.

function Base.:*(L::FunctionOperator{true,false}, u::AbstractVecOrMat)
_, co = L.cache
du = zero(co)
L.op(du, u, L.p, L.t)
end
function Base.:\(L::FunctionOperator{true,false}, u::AbstractVecOrMat)
ci, _ = L.cache
du = zero(ci)
L.op_inverse(du, u, L.p, L.t)
end

@vpuri3
Copy link
Member

vpuri3 commented Feb 2, 2023

When we have out-of-place methods, we simply use them:

Base.:*(L::FunctionOperator{iip,true}, u::AbstractVecOrMat) where{iip} = L.op(u, L.p, L.t)
Base.:\(L::FunctionOperator{iip,true}, u::AbstractVecOrMat) where{iip} = L.op_inverse(u, L.p, L.t)

And for nonallocating mul!, ldiv!, we necessarily need a cache.

@sbrisard
Copy link
Author

sbrisard commented Feb 3, 2023

I understand how these cached arrays are used, but I fail to see why they are needed?
I intend to implement operators that operate on tensor fields in 3d. So ci and co would be vectors of size 6 (number of tensor components) * 1024^3 (size of the grid). It would be very expensive to store two of those guys just for the sake of having a prototype to allocate new vectors.

Would it not be more efficient to just store the size and element-type of ci and co? If you still want to have some freedom with the type of the returned vector, you could pass user-provided functions that create the required vectors on demand.

Does that make sense?

To make things clearer: why not store an allocator, rather than a prototype?

@vpuri3
Copy link
Member

vpuri3 commented Feb 3, 2023

  1. The caches are used to provide a nonallocating 5-argument mul!
    function LinearAlgebra.mul!(v::AbstractVecOrMat, L::FunctionOperator{true}, u::AbstractVecOrMat, α, β)
    _, co = L.cache
    copy!(co, v)
    mul!(v, L, u)
    lmul!(α, v)
    axpy!(β, co, v)
    end
  2. ANd to provide a nonallocating 2-argument ldiv!
    function LinearAlgebra.ldiv!(L::FunctionOperator{true}, u::AbstractVecOrMat)
    ci, _ = L.cache
    copy!(ci, u)
    ldiv!(u, L, ci)
    end

@sbrisard
Copy link
Author

sbrisard commented Feb 3, 2023

I see. Thanks for this clarification. I understand I did not read correctly your first answer.
So if I want to avoid these cached variables, I need to provide both an in-place and an out-of-place method, right?
This makes perfect sense.
I guess this issue should be closed, then.

@vpuri3
Copy link
Member

vpuri3 commented Feb 3, 2023

No, even if you have both in-place v=L(u,p,t) and out-of-place L(v,u,p,t) methods, you still need cache to support the 5-argument mu!(v, L, u, a, b), and the two argument ldiv!(L, u). These methods are used by almost every package out there - Krylov solvers, nonlinear solvers, everything. That's why the cache is important for interoperability.

Usually, the size of cache arrays (O(N)) is not the limiting factor in computations which usually have some sort of matrix(-free) multiplications (O(N log(N)) for Fourier which is the best case scenario).

I understand that you are working on a very large problem (10^9 points) and efficiency does matter. For now, if you want, we can set up a PR to skip caching when FunctionOperator is created. And then optionally create cache if needed by calling cache_operator(L::FunctionOperator, in, out). Let me know if you want that.

Eventually, we should support a truly fused 5-argument mul! as recommended in #48. However, that requires broader agreement on accepted function signatures across SciML.

@ChrisRackauckas
Copy link
Member

The in-place ones still should not pre-allocate and alias the output like that, that's dangerous.

@sbrisard
Copy link
Author

sbrisard commented Feb 5, 2023

I understand that you are working on a very large problem (10^9 points) and efficiency does matter. For now, if you want, we can set up a PR to skip caching when FunctionOperator is created. And then optionally create cache if needed by calling cache_operator(L::FunctionOperator, in, out). Let me know if you want that.

For the time being, I would not do anything to SciMLOperators for the sake of my "large" application. By the time I'm up an running, v 3.0 of SciMLOperators will probably be out.

I do think that SciMLOperators would nicely fit into my application, so I will keep an eye on it. For the sake of testing on small grids, caching does not matter to me in terms of memory efficiency (so long as the code is correct).

Thanks for your these clarifications!

@vpuri3
Copy link
Member

vpuri3 commented Feb 5, 2023

@ChrisRackauckas what's the dangerous aliasing part?

@ChrisRackauckas
Copy link
Member

The following will calculate the wrong result:

y1 = A * x1
y2 = A * x2
y1 + y2

The reason is because A * x1 makes y1 be a reference to the internal cache A.cache. But then A * x2 makes y2 equal to the same cache, and therefore it will mutate the values of y1 making y1 + y2 == 2 * y2, which is not what this syntax is intended to mean. Never expose internal caches, it leads to very difficult to debug issues. Caches should only ever be used internally. Instead, this is meant to be an allocating operation and should allocate a new array for the output.

@vpuri3
Copy link
Member

vpuri3 commented Feb 7, 2023

@sbrisard , with #150, you would be able to pass a flag to avoid caching as follows:

FunctionOperator(func, u_in, u_out; ifcache = false)

@vpuri3
Copy link
Member

vpuri3 commented Feb 8, 2023

@ChrisRackauckas it would be good to allow a syntax like op(du, u, p, t, alpha, beta) == mul!(du, op, u, alpha, beta) in the future so the user can pass their own 5-arg mul! implementation

@ChrisRackauckas
Copy link
Member

The future is now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants