In [1]:
using FunctionOperators
using BenchmarkTools

## Generate some 3D data

In [2]:
data = [sin(i+j+k)^2 for i=1:300, j=1:300, k=1:50];
size(data)

(300, 300, 50)

## Define some operators

In [3]:
?FunctionOperator

search: [0m[1mF[22m[0m[1mu[22m[0m[1mn[22m[0m[1mc[22m[0m[1mt[22m[0m[1mi[22m[0m[1mo[22m[0m[1mn[22m[0m[1mO[22m[0m[1mp[22m[0m[1me[22m[0m[1mr[22m[0m[1ma[22m[0m[1mt[22m[0m[1mo[22m[0m[1mr[22m [0m[1mF[22m[0m[1mu[22m[0m[1mn[22m[0m[1mc[22m[0m[1mt[22m[0m[1mi[22m[0m[1mo[22m[0m[1mn[22m[0m[1mO[22m[0m[1mp[22m[0m[1me[22m[0m[1mr[22m[0m[1ma[22m[0m[1mt[22m[0m[1mo[22m[0m[1mr[22ms



Constructor for FunctionOperator object

FunctionOperator is an operator that maps from a multidimensional space to another multidimensional space. The mapping is defined by a function (`forw`), and optionally the reverse mapping can also be defined (`backw`). The input the mapping must be subtype of AbstractArray.

Arguments

  * `name::String` (Optional but strongly recommended) The operator is referenced later in error messages by this string. **Warning!** It is also used to check equality of (composite) FunctionOperators. Default value: `OpX` where X is a number incremented in each constructor-call.
  * `forw::Function` Function defining the mapping. Must accept one or two arguments. In case of two arguments, the first argument is a preallocated buffer to write the result into (to speed up code by avoiding repeated allocations). In case of both one and two arguments, the return value must be the result of the mapping.
  * `backw::Function` (Optional) Same as backw, but defines the backward mapping
  * `inDims::Tuple{Vararg{Int}}` Size of input array
  * `outDims::Tuple{Vararg{Int}}` Size of output array

The following constructors are available:

  * Positional constructor #1: `FunctionOperator{eltype}(forw, inDims, outDims)`
  * Positional constructor #2: `FunctionOperator{eltype}(forw, backw, inDims, outDims)`
  * Positional constructor #3: `FunctionOperator{eltype}(name, forw, inDims, outDims)`
  * Positional constructor #4: `FunctionOperator{eltype}(name, forw, backw, inDims, outDims)`
  * Keyword constructor: `FunctionOperator{eltype}(;kwargs...)`

where `eltype` is the type enforced on elements of input array.


Squaring operator and square root as its adjoint operation ⟶ **Dimension preserving**

In [4]:
# Using the keyword constructor:
Op₁ = FunctionOperator{Float64}(name = "Op₁",
    forw = x -> x.^2, backw = x -> sqrt.(x),
    inDims = (300, 300, 50), outDims = (300, 300, 50))

FunctionOperator with eltype Float64
    Name: Op₁
    Input dimensions: (300, 300, 50)
    Output dimensions: (300, 300, 50)

A weighting operator that collapses the new dimension on adjoint operation ⟶ **Changes size**

In [5]:
weights = [sin((i-j)*l) + 1 for i=1:300, j=1:300, k=1:50, l=1:10]
# Using the positional constructor:
Op₂ = FunctionOperator{Float64}("Op₂",
    x -> reshape(x, 300, 300, 50, 1) .* weights, # broadcasting: 3D to 4D
    x -> reshape(sum(x ./ weights, dims=4), 300, 300, 50),
    (300, 300, 50), (300, 300, 50, 10))

FunctionOperator with eltype Float64
    Name: Op₂
    Input dimensions: (300, 300, 50)
    Output dimensions: (300, 300, 50, 10)

## Apply these operators to the data

Apply the first operator: Left multiplication by the operator is equal to calling the `forw` function

In [6]:
Op₁ * data == Op₁.forw(data)

true

Result of application of the second operator: size increased

In [7]:
size(data), size(Op₂ * data)

((300, 300, 50), (300, 300, 50, 10))

Combine the two operators:

In [8]:
Op₂ * Op₁ * data == Op₂.forw(Op₁.forw(data))

true

Adjoint of operator == calling the `backw` function

In [9]:
Op₁' * Op₁ * data == Op₁.backw(Op₁.forw(data))

true

Combine operators with addition and substraction

In [10]:
Op₂ * (Op₁ + Op₂'*Op₂) * Op₁ * data ==
    Op₂.forw(Op₁.forw(Op₁.forw(data)) + Op₂.backw(Op₂.forw(Op₁.forw(data))))

true

I is also possible to combine with `UniformScaling` from `LinearAlgebra` library

In [11]:
using LinearAlgebra
Op₁ * I * data == Op₁.forw(data),
Op₁ * 3I * data == Op₁.forw(3 * data),
Op₂*(Op₁ - 2.5*I)*Op₁'*data == Op₂.forw(Op₁.forw(Op₁.backw(data))-2.5*Op₁.backw(data))

(true, true, true)

Adjoint of nested operators also work:

In [12]:
(Op₂ * Op₁)' * (Op₂ * Op₁) * data == (Op₂ * Op₁)' * Op₂ * Op₁ * data ==
    Op₁.backw(Op₂.backw(Op₂.forw(Op₁.forw(data))))

true

...but *not* with addition or substraction:

In [13]:
(Op₁ + 3I)' * data

ErrorException: Sorry, I don't know how to calculate the adjoint of ((Op₁ + (3*I)))'

You can store a combination of some operators, and apply it later to data:

In [14]:
comb_OP = 5I * Op₁
comb_OP' * comb_OP * data == (5I * Op₁)' * (5I * Op₁) * data ==
    Op₁.backw(conj(5)*(5*Op₁.forw(data)))

true

*Note that adjoint operation of scaling by a constant (in this case: 5) is the scaling by the conjugate of the constant (which is equal to the original constant in case of real numbers).*

## Performance
Unfortunately, our naive approach above is quite slow...

In [15]:
@benchmark Op₂*(Op₁ - 2.5*I)*Op₁'*data

BenchmarkTools.Trial: 
  memory estimate:  858.32 MiB
  allocs estimate:  273
  --------------
  minimum time:     349.828 ms (39.01% GC)
  median time:      350.290 ms (39.01% GC)
  mean time:        355.272 ms (39.85% GC)
  maximum time:     404.788 ms (47.11% GC)
  --------------
  samples:          15
  evals/sample:     1

A possible reason is that `Op₂` accesses a global variable, and it is considered to be a bad practice. (*See: [Performance Tips](https://docs.julialang.org/en/v1/manual/performance-tips/index.html)*)

We can avoid that by wrapping the definition of `Op₂` with a function:

In [16]:
function getOp₂()
    weights = [sin((i-j)*l) + 1 for i=1:300, j=1:300, k=1:50, l=1:10]
    Op₂ = FunctionOperator{Float64}(name="Op₂",
        forw = x -> reshape(x, 300, 300, 50, 1) .* weights, # broadcasting: 3D to 4D
        backw = x -> reshape(sum(x ./ weights, dims=4), 300, 300, 50),
        inDims=(300, 300, 50), outDims=(300, 300, 50, 10))
end
Op₂ = getOp₂()

FunctionOperator with eltype Float64
    Name: Op₂
    Input dimensions: (300, 300, 50)
    Output dimensions: (300, 300, 50, 10)

In [17]:
@benchmark Op₂*(Op₁ - 2.5*I)*Op₁'*data

BenchmarkTools.Trial: 
  memory estimate:  858.32 MiB
  allocs estimate:  271
  --------------
  minimum time:     295.893 ms (27.91% GC)
  median time:      351.640 ms (39.20% GC)
  mean time:        352.211 ms (39.29% GC)
  maximum time:     406.226 ms (47.35% GC)
  --------------
  samples:          15
  evals/sample:     1

Well, it didn't solved our problem... In fact, the main reason of slowness is the excessive memory allocations; namely, all the intermediate results allocates a new array.

We can avoid that by defining the `forw` and `backw` function a bit differently: They can also accept two arguments, where the first is a preallocated buffer (with appropriate size) that is supposed to hold the output of the operation:

In [18]:
function getBufferedOps()
    Op₁ = FunctionOperator{Float64}(name="Op₁",
        forw = (buffer, x) -> buffer .= x.^2,
        backw = (buffer, x) -> broadcast!(sqrt, buffer, x),
        inDims = (300, 300, 50), outDims = (300, 300, 50))
    weights = [sin((i-j)*l) + 1 for i=1:300, j=1:300, k=1:50, l=1:10]
    Op₂ = FunctionOperator{Float64}(name="Op₂",
        forw = (buffer,x) -> buffer .= reshape(x, 300, 300, 50, 1) .* weights,
        backw = (buffer,x) -> dropdims(sum!(reshape(buffer, 300, 300, 50, 1), x ./ weights), dims=4),
        inDims=(300, 300, 50), outDims=(300, 300, 50, 10))
    Op₁, Op₂
end
bOp₁, bOp₂ = getBufferedOps()

(FunctionOperator{Float64}(Op₁, (300, 300, 50), (300, 300, 50)), FunctionOperator{Float64}(Op₂, (300, 300, 50), (300, 300, 50, 10)))

In [19]:
@benchmark bOp₂*(bOp₁ - 2.5*I)*bOp₁'*data

BenchmarkTools.Trial: 
  memory estimate:  446.33 MiB
  allocs estimate:  261
  --------------
  minimum time:     205.879 ms (5.70% GC)
  median time:      263.566 ms (26.43% GC)
  mean time:        261.499 ms (25.84% GC)
  maximum time:     280.075 ms (30.79% GC)
  --------------
  samples:          20
  evals/sample:     1

Better, but it still should be much faster...

#### Let's have a look at what is under the hood!

When we combine operators, nothing special happens, just a wrapper object is created that defines the connections between the operators:

In [20]:
Op₂*(Op₁ - 2.5*I)*Op₁'

FunctionOperatorComposite with eltype Float64
    Name: Op₂ * (Op₁ - (2.5*I)) * Op₁'
    Input dimensions: (300, 300, 50)
    Output dimensions: (300, 300, 50, 10)
    Plan: no plan

*Note the last last line: "Plan: no plan" ⟶ it is going to have a significance later...*

The real magic happens when we apply this composite operator to data. To see what is going on behind the scenes, let's enable verbosity.

In [21]:
FO_settings.verbose = true

true

Now, we can see, how this composite operators work: When we apply it to data, it creates a function that aggregates the functionality of all combined operators, and preallocates buffers for the intermediate results.

In [22]:
Op₂*(Op₁ - 2.5*I)*Op₁' * data;

Allocation of buffer1, size: (300, 300, 50, 10)
Allocation of buffer2, size: (300, 300, 50)
Allocation of buffer3, size: (300, 300, 50)
Allocation of buffer4, size: (300, 300, 50)
Plan calculated: buffer1 .= Op₂.forw((buffer2 .= Op₁.backw(x); broadcast!(-, buffer3, Op₁.forw(buffer2), broadcast!(*, buffer4, 2.5, buffer2))))


On the other hand, `bOp₁` and `bOp₂` has a bit different aggregated function:

In [23]:
bOp₂*(bOp₁ - 2.5*I) * bOp₁' * data;

Allocation of buffer1, size: (300, 300, 50, 10)
Allocation of buffer2, size: (300, 300, 50)
Allocation of buffer3, size: (300, 300, 50)
Allocation of buffer4, size: (300, 300, 50)
Plan calculated: buffer1 .= Op₂.forw(buffer1, (buffer2 .= Op₁.backw(buffer2, x); broadcast!(-, buffer3, Op₁.forw(buffer3, buffer2), broadcast!(*, buffer4, 2.5, buffer2))))


The good thing is that the plan (along with the preallocated buffers) is cached, so if we save the combined operator to a variable, then the plan is created only once. See the difference:

In [24]:
bOp₂*(bOp₁ - 2.5*I) * bOp₁' * data
bOp₂*(bOp₁ - 2.5*I) * bOp₁' * data;

Allocation of buffer1, size: (300, 300, 50, 10)
Allocation of buffer2, size: (300, 300, 50)
Allocation of buffer3, size: (300, 300, 50)
Allocation of buffer4, size: (300, 300, 50)
Plan calculated: buffer1 .= Op₂.forw(buffer1, (buffer2 .= Op₁.backw(buffer2, x); broadcast!(-, buffer3, Op₁.forw(buffer3, buffer2), broadcast!(*, buffer4, 2.5, buffer2))))
Allocation of buffer1, size: (300, 300, 50, 10)
Allocation of buffer2, size: (300, 300, 50)
Allocation of buffer3, size: (300, 300, 50)
Allocation of buffer4, size: (300, 300, 50)
Plan calculated: buffer1 .= Op₂.forw(buffer1, (buffer2 .= Op₁.backw(buffer2, x); broadcast!(-, buffer3, Op₁.forw(buffer3, buffer2), broadcast!(*, buffer4, 2.5, buffer2))))


In [25]:
combined = bOp₂*(bOp₁ - 2.5*I) * bOp₁'
combined * data
combined * data;

Allocation of buffer1, size: (300, 300, 50, 10)
Allocation of buffer2, size: (300, 300, 50)
Allocation of buffer3, size: (300, 300, 50)
Allocation of buffer4, size: (300, 300, 50)
Plan calculated: buffer1 .= Op₂.forw(buffer1, (buffer2 .= Op₁.backw(buffer2, x); broadcast!(-, buffer3, Op₁.forw(buffer3, buffer2), broadcast!(*, buffer4, 2.5, buffer2))))
Allocation of buffer1, size: (300, 300, 50, 10)


Now we can see that the `combined` object carries the plan already created:

In [26]:
combined

FunctionOperatorComposite with eltype Float64
    Name: Op₂ * (Op₁ - (2.5*I)) * Op₁'
    Input dimensions: (300, 300, 50)
    Output dimensions: (300, 300, 50, 10)
    Plan: Op₂.forw(buffer1, (buffer2 .= Op₁.backw(buffer2, x); broadcast!(-, buffer3, Op₁.forw(buffer3, buffer2), broadcast!(*, buffer4, 2.5, buffer2))))

And a side-note here: We can also set this plan manually, if the computed one is wrong, or FunctionOperators was not possible to compute. For example, adjoint of addition:

In [27]:
tricky = (bOp₁ + 2.5I)'

FunctionOperatorComposite with eltype Float64
    Name: ((Op₁ + (2.5*I)))'
    Input dimensions: (300, 300, 50)
    Output dimensions: (300, 300, 50)
    Plan: no plan

In [28]:
tricky * data

Allocation of buffer1, size: (300, 300, 50)


ErrorException: Sorry, I don't know how to calculate the adjoint of ((Op₁ + (2.5*I)))'

In [29]:
setPlan(tricky, (buffer, x) -> @.(√(2 - x) / √(2x)), "√(2 - x) / √(2x)")
tricky

FunctionOperatorComposite with eltype Float64
    Name: ((Op₁ + (2.5*I)))'
    Input dimensions: (300, 300, 50)
    Output dimensions: (300, 300, 50)
    Plan: √(2 - x) / √(2x)

In [30]:
tricky * data == @. √(2 - data) / √(2data)

Allocation of buffer1, size: (300, 300, 50)


true

But back to the question of performance: If we preallocate an array for the output manually, and use `mul!`, then we can save also the reallocation of `buffer1`:

In [31]:
combined = bOp₂ * (bOp₁ - 2.5*I) * bOp₁'
output = Array{Float64}(undef, 300, 300, 50, 10)
mul!(output, combined, data)
mul!(output, combined, data);

buffer1 = <previously allocated>
Allocation of buffer2, size: (300, 300, 50)
Allocation of buffer3, size: (300, 300, 50)
Allocation of buffer4, size: (300, 300, 50)
Plan calculated: buffer1 .= Op₂.forw(buffer1, (buffer2 .= Op₁.backw(buffer2, x); broadcast!(-, buffer3, Op₁.forw(buffer3, buffer2), broadcast!(*, buffer4, 2.5, buffer2))))


If we apply the combined operator multiple times, we can save a lot on computation time:

In [32]:
FO_settings.verbose = false
@benchmark mul!(output, combined, data)

BenchmarkTools.Trial: 
  memory estimate:  256 bytes
  allocs estimate:  7
  --------------
  minimum time:     124.221 ms (0.00% GC)
  median time:      124.382 ms (0.00% GC)
  mean time:        124.380 ms (0.00% GC)
  maximum time:     124.702 ms (0.00% GC)
  --------------
  samples:          41
  evals/sample:     1

Let's compare it to a manually function with identical function and optimizations

In [33]:
function getAggregatedFunction()
    weights = [sin((i-j)*l) + 1 for i=1:300, j=1:300, k=1:50, l=1:10]
    buffer2 = Array{Float64}(undef, 300, 300, 50)
    buffer3 = Array{Float64}(undef, 300, 300, 50)
    buffer4 = Array{Float64}(undef, 300, 300, 50)
    (buffer, x) -> begin
        broadcast!(sqrt, buffer2, x)  # Of course, this two lines can be optimized to
        buffer3 .= buffer2 .^ 2       # (√x)^2 = |x|, but let's now avoid this fact
        broadcast!(-, buffer3, buffer3, broadcast!(*, buffer4, 2.5, buffer2))
        buffer .= reshape(buffer3, 300, 300, 50, 1) .* weights
    end
end

getAggregatedFunction (generic function with 1 method)

In [34]:
aggrFun = getAggregatedFunction()
@benchmark aggrFun(output, data)

BenchmarkTools.Trial: 
  memory estimate:  128 bytes
  allocs estimate:  2
  --------------
  minimum time:     124.123 ms (0.00% GC)
  median time:      124.212 ms (0.00% GC)
  mean time:        124.206 ms (0.00% GC)
  maximum time:     124.307 ms (0.00% GC)
  --------------
  samples:          41
  evals/sample:     1

Basically, there is no overhead of using FunctionOperators!

## Syntactic sugar
Let's consider the following function:

In [35]:
function foo1(A, bOp₁, bOp₂)
    for i in 1:10
        C = (bOp₁ - 2.5*I) * bOp₁ * A
        B = bOp₁ * (C - 3A)
        A .= bOp₁ * (C + 2B)
        A ./= maximum(bOp₂ * A)
    end
end

foo1 (generic function with 1 method)

In [36]:
@benchmark foo1(copy(data), bOp₁, bOp₂)

BenchmarkTools.Trial: 
  memory estimate:  6.07 GiB
  allocs estimate:  1822
  --------------
  minimum time:     3.835 s (20.62% GC)
  median time:      3.836 s (20.57% GC)
  mean time:        3.836 s (20.57% GC)
  maximum time:     3.837 s (20.53% GC)
  --------------
  samples:          2
  evals/sample:     1

Using the methods we have seen earlier, we can quickly optimize this code, and we get something like that:

In [37]:
function foo2(A, bOp₁, bOp₂)
    combOp = (bOp₁ - 2.5*I) * bOp₁
    C = similar(A)
    buffer1 = similar(A)
    B = similar(A)
    buffer2 = Array{Float64}(undef, (300, 300, 50, 10))
    for i = 1:10
        mul!(C, combOp, A)
        @. buffer1 = C - 3A
        mul!(B, bOp₁, buffer1)
        @. buffer1 = C + 2B
        mul!(A, bOp₁, buffer1)
        A ./= maximum(mul!(buffer2, bOp₂, A))
    end
end

foo2 (generic function with 1 method)

In [38]:
@benchmark foo2(copy(data), bOp₁, bOp₂)

BenchmarkTools.Trial: 
  memory estimate:  514.99 MiB
  allocs estimate:  267
  --------------
  minimum time:     2.086 s (3.44% GC)
  median time:      2.088 s (3.57% GC)
  mean time:        2.092 s (3.69% GC)
  maximum time:     2.101 s (4.05% GC)
  --------------
  samples:          3
  evals/sample:     1

This speedup is pretty much pleasing, but the tradeoff is that the code is much less readable now. To avoid the mess caused by manual optimization, the `FunctionOperators` library offers the recycle macro (`@♻`) that does the same automatically using the following markers: `🔝`, `🔃`, and `@🔃`.

In [39]:
?@♻

**Recycling macro**: Reduce the number of allocations inside a for loop by preallocation of arrays for the outputs of marked operations. Markers: `@♻` (`\:recycle:`), `🔝` (`\:top:`), `🔃` (`\:arrows_clockwise:`), and `@🔃`

Macro @♻ should be placed right before a for loop, and then it executes the following substitutions:

  * Expressions marked by `🔝` are going to be calculated before the loop, the result is stored in a variable, and the expression will be replaced by that variable. It also can be useful when a constant expression is used in the loop, but the idea behind creating that substitution is to allow caching of composite FunctionMatrices. Eg:

```julia
@♻ for i=1:5
    result = 🔝((FuncOp₁ + 2I) * FuncOp₂) * data
end
```

will be transformed to 

```julia
🔝_1 = (FuncOp₁ + 2I) * FuncOp₂
for i = 1:5
    result = 🔝_1 * data
end
```

so that way plan is calculated only once, and also buffers for intermediate results of the composite operator are allocated once.

  * Expressions marked by `🔃` are going to be calculated before the loop (to allocate an array to store the result), but the expression is also evaluated in each loop iteration. The difference after the substitution is that the result of the expression is always saved to the preallocated array. Eg:

```julia
@♻ for i=1:5
    result = FuncOp₁ * 🔃(A + B)
end
```

will be transformed to 

```julia
🔃_1 = A + B
for i = 1:5
    result = FuncOp₁ * @.(🔃_1 = A + B)
end
```

This transformation first allocates an array named `🔃_1`, and then in every iteration it is recalculated, saved to `🔃_1`, and the this value is used for the rest of the operation (i.e.: `FuncOp₁ * 🔃_1`. Note that `@.` macro is inserted before the inline assignment. This is needed otherwise `A + B` would allocate a new array before it is stored in `🔃_1`. **Warning!** It can break your code, e.g. @.(🔃*1 = A * B) ≠ (🔃*1 = A * B) {matrix multiplication vs. elementwise multiplication}! On the other hand, when the marked expression consists only a multiplication, then it is transformed into a call of `mul!`. Eg:

```julia
@♻ for i=1:5
    result = FuncOp₁ * 🔃(A * B)
end
```

will be transformed to 

```julia
🔃_1 = A * B
for i = 1:5
    result = FuncOp₁ * mul!(🔃_1, A, B)
end
```

  * Lastly, assignments marked by `@🔃` will be transformed into a call of `mul!`. Of course, it works only if `@🔃` is directly followed by an assignment that has a single multiplication on the right side. Eg:

```julia
@♻ for i=1:5
    @🔃 result = FuncOp₁ * A
end
```

will be transformed to 

```julia
result = FuncOp₁ * A
for i = 1:5
    mul!(result, FuncOp₁, A)
end
```

Final note: `🔝` can be arbitrarily nested, and it can be embedded in expressions marked by `🔃`. `🔃` can also be nested, and it can be used in assigments marked by `@🔃` (along with `🔝`, of course).


In our example:

In [40]:
FO_settings.macro_verbose = true # if true, @♻ prints the transformed loop
function foo3(A, bOp₁, bOp₂)
    @♻ for i in 1:10
        @🔃 C = 🔝((bOp₁ - 2.5*I) * bOp₁) * A
        @🔃 B = bOp₁ * 🔃(C - 3A)
        @🔃 A .= bOp₁ * 🔃(C + 2B)
        A ./= maximum(🔃(bOp₂ * A))
    end
end

begin
    🔝_1 = (bOp₁ - 2.5I) * bOp₁
    C = 🔝_1 * A
    🔃_1 = C - 3A
    B = bOp₁ * @__dot__(🔃_1 = C - 3A)
    🔃_2 = C + 2B
    🔃_3 = bOp₂ * A
    for i = 1:10
        mul!(C, 🔝_1, A)
        mul!(B, bOp₁, @__dot__(🔃_1 = C - 3A))
        mul!(A, bOp₁, @__dot__(🔃_2 = C + 2B))
        A ./= maximum(mul!(🔃_3, bOp₂, A))
    end
end


foo3 (generic function with 1 method)

In [41]:
@benchmark foo3(copy(data), bOp₁, bOp₂)

BenchmarkTools.Trial: 
  memory estimate:  618.00 MiB
  allocs estimate:  410
  --------------
  minimum time:     2.269 s (3.89% GC)
  median time:      2.329 s (6.47% GC)
  mean time:        2.310 s (5.63% GC)
  maximum time:     2.331 s (6.47% GC)
  --------------
  samples:          3
  evals/sample:     1

It is slightly slower and requires a bit more memory allocations because it can't detect if a buffer can be reused. But when the loop body consists of a lot of computationally heavy operations, then the difference is mostly negligible.

## Further notes
### Global settings

In [42]:
?FO_settings

search: [0m[1mF[22m[0m[1mO[22m[0m[1m_[22m[0m[1ms[22m[0m[1me[22m[0m[1mt[22m[0m[1mt[22m[0m[1mi[22m[0m[1mn[22m[0m[1mg[22m[0m[1ms[22m



Object that holds global settings for `FunctionOperators` library

Fields:

  * `verbose::Bool` If set to true, then allocation information and calculated plan function will be displayed upon creation (i.e., when a composite operator is first used). Default: `false`
  * `macro_verbose::Bool` If set to true, then recycling macro (@♻) will print the transformed loop. Default: `false`


### Equality operator
Equality operator is defined between (composite) `FunctionOperators` based on their names. In our case, this implies:

In [43]:
(bOp₁ - 2.5*I) * bOp₁ == (Op₁ - 2.5*I) * Op₁

true

### Superclass
Combination of `FunctionOperator` objects are type of `FunctionOperatorComposite`. Both class is subclass of `FunOp`.

In [44]:
bOp₁ isa FunctionOperator,
bOp₁ isa FunOp,
bOp₂ * bOp₁ isa FunctionOperator, # false bacause it is FunctionOperatorComposite
bOp₂ * bOp₁ isa FunOp

(true, true, false, true)