Skip to content

RFC: Facilitate not paying double lookup cost when mutating a value in a LinearSlow AbstractArray #15630

@KristofferC

Description

@KristofferC

There are many cases where one wants to modify a value in an array like A[i, j] = f(A[i, j]), If A is a LinearSlow array this would currently need two transformation between the Cartesian index and the linear index, one in getindex and one in setindex!.

A typical real world example is in Finite Element Analysis where one wants to "assemble" a local small dense matrix Ke into a large sparse matrix K

function assemble!(K, Ke, dofs)
    for (loc1, glob1) in enumerate(dofs)
        for (loc2, glob2) in enumerate(dofs)
            K[glob1, glob2] += Ke[loc1, loc2]
        end
    end
end

Since K[i,j] += Ke[a,b] lowers to K[i,j] = K[i,j] + K[a,b] we will do the Cartesian to linear index searches twice. Typically in Finite Element codes to avoid this, one usually creates a add(K, v, i::Integer, j::Integer) method one sub(K, v, i::Integer, j::Integer) etc, see for example https://github.com/dealii/dealii/blob/0cb18a1aba58177c59bfe3755d13e67cc6099244/examples/step-5/step-5.cc#L391.

It could perhaps be useful for all AbstractArrays to optionally support a fast version of A[i, j] = f(A[i, j]), lets call it mutateindex!

mutateindex! would have the signature mutateindex(A, f, i...) where f is a function (v) -> f(v) where v is the value of A[i...]

A default fall back for mutateindex would be simple:

function mutateindex!(A, f, i...)
    A[i...] = f(A[i...])
end

Optionally an AbstractArray could implement a fast version of mutateindex, like for a SparseMatrix:

import Base.Order.Forward

function mutateindex!{T,Ti}(A::SparseMatrixCSC{T,Ti}, f, i0::Integer, i1::Integer)
    i0 = convert(Ti, i0)
    i1 = convert(Ti, i1)
    if !(1 <= i0 <= A.m && 1 <= i1 <= A.n); throw(BoundsError()); end
    r1 = Int(A.colptr[i1])
    r2 = Int(A.colptr[i1+1]-1)
    i = (r1 > r2) ? r1 : searchsortedfirst(A.rowval, i0, r1, r2, Forward)

    if (i <= r2) && (A.rowval[i] == i0)
        A.nzval[i] = f(A.nzval[i])
    else
        insert!(A.rowval, i, i0)
        insert!(A.nzval, i, f(zero(T)))
        @simd for j = (i1+1):(A.n+1)
            @inbounds A.colptr[j] += 1
        end
    end
    return A
end

We could now write our assembly function again as:

add!(A::AbstractMatrix, v, i0::Integer, i1::Integer) = mutateindex!(A, (i) -> i + v, i0, i1)

function assemble2!(K, Ke, dofs)
    for (loc1, glob1) in enumerate(dofs)
        for (loc2, glob2) in enumerate(dofs)
            add!(K, Ke[loc1, loc2], glob1, glob2) 
        end
    end
end

and by doing some benchmarks:

function benchmark(K, Ke, dofs)
    @time for i in 1:10^4
        assemble!(K, Ke, dofs)
    end

    @time for i in 1:10^4
        assemble2!(K, Ke, dofs)
    end
end

const Ke = rand(8,8)
const dofs = [20,21, 500, 501, 1001, 1002, 1086, 1087]
const K = sprand(10^5, 10^5, 0.001)
assemble!(K, Ke, dofs)

julia> benchmark(K, Ke, dofs)
  0.033032 seconds
  0.017603 seconds

we see that we are now about twice as fast due to halving the dominating lookup cost.

It would of course be nice to have some good syntax for this, which ties on to #249.

We could also just have setindex!(A, v, i, j) be defined as `mutateindex!(A, (_) -> v, i...) but that is probably too disruptive.

Maybe our AbstractArray Super Saiyans @timholy and @mbauman has any comments on the feasibility of something like this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    arrays[a, r, r, a, y, s]designDesign of APIs or of the language itselfperformanceMust go faster

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions