Skip to content

avoid quadratic behavior when accumulating the residual when computing A*x - b#92

Merged
AayushSabharwal merged 1 commit into
mainfrom
kc/perf_unhack
Jun 4, 2026
Merged

avoid quadratic behavior when accumulating the residual when computing A*x - b#92
AayushSabharwal merged 1 commit into
mainfrom
kc/perf_unhack

Conversation

@KristofferC
Copy link
Copy Markdown
Contributor

Accumulating into a Num inside an iteration with standard += is O(n^2).

The following benchmark script tests the two implementations:

Script
using Symbolics, BenchmarkTools
using SymbolicUtils
import SymbolicUtils as SU
using Symbolics: SymbolicT, VartypeT, unwrap

function linear_scc_residuals_old(A_mat::Matrix{SymbolicT}, b_vec::Vector{SymbolicT}, x::Vector{SymbolicT})
    N = length(b_vec)
    residuals = Vector{SymbolicT}(undef, N)
    for i in 1:N
        res_i = -b_vec[i]
        for j in 1:N
            aij = A_mat[i, j]
            SU._iszero(aij) && continue
            res_i += aij * x[j]
        end
        residuals[i] = res_i
    end
    return residuals
end

function linear_scc_residuals(A_mat::Matrix{SymbolicT}, b_vec::Vector{SymbolicT}, x::Vector{SymbolicT})
    N = length(b_vec)
    residuals = Vector{SymbolicT}(undef, N)
    add_buffer = SU.ArgsT{VartypeT}()
    for i in 1:N
        empty!(add_buffer)
        push!(add_buffer, -b_vec[i])
        for j in 1:N
            aij = A_mat[i, j]
            SU._iszero(aij) && continue
            push!(add_buffer, aij * x[j])
        end
        residuals[i] = SU.add_worker(VartypeT, add_buffer)
    end
    return residuals
end

function make_system(N; sparsity = 0)
    A = Matrix{SymbolicT}(undef, N, N)
    for i in 1:N, j in 1:N
        A[i, j] = sparsity > 0 && (i + j) % sparsity == 0 ? unwrap(Num(0)) :
                  unwrap(Symbolics.variable(:a, i, j))
    end
    b = SymbolicT[unwrap(Symbolics.variable(:b, i)) for i in 1:N]
    x = SymbolicT[unwrap(Symbolics.variable(:x, i)) for i in 1:N]
    return A, b, x
end

# correctness
let (A, b, x) = make_system(6; sparsity = 3)
    @assert all(isequal.(linear_scc_residuals_old(A, b, x), linear_scc_residuals(A, b, x)))
end

for N in (3, 10, 30, 100)
    A, b, x = make_system(N)
    t1 = @benchmark linear_scc_residuals_old($A, $b, $x)
    t2 = @benchmark linear_scc_residuals($A, $b, $x)
    println("N=$N  old: ", round(median(t1).time / 1000, digits = 1), "μs   new: ",
        round(median(t2).time / 1000, digits = 1), "μs   speedup: ",
        round(median(t1).time / median(t2).time, digits = 2), "x")
end

with the result:

     N=3  old: 76.5μs   new: 59.4μs   speedup: 1.29x
     N=10  old: 1136.6μs   new: 554.5μs   speedup: 2.05x
     N=30  old: 19424.4μs   new: 4989.2μs   speedup: 3.89x
     N=100  old: 480587.6μs   new: 58232.0μs   speedup: 8.25x

@AayushSabharwal
Copy link
Copy Markdown
Member

Thanks!

@AayushSabharwal AayushSabharwal merged commit 8bed54a into main Jun 4, 2026
20 of 23 checks passed
@AayushSabharwal AayushSabharwal deleted the kc/perf_unhack branch June 4, 2026 05:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants