Faster Jacobian via Split backends #98

avik-pal · 2023-08-08T22:38:38Z

Goals

~~Fast Code Path for TwoPointBVProblem (Banded Matrix so we can do complete sparse diff)~~
- ~~Rewrite the loss function to reorder the outputs correctly.~~
Split Mode
- Sparse Diff for the Banded Part
- Dense Diff for the bc! Part
  - We won't be able to use Zygote since bc! assumes in-place operations
  - The Jacobian has inputs > outputs so we want to use ReverseMode ideally. Enzyme might be a good choice. For fallback, we can just use ReverseDiff.
Sparsity Detection for bc! and if sparse, then use SparseDiff here as well
- Requires Sparse ReverseMode (Not implemented upstream yet)

ai-maintainer

AI-Maintainer Review for PR - Faster Jacobian via Split backends

Title and Description 👍

The Title and Description are clear and focused

The title and description of the pull request are clear and focused. They effectively communicate the purpose of the changes, which is to improve the efficiency of the Jacobian computation by utilizing split backends. The description provides a detailed list of goals and the approach to achieve them.

Scope of Changes 👍

The changes are narrowly focused

The changes in the pull request are narrowly focused on improving the efficiency of the Jacobian computation. While there are multiple tasks involved, they all contribute to the overarching goal of faster Jacobian computation. This focus helps in maintaining the clarity and manageability of the changes.

Documentation 👎

Missing Docstrings

Several new functions, classes, and methods have been added without docstrings. It is recommended to add docstrings to these entities to provide clear documentation of their behavior, arguments, and return values. The entities needing docstrings include `_mirk_jacobian_setup`, `_mirk_complete_jacobian_setup`, `_mirk_dense_jacobian_setup`, `_mirk_sparse_jacobian_setup`, `__mirk_compute_jacobian`, `__mirk`, `__mirk_compute_sparse_jacobian!`, `__mirk_compute_dense_jacobian!`, `AutoMultiModeDifferentiation`, `AutoFastDifferentiation`, `AutoSparseFastDifferentiation`, `_exploit_sparsity`, and `_dense_mode`.

Testing 👎

Testing Methodology Not Described

The description does not mention how the changes were tested. It would be beneficial to include information about the testing methodology, such as specific test cases used, any benchmarking or performance testing conducted, or any other relevant details regarding the testing of the changes.

Suggested Changes

Add docstrings to all new functions, classes, and methods. This will help other developers understand the purpose and usage of these entities.
Include information about how the changes were tested. This could include specific test cases, benchmarking data, or other relevant details.

Potential Issues

Without information on testing, it's hard to identify potential issues. Once testing information is provided, potential issues may be easier to spot.

Reviewed with AI Maintainer

avik-pal · 2023-08-09T20:16:18Z

An initial version of Sparse Reverse Mode AD with Zygote:

using Pkg
Pkg.activate(@__DIR__)

using BenchmarkTools, BoundaryValueDiffEq, BVProblemLibrary
using Symbolics, ArrayInterface
using FiniteDiff, ForwardDiff, SparseDiffTools
using Zygote
using LinearAlgebra
using Test

function jacobian_sparsity(f, x)
    f!(y, x) = (y .= f(x))
    return jacobian_sparsity(f!, f(x), x)
end

jacobian_sparsity(f!, y, x) = Symbolics.jacobian_sparsity(f!, y, x)

f1(x) = [x[1], x[2] * x[3], x[3]]

x_in = [1.0, 3.0, 2.0]

module ABC
    struct ReverseModeJacobianCache{S, C, I, NR, NC}
        sparsity::S
        colorvec::C
        idx_vec::I
        nz_rows::NR
        nz_cols::NC
    end
end

function sparse_jacobian_setup(f, x)
    y = f(x)
    Jₛ = jacobian_sparsity(f, x)
    (nz_rows, nz_cols) = collect.(ArrayInterface.findstructralnz(Jₛ))
    colorvec = matrix_colors(Jₛ')
    return ABC.ReverseModeJacobianCache(Jₛ, colorvec, collect(1:size(Jₛ, 1)), nz_rows,
        nz_cols)
end

function sparse_jacobian_reversemode(f, x)
    cache = sparse_jacobian_setup(f, x)
    J = similar(cache.sparsity, eltype(x))
    return sparse_jacobian_reversemode!(J, f, x, cache)
end

function sparse_jacobian_reversemode!(J, f, x, cache::ABC.ReverseModeJacobianCache)
    partial_f(x, idx_vec) = dot(f(x), idx_vec)
    for c in 1:maximum(cache.colorvec)
        @. cache.idx_vec = cache.colorvec == c
        gs = first(Zygote.gradient(partial_f, x, cache.idx_vec))
        pick_idxs = [i for i in 1:length(cache.nz_rows) if cache.colorvec[cache.nz_rows[i]] == c]
        row_idxs = cache.nz_rows[pick_idxs]
        col_idxs = cache.nz_cols[pick_idxs]
        len_cols = length(col_idxs)
        unused_cols = setdiff(1:size(J, 2), col_idxs)
        perm_cols = sortperm(vcat(col_idxs, unused_cols))
        row_idxs = vcat(row_idxs, zeros(Int, size(J, 2) - len_cols))[perm_cols]
        for i in axes(J, 1), j in axes(J, 2)
            i == row_idxs[j] && (J[i, j] = gs[j])
        end
    end
    return J
end

cache = sparse_jacobian_setup(f1, x_in)
J = zeros(3, 3)
sparse_jacobian_reversemode!(J, f1, x_in, cache)

sparse_jacobian_reversemode(f1, x_in)

@benchmark Zygote.jacobian($f1, $x_in)

@test Array(sparse_jacobian_reversemode(f1, x_in)) ≈ first(Zygote.jacobian(f1, x_in))
@benchmark sparse_jacobian_reversemode($f1, $x_in)

@test sparse_jacobian_reversemode!(J, f1, x_in, cache) ≈ first(Zygote.jacobian(f1, x_in))
@benchmark sparse_jacobian_reversemode!($J, $f1, $x_in, $cache)

Results:

julia> @benchmark Zygote.jacobian($f1, $x_in)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  21.010 μs … 202.265 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     27.285 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   26.745 μs ±   5.267 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

    ▄█▇▄▁                ▁▁▁                                    
  ▂▇█████▆▅▄▃▂▂▂▂▂▂▂▃▄▅▇██████▇▆▅▄▄▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▁▁▁▁▁ ▃
  21 μs           Histogram: frequency by time         38.6 μs <

 Memory estimate: 6.03 KiB, allocs estimate: 177.

julia> @test Array(sparse_jacobian_reversemode(f1, x_in)) ≈ first(Zygote.jacobian(f1, x_in))
Test Passed

julia> @benchmark sparse_jacobian_reversemode($f1, $x_in)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  35.469 μs … 226.444 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     42.254 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   46.914 μs ±  11.458 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

    ▄▅▇█▆▅▂                                                     
  ▂▆████████▆▅▄▄▃▃▃▃▃▄▄▅▄▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁ ▃
  35.5 μs         Histogram: frequency by time         84.5 μs <

 Memory estimate: 23.52 KiB, allocs estimate: 363.

julia> @test sparse_jacobian_reversemode!(J, f1, x_in, cache) ≈ first(Zygote.jacobian(f1, x_in))
Test Passed

julia> @benchmark sparse_jacobian_reversemode!($J, $f1, $x_in, $cache)
BenchmarkTools.Trial: 10000 samples with 7 evaluations.
 Range (min … max):  3.924 μs … 17.234 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     5.860 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   6.307 μs ±  1.623 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

        ▄██▄▃▄▆▅▄▃▁                                           
  ▂▅▃▂▂▆███████████▇▇▆▆▆▆▅▄▄▄▄▄▃▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁ ▃
  3.92 μs        Histogram: frequency by time        12.1 μs <

 Memory estimate: 6.88 KiB, allocs estimate: 99.

… ap/autosparse

avik-pal · 2023-08-17T22:12:27Z

Benchmarking Script

import Pkg;
Pkg.activate(@__DIR__);

using BVProblemLibrary, DiffEqDevTools, Plots
using Symbolics, Zygote, Enzyme

using BoundaryValueDiffEq

# Test Case 1
function func_1!(du, u, p, t)
    du[1] = u[2]
    du[2] = 0
end

# Not able to change the initial condition.
# Hard coded solution.
func_1 = ODEFunction(func_1!, analytic = (u0, p, t) -> [5 - t, -1])

function boundary!(residual, u, p, t)
    residual[1] = u[1][1] - 5
    residual[2] = u[length(u)][1]
end

tspan = (0.0, 5.0)
u0 = [5.0, -3.5]

prob = BVProblem(func_1, boundary!, u0, tspan)

alg = MIRK3()

sol = solve(prob, MIRK3(); dt = 0.2)

solve(prob,
    MIRK3(;
        jac_alg = MIRKJacobianComputationAlgorithm(; bc_diffmode = AutoSparseFiniteDiff(),
            collocation_diffmode = AutoSparseFiniteDiff())); dt = 0.2)

jac_algs = [MIRKJacobianComputationAlgorithm(),
    MIRKJacobianComputationAlgorithm(; bc_diffmode = AutoSparseFiniteDiff(),
        collocation_diffmode = AutoSparseFiniteDiff()),
    MIRKJacobianComputationAlgorithm(; bc_diffmode = AutoFiniteDiff(),
        collocation_diffmode = AutoFiniteDiff()),
    MIRKJacobianComputationAlgorithm(; bc_diffmode = AutoSparseEnzyme(),
        collocation_diffmode = AutoSparseFiniteDiff()),
]

alg_names = ["Default", "Auto Detect Sparsity Patterns", "No Sparsity",
    "Auto Sparse Enzyme BC"]

for mirk_method in (MIRK3, MIRK4, MIRK5, MIRK6)
    for (i, jac_alg) in enumerate(jac_algs)
        sol = solve(prob, mirk_method(; jac_alg); dt = 0.2)
        solver = MIRK3(; jac_alg)
        t = @belapsed solve($prob, $solver; dt = 0.2)
        name = alg_names[i]
        println("$(mirk_method) $(name) took $(t)s")
    end
end

for mirk_method in (MIRK3, MIRK4, MIRK5) # , MIRK6)
    for (i, jac_alg) in enumerate(jac_algs)
        sol = solve(prob_bvp_nonlinear_15, mirk_method(; jac_alg); dt = 0.05)
        solver = MIRK3(; jac_alg)
        t = @belapsed solve($prob, $solver; dt = 0.05)
        name = alg_names[i]
        println("$(mirk_method) $(name) took $(t)s")
    end
end

…nto ap/autosparse

ai-maintainer

AI-Maintainer Review for PR - Faster Jacobian via Split backends

Title and Description 👍

The Title and Description are clear and focused

The title and description of the pull request are clear and focused. They effectively communicate the purpose of the changes, which is to improve the efficiency of the Jacobian calculation by utilizing split backends. The description provides a detailed breakdown of the specific tasks involved in achieving this goal.

Scope of Changes 👍

The changes are narrowly focused

The changes in the pull request are narrowly focused on a specific goal, which is to achieve a faster Jacobian calculation through the use of split backends. While there are multiple tasks involved, they all contribute to the main objective of improving the efficiency of the Jacobian calculation. Therefore, the author is focused on resolving a single issue, rather than trying to address multiple unrelated issues simultaneously.

Testing ⚠️

Testing details are missing

The description does not provide information on how the changes were tested. It is important for the author to provide details on the testing strategies used to ensure the correctness and performance of the changes. This could include information about unit tests, integration tests, or any specific scenarios or datasets used for testing.

Documentation ⚠️

Some functions and classes are missing docstrings

The following newly added functions, classes, or methods do not have docstrings:

construct_MIRK_nlproblem in nlprob.jl
MIRKJacobianComputationAlgorithm struct in types.jl

These should be updated with appropriate docstrings to describe their behavior, arguments, and return values.

Suggested Changes

Please add docstrings to the construct_MIRK_nlproblem function in nlprob.jl and the MIRKJacobianComputationAlgorithm struct in types.jl to describe their behavior, arguments, and return values.
Please provide information on how the changes were tested. This could include details about unit tests, integration tests, or specific scenarios or datasets used for testing.

Reviewed with AI Maintainer

avik-pal · 2023-08-18T19:44:42Z

For the TwoPointBVP specialization, we will need some changes into show the problem is specified, holding off on those changes till upstream PRs are ready

avik-pal · 2023-08-18T19:49:15Z

CI needs SciML/OrdinaryDiffEq.jl#2015 SciML/NonlinearSolve.jl#200 SciML/SciMLBase.jl#481

src/types.jl

avik-pal · 2023-08-31T15:31:34Z

@ChrisRackauckas are sparse matrices in linear solve for 1.6 broken? https://github.com/SciML/BoundaryValueDiffEq.jl/actions/runs/6039175177/job/16387225167?pr=98#step:6:412

avik-pal · 2023-08-31T15:32:37Z

(The current master used sparse differentiation but never really constructed a sparse matrix, so that dispatch was not being triggered)

avik-pal · 2023-09-04T15:43:55Z

@ChrisRackauckas this is ready from my end

Setup for Sparse + Dense Jacobian with multiple backends

6d05393

ai-maintainer bot reviewed Aug 8, 2023

View reviewed changes

avik-pal marked this pull request as draft August 8, 2023 22:39

avik-pal added 5 commits August 15, 2023 16:10

Merge branch 'master' of github.com:SciML/BoundaryValueDiffEq.jl into…

65b6fc1

… ap/autosparse

Initial Version using Upstream Sparsity Detection

0be44f5

Handle sparsity from outside

1e64d0b

Ensure safety in Jacobian

3e986bf

Minor bug

cec684e

avik-pal changed the title ~~[Very WIP] Faster Jacobian via Split backends~~ Faster Jacobian via Split backends Aug 17, 2023

avik-pal added 3 commits August 17, 2023 18:17

remove printing

9cd0fd0

Specify Jac Prototype

ade2778

Merge branch 'master' of github.com:avik-pal/BoundaryValueDiffEq.jl i…

3944fef

…nto ap/autosparse

avik-pal marked this pull request as ready for review August 18, 2023 19:38

ai-maintainer bot reviewed Aug 18, 2023

View reviewed changes

Setup Preallocation Tools

b6263a2

ChrisRackauckas reviewed Aug 19, 2023

View reviewed changes

src/types.jl Outdated Show resolved Hide resolved

avik-pal added 2 commits August 31, 2023 11:05

Merge

e82ef6b

remove NaN checks

20e9544

avik-pal added 2 commits August 31, 2023 12:08

Fix chunksize and default to ForwardDiff

e88f176

Attempt a patch for older julia versions

3c6e34b

avik-pal mentioned this pull request Sep 4, 2023

Towards GPU Compatible BVP Solvers #106

Merged

2 tasks

avik-pal requested a review from ChrisRackauckas September 4, 2023 20:39

ChrisRackauckas approved these changes Sep 4, 2023

View reviewed changes

ChrisRackauckas merged commit dbf35ea into SciML:master Sep 4, 2023
4 of 6 checks passed

avik-pal deleted the ap/autosparse branch September 5, 2023 00:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster Jacobian via Split backends #98

Faster Jacobian via Split backends #98

avik-pal commented Aug 8, 2023 •

edited

Loading

ai-maintainer bot left a comment •

edited

Loading

avik-pal commented Aug 9, 2023

avik-pal commented Aug 17, 2023

ai-maintainer bot left a comment •

edited

Loading

avik-pal commented Aug 18, 2023

avik-pal commented Aug 18, 2023

avik-pal commented Aug 31, 2023

avik-pal commented Aug 31, 2023

avik-pal commented Sep 4, 2023

Faster Jacobian via Split backends #98

Faster Jacobian via Split backends #98

Conversation

avik-pal commented Aug 8, 2023 • edited Loading

Goals

ai-maintainer bot left a comment • edited Loading

Choose a reason for hiding this comment

AI-Maintainer Review for PR - Faster Jacobian via Split backends

Title and Description 👍

Scope of Changes 👍

Documentation 👎

Testing 👎

Suggested Changes

Potential Issues

avik-pal commented Aug 9, 2023

avik-pal commented Aug 17, 2023

ai-maintainer bot left a comment • edited Loading

Choose a reason for hiding this comment

AI-Maintainer Review for PR - Faster Jacobian via Split backends

Title and Description 👍

Scope of Changes 👍

Testing ⚠️

Documentation ⚠️

Suggested Changes

avik-pal commented Aug 18, 2023

avik-pal commented Aug 18, 2023

avik-pal commented Aug 31, 2023

avik-pal commented Aug 31, 2023

avik-pal commented Sep 4, 2023

avik-pal commented Aug 8, 2023 •

edited

Loading

ai-maintainer bot left a comment •

edited

Loading

ai-maintainer bot left a comment •

edited

Loading