Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

@inbounds in copyto! for structured broadcasting #48437

Merged
merged 4 commits into from
Feb 19, 2023
Merged

Conversation

jishnub
Copy link
Contributor

@jishnub jishnub commented Jan 28, 2023

This seems to cut down the runtime of D .* D by half for 5000x5000 diagonal matrices. Using nightly v"1.10.0-DEV.450"

julia> using LinearAlgebra, BenchmarkTools

julia> D = Diagonal(rand(5000));

julia> @btime $D * $D;
  3.648 μs (2 allocations: 39.11 KiB)

julia> @btime $D .* $D; # ideally, should be identical to matrix multiplication
  8.729 μs (2 allocations: 39.11 KiB)

julia> B = Broadcast.instantiate(Broadcast.Broadcasted(*, (D, D)));

julia> @btime Base.copyto!($(copy(D)), $B); # the main method called
  8.390 μs (0 allocations: 0 bytes)

julia> function copyto2!(dest::Diagonal, bc::Broadcast.Broadcasted{<:LinearAlgebra.StructuredMatrixStyle})
           !LinearAlgebra.isstructurepreserving(bc) && !LinearAlgebra.fzeropreserving(bc) && return copyto!(dest, convert(Broadcast.Broadcasted{Nothing}, bc))
           axs = axes(dest)
           axes(bc) == axs || Broadcast.throwdm(axes(bc), axs)
           @inbounds for i in axs[1]
               dest.diag[i] = Broadcast._broadcast_getindex(bc, CartesianIndex(i, i))
           end
           return dest
       end
copyto2! (generic function with 1 method)

julia> @btime copyto2!($(copy(D)), $B);
  4.207 μs (0 allocations: 0 bytes)

Given that the axes are verified to be identical in the previous line, this seems safe.

This seems to cut down the runtime of `D .* D` by half for `5000x5000` diagonal matrices.
Using nightly `v"1.10.0-DEV.450"`
```julia
julia> using LinearAlgebra, BenchmarkTools

julia> D = Diagonal(rand(5000));

julia> @Btime $D * $D;
  3.648 μs (2 allocations: 39.11 KiB)

julia> @Btime $D .* $D;
  8.729 μs (2 allocations: 39.11 KiB)

julia> B = Broadcast.instantiate(Broadcast.Broadcasted(*, (D, D)));

julia> @Btime Base.copyto!($(copy(D)), $B);
  8.390 μs (0 allocations: 0 bytes)

julia> function copyto2!(dest::Diagonal, bc::Broadcast.Broadcasted{<:LinearAlgebra.StructuredMatrixStyle})
           !LinearAlgebra.isstructurepreserving(bc) && !LinearAlgebra.fzeropreserving(bc) && return copyto!(dest, convert(Broadcast.Broadcasted{Nothing}, bc))
           axs = axes(dest)
           axes(bc) == axs || Broadcast.throwdm(axes(bc), axs)
           @inbounds for i in axs[1]
               dest.diag[i] = Broadcast._broadcast_getindex(bc, CartesianIndex(i, i))
           end
           return dest
       end
copyto2! (generic function with 1 method)

julia> @Btime copyto2!($(copy(D)), $B);
  4.207 μs (0 allocations: 0 bytes)
```
@jishnub jishnub added the domain:linear algebra Linear algebra label Jan 28, 2023
@N5N3
Copy link
Member

N5N3 commented Jan 28, 2023

Since default copyto! has @inbounds I think there would be no objection if this helps performance.
But perhaps we should add @inbounds to other copyto! in this file?

@jishnub jishnub changed the title @inbounds in diagonal broadcasting @inbounds in copyto! for structured broadcasting Jan 28, 2023
Copy link
Sponsor Member

@vtjnash vtjnash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These seem like a very large chunk of code to entirely mark @inbounds, since I think I count at least 6 different indexing operations in each of these. Can we mark just the operation that matters for it?

@jishnub
Copy link
Contributor Author

jishnub commented Jan 29, 2023

I've updated it to only mark the broadcasted getindex calls, as these are the annotations that seem to matter in most cases. For the Lower and Upper triangular cases, both the getindex and setindex! calls improve performance.

@ViralBShah
Copy link
Member

good to merge?

@jishnub
Copy link
Contributor Author

jishnub commented Feb 13, 2023

This should be ready

@dkarrasch dkarrasch added the status:merge me PR is reviewed. Merge when all tests are passing label Feb 18, 2023
@N5N3 N5N3 merged commit c82aeb7 into JuliaLang:master Feb 19, 2023
@N5N3 N5N3 removed the status:merge me PR is reviewed. Merge when all tests are passing label Feb 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants