Skip to content

Improve how we build A_and_Aᵀ#156

Merged
gdalle merged 4 commits intomainfrom
am/A_and_At
Feb 17, 2025
Merged

Improve how we build A_and_Aᵀ#156
gdalle merged 4 commits intomainfrom
am/A_and_At

Conversation

@amontoison
Copy link
Copy Markdown
Collaborator

@amontoison amontoison commented Nov 11, 2024

close #161

If we want to store exactly [0 A'; A 0], we can't use symmetric_pattern to build Aᵀ.

Aᵀ = if symmetric_pattern || A isa Union{Symmetric,Hermitian}
  A
else
  transpose(A)
end

We will have the wrong nonzeros in A_and_Aᵀ (but the correct sparsity pattern).

I'm wondering if we should store a different matrix than b = A_and_Aᵀ in StarSetResult or TreeSetResult.
If I remember correctly, we want to store B to have the correct type for the function decompress, but in this case, it’s always a SparseMatrixCSC, whereas A could have a different type.

We probably only want the AdjacencyGraph of A_and_Aᵀ and not the matrix itself, and we could likely achieve this with a keyword argument for AdjacencyGraph or another constructor that takes two arguments (A and Aᵀ).

Off-topic: We can modify the function neighbors(g::AdjacencyGraph, v::Integer) to not use filter in the case of bicoloring because we know that we don't have diagonal coefficients.

@amontoison amontoison requested a review from gdalle November 11, 2024 04:08
@gdalle
Copy link
Copy Markdown
Member

gdalle commented Nov 16, 2024

We probably only want the AdjacencyGraph of A_and_Aᵀ and not the matrix itself, and we could likely achieve this with a keyword argument for AdjacencyGraph or another constructor that takes two arguments (A and Aᵀ).

You're right, in fact that's how I did it initially in #152 (check out the first commit, which is itself a sub-PR). I removed it afterwards because it felt simpler, but looking at the code again we probably never need to actually materialize A_and_Aᵀ. The lazy graph approach would likely work. I have opened #161 to keep track.

Comment thread src/interface.jl Outdated
Comment thread src/interface.jl Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 13, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (50f3f23) to head (7e4f62d).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff            @@
##              main      #156   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           13        13           
  Lines         1500      1541   +41     
=========================================
+ Hits          1500      1541   +41     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@amontoison amontoison added performance Speeding things up benchmark Run benchmarks on PR labels Feb 14, 2025
@amontoison amontoison requested a review from gdalle February 14, 2025 17:18
@amontoison
Copy link
Copy Markdown
Collaborator Author

amontoison commented Feb 14, 2025

@gdalle
I updated this old PR to directly build the augmented adjacency matrix in the format SparsityPatternCSC.
I also updated the definition such that SparsityPatternCSC <: AbstractMatrix{Bool}.
StarSetColoringResult and TreeSetColoringResult were not happy that A_and_Aᵀ was not an AbstractMatrix.

We should check with some benchmarks that we still target the optimized star / acyclic decompression for SparseMatrixCSC.
Update: It seems to be fine for the dispatch 👍

Comment thread src/interface.jl Outdated
@amontoison amontoison force-pushed the am/A_and_At branch 2 times, most recently from f995396 to 3d50b16 Compare February 14, 2025 22:28
@JuliaDiff JuliaDiff deleted a comment from github-actions Bot Feb 15, 2025
@amontoison amontoison closed this Feb 15, 2025
@amontoison amontoison reopened this Feb 15, 2025
@github-actions
Copy link
Copy Markdown
Contributor

Benchmark Results

main 3d50b16... main/3d50b1681b2fcc...
coloring/nonsymmetric/bidirectional/direct/n=1000/p=0.002 2.98 ± 0.077 ms 2.67 ± 0.12 ms 1.12
coloring/nonsymmetric/bidirectional/direct/n=1000/p=0.005 6.09 ± 0.19 ms 5.75 ± 0.21 ms 1.06
coloring/nonsymmetric/bidirectional/direct/n=1000/p=0.01 12.2 ± 0.24 ms 11.8 ± 0.24 ms 1.04
coloring/nonsymmetric/bidirectional/direct/n=100000/p=0.0001 2.8 ± 0.013 s 2.72 ± 0.013 s 1.03
coloring/nonsymmetric/bidirectional/direct/n=100000/p=2.0e-5 0.425 ± 0.012 s 0.395 ± 0.0063 s 1.08
coloring/nonsymmetric/bidirectional/direct/n=100000/p=5.0e-5 1.12 ± 0.021 s 1.07 ± 0.0093 s 1.04
coloring/nonsymmetric/bidirectional/substitution/n=1000/p=0.002 7.23 ± 0.62 ms 6.93 ± 0.8 ms 1.04
coloring/nonsymmetric/bidirectional/substitution/n=1000/p=0.005 16.7 ± 1.1 ms 16.3 ± 1.3 ms 1.02
coloring/nonsymmetric/bidirectional/substitution/n=1000/p=0.01 0.0357 ± 0.0016 s 0.0354 ± 0.0011 s 1.01
coloring/nonsymmetric/bidirectional/substitution/n=100000/p=0.0001 7.47 s 7.57 s 0.986
coloring/nonsymmetric/bidirectional/substitution/n=100000/p=2.0e-5 1.03 ± 0.012 s 1.01 ± 0.035 s 1.03
coloring/nonsymmetric/bidirectional/substitution/n=100000/p=5.0e-5 2.98 ± 0.084 s 2.91 ± 0.11 s 1.02
coloring/nonsymmetric/column/direct/n=1000/p=0.002 0.431 ± 0.0096 ms 0.431 ± 0.01 ms 1
coloring/nonsymmetric/column/direct/n=1000/p=0.005 0.965 ± 0.011 ms 0.965 ± 0.0082 ms 1
coloring/nonsymmetric/column/direct/n=1000/p=0.01 2.06 ± 0.31 ms 2.06 ± 0.15 ms 0.999
coloring/nonsymmetric/column/direct/n=100000/p=0.0001 0.377 ± 0.0078 s 0.378 ± 0.0063 s 0.997
coloring/nonsymmetric/column/direct/n=100000/p=2.0e-5 0.0676 ± 0.00039 s 0.0675 ± 0.0003 s 1
coloring/nonsymmetric/column/direct/n=100000/p=5.0e-5 0.163 ± 0.0022 s 0.163 ± 0.0022 s 0.999
coloring/nonsymmetric/row/direct/n=1000/p=0.002 0.422 ± 0.0099 ms 0.421 ± 0.0097 ms 1
coloring/nonsymmetric/row/direct/n=1000/p=0.005 0.953 ± 0.011 ms 0.956 ± 0.009 ms 0.997
coloring/nonsymmetric/row/direct/n=1000/p=0.01 2.05 ± 0.019 ms 2.05 ± 0.018 ms 0.998
coloring/nonsymmetric/row/direct/n=100000/p=0.0001 0.345 ± 0.0025 s 0.373 ± 0.0031 s 0.924
coloring/nonsymmetric/row/direct/n=100000/p=2.0e-5 0.0629 ± 0.00046 s 0.0665 ± 0.00031 s 0.946
coloring/nonsymmetric/row/direct/n=100000/p=5.0e-5 0.152 ± 0.0018 s 0.163 ± 0.0017 s 0.936
coloring/symmetric/column/direct/n=1000/p=0.002 1.46 ± 0.084 ms 1.46 ± 0.025 ms 0.997
coloring/symmetric/column/direct/n=1000/p=0.005 3.7 ± 0.026 ms 3.71 ± 0.028 ms 0.998
coloring/symmetric/column/direct/n=1000/p=0.01 8.72 ± 0.09 ms 8.74 ± 0.092 ms 0.997
coloring/symmetric/column/direct/n=100000/p=0.0001 1.79 ± 0.019 s 1.8 ± 0.0027 s 0.996
coloring/symmetric/column/direct/n=100000/p=2.0e-5 0.192 ± 0.0052 s 0.192 ± 0.0024 s 0.997
coloring/symmetric/column/direct/n=100000/p=5.0e-5 0.586 ± 0.0092 s 0.597 ± 0.0055 s 0.982
coloring/symmetric/column/substitution/n=1000/p=0.002 4.02 ± 0.11 ms 4.03 ± 0.14 ms 0.999
coloring/symmetric/column/substitution/n=1000/p=0.005 9.6 ± 0.27 ms 9.59 ± 0.3 ms 1
coloring/symmetric/column/substitution/n=1000/p=0.01 20.4 ± 0.45 ms 20.3 ± 0.6 ms 1
coloring/symmetric/column/substitution/n=100000/p=0.0001 3.78 ± 0.09 s 3.76 ± 0.11 s 1.01
coloring/symmetric/column/substitution/n=100000/p=2.0e-5 0.53 ± 0.036 s 0.53 ± 0.023 s 1
coloring/symmetric/column/substitution/n=100000/p=5.0e-5 1.51 ± 0.059 s 1.5 ± 0.084 s 1.01
decompress/nonsymmetric/bidirectional/direct/n=1000/p=0.002 0.187 ± 0.007 ms 0.184 ± 0.0066 ms 1.01
decompress/nonsymmetric/bidirectional/direct/n=1000/p=0.005 0.304 ± 0.0097 ms 0.303 ± 0.0094 ms 1
decompress/nonsymmetric/bidirectional/direct/n=1000/p=0.01 0.551 ± 0.016 ms 0.556 ± 0.016 ms 0.991
decompress/nonsymmetric/bidirectional/direct/n=100000/p=0.0001 0.0862 ± 0.003 s 0.0848 ± 0.0044 s 1.02
decompress/nonsymmetric/bidirectional/direct/n=100000/p=2.0e-5 27.8 ± 1.9 ms 27.9 ± 2 ms 0.999
decompress/nonsymmetric/bidirectional/direct/n=100000/p=5.0e-5 0.046 ± 0.0018 s 0.0458 ± 0.0024 s 1
decompress/nonsymmetric/bidirectional/substitution/n=1000/p=0.002 0.145 ± 0.0047 ms 0.146 ± 0.0046 ms 0.996
decompress/nonsymmetric/bidirectional/substitution/n=1000/p=0.005 0.3 ± 0.0096 ms 0.3 ± 0.0099 ms 1
decompress/nonsymmetric/bidirectional/substitution/n=1000/p=0.01 0.634 ± 0.015 ms 0.634 ± 0.014 ms 0.999
decompress/nonsymmetric/bidirectional/substitution/n=100000/p=0.0001 0.171 ± 0.003 s 0.169 ± 0.0022 s 1.01
decompress/nonsymmetric/bidirectional/substitution/n=100000/p=2.0e-5 30.5 ± 0.54 ms 30.2 ± 0.44 ms 1.01
decompress/nonsymmetric/bidirectional/substitution/n=100000/p=5.0e-5 0.0708 ± 0.0026 s 0.0708 ± 0.0025 s 0.999
decompress/nonsymmetric/column/direct/n=1000/p=0.002 26.1 ± 1.3 μs 26 ± 1.3 μs 1.01
decompress/nonsymmetric/column/direct/n=1000/p=0.005 0.049 ± 0.0019 ms 0.0487 ± 0.002 ms 1.01
decompress/nonsymmetric/column/direct/n=1000/p=0.01 0.0871 ± 0.0038 ms 0.0876 ± 0.0047 ms 0.994
decompress/nonsymmetric/column/direct/n=100000/p=0.0001 25.5 ± 1.1 ms 26.5 ± 0.98 ms 0.964
decompress/nonsymmetric/column/direct/n=100000/p=2.0e-5 4.3 ± 0.16 ms 4.28 ± 0.12 ms 1
decompress/nonsymmetric/column/direct/n=100000/p=5.0e-5 12.7 ± 0.32 ms 12.6 ± 0.27 ms 1.01
decompress/nonsymmetric/row/direct/n=1000/p=0.002 25.9 ± 1.5 μs 25.6 ± 1.4 μs 1.01
decompress/nonsymmetric/row/direct/n=1000/p=0.005 0.0453 ± 0.0018 ms 0.0454 ± 0.0019 ms 0.997
decompress/nonsymmetric/row/direct/n=1000/p=0.01 0.0786 ± 0.004 ms 0.0792 ± 0.0041 ms 0.994
decompress/nonsymmetric/row/direct/n=100000/p=0.0001 11.8 ± 0.5 ms 12.4 ± 0.5 ms 0.952
decompress/nonsymmetric/row/direct/n=100000/p=2.0e-5 3.12 ± 0.083 ms 3.26 ± 0.075 ms 0.959
decompress/nonsymmetric/row/direct/n=100000/p=5.0e-5 6.08 ± 0.18 ms 6.43 ± 0.15 ms 0.945
decompress/symmetric/column/direct/n=1000/p=0.002 25.5 ± 1.5 μs 25.6 ± 1.4 μs 0.997
decompress/symmetric/column/direct/n=1000/p=0.005 0.0468 ± 0.0019 ms 0.0466 ± 0.0023 ms 1
decompress/symmetric/column/direct/n=1000/p=0.01 0.082 ± 0.0044 ms 0.082 ± 0.0044 ms 1
decompress/symmetric/column/direct/n=100000/p=0.0001 23.3 ± 0.86 ms 23.8 ± 1.1 ms 0.979
decompress/symmetric/column/direct/n=100000/p=2.0e-5 3.79 ± 0.16 ms 3.55 ± 0.16 ms 1.07
decompress/symmetric/column/direct/n=100000/p=5.0e-5 11.8 ± 0.43 ms 12 ± 0.27 ms 0.989
decompress/symmetric/column/substitution/n=1000/p=0.002 0.103 ± 0.003 ms 0.109 ± 0.003 ms 0.947
decompress/symmetric/column/substitution/n=1000/p=0.005 0.195 ± 0.0086 ms 0.192 ± 0.008 ms 1.01
decompress/symmetric/column/substitution/n=1000/p=0.01 0.364 ± 0.012 ms 0.362 ± 0.012 ms 1.01
decompress/symmetric/column/substitution/n=100000/p=0.0001 0.099 ± 0.0014 s 0.103 ± 0.002 s 0.965
decompress/symmetric/column/substitution/n=100000/p=2.0e-5 28 ± 0.47 ms 27.8 ± 0.28 ms 1.01
decompress/symmetric/column/substitution/n=100000/p=5.0e-5 0.0545 ± 0.0011 s 0.0547 ± 0.00054 s 0.996
time_to_load 0.284 ± 0.0032 s 0.284 ± 0.004 s 0.999
main 3d50b16... main/3d50b1681b2fcc...
coloring/nonsymmetric/bidirectional/direct/n=1000/p=0.002 9.51 k allocs: 3.85 MB 8.97 k allocs: 3 MB 1.29
coloring/nonsymmetric/bidirectional/direct/n=1000/p=0.005 11.1 k allocs: 6.15 MB 10.6 k allocs: 4.76 MB 1.29
coloring/nonsymmetric/bidirectional/direct/n=1000/p=0.01 11.2 k allocs: 10.7 MB 10.7 k allocs: 8.39 MB 1.27
coloring/nonsymmetric/bidirectional/direct/n=100000/p=0.0001 1 M allocs: 1.1 GB 1 M allocs: 0.876 GB 1.25
coloring/nonsymmetric/bidirectional/direct/n=100000/p=2.0e-5 0.831 M allocs: 0.398 GB 0.831 M allocs: 0.318 GB 1.25
coloring/nonsymmetric/bidirectional/direct/n=100000/p=5.0e-5 0.994 M allocs: 0.644 GB 0.994 M allocs: 0.511 GB 1.26
coloring/nonsymmetric/bidirectional/substitution/n=1000/p=0.002 0.0345 M allocs: 6.16 MB 0.034 M allocs: 5.3 MB 1.16
coloring/nonsymmetric/bidirectional/substitution/n=1000/p=0.005 0.0646 M allocs: 12.2 MB 0.0641 M allocs: 10.9 MB 1.13
coloring/nonsymmetric/bidirectional/substitution/n=1000/p=0.01 0.119 M allocs: 21.4 MB 0.119 M allocs: 19.1 MB 1.12
coloring/nonsymmetric/bidirectional/substitution/n=100000/p=0.0001 11.5 M allocs: 2.29 GB 11.5 M allocs: 2.07 GB 1.11
coloring/nonsymmetric/bidirectional/substitution/n=100000/p=2.0e-5 3.2 M allocs: 0.583 GB 3.2 M allocs: 0.503 GB 1.16
coloring/nonsymmetric/bidirectional/substitution/n=100000/p=5.0e-5 6.22 M allocs: 1.18 GB 6.22 M allocs: 1.05 GB 1.13
coloring/nonsymmetric/column/direct/n=1000/p=0.002 0.12 k allocs: 0.315 MB 0.12 k allocs: 0.315 MB 1
coloring/nonsymmetric/column/direct/n=1000/p=0.005 0.12 k allocs: 0.539 MB 0.12 k allocs: 0.539 MB 1
coloring/nonsymmetric/column/direct/n=1000/p=0.01 0.12 k allocs: 0.928 MB 0.12 k allocs: 0.928 MB 1
coloring/nonsymmetric/column/direct/n=100000/p=0.0001 0.12 k allocs: 0.0894 GB 0.12 k allocs: 0.0894 GB 1
coloring/nonsymmetric/column/direct/n=100000/p=2.0e-5 0.12 k allocs: 30.5 MB 0.12 k allocs: 30.5 MB 1
coloring/nonsymmetric/column/direct/n=100000/p=5.0e-5 0.12 k allocs: 0.0521 GB 0.12 k allocs: 0.0521 GB 1
coloring/nonsymmetric/row/direct/n=1000/p=0.002 0.12 k allocs: 0.315 MB 0.12 k allocs: 0.315 MB 1
coloring/nonsymmetric/row/direct/n=1000/p=0.005 0.12 k allocs: 0.539 MB 0.12 k allocs: 0.539 MB 1
coloring/nonsymmetric/row/direct/n=1000/p=0.01 0.12 k allocs: 0.928 MB 0.12 k allocs: 0.928 MB 1
coloring/nonsymmetric/row/direct/n=100000/p=0.0001 0.12 k allocs: 0.0894 GB 0.12 k allocs: 0.0894 GB 1
coloring/nonsymmetric/row/direct/n=100000/p=2.0e-5 0.12 k allocs: 30.5 MB 0.12 k allocs: 30.5 MB 1
coloring/nonsymmetric/row/direct/n=100000/p=5.0e-5 0.12 k allocs: 0.0521 GB 0.12 k allocs: 0.0521 GB 1
coloring/symmetric/column/direct/n=1000/p=0.002 6.31 k allocs: 0.865 MB 6.31 k allocs: 0.865 MB 1
coloring/symmetric/column/direct/n=1000/p=0.005 14.6 k allocs: 1.68 MB 14.6 k allocs: 1.68 MB 1
coloring/symmetric/column/direct/n=1000/p=0.01 30.7 k allocs: 3.28 MB 30.7 k allocs: 3.28 MB 1
coloring/symmetric/column/direct/n=100000/p=0.0001 3.02 M allocs: 0.377 GB 3.02 M allocs: 0.377 GB 1
coloring/symmetric/column/direct/n=100000/p=2.0e-5 0.589 M allocs: 0.0902 GB 0.589 M allocs: 0.0902 GB 1
coloring/symmetric/column/direct/n=100000/p=5.0e-5 1.44 M allocs: 0.195 GB 1.44 M allocs: 0.195 GB 1
coloring/symmetric/column/substitution/n=1000/p=0.002 28.7 k allocs: 2.97 MB 28.7 k allocs: 2.97 MB 1
coloring/symmetric/column/substitution/n=1000/p=0.005 0.0531 M allocs: 5.64 MB 0.0531 M allocs: 5.64 MB 1
coloring/symmetric/column/substitution/n=1000/p=0.01 0.101 M allocs: 11.2 MB 0.101 M allocs: 11.2 MB 1
coloring/symmetric/column/substitution/n=100000/p=0.0001 9.56 M allocs: 1.06 GB 9.56 M allocs: 1.06 GB 1
coloring/symmetric/column/substitution/n=100000/p=2.0e-5 2.77 M allocs: 0.281 GB 2.77 M allocs: 0.281 GB 1
coloring/symmetric/column/substitution/n=100000/p=5.0e-5 5.17 M allocs: 0.565 GB 5.17 M allocs: 0.565 GB 1
decompress/nonsymmetric/bidirectional/direct/n=1000/p=0.002 0.06 k allocs: 0.884 MB 0.06 k allocs: 0.884 MB 1
decompress/nonsymmetric/bidirectional/direct/n=1000/p=0.005 0.073 k allocs: 1.66 MB 0.073 k allocs: 1.66 MB 1
decompress/nonsymmetric/bidirectional/direct/n=1000/p=0.01 0.06 k allocs: 3.52 MB 0.06 k allocs: 3.52 MB 1
decompress/nonsymmetric/bidirectional/direct/n=100000/p=0.0001 0.06 k allocs: 0.355 GB 0.06 k allocs: 0.355 GB 1
decompress/nonsymmetric/bidirectional/direct/n=100000/p=2.0e-5 0.06 k allocs: 0.105 GB 0.06 k allocs: 0.105 GB 1
decompress/nonsymmetric/bidirectional/direct/n=100000/p=5.0e-5 0.06 k allocs: 0.187 GB 0.06 k allocs: 0.187 GB 1
decompress/nonsymmetric/bidirectional/substitution/n=1000/p=0.002 0.075 k allocs: 0.594 MB 0.075 k allocs: 0.594 MB 1
decompress/nonsymmetric/bidirectional/substitution/n=1000/p=0.005 0.075 k allocs: 1.21 MB 0.075 k allocs: 1.21 MB 1
decompress/nonsymmetric/bidirectional/substitution/n=1000/p=0.01 0.075 k allocs: 2.63 MB 0.075 k allocs: 2.63 MB 1
decompress/nonsymmetric/bidirectional/substitution/n=100000/p=0.0001 0.075 k allocs: 0.257 GB 0.075 k allocs: 0.257 GB 1
decompress/nonsymmetric/bidirectional/substitution/n=100000/p=2.0e-5 0.075 k allocs: 0.0559 GB 0.075 k allocs: 0.0559 GB 1
decompress/nonsymmetric/bidirectional/substitution/n=100000/p=5.0e-5 0.075 k allocs: 0.121 GB 0.075 k allocs: 0.121 GB 1
decompress/nonsymmetric/column/direct/n=1000/p=0.002 0.045 k allocs: 0.197 MB 0.045 k allocs: 0.197 MB 1
decompress/nonsymmetric/column/direct/n=1000/p=0.005 0.045 k allocs: 0.419 MB 0.045 k allocs: 0.419 MB 1
decompress/nonsymmetric/column/direct/n=1000/p=0.01 0.045 k allocs: 0.802 MB 0.045 k allocs: 0.802 MB 1
decompress/nonsymmetric/column/direct/n=100000/p=0.0001 0.045 k allocs: 0.0782 GB 0.045 k allocs: 0.0782 GB 1
decompress/nonsymmetric/column/direct/n=100000/p=2.0e-5 0.045 k allocs: 19.1 MB 0.045 k allocs: 19.1 MB 1
decompress/nonsymmetric/column/direct/n=100000/p=5.0e-5 0.045 k allocs: 0.0409 GB 0.045 k allocs: 0.0409 GB 1
decompress/nonsymmetric/row/direct/n=1000/p=0.002 0.045 k allocs: 0.197 MB 0.045 k allocs: 0.197 MB 1
decompress/nonsymmetric/row/direct/n=1000/p=0.005 0.045 k allocs: 0.419 MB 0.045 k allocs: 0.419 MB 1
decompress/nonsymmetric/row/direct/n=1000/p=0.01 0.045 k allocs: 0.802 MB 0.045 k allocs: 0.802 MB 1
decompress/nonsymmetric/row/direct/n=100000/p=0.0001 0.045 k allocs: 0.0782 GB 0.045 k allocs: 0.0782 GB 1
decompress/nonsymmetric/row/direct/n=100000/p=2.0e-5 0.045 k allocs: 19.1 MB 0.045 k allocs: 19.1 MB 1
decompress/nonsymmetric/row/direct/n=100000/p=5.0e-5 0.045 k allocs: 0.0409 GB 0.045 k allocs: 0.0409 GB 1
decompress/symmetric/column/direct/n=1000/p=0.002 0.045 k allocs: 0.197 MB 0.045 k allocs: 0.197 MB 1
decompress/symmetric/column/direct/n=1000/p=0.005 0.045 k allocs: 0.419 MB 0.045 k allocs: 0.419 MB 1
decompress/symmetric/column/direct/n=1000/p=0.01 0.045 k allocs: 0.802 MB 0.045 k allocs: 0.802 MB 1
decompress/symmetric/column/direct/n=100000/p=0.0001 0.045 k allocs: 0.0782 GB 0.045 k allocs: 0.0782 GB 1
decompress/symmetric/column/direct/n=100000/p=2.0e-5 0.045 k allocs: 19.1 MB 0.045 k allocs: 19.1 MB 1
decompress/symmetric/column/direct/n=100000/p=5.0e-5 0.045 k allocs: 0.0409 GB 0.045 k allocs: 0.0409 GB 1
decompress/symmetric/column/substitution/n=1000/p=0.002 0.06 k allocs: 0.235 MB 0.06 k allocs: 0.235 MB 1
decompress/symmetric/column/substitution/n=1000/p=0.005 0.06 k allocs: 0.457 MB 0.06 k allocs: 0.457 MB 1
decompress/symmetric/column/substitution/n=1000/p=0.01 0.06 k allocs: 0.841 MB 0.06 k allocs: 0.841 MB 1
decompress/symmetric/column/substitution/n=100000/p=0.0001 0.06 k allocs: 0.0819 GB 0.06 k allocs: 0.0819 GB 1
decompress/symmetric/column/substitution/n=100000/p=2.0e-5 0.06 k allocs: 22.9 MB 0.06 k allocs: 22.9 MB 1
decompress/symmetric/column/substitution/n=100000/p=5.0e-5 0.06 k allocs: 0.0447 GB 0.06 k allocs: 0.0447 GB 1
time_to_load 0.159 k allocs: 11.2 kB 0.159 k allocs: 11.2 kB 1

@gdalle gdalle removed the benchmark Run benchmarks on PR label Feb 17, 2025
Copy link
Copy Markdown
Member

@gdalle gdalle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we reduce code duplication by using the same code for transpose and bidirectional_pattern? For instance defining a subroutine like

transpose_pattern!(A, colptr, rowval)

which could then also be applied on colptr = view(...) and rowval = view(...) in the case of the bigger matrix (and followed by a vectorized addition to account for the shift)

@amontoison
Copy link
Copy Markdown
Collaborator Author

amontoison commented Feb 17, 2025

Can we reduce code duplication by using the same code for transpose and bidirectional_pattern? For instance defining a subroutine like

transpose_pattern!(A, colptr, rowval)

which could then also be applied on colptr = view(...) and rowval = view(...) in the case of the bigger matrix (and followed by a vectorized addition to account for the shift)

Yes, we could reduce code duplication by implementing a subroutine like transpose_pattern!.
However, the gain would only be around 20–30 lines of code (including comments and spacing).

That said, introducing the offset logic for rows, columns, and nnz might slightly impact readability.
Personally, I don't think it's worth spending time on this right now.

I suggest we merge the PR as it is.
If you feel strongly about reducing duplication, we could open an issue to revisit this improvement later.

Copy link
Copy Markdown
Member

@gdalle gdalle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't fully understand the code but I trust the tests I added

@gdalle gdalle merged commit 1ce9fb8 into main Feb 17, 2025
@gdalle gdalle deleted the am/A_and_At branch February 17, 2025 19:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Speeding things up

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Lazy bicoloring adjacency graph

2 participants